
The Project
This page provides the background information about Project ReKisstory
The Mission
ReKisstory Project is a project of Go Sugimoto with a mission to:
- help people to easily and effectively perform research on human knowledge
- bring fun to explore time and space data
- demonstrate the power of data science and data integration on the web
To achieve these goals, he has created a simple user-friendly search engine tool "ReKisstory". It allows users to search a large-scale database of human knowledge and analyze a broad spectrum of data with a special focus on time and space. Your search results will be presented in nice interactive visualizations (table, graph, timeline, and map (and network)). But, this is just a building block of what we envision. If you are interested in a bigger vision, please check out our blog post
The Target Audience
It is a research project in a Computer Science department. Therefore, it is meant to be technically-oriented, and we expect technical audience in terms of cutting-edge technology, called Semantic Web. However, the author belongs to the User Centric Data Science (UCDS) group, meaning he would like to address a wide range of end-users, because there are many user aspects in this project. In addition, he works in the interdisciplinary filed, especially Digital Humanities (DH). So, he welcomes anybody who is interested in history, humanities, technology and science, and beyond. Still, potential target groups would be:
- Humanities and social science researchers (historians, archaeologists, philosophers, economists)
- Data (science) experts (Semantic Web, Knowledge Engineering, Metadata, Wikidata)
- Data organizers and holders (library, archives, museums, universities)
- Web developers (data visualization, web design, Python)
- School teachers and students
- Tourists and tourism organizers (local guide, regional promoters)
- Creators and journalists (artists, musicians, game designers, writers, influencers)
The Start
The author started the project from his long experience in Cultural Heritage (CH) and Humanities and "frustration" in the area of an emerging technical field, called Linked Data (LD) (watch a great 10+ min video for beginners by @manusporny). My frustration was just like Time-Berners Lee's story: he invented the World Wide Web, based on his frustration over data management (his vision of LD). LD is a simple set of design principles for the data on the web (or the data itself following the principles) which would take the web to a new level. It is not totally new any more. It has been increasingly popularized over the last decades, because it makes the web more database-oriented, so that we can ask questions and machines can give us answers more easily
That sounds like ChatGPT! That's true, but while ChatGPT is based on a Machine Learning with Large Language Model (LLM), LD is based on datasets mostly we humans designed and generated carefully. Although the invention like ChatGPT amazed us, one criticism is its black box approach. It is not easy to know how the machine gets the information. Biases caused by input data are also another question for Machine Learning. As it may give you convincing answers through dialogues with a user-friendly interface, there is a risk of over-trusting. On the other hand, LD could be more trustful in the sense we can know where the information comes from (if designed properly). In addition, the purposes are quite different. ChatGPT is not meant to integrate data on the web. It is currently rather a "standalone tool". LD has a formalized mechanism for reasoning to answer your questions. It is also noted that the two technologies may not be competing, but complementing each other. In fact, there is research to combine both
Technically speaking, LD allows us to create, publish, and share standardized structured data in a de-centralized manner to connect to each other through hyperlinks. It is a technology to move from human friendly "static" websites (documents) to a new form of web: "dynamic" machine-friendly interlinked graph databases of our knowledge, called Semantic Web. LD is designed to be its building block. While in Machine Learning, machines would "stealthily guess" answers to your question, by employing probability theories and a large amount of text processing, in Semantic Web, machines "logically and semantically interpret" data through inferences. In addition, LD is able to easily and smoothly integrate complicated and heterogeneous data scattered on the web. Consequently, we should be able to efficiently automate data processing on the web, which is not easy in many cases today. Great, right? However, there are still challenges to use LD. In the domain of CH and Humanities:
- LD is under-explored for research analysis
- Time and space are crucial across many sub-domains, but are under-represented in use cases
- SPARQL is widely used to query LD, but too complicated for many (ordinary) users
- Thus, many LD web applications over-simplify LD for end-users
- Distributed data is not integrated sufficiently
- Use cases of LD are often narrow in scope, and/or do not scale up beyond boundaries (domains, organizations, countries)
On top of these issues, the author feels the slow pace of LD adaptation in the web community in general. Although these are also related to more fundamental problems including LD quality and quantity, the author would like to address those challenges, provided we have a relatively desired LD environment, in order to demonstrate a scenario of and incentive to moving forward to our near-future
The Major Research Questions
- How to demonstrate that LD could be powerful framework for research analysis?
- How to explore time and space more interestingly with LD?
- How to make LD more approachable by users with limited or no LD experience, while reserving technical excitement for web experts?
- What would be a generic and flexible approach to integrate LD with a wide range of heterogeneous data distributed online and offline?
- What kind of domain agnostic tool could be developed to explore broad knowledge available in LD?
ReKisstory is my answer to tackle our challenges, although it is far from perfect. It is hoped to see a glimpse of the power of LD. This project also aims to attract more attention to LD within the web community to increase the level of benefits in the society
The Technology
The technology is based on widely accepted and and cutting-edge standards
- Linked Data (especially Wikidata) and SPARQL for data sources and queries
- Python and its packages with Flask and Jinja for application logic and web development
- Flowbite and Tailwind CSS for frontend components and design
- External APIs (Reconciliation API for easy User Experience and MeidaWiki API for Wikipedia data processing)
- GitHub version control and Docker image for app deployment
Cool Concepts
- Rare serendipitous analytical tool for Linked Data useful for many fields
- Strong time-space interactive exploration capability suitable for Humanities research
- Unique data science capability by external data integration with generalized flexible methods and workflow
- Middle ground for any level of users (Auto-suggest user experience for complex expert query)
Cool Features
- Auto-suggest driven easy-for-everybody user interface
- Hyperlinks everywhere to find more information
- "Wikipedia-level" multilingual support for auto-suggest and search results
- Holistic search engine (initially started with project of biographies, but any time related entities can be searched)
- Items editable in timeline (move and remove)
- Item grouping in timeline (By item or relation)
- Item grouping on map (Find)
- Filter conditions for search (time, place, media only, or all except IDs) (Compare)
- Filtering of search results by a keyword and sorting the results in a column (Table view)
- Co-presentation of lifespan (dotted bar) and query result span (filled bar) (Find)
- Search overlap between two time durations or points in combination (rare for history web databases)
- Inclusion of context history (rare for history web databases)
- Common Era/Before Christ(BCE/BC) support in timeline (rare in standard computer applications)
- Time calculation across three dates (start, end, target date)
- Three data import streams by user (CSV import, Rest API endpoint, SPARQL endpoint)
- Easy data mapping function for user imported data
- Combined view of timeline and statistics
- Provenance information: source RDF statement (not just HTML document) can be found. Can be used as error checker for Wikidata
- Network view for implicit relations (Find)
- Display corresponding Wikipedia articles (Compare)
- Image gallery to list middle-quality images
- Geo location search based on map input (max 4 places) or place name (Geo Find)
- GPS/Wifi detects the user location to search items in the area (Geo Find)
- Reload downloaded search results (Compare)
- Software portability. No persistent database behind. All data is dynamically generated on the fly. We store it temporarily for CSV download
Limitations
- Waiting time for query results is much longer than other databases (due to heavy SPARQL)
- Inferences are limited so far for the performance reason (cf query optimization)
- Little or no control over external APIs
- Type-based auto-suggest is slower
- Customizability of visualization libraries is limited
- Quality of source data might not be satisfactory (Wikidata)
- Year-level time calculation (no month-level) for BC/BCE
Future Work
- Pick up items from Find and move to Compare section
- Member login
- Saved and share search results
- Query optimization and semantic inferences
- AI integration
- Stable service and commercialization
- Research & Development collaboration
Potential Use Cases
ReKissory can be used in many different ways. Here we list some ideas of use cases
If you have a good idea and/or use ReKisstory for interesting purposes, please share it with us and other users
- Find ideas for a museum exhibition
- Create school learning materials for history related subjects
- Write a blog for art history (fine art, rock'n roll, opera, literature etc)
- Find interesting spots and/or people around your house or school
- Navigate yourself with GPS to find Dolmens (it is not easy to find "hidden archaeological monuments" on site. The creator used this way in Emmen in the Netherlands!)
- Find tourist attractions or cafes when travelling abroad
- Create a pub quiz for favorite footballers and/or teams
- List all Manga characters who have super power
- Get inspiration for your cooking
- Discover name-sake items
- Identify the distribution of disasters for a scientific investigation
- Initiate your background research on transportation, before full-scale research
The Name and The Logo
ReKisstory is the name chosen for the project. In the nutshell, it is a combination of Reki + Kiss + History. We would like to explore history with love. The ReKisstory app aims to show the way to step into time and space more easily and freely.
What is Reki? It is a Japanese character "歴", representing experience and record. Chinese basically has the same: "歷" or "历". History in Japanese is "Rekishi (歴史)". The logo of ReKisstory is based on the character "歴". Note ReKisstory also contains a word "history". We think it is also very important that users can build their own "stories" through "history".
Terms of Use
- All our data is sourced from Wikidata and Wikipedia. ReKisstory displays it on-the-fly and only store the search results and user imported data for the function CSV download
- Wikidata content is under the free license of Creative Commons CC-0. Wikipedia articles are under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0 DEED). Wikimedia Foundation Terms of Use should be respected for both
- Website content of ReKisstory Project is under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0 DEED)
- Code of ReKisstory application is not public so far
- For questions, feedback, bug-reports, and collaboration, please use the contact information below
- Limited technical support availability. It could be done on the case-by-case basis due to lack of human resources
- ReKisstory is an experimental application. The author is developing it for a good cause. Please use it ethically and responsibly. He and Vrije Universiteit Amsterdam take no liability directly and indirectly caused by the use of the application. Thank you for your understanding
- Please read privacy statement of Vrije Universisteit Amsterdam
Contact
- Go Sugimoto (Vrije Universiteit Amsterdam)
- g.sugimoto@vu.nl
- ORCID: https://orcid.org/0000-0003-2646-6784