The Project

The Mission

ReKisstory Project is a project of Go Sugimoto with a mission to:

help people to easily and effectively perform research on human knowledge
bring fun to explore time and space data
demonstrate the power of data science and data integration on the web

To achieve these goals, he has created a simple user-friendly search engine tool "ReKisstory". It allows users to search a large-scale database of human knowledge and analyze a broad spectrum of data with a special focus on time and space. Your search results will be presented in nice interactive visualizations (table, graph, timeline, and map (and network)). But, this is just a building block of what we envision. If you are interested in a bigger vision, please check out our blog post

The Target Audience

It is a research project in a Computer Science department. Therefore, it is meant to be technically-oriented, and we expect technical audience in terms of cutting-edge technology, called Semantic Web. However, the author belongs to the User Centric Data Science (UCDS) group, meaning he would like to address a wide range of end-users, because there are many user aspects in this project. In addition, he works in the interdisciplinary filed, especially Digital Humanities (DH). So, he welcomes anybody who is interested in history, humanities, technology and science, and beyond. Still, potential target groups would be:

Humanities and social science researchers (historians, archaeologists, philosophers, economists)
Data (science) experts (Semantic Web, Knowledge Engineering, Metadata, Wikidata)
Data organizers and holders (library, archives, museums, universities)
Web developers (data visualization, web design, Python)
School teachers and students
Tourists and tourism organizers (local guide, regional promoters)
Creators and journalists (artists, musicians, game designers, writers, influencers)

The Start

The author started the project from his long experience in Cultural Heritage (CH) and Humanities and "frustration" in the area of an emerging technical field, called Linked Data (LD) (watch a great 10+ min video for beginners by @manusporny). My frustration was just like Time-Berners Lee's story: he invented the World Wide Web, based on his frustration over data management (his vision of LD). LD is a simple set of design principles for the data on the web (or the data itself following the principles) which would take the web to a new level. It is not totally new any more. It has been increasingly popularized over the last decades, because it makes the web more database-oriented, so that we can ask questions and machines can give us answers more easily

That sounds like ChatGPT! That's true, but while ChatGPT is based on a Machine Learning with Large Language Model (LLM), LD is based on datasets mostly we humans designed and generated carefully. Although the invention like ChatGPT amazed us, one criticism is its black box approach. It is not easy to know how the machine gets the information. Biases caused by input data are also another question for Machine Learning. As it may give you convincing answers through dialogues with a user-friendly interface, there is a risk of over-trusting. On the other hand, LD could be more trustful in the sense we can know where the information comes from (if designed properly). In addition, the purposes are quite different. ChatGPT is not meant to integrate data on the web. It is currently rather a "standalone tool". LD has a formalized mechanism for reasoning to answer your questions. It is also noted that the two technologies may not be competing, but complementing each other. In fact, there is research to combine both

Technically speaking, LD allows us to create, publish, and share standardized structured data in a de-centralized manner to connect to each other through hyperlinks. It is a technology to move from human friendly "static" websites (documents) to a new form of web: "dynamic" machine-friendly interlinked graph databases of our knowledge, called Semantic Web. LD is designed to be its building block. While in Machine Learning, machines would "stealthily guess" answers to your question, by employing probability theories and a large amount of text processing, in Semantic Web, machines "logically and semantically interpret" data through inferences. In addition, LD is able to easily and smoothly integrate complicated and heterogeneous data scattered on the web. Consequently, we should be able to efficiently automate data processing on the web, which is not easy in many cases today. Great, right? However, there are still challenges to use LD. In the domain of CH and Humanities:

LD is under-explored for research analysis
Time and space are crucial across many sub-domains, but are under-represented in use cases
SPARQL is widely used to query LD, but too complicated for many (ordinary) users
Thus, many LD web applications over-simplify LD for end-users
Distributed data is not integrated sufficiently
Use cases of LD are often narrow in scope, and/or do not scale up beyond boundaries (domains, organizations, countries)

On top of these issues, the author feels the slow pace of LD adaptation in the web community in general. Although these are also related to more fundamental problems including LD quality and quantity, the author would like to address those challenges, provided we have a relatively desired LD environment, in order to demonstrate a scenario of and incentive to moving forward to our near-future

The Major Research Questions

How to demonstrate that LD could be powerful framework for research analysis?
How to explore time and space more interestingly with LD?
How to make LD more approachable by users with limited or no LD experience, while reserving technical excitement for web experts?
What would be a generic and flexible approach to integrate LD with a wide range of heterogeneous data distributed online and offline?
What kind of domain agnostic tool could be developed to explore broad knowledge available in LD?

ReKisstory is my answer to tackle our challenges, although it is far from perfect. It is hoped to see a glimpse of the power of LD. This project also aims to attract more attention to LD within the web community to increase the level of benefits in the society

The Technology

The technology is based on widely accepted and and cutting-edge standards

Linked Data (especially Wikidata) and SPARQL for data sources and queries
Python and its packages with Flask and Jinja for application logic and web development
Flowbite and Tailwind CSS for frontend components and design
External APIs (Reconciliation API for easy User Experience and MeidaWiki API for Wikipedia data processing)
GitHub version control and Docker image for app deployment

Cool Concepts

Rare serendipitous analytical tool for Linked Data useful for many fields
Strong time-space interactive exploration capability suitable for Humanities research
Unique data science capability by external data integration with generalized flexible methods and workflow
Middle ground for any level of users (Auto-suggest user experience for complex expert query)

Cool Features

Auto-suggest driven easy-for-everybody user interface
Hyperlinks everywhere to find more information
"Wikipedia-level" multilingual support for auto-suggest and search results
Holistic search engine (initially started with project of biographies, but any time related entities can be searched)
Items editable in timeline (move and remove)
Item grouping in timeline (By item or relation)
Item grouping on map (Find)
Filter conditions for search (time, place, media only, or all except IDs) (Compare)
Filtering of search results by a keyword and sorting the results in a column (Table view)
Co-presentation of lifespan (dotted bar) and query result span (filled bar) (Find)
Search overlap between two time durations or points in combination (rare for history web databases)
Inclusion of context history (rare for history web databases)
Common Era/Before Christ(BCE/BC) support in timeline (rare in standard computer applications)
Time calculation across three dates (start, end, target date)
Three data import streams by user (CSV import, Rest API endpoint, SPARQL endpoint)
Easy data mapping function for user imported data
Combined view of timeline and statistics
Provenance information: source RDF statement (not just HTML document) can be found. Can be used as error checker for Wikidata
Network view for implicit relations (Find)
Display corresponding Wikipedia articles (Compare)
Image gallery to list middle-quality images
Geo location search based on map input (max 4 places) or place name (Geo Find)
GPS/Wifi detects the user location to search items in the area (Geo Find)
Reload downloaded search results (Compare)
Software portability. No persistent database behind. All data is dynamically generated on the fly. We store it temporarily for CSV download

Limitations

Waiting time for query results is much longer than other databases (due to heavy SPARQL)
Inferences are limited so far for the performance reason (cf query optimization)
Little or no control over external APIs
Type-based auto-suggest is slower
Customizability of visualization libraries is limited
Quality of source data might not be satisfactory (Wikidata)
Year-level time calculation (no month-level) for BC/BCE

Future Work

Pick up items from Find and move to Compare section
Member login
Saved and share search results
Query optimization and semantic inferences
AI integration
Stable service and commercialization
Research & Development collaboration

Potential Use Cases

ReKissory can be used in many different ways. Here we list some ideas of use cases

If you have a good idea and/or use ReKisstory for interesting purposes, please share it with us and other users

Find ideas for a museum exhibition
Create school learning materials for history related subjects
Write a blog for art history (fine art, rock'n roll, opera, literature etc)
Find interesting spots and/or people around your house or school
Navigate yourself with GPS to find Dolmens (it is not easy to find "hidden archaeological monuments" on site. The creator used this way in Emmen in the Netherlands!)
Find tourist attractions or cafes when travelling abroad
Create a pub quiz for favorite footballers and/or teams
List all Manga characters who have super power
Get inspiration for your cooking
Discover name-sake items
Identify the distribution of disasters for a scientific investigation
Initiate your background research on transportation, before full-scale research

The Name and The Logo

ReKisstory is the name chosen for the project. In the nutshell, it is a combination of Reki + Kiss + History. We would like to explore history with love. The ReKisstory app aims to show the way to step into time and space more easily and freely.

What is Reki? It is a Japanese character "歴", representing experience and record. Chinese basically has the same: "歷" or "历". History in Japanese is "Rekishi (歴史)". The logo of ReKisstory is based on the character "歴". Note ReKisstory also contains a word "history". We think it is also very important that users can build their own "stories" through "history".

Terms of Use

All our data is sourced from Wikidata and Wikipedia. ReKisstory displays it on-the-fly and only store the search results and user imported data for the function CSV download
Wikidata content is under the free license of Creative Commons CC-0. Wikipedia articles are under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0 DEED). Wikimedia Foundation Terms of Use should be respected for both
Website content of ReKisstory Project is under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0 DEED)
Code of ReKisstory application is not public so far
For questions, feedback, bug-reports, and collaboration, please use the contact information below
Limited technical support availability. It could be done on the case-by-case basis due to lack of human resources
ReKisstory is an experimental application. The author is developing it for a good cause. Please use it ethically and responsibly. He and Vrije Universiteit Amsterdam take no liability directly and indirectly caused by the use of the application. Thank you for your understanding
Please read privacy statement of Vrije Universisteit Amsterdam

Contact

Go Sugimoto (Vrije Universiteit Amsterdam)
g.sugimoto@vu.nl
ORCID: https://orcid.org/0000-0003-2646-6784