This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Vogt, Lars, TIB Leibniz Information Centre for Science and Technology;
(2) Konrad, Marcel, TIB Leibniz Information Centre for Science and Technology;
(3) Prinz, Manuel, TIB Leibniz Information Centre for Science and Technology.
Table of Links
- Abstract & Introduction
- Interoperability
- Semantic interoperability and what natural languages like English can teach us
- Requirements for successfully communicating terms and statements
- Parallels between the structure of natural language statements and data schemata with implications for semantic interoperability
- What makes a term a good term and a schema a good schema?
- The need for a machine-actionable Rosetta Stone for (meta)data that acts as an interlingua for specifying reference terms and reference schemata to support cognitive and semantic interoperability
- Rosetta Stone and machine-readability: UPRIs, XML Schema datatypes, and RDF for communicating terms, datatypes, and statements
- Rosetta Stone and machine-interpretability: Wikidata and a modeling paradigm for (meta)data statements based on English
- Rosetta Stone and semantic interoperability: Specifying term mappings and schema crosswalks
- Rosetta Stone and cognitive interoperability: Specifying display templates and using a query builder
- Discussion
- Related work
- Conclusion, Acknowledgements, & References
Conclusion
Today, with the pressure to rapidly produce new and FAIR (meta)data that meet the requirement to integrate with existing (meta)data sources from different domains, knowledge graph management platforms must support the rapid development of data management solutions that are suitable for everyday research, that demonstrate the added value of knowledge graph technologies, and that can be implemented by smaller projects with limited budgets or even by individual researchers for their personal research knowledge graph (71). This is all the more true given the value of manually authored knowledge from domain experts. According to Bradley’s brief history of knowledge engineering (72), the most valuable documentations of (meta)data come from the researchers who created them, and it is the task of knowledge engineers to make it as easy as possible for those researchers to document them as FAIR as possible. We expect that the Rosetta Framework will make this task much easier.
Ideally, a knowledge graph management platform supports the needs of different categories of user groups: Domain experts, who are not interested in the internals of the application logic or the underlying data structure of the graph. They are not interested in the additional information that has to be added to make (meta)data machine-actionable, but want to access the information in the graph in intuitive ways with the (meta)data presentation being reduced to the information they are currently interested in.
Expert users and data scientists, on the other hand, want to interact with the (meta)data, maybe analyze it using Jupyter Notebooks, R, or other data analysis tools, and require the (meta)data to be easily accessible in the formats they prefer and want the knowledge graph to provide convenient and efficient tools that support their data management operations
And finally data stewards, ontology engineers, and application engineers, who require that the knowledge graph infrastructure can be easily incorporated into the overall software stack of their organizations while retaining full control over their (meta)data’s ethical, privacy, or legal aspects (following Barend Mons’ data visiting as opposed to data sharing (19)). They want the platform to make their lives easier when having to develop new targeted applications on top of the knowledge graph.
With its main idea to clearly distinguish between a generic data storage model, data display models, and data access/export models and to decouple the application data model from displaying data in the UI, the Rosetta Framework supports all this. We argue that this decoupling is essential for being able to solve many of the immanent problems of knowledge management systems. A Rosetta-driven FAIR knowledge graph application, based on the information provided by a set of statement type classes and their associated reference schemata and display templates, allows intuitively making statements about statements, provides users with input-forms and human-readable data views in the UI, enforces graph patterns for internal interoperability and machine-actionability of (meta)data, allows the specification of additional data schemata alongside with corresponding schema crosswalks for access and export with the possibility to add further schemata for newly upcoming standards, and provides CRUD queries derived from the storage models of each statement type that can be intuitively used, combined, and further specified using the Rosetta Query Builder for searching and exploring the graph.
With the low-code Rosetta Editor and an openly accessible reference schema repository, the Rosetta Framework provides a set of resources and tools that increase the overall cognitive interoperability of (meta)data and of knowledge graph applications. Domain experts can define their own Rosetta-driven FAIR knowledge graph applications and developers can access and export their data as JSON, RDF, or CSV
We think, the time has come that, in addition to focusing on the machine-actionability of (meta)data, we start to focus also on their human-actionability and thus their human-friendliness, creating knowledge graph applications that meet the requirements of cognitive interoperability. The Rosetta Framework is our suggestion for how we could arrive there.
Acknowledgements
We thank Philip Strömert, Roman Baum, Björn Quast, Peter Grobe, István Míko, and Kheir Eddine for discussing some of the presented ideas. We are solely responsible for all the arguments and statements in this paper. This work was supported by the ERC H2020 Project ‘ScienceGraph’ (819536). We are also grateful to the taxpayers of Germany.