WissKI
What is an ontology in computer science and knowledge representation?
The widely used definition of ontologies from Gruber as a “specification of a conceptualization” [1] shows that in the field of knowledge representation an ontology is a concrete system in contrast to philosophy. The Digital Solutions Team understands ontologies in the sense used in the context of the Semantic Web. It is a system which orders the concepts of some world in a hierarchical structure. These concepts and their instances can have specific properties and can be related with each other.
About WissKI
We are currently experimenting with the open-source virtual research environment WissKI, which was developed by a joined team from Friedrich-Alexander-Universität Erlangen-Nürnberg, Germanisches Nationalmuseum and Zoologisches Forschungsmuseum Alexander Koenig Bonn. The English translation of the project’s name – scientific communication infrastructure – highlights its aims. As it is based on the content management system Drupal, it enables a wide variety of configurations and add-ons.
What is a graph-based database and how does it differ from a relational database?
At its beginning, the Worldwide Web was structured in HTML files. These files contained hyperlinks, which redirected to other websites and in this way created the net. In contrast, the W3C aims to establish a web of data, which organizes its contents in a graph. Furthermore, it shall be possible for machines to read texts and understand them. Therefore, all elements of the Semantic Web Stack are based on XML. The advanced languages of the Semantic Web stack - RDF and OWL - are designed for the modelling of graphs and ontologies.
In graph-based databases, the world is modeled in a subject-predicate-object structure. Each predicate connects the subject with its object. This implies directed and labelled nodes, because their machine-readable meaning has to be included in order to get a meaningful graph. With an increasing number of nodes, a graph spans. One can explore this graph and probably discover new connections between entities. An exploring user can either use structured SPARQL queries or just follow the nodes through the graph, e.g. after creating a visualization of the graph. Especially the latter strongly relates to the idea of serendipity.
An important difference between graph-based and relational databases is the modelling of worlds. Relational Databases work on tables with each table describing some limited dataset. It is possible to combine several tables and their contents with JOIN operations. If you want to extend the data model, there are two possibilities. Either you insert a row in an existing table or you create a new table. In graph-based systems you can just connect new (sub-)graphs to an existing system by creating some connection, i.e. an edge, between two nodes.
Generally, the graph-based approach is better suited for evolving systems, as the structure of relational databases is more static. Furthermore, the information retrieval in graphs proves to be more efficient.
Workflow
In order to create a digital research environment for the “Cluster of Excellence Africa Multiple”, the development of an ontology for the structuring of research data is a crucial step. However, this challenging task is put into practice step by step in manageable units. Therefore, the Digital Solutions team starts with the development of a basic structure. There exist concepts, which are similar in most of the cluster’s research projects. Obviously, a lot of data is collected about persons, geographical entities and points in time. Furthermore, speaking from a point of knowledge representation, most research is about events. These events may include interviews, births, artistic performances, wars, floods and all other things that take place in some matter.
It is necessary to familiarize in detail with the requirements of each project group, but also keep in mind the underlying abstract concepts. The main idea is to collect information from single projects, develop finer grained ontologies for their needs and integrate these in an upper ontology. Currently, we are in the process of interviewing project groups and either work on existing data schemes or develop a scheme in cooperation with the researchers. As this is an iterative process, its output is continuously validated and adapted.
Reference
[1] Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition. 5, 199–220 (1993). https://doi.org/10.1006/knac.1993.1008.