Big Data & Model Management
|Research field||Big Data & Model Management|
The working group Big Data & Model Management discusses research and applications of big data and model management technologies. Model management aims at developing technologies and mechanisms to support the integration, merging, evolution, and matching of complex data models. This support is required for the management of complex, integrated, distributed, heterogeneous information systems. Basic concepts in model management are models, mappings and operators. Models (e.g. an XML Schema, a schema of a relational database, or an ontology in OWL) describe the structure of data. Mappings represent relationships between elements from different models. Operators are operations on models and mappings (e.g. merging & matching of models, composition of mapping).
The management of metadata is of particular importance for model management. Metadata is data about data and is becoming more and more important as distributed and heterogeneous information systems need to be integrated. Using a metadata-based approach in the design and implementation of an integrated information system increases the flexibility and adaptability of the system, as information about the structure of data models and their dependencies are not hidden in the source code of the system. Instead, this information is captured in semantically rich metadata models, which enable the (re)use of the information in various contexts. Furthermore, a semantically rich representation of data models supports the definition of model management operators.
The working group addresses the following topics in more detail.
Meta Database Systems
The ConceptBase system is a deductive, object-oriented meta database system. It is based on the conceptual modelling language O-Telos. The system is available free-of-charge for non-commercial purposes. Current work focuses on the improvement of the graphical user interface (in particular, the graphical editor), interfaces to other modelling languages (such as XML Schema or OWL) and other database systems, and ongoing improvement of the ConceptBase kernel system.
Formal Representation of Models and Mappings
A fundamental problem of model management is a formal representation of models and mappings between these models. The formal representation should enable the definition of operators in an efficient and correct way. Our current goal is to define a generic meta model that is able to represent data models in various modeling languages (such as SQL DDL, XML Schema, or OWL).
Mappings between models are required for many operations in model management. The manual construction of such mappings can be a tedious task, if the models contain thousands of elements. Therefore, (semi-)automatic mechanisms are required to support the creation of mappings. We are currently developing a system for the matching of ontologies.
Quality-Oriented Data Integration
Within a company, data is managed in several systems with different data models and characteristics. However, an integrated view of the data is required to get a comprehensive overview of the state of the organization. This might also include the integration of external sources. In this context, the quality of the data in the various sources has also to be considered. Several sources might provide the same data but with different quality characteristics (e.g. correctness, accuracy, response time of the source). An algorithm for the quality-oriented data integration has been developed in this thesis. We currently plan an implementation of this algorithm using Semantic Web and Grid technologies (see below).
The vision of the Semantic Web is to have semantic annotations of the data, which is available on the web. This supports the search and integration of the data, as the data can be located by their semantical description and not only by their syntactical representation (e.g. keyword based search). Data integration and data quality are also problems in this context. Grid and P2P systems can be seen as the underlying technologies which enable the implementation of distributed information systems based on the idea of the Semantic Web. Research in this field is currently done in the context of the EU IST project SEWASIE.
Joining the Working Group
If you are interested in joining the working group on Model Management, please contact Christoph Quix. Students can join the group as a student assistant (Hiwi) or do their bachelor/diploma/master thesis in this research area. Seminars and lab courses are planned for future terms.
The following topics are currently available for a bachelor/diploma/master thesis:
We are looking for a student research assistant (HiWi) who will support us in improving our model management prototype GeRoMeSuite:
The results of this research have been integrated into the model management prototype system GeRoMeSuite that has been presented at VLDB '07.