Big Data & Model Management
Big data is a buzzword that summarizes various aspects of handling large amounts of heterogeneous data. The goals are to perform efficient analytics and to derive new information from large collection of potentially heterogeneous data. The heterogeneity of data is an important issue in big data: data is not only large in volume and produced at a high speed (velocity), it has also a high variety. This research groups applies and extends technologies that have been developed in the context of data integration and model management. Research in model management aims at developing technologies and mechanisms to support the integration, merging, evolution, and matching of complex data models to support the management of complex, integrated, distributed, heterogeneous information systems.
Overview
The research group has a long experience in developing systems and applications for handling complex, heterogeneous data. The model management system GeRoMeSuite has been developed as a platform for generic model management. This means that the heterogeneous modeling languages (e.g., XML Schema, the Relational Data Model, OWL) are represented in a generic metamodel (GeRoMe) in order to enable the integration and mapping of models represented in different modeling languages.
In general, model management aims at developing technologies and mechanisms to support the integration, merging, evolution, and matching of complex data models. This support is required for the management of complex, integrated, distributed, heterogeneous information systems. Basic concepts in model management are models, mappings and operators. Models describe the structure of data. Mappings represent relationships between elements from different models. Operators are operations on models and mappings (e.g., merging & matching of models, composition of mapping).
The management of metadata is of particular importance for information integration, model management, and big data applications. Metadata is data about data and provides semantics to heterogeneous data; only with a description data becomes understandable and might become more valuable information. Furthermore, using a metadata-based approach in the design and implementation of an integrated information system increases the flexibility and adaptability of the system, as information about the structure of data models and their dependencies are not hidden in the source code of the system. Instead, this information is captured in semantically rich metadata models, which enable the (re)use of the information in various contexts. Furthermore, a semantically rich representation of data models supports the definition of model management operators.
The following topics are addressed in more detailed in this research group:
- Big Data Architectures
- Systems to manage Big Data (Hadoop, NoSQL systems such as MongoDB, etc.)
- Scientific Data Management, especially in Life Science
- Schema Mapping & Matching
- Quality-oriented Data Integration
- Semantic Web
Software
Current projects
ConceptBase
A deductive object manager for meta databases
Dataspace Framework
Enabling integrated access to structured and unstructured data
HUMIT
Human-zentrierte Unterstützung inkrementell-interaktiver Datenintegration am Beispiel von Hochdurchsatzprozessen in den Life Sciences
Model Management
Working Group Model Management
Completed projects
Cooperative Cars - CoCar
Joint project between Ericsson in Aachen and Fraunhofer FIT
GeRoMeSuite
GeRoMeSuite is a prototype system for generic model management.
mi-Mappa
Systematisches Innovationsmanagement für die Medizintechnik: Kontexte und Data Mining entschlüsseln Patentdaten
MINIMUM: MergINg logIcal scheMas Using Mapping constraints
Merging multiple logical data schemas using schema mappings in the form of data dependencies.
P2P Information Management
Contextualized Peer to Peer Information Management in Wearable and Environmental Computing