Direkt zum Inhalt | Direkt zur Navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Sektionen
Benutzerspezifische Werkzeuge
Sie sind hier: Startseite Publications Relaxed Functional Dependency Discovery in Heterogeneous Data Lakes

Contact

Prof. Dr. M. Jarke
RWTH Aachen
Informatik 5
Ahornstr. 55
D-52056 Aachen
Tel +49/241/8021501
Fax +49/241/8022321

How to find us

Annual Reports

Disclaimer

Webmaster

 

 

Relaxed Functional Dependency Discovery in Heterogeneous Data Lakes

Year 2019
PDF File download

Functional dependencies are important for the definition of constraints and relationships that have to be satisfied by every database instance. Relaxed functional dependencies (RFD) can be used for data exploration and profiling in datasets with lower data quality. In this work, we present an approach for RFD discovery in heterogeneous data lakes. More specifically, the goal of this work is to find RFDs from structured, semi-structured, and graph data. Our solution brings novelty to this problem in the following aspects: (1) We introduce a generic metamodel to the problem of RFD discovery, which allows us to define and detect RFDs for data stored in heterogeneous sources in an integrated manner. (2) We apply clustering techniques during RFD discovery for partitioning and pruning. (3) We performed an intensive evaluation with nine datasets, which shows that our approach is effective for discovering meaningful RFDs, reducing redundancy, and detecting inconsistent data.

Details

The 38th International Conference on Conceptual Modeling (ER 2019)

Authors

Published in

The 38th International Conference on Conceptual Modeling (ER 2019) .

Related projects

Artikelaktionen