Skip to content. | Skip to navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Sections
Personal tools
You are here: Home Theses Best Effort Schemaless Reference Reconciliation

Contact

Prof. Dr. M. Jarke
RWTH Aachen
Informatik 5
Ahornstr. 55
D-52056 Aachen
Tel +49/241/8021501
Fax +49/241/8022321

How to find us

Disclaimer

Webmaster

 

Best Effort Schemaless Reference Reconciliation

Thesis type
  • Bachelor
  • Master
  • Diplom
Status Cancelled
Supervisor(s)

Information overload is a common symptom in the Internet age nowadays. Search engines assist users to seek a "needle in a haystack". However, the evolving demand of data intensive applications now asks for not only an isolated piece of information, but also a collection of interlinked data elements. Furthermore, in order to enable machine understandability, the information needs to be structured.

Our project is a first step towards the goal of finding useful structured information from unstructured data. We aim at consolidating a collection of triples extracted from the web, so that duplicates, either explicit or implicit, are identified and merged. The responsibility of the thesis candidate is to develop a scalable approach to reconcile natural-language triples using well-established algorithms in databases and information retrieval.

For more information, see the following attachment:

ba-ref-reconciliation.pdf — PDF document, 742Kb

Prerequisites

# Experienced in Java programming
# Good Command of English language
# Knowledge of Algorithms and Data Structures

Related projects

Document Actions