Skip to content. | Skip to navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Sections
Personal tools
You are here: Home Theses Entity Recognition in Information Extraction

Contact

Prof. Dr. M. Jarke
RWTH Aachen
Informatik 5
Ahornstr. 55
D-52056 Aachen
Tel +49/241/8021501
Fax +49/241/8022321

How to find us

Disclaimer

Webmaster

 

Entity Recognition in Information Extraction

Thesis type
  • Master
Status Running
Proposal on 05. Feb 2013 16:45
Proposal room Seminarraum I5
Add proposal to calendar vCal
iCal
Supervisor(s)
Advisor(s)

Dataspaces are composed of heterogeneous data sources: structured, unstructured and partially structured. Heterogeneity increases the complexity of user interaction with dataspace, and users may not be at ease ful lfiling their information need in such an environment. The quality of information coming from different sources plays an important role in the context of dataspaces. A prevalent problem in dataspaces is the inability to easily reconcile the information contained in heterogeneous data sources that compose a dataspace.

The goal of this thesis is to try to solve the entity recognition problem in the context of an information extraction system. The system extracts structured triples in the form of (subject, predicate, object) from unstructured text documents. To make the triples more useful, it is necessary to link the subjects and/or objects of such triples to identified entities.

Prerequisites

Good programming skills in Java
Background in information retrieval, relational databases, statistics and probability theory
Knowledge of information extraction, or natural language processing is a plus

Related projects

Document Actions