Skip to content. | Skip to navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Sections
Personal tools
You are here: Home Theses Conjunctive Triple Queries Over Text Documents

Contact

Prof. Dr. M. Jarke
RWTH Aachen
Informatik 5
Ahornstr. 55
D-52056 Aachen
Tel +49/241/8021501
Fax +49/241/8022321

How to find us

Annual Reports

Disclaimer

Webmaster

 

 

Conjunctive Triple Queries Over Text Documents

Thesis type
  • Master
Student Danilo Djordjevic
Status Finished
Submitted in 2012
Proposal on 05. Jul 2011 15:40
Proposal room Seminarraum I5
Add proposal to calendar vCal
iCal
Presentation on 27. Mar 2012 16:00
Presentation room Seminarraum I5
Add presentation to calendar vCal
iCal
Supervisor(s)
Advisor(s)

Structured relational databases have gained great success for data management in the past three decades. However, with the advent of the internet, more and more information are presented in unstructured/semi-structured web texts. Emails and many other documents on personal computers are also plain texts. Such loosely structured data has great demands for data management facilities such as querying. On the one hand, the prevalent usage of text documents nowadays is enabled by information retrieval (IR) techniques, e.g., via keyword search. Modern search engines like Google follow a keyword-in-document-out paradigm. Users have to follow links and navigate possibly large texts to locate the interesting information they need. On the other hand, structured data stored in relational databases require expertise in database schemas to manipulate; and querying and updating is not friendly for casual users. Furthermore, there is no good technique that is able to explore relationship between structured databases and plain text documents or web texts.

This thesis aims at processing conjunctive triple queries over text documents. The candidate is going to work on top of our existing prototype that processes single semi-bound triple queries. The goal is to extend the prototype to be able to process a conjunction of triples. Both runtime information like the query proposed and the statistical profiles of the text corpora should be utilized. Under the project, we will investigate both effectiveness (e.g., precision and recall) and efficiency.

Prerequisites

Good programming skills in Java
Background in relational databases
Knowledge of information retrieval, information extraction, or natural language processing is a plus

Related projects

Document Actions