Schema Generation from Unstructured Data
| Thesis type |
|
|---|---|
| Student | Katrin Landwehr |
| Status | Finished |
| Submitted in | 2010 |
| Proposal on | 01. Dec 2009 15:30 |
| Proposal room | Seminarraum I5 |
| Add proposal to calendar |
|
| Presentation on | 10. Sep 2010 10:45 |
| Presentation room | Seminarraum I5 |
| Add presentation to calendar |
|
| Supervisor(s) | |
| Advisor(s) |
This diploma thesis focusses on automatically generating a schema over a col- lection of unstructured documents. The latter should serve users to obtain an overview of information provided by the desired domain and permit them to pose structured queries by exploiting formerly inacessible semantics on the data.
Input to the generation process are triples conveying relationships between en- tities encountered in texts and unary predicates specifying type information on these entities. Additionally, some queries on the text set may already be known and serve in creating a schema. However, prior to generation no schema or par- tial schema is known, domains are completely unknown and the process itself should produce acceptable candidate schemata in the absence of any human supervision.

