Skip to content. | Skip to navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Sections
Personal tools
You are here: Home Theses Crawling and Scraping Scientific Events for Semantic Analysis

Contact

Prof. Dr. M. Jarke
RWTH Aachen
Informatik 5
Ahornstr. 55
D-52056 Aachen
Tel +49/241/8021501
Fax +49/241/8022321

How to find us

Disclaimer

Webmaster

 

Crawling and Scraping Scientific Events for Semantic Analysis

Thesis type
  • Bachelor
Status Open
Supervisor(s)
There are a lot of Web-based services for scientific events based on communication tools like email lists, wikis or blogs. Recently, some of these services start to identify important semantic properties like topics, deadlines, venues, pc members or organizers with automatic semantic extraction tools. These extraction tools are based on natural language processing and are able to produce structured data, i.e. RDF tuples. Together with a data model of the scientific event domain complex queries would be possible. Such queries have the potential to improve scientific communication by better placing of scientific publications or events. The task of this bachelor thesis is the creation of crawlers for communication tools like email lists, wikis or blogs and scraping tools based on existing tools, the storage of the data in an existing structured database and the realization of a Web-based query interface and/or RESTful API. The bachelor thesis is embedded in ongoing projects at our chair. The ideal candidate should be able to program in Java and JavaScript, has some knowledge about databases and interests in semantic technologies as well as Web Services.

Related projects

Document Actions