Skip to content. | Skip to navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Sections
Personal tools
You are here: Home Jobs HiWi - Web Crawling and Mining

HiWi - Web Crawling and Mining

Job type HiWi
Extent 15-19h
Status Open
Contact Dr. Michael Derntl
Contact PD Dr. Ralf Klamma, AOR

We are looking for a student interested in tools and crawlers operating on web data and relational databases.

We are developing and hosting so-called Mediabases that offer tools for operating and analyzing media resources (e.g. blogs, papers, websites, forums, mailing lists) of communities of practice on the web. These media resources are crawled from the web and stored and updated every day in the Mediabase databases.

Since the representations of these resources are evolving (e.g. new possibilities with HTML5), the crawlers face many new challenges and need to be adapted and evolved accordingly.

Your task will be to fix and develop these crawlers and related tools which operate on the Mediabase data. You will be offered to work and learn in a stimulating environment, and the tools you will develop will be publicly available on the Web to large communities of users all over the world.

Prerequisites

We expect profound practical skills in Python and SQL. Proficiency with Perl is also a nice-to-have.

Related projects

Document Actions