Skip to content. | Skip to navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Personal tools
You are here: Home Theses


Prof. Dr. M. Jarke
RWTH Aachen
Informatik 5
Ahornstr. 55
D-52056 Aachen
Tel +49/241/8021501
Fax +49/241/8022321

How to find us

Annual Reports






Information about Diploma/Master thesis process

Open Theses  

An Editor for Prototype-based Knowledge Bases
Posted on 24. Oct 2016; Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
Prototype ontologies are a new approach for knowledge representation. The task of the student is to create an editor for prototype based ontologies, based on the prototype knowledge base code provided. The editor must be intuitive to use and give suggestions to the user. Further, it must show how final values of the prototypes have been derived.
Posted on 21. Jun 2017; Supervised by Prof. Dr. Thomas Rose; Advisor(s): Thomas Osterland
Often only noticed as a technology that enables the digital currency Bitcoin, blockchain is a novel protocol that allows the distributed and secure storing of information and untempered execution of program code in trust-less environments. Did you ever feel the intense desire to write a thesis about blockchain or do you have a slight hope that blockchain is the one-and-only topic that touches your heart? Use your chance now! We are looking forward to hear from you.
Comparing Communities and Topics in Wikipedias
Posted on 11. Sep 2017; Supervised by PD Dr. Ralf Klamma, AOR; Advisor(s): Bernhard Göschlberger, MLBT MSc BSc, Mohsen Shahriari
Investigate the relationship between communities and topics by applying overlapping community detection to the social network of contributors and subsets of intrawiki link networks on different Wikipedias.
Developing a Data Annotation Tool for Scientific Data Management
Posted on 25. Oct 2016; Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Oya Deniz Beyan
Semantic technologies and RDF data representation can improve the reusability of scientific data and enable scientist to reproduce the experiments. However there is no tool to support researcher for making their data semantically interpretable by computers. Aim of the thesis is to understand the benefits of semantic web technologies for reproducible research and develop a tool which can convert experimental data to RDF by annotating with selected data models.
Bachelor, Master
Evaluating the performance of all-pairs personalized page rank
Posted on 23. Oct 2017; Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
Recently, a new approach for computing the personalized pagerank (PPR) for all nodes in a graph was presented by the adviser of this thesis. Personalized pagerank can be used to determine how important specific nodes in a graph are from the viewpoint of a specific node, or set of nodes. For example, it can be used to determine which web pages are relevant for a user, given a set of pages he has visited before. The improvement which the advisor of the thesis devised is useful when the PPR has to be computed for all nodes in the graph. The gain in speed is due to reuse of already computed PPR values for other nodes and a clever ordering of nodes. The student working on this thesis will experiment with this technique and others. First, the student needs to investigate ways to parallelize the algorithms and analyse the algorithms and their parameters experimentally. For a master thesis, the student has to improve the algorithm to optimize loops and experiment whether other improvements in the ordering pay of (i.e., whether other heuristics for finding the order work better and can be computed fast enough to speed up the overall algorithm). Besides, the student would investigate parallel computation of the order and actual PPR. External resources: sections 3.2-4
Evaluation of Approximate Hierarchical Clustering Algorithms
Posted on 29. May 2017; Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
There are several algorithms to perform a hierarchical clustering, resulting in approximate dendrogram. This makes it possible to perform a clustering on big data sets. In this thesis the student will evaluate of several existing algorithms in terms of resource use and clustering quality. As part of this work, the student has to implement some of the algorithms to work on a GPU as they are not scalable enough for CPU computing. External resources:, mainly IV. SIDESTEP: MEASURING THE QUALITY OF A DENDROGRAM
Evaluation of Stream Sampling Algorithms
Posted on 08. Feb 2017; Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
The student will implement several stream sampling algorithms and perform experiments to compare their performance. The implementations are done on top of streaming frameworks like Spark, Apache Flink, and Storm.
Bachelor, Master
Including Attributes in a Graph Embedding
Posted on 29. Sep 2017; Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
Lately several methods for embedding graphs nodes into a vector space have been proposed. These embeddings can then used to train other machine learning models. Most approaches will, however, only keep relations between nodes representing entities in the graph into account. If the graph also has nodes representing literal values (numbers, strings, etc.) then they are ignored. In this thesis, the student will investigate how these attributes can be included in the embedding.
Interactive Support for Business Modelling
Posted on 21. Jun 2017; Supervised by Prof. Dr. Thomas Rose; Advisor(s): Thomas Osterland
A business model is an abstract model of the business of one or more cooperating organisations. It is a conceptual and architectural implementation of a business strategy and the foundation for the implementation of business processes and information systems. The research objective of this thesis is the design, implementation and evaluation of an interactive tool for the engineering of a business model.
Measuring coherence accross media in learning environments
Posted on 28. Sep 2017; Supervised by PD Dr. Ralf Klamma, AOR, Dr. Marc Spaniol
Computer linguistics has provided impressive results for measuring the quality of writing, e.g. for automatic essay scoring. Multimedia content based indexing delivered a lot of models for the analysis of multimedia materials. In modern educational platforms, e.g. MOOCs and self-regulated learning platforms. In consequence, multimedia materials are produced by educational designers but also by the learners during their learning processes. Coherence is a semantic measure for the local and global connectivity of e.g. sentences, paragraphs, videos, slides among others. To measure the coherence of multimedia materials many computational methods reaching from natural language processing to machine learning needs to be combined in a common coherence model. Goal of this master thesis is to co-develop a coherence model for cross-media coherence and to prototypical combine a few of these computational methods for one or two analysis scenarios, e.g. a MOOC or a webinar.
Bachelor, Master
Mobility Service Payment using Privacy-Preserving Interval Operations
Posted on 17. Nov 2017; Supervised by Prof. Dr. Matthias Jarke, Prof. Dr. Ulrike Meyer; Advisor(s): Dipl.-Inform. Christian Samsel, Dipl.-Kfm. Markus Christian Beutel, Stefan Wüller
Together with the IT security chair, we'd like to investigate the possibilities of employing cryptographic oprations for bartering in ride sharing scenarios.
Bachelor, Master
Optimizing Mining Maximal Frequent Patterns with MFPAS
Posted on 29. Sep 2017; Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez, Rezaul Karim
Recently, a new approach for finding maximal frequent patterns (MFPAS) was presented by the supervisor and advisers of this thesis. Several further optimizations of the algorithm are possible. The student working on this thesis will experiment with different optimization possibilities and analyse their effect experimentally. For a master thesis, further theoretical analysis of the optimizations and the original algorithm are necessary. Related paper: M. R. Karim, M. Cochez, O. D. Beyan, C. F. Ahmed, and S. Decker. Mining maximal frequent patterns in transactional databases and dynamic data streams: a spark-based approach. Information Sciences, 2017a
Post-Mortem Community Information Systems Success Analytics
Posted on 31. May 2016; Supervised by PD Dr. Ralf Klamma, AOR
Goal of this thesis is an integration of post-mortem community data dumps with the MobSOS real-time community information systems success awareness framework.
Process Interaction across Blockchains
Posted on 10. Nov 2017; Supervised by Prof. Dr. Thomas Rose; Advisor(s): Thomas Osterland
FIT and Bosch offer in collaboration two Master Theses: Combination of different application specific blockchains and registry of services governed by blockchains

Running Theses

Completed Theses

Data Analysis in the Industry Inferring causal relations in Industrial Data
Syed, Muhammad Ali in 2018; Supervised by Prof. Dr. Matthias Jarke, PD Dr. Christoph Quix; Advisor(s): Dr. Christoph Paulitsch, Dr.-Ing. Matthias Loskyll
Semantic Data Profi ling in Data Lake
Ansari, Jasim Waheed in 2018; Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Oya Deniz Beyan, Naila Karim, Dr. Michael Cochez
Scope of the thesis is to extend current semantic pro ling efforts for data lakes with ontologies enrichment. Develop tools to systematically extract, manage and exploit metadata of the datasets' information and display the updated datasets with semantic or syntactic correct results.