Skip to content. | Skip to navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Sections
Personal tools
You are here: Home Staff Prof. Dr. Stefan Decker

Prof. Dr. Stefan Decker - Theses


Bachelor, Master
Conversion from RDF to Prototype-based Knowledge Base - Open
Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
Prototypes have been proposed recently as a new way to represent knowledge. In recent years many datasets have been published using RDF. Your task is to find out how the RDF dataset can be converted to prototypes in an efficient manner. For a master thesis, you also have to work on the optimal conversion for different requirements such as updates in the original RDF data.
Bachelor
Developing a Data Annotation Tool for Scientific Data Management - Open
Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Oya Deniz Beyan
Semantic technologies and RDF data representation can improve the reusability of scientific data and enable scientist to reproduce the experiments. However there is no tool to support researcher for making their data semantically interpretable by computers. Aim of the thesis is to understand the benefits of semantic web technologies for reproducible research and develop a tool which can convert experimental data to RDF by annotating with selected data models.
Bachelor, Master
Evaluating the performance of all-pairs personalized page rank - Open
Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
Recently, a new approach for computing the personalized pagerank (PPR) for all nodes in a graph was presented by the adviser of this thesis. Personalized pagerank can be used to determine how important specific nodes in a graph are from the viewpoint of a specific node, or set of nodes. For example, it can be used to determine which web pages are relevant for a user, given a set of pages he has visited before. The improvement which the advisor of the thesis devised is useful when the PPR has to be computed for all nodes in the graph. The gain in speed is due to reuse of already computed PPR values for other nodes and a clever ordering of nodes. The student working on this thesis will experiment with this technique and others. First, the student needs to investigate ways to parallelize the algorithms and analyse the algorithms and their parameters experimentally. For a master thesis, the student has to improve the algorithm to optimize loops and experiment whether other improvements in the ordering pay of (i.e., whether other heuristics for finding the order work better and can be computed fast enough to speed up the overall algorithm). Besides, the student would investigate parallel computation of the order and actual PPR. External resources: https://pdfs.semanticscholar.org/c63e/b92a805c3eb636865ecfbd799fda32194753.pdf http://users.jyu.fi/~miselico/papers/GlobalRDFEmbedding.pdf sections 3.2-4
Bachelor
Evaluation of Stream Sampling Algorithms - Open
Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
The student will implement several stream sampling algorithms and perform experiments to compare their performance. The implementations are done on top of streaming frameworks like Spark, Apache Flink, and Storm.
Master
Exploring Unknown Environments - Finding Pollution in Underground Pipes - Open
Supervised by Prof. Dr.-Ing. Gerd Ascheid, Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez, Ahmed Hallawa
Bachelor, Master
Optimizing Mining Maximal Frequent Patterns with MFPAS - Open
Recently, a new approach for finding maximal frequent patterns (MFPAS) was presented by the supervisor and advisers of this thesis. Several further optimizations of the algorithm are possible. The student working on this thesis will experiment with different optimization possibilities and analyse their effect experimentally. For a master thesis, further theoretical analysis of the optimizations and the original algorithm are necessary. Related paper: M. R. Karim, M. Cochez, O. D. Beyan, C. F. Ahmed, and S. Decker. Mining maximal frequent patterns in transactional databases and dynamic data streams: a spark-based approach. Information Sciences, 2017a
Bachelor, Master
Accelerating Graph Embedding using GPUs and Distributed Computing - Running
Lately several methods for embedding graphs nodes into a vector space have been proposed. These embeddings can then used to train other machine learning models. Learning these embeddings is typically done using CPUs. In this thesis the student would look into the use of other hardware, like GPUs and distributed computation options to speed up the learning process. The challenge is that algorithms working on graphs have typically a bad memory locality. Hence, existing algorithms might need profound modification in order to use them on GPUs or in a distributed fashion.
Master
Analysis of Breast Cancer Genomic Data with Multimodal Deep Belief Network - Running
The objective of this thesis is to analyse breast cancer genomic data with the potential of predicting breast cancer genomic biomarker. Specifically, these analysis will include classification and regression of breast cancer patients based on their genetic information.
Master
Master
Locality-sensitive Hashing using not-so-random Hash Functions - Running
Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Michael Cochez
Locality-sensitive hashing is used to speed up near-neighbor search in high dimensional space. When the distance of interest is cosine distance, Random hyperplane hashing (RHH) is used. This technique is based on randomly selecting hyperplanes. However, in some cases (when we have more information about the dataset) it seems reasonable to not choose the hyperplanes completely randomly. Further, if normal RRH is performed with a low number of hyperplanes, then the hyperplanes are likely to not cover the space very well. This thesis will be about choosing the hyperplane in a data dependent way and try to sample the hyperplanes such that they cover the space nicely (including a comparison with angular quantization).
Master
Patterns for Integrating Rule Based and Process Based Model Components of Computerized Clinical Guidelines - Running
Supervised by Prof. Dr. Stefan Decker, Dr. rer. nat. Cord Spreckelsen; Advisor(s): 692050c6199c8bbfb9be2189e82ff904
Master
Smart Quality Check for FHIR Resources System - Running
Supervised by Prof. Dr. Stefan Decker; Advisor(s): Dr. Oya Deniz Beyan
Fast Healthcare Interoperability Resources defines a set of "Resources" that represent granular clinical concepts. It is a standard for exchanging healthcare information electronically between stakeholders of the healthcare environment including care providers, patients, and mobile application developers. Aim of this thesis is to develop tools to validate conformance of the resources against a set of business rules.