Skip to content. | Skip to navigation

Informatik 5
Information Systems
Prof. Dr. M. Jarke
Personal tools
You are here: Home Staff Prof. Dr. Christoph Quix

Prof. Dr. Christoph Quix - Theses

Next-generation Sequencing Using Data Lakes - MA / BA - Open
Supervised by Prof. Dr. Thomas Berlage, Prof. Dr. Christoph Quix; Advisor(s): Dr. Sandra Geisler, Tillmann Eitelberg
NGS Pipeline on Azure Data Lake The development of next-generation sequencing (NGS) technologies at the beginning of the 21st century opened the door for revolutionary study setups which enables access to a long time closed dimension of knowledge in several fields of research like genetics, ecology and medicine. The so called “Ppipeline” is the link between different work steps and corresponding tools within the workflow when analyzing a NGS dataset. Microsoft Azure Data Lake is based on two different services. The Azure Data Lake Store allows to store different types of data in an almost unlimited size. With Azure Data Lake Analytics it’s possible to run massively parallel data transformation and processing programs massively in U-SQL, R, Python, and .NET over petabytes of data stored in the Azure Data Lake Store. In our company, we develop the system BOA , a Pipeline based on Microsoft Azure Data Lake designed for analyzing NGS datasets. In order to achieve maximum performance and optimal results, the data for the various steps are partitioned and analyzed using different algorithms. The input and output of the data within BOA is done via a user-friendly user interface. In addition, the system offers the possibility to analyze the data directly against various reference databases. The goals of this thesis are: - to evaluate different algorithms for data partitioning in the context von RNA sequences - to evaluate different algorithms in different languages (C#, R, Python) to optimize the performance in sequence alignment - to design and implement an incremental loading process of different RNA reference databases for the alignment process
Feature Clustering and Visualization of High Dimensional Data using Clique Cover Theory - Running
Approaches such as clustering and classification that are analytically or computationally manageable in low dimensions become intractable as the dimensions increases. This happens because of a phenomenon known as “the curse of dimensionality” which is commonly observed in high dimensional data. Thus the aim of this thesis is to come up with a novel approach for feature clustering, selection, and visualization using the graph theoretical approach of Clique Covers.
Classification of Mechanically Ventilated Patients Based on Weaning Difficulty - Finished
Completed by Ilias Spyros in 2019; Supervised by Prof. Dr. Christoph Quix, Johannes Bickenbach; Advisor(s): Dr. Sandra Geisler, Jermain Kaminski, Arne Peine