Thesis Projects
Information about Diploma/Master thesis process
Open Theses
This thesis investigates how compact, application-specific ontology modules can be extracted automatically from large ontologies, preserving the semantic coherence needed for downstream tasks while drastically reducing complexity.
Knowledge graphs like Wikidata combine rich relational structure with natural-language descriptions, yet most models are trained narrowly for a single task and transfer poorly. This thesis investigates how a single generative graph foundation model, pretrained on large-scale text-rich knowledge graphs, can be adapted to a range of downstream tasks, including knowledge graph completion, text-conditional subgraph ...
Large language models (LLMs) are increasingly used in biomedical applications, including literature mining (PMID: 40188094), drug discovery (PMID: 38730226; 41362614; https://arxiv.org/abs/2510.27130), clinical decision support (PMID: 40753316), and patient data analysis (PMID: 41034564). Hybrid approaches combining LLMs with structured knowledge bases and retrieval-augmented generation (RAG) improve performance and interpretability (PMID: 38830083; https://www.biorxiv.org/content/10.1101/2025.05.08.652829v2) . However, LLM-based systems ...
Running Theses
Currently, the primary way users interact with Large Language Models (LLMs) is through two-dimensional chat interfaces. However, for use cases in Extended Reality (XR) environments, the interaction paradigm shifts from a flat screen to a spatial experience. Here, LLMs can, e.g., be represented as XR agents, a personified version of the LLM. While 3D environments ...
Narrative Classification identifies stories via NLP but often lacks generalizability. While LLMs augment other text tasks, their narrative application remains exploratory. This thesis investigates whether an ontology-based LLM-agent framework incorporating specific data characteristics improves synthetic training data quality.
Large Language Models (LLMs) are increasingly used to support data wrangling, but their integration into interactive transformation workflows raises new challenges for auditability, reproducibility, and accountability. When users approve, reject, or refine LLM-generated suggestions, conventional data lineage systems often fail to capture why a change occurred, who was responsible for it, and which transformation produced ...
Knowledge-augmented multiple-choice question answering (MCQA) aims to improve robustness and factual grounding by integrating external structured knowledge (e.g., knowledge graphs) into language-model-based decision making. Current high-performing systems typically retrieve a local subgraph relevant to a question and candidate answers, then combine pretrained language representations with explicit graph reasoning modules.
This thesis investigates an alternative representation path: ...
AbstractThis master’s thesis aims at examining the applicability of automatic ontology generation and ontology-based data integration to the configuration of co-simulation scenarios. To study power systems through simulations, it is conducive to model sub-domains through separate simulators, which are combined through co-simulations to comprise complex simulation scenarios. However, what is gained through focused modelling of ...
View all running theses
Completed Theses
Thesis Type
Master
Student: Sebastian Miller
Status
In Progress
Background
Supervisory control and data acquisition (SCADA) systems are increasingly connected through information and communication technologies, exposing smart grids to cyberattacks and operational disruptions. Conventional signature-based intrusion detection systems (IDSs) reliably identify known attacks but cannot detect previously unseen patterns, while statistical and machine-learning-based IDSs may achieve high detection rates but often ...
Federated Machine Learning Architecture for an MDF Production Industry Use Case
Data-driven quality assurance in grinding manufacturing technology
While recent advancements in natural language processing have been largely driven by increasingly powerful large language models (LLMs), the role of data quality in fine-tuning these models remains underexplored. This thesis addresses the often-overlooked but critical aspect of data-centric AI by investigating how different types and levels of data degradation affect the performance of fine-tuned ...
View all completed theses