Kategorie: ‘Theses’
Image stream anomaly detection using feature sketch for industrial quality control
Is there data? Repository approaches to simplify the search for reusable research data
The discovery of scientific knowledge has always been a continuous and exponentially growing process. As data-driven research generates massive amounts of datasets, there has been a growing awareness among researchers and institutions regarding the importance of making research data openly accessible. Making the underlying data freely accessible contributes to reproducibility and transparency in research and fosters public faith in scientific discovery. In this light, the role of data repositories as a medium to share reusable data and thus to connect data providers with data consumers is becoming highly relevant. This thesis investigates approaches for data repositories to enhance collaborative, data-driven research, with a special focus on improving data discoverability.
Moving from Ad-hoc to a Structured Testing Approach in a Graph Analysis and Community Detection Framework
Algorithmic Approaches to Overlapping Community Detection – Multiplex Networks
Algorithmic Approaches to Overlapping Community Detection – Protein-Protein Interaction Networks
Integrating Semantics in Data Spaces by the automatic generation of mapping rules between data sources and ontologies
Consistency-checking German Pathology Reports using Large Language models
High data quality is becoming increasingly critical to today’s medical research. After identifying respective data quality rules, data quality checks can be implemented relatively easily on structured data. However, Pathology reports are predominantly narrative reports. Thus, these reports, comprising microscopy, histology and diagnosis provide very few structured elements. Hence, consistency checking of the sections of a report is mainly performed manually. Recent Large Language Models provide the potential to automate this process.
Knowledge and Social Context-enhanced Fake News Detection
In today’s digital era, information disseminates rapidly through online platforms, such as Twitter. Those platforms have fundamentally transformed how societies communicate, share, and perceive information. On the other hand, it also presents challenges, notably in the spread of false information or so-called fake news. The spread of fake news has been shown to sway public opinion, disrupt electoral processes, and even endanger public safety.
WILLM: A System for Academic Writing Improvement basedon Large Language Models
Writing is an essential skill in both academic and professional contexts, that allows individuals to convey their thoughts, ideas, and information in a clear and concise manner. However, it can still be challenging to write effectively, efficiently, and accurately, especially for non-native speakers of the language.
Scientific Question Answering using Retrieval-Augmented Large Language Models
The immense growth of scientific literature makes it nearly impossible for researchers to keep pace with all new developments in their domains. An automated scientific Question Answering (QA) system could substantially expedite the process of literature review, hypothesis generation, and knowledge extraction. With the emergence of Large Language Models (LLMs) like GPT, BERT, and their successors, the landscape of QA has significantly shifted.