This thesis aims at developing a pipeline for automatic translation of generic latex expression into domain specific, processable, languages.
Thesis Type 

Student 
David Korets'kyy 
Status 
Running 
Presentation room 
Seminar room I5 6202 
Supervisor(s) 
Stefan Decker 
Advisor(s) 
Laurenz Neumann Maximilian Kißgen 
Contact 
laurenz.neumann@dbis.rwthaachen.de kissgen@dbis.rwthaachen.de 
In many domains of computer science, such as database theory, notations encompass a variety of mathematical terms, for which many use LaTeX, a popular markup language for mathematical notations.
However, as a language purely for markup, its semantics cannot be directly processed within specific domains, for instance, relational algebra. This presents challenges for LaTeX in education. Here, students often cannot use the language in automatically graded exercises but need to instead use specifically defined languages that can be programmatically processed.
This thesis project aims to address this challenge by developing a pipeline to translate welldefined subsets of LaTeX expressions into domainspecific, processable algebras, enabling better comprehension and enhancing assignments with automatic assessment. To assess whether student learning experience is improved, the pipeline should be evaluated via a user survey.
Goals & Objectives:
 Formulating the theoretical background of adapting latex expressions into domain specific algebras
 Exploring and adapting stateoftheart approaches for parser generation and LaTeX translation
 Developing a parser and modular framework to translate LaTeXexpressions into domainspecific representations via Python
 Evaluating the impact of LaTeX statements for automated assessments on student learning experience
Challenges:
Validity of the algebraic statements must be ensured, this includes ambiguities in LaTeX notations (e.g. “>” versus “\rightarrow”, parentheses, etc.). In addition, the framework needs to maintain its modularity, meaning that users have to be able to define their own translation rules without breaking the system.
 Proficiency in Python, LaTeX
 Basic knowledge about parsers and formal grammars
 Nice to have: proficiency in ANTLR & jupyter notebooks, knowledge about compilers