DISCO-ML: Decision Interoperability Specification Conventions for Operationalized Machine Learning

October 20th, 2025

Sensor-based machine learning (ML) systems (such as predictive maintenance, environmental monitoring, and industrial automation) require scalable, explainable, and continuously evolving data infrastructures. The complexity of these systems lies not only in the technical pipeline (data ingestion, feature engineering, model training, deployment, monitoring) but also in the design decisions stakeholders make along the way. These decisions range from architectural trade-offs (edge vs. cloud processing), ethical considerations (data privacy, fairness), to explainability requirements and system scalability.

While existing modeling techniques support documenting software architecture and data flows, there is no widely accepted notation that explicitly captures, traces, and communicates design decisions for sensor-based ML infrastructures in a way that is interpretable across diverse stakeholders (e.g., data scientists, system architects, domain experts, managers) and/or machine-readable.

This thesis aims to identify, compare, and develop suitable modeling notations and information structures that support the transparent documentation and communication of design decisions in sensor-based ML systems.

Thesis Type	Master
Status	Running
Presentation room	Seminar room I5 6202
Supervisor(s)	Stefan Decker
Advisor(s)	Michal Slupczynski
Contact	slupczynski@dbis.rwth-aachen.de

Related Work

Architectural Decision Records (ADR) for lightweight documentation of software engineering design choices.
UML/SysML for representing system architectures and data flows.
Ontologies and knowledge graphs for semantic representation of ML lifecycle concepts.
MLOps frameworks that address automation but rarely integrate decision traceability.
Design Patterns for TEAMS: Tailoring Engagement and Alignment for MLOps Stakeholders
FLUX: Feedback Latency and Utilization Examination — Optimizing Real-Time AI Pipelines

Potential Research Questions

RQ1: Which modeling notations and information structures are most suitable for representing design decisions in sensor-based ML infrastructures?
RQ2: What elements must be included to ensure that these information structures cover key aspects of ML infrastructures?
RQ3: How can ML design decisions be traced over time to support transparency and accountability?
RQ4: How well do different stakeholder roles understand and use these representations?
RQ5: Can such models be reused or adapted to support similar ML pipelines in various domains?

Methodology / Approach

Literature Review & Comparative Analysis
- Analyze existing modeling approaches and their relevance to ML infrastructures.
Conceptual Framework Development
- Define a structured notation to capture decisions, alternatives, rationale, stakeholders, and impacts.
- Integrate metadata for trust/explainability.
Evaluation through Expert Reviews
- Conduct evaluations with multi-disciplinary stakeholders.
- Use structured questionnaires or interviews to assess usefulness, interpretability, and reusability.
Prototype / Case Study Application
- Apply the framework to a real or simulated sensor-based ML pipeline

Expected Contributions

A structured classification of modeling techniques suited to ML infrastructure design decisions.
A novel or adapted modeling framework that combines architectural, stakeholder, and decision elements.
Evaluation results and guidelines for practitioners and researchers.
A demonstrator or case-study application to a real or simulated sensor-based ML system.

If you are interested in this thesis, a related topic or have additional questions, please do not hesitate to send a message to slupczynski@dbis.rwth-aachen.de
Please apply with a meaningful CV and a recent transcript of your academic performance.

Prerequisites:

Strong interest in software engineering, explainable AI, and stakeholder collaboration.
Familiarity with modeling techniques (e.g., UML, BPMN) and/or ontologies.
Basic knowledge of machine learning pipelines and system architecture.
Experience with Python, ML frameworks, or MLOps tools is beneficial but not mandatory.

DBIS