Interoperable Data Exchange between Data Stream Processing Platforms

March 5th, 2024

Thesis Type
  • Bachelor
  • Master
Sandra Geisler
Liam Tirpitz

Over recent years, driven by an overall increase in data volume, increasingly complex data processing pipelines are placed on heterogeneous resources, across both centralized distributed environments. To perform their analysis of data in real-time, users can choose one of many data stream processing platforms in existence, such as Apache Spark Streaming, Apache Flink and others. However, these platforms have various strengths and weaknesses, and excel at processing data either in edge or cloud environments, not both.

To build efficient and dynamic data pipelines from distributed edge environments to centralized cloud infrastructure, or even across organizations, multiple platforms can be chained together.
However, the heterogeneity in processing platforms and the lack of common interfaces between them, poses a challenge. One way to dynamically connect processing pipelines would be the use of a centralized orchestrator and message queue. However, to process edge data on the data path to a centralized cloud environment, platforms need to be dynamically coupled without centralized data exchange.

In the context of this thesis, you will therefore develop a unified, cross-platform data stream interface that enables interoperable, on-demand data exchange between data stream processing platforms.

Towards that goal, you will:

  • Look at related work, such as Apache Wayang,
  • Analyze modern event and data stream processing platforms, such as Apache Spark Streaming, Apache Flink, Apache Kafka or NebulaStream towards the extendibility of source and sink definitions, as well as their commonality and differences.
  • Define a common exchange format and cross-platform interconnection
  • Implement and evaluate the developed format for existing data stream processing platforms

The scope of this thesis can be adapted to either a bachelor or master thesis.


Interested? Questions? Contact Us with a CV and current transcript of records!

Liam Tirpitz, M.Sc. – – Tel: +49 241 80-21542