Applying Anonymization Methodologies to Distributed Analytics

January 10th, 2022

Thesis Type Bachelor
Status Finished
Advisor(s) Sascha Welten

The Thesis considers existing anonymization techniques and applies them in distributed analytics. The challenge is that distributed anonymization of distributed results achieves lower data quality than anonymization of the entire result dataset. To achieve the highest possible data quality despite anonymization, synthetic data are temporarily generated and anonymized along with the original data. As soon as all results of the distributed analytics are available, synthetic entries are deleted so that only the original data remain. The thesis evaluates different possibilities to use synthetic data for anonymization alone and in combination with other existing anonymization concepts and compares them with each other.