This course provides participants with a comprehensive and versatile toolbox of data visualisation and analysis methods, which can be transferred to a vast number of applications.
Type | Lab (advanced level) |
Term | WS 2022 |
Starting | 21.10.2022 at 9:30am |
Mentor(s) |
Thomas Rose |
Assistant(s) |
As more and more processes move to the digital world, data sets on numerous aspects of our daily lives become increasingly more abundant. The ability to make sense of these data sets and to successfully employ it to understand and improve services is becoming a relevant key skill to have in most industries. Such data sets call for a full-fledged quantitative analysis in order to get answers to questions, which used to be hard to tackle or were even impossible to answer previously: how do government policies influence labour market decisions, which faming decisions influence animal welfare, and what individual characteristics determine the wage of a person? Analysis of data is certainly a key capability to turn data into an opportunity be it for societal or business reasons.
Methods, processes and tools for data analysis and visualization are therefore becoming key ingredients in producing knowledge necessary and instrumental for decision processes.
This lab course teaches methods and tools for analyzing and visualizing data sets as technical cornerstone of Data Science. It conveys the technical foundation and gives ample opportunities for practicing data analysis and visualization in a variety of contexts while employing a healthy tool box of methods. Innovation of analysis is founded in method orchestration. Previous Lab Courses have approached specific sets of questions, such as how to asses gender gaps[1] or to explore commuter patterns for the planning of transit routes, mobility[2] and distribution of mining nodes for Distributed Ledger Technology. Moreover, data analytics and visualization have been applied to the assessment of animal welfare on the basis of farming decisions[3].
At the beginning of the course, the key concepts of data analysis and visualization are taught using R as free open-source software package. This part will focus on data processing, exploratory data analysis (EDA), regression and classification models, as well as creating a variety of data visualization to inform and communicate results of the analysis. The course expects basic knowledge of statistical terminology, such as mean, variance and distributions.
Supplementary techniques, which can be covered offhand, may include additional visualization tools, combining multiple data sets, presentation skills and more. All these methods and techniques are practiced using applied real-world data sets.
In the second part, students combine these methods to answer increasingly more complex questions. The heart of this course consists of group specific applied projects, in which students analyze real-world data sets on their own, work on answering complex questions and present their result to a larger audience.
Apart from acquiring the functional and technical foundations, students will experience the operative potential of data analysis and its application in analysis processing. In effect, this course is turning data science into a living lab.