Large Language Models (LLM) can be applied to transform a natural language (NL)-based text input query into a NL-based text answer. A common use case are personal assistants, e.g., for learning activities. In such teaching contexts they can process knowledge recorded in plain text documents, create summarizations, or teach knowledge according to a curriculum. However, interfaces for LLMs are currently text-based chats. This can be enhanced by showing body language, e.g., gestures which support the conveyed content. With the help of desktop-based virtual agents, the chat interface can be turned into a video call where the LLM is personified by an agent which is able to respond with gestures in addition to the output text.
Thesis Type |
|
Student |
Sebastian Meinberger |
Status |
Running |
Presentation room |
Seminar room I5 6202 |
Supervisor(s) |
Stefan Decker |
Advisor(s) |
Benedikt Hensen |
Contact |
hensen@dbis.rwth-aachen.de |
This thesis explores the use case of using an LLM to teach knowledge recorded in plain NL text. The focus lies on enhancing the LLM-to-user interaction by creating multimodal visualizations of the answer queries. The user-to-LLM interactions consist of regular NL text queries which are entered through a keyboard. As an interface for this, an established chat program should be used to provide the LLM as a chatbot. This existing interface is then extended by adding additional visualizations in the form of a virtual agent. In a comparative study, the thesis can investigate different visualization types and gather meaningful data about their impact. These visualizations can include a display of a virtual face and text-to-speech outputs. The face can synchronize its lip movements with the speech and show suitable expressions. Another visualization type is a full 3D agent which is capable of expressing gestures and using virtual objects for demonstration purposes. By comparing these visualizations to a traditional text chat, insights can be gained about the strengths and weaknesses of such visual personifications of the LLM.
Must: Knowledge of LLMs
Beneficial: Experience with the Unity 3D engine, C# and Python