Supervised fine-tuning LLMs for cybersecurity playbook translation

April 5th, 2024

Thesis Type	Master
Student	Rabina Borici
Status	Finished
Proposal on	03/05/2024 11:15 am
Proposal room	Seminar room I5 6202
Presentation on	04/02/2025 10:00 am
Presentation room	Seminar room I5 6202
Supervisor(s)	Stefan Decker
Advisor(s)	Mehdi Akbari G. fehermsen
Contact	mehdi.akbari.gurabi@fit.fraunhofer.de felix.hermsen@fit.fraunhofer.de

The master’s thesis project aims to automate the translation of unstructured or semi-structured cybersecurity playbooks into a standardized, machine-readable format (OASIS CACAO) using Large Language Models (LLMs). It highlights the benefits of structured processes, interoperable visualizations, collaborative knowledge sharing, and automated response actions, reducing human intervention. A key research focus is ensuring the accuracy, reliability, and effectiveness of LLM-generated workflows during playbook translation. It includes a concept where security operators use LLMs to convert unstructured text into structured workflows, with syntax checkers and playbook management components ensuring standard compliance and content accuracy. The main research problem of the master’s thesis is scarcity of data in both unstructured and machine-readable formats. It will focus on fine-tuning of LLMs (e.g., GPT-3, 4 or Llama 2) with small quantity of data (with the help of other strategies such as feature extraction) for the CACAO playbook translation and further development of an already existing CACAO syntax checker component for syntax verification.

These are the resources we talked about for initial thinking of the topic:

Playbook Examples and guidelines: These links will provide simple practical examples of cybersecurity playbooks:
- https://github.com/phantomcyber/playbooks
- https://gitlab.com/syntax-ir/playbooks
- https://github.com/luduslibrum/awesome-playbooks

OASIS CACAO Specification: This document details the Collaborative Automated Course of Action Operations (CACAO) standard for cybersecurity playbooks: https://docs.oasis-open.org/cacao/security-playbooks/v2.0/security-playbooks-v2.0.html
- CACAO v2.0 syntax validator: https://github.com/opencybersecurityalliance/cacao-roaster/tree/main/src/diagram/modules/features/validator

Fine-tuning with OpenAI: This resource from OpenAI discusses fine-tuning, a method for effectively utilizing Large Language Models (LLMs) like GPT-3 or GPT-4. Understanding how to fine-tune models that results in better outcomes will be key in automating the playbook translation process: https://platform.openai.com/docs/guides/fine-tuning/when-to-use-fine-tuning
Some relevant articles:

Prerequisites:

Basic knowledge in the domains of cyber security, Natural Language Processing (Specifically, Generative AI).

DBIS

Supervised fine-tuning LLMs for cybersecurity playbook translation

Quick Links

Recent News

Recent Publications