Categories
Pages
-

DBIS

Reasoning-based LLM enhancement for the cybersecurity playbook translation into a standardised format

June 27th, 2024

Thesis Type
  • Bachelor
Student
Justin Backes
Status
Running
Proposal on
12/07/2024 10:45 am
Proposal room
Seminar room I5 6202
Supervisor(s)
Stefan Decker
Advisor(s)
Mehdi Akbari G.
matzutt
Contact
mehdi.akbari.gurabi@fit.fraunhofer.de
roman.matzutt@fit.fraunhofer.de

The thesis project aims to automate the translation of unstructured or semi-structured cybersecurity playbooks into a standardized, machine-readable format (OASIS CACAO) using Large Language Models (LLMs). A key research focus is ensuring the accuracy, reliability, and effectiveness of LLM-generated workflows during playbook translation. It includes a concept where security operators use LLMs to convert unstructured text into structured workflows, with syntax checkers and playbook management components ensuring standard compliance and content accuracy. The thesis will focus on reasoning-based techniques such as chain of Thought and React, on state-of-the-art LLM models for the CACAO playbook translation and further development of an already existing CACAO syntax checker component for syntax verification and improving prompting. This will aid in developing a methodological approach to use LLMs for automating the translation of cybersecurity playbooks effectively.

* OASIS CACAO Specification: This document details the Collaborative Automated Course of Action Operations (CACAO) standard for cybersecurity playbooks: https://docs.oasis-open.org/cacao/security-playbooks/v2.0/security-playbooks-v2.0.html

* CACAO v2.0 syntax validator: https://github.com/opencybersecurityalliance/cacao-roaster/tree/main/src/diagram/modules/features/validator

* Playbook Examples:

1. The link to Phantom Cyber’s GitHub repository will provide simple practical examples of cybersecurity playbooks: https://github.com/phantomcyber/playbooks
2. https://github.com/luduslibrum/awesome-playbooks

Seed papers:

Process Modeling With Large Language Models: https://arxiv.org/pdf/2403.07541.pdf

A Method for Extracting BPMN Models from Textual Descriptions Using Natural Language Processing: https://zir.nsk.hr/islandora/object/unipu:8207/datastream/PDF/download

Chain-Of-Knowledge: Grounding Large Language Models Via Dynamic Knowledge Adapting Over Heterogeneous Sources”: https://arxiv.org/pdf/2305.13269


Prerequisites:

Basic knowledge in the domains of cyber security, Natural Language Processing (Specifically, Generative AI).