Name: Advancing Cyber Incident Timeline Analysis Through Retrieval-Augmented Generation and Large Language Models
Keywords: FOS: Computer and information sciences, Cryptography and Security, DFIR, digital forensics, GenAI, QA75.5-76.95, Machine Learning (cs.LG), cyber incident, Machine Learning, Artificial Intelligence (cs.AI)

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 30 Dec 2024Embargo end date: 01 Jan 2024Publisher:MDPI AGJournal:Computers, volume 14, page 67 (eissn: 2073-431X,

Authors: Fatma Yasmine Loumachi; Mohamed Chahine Ghanem; Mohamed Amine Ferrag;

doi: 10.20944/preprints202412.2516.v1 , 10.3390/computers14020067 , 10.20944/preprints202412.2516.v2 , 10.48550/arxiv.2409.02572

arXiv: 2409.02572

Advancing Cyber Incident Timeline Analysis Through Retrieval-Augmented Generation and Large Language Models

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Cyber timeline analysis or Forensic timeline analysis is critical in Digital Forensics and Incident Response (DFIR) investigations. It involves examining artefacts and events—particularly their timestamps and associated metadata—to detect anomalies, establish correlations, and reconstruct a detailed sequence of the incident. Traditional approaches rely on processing structured artefacts, such as logs and filesystem metadata, using multiple specialised tools for evidence identification, feature extraction, and timeline reconstruction. This paper introduces an innovative framework, GenDFIR, a context-specific approach powered by large language models (LLMs) capabilities. Specifically, it proposes the use of Llama 3.1 8B in zero-shot, selected for its ability to understand cyber threat nuances, integrated with a Retrieval-Augmented Generation (RAG) agent. Our approach comprises two main stages: (1) Data Preprocessing and Structuring: Incident events, represented as textual data, are transformed into a well-structured document, forming a comprehensive knowledge base of the incident. (2) Context Retrieval and Semantic Enrichment: A RAG agent retrieves relevant incident events from the knowledge base based on user prompts. The LLM processes the pertinent retrieved-context, enabling detailed interpretation and semantic enhancement. The proposed framework was tested on synthetic cyber incident events in a controlled environment, with results assessed using DFIR-tailored, context-specific metrics designed to evaluate the framework’s performance, reliability, and robustness, supported by human evaluation to validate the accuracy and reliability of the outcomes. Our findings demonstrate the potential of LLMs in DFIR and the automation of the timeline analysis process. This approach highlights the power of Generative AI, particularly LLMs, and opens new possibilities for advanced threat detection and incident reconstruction.

Related Organizations

London Metropolitan University
United Kingdom
University of Liverpool
United Kingdom
University of Guelma
Algeria

Keywords

FOS: Computer and information sciences, Cryptography and Security, DFIR, digital forensics, GenAI, QA75.5-76.95, Machine Learning (cs.LG), cyber incident, Machine Learning, Artificial Intelligence (cs.AI), Emerging Technologies (cs.ET), Artificial Intelligence, Electronic computers. Computer science, incident response, timeline analysis, Cryptography and Security (cs.CR), Emerging Technologies

2 Research products, page 1 of 1

GenDFIR software on GitHub
IsRelatedTo
PurpleLlama software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	7
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%