Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ World Journal of Adv...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
World Journal of Advanced Research and Reviews
Article . 2025 . Peer-reviewed
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

Leveraging big data engineering techniques for automated evidence extraction and pattern recognition in cybercrime forensic analysis

Authors: Nsor, Michael; Bakare, Felix Adebayo;

Leveraging big data engineering techniques for automated evidence extraction and pattern recognition in cybercrime forensic analysis

Abstract

The exponential growth of cybercrime, ranging from identity theft to ransomware and state-sponsored attacks, has overwhelmed traditional digital forensic methodologies. These conventional approaches often rely on manual inspection and isolated system logs, making them time-consuming, error-prone, and insufficient for tracking complex, multi-layered cyber threats. In this context, big data engineering emerges as a transformative enabler for scalable, automated, and intelligent cyber forensic analysis. This paper explores the integration of big data engineering techniques such as distributed data processing, real-time stream analytics, NoSQL-based evidence repositories, and parallelized machine learning algorithms for automating evidence extraction and uncovering hidden patterns in massive, heterogeneous datasets. A foundational framework is proposed, combining Hadoop and Spark ecosystems with forensic tools to manage and analyze unstructured, semi-structured, and structured digital evidence originating from diverse sources including logs, emails, file systems, and network packets. Through case-driven evaluation, we demonstrate how the system can detect behavioral anomalies, correlate time-sensitive events across systems, and extract digital artifacts with minimal human intervention. Particular focus is given to the scalability of the architecture, forensic integrity of the data pipeline, and legal admissibility of the outputs. The paper further discusses the challenges of maintaining chain-of-custody and privacy compliance in a high-throughput forensic environment. By bridging big data engineering and digital forensics, this study positions automated pattern recognition and evidence extraction as central to the next generation of cybercrime investigation tools. The resulting framework enhances operational efficiency, investigative depth, and the ability to respond to increasingly sophisticated cyber threats.

Related Organizations
Keywords

Big Data Engineering, Stream Analytics, Automated Evidence Extraction, Pattern Recognition, Cybercrime Forensics, Digital Investigation Systems

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
gold