Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

The file fragment classification problem : a combined neural network and linear programming discriminant model approach

Authors: Wilgenbus, Erich Feodor;

The file fragment classification problem : a combined neural network and linear programming discriminant model approach

Abstract

The increased use of digital media to store legal, as well as illegal data, has created the need for specialized tools that can monitor, control and even recover this data. An important task in computer forensics and security is to identify the true file type to which a computer file or computer file fragment belongs. File type identification is traditionally done by means of metadata, such as file extensions and file header and footer signatures. As a result, traditional metadata-based file object type identification techniques work well in cases where the required metadata is available and unaltered. However, traditional approaches are not reliable when the integrity of metadata is not guaranteed or metadata is unavailable. As an alternative, any pattern in the content of a file object can be used to determine the associated file type. This is called content-based file object type identification. Supervised learning techniques can be used to infer a file object type classifier by exploiting some unique pattern that underlies a file type's common file structure. This study builds on existing literature regarding the use of supervised learning techniques for content-based file object type identification, and explores the combined use of multilayer perceptron neural network classifiers and linear programming-based discriminant classifiers as a solution to the multiple class file fragment type identification problem. The purpose of this study was to investigate and compare the use of a single multilayer perceptron neural network classifier, a single linear programming-based discriminant classifier and a combined ensemble of these classifiers in the field of file type identification. The ability of each individual classifier and the ensemble of these classifiers to accurately predict the file type to which a file fragment belongs were tested empirically. The study found that both a multilayer perceptron neural network and a linear programming-based discriminant classifier (used in a round robin) seemed to perform well in solving the multiple class file fragment type identification problem. The results of combining multilayer perceptron neural network classifiers and linear programming-based discriminant classifiers in an ensemble were not better than those of the single optimized classifiers.

MSc (Computer Science), North-West University, Potchefstroom Campus, 2013

Masters

Country
South Africa
Related Organizations
Keywords

Multilayer perceptron neural network, Lineêre programmeringgebaseerde diskriminantklassifiseerder, File type identifification, Multilaag-perseptron neurale netwerk, File fragment type identification, Rekenaarlêerfragmentformaatidentifisering, Klassifikasie, 006, Ensembles, Lêerfragmentformaatidentifisering, Linear programming-based discriminant analysis, Classification, Rekenaarlêerformaatidentifisering

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green