
handle: 10394/10215
The increased use of digital media to store legal, as well as illegal data, has created the need for specialized tools that can monitor, control and even recover this data. An important task in computer forensics and security is to identify the true le type to which a computer le or computer le fragment belongs. File type identi cation is traditionally done by means of metadata, such as le extensions and le header and footer signatures. As a result, traditional metadata-based le object type identi cation techniques work well in cases where the required metadata is available and unaltered. However, traditional approaches are not reliable when the integrity of metadata is not guaranteed or metadata is unavailable. As an alternative, any pattern in the content of a le object can be used to determine the associated le type. This is called content-based le object type identi cation. Supervised learning techniques can be used to infer a le object type classi er by exploiting some unique pattern that underlies a le type's common le structure. This study builds on existing literature regarding the use of supervised learning techniques for content-based le object type identi cation, and explores the combined use of multilayer perceptron neural network classi ers and linear programming-based discriminant classi ers as a solution to the multiple class le fragment type identi cation problem. The purpose of this study was to investigate and compare the use of a single multilayer perceptron neural network classi er, a single linear programming-based discriminant classi- er and a combined ensemble of these classi ers in the eld of le type identi cation. The ability of each individual classi er and the ensemble of these classi ers to accurately predict the le type to which a le fragment belongs were tested empirically. The study found that both a multilayer perceptron neural network and a linear programming- based discriminant classi er (used in a round robin) seemed to perform well in solving the multiple class le fragment type identi cation problem. The results of combining multilayer perceptron neural network classi ers and linear programming-based discriminant classi ers in an ensemble were not better than those of the single optimized classi ers. MSc (Computer Science), North-West University, Potchefstroom Campus, 2013
File fragment type identification, Klassifikasie, 006, Lêerfragmentformaatidentifisering, Linear programming-based discriminant analysis, Classification, Multilayer perceptron neural network, Lineêre programmeringgebaseerde diskriminantklassifiseerder, File type identifification, Multilaag-perseptron neurale netwerk, Rekenaarlêerfragmentformaatidentifisering, Ensembles, Rekenaarlêerformaatidentifisering
File fragment type identification, Klassifikasie, 006, Lêerfragmentformaatidentifisering, Linear programming-based discriminant analysis, Classification, Multilayer perceptron neural network, Lineêre programmeringgebaseerde diskriminantklassifiseerder, File type identifification, Multilaag-perseptron neurale netwerk, Rekenaarlêerfragmentformaatidentifisering, Ensembles, Rekenaarlêerformaatidentifisering
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
