
doi: 10.1109/ares.2012.85
Malicious software (malware) represents a threatto the security and privacy of computer users. Traditionalsignature-based and heuristic-based methods are unsuccessfulin detecting some forms of malware. This paper presents amalware detection approach based on supervised learning. Themain contributions of the paper are an ensemble learningalgorithm, two pre-processing techniques, and an empiricalevaluation of the proposed algorithm. Sequences of operationalcodes are extracted as features from malware and benign files. These sequences are used to produce three different data setswith different configurations. A set of learning algorithms isevaluated on the data sets and the predictions are combinedby the ensemble algorithm. The predicted output is decided onthe basis of veto voting. The experimental results show that theapproach can accurately detect both novel and known malwareinstances with higher recall in comparison to majority voting.
veto voting, Datavetenskap (datalogi), classification, majority voting, Computer Sciences, feature extraction, detection, scareware, ensembles, Malware
veto voting, Datavetenskap (datalogi), classification, majority voting, Computer Sciences, feature extraction, detection, scareware, ensembles, Malware
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 11 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
