
Every day thousands of malware are released online. The vast majority of these malware employ some kind of obfuscation ranging from simple XOR encryption, to more sophisticated anti-analysis, packing and encryption techniques. Dynamic analysis methods can unpack the file and reveal its hidden code. However, these methods are very time consuming when compared to static analysis. Moreover, considering the large amount of new malware being produced daily, it is not practical to solely depend on dynamic analysis methods. Therefore, finding an effective way to filter the samples and delegate only obfuscated and suspicious ones to more rigorous tests would significantly improve the overall scanning process. Current techniques of identifying obfuscation rely mainly on signatures of known packers, file entropy score, or anomalies in file header. However, these features are not only easily bypass-able, but also do not cover all types of obfuscation. In this paper, we introduce a novel approach to identify obfuscated files based on anomalies in their instructions-based characteristics. We detect the presence of interleaving instructions which are the result of the opaque predicate anti-disassembly trick, and present distinguishing statistical properties based on the opcodes and control flow graphs of obfuscated files. Our detection system combines these features with other file structural features and leads to a very good result of detecting obfuscated malware.
Other Computer Engineering, malware detection
Other Computer Engineering, malware detection
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 8 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
