
Existing malicious JavaScript (JS) detection tools and commercial anti-virus tools mostly use feature-based or signature-based approaches to detect JS malware. These tools are weak in resistance to obfuscation and JS malware variants, not mentioning about providing detailed information of attack behaviors. Such limitations root in the incapability of capturing attack behaviors in these approches. In this paper, we propose to use Deterministic Finite Automaton (DFA) to abstract and summarize common behaviors of malicious JS of the same attack type. We propose an automatic behavior learning framework, named JS*, to learn DFAs from dynamic execution traces of JS malware, where we implement an effective online teacher by combining data dependency analysis, defense rules and trace replay mechanism. We evaluate JS* using real world data of 10000 benign and 276 malicious JS samples to cover 8 most-infectious attack types. The results demonstrate the scalability and effectiveness of our approach in the malware detection and classification, compared with commercial anti-virus tools. We also show how to use our DFAs to detect variants and new attacks.
malware detection, L*, Programming Languages and Compilers, Software Engineering, malicious JavaScript, behavior modelling
malware detection, L*, Programming Languages and Compilers, Software Engineering, malicious JavaScript, behavior modelling
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 29 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
