Detection and classification of malicious JavaScript via attack behavior modelling

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 13 Jul 2015 Singapore Publisher:ACMJournal:Proceedings of the 2015 International Symposium on Software Testing and Analysis

Authors: Yinxing Xue; Junjie Wang 0007; Yang Liu 0003; Hao Xiao; Jun Sun 0001; Mahinthan Chandramohan;

doi: 10.1145/2771783.2771814

Detection and classification of malicious JavaScript via attack behavior modelling

- Summary
- Subjects
- Metrics

Abstract

Existing malicious JavaScript (JS) detection tools and commercial anti-virus tools mostly use feature-based or signature-based approaches to detect JS malware. These tools are weak in resistance to obfuscation and JS malware variants, not mentioning about providing detailed information of attack behaviors. Such limitations root in the incapability of capturing attack behaviors in these approches. In this paper, we propose to use Deterministic Finite Automaton (DFA) to abstract and summarize common behaviors of malicious JS of the same attack type. We propose an automatic behavior learning framework, named JS*, to learn DFAs from dynamic execution traces of JS malware, where we implement an effective online teacher by combining data dependency analysis, defense rules and trace replay mechanism. We evaluate JS* using real world data of 10000 benign and 276 malicious JS samples to cover 8 most-infectious attack types. The results demonstrate the scalability and effectiveness of our approach in the malware detection and classification, compared with commercial anti-virus tools. We also show how to use our DFAs to detect variants and new attacks.

Country

Singapore

Related Organizations

Singapore Management University
Singapore
Nanyang Technological University
Singapore
Singapore University of Technology and Design
Singapore

Keywords

malware detection, L*, Programming Languages and Compilers, Software Engineering, malicious JavaScript, behavior modelling

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	29
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%