ScriptNet: Neural Static Analysis for Malicious JavaScript Detection

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Nov 2019Embargo end date: 01 Jan 2019Publisher:IEEEJournal:MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)

Authors: Stokes, Jack W.; Agrawal, Rakshit; McDonald, Geoff; Hausknecht, Matthew;

doi: 10.1109/milcom47813.2019.9020870 , 10.48550/arxiv.1904.01126

arXiv: 1904.01126

ScriptNet: Neural Static Analysis for Malicious JavaScript Detection

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Malicious scripts are an important computer infection threat vector in the wild. For web-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We use the Convoluted Partitioning of Long Sequences (CPoLS) model, which processes Javascript files as byte sequences. Lower layers capture the sequential nature of these byte sequences while higher layers classify the resulting embedding as malicious or benign. Unlike previously proposed solutions, our model variants are trained in an end-to-end fashion allowing discriminative training even for the sequential processing layers. Evaluating this model on a large corpus of 212,408 JavaScript files indicates that the best performing CPoLS model offers a 97.20% true positive rate (TPR) for the first 60K byte subsequence at a false positive rate (FPR) of 0.50%. The best performing CPoLS model significantly outperform several baseline models.

Related Organizations

Microsoft (United States)
United States
University of California, Santa Cruz
United States

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Cryptography and Security, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Cryptography and Security (cs.CR), Machine Learning (cs.LG)

1 Research products, page 1 of 1

keras software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	7
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%