publication . Preprint . Article . 2017

Towards Building an Intelligent Anti-Malware System: A Deep Learning Approach using Support Vector Machine (SVM) for Malware Classification

Abien Fred Agarap;
Open Access English
  • Published: 31 Dec 2017
Abstract
Effective and efficient mitigation of malware is a long-time endeavor in the information security community. The development of an anti-malware system that can counteract an unknown malware is a prolific activity that may benefit several sectors. We envision an intelligent anti-malware system that utilizes the power of deep learning (DL) models. Using such models would enable the detection of newly-released malware through mathematical generalization. That is, finding the relationship between a given malware $x$ and its corresponding malware family $y$, $f: x \mapsto y$. To accomplish this feat, we used the Malimg dataset (Nataraj et al., 2011) which consists of...
Subjects
ACM Computing Classification System: ComputingMethodologies_PATTERNRECOGNITIONSoftware_OPERATINGSYSTEMS
free text keywords: Computer Science - Neural and Evolutionary Computing, Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Statistics - Machine Learning
Related Organizations
17 references, page 1 of 2

[1] 2017. Deep MNIST for Experts. (Nov 2017). https://www.tensorflow.org/get_ started/mnist/pros

[2] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jefrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geofrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http://tensorflow.org/ Software available from tensorflow.org.

[3] Abien Fred Agarap. 2017. A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Trafic Data. arXiv preprint arXiv:1709.03082 (2017). [OpenAIRE]

[4] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).

[5] C. Cortes and V. Vapnik. 1995. Support-vector Networks. Machine Learning 20.3 (1995), 273-297. https://doi.org/10.1007/BF00994018

[6] Felan Carlo C. Garcia and Felix P. Muga II. 2016. Random Forest for Malware Classification. arXiv preprint arXiv:1609.07770 (2016). [OpenAIRE]

[7] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.

[8] J. D. Hunter. 2007. Matplotlib: A 2D graphics environment. Computing In Science & Engineering 9, 3 (2007), 90-95. https://doi.org/10.1109/MCSE.2007.55

[9] Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[10] Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In Proc. ICML, Vol. 30.

[11] Warren S McCulloch and Walter Pitts. 1943. A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics 5, 4 (1943), 115-133.

[12] Lakshmanan Nataraj, S Karthikeyan, Gregoire Jacob, and BS Manjunath. 2011. Malware images: visualization and automatic classification. In Proceedings of the 8th international symposium on visualization for cyber security. ACM, 4.

[13] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825-2830. [OpenAIRE]

[14] Frank Rosenblatt. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological review 65, 6 (1958), 386. [OpenAIRE]

[15] Gary B Shelly and Misty E Vermaat. 2011. Discovering Computers, Complete: Your Interactive Guide to the Digital World. Cengage Learning.

17 references, page 1 of 2
Abstract
Effective and efficient mitigation of malware is a long-time endeavor in the information security community. The development of an anti-malware system that can counteract an unknown malware is a prolific activity that may benefit several sectors. We envision an intelligent anti-malware system that utilizes the power of deep learning (DL) models. Using such models would enable the detection of newly-released malware through mathematical generalization. That is, finding the relationship between a given malware $x$ and its corresponding malware family $y$, $f: x \mapsto y$. To accomplish this feat, we used the Malimg dataset (Nataraj et al., 2011) which consists of...
Subjects
ACM Computing Classification System: ComputingMethodologies_PATTERNRECOGNITIONSoftware_OPERATINGSYSTEMS
free text keywords: Computer Science - Neural and Evolutionary Computing, Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Statistics - Machine Learning
Related Organizations
17 references, page 1 of 2

[1] 2017. Deep MNIST for Experts. (Nov 2017). https://www.tensorflow.org/get_ started/mnist/pros

[2] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jefrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geofrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http://tensorflow.org/ Software available from tensorflow.org.

[3] Abien Fred Agarap. 2017. A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Trafic Data. arXiv preprint arXiv:1709.03082 (2017). [OpenAIRE]

[4] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).

[5] C. Cortes and V. Vapnik. 1995. Support-vector Networks. Machine Learning 20.3 (1995), 273-297. https://doi.org/10.1007/BF00994018

[6] Felan Carlo C. Garcia and Felix P. Muga II. 2016. Random Forest for Malware Classification. arXiv preprint arXiv:1609.07770 (2016). [OpenAIRE]

[7] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.

[8] J. D. Hunter. 2007. Matplotlib: A 2D graphics environment. Computing In Science & Engineering 9, 3 (2007), 90-95. https://doi.org/10.1109/MCSE.2007.55

[9] Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[10] Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In Proc. ICML, Vol. 30.

[11] Warren S McCulloch and Walter Pitts. 1943. A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics 5, 4 (1943), 115-133.

[12] Lakshmanan Nataraj, S Karthikeyan, Gregoire Jacob, and BS Manjunath. 2011. Malware images: visualization and automatic classification. In Proceedings of the 8th international symposium on visualization for cyber security. ACM, 4.

[13] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825-2830. [OpenAIRE]

[14] Frank Rosenblatt. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological review 65, 6 (1958), 386. [OpenAIRE]

[15] Gary B Shelly and Misty E Vermaat. 2011. Discovering Computers, Complete: Your Interactive Guide to the Digital World. Cengage Learning.

17 references, page 1 of 2
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue