Off-Policy Q-Learning Technique For Intrusion Response In Network Security

{"references": ["E. Even-Dar and Y. Mansour, Learning Rates for Q-Learning, Lecture\nNotes in Computer Science Computational Learning Theory, pp. 589-604,\n2001.", "F. S. Melo, S. P. Meyn, and M. I. Ribeiro, An analysis of\nreinforcement learning with function approximation, Proceedings of the\n25th international conference on Machine learning - ICML '08, 2008.", "H. Maei, C. Szepesvari, S. Bhatnagar, D. Silver, D. Precup, and R. Sutton,\nConvergent temporal-difference learning with arbitrary smooth function\napproximation, NIPS-22, pp. 1204-1212.", "ISCX NSL - KDD Data Set, University of New Brunswick est.1785.\n(Online). Available: http://www.unb.ca/cic/datasets/index.html.", "J. Cannady, Applying CMAC-based online learning to intrusion detection,\nProceedings of the IEEE-INNS-ENNS International Joint Conference on\nNeural Networks. IJCNN 2000. Neural Computing: New Challenges and\nPerspectives for the New Millennium, vol. 5, pp. 405-410, Jul. 2000.", "J. Cannady, Next Generation Intrusion Detection: Autonomous\nReinforcement Learning of Network Attacks, In Proceedings of\nthe 23rd National Information Systems Secuity Conference, pp. 1-12,\n2000.", "J. Fu and U. Topcu, Probably Approximately Correct MDP Learning and\nControl With Temporal Logic Constraints, Robotics: Science and Systems\nX, 2014.", "J. N. Tsitsiklis, Asynchronous stochastic approximation and Q-learning,\nMachine Learning, vol. 16, no. 3, pp. 185-202, 1994.", "KDD Cup 1999 Data. (Online). Available:\nhttp://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.\n[10] M. Tavallaee, E. Bagheri,W. Lu, and A. A. Ghorbani, A detailed analysis\nof the KDD CUP 99 data set, 2009 IEEE Symposium on Computational\nIntelligence for Security and Defense Applications, 2009.\n[11] P. Dayan and C. Watkins, Q-learning, Machine Learning, vol. 8, no. 3-4,\npp. 279-292, 1992.\n[12] P. Laskov, K. Rieck, P. Dussel, and C. Schafer, Learning Intrusion\nDetection: Supervised or Unsupervised?, Proceedings of the 13th ICIAP\nConference, pp. 50-57, 2005.\n[13] P. Miller and A. Inoue, Collaborative intrusion detection system, 22nd\nInternational Conference of the North American Fuzzy Information\nProcessing Society, NAFIPS 2003, pp. 519-524.\n[14] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction.\ns.l.: MIT Press, 1998.\n[15] V. Chandola, A. Banerjee, and V. Kumar, Anomaly detection, ACM\nComputing Surveys, vol. 41, no. 3, pp. 1-58, 2009.\n[16] VNI Global Fixed and Mobile Internet Traffic\nForecasts, Cisco, 13-Feb-2018. (Online). Available:\nhttp://www.cisco.com/c/en/us/solutions/service-provider/visualnetworking-\nindex-vni/index.html.\n[17] X. Xu and T. Xie, A Reinforcement Learning Approach for Host-Based\nIntrusion Detection Using Sequences of System Calls, Lecture Notes\nin Computer Science Advances in Intelligent Computing, pp. 995-1003,\n2005. [18] X. Xu and Y. Luo, A Kernel-Based Reinforcement Learning Approach\nto Dynamic Behavior Modeling of Intrusion Detection, Lecture Notes in\nComputer Science, Proceedings of ISNN, pp. 455-464, 2007.\n[19] X. Xu, T. Xie, D. Hu, and X. Lu, Kernel least-squares temporal\ndifference learning, International Journal of Information Technology, vol.\n11, no. 9, pp. 54-63, 2005.\n[20] Z. Stefanova and K. Ramachandran, Network attribute selection,\nclassification and accuracy (NASCA) procedure for intrusion detection\nsystems, 2017 IEEE International Symposium on Technologies for\nHomeland Security (HST), 2017."]}

With the increasing dependency on our computer devices, we face the necessity of adequate, efficient and effective mechanisms, for protecting our network. There are two main problems that Intrusion Detection Systems (IDS) attempt to solve. 1) To detect the attack, by analyzing the incoming traffic and inspect the network (intrusion detection). 2) To produce a prompt response when the attack occurs (intrusion prevention). It is critical creating an Intrusion detection model that will detect a breach in the system on time and also challenging making it provide an automatic and with an acceptable delay response at every single stage of the monitoring process. We cannot afford to adopt security measures with a high exploiting computational power, and we are not able to accept a mechanism that will react with a delay. In this paper, we will propose an intrusion response mechanism that is based on artificial intelligence, and more precisely, reinforcement learning techniques (RLT). The RLT will help us to create a decision agent, who will control the process of interacting with the undetermined environment. The goal is to find an optimal policy, which will represent the intrusion response, therefore, to solve the Reinforcement learning problem, using a Q-learning approach. Our agent will produce an optimal immediate response, in the process of evaluating the network traffic.This Q-learning approach will establish the balance between exploration and exploitation and provide a unique, self-learning and strategic artificial intelligence response mechanism for IDS.

Keywords

Intrusion prevention, network security, optimal policy, Q-learning.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average