To the best of our knowledge, we are not aware of any work that focuses on using AMSO to tackle scalability issues in NSALG for intrusion detection. Therefore, this work intends to fill the gap by introducing Adaptive Multi-Swarm Optimisation for Feature Selection (AMSO) which is a filter and wrapper-based method of feature selection to NSALG to reduce scalability and improve intrusion detection rate.
PROBLEM STATEMENT Intrusion detection systems (IDS’s) have been gaining growing importance in computer security. This is because distinguishing what belongs to a system (self) from what does not (nonself) is a difficult and challenging problem. Two main difficulties exist. First intruders try to mimic self, making evidence scarce. Secondly, IDS’s should provide prompt responses, which is difficult to achieve since continuous search for strange or suspicious activities quickly gathers enormous amounts of data. There are two main approaches of intrusion detection systems in computer security: (i) based on knowledge or (ii) based on the behaviour of the system. In knowledge-based intrusion detection algorithms, the system searches continuously for evidences of attacks based on knowledge accumulated from known attacks (Hervé, 1999). On the other hand, in behaviour-based intrusion or anomaly detection models, intrusions are detected from a deviation of the natural or expected behaviour of the system. The main advantages of the latter models are that 1) they can potentially detect novel attack or penetration attempts (a.k.a., zero-day attacks), 2) they are less dependent on operating systems; 3) they can detect abuse of privilege, and many other types of attacks (Hervé, 1999). Several strategies have been explored that led to different behaviour-based models (Forrest, 2007). The artificial immune systems community has been particularly flourishing in this respect (Bereta, 2009). Probably the approach providing the best quantitative results so far have been achieved by the negative selection algorithms, first proposed in (Forrest 1994). These methods have nevertheless limitations. Over the years, deployed IDS do not have the ability to detect previously unknown intrusions (Denatious, 2012). This is a great concern as the nature of intrusions keep evolving. This has prompted research into the defense mechanism of the human system. The human system mechanisms have served as inspiration for development nature inspired algorithms. This nature inspired algorithms such as Artificial Immune System (AIS) and Artificial Neural Network (ANN) have gained popularity due to their ability to efficiently solve real-world problems (Wu, 2010). AIS are negative selection algorithm (NSALG), clonal selection algorithm (CSALG), artificial immune network algorithm (AINALG) and dendritic cell Algorithm (DCALG) (Silva, 2016). NSALG has gained the attention of researchers due to its intrinsic anomaly detection characteristic. Salau-Ibrahim (2020) proposed a NSALG based anomaly IDS model using wrapper-based feature selection to tackle the scalability issue. However, scalability remains a challenge. To the best of our knowledge, we are not aware of any work that focuses on using AMSO to tackle scalability issues in NSALG for intrusion detection. Therefore, this work intends to fill the gap by introducing Adaptive Multi-Swarm Optimisation for Feature Selection (AMSO) which is a filter and wrapper-based method of feature selection to NSALG to reduce scalability and improve intrusion detection rate. The adapative multiswarm optimisation for feature selection algorithm (AMSO) (Tran, 2019), was proposed for high-dimensional feature selection problems. AMSO starts by ranking available features in descending order, using any information theoretic criterion. This study also uses symmetrical uncertainty as an information theoretic criterion, as presented by Tran (2019). AMSO was chosen to be included due to its novelty and attention to dimensionality reduction. Since AMSO has a stochastic element, fitness is reported as the median over 30 independent runs of the algorithm. Negative Selection Algorithm Based Intrusion Detection Model 978-1-7281-5200-4/20/$31.00 ©2020 IEEE Negative Selection Algorithm Based Intrusion Detection Model Salau-Ibrahim Taofeekat Tosin Department of Computer Science Al-Hikmah University Ilorin, Nigeria
[email protected] Jimoh Rasheed Gbenga Faculty of Communication and Information Science University of Ilorin Ilorin, Nigeria
[email protected] Abstract— The ever-growing security challenges have been a hindrance to the success of Information Technology Innovations due to multifaceted network intrusions. Hence, it becomes imperative to provide tools that can address without compromising integrity, confidentiality and availability of network resources. This paper presents a model for detecting intrusion in a network using Negative Selection Algorithm. Negative Selection which is Human Immune System (HIS) inspired has been used for anomaly detection due to its self-non- self-discrimination potential. However, it suffers from high rate of false positives and scalability issues. This paper addresses the issues using feature selection to reduce the dimensionality of the dataset. The intrusion detection model is evaluated using NSL- KDD dataset. The results obtained using the benchmark dataset showed that the scalability issue reduced in the proposed approach. Keywords— Intrusion Detection System, Artificial Immune System, Negative Selection Algorithm, Feature Selection. I. INTRODUCTION The rapid growth of technologies and sophisticated cyber threats have made research in the field of network intrusion detection open ended as new intrusions are being introduced everyday. Consequently, it is vital to deploy better ways of detecting intrusions on today’s Information Systems (IS) that thrive and deliver considerably in a networked environment. Therefore, new Intrusion Detection Systems (IDS) must be capable of identifying unknown threats as well as cope with the voluminous data from the networked systems. Intrusion Detection systems (IDS) are hardware or software system for automating the process of intrusion detection in a computer or network. IDS is an old concept that has being in existence since the 1980’s. It was first introduced to the research community in James Anderson’s influential paper [1]. Since that time, research on intrusion detection systems has gained substantial focus as a result of advancement in technology that has increased the vulnerability of information system assets to various attacks. Over the years, deployed IDS do not have the ability to detect previously unknown intrusions[2], [3]. This is a great concern as the nature of intrusions keep evolving. This has prompted research into the defense mechanism of the human system. The human system mechanisms have served as inspiration for development nature inspired algorithms. These nature inspired algorithms such as Artificial Immune System (AIS) and Artificial Neural Network (ANN) have gained popularity due to their ability to efficiently solve real- world problems[4]. AIS are negative selection algorithm (NSALG), clonal selection algorithm (CSALG), artificial immune network algorithm (AINALG) and dendritic cell Algorithm (DCALG) [5].Recently, many IDS models are implemented using AIS algorithms[6]–[10]. In this paper, a NSALG based anomaly IDS model is proposed using wrapper based feature selection to tackle the scalability issue. The model has three modules: data preprocessing module, NSALG module and the Alert generation module. The rest of the paper is organized as follows: Section 2 provides review on IDS, AIS and NSALG concepts. Then, the methodology is explained in Section 3. Section 4 presents the experiment conducted, results obtained as well as comparison between NSALG with and without feature selection (FS). Lastly, Section 5 will include conclusion and future direction of the research. II. RELATED WORK A. Intrusion Detection System Intrusion detection is the process of detecting activities on a computer or networked computers that attempts to compromise confidentiality, integrity and availability of resources. Generally, components of intrusion detection systems are data collection, detection and the response. The data collection component is composed of the target system, event generator, log data storage and data collection configuration. The detection component is made up of analysis engine, state information, and detection policy. Lastly, the response component is made up of response unit and response policy. The components of IDS are depicted in Fig. 1. Fig. 1. Components of IDS [11] The target system is usually the system under surveillance. It collects data such as network traffic, system logs, and application logs. The event generator controls log information. Also, data transformation and cleaning takes place during the event generation. The transformed data resides in the log data storage in preparation for analysis. The data collection configuration contains configuration information on how data collected are handled. Analysis This research work is partially funded by National Office for Technology Acquisition and Promotion (NOTAP)-Industrial Technology Transfer Fellowship. 202 Authorized licensed use limited to: Cornell University Library. Downloaded on August 18,2020 at 19:55:35 UTC from IEEE Xplore. Restrictions apply. engine handles implementation of the detection algorithm. The detection policy contains information about representation, threshold values and affinity measures. The state information system contains details about the current state of the system. Response unit receives information about the nature of event that occurred; is it normal or intrusive? Using the stored rules in the response policy database, the appropriate countermeasure to the event is triggered. IDS can be categorized either by analysis/detection or placement/location approach. Considering the method of detection, IDS can be misuse or anomaly detection. Misuse detection uses database of known intrusion while anomaly detection uses database of normal processes on the computer or network. For placement approach, IDS can be Host-based (Resident on the host computer) or Network-based (resident on the network). For detailed taxonomy of IDS refer to [12], [13]. Several techniques have been used for designing intrusion detection systems such as statistical methods, data mining techniques, knowledge-based techniques, mobile agent based and machine learning approach [2], [3], [13]–[17]. As promising has these methods seem, there are still challenges in differentiating malicious actions from normal actions, difficulty in parameter selection and modelling behavior using stochastic methods. Furthermore, problem in getting high quality training data, high resources consumption, space issue for huge amount of data required for analysis, time, high false positive as well as inability to detect new intrusions are issues that require attention. [2], [18], [19]. B. Negative Selection Algorithm NSALG works based on the principles of T cells maturation and ability to distinguish between self and non- self cells in the human body. These functions of maturation and discrimination takes place in the thymus. In the thymus, T-cells that bind to self-proteins (self-cells) are destroyed via apoptosis. Afterwards, only T-cells that can recognize antigens remain i.e. those that do not bind to self-protein. The remaining T-cells are the matured ones that eventually leave the thymus to flow through the body to protect against foreign antigen [7], [20], [21]. In the field of computer security, NSALG first came into limelight in 1994 by Forrest[22]. Then, it was applied to determine the effect of computer virus. The algorithm has two phases: detector generation and detection phases. In the detector generation phase, self- elements are