Скачать книгу

events, as well as the possibility of integrating both previous knowledge and previous data (Heckerman 2008). Its major drawback is that results are comparable to statistical techniques, but this requires additional computation efforts. Kruegel et al. (2003) proposed a multisensor fusion approach using a Bayesian network–based classifier for the classification and cancellation of false alarms, according to which the outputs of various sensors of the intrusion detection system are aggregated to generate a single alarm. Han et al. (2015) proposed an intrusion detection algorithm based on Bayesian networks relying on the analysis into main components. The authors calculate the characteristic data value of the attack on the original network, and then extract the main properties by analysis into main components.

      1.3.4.3. Markov chains

      A Markov chain is a random process related to a finite number of states, with memoryless transition probabilities. During the learning phase, probabilities associated with transitions are estimated from the normal behavior of the target system. Detection of anomalies is then achieved by comparing the anomaly score obtained for the sequences observed at a fixed threshold. In the case of a hidden Markov model (Hu et al. 2009; Zegeye et al. 2018; Liang et al. 2019), the system we are interested in is assumed to be a Markov process in which states and transitions are masked. In the literature, several methods have been presented for solving the intrusion detection problem by inspecting the packet headers. Mahoney and Chan (2001) experimented with anomaly detection on DARPA network data by comparing the header fields of the network packet. Several systems use the Markov model for intrusion detection: PHAD (Packet Header Anomaly Detector) (Mahoney and Chan 2001), LERAD (Learning Rules for Anomaly Detection) (Mahoney and Chan 2002a) and ALAD (Application Layer Anomaly Detector) (Mahoney and Chan 2002b). In the book of Zegeye et al. (2018), an intrusion detection system using the hidden Markov model is proposed. The phase of network traffic analysis involves characteristic extraction techniques, reduction of dimensions and vector quantization, which plays an important role in large sets of data, as the amount of data transmitted increases every day. Model performances with respect to the KDD 99 dataset indicate an accuracy above 99%.

      1.3.4.4. Support-vector machines

      1.3.5. Clustering techniques

      Clustering techniques operate by organizing observed data in groups, depending on a given similarity or a distance measurement. Similarity can be measured by using the cosine formula, the binary weighted cosine formula proposed by Rawat (2005) or other formulas. The most commonly used procedure for clustering involves the selection of a representative point for each cluster. Then each new data point is classified as belonging to a given group depending on the proximity to the corresponding representative point. There are at least two approaches for the classification-based detection of anomalies. In the first approach, the anomaly detection model is formed using unlabeled data including both normal and attack traffic. In the second approach, the model is formed using only normal data and a normal activity profile is created. The idea underlying the first approach is that abnormal or attack data represent a small percentage of the total data. If this hypothesis is verified, anomalies and attacks can be detected depending on cluster size: large clusters correspond to normal data and the other data points to attacks. Liao and Vemuri (2002) used the K-nearest neighbor (K-nn) approach, based on the Euclidian distance, to define the belonging of data points to a given cluster. The Minnesota intrusion detection system is a network-based anomaly detection approach that uses data exploration and clustering techniques (Levent et al. 2004).

      1.3.6. Hybrid techniques

      Many researchers suggested that the monitoring capacity of current IDS systems could be improved by adopting a hybrid approach including detection techniques of both anomalies and signatures (Lunt et al. 1992; Anderson et al. 1995; Fortuna et al. 2002; Hwang et al. 2007). Sabhnani and Serpen (2003) proved that no single classification technique enables the detection of all the attack classes at an acceptable false alarm rate and with a good detection accuracy. The authors used various techniques to classify the intrusions by means of a KDD 1998 dataset. Many researchers proved that the hybrid or set-based classification technique can improve detection accuracy (Mukkamala et al. 2005; Chen et al. 2005; Aslahi-Shahri et al. 2016; Hamamoto et al. 2018; Hajimirzaei and Navimipour 2019; Sai Satyanarayana Reddy et al. 2019). A hybrid approach involves the integration of various learning or decision-making models. Each learning model operates differently and uses a different set of functionalities. The integration of various learning models yields better results than the individual learning or decision-making models and reduces their individual limitations. A significant advantage of the combination of redundant and complementary classification techniques is that it increases robustness and accuracy in most applications.

Скачать книгу