Скачать книгу

methods will result in higher detection and false positive rates. Therefore, the effectiveness of unsupervised intrusion methods is sensitive to parameter choices, especially when the boundaries between normal and abnormal behavior are not clearly distinguishable. Thus, it would be interesting to identify the observations whose anomaly scores are extreme and significantly deviate from others, and then such observations are assumed to be “abnormal”. On another hand, the observations whose anomaly scores are significantly distant from “abnormal” ones will be assumed to be “normal”. Then, the ensemble‐based supervised learning is proposed to find a global and efficient anomaly threshold using the information of both “normal”/“abnormal” behavior.

      This section summarizes the important lessons learned from the development of robust unsupervised SCADA data‐driven Intrusion Detection Systems (IDSs), which are detailed in the various chapters of this book. The first lesson relates to the design of a SCADA security testbed through which the practicality and efficiency of SCADA security solutions are evaluated and tested, while, the remaining three aspects focus on the details of the various elements of a robust unsupervised SCADA data‐driven IDS.

       The evaluation and testing of security solutions tailored to SCADA systems is a challenging issue facing researchers and practitioners working on such systems. Several reasons for this include: privacy, security, and legal constraints that prevent organizations from publishing their respective SCADA data. In addition, it is not feasible to conduct experiments on actual live systems, as this is highly likely to affect their availability and performance. Moreover, the establishment of a real SCADA Lab can be costly and place‐constrained, and therefore unavailable to all researchers and practitioners. In this book, a framework for a SCADA security testbed is described to build a full SCADA system based on a hybrid of emulation and simulation methods. A real SCADA protocol is implemented and therefore realistic SCADA network traffic is generated. Moreover, a key benefit of this framework is that it is a realistic alternative to real‐world SCADA systems and, in particular, it can be used to evaluate the accuracy and efficiency of unsupervised SCADA data‐driven Intrusion Detection Systems (IDSs).

       Unsupervised learning for anomaly‐detection methods is time‐ and cost‐efficient since they can learn from unlabeled data. This is because human expertise is not required to identify the behavior (whether normal or abnormal) for each observation in a large amount of training data sets. Anomaly scoring methods are believed to be promising automatic methods for assigning an anomaly degree to each observation (Chandola et al., 2009). The ‐NN method is one of the most interesting and best methods for computing the degree of anomaly based on neighborhood density of a particular observation (Wu et al., 2008). However, this method requires high computational cost, especially with large and high‐dimensional data that we expect to have in the development of an unsupervised SCADA data‐driven IDS. Therefore, this book describes an efficient ‐nearest neighbor‐based method, called NNVWC (‐Nearest Neighbor approach based on Various‐Widths Clustering), which utilizes a novel various‐width clustering algorithm and triangle inequality.

       It is not feasible to retain all the training data in SCADA data‐driven anomaly detection methods, especially when these are built from a large training data set. This is because such detection methods will be used for on‐line monitoring, and therefore the more information retained in the detection methods, the larger the memory capacity required and the higher the computation cost required. To address this issue, this book describes a clustering‐based method to extract proximity‐based detection rules, called SDAD (SCADA Data‐Driven Anomaly Detection), which are assumed to be a tiny portion compared to the training data, for each behavior (normal and abnormal). Each rule comprehensively represents a subset of observations that represent only one behavior.

       Unsupervised learning for anomaly‐detection methods are based mainly on assumptions to find the near‐optimal anomaly detection threshold. Therefore, the accuracy of the detection methods is based on the validity of the assumptions. This book, however, describes an efficient method, called GATUD (Global Anomaly Threshold to Unsupervised Detection), which firstly identifies observations whose anomaly scores significantly deviate from others to represent “abnormal” behavior. On the other hand, a tiny portion of observations whose anomaly scores are the smallest are considered to represent “normal” behavior. Then an ensemble‐based decision‐making method is described, which aims to find a global and efficient anomaly threshold using the information of both “normal”/“abnormal” behavior.

      The remainder of the book is structured as follows. Chapter 2 gives an introduction to readers who do not have an understanding of SCADA systems and their architectures, and the main components. This includes a description of the relationship between the main components and three generations of SCADA systems. The classification of a SCADA IDS based on its architecture and implementation is described.

      Chapter 4 describes in detail

NNVWC, an efficient method that finds the
‐nearest neighbors in large and high‐dimensional data. In
NNVWC, a new various‐widths clustering algorithm is introduced, where the data is partitioned into a number of clusters using various widths. Triangle inequality is adapted to prune unlikely clusters in the search process of
‐nearest neighbors for an observation. Experimental results show that
NNVWC performs well in finding
‐nearest neighbors compared to a number of
‐nearest neighbor‐based algorithms, especially for a data set with high dimensions, various distributions, and large size.

      Chapter 5 describes SDAD, a method that extracts proximity‐based detection rules from unlabeled SCADA data, based on a clustering‐based method. The evaluation of SDAD is carried out using real and simulated data sets. The extracted proximity‐based detection rules show a significant detection accuracy rate compared with an existing clustering‐based intrusion detection algorithm.

      Chapter 6 describes GATUD, a method that finds a global and efficient anomaly threshold. GATUD is proposed as an add‐on component that can be attached to any unsupervised anomaly detection method in order to define the near‐optimal anomaly threshold. GATUD shows significant and promising results with two unsupervised anomaly detection methods.

      Chapter 7 looks at the authentication aspects related to SCADA environments. It describes two innovative protocols which are based on TPASS (Threshold Password‐Authenticated Secret Sharing) protocols; one is built on two‐phase

Скачать книгу