CN113194098A

CN113194098A - Water distribution system network physical attack detection method based on deep learning

Info

Publication number: CN113194098A
Application number: CN202110480658.7A
Authority: CN
Inventors: 李娟�; 王迪; 杜海龙; 刘贲; 乔乔; 左英泽
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2021-07-30

Abstract

The invention discloses a deep learning-based method for detecting network physical attack of a water distribution system, which comprises the following steps: step one, constructing a stack self-encoder and inputting training set data to obtain reconstructed data; step two, comparing the reconstruction data with the training set data and constructing a reconstruction error matrix; step three, decomposing the reconstruction error matrix into a window error of a time step, and comparing the window error with a threshold value: if the window error is not larger than the threshold value, the water distribution system network is normal; and if the window error is larger than the threshold value, the water distribution system network detects the attack. The invention has the characteristic of learning high-dimensional data to accurately detect attacks.

Description

Water distribution system network physical attack detection method based on deep learning

Technical Field

The invention relates to the technical field of rescue equipment, in particular to a deep learning-based method for detecting network physical attack of a water distribution system.

Background

At present, a water distribution system integrates data acquisition, transmission, online monitoring, real-time automatic operation and other equipment of a modern network physical system. The normal operation of the water distribution system is realized by mainly controlling corresponding equipment by an integrated sensor network and a programmable logic control device. The data acquisition and transmission are mainly completed by a monitoring and data acquisition system, and the system can monitor the state, flow, pressure and the like of a certain part in the water distribution system in real time. However, the combination of the physical infrastructure and the intelligent network technology makes the physical infrastructure completely exposed in the virtual network and vulnerable to network attack, and then the system may be abnormal under the network attack. An anomaly may be generally defined as some data that does not conform to a specified notion of normal behavior. Cyber attacks have become an increasingly serious threat. The federal bureau of investigation classified cyber crime as the primary activity in fighting crimes. Under the investigation of the U.S. department of homeland security, the most vulnerable to cyber attacks in the critical infrastructure is the water distribution system, which is the third largest target industry next to the key manufacturing and energy industries. An attacker indirectly causes the abnormity of the water distribution system through the attack monitoring and data acquisition system. For example, attackers have different attack methods to maliciously tamper with the water level of the water tank. Therefore, it is extremely important to identify these anomalies.

To mitigate these threats in the future, recent research has discussed the importance of building a mature network security culture to enhance the intelligence's ability to defend against network attacks. In addition, security measures are taken on the corresponding components of the cyber-physical system (including the remote sensors and actuators, the communication network and the monitoring and data acquisition system module) to enhance their random response capability in the face of cyber attacks. At present, a detection test is carried out on network physical attacks in a water supply infrastructure system, a hydraulic model is used for researching network attack detection in a canal network, and a genetic algorithm and a recursive Bayesian rule are used for detecting abnormal hydraulic power or water quality events; for a data-based method, a data-driven clustering technology is utilized as an intrusion detection method for detecting attacks on a monitoring and data acquisition system, abnormal values can be detected by analyzing correlations between data of different sensors acquired by the monitoring and data acquisition system, but the method is not suitable for being expanded to large data sets which usually contain a large amount of irrelevant and redundant information; more attacks can be detected for model-based methods, but model-based methods require a good physical hydraulic model; for the learning-based method, the corresponding threshold is set by utilizing principal component analysis and an artificial neural network to detect the attack, but the threshold is given by human experience, so the detection recall rate is very low or the recall rate is very high, too many false reports are given, and the balance is not achieved.

As the frequency of cyber physical attacks on water distribution systems continues to increase, it is desirable to develop an attack detection method to protect the critical infrastructure of water supply systems.

Disclosure of Invention

The invention aims to design and develop a deep learning-based method for detecting physical attacks on a water distribution system network, which reproduces data acquired in the operation process of facilities in a water supply system under normal conditions through a stack self-encoder, judges the attack behavior in the water distribution system network by analyzing and processing reconstruction errors, and accurately identifies the attacks on the water distribution system network through learning the characteristics of high-dimensional data.

The technical scheme provided by the invention is as follows:

a method for detecting network physical attacks of a water distribution system based on deep learning comprises the following steps:

step one, constructing a stack self-encoder and inputting data to obtain reconstructed data;

step two, comparing the reconstruction data with the training set data and constructing a reconstruction error matrix;

step three, decomposing the reconstruction error matrix into a window error of a time step, and comparing the window error with a threshold value:

if the window error is not larger than the threshold value, the water distribution system network is normal;

and if the window error is larger than the threshold value, the water distribution system network detects the attack.

Preferably, the constructing of the stack self-encoder includes the steps of:

step 1, constructing a self-encoder;

step 2, stacking a plurality of self-encoders to construct a stack self-encoder;

and 3, carrying out normalization processing on the training set data, the verification set data and the test set data, training the stack self-encoder by using the training set data, verifying the stack self-encoder by using the verification set data and generating a threshold value, and testing the stack self-encoder by using the test set data.

Preferably, the training set data is data collected when the water distribution system network is operating normally, the validation set data is data collected when the water distribution system network is under various types of attacks, and the test set data is data collected when the water distribution system network is under various types of attacks.

Preferably, the self-encoder concealment layer h_iSatisfies the following conditions:

h_i＝f(W₁x+b₁)；

wherein x represents input data, and x ═ x₍₁₎,x₍₂₎,x₍₃₎,...,x_(n)]^T∈RⁿN denotes the dimension of the data, W₁Weight matrix representing the encoder, b₁Representing the offset vector of the encoder, h_i＝[h₍₁₎,h₍₂₎,h₍₃₎,...,h_(m)]^T∈R^mM denotes the dimension of the hidden layer and m < n, f (z) denotes the activation function;

the output data of the self-encoder satisfies:

y＝f(W₂h+b₂)；

wherein y represents output data, and y is [ y ]₍₁₎,y₍₂₎,y₍₃₎,...,y_(n)]^T∈Rⁿ，W₂Weight matrix representing the decoder, b₂Representing the offset vector of the decoder.

Preferably, the activation function satisfies:

preferably, the activation function satisfies:

preferably, the threshold is the 99 th percentile of the reconstruction errors calculated on the 20% training set.

Preferably, the method further comprises evaluating the stack self-encoder:

in the formula, F₁The balance index of the Recall rate and the Precision is represented, Recall represents the Recall rate, and Precision represents the Precision rate;

the value range of the balance index of the recall rate and the accuracy is between 0 and 1, and if the balance index of the recall rate and the accuracy is 0, the output result of the stack self-encoder is represented to be worst; if the balance index of the recall rate and the accuracy is 1, the output result of the stack self-encoder is optimal.

Preferably, the recall ratio satisfies:

in the formula, TP represents the number of actual attacks and determined as attack behaviors, and FN represents the number of actual attacks and determined as normal behaviors.

Preferably, the accuracy ratio satisfies:

in the formula, FP represents the number of actually normal and determined attack behaviors.

The invention has the following beneficial effects:

(1) the invention designs and develops a water distribution system network physical attack detection method based on deep learning, which mainly depends on data collected by a monitoring and data collecting system in the normal operation process of a water supply system, namely attack-free data, and through layer-by-layer training, the network is quickly converged, local optimal solutions of overfitting and gradient descent do not exist, data can be better fitted, and data characteristics are extracted.

(2) The method for detecting the network physical attack of the water distribution system based on deep learning, which is designed and developed by the invention, focuses on the balance between the detection recall rate and the accuracy rate through the evaluation index, namely, the attack is detected as much as possible while the false alarm is reduced, and the algorithm is optimized.

(3) The method for detecting the network physical attack of the water distribution system based on deep learning, which is designed and developed by the invention, can identify most attacks simulated in a data set, and is superior to the traditional algorithm.

Drawings

Fig. 1 is a schematic structural diagram of an auto-encoder according to the present invention.

FIG. 2 is a schematic diagram of a stacked encoder according to the present invention.

FIG. 3 is a diagram illustrating a layered training process of the stacked encoder according to the present invention.

FIG. 4 is a diagram illustrating an overall trimming structure of the stacked encoder according to the present invention.

FIG. 5 is a schematic diagram of attack detection of the stacked encoder according to the present invention.

FIG. 6 is a diagram illustrating a detection result of the stacked self-encoder according to the present invention.

FIG. 7 is a diagram illustrating the result of the artificial neural network test according to the present invention.

FIG. 8 is a diagram illustrating the detection results of the self-encoder according to the present invention.

FIG. 9 is a diagram illustrating the detection results of the shallow stacked self-encoder according to the present invention.

FIG. 10 is a diagram illustrating the detection results of the deep stack self-encoder according to the present invention.

Detailed Description

The present invention is described in further detail below in order to enable those skilled in the art to practice the invention with reference to the description.

The invention provides a deep learning-based method for detecting network physical attack of a water distribution system, which comprises the following steps:

as shown in fig. 1, the automatic encoder is an unsupervised deep neural network, which is similar to other artificial neural networks, and implements mathematical mapping of weights of the self-encoder network by synapse connecting neurons, and includes three layers of neural networks, i.e., an encoding layer, a hidden layer, and a decoding layer, and trains the weights of each layer by adjusting parameters, so that the output is approximately equal to the input:

y≈x；

where x denotes input data, x ═ x₍₁₎,x₍₂₎,x₍₃₎,...,x_(n)]^T∈RⁿY represents output data, y ═ y₍₁₎,y₍₂₎,y₍₃₎,...,y_(n)]^T∈RⁿN represents the dimension of the data;

the self-encoder mainly encodes and decodes input data, and during encoding, the encoding layer maps the input data to the output of the hidden layer, learns and extracts the characteristics of the data:

h_i＝f(W₁x+b₁)；

wherein, W₁Weight matrix representing the encoder, b₁Representing the offset vector of the encoder, h_i＝[h₍₁₎,h₍₂₎,h₍₃₎,...,h_(m)]^T∈R^mM represents the dimension of the hidden layer, generalConstant m < n, f (z) denotes the activation function;

common activation functions are logistic regression (sigmoid) functions, hyperbolic tangent (tanh) functions:

the hidden layer compresses input data and learns the characteristics of the input data, and when decoding, the decoder decodes the output data characteristics obtained by the hidden layer into input data and reconstructs the input data as far as possible:

y＝f(W₂h+b₂)；

wherein, W₂Weight matrix representing the decoder, b₂Representing the offset vector of the decoder and y the output data.

The stacked self-encoder is formed by stacking a plurality of self-encoders, and initializes the parameters of the network by pre-training unsupervised learning layer by layer, as shown in fig. 2, the structure of the 4-layer stacked self-encoder is shown, and the training input data is X ═ X₁，x₂，x₃，，，x_NN denotes the total number of training samples, each input sample being projected to the hidden layer h_iThen mapped to reconstructed data y_i。

Input data x is encoded by a first layer self-encoder to obtain an output y of a first layer₁And a first hidden layer h₁The parameter (c) of (c). When training the second layer self-encoder, the output y of the first layer is₁As input to the second layer without changing the hidden layer h of the first layer₁Training the second hidden layer h₂While obtaining the output y of the second layer₂The layered training process is shown in fig. 3, in the layer-by-layer training method, a stack self-encoder is trained to the nth layer to obtain output data y, each hidden layer extracts the characteristics of the current input data, and the learning data is from the initial low-level characteristics to the maximumThe final advanced features.

As shown in fig. 4, except the last decoding layer, the other decoding layers are discarded at last, the main purpose of layer-by-layer training is to train the whole parameters of the hidden layer, and finally, the whole network is finely tuned through a loss function and a network optimizer, so that y is as close to x as possible, and through layer-by-layer training, the stacked self-encoder can better fit data and extract data features.

The training set is used to train a network that can learn the underlying representations and features of the training set. As shown in fig. 5, in the network attack detection process, after layered pre-training is performed on a single self-encoder, hidden layers are superimposed, and finally, a network is fine-tuned to form a complete network structure of the stack self-encoder, so that newly input data can be well reconstructed, errors between reconstructed data and input data are minimized, which indicates that the training of network parameters of the stack self-encoder is successful, but when abnormal data is input, the network cannot well reconstruct abnormal data because the trained network parameters are trained based on a training set without abnormality, which is a method for detecting abnormality, the reconstructed errors are further decomposed into window errors of time step sizes, since an attack only occurs within a certain period of time, window detection is performed by using the time step sizes, and the window errors are compared with a threshold value, if the errors are greater than the threshold value, the attack is regarded as abnormal and a flag is returned, otherwise, the result is considered to be normal.

The window error is an error matrix calculated by the whole, and the moving time window is used for sliding decomposition to calculate the average value of the error.

The 99 th percentile of the reconstruction error calculated over the 20% training set was chosen as the threshold.

Ideally, the detection algorithm can detect all attacks without false positives, but in practice this cannot be done, so the classification index is used to evaluate the detection result of the algorithm.

The ability to identify attacks is usually evaluated by the True Positive Rate (TPR), also called Recall (Recall), which is intuitively the ability of the classifier to detect all attacks, i.e. the ratio of the number of all attack time steps to the total number of time steps of the attacked system, which is correctly detected, and is defined as:

wherein TP represents the number of true positives, FN represents the number of false negatives, and TP represents that the actual attack behavior is judged to be an attack; FN indicates that the actual attack behavior is judged to be normal behavior.

The ability to avoid false alarms is usually assessed by the True Negative Rate (TNR), i.e. the proportion of the system's normal time step that is correctly classified throughout normal system operation, which is defined as:

wherein FP represents the number of false positives, TN represents the number of true negatives, FP represents that the normal behavior is actually judged as the aggressive behavior; TN indicates that it is actually normal behavior and is judged to be normal behavior.

Corresponding to the recall rate is Precision (Precision), i.e. the ratio of the number of correctly detected attack time steps to the total attack time step, defined as:

therefore, in optimizing and comparing algorithms, it is necessary to strike a balance between recall and accuracy, and if the recall is too high, it indicates that the algorithm can detect most attacks, but the accompanying accuracy will be too low, resulting in a large number of false positives, which indicates that the algorithm is too sensitive; accordingly, if the accuracy rate is too high, the false alarm avoidance rate of the algorithm is high, but the recall rate is very low, and many attacks cannot be detected, which indicates that the algorithm is not sensitive enough, so that it is very important to balance the recall rate and the accuracy rate.

This requires a new index to balance recall and accuracy, which is F₁Index, F₁The indicator is the harmonic mean of recall and accuracy, defined as:

the value ranges of all the evaluation indexes are between 0 and 1, and when the index is 0, the result is worst; when the index is 1, the result is best.

Examples

The C town water distribution system is simulated based on real world physical infrastructure, the C city comprises 429 pipelines, 388 nodes, 7 storage tanks, 11 pumps, 5 valves (only one valve is actually monitored and recorded) and 1 reservoir, the basic physical facilities are controlled by 9 programmable logic controllers, and the programmable logic controllers are mainly used for controlling the states (on or off) of the water pumps and the valves to realize the normal operation of the water distribution system.

In this example, three data sets were used, namely a training set, a validation set (table 1) and a test set (table 2), which were simulated by a MATLAB toolbox and EPANET2 to simulate the hydraulic response of a water distribution system.

TABLE 1 authentication set attack signature

TABLE 2 test set attack signature

All data sets contained hourly monitoring and data acquisition system readings of 43 monitored system variables, 43 system variables including 7 tank water levels (L _ T < id >), 11 pump flows (F _ PU < id >) and 11 pump states (S _ PU < id >), 1 valve flow (F _ V2) and state (S _ V2), and 12 node pressures (P _ J280, P _ J269, P _ J300, P _ J256, P _ J289, P _ J415, P _ J302, P _ J306, P J307, P _ J317, P _ J4, P _ J422).

The total time step of the data collected in the training set is 365 days, and no attack is contained, namely the data set is collected under the condition that the water supply network system normally operates; the validation set contains 7 types of attacks, with a total attack time step of 492 hours (table 1, attacks 1-7); the test set contained 7 types of attacks with a total attack time step of 407 hours (table 2, attacks 8-14).

An attacker tries to hide or modify the data collected by the monitoring and data acquisition system, so that the system sends out an error instruction to control the programmable logic controller to perform an error operation, thereby realizing the attack, for example, in the attack 9, the attacker makes the water level of the water tank T2 transferred to the programmable logic controller 3 lower than the historical value and actually work within a normal range, the system is cheated, and the system opens the V2 valve and fills the water tank T2 with water, so that the water level of the water tank is overhigh until overflowing.

The method comprises the steps of dividing a training set into two parts, namely, 80% of the training set 1 and 20% of the verification set 1, wherein the training set 1 is used for training and fitting a network model, the verification set 1 is used for verifying a network and generating an adaptive threshold, normalization processing is carried out on data before the network is trained, the network is conveniently trained and fitted before the network is trained, and meanwhile, normalization processing is carried out on the data in the verification set and the test set.

The method comprises the steps of using a training set (namely data of a water distribution system in normal operation) to carry out network training, verification and testing on a verification set (table 1) and a testing set (table 2), inputting variable data of 43 sensors into a network to carry out feature extraction in the layered training process, using random batch training for training, setting the batch size to be 32, setting the iteration number to be 24, when the network is subjected to overall fine tuning, setting the iteration number to be 50, and selecting a hyperbolic tangent function as an activation function, wherein the aim is to better and faster train and reduce the influence of randomness of a deep network.

In the training process, an Adam optimization algorithm is adopted for training, the Adam algorithm is greatly different from a traditional random gradient descent algorithm, a single learning rate alpha is used for updating a network weight, the learning rate cannot be changed, a series of training problems such as gradient explosion, gradient disappearance, local optimal solution and the like can be generated, the Adam algorithm calculates first-order moment estimation and second-order moment estimation of the gradient, the gradient descent is adaptively learned, the network is optimized, different learning parameters are designed, and a trained loss function is a mean square error loss function:

meanwhile, in the embodiment, a method of stopping training in advance and reducing the learning rate in training is also adopted, and when an Adam optimizer is used, the initial learning rate l is used_rThe learning rate is set to 0.001 to reduce the learning rate, and after 1 iteration, the model performance is not improved when the learning rate is reduced, and an action is triggered, the learning rate is reduced by the following formula (wherein the coefficient is set to 0.5), and the training is stopped in advance, when 3 iterations are passed continuously, the verification loss of the network training is not reduced, and the action is stopped in advance, and the aim is to prevent overfitting in the network training process and accelerate the convergence speed of the network.

l_r'＝l_r·factor；

In the formula I_r' denotes the learning rate after iteration, l_rThe initial learning rate is expressed and the factor is the iteration coefficient.

An artificial neural network is an unsupervised neural network that is trained in the same way as a stacked self-encoder and tested on a validation set and a test set, and was used to predict dynamic time series patterns of water resource variables.

As shown in table 3, the comparison results of the recall rate and the accuracy rate of the detection of the stack self-encoder, the artificial neural network and the self-encoder, and the comparison results of the TPR and the TNR index are shown, in contrast, the detection results of the artificial neural network and the self-encoder are inferior to that of the stack self-encoder, and the recall rate of the detection of the stack self-encoder is close to 60% and the accuracy rate is maintained at about 94% no matter the verification set or the test set.

TABLE 3 Artificial neural network, autoencoder, Stack autoencoder detection index

As shown in fig. 6, which shows the detection results of the stacked self-encoder network, it can be seen more intuitively from fig. 6, 7 and 8 that the stacked self-encoder algorithm is superior to the artificial neural network and the self-encoder algorithm, and it can be seen from the figure that the artificial neural network does not detect attack 7 (table 1 validation set) and attack 13 (table 2 test set), and fig. 8 shows that the self-encoder is inferior to the stacked self-encoder in detecting attack 12 of the validation set.

Using the 4-layer hidden-layer stacked self-encoder structure (fig. 2), the number of hidden-layer variables is set to 32, 16 and 32, respectively, and a comparison between the shallow-layer stacked self-encoder and the deep-layer stacked self-encoder and the stacked self-encoder can be seen from table 4, the shallow-layer stacked self-encoder having the same structure as the stacked self-encoder, which is also a stack of 4-layer self-encoders, except that the number of hidden-layer variables is set to 18, 8, 18, respectively; the deep stack autoencoders are different, 6 layers of the stack autoencoders are used, the number of hidden layer variables is set to be 32, 16, 8, 16 and 32 respectively, table 4 shows the comparison results of the detection indexes of the stack autoencoders, the shallow stack autoencoders and the deep stack autoencoders, and table 4 shows that the detection results of the shallow stack autoencoders on a test set have the highest accuracy but the recall rate is very low; the detection results of the shallow stacked self-encoder are shown in fig. 9, and the detection results of the deep stacked self-encoder are shown in fig. 10.

TABLE 4 detection indexes of stack auto-encoder, shallow stack auto-encoder, deep stack auto-encoder

Finally, the evaluation indexes of the recall rate and the accuracy rate of each model are weighted, and the F after weighting the recall rate and the accuracy rate is shown in the table 5₁Parameter indexes, it can be seen that the adopted 4-layer stack automatic encoder is superior to other models.

TABLE 5F₁Parameter index

The invention designs and develops a water distribution system network physical attack detection method based on deep learning, which mainly depends on data collected by a monitoring and data collecting system in the normal operation process of a water distribution system, namely attack-free data, through training layer by layer, the network is quickly converged, local optimal solution of overfitting and gradient descending does not exist, most attacks simulated in data concentration can be identified, and the algorithm focuses on the balance between detection recall rate and accuracy rate, namely, attacks are detected as much as possible while false alarms are reduced.

While embodiments of the invention have been described above, it is not limited to the applications set forth in the description and the embodiments, which are fully applicable to various fields of endeavor for which the invention may be embodied with additional modifications as would be readily apparent to those skilled in the art, and the invention is therefore not limited to the details given herein and to the embodiments shown and described without departing from the generic concept as defined by the claims and their equivalents.

Claims

1. A method for detecting network physical attacks of a water distribution system based on deep learning is characterized by comprising the following steps:

2. The deep learning-based water distribution system network physical attack detection method of claim 1, wherein constructing a stacked self-encoder comprises the steps of:

step 1, constructing a self-encoder;

3. The method of claim 2, wherein the training set data is data collected when the network of the water distribution system is operating normally, the validation set data is data collected when the network of the water distribution system is under multiple types of attacks, and the test set data is data collected when the network of the water distribution system is under multiple types of attacks.

4. The deep learning-based water distribution system network physical attack detection method of claim 2, wherein the self-encoder hidden layer h_iSatisfies the following conditions:

h_i＝f(W₁x+b₁)；

the output data of the self-encoder satisfies:

y＝f(W₂h+b₂)；

5. The deep learning-based water distribution system cyber-physical attack detection method of claim 4, wherein the activation function satisfies:

6. the deep learning-based water distribution system cyber-physical attack detection method of claim 4, wherein the activation function satisfies:

7. the method of claim 1, wherein the threshold is the 99 th percentile of reconstruction errors computed over a 20% training set.

8. The deep learning-based water distribution system network physical attack detection method of claim 7, further comprising evaluating the stack autoencoder to:

9. The deep learning-based water distribution system cyber-physical attack detection method of claim 8, wherein the recall rate satisfies:

10. The deep learning-based method for detecting cyber-physical attacks on a water distribution system of claim 9, wherein the accuracy rate satisfies: