CN113194098A - Water distribution system network physical attack detection method based on deep learning - Google Patents

Water distribution system network physical attack detection method based on deep learning Download PDF

Info

Publication number
CN113194098A
CN113194098A CN202110480658.7A CN202110480658A CN113194098A CN 113194098 A CN113194098 A CN 113194098A CN 202110480658 A CN202110480658 A CN 202110480658A CN 113194098 A CN113194098 A CN 113194098A
Authority
CN
China
Prior art keywords
encoder
data
distribution system
water distribution
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110480658.7A
Other languages
Chinese (zh)
Inventor
李娟�
王迪
杜海龙
刘贲
乔乔
左英泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin University
Original Assignee
Jilin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin University filed Critical Jilin University
Priority to CN202110480658.7A priority Critical patent/CN113194098A/en
Publication of CN113194098A publication Critical patent/CN113194098A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/16Implementing security features at a particular protocol layer

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Molecular Biology (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Geometry (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a deep learning-based method for detecting network physical attack of a water distribution system, which comprises the following steps: step one, constructing a stack self-encoder and inputting training set data to obtain reconstructed data; step two, comparing the reconstruction data with the training set data and constructing a reconstruction error matrix; step three, decomposing the reconstruction error matrix into a window error of a time step, and comparing the window error with a threshold value: if the window error is not larger than the threshold value, the water distribution system network is normal; and if the window error is larger than the threshold value, the water distribution system network detects the attack. The invention has the characteristic of learning high-dimensional data to accurately detect attacks.

Description

Water distribution system network physical attack detection method based on deep learning
Technical Field
The invention relates to the technical field of rescue equipment, in particular to a deep learning-based method for detecting network physical attack of a water distribution system.
Background
At present, a water distribution system integrates data acquisition, transmission, online monitoring, real-time automatic operation and other equipment of a modern network physical system. The normal operation of the water distribution system is realized by mainly controlling corresponding equipment by an integrated sensor network and a programmable logic control device. The data acquisition and transmission are mainly completed by a monitoring and data acquisition system, and the system can monitor the state, flow, pressure and the like of a certain part in the water distribution system in real time. However, the combination of the physical infrastructure and the intelligent network technology makes the physical infrastructure completely exposed in the virtual network and vulnerable to network attack, and then the system may be abnormal under the network attack. An anomaly may be generally defined as some data that does not conform to a specified notion of normal behavior. Cyber attacks have become an increasingly serious threat. The federal bureau of investigation classified cyber crime as the primary activity in fighting crimes. Under the investigation of the U.S. department of homeland security, the most vulnerable to cyber attacks in the critical infrastructure is the water distribution system, which is the third largest target industry next to the key manufacturing and energy industries. An attacker indirectly causes the abnormity of the water distribution system through the attack monitoring and data acquisition system. For example, attackers have different attack methods to maliciously tamper with the water level of the water tank. Therefore, it is extremely important to identify these anomalies.
To mitigate these threats in the future, recent research has discussed the importance of building a mature network security culture to enhance the intelligence's ability to defend against network attacks. In addition, security measures are taken on the corresponding components of the cyber-physical system (including the remote sensors and actuators, the communication network and the monitoring and data acquisition system module) to enhance their random response capability in the face of cyber attacks. At present, a detection test is carried out on network physical attacks in a water supply infrastructure system, a hydraulic model is used for researching network attack detection in a canal network, and a genetic algorithm and a recursive Bayesian rule are used for detecting abnormal hydraulic power or water quality events; for a data-based method, a data-driven clustering technology is utilized as an intrusion detection method for detecting attacks on a monitoring and data acquisition system, abnormal values can be detected by analyzing correlations between data of different sensors acquired by the monitoring and data acquisition system, but the method is not suitable for being expanded to large data sets which usually contain a large amount of irrelevant and redundant information; more attacks can be detected for model-based methods, but model-based methods require a good physical hydraulic model; for the learning-based method, the corresponding threshold is set by utilizing principal component analysis and an artificial neural network to detect the attack, but the threshold is given by human experience, so the detection recall rate is very low or the recall rate is very high, too many false reports are given, and the balance is not achieved.
As the frequency of cyber physical attacks on water distribution systems continues to increase, it is desirable to develop an attack detection method to protect the critical infrastructure of water supply systems.
Disclosure of Invention
The invention aims to design and develop a deep learning-based method for detecting physical attacks on a water distribution system network, which reproduces data acquired in the operation process of facilities in a water supply system under normal conditions through a stack self-encoder, judges the attack behavior in the water distribution system network by analyzing and processing reconstruction errors, and accurately identifies the attacks on the water distribution system network through learning the characteristics of high-dimensional data.
The technical scheme provided by the invention is as follows:
a method for detecting network physical attacks of a water distribution system based on deep learning comprises the following steps:
step one, constructing a stack self-encoder and inputting data to obtain reconstructed data;
step two, comparing the reconstruction data with the training set data and constructing a reconstruction error matrix;
step three, decomposing the reconstruction error matrix into a window error of a time step, and comparing the window error with a threshold value:
if the window error is not larger than the threshold value, the water distribution system network is normal;
and if the window error is larger than the threshold value, the water distribution system network detects the attack.
Preferably, the constructing of the stack self-encoder includes the steps of:
step 1, constructing a self-encoder;
step 2, stacking a plurality of self-encoders to construct a stack self-encoder;
and 3, carrying out normalization processing on the training set data, the verification set data and the test set data, training the stack self-encoder by using the training set data, verifying the stack self-encoder by using the verification set data and generating a threshold value, and testing the stack self-encoder by using the test set data.
Preferably, the training set data is data collected when the water distribution system network is operating normally, the validation set data is data collected when the water distribution system network is under various types of attacks, and the test set data is data collected when the water distribution system network is under various types of attacks.
Preferably, the self-encoder concealment layer hiSatisfies the following conditions:
hi=f(W1x+b1);
wherein x represents input data, and x ═ x(1),x(2),x(3),...,x(n)]T∈RnN denotes the dimension of the data, W1Weight matrix representing the encoder, b1Representing the offset vector of the encoder, hi=[h(1),h(2),h(3),...,h(m)]T∈RmM denotes the dimension of the hidden layer and m < n, f (z) denotes the activation function;
the output data of the self-encoder satisfies:
y=f(W2h+b2);
wherein y represents output data, and y is [ y ](1),y(2),y(3),...,y(n)]T∈Rn,W2Weight matrix representing the decoder, b2Representing the offset vector of the decoder.
Preferably, the activation function satisfies:
Figure BDA0003048446680000031
preferably, the activation function satisfies:
Figure BDA0003048446680000032
preferably, the threshold is the 99 th percentile of the reconstruction errors calculated on the 20% training set.
Preferably, the method further comprises evaluating the stack self-encoder:
Figure BDA0003048446680000033
in the formula, F1The balance index of the Recall rate and the Precision is represented, Recall represents the Recall rate, and Precision represents the Precision rate;
the value range of the balance index of the recall rate and the accuracy is between 0 and 1, and if the balance index of the recall rate and the accuracy is 0, the output result of the stack self-encoder is represented to be worst; if the balance index of the recall rate and the accuracy is 1, the output result of the stack self-encoder is optimal.
Preferably, the recall ratio satisfies:
Figure BDA0003048446680000041
in the formula, TP represents the number of actual attacks and determined as attack behaviors, and FN represents the number of actual attacks and determined as normal behaviors.
Preferably, the accuracy ratio satisfies:
Figure BDA0003048446680000042
in the formula, FP represents the number of actually normal and determined attack behaviors.
The invention has the following beneficial effects:
(1) the invention designs and develops a water distribution system network physical attack detection method based on deep learning, which mainly depends on data collected by a monitoring and data collecting system in the normal operation process of a water supply system, namely attack-free data, and through layer-by-layer training, the network is quickly converged, local optimal solutions of overfitting and gradient descent do not exist, data can be better fitted, and data characteristics are extracted.
(2) The method for detecting the network physical attack of the water distribution system based on deep learning, which is designed and developed by the invention, focuses on the balance between the detection recall rate and the accuracy rate through the evaluation index, namely, the attack is detected as much as possible while the false alarm is reduced, and the algorithm is optimized.
(3) The method for detecting the network physical attack of the water distribution system based on deep learning, which is designed and developed by the invention, can identify most attacks simulated in a data set, and is superior to the traditional algorithm.
Drawings
Fig. 1 is a schematic structural diagram of an auto-encoder according to the present invention.
FIG. 2 is a schematic diagram of a stacked encoder according to the present invention.
FIG. 3 is a diagram illustrating a layered training process of the stacked encoder according to the present invention.
FIG. 4 is a diagram illustrating an overall trimming structure of the stacked encoder according to the present invention.
FIG. 5 is a schematic diagram of attack detection of the stacked encoder according to the present invention.
FIG. 6 is a diagram illustrating a detection result of the stacked self-encoder according to the present invention.
FIG. 7 is a diagram illustrating the result of the artificial neural network test according to the present invention.
FIG. 8 is a diagram illustrating the detection results of the self-encoder according to the present invention.
FIG. 9 is a diagram illustrating the detection results of the shallow stacked self-encoder according to the present invention.
FIG. 10 is a diagram illustrating the detection results of the deep stack self-encoder according to the present invention.
Detailed Description
The present invention is described in further detail below in order to enable those skilled in the art to practice the invention with reference to the description.
The invention provides a deep learning-based method for detecting network physical attack of a water distribution system, which comprises the following steps:
as shown in fig. 1, the automatic encoder is an unsupervised deep neural network, which is similar to other artificial neural networks, and implements mathematical mapping of weights of the self-encoder network by synapse connecting neurons, and includes three layers of neural networks, i.e., an encoding layer, a hidden layer, and a decoding layer, and trains the weights of each layer by adjusting parameters, so that the output is approximately equal to the input:
y≈x;
where x denotes input data, x ═ x(1),x(2),x(3),...,x(n)]T∈RnY represents output data, y ═ y(1),y(2),y(3),...,y(n)]T∈RnN represents the dimension of the data;
the self-encoder mainly encodes and decodes input data, and during encoding, the encoding layer maps the input data to the output of the hidden layer, learns and extracts the characteristics of the data:
hi=f(W1x+b1);
wherein, W1Weight matrix representing the encoder, b1Representing the offset vector of the encoder, hi=[h(1),h(2),h(3),...,h(m)]T∈RmM represents the dimension of the hidden layer, generalConstant m < n, f (z) denotes the activation function;
common activation functions are logistic regression (sigmoid) functions, hyperbolic tangent (tanh) functions:
Figure BDA0003048446680000061
Figure BDA0003048446680000062
the hidden layer compresses input data and learns the characteristics of the input data, and when decoding, the decoder decodes the output data characteristics obtained by the hidden layer into input data and reconstructs the input data as far as possible:
y=f(W2h+b2);
wherein, W2Weight matrix representing the decoder, b2Representing the offset vector of the decoder and y the output data.
The stacked self-encoder is formed by stacking a plurality of self-encoders, and initializes the parameters of the network by pre-training unsupervised learning layer by layer, as shown in fig. 2, the structure of the 4-layer stacked self-encoder is shown, and the training input data is X ═ X1,x2,x3,,,xNN denotes the total number of training samples, each input sample being projected to the hidden layer hiThen mapped to reconstructed data yi
Input data x is encoded by a first layer self-encoder to obtain an output y of a first layer1And a first hidden layer h1The parameter (c) of (c). When training the second layer self-encoder, the output y of the first layer is1As input to the second layer without changing the hidden layer h of the first layer1Training the second hidden layer h2While obtaining the output y of the second layer2The layered training process is shown in fig. 3, in the layer-by-layer training method, a stack self-encoder is trained to the nth layer to obtain output data y, each hidden layer extracts the characteristics of the current input data, and the learning data is from the initial low-level characteristics to the maximumThe final advanced features.
As shown in fig. 4, except the last decoding layer, the other decoding layers are discarded at last, the main purpose of layer-by-layer training is to train the whole parameters of the hidden layer, and finally, the whole network is finely tuned through a loss function and a network optimizer, so that y is as close to x as possible, and through layer-by-layer training, the stacked self-encoder can better fit data and extract data features.
The training set is used to train a network that can learn the underlying representations and features of the training set. As shown in fig. 5, in the network attack detection process, after layered pre-training is performed on a single self-encoder, hidden layers are superimposed, and finally, a network is fine-tuned to form a complete network structure of the stack self-encoder, so that newly input data can be well reconstructed, errors between reconstructed data and input data are minimized, which indicates that the training of network parameters of the stack self-encoder is successful, but when abnormal data is input, the network cannot well reconstruct abnormal data because the trained network parameters are trained based on a training set without abnormality, which is a method for detecting abnormality, the reconstructed errors are further decomposed into window errors of time step sizes, since an attack only occurs within a certain period of time, window detection is performed by using the time step sizes, and the window errors are compared with a threshold value, if the errors are greater than the threshold value, the attack is regarded as abnormal and a flag is returned, otherwise, the result is considered to be normal.
The window error is an error matrix calculated by the whole, and the moving time window is used for sliding decomposition to calculate the average value of the error.
The 99 th percentile of the reconstruction error calculated over the 20% training set was chosen as the threshold.
Ideally, the detection algorithm can detect all attacks without false positives, but in practice this cannot be done, so the classification index is used to evaluate the detection result of the algorithm.
The ability to identify attacks is usually evaluated by the True Positive Rate (TPR), also called Recall (Recall), which is intuitively the ability of the classifier to detect all attacks, i.e. the ratio of the number of all attack time steps to the total number of time steps of the attacked system, which is correctly detected, and is defined as:
Figure BDA0003048446680000071
wherein TP represents the number of true positives, FN represents the number of false negatives, and TP represents that the actual attack behavior is judged to be an attack; FN indicates that the actual attack behavior is judged to be normal behavior.
The ability to avoid false alarms is usually assessed by the True Negative Rate (TNR), i.e. the proportion of the system's normal time step that is correctly classified throughout normal system operation, which is defined as:
Figure BDA0003048446680000072
wherein FP represents the number of false positives, TN represents the number of true negatives, FP represents that the normal behavior is actually judged as the aggressive behavior; TN indicates that it is actually normal behavior and is judged to be normal behavior.
Corresponding to the recall rate is Precision (Precision), i.e. the ratio of the number of correctly detected attack time steps to the total attack time step, defined as:
Figure BDA0003048446680000081
therefore, in optimizing and comparing algorithms, it is necessary to strike a balance between recall and accuracy, and if the recall is too high, it indicates that the algorithm can detect most attacks, but the accompanying accuracy will be too low, resulting in a large number of false positives, which indicates that the algorithm is too sensitive; accordingly, if the accuracy rate is too high, the false alarm avoidance rate of the algorithm is high, but the recall rate is very low, and many attacks cannot be detected, which indicates that the algorithm is not sensitive enough, so that it is very important to balance the recall rate and the accuracy rate.
This requires a new index to balance recall and accuracy, which is F1Index, F1The indicator is the harmonic mean of recall and accuracy, defined as:
Figure BDA0003048446680000082
the value ranges of all the evaluation indexes are between 0 and 1, and when the index is 0, the result is worst; when the index is 1, the result is best.
Examples
The C town water distribution system is simulated based on real world physical infrastructure, the C city comprises 429 pipelines, 388 nodes, 7 storage tanks, 11 pumps, 5 valves (only one valve is actually monitored and recorded) and 1 reservoir, the basic physical facilities are controlled by 9 programmable logic controllers, and the programmable logic controllers are mainly used for controlling the states (on or off) of the water pumps and the valves to realize the normal operation of the water distribution system.
In this example, three data sets were used, namely a training set, a validation set (table 1) and a test set (table 2), which were simulated by a MATLAB toolbox and EPANET2 to simulate the hydraulic response of a water distribution system.
TABLE 1 authentication set attack signature
Figure BDA0003048446680000083
Figure BDA0003048446680000091
TABLE 2 test set attack signature
Figure BDA0003048446680000092
All data sets contained hourly monitoring and data acquisition system readings of 43 monitored system variables, 43 system variables including 7 tank water levels (L _ T < id >), 11 pump flows (F _ PU < id >) and 11 pump states (S _ PU < id >), 1 valve flow (F _ V2) and state (S _ V2), and 12 node pressures (P _ J280, P _ J269, P _ J300, P _ J256, P _ J289, P _ J415, P _ J302, P _ J306, P J307, P _ J317, P _ J4, P _ J422).
The total time step of the data collected in the training set is 365 days, and no attack is contained, namely the data set is collected under the condition that the water supply network system normally operates; the validation set contains 7 types of attacks, with a total attack time step of 492 hours (table 1, attacks 1-7); the test set contained 7 types of attacks with a total attack time step of 407 hours (table 2, attacks 8-14).
An attacker tries to hide or modify the data collected by the monitoring and data acquisition system, so that the system sends out an error instruction to control the programmable logic controller to perform an error operation, thereby realizing the attack, for example, in the attack 9, the attacker makes the water level of the water tank T2 transferred to the programmable logic controller 3 lower than the historical value and actually work within a normal range, the system is cheated, and the system opens the V2 valve and fills the water tank T2 with water, so that the water level of the water tank is overhigh until overflowing.
The method comprises the steps of dividing a training set into two parts, namely, 80% of the training set 1 and 20% of the verification set 1, wherein the training set 1 is used for training and fitting a network model, the verification set 1 is used for verifying a network and generating an adaptive threshold, normalization processing is carried out on data before the network is trained, the network is conveniently trained and fitted before the network is trained, and meanwhile, normalization processing is carried out on the data in the verification set and the test set.
The method comprises the steps of using a training set (namely data of a water distribution system in normal operation) to carry out network training, verification and testing on a verification set (table 1) and a testing set (table 2), inputting variable data of 43 sensors into a network to carry out feature extraction in the layered training process, using random batch training for training, setting the batch size to be 32, setting the iteration number to be 24, when the network is subjected to overall fine tuning, setting the iteration number to be 50, and selecting a hyperbolic tangent function as an activation function, wherein the aim is to better and faster train and reduce the influence of randomness of a deep network.
In the training process, an Adam optimization algorithm is adopted for training, the Adam algorithm is greatly different from a traditional random gradient descent algorithm, a single learning rate alpha is used for updating a network weight, the learning rate cannot be changed, a series of training problems such as gradient explosion, gradient disappearance, local optimal solution and the like can be generated, the Adam algorithm calculates first-order moment estimation and second-order moment estimation of the gradient, the gradient descent is adaptively learned, the network is optimized, different learning parameters are designed, and a trained loss function is a mean square error loss function:
Figure BDA0003048446680000101
meanwhile, in the embodiment, a method of stopping training in advance and reducing the learning rate in training is also adopted, and when an Adam optimizer is used, the initial learning rate l is usedrThe learning rate is set to 0.001 to reduce the learning rate, and after 1 iteration, the model performance is not improved when the learning rate is reduced, and an action is triggered, the learning rate is reduced by the following formula (wherein the coefficient is set to 0.5), and the training is stopped in advance, when 3 iterations are passed continuously, the verification loss of the network training is not reduced, and the action is stopped in advance, and the aim is to prevent overfitting in the network training process and accelerate the convergence speed of the network.
lr'=lr·factor;
In the formula Ir' denotes the learning rate after iteration, lrThe initial learning rate is expressed and the factor is the iteration coefficient.
An artificial neural network is an unsupervised neural network that is trained in the same way as a stacked self-encoder and tested on a validation set and a test set, and was used to predict dynamic time series patterns of water resource variables.
As shown in table 3, the comparison results of the recall rate and the accuracy rate of the detection of the stack self-encoder, the artificial neural network and the self-encoder, and the comparison results of the TPR and the TNR index are shown, in contrast, the detection results of the artificial neural network and the self-encoder are inferior to that of the stack self-encoder, and the recall rate of the detection of the stack self-encoder is close to 60% and the accuracy rate is maintained at about 94% no matter the verification set or the test set.
TABLE 3 Artificial neural network, autoencoder, Stack autoencoder detection index
Figure BDA0003048446680000111
As shown in fig. 6, which shows the detection results of the stacked self-encoder network, it can be seen more intuitively from fig. 6, 7 and 8 that the stacked self-encoder algorithm is superior to the artificial neural network and the self-encoder algorithm, and it can be seen from the figure that the artificial neural network does not detect attack 7 (table 1 validation set) and attack 13 (table 2 test set), and fig. 8 shows that the self-encoder is inferior to the stacked self-encoder in detecting attack 12 of the validation set.
Using the 4-layer hidden-layer stacked self-encoder structure (fig. 2), the number of hidden-layer variables is set to 32, 16 and 32, respectively, and a comparison between the shallow-layer stacked self-encoder and the deep-layer stacked self-encoder and the stacked self-encoder can be seen from table 4, the shallow-layer stacked self-encoder having the same structure as the stacked self-encoder, which is also a stack of 4-layer self-encoders, except that the number of hidden-layer variables is set to 18, 8, 18, respectively; the deep stack autoencoders are different, 6 layers of the stack autoencoders are used, the number of hidden layer variables is set to be 32, 16, 8, 16 and 32 respectively, table 4 shows the comparison results of the detection indexes of the stack autoencoders, the shallow stack autoencoders and the deep stack autoencoders, and table 4 shows that the detection results of the shallow stack autoencoders on a test set have the highest accuracy but the recall rate is very low; the detection results of the shallow stacked self-encoder are shown in fig. 9, and the detection results of the deep stacked self-encoder are shown in fig. 10.
TABLE 4 detection indexes of stack auto-encoder, shallow stack auto-encoder, deep stack auto-encoder
Figure BDA0003048446680000121
Finally, the evaluation indexes of the recall rate and the accuracy rate of each model are weighted, and the F after weighting the recall rate and the accuracy rate is shown in the table 51Parameter indexes, it can be seen that the adopted 4-layer stack automatic encoder is superior to other models.
TABLE 5F1Parameter index
Figure BDA0003048446680000122
The invention designs and develops a water distribution system network physical attack detection method based on deep learning, which mainly depends on data collected by a monitoring and data collecting system in the normal operation process of a water distribution system, namely attack-free data, through training layer by layer, the network is quickly converged, local optimal solution of overfitting and gradient descending does not exist, most attacks simulated in data concentration can be identified, and the algorithm focuses on the balance between detection recall rate and accuracy rate, namely, attacks are detected as much as possible while false alarms are reduced.
While embodiments of the invention have been described above, it is not limited to the applications set forth in the description and the embodiments, which are fully applicable to various fields of endeavor for which the invention may be embodied with additional modifications as would be readily apparent to those skilled in the art, and the invention is therefore not limited to the details given herein and to the embodiments shown and described without departing from the generic concept as defined by the claims and their equivalents.

Claims (10)

1. A method for detecting network physical attacks of a water distribution system based on deep learning is characterized by comprising the following steps:
step one, constructing a stack self-encoder and inputting data to obtain reconstructed data;
step two, comparing the reconstruction data with the training set data and constructing a reconstruction error matrix;
step three, decomposing the reconstruction error matrix into a window error of a time step, and comparing the window error with a threshold value:
if the window error is not larger than the threshold value, the water distribution system network is normal;
and if the window error is larger than the threshold value, the water distribution system network detects the attack.
2. The deep learning-based water distribution system network physical attack detection method of claim 1, wherein constructing a stacked self-encoder comprises the steps of:
step 1, constructing a self-encoder;
step 2, stacking a plurality of self-encoders to construct a stack self-encoder;
and 3, carrying out normalization processing on the training set data, the verification set data and the test set data, training the stack self-encoder by using the training set data, verifying the stack self-encoder by using the verification set data and generating a threshold value, and testing the stack self-encoder by using the test set data.
3. The method of claim 2, wherein the training set data is data collected when the network of the water distribution system is operating normally, the validation set data is data collected when the network of the water distribution system is under multiple types of attacks, and the test set data is data collected when the network of the water distribution system is under multiple types of attacks.
4. The deep learning-based water distribution system network physical attack detection method of claim 2, wherein the self-encoder hidden layer hiSatisfies the following conditions:
hi=f(W1x+b1);
wherein x represents input data, and x ═ x(1),x(2),x(3),...,x(n)]T∈RnN denotes the dimension of the data, W1Weight matrix representing the encoder, b1Representing the offset vector of the encoder, hi=[h(1),h(2),h(3),...,h(m)]T∈RmM denotes the dimension of the hidden layer and m < n, f (z) denotes the activation function;
the output data of the self-encoder satisfies:
y=f(W2h+b2);
wherein y represents output data, and y is [ y ](1),y(2),y(3),...,y(n)]T∈Rn,W2Weight matrix representing the decoder, b2Representing the offset vector of the decoder.
5. The deep learning-based water distribution system cyber-physical attack detection method of claim 4, wherein the activation function satisfies:
Figure FDA0003048446670000021
6. the deep learning-based water distribution system cyber-physical attack detection method of claim 4, wherein the activation function satisfies:
Figure FDA0003048446670000022
7. the method of claim 1, wherein the threshold is the 99 th percentile of reconstruction errors computed over a 20% training set.
8. The deep learning-based water distribution system network physical attack detection method of claim 7, further comprising evaluating the stack autoencoder to:
Figure FDA0003048446670000023
in the formula, F1The balance index of the Recall rate and the Precision is represented, Recall represents the Recall rate, and Precision represents the Precision rate;
the value range of the balance index of the recall rate and the accuracy is between 0 and 1, and if the balance index of the recall rate and the accuracy is 0, the output result of the stack self-encoder is represented to be worst; if the balance index of the recall rate and the accuracy is 1, the output result of the stack self-encoder is optimal.
9. The deep learning-based water distribution system cyber-physical attack detection method of claim 8, wherein the recall rate satisfies:
Figure FDA0003048446670000031
in the formula, TP represents the number of actual attacks and determined as attack behaviors, and FN represents the number of actual attacks and determined as normal behaviors.
10. The deep learning-based method for detecting cyber-physical attacks on a water distribution system of claim 9, wherein the accuracy rate satisfies:
Figure FDA0003048446670000032
in the formula, FP represents the number of actually normal and determined attack behaviors.
CN202110480658.7A 2021-04-30 2021-04-30 Water distribution system network physical attack detection method based on deep learning Pending CN113194098A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110480658.7A CN113194098A (en) 2021-04-30 2021-04-30 Water distribution system network physical attack detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110480658.7A CN113194098A (en) 2021-04-30 2021-04-30 Water distribution system network physical attack detection method based on deep learning

Publications (1)

Publication Number Publication Date
CN113194098A true CN113194098A (en) 2021-07-30

Family

ID=76983123

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110480658.7A Pending CN113194098A (en) 2021-04-30 2021-04-30 Water distribution system network physical attack detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN113194098A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113949549A (en) * 2021-10-08 2022-01-18 东北大学 Real-time traffic anomaly detection method for intrusion and attack defense

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110706720A (en) * 2019-08-16 2020-01-17 广东省智能制造研究所 Acoustic anomaly detection method for end-to-end unsupervised deep support network
CN111967571A (en) * 2020-07-07 2020-11-20 华东交通大学 MHMA-based anomaly detection method and equipment
CN112673381A (en) * 2020-11-17 2021-04-16 华为技术有限公司 Method and related device for identifying confrontation sample

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110706720A (en) * 2019-08-16 2020-01-17 广东省智能制造研究所 Acoustic anomaly detection method for end-to-end unsupervised deep support network
CN111967571A (en) * 2020-07-07 2020-11-20 华东交通大学 MHMA-based anomaly detection method and equipment
CN112673381A (en) * 2020-11-17 2021-04-16 华为技术有限公司 Method and related device for identifying confrontation sample

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李娟等: "《基于堆叠降噪自动编码器的评价对象抽取》", 《中国科技论文在线精品论文》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113949549A (en) * 2021-10-08 2022-01-18 东北大学 Real-time traffic anomaly detection method for intrusion and attack defense
CN113949549B (en) * 2021-10-08 2022-08-23 东北大学 Real-time traffic anomaly detection method for intrusion and attack defense

Similar Documents

Publication Publication Date Title
Xie et al. Multivariate abnormal detection for industrial control systems using 1D CNN and GRU
CN111967571B (en) Abnormality detection method and device based on MHMA
CN110738360B (en) Method and system for predicting residual life of equipment
El-Midany et al. A proposed framework for control chart pattern recognition in multivariate process using artificial neural networks
CN111585948B (en) Intelligent network security situation prediction method based on power grid big data
CN109800875A (en) Chemical industry fault detection method based on particle group optimizing and noise reduction sparse coding machine
CN108881196A (en) The semi-supervised intrusion detection method of model is generated based on depth
CN112989710A (en) Industrial control sensor numerical value abnormity detection method and device
CN113723007B (en) Equipment residual life prediction method based on DRSN and sparrow search optimization
CN112039903B (en) Network security situation assessment method based on deep self-coding neural network model
CN116757534A (en) Intelligent refrigerator reliability analysis method based on neural training network
CN110110318A (en) Text Stego-detection method and system based on Recognition with Recurrent Neural Network
CN112087442A (en) Time sequence related network intrusion detection method based on attention mechanism
US20230291668A1 (en) Method for detecting anomalies in time series data produced by devices of an infrastructure in a network
Ramotsoela et al. Behavioural intrusion detection in water distribution systems using neural networks
CN115713095A (en) Natural gas pipeline abnormity detection method and system based on hybrid deep neural network
CN113194098A (en) Water distribution system network physical attack detection method based on deep learning
Jiang et al. Attacks on data-driven process monitoring systems: Subspace transfer networks
CN114915496B (en) Network intrusion detection method and device based on time weight and deep neural network
CN114401135B (en) Internal threat detection method based on LSTM-Attention user and entity behavior analysis technology
Razavi-Far et al. Neuro-fuzzy based fault diagnosis of a steam generator
Cabeza et al. Fault diagnosis with missing data based on hopfield neural networks
US20240045410A1 (en) Anomaly detection system and method for an industrial control system
Artamonov et al. Mathematical model of chemical process prediction for industrial safety risk assessment
KR20230063905A (en) Heatwave Probability Prediction System based on Recurrent Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210730