CN115221520A

CN115221520A - Open set identification-based unknown attack detection method for industrial control network

Info

Publication number: CN115221520A
Application number: CN202210877769.6A
Authority: CN
Inventors: 曹向辉; 石鹏
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2022-07-25
Filing date: 2022-07-25
Publication date: 2022-10-21

Abstract

The invention discloses an unknown attack detection method for an industrial control network based on open set identification, which comprises the following steps: s1: initializing the structure and the hyper-parameters of a model to be trained; s2: designing a similar distance loss function, and establishing a combined loss function by combining a cross entropy loss function; s3: pre-training the model, and updating the activation vector center and the network parameters of each class in iteration; s4: calculating an activation vector on a pre-training model, and fitting the distribution of We i bu l by using correctly classified samples; s5: and adjusting the activation vector of the known class, calculating the activation vector of the unknown attack, and outputting the probability that the test sample belongs to the known class and the unknown attack. The method fully excavates the characteristics of the attack with distinctiveness and generalization, improves the detection rate of unknown attacks, can identify the unknown attacks while correctly classifying the known type samples when the unknown attacks exist in the industrial control network, and greatly improves the network security.

Description

Open set identification-based unknown attack detection method for industrial control network

Technical Field

The invention belongs to the technical field of artificial intelligence and information security, particularly relates to the technical field of network attack threat to an industrial control system, and mainly relates to an unknown attack detection method for an industrial control network based on open set identification.

Background

Industrial control systems refer to a collection of devices, systems, networks, and controllers used to operate and control automated industrial processes, and are commonly used in the industrial fields of electricity, sewage treatment, oil and gas transportation, chemical industry, and transportation. With the deep integration of informatization and industrialization and the rapid development of the internet of things, the corresponding security boundary control is not improved while an industrial control system is opened, so that the threat of serious network attack is faced.

Intrusion detection techniques detect the presence of security policy violations and signs of network attacks by analyzing information collected from network nodes. Distinguishing between normal traffic and intrusion behavior can be seen as a classification problem. Due to the superior performance of machine learning and deep learning on classification problems, researchers have explored their application in the field of intrusion detection and demonstrated effectiveness.

Most machine learning and deep learning algorithms are premised on a closed set assumption that all classes in the test phase are present in the training data. However, intrusion detection in a real industrial control network environment is not a closed problem, and in the face of increasing attack types, traditional intrusion detection algorithms based on machine learning also expose shortcomings, and they are usually trained on known attack samples, and a higher detection rate can be obtained on known attack types on a training set, but it is difficult to detect an attack of unknown type.

Disclosure of Invention

The invention provides an unknown attack detection method of an industrial control network based on open set identification, aiming at the problem that the industrial control network environment in the prior art is difficult to detect the unknown attack, comprising the following steps: s1: initializing the structure and the hyper-parameters of a model to be trained; s2: designing a similar distance loss function, and establishing a combined loss function by combining a cross entropy loss function; s3: pre-training the model, and updating the activation vector center and the network parameter of each class in iteration; s4: calculating an activation vector on a pre-training model, and fitting Weibull distribution by using correctly classified samples; s5: and adjusting the activation vector of the known class, calculating the activation vector of the unknown attack, and outputting the probability that the test sample belongs to the known class and the unknown attack. The method fully excavates the characteristics with distinctiveness and generalization, improves the detection rate of unknown attacks, can correctly classify samples of known types and identify the unknown attacks when the unknown attacks exist in the industrial control network, and greatly improves the network security.

In order to achieve the purpose, the invention adopts the technical scheme that: an unknown attack detection method for an industrial control network based on open set identification comprises the following steps:

s1, initializing data: determining the number of nodes and the number of layers of hidden layers of the neural network by using the deep neural network as a training model, and initializing an optimization algorithm and a super-parameter value;

s2, establishing a joint loss function L: design class distance loss function L _CD Combined with cross entropy loss function L _CE Establishing a joint loss function, wherein the calculation method of the joint loss function comprises the following steps:

L＝L _CE +λL _CD

wherein the hyper-parameter λ is used to control the relationship between the two functions of loss;

s3, pre-training a model: updating the center of the activated vector and the network parameter of each class in the iteration of the training model to obtain a high-precision pre-training model;

s4, fitting Weibull distribution: calculating an activation vector on the pre-training model of the high-precision pre-training model after iterative updating in the step S3, and fitting Weibull distribution by using correctly classified samples;

s5, outputting test probability: on the basis of fitting Weibull distribution, the activation vectors of the known classes are corrected, the activation vectors of the unknown classes are calculated, and the probability that the test sample belongs to the known classes and the unknown attack is output.

Compared with the prior art, the invention has the following beneficial effects:

(1) The unknown attack detection algorithm based on open set identification increases the probability calculation mode of inputting the unknown attack samples by calibrating various activation vectors, and simultaneously reserves the discrimination capability of the known normal flow and the known attack samples, solves the problem that the current unsupervised method can not carry out fine-grained classification on the known attack, and makes up the defect that the closed set can not detect the unknown attack.

(2) In the training phase, the method combines the inter-class distance loss function and the cross entropy loss function to carry out back propagation, can learn the characteristics with enough distinctiveness and generalization, and improves the detection effect of unknown attacks.

Drawings

FIG. 1 is a flow chart of the steps of the detection method of the present invention.

Detailed Description

The present invention will be further illustrated with reference to the accompanying drawings and detailed description, which will be understood as being illustrative only and not limiting in scope.

Example 1

An unknown attack detection method for an industrial control network based on open set identification is disclosed, as shown in fig. 1, and comprises the following steps:

s1, data initialization: determining the number of nodes and the number of layers of hidden layers of the neural network by using a deep neural network as a training model, and initializing an optimization algorithm and a super parameter value;

in this embodiment, the deep neural network uses a 4-layer network structure, including an input layer, an output layer, and two hidden layers with node numbers of 32 and 16, respectively. The learning rate is set to 0.005, the batch size is set to 256, the tail size is 500, the number of known classes to be adjusted, α, is set to 2, the parameter β adjusting the rate of change of class center is set to 0.005, and the scaling parameter λ controlling the class distance loss function in the joint loss function is set to 0.01.

S2, establishing a joint loss function L:

firstly, a class Distance Loss Function (CD Loss) is designed, the first half part of the CD Loss considers the Distance in a class, and the samples in the same class are required to be as close as possible, which is equivalent to minimizing the Distance between each class of samples and the center of the class; the latter half considers the inter-class distance, requiring that samples between different classes should be far apart, which is equivalent to maximizing the distance between the centers of different classes. The specific calculation method of the similar distance loss function comprises the following steps:

wherein M is the number of samples of a batch; n is the number of known classes; v. of _i An activation vector for sample i;

is y _i The activation vector center of the category; c _m And C _n The m-th and n-th activation vector centers are respectively;

because the deep neural network is used as a training model in the method, when the deep neural network carries out a multi-classification task, a Cross Entropy Loss Function (CE Loss) is generally used for measuring errors:

wherein M is the number of samples of a batch; n is the number of known classes; p is a radical of _ij Is the probability that the ith sample belongs to class j; when y is _i When the value of = j is set to zero,

the function value is 1, CE loss has gradient feedback, and the magnitude of the gradient value is proportional to the difference between the predicted value and the true value of the network.

The joint loss function is formed by combining CE loss and CD loss, and the over-parameter lambda is used for controlling the relation between the two loss functions:

L＝L _CE +λL _CD

s3, pre-training a model: updating the center of the activated vector and the network parameter of each class in the iteration of the training model to obtain a high-precision pre-training model, wherein the steps further comprise:

s31, dividing the data set:

and (3) constructing an unknown attack detection data set by using the natural gas pipeline data set, and performing the following steps of 7:3, dividing the known data by a proportion, wherein 70 percent of the known data serve as a training set, and the rest 30 percent of the known data serve as a testing set;

s32, calculating the back propagation error of M batches on the training set:

wherein t is the current iteration number; x is the number of _i Is the ith sample; λ is a hyper-parameter controlling the two loss functions;

s33, updating the center of the class activation vector based on batch processing:

wherein, the first and the second end of the pipe are connected with each other,

the current j-th type of the activated vector center is obtained, and t is the current iteration number; scalar β is used to control the rate of change of class centers; delta C _j The variation of the j-th activation vector center; m is the number of samples of a batch; v. of _i An activation vector for the ith sample; when y is _i When = j, the value of phi () function is 1;

s34, updating network parameters:

in the formula, μ is the learning rate, and t is the number of iterations.

S4, fitting Weibull distribution: calculating an activation vector on the pre-training model by using the high-precision pre-training model after iterative update in the step S3, and fitting Weibull distribution by using correctly classified samples, wherein the step S further comprises the following steps of:

s41-higherAnd after the precision is achieved, the pre-training model is stored, and then the activation vector v (x) corresponding to the correct classification sample is obtained on the pre-training model. Calculate the average activation vector MAV = [ m ] for each class over all known classes ₁ ,m ₂ ,...,m _N ]The j-th class average activation vector is:

in the formula, x _i,j Indicating that sample i is correctly classified as class j, N _j Is the number of training samples correctly classified into j classes.

S42, calculating the distance between the correctly classified samples on each class and the corresponding MAV, selecting samples with the largest tailsize after sorting the distances to fit the Weibull distribution of each class, wherein the fitting model comprises a proportion parameter lambda _j Position parameter τ _j Shape parameter K _j Here by the FitHigh function in libMR.

S5, outputting test probability: on the basis of fitting Weibull distribution, correcting the activation vector of the known class and calculating the activation vector of the unknown class, and outputting the probability that the test sample belongs to the known class and the unknown attack, wherein the steps further comprise:

s51: on the basis of fitting the Weibull distribution, the activation vectors of known classes are corrected and the activation vectors of unknown classes are calculated. Inputting test data into the neural network, sorting all components of the obtained activation vector v (x) in descending order, s _j The subscript corresponding to the jth large component is recorded. Setting adjustment coefficients for the first alpha components:

wherein λ is _j Is a scale parameter of Weibull distribution; tau. _j Is a position parameter; k _j Is a shape parameter;

s52 recalculating the activation vector v (x) using the adjustment coefficients ^* And activation vector v of unknown attack ₀ (x)：

v(x) ^* ＝v(x)oω(x)

S53, mixing v ₀ (x) And v (x) ^* The probability that the test data belongs to each class, here both known and unknown classes on the training samples, is calculated as input to SoftMax. The probability of class j is:

and (4) calculating N +1 probabilities in total, wherein the class number corresponding to the maximum probability is the predicted output class, and the class corresponding to 0 is the unknown attack.

The unknown attack detection algorithm based on open set identification increases the probability calculation mode of inputting the unknown attack samples by calibrating various activation vectors, and simultaneously reserves the discrimination capability of the known normal flow and the known attack samples, solves the problem that the current unsupervised method can not carry out fine-grained classification on the known attack, and makes up the defect that the closed set can not detect the unknown attack.

It should be noted that the above-mentioned contents only illustrate the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and it is obvious to those skilled in the art that several modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations fall within the protection scope of the claims of the present invention.

Claims

1. An unknown attack detection method for industrial control network based on open set identification is characterized by comprising the following steps:

s1, data initialization: determining the number of nodes and the number of layers of hidden layers of the neural network by using the deep neural network as a training model, and initializing an optimization algorithm and a super-parameter value;

s2, establishingJoint loss function L: design class distance loss function L _CD Combined with cross entropy loss function L _CE Establishing a joint loss function, wherein the calculation method of the joint loss function comprises the following steps:

L＝L _CE +λL _CD

s4, fitting Weibull distribution: calculating an activation vector on the pre-training model by using the high-precision pre-training model subjected to iterative update in the step S3, and fitting Weibull distribution by using a correctly classified sample;

s5, outputting the test probability: on the basis of fitting Weibull distribution, the activation vectors of the known classes are corrected, the activation vectors of the unknown classes are calculated, and the probability that the test sample belongs to the known classes and the unknown attack is output.

2. The method for detecting the unknown attack of the industrial control network based on the open set identification as claimed in claim 1, wherein: the hyper-parameters in step S1 at least include a learning rate, a batch size, a tail size, a known class number α to be adjusted, a parameter β for adjusting a class center change rate, and a proportional parameter λ for controlling a class distance loss function in a joint loss function.

3. The method for detecting the unknown attack of the industrial control network based on the open set identification as claimed in claim 2, characterized in that: in the step S2, the distance-like loss function L _CD Comprises the following steps:

is y _i An activation vector center for a category; c _m And C _n The m-th and n-th activation vector centers are respectively;

the cross entropy loss function L _CE Comprises the following steps:

wherein p is _ij Is the probability that the ith sample belongs to class j; when y is _i When = j, the ratio of the number of pulses is not less than j,

the function value is 1.

4. The method for detecting the unknown attack of the industrial control network based on open set identification as claimed in claim 3, wherein: the step S3 further includes:

s31: dividing the data set: constructing an unknown attack detection data set, and dividing known data into a training set and a test set according to a proportion;

s32: the back propagation error for M batches was calculated:

s33: updating class activation vector centers based on batch processing:

the current j-th activation vector center; scalar β is used to control the rate of change of class center; delta C _j The variation of the j-th activation vector center; when y is _i When = j, the value of phi () function is 1;

s34: and (3) updating network parameters:

in the formula, μ is the learning rate, and t is the number of iterations.

5. The method for detecting unknown attacks on the industrial control network based on open set identification as claimed in claim 4, wherein: the step S4 further includes:

s41: after obtaining higher precision, storing a pre-training model, obtaining an activation vector v (x) corresponding to a correct classification sample on the pre-training model, and calculating an average activation vector MAV = [ m ] of each class on all known classes ₁ ,m ₂ ,...,m _N ]The j-th class average activation vector is:

wherein x is _i,j Indicating that sample i is correctly classified as class j, N _j Is the number of training samples correctly classified into j categories;

s42: and calculating the distance between the correctly classified samples on each class and the corresponding MAV, sorting the distances, and fitting the samples with the tail size with the maximum distance to the Weibull distribution of each class.

6. The method as claimed in claim 5, wherein the unknown attack detection method for industrial control network based on open set identification,the method is characterized in that: in step S42, the fitting model includes a proportional parameter λ _j Position parameter τ _j And a shape parameter K _j This is achieved by the FitHigh function in libMR.

7. An unknown attack detection method for industrial control network based on open set identification by using the method as claimed in claim 5 or 6, characterized in that: the step S5 further includes:

s51, inputting the test data into the neural network, sorting all components of the obtained activation vector v (x) in a descending order, S _j Recording subscripts corresponding to jth large components, and setting adjustment coefficients for the first alpha components:

wherein λ is _j Is a scale parameter of Weibull distribution; tau. _j Is a position parameter; k is _j Is a shape parameter;

s52: recalculating activation vector v (x) using adjustment coefficients ^* And activation vector v of unknown attack ₀ (x)：

v(x) ^* ＝v(x)oω(x)

S53, activating vector v of unknown attack ₀ (x) And an activation vector v (x) ^* Calculating the probability of the test data belonging to each class as the input of SoftMax, wherein the probability of the j-th class is as follows: