Disclosure of Invention
The invention aims to solve the problems of complex data characteristics, relatively low identification precision and the like in the process of identifying the electric CPS information attack, and provides an electric CPS information attack identification method based on a stacked self-coding network model from the aspects of correlation and redundancy of CPS data characteristics. Firstly, analyzing the properties of CPS data such as non-function dependence, non-linear correlation and the like, providing a feature selection method based on the maximum information coefficient, and determining an optimal attack feature set. Then, an information attack identification model based on the stacked self-coding network is constructed, and an unsupervised pre-training encoder and a supervised fine-tuning classifier are arranged to carry out network parameter training and updating. And finally, model initial parameter optimization based on the self-adaptive cuckoo algorithm is realized. The example analysis shows that the method can effectively improve the identification precision of the information attack.
The technical scheme adopted for realizing the aim of the invention is as follows: a power CPS information attack identification method based on a stack type self-coding network model is characterized by comprising the following steps:
1) maximum correlation minimum redundancy attack feature selection method considering maximum information coefficient improvement reflects nonlinear correlation and non-functional dependency relationship in data features, analyzes correlation and redundancy among features, and further selects optimal attack feature set
(a) Cutting a set D formed by the features < x, y > by using a grid G, calculating mutual information values in each sub-grid by changing the positions of the segmentation points to obtain the maximum mutual information value of the whole grid G, and forming a feature matrix M (D) x, y by changing the maximum normalized I (D, x, y) values obtained by changing different cutting points, as shown in a formula (1):
the maximum information coefficient is defined as formula (2):
wherein MIC (D) is in the range of [0, 1%]B (n) is an upper limit of the grid size, if b (n) is too large, it may cause the data in the set D to be aggregated in a small portion of the sub-grids, and if b (n) is too small, less data may be searched, taking the values b (n) ═ n0.6The effect is optimal;
(b) the larger the MIC between the features and the categories is, the stronger the representative correlation is, and the greater the influence on the final classification result is; the greater the MIC between features, the stronger the substitutability between features, i.e. the stronger the redundancy, and the quantitative analysis correlation and redundancy process is as shown in formula (3) and formula (4):
wherein D represents the correlation between the feature set F and the attack category c, and R represents the redundancy between the features in the set F; f and | F | are respectively a feature set and the number of features, xiRepresents the ith feature, and c represents a category label; MIC (x)iC) maximum information coefficient between feature i and object class, MIC (x)i,xj) Representing the maximum information coefficient between the feature i and the feature j;
(c) the optimal attack feature set realizes feature selection from the aspect of feature correlation and redundancy, the conditions of maximum correlation and minimum redundancy are required to be met in the selected set, the original feature set is set to be F, and the optimal feature subset F of m-1 features is obtained(m-1)From the remainder F-F(m-1)The process of selecting the mth feature from the features should satisfy formula (5):
2) considering the complex characteristics among the data features of the power CPS, through analyzing and researching historical data, an information attack identification model based on a Stacked Auto-Encoding (SAE) is provided, and the steps are as follows:
(d) constructing an unsupervised pre-training encoder, enabling an input layer and an output layer of the network to be the same as far as possible, enabling middle hidden layer low-dimensional data to represent original data, pre-training each layer of the neural network by utilizing a layer-by-layer greedy training method, initializing network parameters layer by layer, respectively carrying out layer-by-layer abstract representation on physical and information characteristics in such a way, encoding the physical and information characteristics into low-dimensional data characteristics, and reducing the difficulty of model training;
(e) constructing a supervision fine-tuning classifier, encoding the encoded data for multiple times to obtain the physical characteristics and information characteristics after dimension reduction, constructing a softmax classifier to perform the final attack identification step, setting neurons of an output layer to be N, and regarding the N-type electric power CPS information attack modes, each neuron represents one-type attack;
(f) when the SAE identification model adjusts the optimization parameters, the setting requirement on the initial parameters is higher, and the objective function expression of the initial parameters of the model is as in formula (6):
where n is the total number of samples, y' (i) represents the desired output sample, and y (i) represents the actual output sample;
3) after the objective function of the model is obtained, a self-adaptive cuckoo algorithm is provided for carrying out function solving, initial parameters are effectively set, and the weight and the threshold value in the SAE identification model are optimized:
(h) for adaptive step size factor α0Dynamic setting is carried out, the larger the value is, the stronger the global search capability is represented, but the convergence precision of the algorithm is reduced; the smaller this value is, the higher the optimization accuracy is represented, but the slower the convergence speed is, and the dynamic setting is as in equation (7):
in the formula, tiRepresenting the current number of iterations, tmaxRepresenting the maximum number of iterations;
(g) the method provides a self-adaptive cuckoo algorithm to solve the initial parameters of the model, improves the traditional cuckoo algorithm and finds the probability paThe dynamic setting is carried out, the dynamic setting is gradually increased along with the progress of the search, the balance between the global search and the local search in the algorithm can be kept at the later stage of evolution, the convergence precision of the algorithm is integrally improved, the phenomenon that the algorithm is trapped into the local optimum is avoided, and the dynamic setting is as the formula (8):
in the formula, paRepresenting the probability of finding bird nests, pa,maxDenotes the maximum probability of discovery, tiRepresenting the current number of iterations, tmaxRepresenting the maximum number of iterations;
4) after the network parameters are initialized by the self-adaptive cuckoo algorithm, the network parameters are reversely adjusted and optimized on the basis, the weight of the neural network parameters is trained, CPS information attacks are identified, and operation and maintenance personnel carry out corresponding processing according to the identification result.
Compared with the prior art, the electric CPS information attack identification method based on the stacked self-coding network has the beneficial effects that:
1) considering the characteristics of high dimension, nonlinear correlation, non-function dependence relationship and the like of the electric power CPS data, a maximum information coefficient is introduced to select data characteristics, and an optimal attack characteristic set is determined. The correlation and redundancy among the characteristics are analyzed, and the identification precision and the training speed of the model are effectively improved;
2) and constructing a stack type self-coding network identification model, pre-training each layer of the neural network by using a layer-by-layer greedy training method, solving the problem of high dimensionality of data characteristics, and deeply extracting abstract characteristics. Comparing the label result with the classification result, and adjusting the model parameters by using a back propagation algorithm to ensure that all layer parameters of the whole identification model reach global optimum as much as possible;
3) the cuckoo algorithm is improved, the convergence speed of a target function is improved by adaptively setting the discovery probability and the step size factor, and the local optimization is prevented;
4) the method is scientific and reasonable, and has strong applicability and good effect.
Detailed Description
The following describes in detail an electric CPS information attack identification method based on a stacked self-coding network according to the present invention with reference to the accompanying drawings.
Referring to fig. 1, a power CPS information attack identification method based on a stacked self-coding network includes the following steps:
1) considering that properties such as high-dimensional characteristics, nonlinear correlation, non-function dependency relationship and the like in CPS data cause serious obstacles in research and application processes, the invention provides a maximum correlation minimum redundancy attack feature selection method considering maximum information coefficient improvement, reflecting nonlinear correlation and non-function dependency relationship in data features, analyzing correlation and redundancy among features, and further selecting an optimal attack feature set
(a) And (4) preprocessing data. The CPS data may contain null values or obvious infinite abnormal values (such as NAN and INF values), the existence of the values has serious influence on the attack identification process, and the whole piece of data containing the abnormal values is selected to be deleted because the data set depended on by the invention is huge; identifying that the question belongs to a multi-category question, therefore, the category attribute should be converted into a one-hot coded form, e.g., event 1 type can be converted into (1,0,0, …,0), event 41 can be converted into (0,0, …,0, 1); the difference of different characteristic values in the original data is large, large errors are easy to generate, and the original data is normalized;
(b) and cutting a set D formed by the characteristics < x, y > by using the grid G, and calculating mutual information values in each sub-grid by changing the positions of the division points to obtain the maximum mutual information value of the whole grid G. And (3) combining the maximum normalized I (D, x, y) values obtained by changing different cutting points into a feature matrix M (D) x, y as shown in a formula (1):
the maximum information coefficient is defined as formula (2):
wherein MIC (D) is in the range of [0, 1%]And B (n) is the upper limit of the grid size. If B (n) is too large, it may cause the data in set D to be all gathered in fewerIn part of the submesh, and if b (n) is too small, then less data can be searched, taking the values b (n) ═ n0.6The effect is optimal;
(c) the larger the MIC between the features and the categories is, the stronger the representative correlation is, and the greater the influence on the final classification result is; the larger the MIC from feature to feature, the stronger the substitutability, i.e., the greater the redundancy, between the features. The correlation and redundancy process of quantitative analysis is shown in formula (3) and formula (4):
wherein D represents the correlation between the feature set F and the attack category c, and R represents the redundancy between the features in the set F; f and | F | are respectively a feature set and the number of features, xiRepresents the ith feature, and c represents a category label; MIC (x)iC) maximum information coefficient between feature i and object class, MIC (x)i,xj) Representing the maximum information coefficient between the feature i and the feature j;
(d) the optimal attack feature set realizes feature selection from the aspect of feature correlation and redundancy, which requires that the conditions of maximum correlation and minimum redundancy are satisfied in the selected set. Setting the original feature set to be F, and acquiring the optimal feature subset F of m-1 features(m-1)From the remainder F-F(m-1)The process of selecting the mth feature from the features should satisfy formula (5):
(e) the algorithm flow is as follows: inputting: feature set F, category label C; and (3) outputting: and F' of an optimal attack feature set.
① pairs of physical features F in the feature set FPAnd carrying out discretization processing, wherein the initial value of the feature set F' is a null value.
② calculate the maximum information coefficient of each feature and class label C, and remove features that are not relevant and are weakly relevant.
③ find the feature F in F that maximizes equation (5)iAdding the attack features into the optimal attack feature set F', and deleting the features F in Fi。
④ loop through ③ to continue selecting features from the remaining features of feature set F.
⑤, an optimal attack feature set F' is obtained.
2) Considering the complex characteristics among the data features of the power CPS, through analyzing and researching historical data, an information attack identification model based on a Stacked Auto-Encoding (SAE) is provided, and the steps are as follows:
(f) an unsupervised pre-training encoder is constructed, so that an input layer and an output layer of the network are the same as far as possible, low-dimensional data of a middle hidden layer can represent original data, each layer of the neural network is pre-trained by using a layer-by-layer greedy training method, network parameters are initialized layer by layer, physical and information characteristics are abstractly represented layer by layer in such a way, the low-dimensional data characteristics are encoded, and the difficulty of model training is reduced.
(g) And constructing a supervision fine-tuning classifier, encoding the encoded data for multiple times to obtain the physical characteristics and the information characteristics after dimension reduction, and constructing a softmax classifier to perform the final attack identification step. And setting output layer neurons as N, wherein for the N types of electric power CPS information attack modes, each neuron represents a type of attack.
(h) When the SAE identification model adjusts the optimization parameters, the setting requirement on the initial parameters is higher, and the objective function expression of the initial parameters of the model is as in formula (6):
where n is the total number of samples, y' (i) represents the desired output samples, and y (i) represents the actual output samples.
3) After the objective function of the model is obtained, a self-adaptive cuckoo algorithm is provided for carrying out function solving, initial parameters are effectively set, and the weight and the threshold value in the SAE identification model are optimized:
(i) randomly generating n bird nest initial positions
Respectively corresponding to the initial weight and the threshold parameter of the stacked self-coding network model, training the model by the neural network according to the parameter value, and calculating the result according to the following formula:
in the formula (I), the compound is shown in the specification,
indicating the position of the ith bird's nest in the t-th generation, α indicating the step size control factor,
for the purpose of the point-to-point multiplication,
representing the current generation optimal solution, α
0Fixed value 0.01, L (λ) is a random search path, obeying the lave distribution:
mu and v are normally distributed, β is 1.5, phi is as follows:
(j) step size factor α in general0The larger the algorithm is, the stronger the global search capability is represented, but the convergence accuracy of the algorithm is reduced; the smaller the value, the higher the optimization accuracy, but the convergence rateThe slower will be. The value is set as a fixed value in the standard cuckoo algorithm, so that the convergence process of the algorithm lacks adaptivity. The dynamic setting of the invention is shown in formula (7):
wherein, tiRepresenting the current number of iterations, tmaxIndicating the maximum number of iterations α0The value of the optimal value is gradually reduced along with the increase of the iteration times, the step length is ensured to be gradually reduced, the algorithm meets global search at the initial stage, and the optimization precision is improved at the later stage;
(k) by integrating the above processes, the expression of the adaptive cuckoo algorithm for generating new individuals is as follows:
(l) Calculating all bird nests in each iteration, and storing the best bird nest position in the current generation
And (5) storing.
(m) after obtaining the position of the bird nest of the new generation, replacing the position of the bird nest with poor performance of the previous generation by using the position of the bird nest with better performance, thereby obtaining a group of positions of the bird nest with better performance
(n) generating a [0,1 ]]Random number rand within the range. If rand>paThen the partial solutions are discarded and the same number of new solutions are generated using preferential random walk, as follows:
wherein the content of the first and second substances,
and
two random solutions representing the t-th generation.
(o) probability of discovery paTypically a fixed value of 0.25, the size of which determines whether the current solution is retained. In order to prevent the algorithm from falling into local optimization, the cuckoo algorithm is further improved: for the discovery probability paThe dynamic setting is carried out, the dynamic setting is gradually increased along with the search, the balance between the global search and the local search in the algorithm can be kept in the later evolution stage, the convergence precision of the algorithm is integrally improved, and the phenomenon that the algorithm is trapped in local optimum is avoided. The expression is shown in formula (8):
in the formula, paRepresenting the probability of finding bird nests, pa,maxDenotes the maximum probability of discovery, tiRepresenting the current number of iterations, tmaxThe maximum number of iterations is indicated.
(p) after obtaining a new group of nest positions, replacing e with the nest positions with better performance according to an objective function
kThe poor position of the bird nest. The latest nest position is obtained
(Q) finding Q
kAt an optimal bird nest position
If the maximum iteration number is not reached, returning to the step (l) to continue searching and optimizing, otherwise, outputting the optimal position
(r) according to the optimal bird nest position
Using the corresponding value as the initial parameter of the model, and performing forward training of the modelTraining and reverse regulation.
4) After the network parameters are initialized by the self-adaptive cuckoo algorithm, the network parameters are reversely adjusted and optimized on the basis, the weight of the neural network parameters is trained, CPS information attacks are identified, and operation and maintenance personnel carry out corresponding processing according to the identification result.
In order to verify that the electric power CPS information attack identification method based on the stacked self-coding network can effectively identify information attacks, the inventor adopts the method provided by the invention to compare and analyze with the traditional machine learning method, and fig. 2 shows that the adaptive cuckoo algorithm can fix initial parameters to a proper place, and after a fine adjustment process, a training model is converged to a more ideal state, and the model training speed is improved to a certain extent. As can be seen from fig. 3, in the optimal feature selection process, about 75% of the 128 features may provide a higher learning value, where the two features with the highest correlation are the phase angles of the a-phase voltages of the physical devices R1 and R2, respectively. Fig. 4 indicates that the initial feature numbers are different, and the model identification accuracy also changes greatly, because the features selected by the model 1 are relatively few, and part of valid information is missing, while the feature dimensions of the model 3 are large, and there are some redundant features and weak correlation features, which generate a certain level of confusion behavior for the model, increase the complexity of the model training process, and cause the identification accuracy to be slightly reduced. Fig. 5 compares the identification method proposed by the present invention with the conventional machine learning algorithm, and proves the feasibility and accuracy of the method proposed by the present invention.