CN113904786A

CN113904786A - False data injection attack identification method based on line topology analysis and power flow characteristics

Info

Publication number: CN113904786A
Application number: CN202110729263.6A
Authority: CN
Inventors: 任洲洋; 王文钰
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2022-01-07
Anticipated expiration: 2041-06-29
Also published as: CN113904786B

Abstract

The invention discloses a false data injection attack identification method based on line topology analysis and tidal current characteristics, which comprises the following steps: 1) acquiring historical tide data and an original topological structure of the power system; 2) obtaining a power system conversion topological structure; 3) injecting a plurality of attack data into historical power flow data to obtain power flow sample data with false data; 4) extracting a power flow characteristic value of the power flow sample data, and marking a label whether attack data exists or not; 5) establishing a graph convolution network model for judging whether the power system is attacked or not based on the load flow characteristic value of the load flow sample data; 6) obtaining current flow data of the power system, and extracting a flow characteristic value of the current flow data; and inputting the power flow characteristic value of the current power flow data into the graph convolution network model, and judging whether the power system line is attacked or not and the position of the attacked line. The method utilizes a graph attention machine mechanism neural network to realize the determination of the false data attack position.

Description

False data injection attack identification method based on line topology analysis and power flow characteristics

Technical Field

The invention relates to the field of network information security problems of power systems, in particular to a false data injection attack identification method based on line topology analysis and load flow characteristics.

Background

With the development of industry 4.0, more and more industries gradually realize the intellectualization of the industry by fusing more informatization technologies. The power system is the largest manual system in the world and is an industrial system integrating information technology and automation technology earlier, and the power information physical system integrating real-time perception, dynamic control and information decision is formed. However, the construction of this system has new impacts on the vulnerability of the grid. An attacker destroys the network information security of the power system by means of replay attack, False Data Injection Attacks (FDIAs) and the like in the information interaction process, so that the control center mistakenly thinks that the system is still in normal operation and misleads the control center to make a wrong decision, thereby causing primary equipment in the system to exit the operation or causing interactive interlocking failure due to disconnection of a key line. Therefore, detecting and identifying FDIAs in the power system is very important to ensure safe and stable operation of the power system.

In the existing research method, system data under a single time section are generally considered, and FDIAs are detected by combining spatial features of the data. However, false data constructed by an attacker generally conforms to the operation rule of a power system, and effective detection of FDIAs is difficult to realize only considering the spatial characteristics of the data under continuous time sections. When identifying the false data injection attack, the existing method is usually based on the traditional topological structure, namely, the topological structure of the power system is formed by taking primary equipment as a node and taking a transmission line as an edge.

Disclosure of Invention

The invention aims to provide a false data injection attack identification method based on line topology analysis and power flow characteristics, which comprises the following steps:

1) and acquiring historical power flow data and an original topological structure of the power system.

2) And converting the original topological structure of the power system to obtain a power system conversion topological structure.

The steps of establishing the conversion topological structure chart are as follows:

2.1) establishing an adjacency matrix A for characterizing the original topology of the power system_GNamely:

according to the concepts of complex network theory and graph theory, for a topological structure diagram with N nodes and M edges, the N-N nodes can be used for adjacent matrix A_GRepresenting a graph network structure, in which the elements a in the matrix_ijIs shown as

In the formula, a_ijRepresenting a node adjacency matrix A_GOf (1). v. of_i、v_jShowing the ith and jth nodes of the network.

2.2) extract the node adjacency matrix A_GThe number of adjacent nodes of each node in the array is written into the cell array A_cellIn (1).

2.3) determining adjacent lines of the lines in the traditional topological graph of the power system and constructing a cellular array E based on the topological structure of the lines_cell。

2.4) taking the line as a node of the topological structure chart, searching the adjacent line of each line, and establishing a line adjacent matrix A_G’。

2.5) matrix A by line adjacency_G' establishing a power system conversion topological structure which takes the lines as nodes and takes the connection relation between the lines as edges.

3) Injecting a plurality of attack data into the historical power flow data to obtain power flow sample data with false data.

Attack data a is as follows:

a＝Hc (2)

wherein c is ═ c₁,c₂,...,c_n]^TIs an arbitrary non-zero vector. c is as large as R^n×1. n is the number of statesAmount of the compound (A). H is a Jacobian matrix representing the topology of the power system. .

4) And extracting the load flow characteristic value of the load flow sample data, and marking the load flow characteristic value with a label of whether attack data exists or not.

The power flow characteristic value comprises an electrical coefficient index and a power flow deviation coefficient index.

The electrical permittivity B_e(m, n) is as follows:

in the formula, L and G are sets of load nodes and power generation nodes in the power grid, respectively. W_iAnd W_jRespectively representing the active power output by the generator and the node load value. I is^ij(m, n) represents the amount of current change between the lines (m, n) after a unit current source is connected between the power supply node i and the load node j.

Tidal current offset coefficient index M_iAs follows:

in the formula, P_i0And P_j0Representing the initial active power of line i and line j, respectively. And L is the set of all power transmission lines in the power grid. Delta P_jiIs the amount of change in active power on line j due to the disconnection of line i.

The power flow characteristic value is preprocessed data. The pre-processing includes z-score normalization of the data. The data for x' after normalization are as follows:

in the formula, x_μAnd x_σSample means and standard deviation, respectively. x characterizes the data before pre-processing.

5) And establishing a graph convolution network model for judging whether the power system is attacked or not based on the power flow characteristic value of the power flow sample data.

The step of establishing a graph convolution network model for judging whether the power system is attacked or not comprises the following steps:

and 5.1) randomly dividing the tidal current characteristic values of the tidal current sample data into a test set and a training set.

And 5.2) building a graph convolution network. The graph convolution network model comprises an input layer, a plurality of hidden layers and an output layer.

And 5.3) training the graph convolution network by utilizing a training set to obtain the trained graph convolution network.

And 5.4) testing the trained graph convolution network by using the test set, finishing building a graph convolution network model if the accuracy rate of the graph convolution network output result is greater than a preset threshold, otherwise, acquiring load flow sample data and a load flow characteristic value again, and returning to the step 1).

The feature learning relationship between the feature values of any two layers of nodes of the graph convolution network model is as follows:

in the formula, l represents the number of layers where the node is located. s denotes a node v_iThe number of adjacent nodes. x is the number of_ni ^lRepresenting the eigenvalue of the nth node in the l-th layer. ni is the number of the neighbor node of node i. g (-) represents the activation function.

Is the network weight.

The characteristic value of the node i at the l +1 th layer is obtained.

The output characteristic h' of the graph convolution network model is as follows:

in the formula, alpha_ijFor the attention coefficient, W is a shareable network parameter.

Obtaining the attention coefficient alpha_ijComprises the following steps:

a) recording an M-dimensional characteristic input set of all line nodes in a power system conversion topological structure as

The output feature is combined into

b) Calculating the mutual influence degree e between the line nodes_ijNamely:

wherein W is a shareable parameter. [ | | · of [ ]]A splice is indicated. Shared attention mechanism a (-) is used to map the stitched features to the correlation coefficient e_ijAnd finishing the learning of the correlation between the node i and the node j.

c) Mapping of splicing characteristics is carried out by utilizing a nonlinear activation function Leaky ReLU single-layer feedforward neural network, and the mutual influence degree e is updated_ijThe following were used:

the parameters are weight vectors

The calculation formula of the correlation coefficient can be expressed as

In the formula (I), the compound is shown in the specification,

representing a weight vector.

Representing the input.

d) Using softmax function to influence each other_ijPerforming normalization processing to obtain a proper attention coefficient alpha_ijNamely:

the activation function of the graph convolution network model is an ELUs function.

Representing the input.

The ELUs function is as follows:

in the formula, alpha is an adjustment parameter. x is the input data. g (x) is output data.

6) The method comprises the steps of obtaining current flow data of the power system and extracting flow characteristic values of the current flow data. And inputting the current power flow characteristic value of the current power flow data into the graph convolution network model, and judging whether the power system line is attacked or not.

The method has the advantages that the traditional topological structure is converted, the neural network is manufactured by using a graph attention machine, the graph neural network FDIAs identification method considering branch tidal current characteristics is provided, the characteristic vector representing the tidal current characteristics of each branch is calculated through the tidal current value of the line, the deep data characteristics are mined by using the connection relation between the lines and the neural network on the topological structure, and the determination of the false data attack position is realized. The method determines the identification precision and accuracy of the method by comparing with other identification methods under different attack degrees.

Drawings

FIG. 1 is a flow chart of a graph neural network FDIAs identification method considering branch flow characteristics;

FIG. 2 shows the connection relationship of IEEE39 standard system lines after topology conversion;

fig. 3 is a graph of the convolution network and a graph of the penalty function descent process for the attention mechanism.

Detailed Description

The present invention is further illustrated by the following examples, but it should not be construed that the scope of the above-described subject matter is limited to the following examples. Various alterations and modifications can be made without departing from the technical idea of the invention, and all changes and modifications made by the ordinary technical knowledge and the conventional means in the field are intended to be included in the scope of the invention.

Example 1:

referring to fig. 1 to 3, a method for identifying a false data injection attack based on line topology analysis and power flow characteristics includes the following steps:

Attack data a is as follows:

a＝Hc (2)

wherein c is ═ c₁,c₂,...,c_n]^TIs an arbitrary non-zero vector. c is as large as R^n×1. n is the number of states. H is a Jacobian matrix representing the topology of the power system. .

The electrical permittivity B_e(m, n) is as follows:

Tidal current offset coefficient index M_iAs follows:

in the formula, Pi0 and P_j0Representing the initial active power of line i and line j, respectively. And L is the set of all power transmission lines in the power grid. Delta P_jiIs the amount of change in active power on line j due to the disconnection of line i.

in the formula, l represents the number of layers where the node is located. s denotes a node v_iThe number of adjacent nodes. x is the number of_ni ^lRepresenting the eigenvalue of the nth node in the l-th layer. ni is the number of the neighbor node of node i. g (-) represents the activation function. w is the network weight.

The characteristic value of the node i at the l +1 th layer is obtained.

Obtaining the attention coefficient alpha_ijComprises the following steps:

The output feature is combined into

b) Calculating the mutual influence degree e between the line nodes_ijNamely:

Representing the input.

the parameters are weight vectors

The calculation formula of the correlation coefficient can be expressed as

In the formula (I), the compound is shown in the specification,

representing a weight vector.

Representing the input.

The ELUs function is as follows:

Example 2:

a false data injection attack identification method based on line topology analysis and power flow characteristics comprises the following steps:

1) and converting the traditional topological structure of the power system to obtain the topological structure among the circuits of the power system.

2) The branch power flow data of the power system is utilized, attack data are injected into power flow data based on a construction method of false data, the indexes of the current medium number and the power flow deviation coefficient are calculated to serve as the power flow characteristics of the line, and whether attack exists in the data or not is marked.

3) And selecting a proper graph attention machine mechanism activation function and a proper loss function, and constructing a graph neural network model based on the attention machine mechanism.

4) And carrying out normalization processing on the line power flow characteristics by using a z-score method to obtain a node characteristic data set with consistent dimensions. The normalized data set is divided into a training sample set and a test sample set.

5) The training sample set is used for updating and optimizing parameters of the graph neural network, and the test set is used for verifying that the method can accurately identify the attacked line and realize the identification of the false data injection attack.

Example 3:

a method for identifying false data injection attacks based on line topology analysis and trend characteristics mainly comprises the following steps of embodiment 2, wherein the steps of establishing a conversion topology structure diagram are as follows:

1) according to the concepts of complex network theory and graph theory, for a topological structure diagram with N nodes and M edges, an N-N adjacency matrix A can be used_GRepresenting a network structure of diagrams in which the elements a in the matrix_ijIs shown as

2) According to the adjacency matrix A_GKnowing the number of adjacent nodes per node, in an adjacent matrixIn the row, the number of the value 1, that is, the number of the nodes adjacent to the node, is determined, thereby determining the elements in the cell array. In other words, a_ijWhen 1, it means that two nodes are connected. Using cell array A_cellAnd (4) showing.

3) Because the converted topological graph takes the lines as nodes and the connection relation between the lines is an edge, the lines in the traditional topological graph of the power system need to be numbered so as to find the adjacent lines of the lines to form a corresponding cellular array.

4) When the lines are used as nodes of the topological structure diagram, adjacent lines of each line are searched. And according to the array formed by the line numbers, the lines connected with the end nodes of the corresponding lines are the adjacent matrixes of the branches.

5) Cellular array E based on line topological structure_cellCorresponding line-based adjacency expression matrix A can be established_G’。

6) With lines adjacent to the matrix A_GBased on the above, a topological graph structure of the power system is drawn, in which the lines are nodes and the connection relationship between the lines is edges.

7) And completing the construction of the power system topological structure conversion model graph.

Example 4:

a false data injection attack identification method based on line topology analysis and power flow characteristics mainly comprises the following steps of embodiment 2, wherein attack amount is injected into original power flow data according to power flow data of each branch in a power system and a construction mode of false data, and therefore a characteristic matrix of each branch node is established. Establishing a characteristic matrix considering branch flow mainly comprises the following two aspects:

1) method for constructing false data

Given the topology of the power system, an attacker injects an attack vector a into the measurement system, which satisfies

a＝Hc (2)

Wherein c ═ c₁,c₂,...,c_n]^TIs an arbitrary non-zero vector, c ∈ R^n×1And n is the number of states, then measureValue becomes z_aAfter z + a, the state estimate is

Accordingly, the residual equation becomes

As can be seen from equation (2.25), when FDIAs exists in the measured data, the residual error still remains within the range allowed by the threshold, thereby bypassing the detection and identification of bad data modules and successfully attacking the power system.

2) Node eigenvalues considering branch flow data

2.1) dielectric constant

Electric energy in the power system is output by the generator and transmitted to the load nodes through the lines for power consumption, and the propagation of the power flow conforms to kirchhoff law. In other words, the magnitude of the power flow is influenced by the magnitude of the impedance of the branch circuit, and the power flow tends to flow through the line with the minimum impedance value in the power flow transmission process. Therefore, when the power topological structure and the power flow are considered simultaneously, the coupling relation between the distribution and consumption of the load, the magnitude of the generated energy and the topological structure of the power system is reflected by taking the number of the electric medium as one of the characteristics of the line nodes, and the use condition of the power supply and the load on each power transmission line is reflected. The calculation formula is as follows:

l and G are respectively a set of a load node and a power generation node in the power grid; w_iAnd W_jRespectively representing the active power output by the generator and the node load value; i is^ij(m, n) represents a change amount of current between the lines (m, n) after a unit current source is connected between the power source node i and the load node j, that is, a unit power change value.

2.2) index of tidal current offset coefficient

When considering the operation state of the power system, the factors which must be considered are the transmission and distribution of the power flow among the lines. According to the safety principle of the power system N-1, when a certain line in the system exits from operation, the system realizes self-regulation through flow redistribution. If during the self-regulation process, the power flow distribution is unbalanced, for example, some lines carry relatively small tidal flow, and some lines carry too large tidal flow, the lines are taken out of operation due to overload, and a cascading failure of the power system is generated. The power flow deviation coefficient index can be used for representing the influence of the change of the power flow in a certain line on the power flow in the whole system and quantitatively analyzing the mutual influence among the lines in the system. The power flow deviation factor index is therefore used as one of the characteristics reflecting the power flow in the power system. The calculation formula of the index is as follows:

in the formula P_i0And P_j0Respectively representing the initial active power of the line i and the initial active power of the line j, wherein L is the set of all transmission lines in the power grid, and delta P_jiIs the amount of change in active power on line j due to the disconnection of line i.

Example 5:

a method for identifying false data injection attacks based on line topology analysis and trend characteristics mainly comprises the following steps of embodiment 2, wherein a convolutional neural network function selection mode based on an attention mechanism is as follows:

1) solving method of graph convolution network

When the nodes in the graph are subjected to convolution operation, the characteristic learning relation between the node characteristic values of the two nerve unit layers is

Wherein l represents the number of layers of the node, and s represents the nodePoint v_iThe number of the adjacent nodes of (a),

and representing the characteristic value of the nth node in the l layer, wherein ni is the number of the neighbor nodes of the node i, and in the graph convolution network, the characteristic value of the node i at the l +1 layer is only related to the adjacent node of the node i at the l layer. And g (-) represents an activation function used for carrying out nonlinear learning on the data characteristics, and w is the network weight and is continuously updated in the training process.

For a matrix of input signals x, x ∈ R consisting of the node features in the graphⁿThe Fourier transform of (A) is defined as F (x) U^Tx, accordingly, the inverse of the Fourier transform is defined as

Wherein

Representing the frequency domain signal result of the input signal after fourier transformation. Transforming a signal

The element in (3) is just the coordinate of the diagram signal in the orthogonal space based on the characteristic vector of the Laplacian matrix, thereby completing the task of projecting the original input to the corresponding diagram. Thus, the input signal can be converted into

Which is exactly the inverse of the fourier transform. Therefore, the calculation formula of the convolution operation of the graph obtained by combining the convolution neural network calculation formula and the spectrogram theory is as follows

Wherein

For input signal, f_lAs to the number of layers of the input signal,

parameters that are updated for the convolutional network.

In the electric power information physical system, based on the admittance matrix and the line connection condition of each branch, the importance degree of each branch to the system is different, namely, the weights of the sides in the converted topological graph are different. In addition, each branch can be disconnected according to different operation requirements, topological graphs with the same structure are required to be used in the training and testing stages of the traditional graph convolution network, the calculation cost is increased along with the increase of the scale of the topological structures, and the problem of dynamic graphs cannot be solved. In order to solve the problems, a graph attention mechanism is introduced, and the convolution operation of the graph neural network is completed by learning the feature weights of the adjacent nodes through the network, weighting and summing the features of the adjacent nodes.

2) Function selection method of graph attention machine mechanism

The graph attention mechanism is based on a graph convolutional neural network, and according to the connection relation and the node characteristics of adjacent nodes, the connection coefficients among the nodes are calculated and distributed with different weights to serve as the difference consideration of each node. For an electric power system with N lines, the M-dimensional feature input set of all line nodes in the conversion topology is expressed as

Having an output feature combined as

For a specific node i, the correlation coefficients of adjacent nodes and the correlation coefficients are calculated one by one, namely the mutual influence degree between the adjacent nodes and the adjacent nodes, namely the mutual influence degree between the adjacent nodes and the line nodes is calculated according to the following formula

Performing enhanced representation and splicing [. cndot.. on original input features through linear transformation of sharable parameter W]Sharing attentionA (-) mapping the stitched features to a correlation coefficient e_ijAnd finishing the learning of the correlation between the node i and the node j. In the graph attention mechanism, a nonlinear activation function LeakyReLU single-layer feedforward neural network with negative semiaxis slope of 0.2 is used for mapping the splicing characteristics, and the parameters are weight vectors

The calculation formula of the correlation coefficient can be expressed as

Because the correlation coefficients have different dimensions, the correlation coefficients between nodes calculated by the formula (4.5) are normalized by using a softmax function to better distribute the weight between the nodes to obtain the attention coefficient.

The figure notes that the operation of the force mechanism is shown in figure 1. And according to the attention coefficient between the normalized nodes, similar to the graph convolution network, carrying out weighted summation on the characteristics of each node to obtain the output characteristics h' of the node.

Wherein alpha is_ijFor normalized attention coefficients, W is a shareable network parameter.

The graph attention mechanism utilizes attention coefficients to reflect the correlation degree between the line nodes and aggregates the characteristics of adjacent nodes, gets rid of the dependence of the traditional graph convolution network on a topological structure, and is suitable for a system with continuously changing topology. For power systems, the topology has a dynamic characteristic due to the opening of the circuit breaker and the difference in line capacity. When the FDIAs are identified, the method is more suitable for using the graph attention machine mechanism network to carry out feature learning and judging whether the branch is attacked or not.

3) Selection method of activation function

The invention introduces a new activation function: the ELUs function, used for mapping features in two neuron layers in a graph attention mechanism, makes some improvements based on the ReLU activation function. The ReLU activation function considers negative values as 0 uniformly and outputs positive values linearly, so that the output value of the function has no negative value, the result is greater than 0 when the output is subjected to mean value calculation, the lower layer is easy to generate bias and mean value shift, and the problem of non-convergence occurs when a deeper network is trained. The ELUS function effectively solves the problem by introducing a negative value activation function, and the calculation formula is as follows

Where α is the tuning parameter that controls where the negative portion of the ELUs activation functions reach saturation.

When the graph attention machine system is used for training and learning the topological graph of the power system, because the number of nodes is large and the topological structure can be changed by disconnecting the branch, the ELUs with strong noise robustness are selected as the activation function, and the problem of non-convergence of training is avoided.

Example 6:

a method for identifying false data injection attacks based on line topology analysis and trend characteristics mainly comprises the following steps of embodiment 2, wherein in order to keep the dimension of node characteristics consistent, data needs to be preprocessed, and the processing method comprises the following steps:

a common min-max normalization method processes data based on the difference between the maximum and minimum values in the sample. For the power system scenario in which the present invention is actually applied, in consideration of the influence of the power transmission line disconnection on the topology (e.g., when the power is 0 because the line is out of service), a small number of data points deviating from the sample mean may appear in the sample. The preprocessing of data based on maximum or minimum values in conventional methods may be affected by stray data. Therefore, the invention selects the z-score standardization method considering the whole information of the sample to preprocess the data, and the method selects the standard deviation and the mean value of the sample to process the original data, thereby effectively solving the influence of a small amount of deviation data on the data normalization, ensuring the distribution characteristics of the original data and being beneficial to the characteristic extraction. The calculation formula is

Wherein x is_μAnd x_σSample means and standard deviation, respectively.

Example 7:

a method for identifying false data injection attacks based on line topology analysis and trend characteristics mainly comprises the following steps of example 2, wherein the process of using training data samples and testing data samples to carry out parameter adjustment and optimization on the method comprises the following steps:

reading a line admittance matrix and historical operation load data of the power system, obtaining a node adjacent matrix, and constructing a conversion topology structure chart with lines as nodes and connection relations between the lines as edges.

Based on historical coincidence data, calculating the power flow data in the line by using a Czochralski method, injecting false data into historical power flow by using a virtual false data construction method, and constructing a power flow sample with the false data. And calculating a node characteristic matrix considering the load flow operation characteristics based on the existing load flow data, and taking the node characteristic matrix as the input of the graph attention machine neural network to judge the attacked line.

And preprocessing the node characteristic data which is completely calculated according to a normalization formula, and dividing the node characteristic data into a test sample and a training sample. Training sample data is used for adjusting parameters of the convolution network based on the graph attention machine mechanism, and a graph convolution network model with the final purpose of whether the node is attacked or not is constructed. The test sample is used for verifying the validity of the FDIAs identification method and judging a specific attacked line; if FDIAs do not exist, the identification process ends.

Example 8:

an experiment of a false data injection attack identification method based on line topology analysis and trend characteristics mainly comprises the following steps:

1) and converting the traditional topological structure of the power system to obtain the topological structure among the circuits of the power system. The Pythrch deep learning framework is used for drawing the correlation among IEEE39 node standard calculation example lines, and the total number of lines is 46. It should be noted that, in the Python operation process, the initial value of the number is 0, and therefore, the number in the figure takes the value of 0 to 45. The line number in the topological structure diagram is based on the result of PowerWorld load flow simulation calculation, and the final topological structure conversion result is shown in figure 2.

2) The branch power flow data of the power system is utilized, attack data are injected into power flow data based on a construction method of false data, the indexes of the current medium number and the power flow deviation coefficient are calculated to serve as the power flow characteristics of the line, and whether attack exists in the data or not is marked. And (3) accessing the IEEE39 node into a continuous time load of a transformer substation in a certain region in a country, processing the continuous time load according to the load data of the IEEE39 node to obtain a per unit value of the load data, and taking the per unit value into an IEEE39 system to obtain the load data. The final node feature calculation results are shown in table 1.

TABLE 1 characteristic values of nodes

Line numbering	Electric permittivity	Tidal current offset coefficient	Line numbering	Electric permittivity	Tidal current offset coefficient
						1	1.075	22.43	24	0.986	26.41
2	0.320	-20.62	25	2.080	-4.92
						3	1.780	-9.55	26	2.930	5.92
4	1.930	3.54	27	1.938	-0.02
						5	0.984	0	28	1.330	-0.02
6	0.998	-69.67	29	1.256	-0.01
						7	0.874	11.07	30	0.864	2.27
8	0.968	11.89	31	0.798	35.28
						9	0.784	0.94	32	0.782	0.01
10	0.886	5.78	33	0.537	0
						11	0.936	2.21	34	0.254	0
12	0.952	2.68	35	1.135	-0.01
						13	1.273	-5.10	36	1.325	0.10
14	0.682	102.97	37	0.584	0
						15	0.920	5.50	38	1.324	0
16	0.889	55.95	39	0.654	0
						17	0.546	55.95	40	1.456	-13.40
18	1.180	-4.45	41	0.576	0
						19	1.220	4.45	42	1.800	-3.39
20	0.687	0	43	0.236	0.05
						21	1.260	47.71	44	1.246	-0.03
22	1.147	-30.03	45	0.998	0.01
						23	1.330	5.01	46	0.847	0

As is apparent from table 1, there is a certain relationship between the magnitude of the electrical permittivity of the line and the degree between each node in the topological graph, and the distribution state of the power flow of each node (i.e., each line) is reflected on the basis of fully considering the topological structure. For different line nodes, the numerical deviation of partial indexes is large, the dimensions of all indexes are different, in order to ensure that a detection model works normally and identify an attacked line, all indexes in a node characteristic matrix are normalized according to a z-score normalization method, and the normalized indexes are used as a node matrix of a neural network based on a graph attention machine system to perform subsequent calculation.

3) And selecting a proper graph attention machine mechanism activation function and a proper loss function, and constructing a graph neural network model based on the attention machine mechanism. In the invention, ELUs are selected as an activation function, cross entropy is selected as a loss function, and in the process of updating the attention mechanism, the weight coefficient of each node in the conversion topology structure chart is determined by using the mapping of the LeakyReLU activation function and the activation function. Wherein, the convergence process of the neural network is shown in comparison with FIG. 3, and the detection performance is shown in Table 2.

TABLE 2 example Performance indicator results based on IEEE39 Standard System

4) And carrying out normalization processing on the line power flow characteristics by using a z-score method to obtain a node characteristic data set with consistent dimensions. The normalized data set is divided into a training sample set and a testing sample set. And calculating the normalized result of the characteristics of each node in the table 1 according to the normalization equation of z-score, and substituting the normalized result into the graph attention machine convolution network determined in the step 3) for settlement.

5) The training sample set is used for updating and optimizing parameters of the graph neural network, and the test set is used for verifying that the method can accurately identify the attacked line and realize the identification of the false data injection attack. Under different amplitude disturbances σ, where the amplitude disturbances are set to 0.005, 0.05 and 0.1, respectively, the final results are shown in table 3 in comparison with other false data identification methods.

TABLE 3 comparative analysis of identification results by different methods

From the data in table 3, the algorithm of fusion of the graph convolution network based on the traditional topology, the convolutional neural network and the bad data and the convolutional network based on the graph of the circuit topology structure proposed by the present invention have no great difference in precision and recall rate. However, when the amplitude of the state value is increased, the accuracy and precision of the former two methods are continuously reduced, while the detection precision and recall ratio of the method provided by the invention are gradually increased in a small range, and the stability is kept. The two methods are based on the original topological structure, the state estimation value of the node is identified and analyzed, and when amplitude disturbance occurs in the system, the larger the disturbance amplitude, the higher the possibility that normal data is misjudged, and the detection performance is reduced. The method provided by the invention considers the connection relation between lines and the characteristics of the power flow data, and based on the lines, the method can reflect the characteristics of the system better when disturbance occurs in the system.

Example 9:

referring to fig. 1, a false data injection attack identification method based on line topology analysis and power flow characteristics mainly includes the following steps:

The main steps for converting the traditional topological structure of the power system are as follows:

1.1) according to the concepts of complex network theory and graph theory, for a topology structure diagram with N nodes and M edges, the graph network structure can be represented by N × N adjacency matrix AG, where the elements aij in the matrix are represented as

1.2) the number of the adjacent nodes of each node can be known according to the adjacent matrix AG, and in the row of the adjacent matrix, the number which is 1, namely the number of the nodes adjacent to the node, is obtained, thereby determining the elements in the cellular array. In other words, if aij is 1, it means that two nodes are connected. Represented using the cellular array Acell.

1.3) because the converted topological graph takes the lines as nodes and the connection relation between the lines is edges, the lines in the traditional topological graph of the power system need to be numbered so as to search the adjacent lines of the lines to form a corresponding cellular array.

1.4) when the line is used as a node of the topological structure chart, finding the adjacent line of each line. And according to the array formed by the line numbers, the lines connected with the head and tail nodes of the corresponding lines are the adjacent matrixes of the branches.

1.5) the line topology based cellular array Ecell can establish a corresponding line based adjacency expression matrix AG'.

1.6) drawing a topological graph structure of the power system by taking the lines as nodes and taking the connection relation between the lines as edges according to the line adjacency matrix AG'.

1.7) completing the construction of the conversion model graph of the power system topological structure.

According to the power flow data of each branch in the power system, injecting attack quantity into the original power flow data according to the construction mode of the false data, and therefore establishing a characteristic matrix of each branch node. Establishing a characteristic matrix considering branch flow mainly comprises the following two aspects:

2.1) construction mode of false data

a＝Hc (2)

Wherein c ═ c₁,c₂,...,c_n]^TIs an arbitrary non-zero vector, c ∈ R^n×1And n is the number of states, the measured value becomes z_aAfter z + a, the state estimate is

Accordingly, the residual equation becomes

As can be seen from equation (2.25), when FDIAs exists in the measured data, the residual error still remains within the range allowed by the threshold, thereby bypassing the detection and identification of the bad data module and successfully attacking the power system.

2.2) consideration of node characteristic values of branch flow data

2.2.1) dielectric constant

Electric energy in the power system is output by the generator and transmitted to the load nodes through the lines for power consumption, and the propagation of the power flow conforms to kirchhoff law. In other words, the magnitude of the power flow is influenced by the magnitude of the branch impedance, and during the power flow transmission process, the power flow tends to flow through the line with the minimum impedance value. Therefore, under the condition of simultaneously considering the power topological structure and the power flow, the coupling relation between the distribution and consumption of the load, the magnitude of the generated energy and the topological structure of the power system is reflected by taking the electrical medium number as one of the characteristics of the line nodes, and the service condition of the power supply and the load to each power transmission line is reflected. The calculation formula is as follows:

2.2.2) index of tidal current offset coefficient

When considering the operation state of the power system, the factors which must be considered are the transmission and distribution of power flow among lines. According to the safety principle of the power system N-1, when a certain line in the system exits from operation, the system realizes self-regulation through flow redistribution. If the power flow distribution is unbalanced during the self-regulation process, for example, some lines carry relatively small tidal flow, and some lines carry too large tidal flow, the lines are taken out of operation due to overload, and a cascading failure of the power system is generated. The power flow deviation coefficient index can be used for representing the influence of the change of the power flow in a certain line on the power flow in the whole system and quantitatively analyzing the mutual influence among the lines in the system. Therefore, the power flow deviation factor index is used as one of the characteristics reflecting the power flow in the power system. The calculation formula of the index is as follows:

The selection method of the activation function and the loss function of the attention mechanism is as follows:

3.1) solving method of graph convolution network

When the nodes in the graph are subjected to convolution operation, the characteristic learning relation between the node characteristic values of the two neural unit layers is

Wherein l represents the number of layers of the node, and s represents the node v_iNumber of adjacent nodes, x_ni ^lAnd representing the characteristic value of the nth node in the l layer, wherein ni is the number of the neighbor nodes of the node i, and in the graph convolution network, the characteristic value of the node i in the l +1 layer is only related to the adjacent node of the node i in the l layer. And g (-) represents an activation function used for carrying out nonlinear learning on the data characteristics, and w is the network weight and is continuously updated in the training process.

Wherein

Which is exactly the inverse of the fourier transform. Therefore, the convolution neural network calculation formula and the spectrogram theory-derived graph convolution operation are combined to obtain the calculation formula

Wherein

For input signal, f_lAs to the number of layers of the input signal,

parameters that are updated for the convolutional network.

In the electric power information physical system, based on the admittance matrix and the line connection condition of each branch, the importance degree of each branch to the system is different, namely the weight values of the edges in the converted topological graph are different. In addition, each branch can be disconnected according to different operation requirements, topological graphs with the same structure are required to be used in the training and testing stages of the traditional graph convolution network, the calculation cost is increased along with the increase of the scale of the topological structures, and the problem of dynamic graphs cannot be solved. In order to solve the problems, a graph attention mechanism is introduced, and the convolution operation of the graph neural network is completed by learning the feature weights of the adjacent nodes through the network, weighting and summing the features of the adjacent nodes.

3.2) function selection method of graph attention machine mechanism

The graph attention mechanism is based on a graph convolution neural network, and calculates the connection coefficient between nodes according to the connection relation and the node characteristics of adjacent nodes, namelyWhich are assigned different weights as a difference consideration for each node. For a power system with N lines, the M-dimensional feature input set of all line nodes in the conversion topology is represented as

Having an output feature combined as

Performing enhanced representation and splicing [. cndot.. on original input features through linear transformation of sharable parameter W]The shared attention mechanism a (-) maps the stitched features to a correlation coefficient e_ijAnd finishing the learning of the correlation between the node i and the node j. In the graph attention mechanism, a nonlinear activation function LeakyReLU single-layer feedforward neural network with negative semiaxis slope of 0.2 is used for mapping the splicing characteristics, and the parameters are weight vectors

The calculation formula of the correlation coefficient can be expressed as

And according to the attention coefficient between the normalized nodes, similar to the graph convolution network, carrying out weighted summation on the characteristics of each node to obtain the output characteristics h' of the node.

3.3) selection method of activation function

The invention introduces a new activation function: the ELUs function, used for mapping features in two neuron layers in a graph attention mechanism, makes some improvements based on the ReLU activation function. The ReLU activation function considers negative values as 0 uniformly and outputs positive values linearly, so that the output value of the function has no negative value, the result is greater than 0 when the output is subjected to mean value calculation, the lower layer is easy to generate bias, mean value shift is generated, and the problem of non-convergence occurs when a deeper network is trained. The ELUS function effectively solves the problem by introducing a negative value activation function, and the calculation formula is as follows

When the graph attention machine mechanism is used for training and learning the topological graph of the power system, due to the fact that the number of nodes is large, and the topological structure can be changed due to the fact that the branches are disconnected, the ELUs with high noise robustness are selected as the activation function, and the problem of non-convergence of training is avoided.

4) And carrying out normalization processing on the line power flow characteristics by using a z-score method to obtain a node characteristic data set with consistent dimensions. The normalized data set is divided into a training sample set and a testing sample set.

Wherein x is_μAnd x_σSample means and standard deviation, respectively.

The parameter adjustment and optimization procedure for the proposed method using training data samples and test data samples is as follows:

and 5.1) reading a line admittance matrix and historical operation load data of the power system, acquiring a node adjacency matrix, and constructing a conversion topology structure chart with lines as nodes and connection relations among the lines as edges.

And 5.2) based on the historical coincidence data, calculating the power flow data in the line by using a Newton method, injecting false data into the historical power flow by using a false data construction method, and constructing a power flow sample with the false data. And calculating a node characteristic matrix considering the load flow operation characteristics based on the existing load flow data, and taking the node characteristic matrix as the input of the graph attention machine neural network to judge the attacked line.

And 5.3) preprocessing the node characteristic data which is completely calculated according to a normalization formula, and dividing the node characteristic data into a test sample and a training sample. Training sample data is used for adjusting parameters of the convolutional network based on the graph attention machine mechanism, and a graph convolutional network model which takes whether the node is attacked as a final aim is constructed. The test sample is used for verifying the validity of the FDIAs identification method and judging a specific attacked line; if FDIAs do not exist, the identification process ends.

Claims

1. A false data injection attack identification method based on line topology analysis and power flow characteristics is characterized by comprising the following steps:

2) Converting the original topological structure of the power system to obtain a power system conversion topological structure;

3) injecting a plurality of attack data into historical power flow data to obtain power flow sample data with false data;

4) extracting a power flow characteristic value of the power flow sample data, and marking a label whether attack data exists or not for the power flow characteristic value;

5) establishing a graph convolution network model for judging whether the power system is attacked or not based on the load flow characteristic value of the load flow sample data;

6) obtaining current flow data of the power system, and extracting a flow characteristic value of the current flow data; and inputting the current power flow characteristic value of the current power flow data into the graph convolution network model, and judging whether the power system line is attacked or not.

2. The method for identifying the false data injection attack based on the line topology analysis and the power flow characteristic as claimed in claim 1, wherein the step of establishing the converted topology structure diagram is as follows:

1) establishing an adjacency matrix A for representing an original topological structure of a power system_GNamely:

A_G＝[a_ij]_N×N

In the formula, a_ijRepresenting a node adjacency matrix A_GThe elements of (1); v. of_i、v_jRepresents the ith and jth nodes of the network;

2) extracting a node adjacency matrix A_GThe number of adjacent nodes of each node in the array is written into the cell array A_cellPerforming the following steps;

3) determining adjacent lines of the lines in the traditional topological graph of the power system, and constructing a cellular array E based on the topological structure of the lines_cell；

4) Using the line as the node of the topological structure chart, finding the adjacent line of each line, thereby establishing a line adjacent matrix A_G’；

5) According to the line adjacency matrix A_G' establishing a power system conversion topological structure which takes the lines as nodes and takes the connection relation between the lines as edges.

3. The method for identifying the false data injection attack based on the line topology analysis and the power flow characteristic as claimed in claim 1, wherein the attack data a is as follows:

a＝Hc (2)

wherein c is ═ c₁,c₂,...,c_n]^TIs an arbitrary non-zero vector; c is as large as R^n×1(ii) a n is the number of states; h is a Jacobian matrix representing the topology of the power system.

4. The method for identifying the false data injection attack based on the line topology analysis and the power flow characteristics as claimed in claim 1, wherein the power flow characteristic value comprises an electrical betweenness and a power flow deviation coefficient index;

the electrical permittivity B_e(m, n) is as follows:

in the formula, L and G are respectively a set of a load node and a power generation node in a power grid; w_iAnd W_jRespectively representing the active power output by the generator and the node load value; i is^ij(m, n) represents the amount of current change between the lines (m, n) after connecting a unit current source between the power source node i and the load node j;

tidal current offset coefficient index M_iAs follows:

in the formula, P_i0And P_j0Respectively representing the initial active power of the line i and the initial active power of the line j; l is the set of all power transmission lines in the power grid; delta P_jiIs the amount of change in active power on line j due to the disconnection of line i.

5. The method for identifying the false data injection attack based on the line topology analysis and the power flow characteristic as claimed in claim 4, wherein the power flow characteristic value is preprocessed data; the pre-processing includes z-score normalization of the data; the data for x' after normalization are as follows:

in the formula, x_μAnd x_σRespectively as a sample mean and a standard deviation; x characterizes the data before pre-processing.

6. The method for identifying the false data injection attack based on the line topology analysis and the power flow characteristic as claimed in claim 1, wherein the step of establishing a graph convolution network model for judging whether the power system is attacked or not comprises:

1) randomly dividing the tidal current characteristic value of the tidal current sample data into a test set and a training set;

2) building a graph convolution network; the graph convolution network model comprises an input layer, a plurality of hidden layers and an output layer;

3) training the graph convolution network by using a training set to obtain a trained graph convolution network;

4) and (3) testing the trained graph convolution network by using the test set, finishing building the graph convolution network model if the accuracy rate of the output result of the graph convolution network is greater than a preset threshold value, otherwise, acquiring load flow sample data and a load flow characteristic value again, and returning to the step 1).

7. The method for identifying the false data injection attack based on the line topology analysis and the power flow characteristic as claimed in claim 1, wherein the feature learning relationship between the feature values of any two layers of nodes of the graph convolution network model is as follows:

in the formula, l represents the number of layers where the node is located; s denotes a node v_iThe number of adjacent nodes of (a); x is the number of_ni ^lRepresenting the characteristic value of the nth node in the ith layer; ni is the compilation of node i neighbor nodesNumber; g (.) represents an activation function;

is the network weight;

the characteristic value of the node i at the l +1 th layer is obtained;

8. the method for identifying the false data injection attack based on the line topology analysis and the power flow characteristic as claimed in claim 1, wherein the output characteristics of the graph convolution network model

As follows:

in the formula, alpha_ijFor attention coefficients, W is a shareable network parameter;

representing the input.

9. The method for identifying the false data injection attack based on the line topology analysis and the power flow characteristic as claimed in claim 8, wherein an attention coefficient α is obtained_ijComprises the following steps:

1) recording an M-dimensional characteristic input set of all line nodes in a power system conversion topological structure as

The output feature is combined into

2) Calculating mutual relationships between line nodesDegree of influence e_ijNamely:

wherein W is a shareable parameter; [^.||^.]Representing a splice; shared attention mechanism a (^.) For mapping the stitched features to the correlation coefficient e_ijFinishing the learning of the correlation between the node i and the node j;

representing an input;

3) mapping of splicing characteristics is carried out by utilizing a nonlinear activation function LeakyReLU single-layer feedforward neural network, and the mutual influence degree e is updated_ijThe following were used:

the parameters are weight vectors

The calculation formula of the correlation coefficient can be expressed as

In the formula (I), the compound is shown in the specification,

representing a weight vector;

4) using softmax function to influence each other_ijPerforming normalization processing to obtain a proper attention coefficient alpha_ijNamely:

in the formula (I), the compound is shown in the specification,

representing the input.

10. The method for identifying the false data injection attack based on the line topology analysis and the power flow characteristic as claimed in claim 1, wherein the activation function of the graph convolution network model is an ELUs function;

the ELUs function is as follows:

in the formula, alpha is an adjusting parameter; x is input data; g (x) is output data.