Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a power distribution network topology identification method based on an attention mechanism and a convolutional neural network.
The purpose of the invention can be realized by the following technical scheme:
a power distribution network topology identification method based on an attention mechanism and a convolutional neural network comprises the following steps:
s1: acquiring historical measurement data of the power distribution network and a corresponding topological structure, and constructing a database;
s2: preprocessing the measured data;
s3: screening the features according to the feature contribution degree, and constructing a feature set;
s4: constructing a power distribution network topology identification model, and training the power distribution network topology identification model based on the feature set;
s5: and sending the measurement data of the power distribution network to be identified into the power distribution network topology identification model to obtain the topology structure of the power distribution network to be identified.
Preferably, the preprocessing comprises data cleaning, missing value filling and abnormal value removing.
Preferably, the specific step of step S3 includes:
s31: acquiring the characteristic contribution of different measurement data to the topology identification of the power distribution network;
s32: and sequencing the feature contribution degrees, selecting the measurement data with the highest feature contribution degree as a feature set, and taking the corresponding topological structure as a label of the feature set.
Preferably, in step S31, a random forest algorithm is used to calculate the feature contribution degree of the measurement data.
Preferably, the power distribution network topology identification model is a convolutional neural network based on an attention mechanism.
Preferably, the convolutional neural network comprises an attention module, and an input layer, a hidden layer and an output layer which are connected in sequence, wherein the attention module is arranged behind the first layer of the hidden layer.
Preferably, the loss function of the convolutional neural network is:
wherein p ═ p1,…,pN]Is a probability distribution of each element piRepresenting the probability that the sample belongs to the topology i; y ═ y1,…,yN]Is a sample label, y when the sample belongs to the topology i i1, otherwise yi0; n is the total number of topology classes.
Preferably, the measured data includes node voltage amplitude and node injection power.
Preferably, the node injection power is active power.
Preferably, the topology structure is a topology structure diagram corresponding to the measurement data of each group of nodes.
Compared with the prior art, the invention has the following advantages:
1) according to the method, aiming at a large amount of redundant measurement data in the power distribution network, the feature set is screened by means of a random forest algorithm, the dimension of the data set is reduced, the calculation complexity and the space complexity of a subsequent model are reduced, and the model identification efficiency is improved;
2) the invention utilizes the convolutional neural network to re-divide and mine the incidence relation between the characteristic category and the topological structure, and learns the mapping rule thereof, realizes that the current topology can be identified only by section measurement data, and solves the defects that the current identification method is difficult in threshold setting and only suitable for a radiation network;
3) according to the method, an attention mechanism is added into the convolutional neural network, and attention is added to corresponding characteristics, so that the robustness of the model is greatly improved, the defect of high noise of measured data is overcome, the model has a good identification effect in data with high noise, and the method has a high practical application value.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. Note that the following description of the embodiments is merely a substantial example, and the present invention is not intended to be limited to the application or the use thereof, and is not limited to the following embodiments.
Examples
A method for identifying a power distribution network topology based on an attention mechanism and a convolutional neural network is disclosed, as shown in FIG. 1, and comprises the following steps:
s1: acquiring historical measurement data of the power distribution network and a corresponding topological structure, and constructing a database;
in this embodiment, the measurement data includes a node voltage amplitude and a node injection power, where the obtained node injection power selects an active power that is easily obtained.
An IEEE33 node power distribution network of the calculation example in this embodiment is shown in fig. 3, and considering that there may be distributed power generation access in an actual power distribution network, nodes 7, 10, 14, and 33 are selected as distributed power generation access, and line parameters adopt IEEE33 node standard parameters, and line connection is changed on the basis of this topology, so that 28 topology structures are generated, of which 20 are radial networks and 8 ring-containing networks. And changing the working scene on the basis of each topology, and obtaining 3000 groups of sample data by means of MATLAB software simulation, wherein the total number is 84000. The data for each set of samples is the voltage amplitude and injected power for the 33 nodes.
S2: and preprocessing the measured data.
The preprocessing comprises data cleaning, missing value filling and abnormal value removing.
Specifically, firstly, the maximum voltage amplitude estimation is carried out on the data, and the value exceeding the specified variation range is deleted as an abnormal value;
then, the data is normalized:
in the formula: v and vnormThe voltage amplitude before and after normalization of the node, v, respectivelyminAnd vmaxThe minimum and maximum values of the voltage amplitude at the nodes in the training data set, respectively.
The normalized voltage data of a certain node is equal to the difference between the actual measurement value and the lowest measurement value of the node at all times, and the difference between the maximum measurement value and the minimum measurement value of the node at all times is compared;
and finally, performing leak repairing on the deleted value and the missing value:
wherein v isi,tRepresenting the voltage magnitude at time t of node i and n representing the total number of nodes on the same branch as node i.
Specifically, for training samples with missing part of data, directly discarding and re-collecting the data; and for the test sample with the missing part of data, performing data filling on the sample to ensure that the topology identification can be performed normally. And by utilizing the similarity of the fluctuation of the voltages of the adjacent nodes, the data missing part is filled with the average value of the difference values of the voltage amplitudes of the adjacent nodes and the previous moment and the voltage amplitude of the previous moment of the missing value.
S3: and screening the features according to the feature contribution degree, and constructing a feature set.
The specific steps of step S3 include:
s31: acquiring the characteristic contribution of different measurement data to the topology identification of the power distribution network;
s32: and sequencing the feature contribution degrees, selecting the measurement data with the highest feature contribution degree as a feature set, and taking the node voltage amplitude and the corresponding distribution network topological structure as a feature data set and a corresponding label for subsequent model training.
In order to reduce the computational complexity and the spatial complexity of subsequent model training, a random forest intelligent algorithm is adopted to calculate the contribution degree of each characteristic category to the power distribution network topology identification, and the principle is as follows:
wherein, N represents a tree of decision trees in the forest, for each decision tree, selecting corresponding out of bag data (OOB) to calculate out of bag data error, and recording as errOOB1Randomly adding noise interference to the characteristic X of all samples of the data outside the bag (the value of the sample at the characteristic X can be randomly changed), and calculating the error of the data outside the bag again and recording the error as errOOB2. Feature contribution ranking is shown in FIG. 4Shown in the figure. Except for the root node, the voltage amplitudes of all other nodes have higher contribution degrees than the node injection power, so the node voltage amplitude is selected as a characteristic set.
S4: and constructing a power distribution network topology identification model, and training the power distribution network topology identification model based on the feature set.
As shown in fig. 2, the topology identification model of the power distribution network is a convolutional neural network based on an attention mechanism. The convolutional neural network comprises an attention module, an input layer, a hidden layer and an output layer, wherein the input layer, the hidden layer and the output layer are sequentially connected, and the attention module is arranged behind the first layer of the hidden layer.
The attention module adopts an attention mechanism to rapidly scan all the features to obtain the feature classes needing important attention, and then focuses attention on the feature classes needing important attention, so that the attention applied to other feature classes which are not important is reduced, and the working efficiency and the accuracy are greatly improved. The implementation principle of the method is shown in fig. 5, and the feature data set after attention mechanism processing is obtained by calculating the feature importance of all input feature data sets, obtaining the weight of each feature category and multiplying the rest input features. An attention mechanism and a convolutional neural network are combined to construct a power distribution network topology identification model, the basic structure of the power distribution network topology identification model is shown in fig. 6, the input layer is the voltage amplitude of 33 nodes, the output layer is the probability of belonging to a certain topology of 28 topologies, and the attention mechanism is put into the power distribution network topology identification model to serve as a hidden layer to improve the anti-noise capability of the model.
The loss function of the convolutional neural network is:
wherein p ═ p1,…,pN]Is a probability distribution of each element piRepresenting the probability that the sample belongs to the topology i; y ═ y1,…,yN]Is a sample label, y when the sample belongs to the topology i i1, otherwise yi0; n is the total number of topology classes.
S5: and sending the measurement data of the power distribution network to be identified into the power distribution network topology identification model to obtain the topology structure of the power distribution network to be identified.
In this embodiment, in order to facilitate the effectiveness and superiority of the present invention, a power distribution Network topology model is respectively constructed based on several common intelligent algorithms such as Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and eXtreme Gradient Boosting (XGboost), and experimental results are compared.
The evaluation indexes of the model adopt Accuracy (ACC), Precision (PRE) and Recall (REC). The calculation formula is as follows:
wherein: t is the number of correct samples in all classes, N is the total number of samples in all classes, TP represents the number of positive samples classified as positive, FP represents the number of negative samples classified as positive, and FN represents the number of positive samples classified as negative.
Meanwhile, considering that the model function of the invention is actually a multi-classification function, the confusion matrix can be adopted to display the classification effect of a single sample class of the model, and the confusion matrix can display the classification effect of the model very intuitively.
And randomly segmenting the feature data set and the label set, wherein 70% of feature data set is used as a training set, and 30% of feature data set is used as a testing set. The model training results are shown in the following table:
TABLE 1 comparison of Performance of four algorithms
As can be seen from the above table, the classification problem focuses on accuracy, the accuracy of the convolutional neural network (ACNN) and CNN in combination with the attention mechanism is high, the accuracy and recall rate of ACNN are good, and the topology identification effect is good.
From theoretical analysis, ACNN should be able to show superiority of classification effect under the condition of noise in the data, and the superiority shows more obviously the greater the noise, considering that the PMU device and the micro PMU device with good performance and better measurement error have measurement errors of 0.05% and 0.01% respectively, and the other common measurement devices have larger errors, so that the noise of 0.01%, 0.05%, 0.5% and 1% respectively added in the data simulates real measurement data.
After adding noise to the test data, the recognition results are shown in the following table, where the Total Vector Error (TVE) is used to measure the noise level:
table 2 IEEE33 node test set accuracy with measurement noise taken into account
As can be seen from the above table results, within a certain range, the higher the noise level in the data is, the more excellent the classification effect of ACNN can be reflected.
To further verify the validity of the model, a normalized confusion matrix is calculated for the model, the result of which is shown in fig. 7 for a confusion matrix thermodynamic diagram. The meaning of the element in the ith row and the jth column of the confusion matrix is: the true topology label is i and the probability that the predicted topology label is j. The results show that the diagonal elements are almost all 1, and the off-diagonal elements are substantially 0, so the model has better effect. Therefore, the method can accurately and effectively realize the topology identification of the power distribution network.
The above embodiments are merely examples and do not limit the scope of the present invention. These embodiments may be implemented in other various manners, and various omissions, substitutions, and changes may be made without departing from the technical spirit of the present invention.