CN116562114A - Power transformer fault diagnosis method based on graph convolution neural network - Google Patents

Power transformer fault diagnosis method based on graph convolution neural network Download PDF

Info

Publication number
CN116562114A
CN116562114A CN202211479696.1A CN202211479696A CN116562114A CN 116562114 A CN116562114 A CN 116562114A CN 202211479696 A CN202211479696 A CN 202211479696A CN 116562114 A CN116562114 A CN 116562114A
Authority
CN
China
Prior art keywords
gcn
transformer
layer
samples
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211479696.1A
Other languages
Chinese (zh)
Inventor
何明锋
陈飞
李付林
叶国庆
李毓
张波
黄红辉
季克勤
侯健生
黄健
王珂
沃建栋
叶宏
贺燕
吴峰
金坚锋
杨艳天
王赢聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd filed Critical Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority to CN202211479696.1A priority Critical patent/CN116562114A/en
Publication of CN116562114A publication Critical patent/CN116562114A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Mathematical Analysis (AREA)
  • Geometry (AREA)
  • Computing Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Computer Hardware Design (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Algebra (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Testing Electric Properties And Detecting Electric Faults (AREA)

Abstract

The invention discloses a power transformer fault diagnosis method based on a graph convolution neural network; the method comprises the following steps: s1, constructing a power transformer fault diagnosis method based on GCN; s2, constructing a GCN structure; s3, performing transformer fault diagnosis by using the GCN, performing data processing on the transformer faults, determining the output quantity of the model, and dividing the transformer faults into thermal faults and discharge faults. Specifically, the thermal faults include low temperature thermal faults (LT), medium temperature thermal faults (MT), and high temperature thermal faults (HT). The discharge faults include Partial Discharge (PD), low-energy discharge (LD), high-energy discharge (HD). The diagnostic process includes (1) data import and normalization; (2) reconstructing and dividing data; (3) the structure and parameters of the GCN are initialized. And training the model, and finally evaluating the performance of the GCN.

Description

Power transformer fault diagnosis method based on graph convolution neural network
Technical Field
The invention belongs to the technical field of power grid fault diagnosis, and particularly relates to a power transformer fault diagnosis method based on a graph convolution neural network.
Background
Along with the continuous expansion of the scale of the power system, the number of transformers is increased, meanwhile, the data of transformer faults are also increased, and the accuracy of the traditional fault diagnosis method is needed to be further improved. The operational state of the transformer is directly related to the safety and power quality of the whole power system. Once the transformer fails, local power failure and even large-scale power failure can be caused, so that the operation of the power system is affected and economic loss is caused. Therefore, accurate diagnosis of the state of the power transformer is of great importance to the power system.
Currently, large transformers mostly belong to oil-immersed transformers. When a fault occurs, the oil immersed transformer releases a large amount of dissolved gas, which is an important index for fault diagnosis by Dissolved Gas Analysis (DGA). Existing DGA-based transformer fault diagnosis methods can be categorized into two types, distance-based methods and model-based methods. The first category mainly includes case-based reasoning, expert systems, k-nearest neighbor (KNN) and twin neural networks. Generally, although these distance-based methods make full use of historical data and a priori knowledge through similarity metrics, it is difficult to capture complex nonlinear relationships between dissolved gases and corresponding tags, resulting in limited accuracy in transformer fault diagnosis. For the second class, traditional model-based algorithms include Support Vector Machines (SVMs), multi-layer perceptrons (MLPs), extreme gradient boosting (XGBoost), lightweight gradient boosting (light GBM). Generally, while these traditional methods are more suitable for smaller datasets, their limited feature extraction capabilities make it difficult to fully exploit the potential properties between dissolved gases and corresponding tags.
The power transformer fault diagnosis method based on deep learning disclosed in the authority publication number CN 115329908A is implemented, although a fault sample data set of a power transformer is obtained, the fault sample data set is preprocessed to obtain a training data set for training, a preset fault diagnosis model based on CNN is constructed, the preset fault diagnosis model is trained through the training data set to obtain a trained fault diagnosis model, super parameters of the trained fault diagnosis model are optimized to obtain a target fault diagnosis model, data to be analyzed is obtained, the data to be analyzed is analyzed through the target fault diagnosis model, and a fault diagnosis result corresponding to the data to be analyzed is output. However, the method is complex, is not suitable for engineering personnel to simplify operation, has the problem of less influence factors, and does not comprehensively consider the factors of the power transformer faults. The graph convolutional neural network (GCN) can effectively mine complex nonlinear relations between fault types and dissolved gases by using graph convolution layers with strong learning capability, and can also use an adjacency matrix to represent similarity measurement between unknown samples and marked samples, so that the accuracy of transformer fault diagnosis is improved.
Disclosure of Invention
The flow of the power transformer fault diagnosis method based on the graph convolution neural network provided by the invention is shown in figure 1, and the method specifically comprises the following 3 steps.
S1 construction GCN-based power transformer fault diagnosis method
The objective is to construct a feature function g= (V, E) taking as input a feature matrix X of dissolved gas content and an adjacency matrix a of samples:
Input=(X,A) (1)
wherein X is an n X d feature matrix, i is composed of feature descriptions X for each node i, n is the number of nodes (n is the number of samples in the fault diagnosis of the transformer), and d is the input feature number. The adjacency matrix represents in matrix form a similarity measure between the historical data and the current sample.
The output of the convolutional layer is an nxf node vector Y, where F is the number of transformer states. Each of the layers of the volume may be written as a nonlinear function:
H (i+1) =f(H (i) ,A),I=0,1,...,L (2)
where L is the number of layers of the drawing coil. When i=0, H (0) When i=l, as in X, H (L) The same as Y. The specific graph convolution layers differ only in the choice of the activation function f and in the manner of parameterization.
Simple form of hierarchical propagation principle of graph convolution layer:
f(H (i) ,A)=σ(AH (i) W (i) ) (3)
where σ is a nonlinear activation function, such as a modified linear (ReLU) function, and W (i) is the weight matrix in the ith picture volume layer.
Although the graph convolution layer is very powerful, it has two limitations to be addressed:
1) Multiplying by adjacency matrix a means that for each node it sums the eigenvectors of all neighboring nodes, rather than summing the nodes themselves (unless there is a self-loop in the graph structure data). This limitation can be solved by enforcing self-loops in the graph structure data (e.g., adding identity matrix to adjacency matrix a):
A′=A+I (4)
2) The second limitation is that the adjacency matrix a 'is not normalized, so the multiplication may change the scale of the eigenvectors, which can be verified by examining the eigenvalues of adjacency matrix a'. To solve this problem, the adjacency matrix a' should be normalized by the following formula:
wherein D is the diagonal node degree matrix a' of the adjacency matrix a:
D ij =∑ j A′ ij (6)
after using these two techniques, the new propagation principle of the graph roll stacking becomes:
f(H (i) ,A)=σ(A″H (i) W (i) ) (7)
s2 construction GCN structure
Typically, the neural network diagnoses the fault type Y of the power transformer by inputting the dissolved gas content X. Specifically, GCN requires an n×n adjacency matrix a in addition to X, where n is the number of samples in the data set. The samples in the training set and the validation set are linked only to samples with the same label. For example, if the ith sample and the jth sample belong to partial discharge, a (i, j) =a (j, i) =1. For samples in the test set (unknown samples), the low-dimensional features of the input variables are extracted using the Siamese network, thereby calculating the euclidean distance between the samples. Then, k samples closest to the unknown sample are found using KNN and considered to be connected. For example, if k is equal to 1, the euclidean distance between the ith unknown sample and the jth sample is nearest, a (i, j) =a (j, i) =1. In this case, the adjacency matrix a may represent a similarity measure between the historical data and the current sample.
As shown in fig. 2, the graph structure data (X, a) is fed to the output H (1) Is on the first layer. Specifically, a mixed feature matrixIt combines the feature vector of each node linearly with the feature vector of the neighboring node using the weights denoted by a ". Next, a new set of features +.>Weight matrix W of (2) 1 Plus a bias vector b 1 . Then, an activation function (e.g., reLU) is selected for the new vector to obtain output data H (1) of the first layer H:
also, the output data shown in FIG. 3 is a convolved layer
H (2) =ReLU(A″H (1) W 2 +b 2 ) (9)
Wherein W is 2 And b 2 The weight matrix and the bias vector in the second layer of graph convolution layer, respectively.
Behind the picture scroll laminate there are two dense layers. Before inputting H, it is necessary to perform winning process H on the data (2) . Third layer, output data H (3) May be passed through a weight matrix W 3 A deviation vector b 3 One laserLiving function:
H (3) =ReLU(H (2) W 3 +b 3 ) (10)
the output of the fourth layer through the Softmax function is:
Y=Softmax(H (3) W 4 +b 4 ) (11)
wherein Y is the type of transformer fault.
S3, performing transformer fault diagnosis by using GCN
S301 data processing
In normal operation, the solid organic insulating material and insulating oil of the power transformer are gradually aged due to the combined action of the electric field and the thermal field. Small amounts of dissolved gases, such as hydrogen and low molecular hydrocarbon gases, dissolve in transformer oil. If the transformer fails in discharge or in heat, the dissolved gas content increases rapidly. If the rate of dissolved gas is greater than the rate of gas absorption by the transformer oil, the excess gas will continue to diffuse and enter the relay, triggering an alarm. Currently, one common detection technique capable of diagnosing the type of fault in an oil-immersed transformer is by analyzing the content of dissolved gases. And a plurality of novel features are further constructed by using an IEC ratio method, a Dornenburg ratio method and a Rogers ratio method, so that the accuracy of fault diagnosis is effectively improved. Previous work has shown that CO and CO2 have a weak correlation with the transformer fault type, while H2, C2H6, CH4, C2H2, C2H4 have a strong correlation with the transformer fault type. Thus, the dissolved gas (H2, C2H6, CH4, C2H2, C2H 4) content was chosen as the original signature, and 4 new signatures (CH 4/H2, C2H2/C2H4, C2H4/C2H6, C2H6/CH4 constructed by the Rogers ratio method) were chosen to be further considered as input variables to the GCN.
Since the values of these 9 features differ greatly, if they are directly used as input variables, the performance of the model is adversely affected, and even the loss function is difficult to converge. Thus, nine features should be mapped to interval [0,1] by min-max normalization before being fed to the GCN:
x i and x i ' represents the ith feature before and after normalization, respectively. X is x i,min And x i,max Representing the maximum and minimum values of the ith feature before normalization, respectively.
Output variable of S302 model
Transformer faults can be classified into thermal faults and discharge faults. Specifically, the thermal faults include low temperature thermal faults (LT), medium temperature thermal faults (MT), and high temperature thermal faults (HT). The discharge faults include Partial Discharge (PD), low-energy discharge (LD), high-energy discharge (HD). In order to effectively calculate the cross entropy loss function during GCN training, various state types of the transformer are encoded as shown in table 1.
Table 1 transformer status encoding
Transformer state Encoding
Normal state 1000000
Low temperature thermal failure 0100000
Low temperature thermal failure 0010000
Low temperature thermal failure 0001000
Partial discharge 0000100
Low energy discharge 0000010
High energy discharge 0000001
S303 fault diagnosis procedure
The fault diagnosis process of the transformer based on GCN is shown in fig. 4, and the specific steps are as follows:
(1) data import and normalization
Dissolved gas H 2 、C 2 H 6 、CH 4 、C 2 H 2 、C 2 H 4 Is considered as the original feature and 4 new features were constructed using Rogers ratio method. The above 9 features are used as input variables for the GCN. To obtain the adjacency matrix a, the low-dimensional features of the input variables are extracted using a Siamese network, thereby calculating the euclidean distance between samples. Then, k samples closest to the unknown sample are found using KNN and considered to be connected. Further, the input data is converted to a value of 0-1 using a min-max normalization method.
(2) Reconstruction and partitioning of data
The adjacency matrix is reshaped into a sparse matrix in the coordinate format because it contains a large number of 0 elements, resulting in a waste of space. In the dataset, 75% of the samples were used to train the GCN, and the remaining samples were used to evaluate the performance of the model.
(3) Initializing the structure and parameters of a GCN
In order to improve the accuracy of transformer fault diagnosis, it is necessary to explore the optimal structure and parameters before training the GCN. The structure and parameters of the GCN mainly comprise the number of layers of the graph convolution, the number of iterations, the k size of the adjacency matrix A and the selection of an optimizer. One basic structure of the GCN is shown in Table 2. The convolution filters have sizes of 8 and 16, respectively. All activation functions of the graph convolution layer are relus. To mitigate over-fitting, the picture volume layer is followed by a reject layer with a probability of 0.25. The size of the convolution filter and the probability of rejecting the layer are both the best values found.
TABLE 2 basic Structure of GCN
Through many experiments in case studies, the density layer output 1 x 7 vector was finally determined to represent the state of the unknown sample.
S304 training
The training process of GCN is completed by using a back propagation algorithm, and mainly comprises two steps of forward excitation propagation and backward weight updating. For forward excitation propagation, the input variable is transferred to the dense layer after being processed by a plurality of graph convolution layers, and the label of the sample is output by the dense layer. The diagnostic result and the real result are used to calculate the loss function (error). For backward weight updates, the chain law is used to transfer the error from the output layer to the middle layer. Then, the weight of each layer is updated by a gradient descent method. When the set number of iterations is reached, the test set is used to evaluate the performance of the GCN. In addition, in the training process, the accuracy of the model is improved by utilizing an integration technology.
S305 evaluating the performance of GCN
For binary classification problems, the result is either a positive class or a negative class. The accuracy and recall can be used to evaluate the performance of the model, but these indicators are not applicable to multiple classification problems such as transformer fault diagnosis. In general, k classification problems can be translated into k binary classification problems. In addition to accuracy, the geometric mean (G-mean) of Macro F1 and recall was used to evaluate the performance of the model. Both Macro F1 and G-means are positive indicators. That is, the larger the index, the better the performance.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a power transformer fault diagnosis method based on a graph convolution neural network, which is used for improving the accuracy of transformer fault diagnosis. Constructing a GCN-based power transformer fault diagnosis method; constructing a GCN structure; performing transformer fault diagnosis by using GCN, and performing data processing on transformer faults, wherein the diagnosis process comprises (1) data importing and normalizing; (2) reconstructing and dividing data; (3) the structure and parameters of the GCN are initialized. And training the model, and finally evaluating the performance of the GCN. The graph convolutional neural network (GCN) can effectively mine complex nonlinear relations between fault types and dissolved gases by using graph convolution layers with strong learning capability, and can also use an adjacency matrix to represent similarity measurement between unknown samples and marked samples, so that the accuracy of transformer fault diagnosis is improved.
Drawings
FIG. 1 is a flow chart of the method steps of the present invention;
FIG. 2 is a graph of a graph convolutional neural network;
FIG. 3 is a diagram of a multi-layer perceptron;
FIG. 4 is a GCN-based fault diagnosis flowchart;
FIG. 5 is a diagram of a GCN training process;
FIG. 6 is a graph of test set indicators for different k values.
Detailed Description
(1) Description of data
To test the performance of the GCN for transformer fault diagnosis, simulations and analyses were performed using actual data sets from the national grid company. The voltage rating of the sample was 220kV. After data cleaning, 718 samples are left in the data set, including 7 state types, namely normal, low-temperature thermal faults, medium-temperature thermal faults, high-temperature thermal faults, partial discharge, low-energy discharge and high-energy discharge. The number of training samples was 75%. The remaining data was used as test samples to evaluate the performance of the model. The sizes of the respective status types are shown in table 3.
Table 3 sample distribution of data sets
Tab.3 A SAMPLE DISTRIBUTION OF DATA SET
Transformer state Total number of samples Training sample data volume Number of test samples
Normal state 52 39 13
LT 99 74 25
MT 73 55 18
HT 168 126 42
PD 105 79 26
LD 42 31 11
HD 179 134 45
(2) GCN training effect
In order to clearly observe the training process of the GCN, fig. 5 shows the trend of the loss function as the number of iterations increases.
In the early stages of the training process, the loss function of the training set drops rapidly with increasing number of iterations. When the number of iterations is greater than 400, the loss function tends to be constant and does not continue to drop, indicating that the GCN has converged. Generally, the training process of the GCN is relatively stable, and the convergence speed is high. To ensure convergence of the GCN, after 800 iterations, the unknown samples were diagnosed using the GCN.
To analyze the effect of the number of layers on GCN performance, the number of layers was gradually increased and the index of the test set under the different layers was counted as shown in Table 4.
Table 4 test set index for different number of layers
Layer number Accuracy of Macro G-means Time/s Number of parameters
1 0.714 0.689 0.709 35.27431 80
2 0.793 0.772 0.791 65.98592 152
3 0.780 0.753 0.769 96.35367 224
4 0.699 0.672 0.688 126.7464 296
5 0.683 0.655 0.673 157.6507 368
From Table four it can be concluded that 1) at an early stage, the index of the test set increased with increasing number of layers of the graph, indicating that the performance of the GCN was gradually increasing. When the number of the picture winding layers is 2, the accuracy of GCN, macro F1 and G-mean are maximum, and the fault diagnosis performance is best. This phenomenon suggests that it is difficult to mine complex nonlinear relationships between dissolved gas and transformer conditions with only a small number of layers of graph windings. The number of the graph convolution layers is increased, so that the characteristic learning capability of the GCN can be improved, and the precision fault diagnosis is improved. 2) It was further found that the number of parameters to be trained increases linearly with the number of layers of the graph convolution. When the number of layers is greater than 2, the performance of the GCN becomes worse if more convolution layers are added to the GCN. This is because the number of samples in the data set is limited. Too many layers of graph convolution not only increase the parameters of the GCN to be trained, but also consume a lot of training time, and are easy to overfit, reducing the accuracy of the diagnosis. 3) In general, the number of layers of a volume should be determined based on the size of the data set. With a smaller number of samples, the GCN can achieve better performance by setting the graph convolution layer to 2 layers.
The size of k determines the number of samples in the training set that are connected to each unknown sample, which will directly affect the adjacency matrix a. To investigate the effect of k on GCN performance, k was set to 1-20, and the performance of GCN on the test set was counted as shown in FIG. 6
When k is very small, the unknown samples are connected to the nearest samples only, which makes it difficult to fully exploit the similarity measure between samples, resulting in limited accuracy. In contrast, if the value of k is very large, the unknown sample is concatenated with many samples, which may cause the unknown sample to be concatenated with different types of samples, thereby generating noise, limiting the performance of the GCN. Thus, as the k value increases, the precision, macro F1 and G-mean increase and decrease. When the k value is 10, the accuracy of the test set, macro F1 and G-mean are maximum, and the performance of the GCN in fault diagnosis is optimal.
After initializing the structure and parameters of the GCN, the neural network needs to be optimized by adopting a gradient descent method. Currently popular gradient descent methods are SGD, RMSprop, adagrad, adadelta, adam, adamax and Nadam. Typically, these methods are often used as black boxes, as their principle is too complex to explain in practical engineering. In order to find optimizers suitable for GCN in transformer fault detection, these popular optimizers were set up and simulated and the metrics of the test set were counted as shown in Table 5.
Table 5 index of test set under different optimizers
As can be seen from table 4, the GCN at RMSprop, adagrad, adamax, nadam and Adam algorithm was used as an optimizer. Specifically, the accuracy of Adam algorithm, macro F1 and G-mean are slightly higher than the first four optimizers, indicating that Adam algorithm is the most suitable optimizer for GCN in transformer fault detection. Furthermore, the Adagrad, adadelta and SGD algorithms both correspond to an accuracy below 0.7, indicating that they are not suitable for GCN-based transformer fault diagnosis.
3.3 comparison under different input characteristics
To illustrate the effectiveness of GCN, common distance-based methods (e.g., KNN and Siamese networks) and model-based methods (e.g., CNN, MLP, XGBoost and SVM) were used as baselines. And comparing various indexes of the test set under different input characteristics. Through multiple experiments, the optimal parameters and structures of various algorithms are obtained as follows:
1) For KNN, k has a size of 7. 2) For the Siamese network, it includes two cnns with the same weights, which are used to calculate the Euclidean distance of the input data (the input data of the Siamese network is a pair of samples). Specifically, the filters of the two convolution layers are 16 and 36, respectively, and the convolution kernel has a size of 2×2. The size of the largest pooling layer is 2 x 2. A dropout layer is inserted after the two convolutional layers to mitigate over-fitting. The probability of the dropout layer is set to 0.25. The number of neurons in the dense layer was 8 and 1, respectively. The activation function of all layers is a ReLU function. 3) For CNN, it includes two convolutional layers, two max-pooling layers, two dropout layers, and two density layers. The probability of the dropout layer is 0.25. The size of the kernel in the convolutional layer is 3. The size of the pools in the maximum pooling layer is 2 x 2. The activation function of the convolutional layer is a ReLU function. The number of neurons in the dense layer was 14 and 7, respectively. 4) For MLP, the input layer neuron number is 9, and the intermediate layer neuron number is 9 and 7, respectively. The number of neurons of the output layer is equal to the number of categories. To mitigate overfitting, one dropout layer is inserted between each dense layer. 5) For XGBoost, the maximum depth is 5 and the gamma value is 0.2. The subsampled rate is 0.6 and the minimum subspan weight is 3. 6) For SVM, the transformer fault type is classified using the fiteco function in MATLAB2018 a.
The above algorithm was trained under different input features, and the simulation results of the test set are shown in tables 6 and 7.
Table 6 5GCN results of raw feature training
Method Accuracy of Macro G-means
GCN 0.781 0.758 0.776
CNN 0.74 0.717 0.734
MLP 0.633 0.602 0.62
XGBoost 0.634 0.606 0.626
SVM 0.642 0.616 0.635
KNN 0.669 0.644 0.664
Siamese Network 0.751 0.725 0.741
Table 5 9 GCN results of raw feature training
From the table it can be concluded that 1) accuracy represents the probability that the model can correctly identify positive and negative classes. If there is an imbalance problem with the dataset, the effect of the imbalance class accuracy can be significant. Thus, macro F1 and G-mean were chosen to evaluate the performance of the model. As can be seen from the table, the values of accuracy, macro F1 and G-mean are similar, which illustrates that the probability of a model for correct diagnosis of various fault types is very close. In addition, the values of G-mean are all greater than 0, indicating that there is no missed detection for each fault type in the model. 2) The accuracy, macro F1 and G-mean of each algorithm in Table 7 are all greater than those of Table 6, indicating that four features constructed by the Rogowski method improve the performance of each algorithm in transformer fault diagnosis to some extent. 3) With two different input features, the GCN has higher diagnostic performance than other algorithms. Table 7 illustrates. The GCN accuracy, macro F1 and G-mean were 79.3%, 77.2% and 79.1%, respectively. Compared with CNN, MLP, XGBoost, SVM, KNN and Siamese networks, the accuracy of GCN is improved by 3.7%, 12.1%, 15.2%, 10.5%, 9.3% and 2.2%, and the Macro F1 of GCN is respectively increased by 4.0%, 12.3%, 15.9%, 11.0%, 9.5% and 2.7%; the G-mean of GCN was increased by 4.3%, 12.2%, 16.0%, 10.8%, 9.6% and 2.8%, respectively. 4) Last but not least, both CNNs and GCNs use convolutional layers to extract the features of the input data. The difference is that the latter also takes into account the similarity measure between the unknown sample and the labeled sample by means of the adjacency matrix. The performance of GCN is superior to CNN, which suggests that CNN ignores similarity metrics between samples, limiting the accuracy of fault diagnosis. The GCN can not only explore the relation between the characteristics and the fault types by using the graph convolution layer, but also consider the similarity measurement between samples, so that the fault types of the transformer can be diagnosed more accurately.

Claims (4)

  1. S3, performing transformer fault diagnosis by using the GCN, performing data processing on the transformer fault, determining the output quantity of the model, and dividing the transformer fault into a thermal fault and a discharge fault. Specifically, the thermal faults include low temperature thermal faults (LT), medium temperature thermal faults (MT), and high temperature thermal faults (HT). The discharge faults include Partial Discharge (PD), low-energy discharge (LD), high-energy discharge (HD). The diagnostic process includes (1) data import and normalization; (2) reconstructing and dividing data; (3) the structure and parameters of the GCN are initialized. And training the model, and finally evaluating the performance of the GCN.
  2. 2. The intelligent micro-grid capacity optimization method based on wind-solar energy storage integration according to claim 1, wherein the method is characterized by comprising the following steps: s1, constructing a GCN-based power transformer fault diagnosis method: the objective is to construct a feature function g= (V, E) taking as input a feature matrix X of dissolved gas content and an adjacency matrix a of samples:
    wherein X is an n X d feature matrix, i is composed of feature descriptions X for each node i, n is the number of nodes (n is the number of samples in the fault diagnosis of the transformer), and d is the input feature number. The adjacency matrix represents similarity measurement between the historical data and the current sample in a matrix form;
    the output of the convolutional layer is an nxf node vector Y, where F is the number of transformer states. Each of the layers of the volume may be written as a nonlinear function:
    H (i+1) =f(H (i) ,A),I=0,1,...,L (2)
    where L is the number of layers of the drawing coil. When i=0, H (0) When i=l, as in X, H (L) The same as Y. The specific graph convolution layers differ only in the choice of the activation function f and in the manner of parameterization;
    simple form of hierarchical propagation principle of graph convolution layer:
    f(H (i) ,A)=σ(AH (i) W (i) ) (3)
    where σ is a nonlinear activation function, such as a modified linear (ReLU) function, and W (i) is the weight matrix in the ith picture volume layer;
    although the graph convolution layer is very powerful, it has two limitations to be addressed:
    1) Multiplying by adjacency matrix a means that for each node it sums the eigenvectors of all neighboring nodes, rather than summing the nodes themselves (unless there is a self-loop in the graph structure data). This limitation can be solved by enforcing self-loops in the graph structure data (e.g., adding identity matrix to adjacency matrix a):
    A′=A+I (4)
    2) The second limitation is that the adjacency matrix a 'is not normalized, so the multiplication may change the scale of the eigenvectors, which can be verified by examining the eigenvalues of adjacency matrix a'. To solve this problem, the adjacency matrix a' should be normalized by the following formula:
    wherein D is the diagonal node degree matrix a' of the adjacency matrix a:
    D ij =∑ j A′ ij (6)
    after using these two techniques, the new propagation principle of the graph roll stacking becomes:
    f(H (i) ,A)=σ(A″H (i) W (i) ) (7)。
  3. 3. the intelligent micro-grid capacity optimization method based on wind-solar energy storage integration according to claim 1, wherein the method is characterized by comprising the following steps: s2, constructing a GCN structure: typically, the neural network diagnoses the fault type Y of the power transformer by inputting the dissolved gas content X. Specifically, GCN requires an n×n adjacency matrix a in addition to X, where n is the number of samples in the data set. The samples in the training set and the validation set are linked only to samples with the same label. For example, if the ith sample and the jth sample belong to partial discharge, a (i, j) =a (j, i) =1. For samples in the test set (unknown samples), the low-dimensional features of the input variables are extracted using the Siamese network, thereby calculating the euclidean distance between the samples. Then, k samples closest to the unknown sample are found using KNN and considered to be connected. For example, if k is equal to 1, the euclidean distance between the ith unknown sample and the jth sample is nearest, a (i, j) =a (j, i) =1. In this case, adjacency matrix a may represent a similarity measure between the historical data and the current sample;
    as shown in fig. 2, the graph structure data (X, a) is fed to the output H (1) Is on the first layer. Specifically, a mixed feature matrixIt combines the feature vector of each node linearly with the feature vector of the neighboring node using the weights denoted by a ". Next, a new set of features +.>Weight matrix W of (2) 1 Plus a bias vector b 1 . Then, an activation function (such as ReLU) is selected for the new vector to obtain the output data H of the first layer H (1) :
    Likewise, the output data of the second graph is a convolved layer
    H (2) =ReLU(A″H (1) W 2 +b 2 ) (9)
    Wherein W is 2 And b 2 The weight matrix and the offset vector in the second layer of graph convolution layer are respectively;
    behind the picture scroll laminate there are two dense layers. Before inputting H, it is necessary to perform winning process H on the data (2) . Third layer, output data H (3) May be passed through a weight matrix W 3 A deviation vector b 3 An activation function:
    H (3) =ReLU(H (2) W 3 +b 3 ) (10)
    the output of the fourth layer through the Softmax function is:
    Y=Softmax(H (3) W 4 +b 4 ) (11)
    wherein Y is the type of transformer fault.
  4. 4. The intelligent micro-grid capacity optimization method based on wind-solar energy storage integration according to claim 1, wherein the method is characterized by comprising the following steps: s3, performing transformer fault diagnosis by using the GCN:
    s301 data processing
    In normal operation, the solid organic insulating material and insulating oil of the power transformer are gradually aged due to the combined action of the electric field and the thermal field. Small amounts of dissolved gases, such as hydrogen and low molecular hydrocarbon gases, dissolve in transformer oil. If the transformer fails in discharge or in heat, the dissolved gas content increases rapidly. If the rate of dissolved gas is greater than the rate of gas absorption by the transformer oil, the excess gas will continue to diffuse and enter the relay, triggering an alarm. Currently, one common detection technique capable of diagnosing the type of fault in an oil-immersed transformer is by analyzing the content of dissolved gases. And a plurality of novel features are further constructed by using an IEC ratio method, a Dornenburg ratio method and a Rogers ratio method, so that the accuracy of fault diagnosis is effectively improved. Previous work has shown that CO and CO2 have a weak correlation with the transformer fault type, while H2, C2H6, CH4, C2H2, C2H4 have a strong correlation with the transformer fault type. Thus, the dissolved gas (H2, C2H6, CH4, C2H2, C2H 4) content was chosen as the original signature, and 4 new signatures (CH 4/H2, C2H2/C2H4, C2H4/C2H6, C2H6/CH4 constructed by the Rogers ratio method) were chosen to be further considered as input variables to the GCN;
    since the values of these 9 features differ greatly, if they are directly used as input variables, the performance of the model is adversely affected, and even the loss function is difficult to converge. Thus, nine features should be mapped to interval [0,1] by min-max normalization before being fed to the GCN:
    x i and x' i The i-th feature before and after normalization is shown, respectively. X is x i,min And x i,max Respectively representing the maximum value and the minimum value of the ith feature before normalization;
    output variable of S302 model
    Transformer faults can be classified into thermal faults and discharge faults. Specifically, the thermal faults include low temperature thermal faults (LT), medium temperature thermal faults (MT), and high temperature thermal faults (HT). The discharge faults include Partial Discharge (PD), low-energy discharge (LD), high-energy discharge (HD). In order to effectively calculate a cross entropy loss function in the GCN training process, coding various state types of the transformer;
    s303 fault diagnosis procedure
    The fault diagnosis process of the transformer based on GCN is shown in fig. 4, and the specific steps are as follows:
    (1) data import and normalization
    Dissolved gas H 2 、C 2 H 6 、CH 4 、C 2 H 2 、C 2 H 4 Is considered as the original feature and 4 new features were constructed using Rogers ratio method. The above 9 features are used as input variables for the GCN. To obtain the adjacency matrix a, the low-dimensional features of the input variables are extracted using a Siamese network, thereby calculating the euclidean distance between samples. Then, k samples closest to the unknown sample are found using KNN and considered to be connected. Further, converting the input data into a value of 0-1 using a min-max normalization method;
    (2) reconstruction and partitioning of data
    The adjacency matrix is reshaped into a sparse matrix in the coordinate format because it contains a large number of 0 elements, resulting in a waste of space. In the dataset, 75% of the samples were used to train the GCN, and the remaining samples were used to evaluate the performance of the model;
    (3) initializing the structure and parameters of a GCN
    In order to improve the accuracy of transformer fault diagnosis, it is necessary to explore the optimal structure and parameters before training the GCN. The structure and parameters of the GCN mainly comprise the number of layers of graph convolution, the number of iterations, the k size of the adjacent matrix A and the selection of an optimizer;
    s304 training
    The training process of GCN is completed by using a back propagation algorithm, and mainly comprises two steps of forward excitation propagation and backward weight updating. For forward excitation propagation, the input variable is transferred to the dense layer after being processed by a plurality of graph convolution layers, and the label of the sample is output by the dense layer. The diagnostic result and the real result are used to calculate the loss function (error). For backward weight updates, the chain law is used to transfer the error from the output layer to the middle layer. Then, the weight of each layer is updated by a gradient descent method. When the set number of iterations is reached, the test set is used to evaluate the performance of the GCN. In addition, in the training process, the accuracy of the model is improved by utilizing an integration technology;
    s305 evaluating the performance of GCN
    For binary classification problems, the result is either a positive class or a negative class. The accuracy and recall can be used to evaluate the performance of the model, but these indicators are not applicable to multiple classification problems such as transformer fault diagnosis. In general, k classification problems can be translated into k binary classification problems. In addition to accuracy, the geometric mean (G-mean) of Macro F1 and recall was used to evaluate the performance of the model. Both Macro F1 and G-means are positive indicators. That is, the larger the index, the better the performance.
CN202211479696.1A 2023-04-25 2023-04-25 Power transformer fault diagnosis method based on graph convolution neural network Pending CN116562114A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211479696.1A CN116562114A (en) 2023-04-25 2023-04-25 Power transformer fault diagnosis method based on graph convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211479696.1A CN116562114A (en) 2023-04-25 2023-04-25 Power transformer fault diagnosis method based on graph convolution neural network

Publications (1)

Publication Number Publication Date
CN116562114A true CN116562114A (en) 2023-08-08

Family

ID=87490406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211479696.1A Pending CN116562114A (en) 2023-04-25 2023-04-25 Power transformer fault diagnosis method based on graph convolution neural network

Country Status (1)

Country Link
CN (1) CN116562114A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235465A (en) * 2023-11-15 2023-12-15 国网江西省电力有限公司电力科学研究院 Transformer fault type diagnosis method based on graph neural network wave recording analysis
CN117950906A (en) * 2024-03-27 2024-04-30 西南石油大学 Method for deducing fault cause of server based on neural network of table graph
CN117970224A (en) * 2024-03-29 2024-05-03 国网福建省电力有限公司 CVT error state online evaluation method, system, equipment and medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235465A (en) * 2023-11-15 2023-12-15 国网江西省电力有限公司电力科学研究院 Transformer fault type diagnosis method based on graph neural network wave recording analysis
CN117235465B (en) * 2023-11-15 2024-03-12 国网江西省电力有限公司电力科学研究院 Transformer fault type diagnosis method based on graph neural network wave recording analysis
CN117950906A (en) * 2024-03-27 2024-04-30 西南石油大学 Method for deducing fault cause of server based on neural network of table graph
CN117950906B (en) * 2024-03-27 2024-06-04 西南石油大学 Method for deducing fault cause of server based on neural network of table graph
CN117970224A (en) * 2024-03-29 2024-05-03 国网福建省电力有限公司 CVT error state online evaluation method, system, equipment and medium

Similar Documents

Publication Publication Date Title
Liao et al. Fault diagnosis of power transformers using graph convolutional network
Dai et al. Dissolved gas analysis of insulating oil for power transformer fault diagnosis with deep belief network
CN116562114A (en) Power transformer fault diagnosis method based on graph convolution neural network
CN112147432A (en) BiLSTM module based on attention mechanism, transformer state diagnosis method and system
Yan et al. Transformer fault diagnosis based on BP‐Adaboost and PNN series connection
CN110879373B (en) Oil-immersed transformer fault diagnosis method with neural network and decision fusion
CN110879377B (en) Metering device fault tracing method based on deep belief network
CN113343581B (en) Transformer fault diagnosis method based on graph Markov neural network
CN110530650A (en) Heavy duty gas turbine performance state monitoring method based on generalized regression nerve networks Yu box map analysis
CN117272116B (en) Transformer fault diagnosis method based on LORAS balance data set
Zhang et al. A fault diagnosis method of power transformer based on cost sensitive one-dimensional convolution neural network
Qi et al. A novel self-decision fault diagnosis model based on state-oriented correction for power transformer
CN112380763A (en) System and method for analyzing reliability of in-pile component based on data mining
Liang et al. Dissolved gas analysis of transformer oil based on Deep Belief Networks
CN110348489B (en) Transformer partial discharge mode identification method based on self-coding network
CN117312939A (en) SOFC system working condition identification method based on deep learning
CN112257335A (en) Oil-immersed transformer fault diagnosis method combining PNN and SVM
Zhang et al. A new method for transformer fault diagnosis by using improved clustering method
CN115146739A (en) Power transformer fault diagnosis method based on stacked time series network
CN114298413A (en) Hydroelectric generating set runout trend prediction method
Zhang et al. Application of least squares support vector machine in fault diagnosis
Yi et al. Fault diagnosis of oil-immersed transformer based on MGTO-BSCN
Zhang et al. Improved multi-grained cascade forest model for transformer fault diagnosis
Hong et al. Power Transformer Fault Diagnosis Based on Improved Support Vector Machine
Chen et al. Fault Diagnosis for Transformers Based on FRVM and DBN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication