CN113822417A

CN113822417A - Transformer fault type diagnosis method combining two machine learning methods

Info

Publication number: CN113822417A
Application number: CN202111106947.7A
Authority: CN
Inventors: 薛凯丹; 王心琦; 王耀辉; 郑贤宇; 李强; 黄凯; 秦莹; 李寅伟
Original assignee: PowerChina Chengdu Engineering Co Ltd
Current assignee: PowerChina Chengdu Engineering Co Ltd
Priority date: 2021-09-22
Filing date: 2021-09-22
Publication date: 2021-12-21

Abstract

The invention relates to the field of transformers, in particular to a transformer fault type diagnosis method combining two machine learning methods, which greatly improves the accuracy of transformer fault type judgment. The invention relates to a transformer fault type diagnosis method combining two machine learning methods, which comprises the following steps: preprocessing a training sample, inputting the preprocessed training sample into a BP neural network, training the BP neural network, fusing a plurality of feature vectors output by the BP neural network into one feature vector, inputting the fused feature vector into a CNN network, training the CNN network to obtain a transformer fault detection composite network, and judging the type of transformer faults according to the transformer fault detection composite network. The method is suitable for judging the fault type of the transformer.

Description

Transformer fault type diagnosis method combining two machine learning methods

Technical Field

The invention relates to the field of transformers, in particular to a transformer fault type diagnosis method combining two machine learning methods.

Background

At present, a plurality of transformer fault diagnosis researches are carried out based on DGA, and a diagnosis method comprises transformer fault diagnosis (LS-SVM) based on a least square support vector machine: when the support vector machine is used for transformer fault diagnosis, DGA data obtained through chromatographic analysis is generally input into the support vector machine as input data, and corresponding diagnosis results are given after classification processing is carried out by the support vector machine, so that the support vector machine has the advantages of high diagnosis speed, high accuracy and the like;

the transformer fault diagnosis method of the multi-classification multi-core learning support vector machine comprises the following steps: compared with the traditional two-classification method, the difficulty of network model construction and parameter selection can be reduced, the solving speed is increased, and the problem complexity is reduced. The extreme learning machine can be simply understood as a single hidden layer neural network, different from the neural network, weight values and neuron threshold values among layers are randomly generated during training, and the weight values and the neuron threshold values are not changed after generation, and the optimal performance is obtained only by adjusting the number of hidden layer neuron nodes;

the transformer fault diagnosis method based on the weighted extreme learning machine comprises the following steps: the weighted learning machine improves the training ability and the generalization ability, and can be combined with a cross validation method to research the influence of the parameter change of the weighted extreme learning machine on the algorithm performance, so that the optimal parameter is selected, and compared with an SVM (support vector machine), the weighted extreme learning machine has higher classification accuracy on an unbalanced data set;

the transformer fault diagnosis research based on the BP network comprises the following steps: a supervised learning algorithm is characterized in that labeled data are used as samples for network training, and a model obtained through learning is used for predicting samples to be recognized. The structure of the BP network includes an input layer, an implicit layer, a fully connected layer, and an output layer. The proportion of various characteristic gases generated by the input transformer to the total gases outputs codes of different fault types.

But the transformer fault diagnosis method based on the support vector machine comprises the following steps: the method is essentially a two-classification model, and when the method is used for a multi-classification problem, a complex transformation process is often needed, so that the construction of a network model is more complicated;

the transformer fault diagnosis method of the multi-classification multi-core learning support vector machine comprises the following steps: the network structure is simple, the training speed is high, but the learning ability is not enough;

the transformer fault diagnosis method based on the weighted extreme learning machine comprises the following steps: the judgment accuracy of the weighted extreme learning machine is not high;

the transformer fault diagnosis research based on the BP network comprises the following steps: the input and output of the network are flat vectors, the learning effect is poor, and the judgment accuracy is not high.

Disclosure of Invention

The invention aims to provide a transformer fault type diagnosis method combining two machine learning methods, which greatly improves the accuracy of transformer fault type judgment.

The invention adopts the following technical scheme to realize the purpose, and the transformer fault type diagnosis method combining two machine learning methods comprises the following steps:

step 1, preprocessing a training sample;

step 2, inputting the preprocessed training sample into a BP neural network, and training the BP neural network;

step 3, fusing a plurality of feature vectors output by the BP neural network into one feature vector;

step 4, inputting the fused feature vector into a CNN network, and training the CNN network to obtain a transformer fault detection composite network;

and 5, judging the fault type of the transformer according to the transformer fault detection composite network.

Further, in step 1, the training samples are various characteristic gases generated by the transformer oil under the condition of discharge or thermal fault.

Further, in step 2, the specific method for inputting the preprocessed training sample into the BP neural network includes:

the method comprises the steps of taking multiple characteristic gases as input data according to a set proportion, inputting the input data in a matrix form, setting the size of the matrix to be n x 1, setting n to be a positive integer, and adding each element in the matrix to form a sum of 1.

Further, in step 2 or step 3, the BP neural network includes an input layer, a hidden layer and an output layer, an input signal enters the network structure from the input layer, data reaches the output layer through the action of an activation function after passing through nodes of the hidden layer, and a final signal is output by the output layer.

Further, the number of nodes of the input layer of the BP network is set to m, the number of nodes of the output layer is set to n, and m is an integer greater than zero, then the hidden layer node output formula is:

v_kias weights of the input layer to the hidden layer, f₁(x) Transfer function for the hidden layer, x_iThe training samples are input into the network, i is 1,2, and p is the number of weights;

the node output formula of the output layer is as follows:

w_jkas a weight between the hidden layer and the output layer, f₂(x) Is the transfer function of the output layer.

Further, in step 3, a plurality of feature vectors output by the BP neural network are fused into one feature vector by using a concatemate vector splicing method.

Further, step 4 is preceded by:

judging the global error, if the global error meets the requirement, entering the step 4, otherwise, reversely calculating the general error of each sample, and adjusting the weight of the output layer and the hidden layer according to the global error and the general error of each sample;

the global error is calculated as:

the actual output results obtained after the training samples are input into the network,

for the desired output result of the BP network, E_pThe general error for the p sample;

the weight value adjustment formula of the output layer is as follows:

wherein delta_yjIn order to be able to detect the error signal,

the weight adjustment formula of the hidden layer is as follows:

further, in step 4 or step 5, the CNN network includes an input layer, a convolutional layer, a pooling layer, and a full connection layer;

the formula for calculating the convolutional layer is as follows:

wherein

Denotes the kth convolution kernel of the convolution layer, N denotes the size of the convolution kernel, g denotes the number of rows of the convolution kernel, h denotes the number of columns of the convolution kernel,

is the bias coefficient, x, of the convolution layer corresponding to the convolution kernel^l-1Is the matrix data that is input to the device,

an element representing the ith row and jth column in the profile of the kth channel of the ith layer of the network, σ (x) being the activation function;

the pooling layer is used for performing downsampling operation on the input feature matrix, realizing feature dimension reduction, reducing the calculation amount of the rear layer, and setting the sampling size to be S1 multiplied by S2, so that the pooling operation formula is as follows:

x is a two-dimensional input vector, y is an output matrix after mean pooling, y_pqOutputting data of a p row and a q column of the matrix for the pooling layer;

the full connectivity layer is used to linearly combine features throughout the network, and the formula of the full connectivity layer is:

wherein

The corresponding elements in the input matrix representing the fully connected layer,

expressed is a weight coefficient between jth neurons of the l-th layer of the l-1 th layer,

bias coefficients in the jth neuron of the ith layer are indicated.

Further, a softmax layer is arranged behind the full connection layer to classify the output results of the full connection layer, and the calculation formula of a softmax function is as follows:

wherein the probability that each sample belongs to the kth class is represented as: p (y ═ j | x), (j ═ 1,2, …, k); each element of the matrix output by the Softmax layer corresponds to the probability of the corresponding transformer failure.

Further, after the fully connected layer is provided with the softmax layer to classify the output result of the fully connected layer, the method further comprises the following steps: judging the final error, if the final error meets the requirement, entering the step 2, otherwise, calculating the weight and the offset partial derivative of the final error to each neuron in each layer of the CNN network according to the error generated by each neuron in each layer of the CNN network according to the final error, updating each neuron parameter of each layer of the CNN network according to the weight and the offset partial derivative of each neuron in each layer, and entering the step 4;

the final error is calculated as:

x denotes the input sample, y_labelRepresenting the actual value, a, corresponding to the input sample^L(x) Expressing the predicted value of the CNN network, wherein L is the maximum layer number of the CNN network;

the final error is calculated for each neuron in each layer of the CNN network according to the formula:

wherein the content of the first and second substances,

denotes the error produced by the jth neuron in the ith layer, and C denotes the final error of the network. An all-digital signal represents the product of Hadamard motors, which is used to represent the dot product operation of elements between matrices;

when in pairs of convolutional layersWhen the error of the neuron is calculated,

when calculating neuron errors in the pooling layer,

when calculating the neuron error in the fully-connected layer,

represents the input of the activation function of the l-th layer,

represents the activation function output in the l layer, and down (x) is a pooling operation function;

the formula for calculating the partial derivatives of the final error to the weight and the bias of each neuron in each layer of the CNN network is as follows:

the formula for updating the parameters of each neuron of each layer is as follows:

w_new＝w_old-η∑_xa^l-1δ^l，b_new＝b_old-η∑_xδ^lwhere eta represents the learning rate, used to regulate the rate of convergence, w_oldRepresents the weight before update, w_newRepresenting updated weights, b_oldRepresenting the bias before update, b_newRepresenting the updated bias.

The invention combines the high learning speed of the BP network to the training sample and the strong extraction capacity of the CNN network to the sample characteristics to construct the composite neural network which utilizes the characteristic gas to carry out the probability diagnosis of the transformer fault, thereby greatly improving the accuracy of the judgment of the transformer fault type, avoiding the neuron saturation caused by the magnitude difference of the input sample data through the data preprocessing, and ensuring the importance of the variables in the initial network training to be at the same position.

Drawings

Fig. 1 is a flow chart of the transformer fault type determination composite training of the present invention.

Detailed Description

The invention relates to a transformer fault type diagnosis method combining two machine learning methods, which comprises the following steps:

step 1, preprocessing a training sample;

In the step 1, the training sample is various characteristic gases generated by the transformer oil under the condition of discharge or thermal fault.

In step 2, the specific method for inputting the preprocessed training sample into the BP neural network includes:

In step 2 or step 3, the bp (back propagation) neural network includes an input layer, a hidden layer and an output layer, and is characterized in that nodes between layers are connected with each other, and nodes in each layer are not connected with each other, an input signal enters a network structure from the input layer, after passing through the nodes of the hidden layer, data reaches the output layer through the action of an activation function, and a final signal is output by the output layer.

BP networkThe number of nodes of the input layer is set as m, the number of nodes of the output layer is set as n, and m is an integer greater than zero, then the output formula of the hidden layer node is as follows:

the node output formula of the output layer is as follows:

In the step 3, a plurality of feature vectors output by the BP neural network are fused into one feature vector by adopting a Concatenate vector splicing method.

Step 4 also comprises the following steps:

the global error is calculated as:

the weight value adjustment formula of the output layer is as follows:

wherein delta_yjIn order to be able to detect the error signal,

the weight adjustment formula of the hidden layer is as follows:

in step 4 or step 5, the cnn (volumetric Neural networks) network includes an input layer, a convolutional layer, a pooling layer, and a full connection layer; the convolutional layer is a parameter set composed of a plurality of different convolutional kernels, and can extract corresponding features of an input matrix by using different convolutional kernels for identification and classification. Different types of convolution kernels have the capacity of local feature extraction, while the same type of convolution kernels can share the weight of the network, and the two characteristics of the convolution kernels enable the connection number of the convolution neural network to be larger than the number of the network weight, so that the complexity of the network is effectively reduced.

The formula for calculating the convolutional layer is as follows:

wherein

wherein

bias coefficients in the jth neuron of the ith layer are indicated.

And (3) arranging a softmax layer behind the full connection layer to classify the output result of the full connection layer, wherein the calculation formula of a softmax function is as follows:

After the full connection layer is provided with the softmax layer, the output result of the full connection layer is classified, and the method further comprises the following steps: judging the final error, if the final error meets the requirement, entering the step 2, otherwise, calculating the weight and the offset partial derivative of the final error to each neuron in each layer of the CNN network according to the error generated by each neuron in each layer of the CNN network according to the final error, updating each neuron parameter of each layer of the CNN network according to the weight and the offset partial derivative of each neuron in each layer, and entering the step 4;

the final error is calculated as:

wherein the content of the first and second substances,

when calculating the neuron errors in the convolutional layer,

when calculating neuron errors in the pooling layer,

when calculating the neuron error in the fully-connected layer,

represents the input of the activation function of the l-th layer,

Fig. 1 is a flow chart of the composite training for judging the fault type of the transformer, and the composite network training process comprises the following steps:

preprocessing the collected training samples: the input data is the proportion of five characteristic gases in the total gas volume, and the size of an input matrix is n multiplied by 1. The ideal output result is a fault type proportion matrix, the size of the output matrix is n multiplied by 1, and the sum of each element in the matrix is 1. And inputting the training samples into a BP (Back propagation) network (BP neural network) to train the BP neural network. In the training process of the BP neural network, the training function can describe training result information and an error change curve through interval steps of displaying results. When the network training step number exceeds the preset maximum training step number, the network training is stopped, or the network training error is smaller than the set target error value, and the network also stops training.

In the process of BP network training, a BP network is initialized, then outputs of all units of a hidden layer and all units of an output layer are sequentially solved, an overall error E (namely a global error) is solved for the whole sample set, whether the error E meets requirements (namely whether the error is within a threshold range) is judged, if the error E does not meet the requirements, generalized errors of all units are reversely calculated, then weights of all layers are adjusted, and then the steps after the BP network is initialized are returned;

if the requirement is met, n feature vectors output by the BP network are fused into one feature vector (n multiplied by n) by using a Concatenate vector splicing method, and the specific operation method is to splice the feature vectors. The prior knowledge is that the feature vectors which are mutually fused are carried out in the feature space with the same dimensionality, the feature modes are similar, and the properties are similar. And inputting the fused feature vector into a CNN network for network training as an input data matrix, wherein an ideal output result is a fault type proportion matrix, the size of the output matrix is n multiplied by 1, the sum of elements is 1, and each element represents each possible fault probability. After the two networks are trained respectively, the same training sample is used for training the composite network, and the transformer fault diagnosis error of the whole network is reduced.

In the CNN network training process, firstly initializing a CNN network, then sequentially solving for convolutional layer output, pooling layer output and full-connection layer output, then setting a softmax layer behind the full-connection layer to classify the output result of the full-connection layer, judging a final error after classification, judging whether the requirement is met or the training frequency requirement is met, returning to the BP network initialization for composite network training if the requirement is met, and if the requirement is not met, reversely propagating and adjusting a network value; and adjusting the network value, namely updating each neuron parameter of each layer of the CNN network, and returning to the step after the CNN network is initialized after updating.

The method comprises the steps that a sample (a column matrix n x 1) subjected to data preprocessing is processed for n times through a BP network and then is spliced into a new matrix (n x n) through Concatenate, and then the new matrix is processed through the CNN network to output a column matrix (5 x 1), wherein the matrix represents the possible proportion of various types of transformer faults.

The labeled data samples used for learning are converted into the proportion of the characteristic gas content in the total produced gas through preprocessing, the characteristic gas content is arranged into a column matrix from top to bottom, and if more than one fault type exists, the column vector (the sum of each element is 1) which is expressed as the proportion of various transformer faults is predicted to be output.

The invention combines the high learning speed of the BP network to the training sample and the strong extraction capability of the CNN network to the sample characteristics to construct a composite neural network which utilizes the characteristic gas to carry out the probability diagnosis of the transformer fault. The existing BP network usually shows a fast and efficient learning effect in a single classification problem and has a simple and clear network structure. When a multi-classification problem is processed, one-sidedness often exists, the problem of trapping in a local optimal solution exists under the influence of a learning sample set, the accuracy rate of a BP network is often not high, and very accurate transformer fault type judgment cannot be provided. The CNN network has wide application in the field of image recognition, and is characterized in that convolution calculation can be performed on data (or pixel points) of a coverage area corresponding to an input matrix in a sliding mode by using convolution kernel matrixes with different sizes to obtain a final result. The CNN network can extract a large amount of sample features, but for a flat input matrix, the range of information that can be extracted is limited, and the optimal learning effect cannot be achieved. Therefore, by combining the BP network and the CNN network, output result vectors processed for many times by the BP network are combined into an n multiplied by n matrix and then input into the CNN network for convolution operation, and the probability of various transformer faults is processed by a softmax function after passing through a full connection layer.

Through data preprocessing, neuron saturation caused by magnitude difference of input sample data is avoided, and the importance of variables in initial network training is in the same position.

In consideration of the fact that the transformer is not single fault in the practical situation, but various types of discharge and overheating are combined with each other. Compared with the prior art that the output result is the fault type code corresponding to a single fault diagnosis result, the method can obtain the possible probability of multiple faults of the corresponding transformer.

In conclusion, the method and the device greatly improve the accuracy of judging the fault type of the transformer.

Claims

1. The transformer fault type diagnosis method combining the two machine learning methods is characterized by comprising the following steps of:

step 1, preprocessing a training sample;

2. The method for diagnosing the fault type of the transformer by combining the two machine learning methods as claimed in claim 1, wherein in the step 1, the training samples are a plurality of characteristic gases generated by transformer oil in the event of discharge or thermal fault.

3. The method for diagnosing the fault type of the transformer by combining the two machine learning methods according to claim 2, wherein in the step 2, the specific method for inputting the preprocessed training samples into the BP neural network comprises:

4. The method for diagnosing the fault type of the transformer by combining the two machine learning methods according to claim 3, wherein in step 2 or step 3, the BP neural network comprises an input layer, a hidden layer and an output layer, an input signal enters the network structure from the input layer, after passing through nodes of the hidden layer, data reaches the output layer through the action of an activation function, and a final signal is output by the output layer.

5. Two as claimed in claim 4The transformer fault type diagnosis method combining the machine learning method is characterized in that the number of nodes of an input layer of a BP network is set as m, the number of nodes of an output layer is set as n, and m is an integer greater than zero, so that an output formula of a hidden layer node is as follows:

the node output formula of the output layer is as follows:

6. The transformer fault type diagnosis method combining the two machine learning methods according to claim 5, wherein in step 3, a plurality of feature vectors output by the BP neural network are fused into one feature vector by using a Concatenate vector splicing method.

7. The method for diagnosing the fault type of the transformer by combining the two machine learning methods according to claim 6, wherein the step 4 is preceded by the steps of:

the global error is calculated as:

input to the network for training samplesThe actual output result obtained at the end of the process,

the weight value adjustment formula of the output layer is as follows:

wherein delta_yjIn order to be able to detect the error signal,

the weight adjustment formula of the hidden layer is as follows:

8. the transformer fault type diagnosis method combining two machine learning methods according to claim 7, wherein in step 4 or step 5, the CNN network comprises an input layer, a convolutional layer, a pooling layer and a full connection layer;

the formula for calculating the convolutional layer is as follows:

wherein

wherein

bias coefficients in the jth neuron of the ith layer are indicated.

9. The transformer fault type diagnosis method combining the two machine learning methods according to claim 8, wherein a softmax layer is arranged behind the full connection layer to classify the output results of the full connection layer, and the calculation formula of the softmax function is as follows:

10. The transformer fault type diagnosis method combining the two machine learning methods according to claim 9, wherein after the step of arranging a softmax layer behind the full connection layer to classify the output result of the full connection layer, the method further comprises the following steps: judging the final error, if the final error meets the requirement, entering the step 2, otherwise, calculating the weight and the offset partial derivative of the final error to each neuron in each layer of the CNN network according to the error generated by each neuron in each layer of the CNN network according to the final error, updating each neuron parameter of each layer of the CNN network according to the weight and the offset partial derivative of each neuron in each layer, and entering the step 4;

the final error is calculated as:

wherein the content of the first and second substances,

when calculating the neuron errors in the convolutional layer,

when calculating neuron errors in the pooling layer,

when calculating the neuron error in the fully-connected layer,

represents the input of the activation function of the l-th layer,

represents the activation function output in layer 1, down (x) is the pooling operation function;

where eta represents the learning rate, used to regulate the rate of convergence, w_oldRepresents the weight before update, w_newRepresenting updated weights, b_oldRepresenting the bias before update, b_newRepresenting the updated bias.