CN113822417A - Transformer fault type diagnosis method combining two machine learning methods - Google Patents

Transformer fault type diagnosis method combining two machine learning methods Download PDF

Info

Publication number
CN113822417A
CN113822417A CN202111106947.7A CN202111106947A CN113822417A CN 113822417 A CN113822417 A CN 113822417A CN 202111106947 A CN202111106947 A CN 202111106947A CN 113822417 A CN113822417 A CN 113822417A
Authority
CN
China
Prior art keywords
layer
network
output
input
transformer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111106947.7A
Other languages
Chinese (zh)
Inventor
薛凯丹
王心琦
王耀辉
郑贤宇
李强
黄凯
秦莹
李寅伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PowerChina Chengdu Engineering Co Ltd
Original Assignee
PowerChina Chengdu Engineering Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PowerChina Chengdu Engineering Co Ltd filed Critical PowerChina Chengdu Engineering Co Ltd
Priority to CN202111106947.7A priority Critical patent/CN113822417A/en
Publication of CN113822417A publication Critical patent/CN113822417A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • G06F17/13Differential equations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Economics (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Water Supply & Treatment (AREA)
  • Medical Informatics (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Public Health (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)

Abstract

The invention relates to the field of transformers, in particular to a transformer fault type diagnosis method combining two machine learning methods, which greatly improves the accuracy of transformer fault type judgment. The invention relates to a transformer fault type diagnosis method combining two machine learning methods, which comprises the following steps: preprocessing a training sample, inputting the preprocessed training sample into a BP neural network, training the BP neural network, fusing a plurality of feature vectors output by the BP neural network into one feature vector, inputting the fused feature vector into a CNN network, training the CNN network to obtain a transformer fault detection composite network, and judging the type of transformer faults according to the transformer fault detection composite network. The method is suitable for judging the fault type of the transformer.

Description

Transformer fault type diagnosis method combining two machine learning methods
Technical Field
The invention relates to the field of transformers, in particular to a transformer fault type diagnosis method combining two machine learning methods.
Background
At present, a plurality of transformer fault diagnosis researches are carried out based on DGA, and a diagnosis method comprises transformer fault diagnosis (LS-SVM) based on a least square support vector machine: when the support vector machine is used for transformer fault diagnosis, DGA data obtained through chromatographic analysis is generally input into the support vector machine as input data, and corresponding diagnosis results are given after classification processing is carried out by the support vector machine, so that the support vector machine has the advantages of high diagnosis speed, high accuracy and the like;
the transformer fault diagnosis method of the multi-classification multi-core learning support vector machine comprises the following steps: compared with the traditional two-classification method, the difficulty of network model construction and parameter selection can be reduced, the solving speed is increased, and the problem complexity is reduced. The extreme learning machine can be simply understood as a single hidden layer neural network, different from the neural network, weight values and neuron threshold values among layers are randomly generated during training, and the weight values and the neuron threshold values are not changed after generation, and the optimal performance is obtained only by adjusting the number of hidden layer neuron nodes;
the transformer fault diagnosis method based on the weighted extreme learning machine comprises the following steps: the weighted learning machine improves the training ability and the generalization ability, and can be combined with a cross validation method to research the influence of the parameter change of the weighted extreme learning machine on the algorithm performance, so that the optimal parameter is selected, and compared with an SVM (support vector machine), the weighted extreme learning machine has higher classification accuracy on an unbalanced data set;
the transformer fault diagnosis research based on the BP network comprises the following steps: a supervised learning algorithm is characterized in that labeled data are used as samples for network training, and a model obtained through learning is used for predicting samples to be recognized. The structure of the BP network includes an input layer, an implicit layer, a fully connected layer, and an output layer. The proportion of various characteristic gases generated by the input transformer to the total gases outputs codes of different fault types.
But the transformer fault diagnosis method based on the support vector machine comprises the following steps: the method is essentially a two-classification model, and when the method is used for a multi-classification problem, a complex transformation process is often needed, so that the construction of a network model is more complicated;
the transformer fault diagnosis method of the multi-classification multi-core learning support vector machine comprises the following steps: the network structure is simple, the training speed is high, but the learning ability is not enough;
the transformer fault diagnosis method based on the weighted extreme learning machine comprises the following steps: the judgment accuracy of the weighted extreme learning machine is not high;
the transformer fault diagnosis research based on the BP network comprises the following steps: the input and output of the network are flat vectors, the learning effect is poor, and the judgment accuracy is not high.
Disclosure of Invention
The invention aims to provide a transformer fault type diagnosis method combining two machine learning methods, which greatly improves the accuracy of transformer fault type judgment.
The invention adopts the following technical scheme to realize the purpose, and the transformer fault type diagnosis method combining two machine learning methods comprises the following steps:
step 1, preprocessing a training sample;
step 2, inputting the preprocessed training sample into a BP neural network, and training the BP neural network;
step 3, fusing a plurality of feature vectors output by the BP neural network into one feature vector;
step 4, inputting the fused feature vector into a CNN network, and training the CNN network to obtain a transformer fault detection composite network;
and 5, judging the fault type of the transformer according to the transformer fault detection composite network.
Further, in step 1, the training samples are various characteristic gases generated by the transformer oil under the condition of discharge or thermal fault.
Further, in step 2, the specific method for inputting the preprocessed training sample into the BP neural network includes:
the method comprises the steps of taking multiple characteristic gases as input data according to a set proportion, inputting the input data in a matrix form, setting the size of the matrix to be n x 1, setting n to be a positive integer, and adding each element in the matrix to form a sum of 1.
Further, in step 2 or step 3, the BP neural network includes an input layer, a hidden layer and an output layer, an input signal enters the network structure from the input layer, data reaches the output layer through the action of an activation function after passing through nodes of the hidden layer, and a final signal is output by the output layer.
Further, the number of nodes of the input layer of the BP network is set to m, the number of nodes of the output layer is set to n, and m is an integer greater than zero, then the hidden layer node output formula is:
Figure BDA0003272804960000021
vkias weights of the input layer to the hidden layer, f1(x) Transfer function for the hidden layer, xiThe training samples are input into the network, i is 1,2, and p is the number of weights;
the node output formula of the output layer is as follows:
Figure BDA0003272804960000022
wjkas a weight between the hidden layer and the output layer, f2(x) Is the transfer function of the output layer.
Further, in step 3, a plurality of feature vectors output by the BP neural network are fused into one feature vector by using a concatemate vector splicing method.
Further, step 4 is preceded by:
judging the global error, if the global error meets the requirement, entering the step 4, otherwise, reversely calculating the general error of each sample, and adjusting the weight of the output layer and the hidden layer according to the global error and the general error of each sample;
the global error is calculated as:
Figure BDA0003272804960000023
the actual output results obtained after the training samples are input into the network,
Figure BDA0003272804960000024
for the desired output result of the BP network, EpThe general error for the p sample;
the weight value adjustment formula of the output layer is as follows:
Figure BDA0003272804960000031
wherein deltayjIn order to be able to detect the error signal,
Figure BDA0003272804960000032
the weight adjustment formula of the hidden layer is as follows:
Figure BDA0003272804960000035
further, in step 4 or step 5, the CNN network includes an input layer, a convolutional layer, a pooling layer, and a full connection layer;
the formula for calculating the convolutional layer is as follows:
Figure BDA0003272804960000036
wherein
Figure BDA0003272804960000037
Denotes the kth convolution kernel of the convolution layer, N denotes the size of the convolution kernel, g denotes the number of rows of the convolution kernel, h denotes the number of columns of the convolution kernel,
Figure BDA0003272804960000038
is the bias coefficient, x, of the convolution layer corresponding to the convolution kernell-1Is the matrix data that is input to the device,
Figure BDA0003272804960000039
an element representing the ith row and jth column in the profile of the kth channel of the ith layer of the network, σ (x) being the activation function;
the pooling layer is used for performing downsampling operation on the input feature matrix, realizing feature dimension reduction, reducing the calculation amount of the rear layer, and setting the sampling size to be S1 multiplied by S2, so that the pooling operation formula is as follows:
Figure BDA00032728049600000310
x is a two-dimensional input vector, y is an output matrix after mean pooling, ypqOutputting data of a p row and a q column of the matrix for the pooling layer;
the full connectivity layer is used to linearly combine features throughout the network, and the formula of the full connectivity layer is:
Figure BDA00032728049600000311
wherein
Figure BDA00032728049600000312
The corresponding elements in the input matrix representing the fully connected layer,
Figure BDA00032728049600000313
expressed is a weight coefficient between jth neurons of the l-th layer of the l-1 th layer,
Figure BDA00032728049600000314
bias coefficients in the jth neuron of the ith layer are indicated.
Further, a softmax layer is arranged behind the full connection layer to classify the output results of the full connection layer, and the calculation formula of a softmax function is as follows:
Figure BDA00032728049600000315
wherein the probability that each sample belongs to the kth class is represented as: p (y ═ j | x), (j ═ 1,2, …, k); each element of the matrix output by the Softmax layer corresponds to the probability of the corresponding transformer failure.
Further, after the fully connected layer is provided with the softmax layer to classify the output result of the fully connected layer, the method further comprises the following steps: judging the final error, if the final error meets the requirement, entering the step 2, otherwise, calculating the weight and the offset partial derivative of the final error to each neuron in each layer of the CNN network according to the error generated by each neuron in each layer of the CNN network according to the final error, updating each neuron parameter of each layer of the CNN network according to the weight and the offset partial derivative of each neuron in each layer, and entering the step 4;
the final error is calculated as:
Figure BDA0003272804960000041
x denotes the input sample, ylabelRepresenting the actual value, a, corresponding to the input sampleL(x) Expressing the predicted value of the CNN network, wherein L is the maximum layer number of the CNN network;
the final error is calculated for each neuron in each layer of the CNN network according to the formula:
Figure BDA0003272804960000042
wherein the content of the first and second substances,
Figure BDA0003272804960000043
denotes the error produced by the jth neuron in the ith layer, and C denotes the final error of the network. An all-digital signal represents the product of Hadamard motors, which is used to represent the dot product operation of elements between matrices;
Figure BDA0003272804960000044
when in pairs of convolutional layersWhen the error of the neuron is calculated,
Figure BDA0003272804960000045
when calculating neuron errors in the pooling layer,
Figure BDA0003272804960000046
when calculating the neuron error in the fully-connected layer,
Figure BDA0003272804960000047
Figure BDA0003272804960000048
represents the input of the activation function of the l-th layer,
Figure BDA0003272804960000049
represents the activation function output in the l layer, and down (x) is a pooling operation function;
the formula for calculating the partial derivatives of the final error to the weight and the bias of each neuron in each layer of the CNN network is as follows:
Figure BDA00032728049600000410
the formula for updating the parameters of each neuron of each layer is as follows:
wnew=wold-η∑xal-1δl,bnew=bold-η∑xδlwhere eta represents the learning rate, used to regulate the rate of convergence, woldRepresents the weight before update, wnewRepresenting updated weights, boldRepresenting the bias before update, bnewRepresenting the updated bias.
The invention combines the high learning speed of the BP network to the training sample and the strong extraction capacity of the CNN network to the sample characteristics to construct the composite neural network which utilizes the characteristic gas to carry out the probability diagnosis of the transformer fault, thereby greatly improving the accuracy of the judgment of the transformer fault type, avoiding the neuron saturation caused by the magnitude difference of the input sample data through the data preprocessing, and ensuring the importance of the variables in the initial network training to be at the same position.
Drawings
Fig. 1 is a flow chart of the transformer fault type determination composite training of the present invention.
Detailed Description
The invention relates to a transformer fault type diagnosis method combining two machine learning methods, which comprises the following steps:
step 1, preprocessing a training sample;
step 2, inputting the preprocessed training sample into a BP neural network, and training the BP neural network;
step 3, fusing a plurality of feature vectors output by the BP neural network into one feature vector;
step 4, inputting the fused feature vector into a CNN network, and training the CNN network to obtain a transformer fault detection composite network;
and 5, judging the fault type of the transformer according to the transformer fault detection composite network.
In the step 1, the training sample is various characteristic gases generated by the transformer oil under the condition of discharge or thermal fault.
In step 2, the specific method for inputting the preprocessed training sample into the BP neural network includes:
the method comprises the steps of taking multiple characteristic gases as input data according to a set proportion, inputting the input data in a matrix form, setting the size of the matrix to be n x 1, setting n to be a positive integer, and adding each element in the matrix to form a sum of 1.
In step 2 or step 3, the bp (back propagation) neural network includes an input layer, a hidden layer and an output layer, and is characterized in that nodes between layers are connected with each other, and nodes in each layer are not connected with each other, an input signal enters a network structure from the input layer, after passing through the nodes of the hidden layer, data reaches the output layer through the action of an activation function, and a final signal is output by the output layer.
BP networkThe number of nodes of the input layer is set as m, the number of nodes of the output layer is set as n, and m is an integer greater than zero, then the output formula of the hidden layer node is as follows:
Figure BDA0003272804960000051
vkias weights of the input layer to the hidden layer, f1(x) Transfer function for the hidden layer, xiThe training samples are input into the network, i is 1,2, and p is the number of weights;
the node output formula of the output layer is as follows:
Figure BDA0003272804960000052
wjkas a weight between the hidden layer and the output layer, f2(x) Is the transfer function of the output layer.
In the step 3, a plurality of feature vectors output by the BP neural network are fused into one feature vector by adopting a Concatenate vector splicing method.
Step 4 also comprises the following steps:
judging the global error, if the global error meets the requirement, entering the step 4, otherwise, reversely calculating the general error of each sample, and adjusting the weight of the output layer and the hidden layer according to the global error and the general error of each sample;
the global error is calculated as:
Figure BDA0003272804960000061
the actual output results obtained after the training samples are input into the network,
Figure BDA0003272804960000062
for the desired output result of the BP network, EpThe general error for the p sample;
the weight value adjustment formula of the output layer is as follows:
Figure BDA0003272804960000063
wherein deltayjIn order to be able to detect the error signal,
Figure BDA0003272804960000064
the weight adjustment formula of the hidden layer is as follows:
Figure BDA0003272804960000067
in step 4 or step 5, the cnn (volumetric Neural networks) network includes an input layer, a convolutional layer, a pooling layer, and a full connection layer; the convolutional layer is a parameter set composed of a plurality of different convolutional kernels, and can extract corresponding features of an input matrix by using different convolutional kernels for identification and classification. Different types of convolution kernels have the capacity of local feature extraction, while the same type of convolution kernels can share the weight of the network, and the two characteristics of the convolution kernels enable the connection number of the convolution neural network to be larger than the number of the network weight, so that the complexity of the network is effectively reduced.
The formula for calculating the convolutional layer is as follows:
Figure BDA0003272804960000068
wherein
Figure BDA0003272804960000069
Denotes the kth convolution kernel of the convolution layer, N denotes the size of the convolution kernel, g denotes the number of rows of the convolution kernel, h denotes the number of columns of the convolution kernel,
Figure BDA00032728049600000610
is the bias coefficient, x, of the convolution layer corresponding to the convolution kernell-1Is the matrix data that is input to the device,
Figure BDA00032728049600000611
an element representing the ith row and jth column in the profile of the kth channel of the ith layer of the network, σ (x) being the activation function;
the pooling layer is used for performing downsampling operation on the input feature matrix, realizing feature dimension reduction, reducing the calculation amount of the rear layer, and setting the sampling size to be S1 multiplied by S2, so that the pooling operation formula is as follows:
Figure BDA00032728049600000612
x is a two-dimensional input vector, y is an output matrix after mean pooling, ypqOutputting data of a p row and a q column of the matrix for the pooling layer;
the full connectivity layer is used to linearly combine features throughout the network, and the formula of the full connectivity layer is:
Figure BDA00032728049600000613
wherein
Figure BDA00032728049600000614
The corresponding elements in the input matrix representing the fully connected layer,
Figure BDA00032728049600000615
expressed is a weight coefficient between jth neurons of the l-th layer of the l-1 th layer,
Figure BDA00032728049600000616
bias coefficients in the jth neuron of the ith layer are indicated.
And (3) arranging a softmax layer behind the full connection layer to classify the output result of the full connection layer, wherein the calculation formula of a softmax function is as follows:
Figure BDA0003272804960000071
wherein the probability that each sample belongs to the kth class is represented as: p (y ═ j | x), (j ═ 1,2, …, k); each element of the matrix output by the Softmax layer corresponds to the probability of the corresponding transformer failure.
After the full connection layer is provided with the softmax layer, the output result of the full connection layer is classified, and the method further comprises the following steps: judging the final error, if the final error meets the requirement, entering the step 2, otherwise, calculating the weight and the offset partial derivative of the final error to each neuron in each layer of the CNN network according to the error generated by each neuron in each layer of the CNN network according to the final error, updating each neuron parameter of each layer of the CNN network according to the weight and the offset partial derivative of each neuron in each layer, and entering the step 4;
the final error is calculated as:
Figure BDA0003272804960000072
x denotes the input sample, ylabelRepresenting the actual value, a, corresponding to the input sampleL(x) Expressing the predicted value of the CNN network, wherein L is the maximum layer number of the CNN network;
the final error is calculated for each neuron in each layer of the CNN network according to the formula:
Figure BDA0003272804960000073
wherein the content of the first and second substances,
Figure BDA0003272804960000074
denotes the error produced by the jth neuron in the ith layer, and C denotes the final error of the network. An all-digital signal represents the product of Hadamard motors, which is used to represent the dot product operation of elements between matrices;
Figure BDA0003272804960000075
when calculating the neuron errors in the convolutional layer,
Figure BDA0003272804960000076
when calculating neuron errors in the pooling layer,
Figure BDA0003272804960000077
when calculating the neuron error in the fully-connected layer,
Figure BDA0003272804960000078
Figure BDA0003272804960000079
represents the input of the activation function of the l-th layer,
Figure BDA00032728049600000710
represents the activation function output in the l layer, and down (x) is a pooling operation function;
the formula for calculating the partial derivatives of the final error to the weight and the bias of each neuron in each layer of the CNN network is as follows:
Figure BDA00032728049600000711
the formula for updating the parameters of each neuron of each layer is as follows:
wnew=wold-η∑xal-1δl,bnew=bold-η∑xδlwhere eta represents the learning rate, used to regulate the rate of convergence, woldRepresents the weight before update, wnewRepresenting updated weights, boldRepresenting the bias before update, bnewRepresenting the updated bias.
Fig. 1 is a flow chart of the composite training for judging the fault type of the transformer, and the composite network training process comprises the following steps:
preprocessing the collected training samples: the input data is the proportion of five characteristic gases in the total gas volume, and the size of an input matrix is n multiplied by 1. The ideal output result is a fault type proportion matrix, the size of the output matrix is n multiplied by 1, and the sum of each element in the matrix is 1. And inputting the training samples into a BP (Back propagation) network (BP neural network) to train the BP neural network. In the training process of the BP neural network, the training function can describe training result information and an error change curve through interval steps of displaying results. When the network training step number exceeds the preset maximum training step number, the network training is stopped, or the network training error is smaller than the set target error value, and the network also stops training.
In the process of BP network training, a BP network is initialized, then outputs of all units of a hidden layer and all units of an output layer are sequentially solved, an overall error E (namely a global error) is solved for the whole sample set, whether the error E meets requirements (namely whether the error is within a threshold range) is judged, if the error E does not meet the requirements, generalized errors of all units are reversely calculated, then weights of all layers are adjusted, and then the steps after the BP network is initialized are returned;
if the requirement is met, n feature vectors output by the BP network are fused into one feature vector (n multiplied by n) by using a Concatenate vector splicing method, and the specific operation method is to splice the feature vectors. The prior knowledge is that the feature vectors which are mutually fused are carried out in the feature space with the same dimensionality, the feature modes are similar, and the properties are similar. And inputting the fused feature vector into a CNN network for network training as an input data matrix, wherein an ideal output result is a fault type proportion matrix, the size of the output matrix is n multiplied by 1, the sum of elements is 1, and each element represents each possible fault probability. After the two networks are trained respectively, the same training sample is used for training the composite network, and the transformer fault diagnosis error of the whole network is reduced.
In the CNN network training process, firstly initializing a CNN network, then sequentially solving for convolutional layer output, pooling layer output and full-connection layer output, then setting a softmax layer behind the full-connection layer to classify the output result of the full-connection layer, judging a final error after classification, judging whether the requirement is met or the training frequency requirement is met, returning to the BP network initialization for composite network training if the requirement is met, and if the requirement is not met, reversely propagating and adjusting a network value; and adjusting the network value, namely updating each neuron parameter of each layer of the CNN network, and returning to the step after the CNN network is initialized after updating.
The method comprises the steps that a sample (a column matrix n x 1) subjected to data preprocessing is processed for n times through a BP network and then is spliced into a new matrix (n x n) through Concatenate, and then the new matrix is processed through the CNN network to output a column matrix (5 x 1), wherein the matrix represents the possible proportion of various types of transformer faults.
The labeled data samples used for learning are converted into the proportion of the characteristic gas content in the total produced gas through preprocessing, the characteristic gas content is arranged into a column matrix from top to bottom, and if more than one fault type exists, the column vector (the sum of each element is 1) which is expressed as the proportion of various transformer faults is predicted to be output.
The invention combines the high learning speed of the BP network to the training sample and the strong extraction capability of the CNN network to the sample characteristics to construct a composite neural network which utilizes the characteristic gas to carry out the probability diagnosis of the transformer fault. The existing BP network usually shows a fast and efficient learning effect in a single classification problem and has a simple and clear network structure. When a multi-classification problem is processed, one-sidedness often exists, the problem of trapping in a local optimal solution exists under the influence of a learning sample set, the accuracy rate of a BP network is often not high, and very accurate transformer fault type judgment cannot be provided. The CNN network has wide application in the field of image recognition, and is characterized in that convolution calculation can be performed on data (or pixel points) of a coverage area corresponding to an input matrix in a sliding mode by using convolution kernel matrixes with different sizes to obtain a final result. The CNN network can extract a large amount of sample features, but for a flat input matrix, the range of information that can be extracted is limited, and the optimal learning effect cannot be achieved. Therefore, by combining the BP network and the CNN network, output result vectors processed for many times by the BP network are combined into an n multiplied by n matrix and then input into the CNN network for convolution operation, and the probability of various transformer faults is processed by a softmax function after passing through a full connection layer.
Through data preprocessing, neuron saturation caused by magnitude difference of input sample data is avoided, and the importance of variables in initial network training is in the same position.
In consideration of the fact that the transformer is not single fault in the practical situation, but various types of discharge and overheating are combined with each other. Compared with the prior art that the output result is the fault type code corresponding to a single fault diagnosis result, the method can obtain the possible probability of multiple faults of the corresponding transformer.
In conclusion, the method and the device greatly improve the accuracy of judging the fault type of the transformer.

Claims (10)

1. The transformer fault type diagnosis method combining the two machine learning methods is characterized by comprising the following steps of:
step 1, preprocessing a training sample;
step 2, inputting the preprocessed training sample into a BP neural network, and training the BP neural network;
step 3, fusing a plurality of feature vectors output by the BP neural network into one feature vector;
step 4, inputting the fused feature vector into a CNN network, and training the CNN network to obtain a transformer fault detection composite network;
and 5, judging the fault type of the transformer according to the transformer fault detection composite network.
2. The method for diagnosing the fault type of the transformer by combining the two machine learning methods as claimed in claim 1, wherein in the step 1, the training samples are a plurality of characteristic gases generated by transformer oil in the event of discharge or thermal fault.
3. The method for diagnosing the fault type of the transformer by combining the two machine learning methods according to claim 2, wherein in the step 2, the specific method for inputting the preprocessed training samples into the BP neural network comprises:
the method comprises the steps of taking multiple characteristic gases as input data according to a set proportion, inputting the input data in a matrix form, setting the size of the matrix to be n x 1, setting n to be a positive integer, and adding each element in the matrix to form a sum of 1.
4. The method for diagnosing the fault type of the transformer by combining the two machine learning methods according to claim 3, wherein in step 2 or step 3, the BP neural network comprises an input layer, a hidden layer and an output layer, an input signal enters the network structure from the input layer, after passing through nodes of the hidden layer, data reaches the output layer through the action of an activation function, and a final signal is output by the output layer.
5. Two as claimed in claim 4The transformer fault type diagnosis method combining the machine learning method is characterized in that the number of nodes of an input layer of a BP network is set as m, the number of nodes of an output layer is set as n, and m is an integer greater than zero, so that an output formula of a hidden layer node is as follows:
Figure FDA0003272804950000011
vkias weights of the input layer to the hidden layer, f1(x) Transfer function for the hidden layer, xiThe training samples are input into the network, i is 1,2, and p is the number of weights;
the node output formula of the output layer is as follows:
Figure FDA0003272804950000012
wjkas a weight between the hidden layer and the output layer, f2(x) Is the transfer function of the output layer.
6. The transformer fault type diagnosis method combining the two machine learning methods according to claim 5, wherein in step 3, a plurality of feature vectors output by the BP neural network are fused into one feature vector by using a Concatenate vector splicing method.
7. The method for diagnosing the fault type of the transformer by combining the two machine learning methods according to claim 6, wherein the step 4 is preceded by the steps of:
judging the global error, if the global error meets the requirement, entering the step 4, otherwise, reversely calculating the general error of each sample, and adjusting the weight of the output layer and the hidden layer according to the global error and the general error of each sample;
the global error is calculated as:
Figure FDA0003272804950000021
Figure FDA0003272804950000022
input to the network for training samplesThe actual output result obtained at the end of the process,
Figure FDA0003272804950000023
for the desired output result of the BP network, EpThe general error for the p sample;
the weight value adjustment formula of the output layer is as follows:
Figure FDA0003272804950000024
wherein deltayjIn order to be able to detect the error signal,
Figure FDA0003272804950000025
the weight adjustment formula of the hidden layer is as follows:
Figure FDA0003272804950000026
8. the transformer fault type diagnosis method combining two machine learning methods according to claim 7, wherein in step 4 or step 5, the CNN network comprises an input layer, a convolutional layer, a pooling layer and a full connection layer;
the formula for calculating the convolutional layer is as follows:
Figure FDA0003272804950000027
wherein
Figure FDA0003272804950000028
Denotes the kth convolution kernel of the convolution layer, N denotes the size of the convolution kernel, g denotes the number of rows of the convolution kernel, h denotes the number of columns of the convolution kernel,
Figure FDA0003272804950000029
is the bias coefficient, x, of the convolution layer corresponding to the convolution kernell-1Is the matrix data that is input to the device,
Figure FDA00032728049500000210
an element representing the ith row and jth column in the profile of the kth channel of the ith layer of the network, σ (x) being the activation function;
the pooling layer is used for performing downsampling operation on the input feature matrix, realizing feature dimension reduction, reducing the calculation amount of the rear layer, and setting the sampling size to be S1 multiplied by S2, so that the pooling operation formula is as follows:
Figure FDA00032728049500000211
x is a two-dimensional input vector, y is an output matrix after mean pooling, ypqOutputting data of a p row and a q column of the matrix for the pooling layer;
the full connectivity layer is used to linearly combine features throughout the network, and the formula of the full connectivity layer is:
Figure FDA00032728049500000212
wherein
Figure FDA00032728049500000215
The corresponding elements in the input matrix representing the fully connected layer,
Figure FDA00032728049500000213
expressed is a weight coefficient between jth neurons of the l-th layer of the l-1 th layer,
Figure FDA00032728049500000214
bias coefficients in the jth neuron of the ith layer are indicated.
9. The transformer fault type diagnosis method combining the two machine learning methods according to claim 8, wherein a softmax layer is arranged behind the full connection layer to classify the output results of the full connection layer, and the calculation formula of the softmax function is as follows:
Figure FDA0003272804950000031
wherein the probability that each sample belongs to the kth class is represented as: p (y ═ j | x), (j ═ 1,2, …, k); each element of the matrix output by the Softmax layer corresponds to the probability of the corresponding transformer failure.
10. The transformer fault type diagnosis method combining the two machine learning methods according to claim 9, wherein after the step of arranging a softmax layer behind the full connection layer to classify the output result of the full connection layer, the method further comprises the following steps: judging the final error, if the final error meets the requirement, entering the step 2, otherwise, calculating the weight and the offset partial derivative of the final error to each neuron in each layer of the CNN network according to the error generated by each neuron in each layer of the CNN network according to the final error, updating each neuron parameter of each layer of the CNN network according to the weight and the offset partial derivative of each neuron in each layer, and entering the step 4;
the final error is calculated as:
Figure FDA0003272804950000032
x denotes the input sample, ylabelRepresenting the actual value, a, corresponding to the input sampleL(x) Expressing the predicted value of the CNN network, wherein L is the maximum layer number of the CNN network;
the final error is calculated for each neuron in each layer of the CNN network according to the formula:
Figure FDA0003272804950000033
wherein the content of the first and second substances,
Figure FDA0003272804950000034
denotes the error produced by the jth neuron in the ith layer, and C denotes the final error of the network. An all-digital signal represents the product of Hadamard motors, which is used to represent the dot product operation of elements between matrices;
Figure FDA0003272804950000035
when calculating the neuron errors in the convolutional layer,
Figure FDA0003272804950000036
when calculating neuron errors in the pooling layer,
Figure FDA0003272804950000037
when calculating the neuron error in the fully-connected layer,
Figure FDA0003272804950000038
Figure FDA0003272804950000039
represents the input of the activation function of the l-th layer,
Figure FDA00032728049500000310
represents the activation function output in layer 1, down (x) is the pooling operation function;
the formula for calculating the partial derivatives of the final error to the weight and the bias of each neuron in each layer of the CNN network is as follows:
Figure FDA00032728049500000311
the formula for updating the parameters of each neuron of each layer is as follows:
Figure FDA0003272804950000041
where eta represents the learning rate, used to regulate the rate of convergence, woldRepresents the weight before update, wnewRepresenting updated weights, boldRepresenting the bias before update, bnewRepresenting the updated bias.
CN202111106947.7A 2021-09-22 2021-09-22 Transformer fault type diagnosis method combining two machine learning methods Pending CN113822417A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111106947.7A CN113822417A (en) 2021-09-22 2021-09-22 Transformer fault type diagnosis method combining two machine learning methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111106947.7A CN113822417A (en) 2021-09-22 2021-09-22 Transformer fault type diagnosis method combining two machine learning methods

Publications (1)

Publication Number Publication Date
CN113822417A true CN113822417A (en) 2021-12-21

Family

ID=78920790

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111106947.7A Pending CN113822417A (en) 2021-09-22 2021-09-22 Transformer fault type diagnosis method combining two machine learning methods

Country Status (1)

Country Link
CN (1) CN113822417A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109100648A (en) * 2018-05-16 2018-12-28 上海海事大学 Ocean current generator impeller based on CNN-ARMA-Softmax winds failure fusion diagnosis method
CN109116203A (en) * 2018-10-31 2019-01-01 红相股份有限公司 Power equipment partial discharges fault diagnostic method based on convolutional neural networks
CN111127423A (en) * 2019-12-23 2020-05-08 金陵科技学院 Rice pest and disease identification method based on CNN-BP neural network algorithm
CN112284735A (en) * 2020-10-21 2021-01-29 兰州理工大学 Multi-sensor rolling bearing fault diagnosis based on one-dimensional convolution and dynamic routing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109100648A (en) * 2018-05-16 2018-12-28 上海海事大学 Ocean current generator impeller based on CNN-ARMA-Softmax winds failure fusion diagnosis method
CN109116203A (en) * 2018-10-31 2019-01-01 红相股份有限公司 Power equipment partial discharges fault diagnostic method based on convolutional neural networks
CN111127423A (en) * 2019-12-23 2020-05-08 金陵科技学院 Rice pest and disease identification method based on CNN-BP neural network algorithm
CN112284735A (en) * 2020-10-21 2021-01-29 兰州理工大学 Multi-sensor rolling bearing fault diagnosis based on one-dimensional convolution and dynamic routing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨雨洁等: "RBF-BP复合神经网络在变压器故障诊断中的应用" *

Similar Documents

Publication Publication Date Title
CN113011499B (en) Hyperspectral remote sensing image classification method based on double-attention machine system
CN110287983B (en) Single-classifier anomaly detection method based on maximum correlation entropy deep neural network
CN108846426B (en) Polarization SAR classification method based on deep bidirectional LSTM twin network
CN109116834B (en) Intermittent process fault detection method based on deep learning
CN111665819A (en) Deep learning multi-model fusion-based complex chemical process fault diagnosis method
CN113723010B (en) Bridge damage early warning method based on LSTM temperature-displacement correlation model
CN106548230A (en) Diagnosis Method of Transformer Faults based on Modified particle swarm optimization neutral net
CN110287777B (en) Golden monkey body segmentation algorithm in natural scene
CN113705526A (en) Hyperspectral remote sensing image classification method
CN111046961B (en) Fault classification method based on bidirectional long-time and short-time memory unit and capsule network
CN114488140B (en) Small sample radar one-dimensional image target recognition method based on deep migration learning
CN112070128A (en) Transformer fault diagnosis method based on deep learning
CN107832789B (en) Feature weighting K nearest neighbor fault diagnosis method based on average influence value data transformation
CN112381763A (en) Surface defect detection method
CN112381179A (en) Heterogeneous graph classification method based on double-layer attention mechanism
CN110119805B (en) Convolutional neural network algorithm based on echo state network classification
CN109740695A (en) Image-recognizing method based on adaptive full convolution attention network
CN107423705A (en) SAR image target recognition method based on multilayer probability statistics model
CN112766283B (en) Two-phase flow pattern identification method based on multi-scale convolution network
CN111275165A (en) Network intrusion detection method based on improved convolutional neural network
CN112766603A (en) Traffic flow prediction method, system, computer device and storage medium
CN113901448A (en) Intrusion detection method based on convolutional neural network and lightweight gradient elevator
CN113947182A (en) Traffic flow prediction model construction method based on double-stage stack graph convolution network
CN109145685B (en) Fruit and vegetable hyperspectral quality detection method based on ensemble learning
CN115965864A (en) Lightweight attention mechanism network for crop disease identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211221

RJ01 Rejection of invention patent application after publication