CN112308146A

CN112308146A - Distribution transformer fault identification method based on operation characteristics

Info

Publication number: CN112308146A
Application number: CN202011200609.5A
Authority: CN
Inventors: 傅俪; 林国庆; 郭俊; 翁宇游; 谢炜
Original assignee: Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd; State Grid Fujian Electric Power Co Ltd
Current assignee: Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd; State Grid Fujian Electric Power Co Ltd
Priority date: 2020-11-02
Filing date: 2020-11-02
Publication date: 2021-02-02

Abstract

The invention relates to a distribution transformer fault identification method based on operation characteristics. The method comprises the following steps: step S1, on the basis of the distribution transformer body fault power failure record, combining the operation characteristic data before the distribution transformer fault, including distribution transformer archive parameters, operation data and environment data, screening the power failure record conforming to the distribution transformer body fault, positioning the power failure record to a specific distribution transformer equipment ID, and constructing a distribution transformer data set; s2, extracting important characteristic variables influencing distribution transformer faults by using a random forest algorithm based on a distribution transformer data set, and then constructing a distribution transformer fault early warning model by using a machine learning algorithm; and S3, outputting a distribution transformer operation risk grade based on the distribution transformer fault early warning model, and positioning a fault part.

Description

Distribution transformer fault identification method based on operation characteristics

Technical Field

The invention relates to a distribution transformer fault identification method based on operation characteristics.

Background

Based on the distribution transformer equipment health state perception and historical fault information data analysis, the occurrence reason of the abnormal state of the distribution transformer equipment and the potential operation rule of the distribution transformer in the fault period are mined, the early signs of the fault are identified, the fault position, the fault degree and the development trend are researched and judged, and the optimal operation maintenance and overhaul time can be determined.

Disclosure of Invention

The invention aims to provide a distribution transformer fault identification method based on operation characteristics, which can identify early signs of faults, study and judge fault positions, fault degrees and development trends and determine the optimal operation, maintenance and overhaul time.

In order to achieve the purpose, the technical scheme of the invention is as follows: a distribution transformer fault identification method based on operation characteristics comprises the following steps:

step S1, on the basis of the distribution transformer body fault power failure record, combining the operation characteristic data before the distribution transformer fault, including distribution transformer archive parameters, operation data and environment data, screening the power failure record conforming to the distribution transformer body fault, positioning the power failure record to a specific distribution transformer equipment ID, and constructing a distribution transformer data set;

s2, extracting important characteristic variables influencing distribution transformer faults by using a random forest algorithm based on a distribution transformer data set, and then constructing a distribution transformer fault early warning model by using a machine learning algorithm;

and S3, outputting a distribution transformer operation risk grade based on the distribution transformer fault early warning model, and positioning a fault part.

In an embodiment of the present invention, the specific manner of extracting the important characteristic variables affecting the distribution transformer fault by using the random forest algorithm in step S2 is as follows: firstly, randomly extracting half of data from a distribution transformation data set as a classification regression tree, and taking the rest half of the data as data outside a bag; secondly, randomly extracting half of feature variables at each node of each number, calculating the information content contained in each feature, and selecting the maximum value of the information content as the node splitting of the first feature; then, arranging the data in descending order according to the information quantity, and stopping splitting when the error value is minimum; and finally, selecting a characteristic variable set with the largest overall information quantity and the smallest error as a core characteristic variable, namely an important characteristic variable influencing the distribution transformer fault.

In an embodiment of the present invention, the machine learning algorithm adopted in step S2 is Adaboost algorithm.

In an embodiment of the invention, a mode for constructing a distribution transformer fault early warning model by using an Adaboost algorithm is as follows:

defining a distribution data set T { (x)₁，y₁)，(x₂，y₂)...(x_N，y_N) Therein examples

But example space

y_iBelongs to the label set { -1, +1 };

(1) initializing weight distribution of data in a distribution transformation data set, namely training data, wherein each training data sample is endowed with the same weight at the beginning: 1/N;

D₁＝(w₁₁，w₁₂…w_1i…，w_1N)，

in the formula, D₁Represents the weight, w, of each sample of the first iteration_1iRepresenting the weight of the ith sample in the 1 st iteration, wherein N is the total number of samples;

(2) a plurality of iterations are performed, with M being 1, 2.

A. Using a weight distribution D_mLearning the distribution transformation data set to obtain a basic classifier:

G_m(x)：χ→{-1，+1}

this equation represents the basic classifier G at the mth iteration_m(x) Sample x is classified as either-1 or 1, G_m(x) Is a two-classifier;

B. calculation of G_m(x) Classification error rate on the distribution data set:

C. calculation of G_m(x) Coefficient of (a)_mRepresents G_m(x) Importance in the final classifier):

from the above formula, e_mWhen < 1/2, a_m0, and a_mWith e_mIs increased, meaning that the smaller the classification error rate, the more the basic classifier plays a role in the final classifier;

D. updating weight distribution of the distribution transformation data set so as to enable the distribution transformation data set to be classified by a basic classifier G_m(x) The weight of the misclassified samples is increased, while the weight of the correctly classified samples is decreased;

D_m+1＝(w_m+1，1，w_m+1，2…w_m+1，i…，w_m+1，N)，

wherein Z is_mIs a normalization factor such that D_m+1Becomes a probability distribution

(3) Combining the weak classifiers:

and further, obtaining a final classifier, namely a distribution fault early warning model:

compared with the prior art, the invention has the following beneficial effects: the method can identify the early signs of the fault, study and judge the fault position, the fault degree and the development trend, and determine the optimal operation and maintenance time.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is a distribution transformer fault operation characteristic data processing process.

Fig. 3 is a process for constructing a distribution transformer equipment fault operation characteristic model.

Detailed Description

The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.

As shown in fig. 1, the present invention provides a distribution transformer fault identification method based on operation characteristics, including the following steps:

The following is a specific implementation of the present invention.

The invention relates to a distribution transformer fault identification method based on operation characteristics, which is realized as follows:

as shown in fig. 2, firstly, correlation analysis is performed by combining pre-distribution-fault operation characteristic data such as distribution transformer archive parameters, operation data, environmental data and the like on the basis of the distribution transformer body fault power failure records. In this example, since the fault information pointing to the specific distribution transformer cannot be directly acquired, it is necessary to screen the power failure records conforming to the fault of the distribution transformer body by searching the fault summary in the fault records, locate the specific equipment ID, and finally acquire about 290 public distribution transformer body fault power failure records conforming to the condition.

As shown in fig. 3, then, based on multi-group algorithm exploration and comparison, the interpretability of the random forest and the Adaboost algorithm is high, the implementation difficulty is low, and the prediction accuracy is high. Important characteristic variables influencing the distribution transformer fault are extracted by using a random forest algorithm, a distribution transformer fault early warning model is constructed by using machine learning algorithms such as adaboost and the like, the distribution transformer operation risk level is output, the fault position is positioned, and the equipment safety is guaranteed. The method comprises the following specific steps:

(1) random forest feature screening

The specific steps of the random forest characteristic screening are as follows: firstly, randomly extracting half of data from a training set as a classification regression tree, and taking the remaining half of the data as data outside a bag; secondly, randomly extracting half of feature variables at each node of each number, calculating the information content contained in each feature, and selecting the maximum value of the information content as the node splitting of the first feature; then, the division is stopped when the error value is minimum. And finally, selecting a characteristic variable set with the largest overall information quantity and the smallest error as a core characteristic variable.

(2)Adaboost

The algorithm flow of Adaboost is as follows:

But example space

y_iThe goal of Adaboost, belonging to the label set { -1, +1}, is to learn a series of weak classifiers, or basic classifiers, from the training data and then combine these weak classifiers into one strong classifier.

Step 1, firstly, initializing weight distribution of training data. Each training sample is initially given the same weight: 1/N.

D1＝(w₁₁，w₁₂…w_1i…，w_1N)，

And 2, performing multiple iterations, and using M as 1, 2, wherein M represents the first iteration round

a. Using a weight distribution D_mLearning the training data set to obtain a basic classifier (selecting a threshold value with the lowest error rate to design the basic classifier):

G_m(x)：χ→{-1，+1}

b. calculation of G_m(x) Classification error rate on training data set

From the above formula, G_m(x) Error rate on training data set e_mIs exactly covered by G_m(x) The sum of the weights of the misclassified samples.

c. Calculation of G_m(x) Coefficient of (a)_mRepresents G_m(x) Degree of importance in the final classifier (purpose: get the weight the basic classifier takes in the final classifier):

from the above formula, e_mWhen < 1/2, a_m0, and a_mWith e_mIs increased, means that the basic classifier with a smaller classification error rate has a higher role in the final classifier.

d. The weight distribution of the training data set is updated (in order to obtain a new weight distribution of the samples) for the next iteration

D_m+1＝(w_m+1，1，w_m+1，2…w_m+1，i…，w_m+1，N)，

So as to be classified by the basic classifier G_m(x) The weight of misclassified samples increases and the weight of correctly classified samples decreases. In this manner, the AdaBoost method can "focus" or "focus on" those samples that are less readily separable.

Wherein Z is_mIs a normalization factor such that D_m+1Becomes a probability distribution:

step 3, combining each weak classifier

The final classifier is thus obtained as follows:

the above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. A distribution transformer fault identification method based on operation characteristics is characterized by comprising the following steps:

2. The distribution transformer fault identification method based on the operation characteristics as claimed in claim 1, wherein the specific way of extracting the important characteristic variables affecting the distribution transformer fault by using the random forest algorithm in step S2 is as follows: firstly, randomly extracting half of data from a distribution transformation data set as a classification regression tree, and taking the rest half of the data as data outside a bag; secondly, randomly extracting half of feature variables at each node of each number, calculating the information content contained in each feature, and selecting the maximum value of the information content as the node splitting of the first feature; then, arranging the data in descending order according to the information quantity, and stopping splitting when the error value is minimum; and finally, selecting a characteristic variable set with the largest overall information quantity and the smallest error as a core characteristic variable, namely an important characteristic variable influencing the distribution transformer fault.

3. The distribution transformer fault identification method based on the operation characteristics as claimed in claim 1, wherein the machine learning algorithm adopted in step S2 is Adaboost algorithm.

4. The distribution transformer fault identification method based on the operation characteristics as claimed in claim 3, wherein the mode of constructing the distribution transformer fault early warning model by using Adaboost algorithm is as follows:

But example space

y_iBelongs to the label set { -1, +1 };

D₁＝(w₁₁，w₁₂…w_1i…，w_1N)，

(2) multiple iterations are performed, where M is 1, 2, …, and M denotes the number of iterations:

G_m(x)：χ→{-1，+1}

C. calculation of G_m(x) Coefficient of (a)_mRepresents G_m(x) InImportance in the final classifier:

D_m+1＝(w_m+1，1，w_m+1，2…w_m+1，i…，w_m+1，N)，

(3) Combining the weak classifiers: