CN114118248A

CN114118248A - Power transformer fault diagnosis method based on width learning under unbalanced sample

Info

Publication number: CN114118248A
Application number: CN202111385191.4A
Authority: CN
Inventors: 许超; 李小兰; 杨波; 王振浩; 李泽曦; 张琦; 刘东延; 卢毅; 杨旭; 郑舒文; 谭澈; 赵宁; 孙守道; 赵贝加; 谢杰; 张志鹏
Original assignee: Shenyang Aibeike Technology Co ltd; State Grid Corp of China SGCC; Shenyang Power Supply Co of State Grid Liaoning Electric Power Co Ltd
Current assignee: Shenyang Aibeike Technology Co ltd; State Grid Corp of China SGCC; Shenyang Power Supply Co of State Grid Liaoning Electric Power Co Ltd
Priority date: 2021-11-22
Filing date: 2021-11-22
Publication date: 2022-03-01

Abstract

The invention discloses a method for diagnosing faults of a power transformer based on width learning under unbalanced samples, which comprises the steps of firstly, carrying out first-stage classification on the faults of the transformer by adopting a first-stage BLS classifier, and then carrying out second-stage classification on the faults by using a second-stage Softmax classifier on the basis of the first-stage classification so as to improve the diagnosis accuracy and efficiency of the faults; in the process of training the classifier, a width learning system (BLS) which uses an incremental algorithm of network horizontal expansion to construct a network model can effectively solve the problem of deep learning. Compared with a deep neural network, the method has the advantages of high precision and high training speed, and has the advantages of simple parameter setting and easiness in operation in practical application.

Description

Power transformer fault diagnosis method based on width learning under unbalanced sample

Technical Field

The invention relates to the technical field of transformer fault diagnosis, in particular to a width learning-based power transformer fault diagnosis method under an unbalanced sample.

Background

The role of power transformers in power systems is irreplaceable, and the normal operation of the power transformers is closely related to the normal transmission of electric energy by the whole power grid system. In recent years, although the research and manufacturing technology of electrical equipment in China is advanced, the faults of power transformers with different sizes can be induced due to insulation aging, severe environment, too high operation load and the like, so that social and economic losses are caused, and serious accidents such as large-scale power failure, power grid collapse and the like can be caused even seriously.

A method for analyzing (DGA) dissolved gas in transformer oil is used for analyzing the content and gas production rate difference of characteristic gases such as hydrogen, methane, ethylene, acetylene, ethane and the like in insulating oil in different operating states, so that an important basis is provided for the evaluation of the operating state of a transformer, and the method has the advantage of supporting charged on-line detection, and is widely applied to the field of transformer state monitoring and fault diagnosis in China.

According to theoretical analysis and actual experience, researchers establish basic method systems with simple flows such as a three-ratio method, a Rogers ratio method and a David triangle method in the initial stage, but due to the limitation of problems such as code loss, threshold absolute and field investigation, the methods are gradually used as auxiliary means for transformer fault diagnosis. With the development of the machine learning theory, the fault diagnosis method based on artificial intelligence becomes a popular research direction in the field of science by virtue of higher classification accuracy of the fault diagnosis method on the operation state types of the transformer. However, as the model expands, the weight of deep learning is updated slowly, and the success rate of unsupervised and semi-supervised network classification is low.

Therefore, how to develop a fault diagnosis method suitable for a power transformer based on artificial intelligence becomes a problem to be solved urgently.

Disclosure of Invention

In view of the above, the invention provides a power transformer fault diagnosis method based on width learning under an unbalanced sample, so as to solve the problem of low diagnosis success rate of the conventional artificial intelligent fault diagnosis method.

The invention provides a technical scheme, in particular to a width learning-based power transformer fault diagnosis method under an unbalanced sample, which comprises the following steps:

s1: performing characteristic extraction on dissolved gas in the transformer oil based on a dissolved gas analysis method to obtain primary input data;

s2: inputting the primary input data into a trained primary BLS classifier to obtain a primary classification of the transformer state;

s3: and after feature fusion is carried out on the primary input data and the primary classification of the transformer state corresponding to the primary input data, inputting the primary input data and the primary classification of the transformer state into a trained second-stage Softmax classifier, and obtaining a second-stage classification of the transformer state.

Preferably, the first-level classification includes: normal, discharge fault, and overheat fault; the second level of classification includes: normal, partial discharge, spark discharge, arc discharge, spark discharge with superheat, arc discharge with superheat, low temperature superheat, medium temperature superheat, and high temperature superheat.

Further preferably, in step S1, the characteristic extraction is performed on the dissolved gas in the transformer oil based on a dissolved gas analysis method to obtain primary input data, specifically:

the volume fractions of 5 gases including hydrogen, methane, ethylene, acetylene and ethane in the transformer oil are obtained based on a dissolved gas analysis method, and acquired data are obtained

The collected data g are processed_iCarrying out normalization processing according to the following formula to obtain primary input data;

wherein, i is 1,2, …, G, indicating a sample number; g is the total number of samples; j-1, 2,3,4,5, each represent 5 gases.

Further preferably, the specific training process of the first-stage BLS classifier in step S2 is as follows:

s201: acquiring historical dissolved gas characteristics in transformer oil, and performing normalization processing to obtain sample characteristic data X;

s202: obtaining a first-stage classification Y of the transformer state corresponding to each sample characteristic data X;

s203: converting the sample characteristic data X into characteristic nodes according to the following formula;

Z_i＝φ_i(XW_ei+β_ei),i＝1,2,…,n；

wherein, W_ei∈R^M×p，β_ei∈R^1×pAnd W is_eiWeight, β, representing characteristic node of i-th group_eiRepresenting the deviation of the ith group of feature nodes, M representing the dimension of sample feature data, p representing the number of each group of feature nodes, and n representing the group number of the feature nodes;

s204: generating an enhanced node by the characteristic node according to the following formula;

H_j＝ξ_j(ZⁿW_hj+β_hj),j＝1,2,3,…,m；

wherein Z isⁿ＝[Z₁,Z₂,…,Z_n]，W_hj∈R^np×q，β_hj∈R^1×qAnd W is_hjWeight, β, representing the jth set of enhancement nodes_hjExpressing the deviation of the jth group of enhanced nodes, wherein m expresses the group number of the enhanced nodes, and q expresses the number of the enhanced nodes in each group;

s205: splicing the characteristic nodes and the enhanced nodes to be used as an input layer, and calculating an output weight by adopting the following formula according to a first-stage classification Y of the transformer state corresponding to the sample characteristic data X to obtain a trained first-stage BLS classifier;

W^m＝[Zⁿ|H^m]⁺Y；

wherein H^m＝[H₁,H₂,…,H_m]。

Further preferably, in step S3, the training process of the second-stage Softmax classifier specifically includes the following steps:

s301: performing feature fusion on the sample feature data X and a first class classification Y of the transformer state corresponding to the sample feature data X;

s302: separating the samples after the characteristics are fused to obtain a minority sample set and a majority sample set;

s303: obtaining a plurality of balance training subsets according to the minority sample set and the majority sample set;

s304: and inputting a plurality of balance training subsets into a second-stage Softmax classifier for training by adopting an easy Ensemble ensemble learning method to obtain the trained second-stage Softmax classifier.

More preferably, step S302: separating the sample after the characteristic fusion to obtain a minority sample and a majority sample, specifically:

s3021: acquiring the category of the unbalanced number set;

s3022: separating the samples after feature fusion according to the categories of the unbalanced number set, wherein the category with the least number of the corresponding samples in all the categories is a minority sample set P, and the others are a majority sample set N₁,N₂,…,N_K-1。

More preferably, S303: obtaining a plurality of balance training subsets according to the minority sample set and the majority sample set, specifically:

s3031: for the majority sample set N_iRandom independent undersampling with putting back is carried out for T times repeatedly, and a plurality of sample sets N_iSeveral subsets N can be generated during each sampling t_itObtaining a plurality of sample subsets N_t＝{N_1t,N_2t,…,N_K-1tIs and | N_itI | ═ P |, where | N_it| is a subset N of majority class samples_tThe number of samples in the sample set P, | P | the number of samples in the minority sample set P;

s3032: a plurality of sample subsets N_t＝{N_1t,N_2t,…,N_K-1tA sum ofAnd combining the several sample sets P to obtain a plurality of balance training subsets.

More preferably, step S304: inputting the balance training subset into a second-stage Softmax classifier for training to obtain the trained second-stage Softmax classifier, which specifically comprises the following steps:

s3041: taking a plurality of balance data subsets as training samples, and respectively inputting the training samples into corresponding sub-classifiers for training;

s3042: and calculating the multi-classification probability of the transformer state by taking softmax as an activation function, and taking the second-level class code serial number k (k is 1,2,3,4,5,6,7,8) with the highest probability as a diagnosis result, wherein,

s3043: by passing

Cross entropy loss function of₂Updating the parameters of the sub-classifier to obtain the trained sub-classifier, wherein Y_2,tIs the output characteristic of the second stage;

s3044: the trained sub-classifier weight and the bias parameter are processed according to the preset weight value alpha_tMerge into the final second stage Softmax classifier H₂Obtaining a second-stage Softmax classifier after training

According to the method for diagnosing the fault of the power transformer based on the width learning under the unbalanced sample, the fault of the transformer is firstly classified in a first stage by adopting a first-stage BLS classifier, and then classified in a second stage by adopting a second-stage Softmax classifier on the basis of the first-stage classification, so that the accuracy and the efficiency of fault diagnosis are improved.

In the process of training the classifier, a width learning system (BLS) for constructing a network model by using an incremental algorithm of network horizontal expansion can effectively solve the problem of deep learning. Compared with a deep neural network, the method has the advantages of high precision and high training speed, and has the advantages of simple parameter setting and easiness in operation in practical application.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of hierarchical classification provided by the disclosed embodiment of the present invention;

FIG. 2 is a DAG diagram of transformer states provided by the disclosed embodiment of the invention;

FIG. 3 is a block diagram of a first stage classifier width learning system (BLS) in accordance with a disclosed embodiment of the invention;

FIG. 4 is a schematic diagram of an Easyensemble algorithm provided by the disclosed embodiment of the invention;

FIG. 5 is a flow chart of a training process for providing a first-stage classifier and a second-stage classifier according to an embodiment of the present disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings, in which like numerals refer to the same or similar elements throughout the different views, unless otherwise specified. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of methods consistent with certain aspects of the invention, as detailed in the appended claims.

The embodiment provides a width learning-based power transformer fault diagnosis method under an unbalanced sample, which comprises the following steps:

s3: after feature fusion is carried out on the primary input data and the primary classification of the transformer state corresponding to the primary input data, the primary input data and the primary classification of the transformer state are input into a trained second-stage Softmax classifier, and second-stage classification of the transformer state is obtained;

in the diagnosis method, a first-stage BLS classifier is adopted to perform first-stage classification on the fault of the transformer, and a second-stage Softmax classifier is adopted to perform second-stage classification on the fault on the basis of the first-stage classification, and fig. 1 is a schematic diagram of specific hierarchical classification.

Referring to fig. 2, according to the transformer state type hierarchical classification structure, the state hierarchy can be respectively classified into 3 first-stage classifications Y of normal, discharge fault and overheat fault and a second-stage classification Y of 9 subdivided state types of normal, partial discharge, spark discharge, arc discharge, spark discharge and overheat, arc discharge and overheat, low-temperature overheat, medium-temperature overheat and high-temperature overheat_z。

In the above embodiment, 5 kinds of gases of hydrogen, methane, ethylene, acetylene, and ethane are used as characteristic gases. In order to reduce the influence of the concentration of each characteristic gas on the absolute value fluctuation of a sample and ensure the convergence of a model network, a normalization method is adopted to combine the extreme value of the global characteristic gas, and a target interval is [0, 1 ] according to the maximum value and the minimum value of the characteristic gas per se]Scaling the value of (c). Under the condition of quantifying a certain characteristic gas distribution among samples and simultaneously keeping the relative size relation of the volume fractions of the component gases in the samples, each original gas is subjected toGas sample g_iThe treatment is carried out, and the treatment is carried out,

i is 1,2, …, G. And the normalized gas data are recorded as:

wherein

In the formula: i ═ 1,2, …, G, indicating the sample number; g is the total number of samples; j-1, 2,3,4,5, each represent 5 gases.

Because the number of cases among fault classes of the power transformer is unbalanced, if historical data is directly adopted as a sample to train a first-stage Softmax classifier and a second-stage Softmax classifier, the accuracy of fault diagnosis is low, therefore, the embodiment provides a width learning system (BLS) which uses an incremental algorithm of network transverse expansion to construct a network model aiming at the unbalance of the sample so as to solve the problem of the unbalance of the sample, and the specific training process of the specific classifier is as follows:

1. specific training procedure for first stage BLS classifier

Because in the first-stage classification of the power transformer, the balance degree of the sample can meet the training requirement, therefore, the sample data is collected and can be directly used for the training of the first-stage BLS classifier after being normalized, and the method specifically comprises the following steps:

as characteristic gases, 5 gases of hydrogen, methane, ethylene, acetylene, and ethane were used. In order to reduce the influence of the concentration of each characteristic gas on the absolute value fluctuation of a sample and ensure the convergence of a model network, a normalization method is adopted to combine the extreme values of the global characteristic gas, and a target interval of [0, 1 ] is carried out according to the maximum value and the minimum value of the characteristic gas per se]Reduction of numerical values ofAnd (4) placing. Under the condition of quantifying a certain characteristic gas distribution among samples and simultaneously keeping the relation of volume fractions and relative sizes of all component gases in the samples, g is taken as each original gas sample_iThe treatment is carried out, and the treatment is carried out,

i is 1,2, …, G. And the normalized gas data are recorded as:

wherein,

the normalized input data is set as { X, Y }. epsilon.R^N×(M+C)Where X is sample feature data, Y is a first level of classification, N represents the number of samples, M represents the dimension of the input sample feature, and C represents the number of classes. Before converting the input sample features into feature nodes, parameters n and p need to be defined. Where n represents the number of sets of feature nodes and p represents p nodes per set of feature nodes. The process of converting the input sample features into feature nodes:

Z_i＝φ_i(XW_ei+β_ei)，i＝1，2，…，n

wherein X ∈ R^N×M，W_ei∈R^M×p，β_ei∈R^1×pAnd W is_ei，β_eiGenerated in a random manner, representing the weights and biases used to generate the ith set of feature nodes, respectively. Z_iRepresenting the ith set of generated feature nodes. All n groups of characteristic nodes are spliced together to obtain the mostFinal set of feature nodes ZⁿWherein Z isⁿ＝[Z₁，Z₂，…，Z_n]。

similarly, the number of groups of enhancement nodes is defined as m, and the number of enhancement nodes in each group is defined as q. The process of generating the enhancement node from the feature node is as follows:

H_j＝ξ_j(ZⁿW_hj+β_hj)，j＝1，2，3，…，m

wherein Z is_n∈R^N×np，W_hj∈R^np×q，β_hj∈R^1×q. And W_hj，β_hjGenerated in a random manner, representing the weights and biases, respectively, used to generate the jth set of enhancement nodes. H_jRepresenting the characteristic nodes generated by the jth group. All m groups of feature nodes are spliced together to obtain a final feature node set H^mIn which H is^m＝[H₁，H₂，…，H_m]。；

S205: splicing the characteristic nodes and the enhancement nodes to be used as an input layer, and obtaining a trained first-stage BLS classifier according to a first-stage classification Y of the transformer state corresponding to the sample characteristic data X;

after the enhanced nodes are generated, the feature nodes and the enhanced nodes need to be spliced together to be used as an input layer, and an output result is obtained through further calculation. Let us assume the weight between the connection input layer and output layer to be W^mThe output Y of the first stage BLS classifier is then obtained, which is also the first stage classification, as shown in fig. 3.

Y＝[Z₁，Z₂，…，Z_n|H₁…H_m]W^m

＝[Zⁿ|H^m]W^m

The pseudo-inverse algorithm can be considered as a convenient method to obtain a random weight flat neural network of output layer weights. Thus, the output layer weights can be quickly calculated as:

W^m＝[Zⁿ|H^m]⁺Y

however, due to the dimension and speed of the training data, it is too costly to use standard methods such as orthogonal projection, iteration, and singular value decomposition to compute the generalized inverse. The invention adopts another method to solve the pseudo-inverse.

Wherein σ₁＞0，σ₂> 0, v and u are regularized norms. Under the condition of the original generalized inverse, the least square estimation A ═ Z of the original generalized inverse estimationⁿ|H^m]And adding a constraint term lambda to solve the pseudo-inverse of the original generalized inverse estimation. When λ is 0, the inverse problem degenerates to the least squares problem and a solution to the original pseudo-inverse problem is obtained. Take sigma₁＝σ₂And setting the optimal problem as a ridge regression learning algorithm, wherein v is 2, and u is 2. Ridge regression is a biased estimate of the regression parameter a, which results in larger residuals, but makes the model more general. Thus, the connection weight of the first stage BLS classifier may be approximated as:

W^m＝(λI+AA^T)^-1A^TY

then, there are:

and finally, obtaining the trained first-stage BLS classifier through the calculated weight.

2. Specific training process of second-stage Softmax classifier

Because there is a serious unbalanced sample problem in the second-stage classification of the power transformer, the training of the second classifier is performed after the acquired training samples need to be equalized, which is specifically as follows:

the original characteristics X can obtain three primary labels through a primary classifier, wherein the primary labels are respectively normal, discharge fault and overheat fault, and the codes are respectively 1,0 and 0 as shown in the following table 1; 0,1, 0; 0,0,1, and then performing feature fusion on the obtained first-stage classification Y and the original features X to serve as the input of a next-stage classifier.

TABLE 1

unbalanced dataset class C₁,C₂,…,C₉The sample set with the least number of samples in the normal, partial discharge, spark discharge, arc discharge, low-temperature overheat, medium-temperature overheat, high-temperature overheat, spark discharge and overheat, and arc discharge and overheat is a least number of sample sets P, the number is | P |, and the other most sample sets are N₁,N₂,…,N_K-1。

repeating all majority sample set for T times with replaced random independent undersampling, majority sample set N_iSeveral subsets N can be generated during each sampling t_itAnd satisfy | N_itI ═ P |, where i ═ 1,2, …, K-1; t ═ 1,2, …, T, as shown in table 2.

Table 2:

in the above sampling process, most samplesThis N_iArbitrary sample (x) of (a)_ij,y_ij)∈N_iIs acquired at least once to a subset N_itThe probability of (1) - (1-1/| N)_i|)^|P|Probability L of occurring at least once in T subsets_oneAnd all occurrences of the possibility L_allRespectively as follows:

under the condition of constant | P |, the larger the sampling frequency T is, the more the majority of samples (x)_ij,y_ij) Likelihoods L occurring in the ensemble of training subsets_oneThe larger the sample is, so that most samples can be guaranteed to be sampled to a training set, and information loss in the undersampling process is prevented; meanwhile, there is a partial majority sample set sample number | N_iIn the case that the number of samples | P | is not different from the number of samples | P | in the minority sample set by multiple times, the samples (x)_ij,y_ij) Probability of occurrence in the ensemble of training subsets L_allThe sampling quality of a small number of most samples can be guaranteed to a certain extent, compared with the sampling quality of a plurality of samples with larger difference multiples.

Then, a plurality of types of samples obtained in the process of each undersampling t are subjected to a subset N_t＝{N_1t,N_2t,…,N_K-1tCombine P with the minority class sample set as training subset D of each sub-classifier_t,D_t＝N_tAnd E, at the moment, the number of each category of the training subset is the same, namely the training subset is a balanced data set.

As shown in FIG. 4, when the easy Ensemble ensemble learning method is adopted, first, X is input to the second level₂Constructing a data-balanced training subset X_2,tThe fault type with small number of samples is used as a few class samples, the other classes are used as a majority class sample to carry out undersampling to obtain T training subsets, and each training subset can be recorded as

D_t＝(X_2,t,Y_2,t),t＝1,2,...,T

In the formula Y_2,tIs the output characteristic of the second stage.

As shown in FIG. 5, T sub-classifiers h are trained in parallel_2,tAnd calculating the multi-classification probability of the transformer state by taking softmax as an activation function, and taking the second-level class code serial number k (k is 1,2,3,4,5,6,7 and 8) with the highest probability as a diagnosis result.

And pass through

Cross entropy loss function of₂The parameters of the sub-classifiers are updated.

Wherein,

after the parallel training of the sub-classifiers is finished, the weight and the bias parameter of the sub-classifiers are set according to the preset weight alpha_tMerge into the final second stage Softmax classifier H₂The final output of the second stage Softmax classifier is:

other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It is to be understood that the present invention is not limited to what has been described above, and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A power transformer fault diagnosis method based on width learning under an unbalance sample is characterized by comprising the following steps:

2. The method for diagnosing the fault of the power transformer based on the width learning under the unbalance sample according to claim 1, wherein the first-stage classification comprises: normal, discharge fault, and overheat fault; the second level of classification includes: normal, partial discharge, spark discharge, arc discharge, spark discharge with superheat, arc discharge with superheat, low temperature superheat, medium temperature superheat, and high temperature superheat.

3. The method for diagnosing the fault of the power transformer based on the width learning under the unbalanced sample according to claim 1, wherein in step S1, the characteristic extraction is performed on the dissolved gas in the transformer oil based on the dissolved gas analysis method to obtain primary input data, specifically:

based on dissolved gas analysisThe method obtains the volume fractions of 5 gases including hydrogen, methane, ethylene, acetylene and ethane in the transformer oil to obtain the collected data

4. The method for diagnosing the fault of the power transformer based on the width learning under the unbalanced sample as recited in claim 1, wherein the specific training process of the first-stage BLS classifier in the step S2 is as follows:

Z_i＝φ_i(XW_ei+β_ei),i＝1,2,…,n；

wherein, W_ei∈R^M×p，β_ei∈R^1×pAnd W is_eiWeight, β, representing characteristic node of i-th group_eiExpressing the deviation of the ith group of feature nodes, M expressing the dimension of sample feature data, p expressing the number of each group of feature nodes, and n expressing the group number of the feature nodes;

H_j＝ξ_j(ZⁿW_hj+β_hj),j＝1,2,3,…,m；

W^m＝[Zⁿ|H^m]⁺Y；

wherein H^m＝[H₁,H₂,…,H_m]。

5. The method for diagnosing the fault of the power transformer based on the width learning under the unbalance sample according to claim 4, wherein the training process of the second-stage Softmax classifier in the step S3 is as follows:

s301: performing feature fusion on the sample feature data X and the first-stage classification Y of the transformer state corresponding to the sample feature data X;

6. The method for diagnosing the fault of the power transformer based on the width learning under the unbalance sample according to claim 5, wherein the step S302: separating the sample after the characteristic fusion to obtain a minority sample and a majority sample, specifically:

s3021: acquiring the category of the unbalanced number set;

7. The method for diagnosing the fault of the power transformer based on the width learning under the unbalance sample according to claim 5, wherein S303: obtaining a plurality of balance training subsets according to the minority sample set and the majority sample set, specifically:

s3031: for the majority sample set N_iPerforming repeated T times of random independent undersampling with release, and collecting multiple types of samples N_iSeveral subsets N can be generated during each sampling t_itObtaining a plurality of sample subsets N_t＝{N_1t,N_2t,…,N_K-1tIs and | N_itI | ═ P |, where | N_it| is a subset N of majority class samples_tThe number of samples in the sample set P, | P | the number of samples in the minority sample set P;

s3032: a plurality of sample subsets N_t＝{N_1t,N_2t,…,N_K-1tAnd respectively combining the training data with the minority class sample sets P to obtain a plurality of balanced training subsets.

8. The method for diagnosing the fault of the power transformer based on the width learning under the unbalance sample according to claim 5, wherein the step S304: inputting the balance training subset into a second-stage Softmax classifier for training to obtain the trained second-stage Softmax classifier, which specifically comprises the following steps:

s3041: taking a plurality of balanced data subsets as training samples, and respectively inputting the training samples into corresponding sub-classifiers for training;

s3042: calculating the multi-classification probability of the transformer state by taking softmax as an activation function, and maximizing the probabilityAs a diagnosis result, the second-level class code number k (k ═ 1,2,3,4,5,6,7,8) of (a),

s3043: by passing