CN111860658A

CN111860658A - Transformer fault diagnosis method based on cost sensitivity and integrated learning

Info

Publication number: CN111860658A
Application number: CN202010721965.5A
Authority: CN
Inventors: 刘云鹏; 和家慧; 刘一瑾; 王权
Original assignee: North China Electric Power University
Current assignee: North China Electric Power University
Priority date: 2020-07-24
Filing date: 2020-07-24
Publication date: 2020-10-30

Abstract

The invention discloses a transformer fault diagnosis method based on cost sensitivity and integrated learning, which comprises the following steps: preprocessing fault data of various transformers, and dividing the fault data into a training sample set and a test set; establishing a transformer fault diagnosis model based on an AdaCost algorithm; using a distribution weight of D_tTraining the training sample set to obtain a weak learner h_t(x) (ii) a Calculate h_t(x) And the weight occupied in forming the strong classifier; introducing cost factors and updating the weight distribution of each sample in the training sample set; repeating iteration until the learning error rate meets the iteration times required by the error rate, and forming a strong learner; and inputting the test set into the strong learner, and voting to determine the fault type. The transformer based on cost sensitivity and integrated learning provided by the inventionThe fault diagnosis method is based on the AdaCost algorithm, solves the problem of low overall precision of the classifier under the unbalanced data set, and further improves the fault judgment accuracy.

Description

Transformer fault diagnosis method based on cost sensitivity and integrated learning

Technical Field

The invention relates to the technical field of power equipment fault detection, in particular to a transformer fault diagnosis method based on cost sensitivity and integrated learning.

Background

The deep mining and analysis of the big data of the power equipment by using the artificial intelligence technology such as machine learning and the like is the trend of the intelligent operation and maintenance field. The power transformer is one of important electrical equipment in a power system, and the operation state of the power transformer is mastered, so that the operation maintenance level of the power transformer is improved, and the safe operation of a power grid is ensured. The abnormal state samples of the power transformer are few, and meanwhile, the problems of missing, imperfection and the like exist in the information of the fault case and the abnormal sample, so that the distribution of the number of the classes of the sample data set of the transformer is unbalanced. Although classification models such as Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs) have good effects in transformer fault diagnosis, for an unbalanced sample set of a transformer, since the loss value is the minimum or the class interval is the maximum, the class interval surface moves in the direction of sparsely distributed samples in a class, the rate of missing judgment of a fault sample is far higher than that of a normal sample, and the classification accuracy of the fault sample cannot be guaranteed, which brings significant losses to a power system, even social economy and life.

The category quantity distribution of the unbalanced data set is extremely unbalanced, the problems of over-fitting, under-fitting and the like can occur when the machine learning model carries out analysis and prediction of classification tasks, and the accuracy and the robustness of the machine learning model are greatly reduced. The study of unbalanced data sets is a focus and problem in the field of machine learning. At present, the researchers in the industry have conducted a lot of research aiming at improving the classification performance of a few classes of samples, and the proposed method is mainly summarized into 2 layers of algorithm and data.

The data plane mainly includes undersampling and oversampling. The essence is to achieve sample equalization by adding few classes of samples or subtracting most classes of samples. The processing method of the non-equilibrium data set mainly comprises 4 types of random oversampling, random undersampling, equilibrium sampling and few-class oversampling synthesis. However, it is possible to use a single-layer,

the algorithm level mainly takes a cost sensitive method as a main part, and is widely applied to a plurality of fields such as image recognition, medical diagnosis and credit rating at present. The cost-sensitive learning method mainly comprises the following 3 implementation modes:

(1) from the study model, the improvement of a specific study method is focused on, so that the method can adapt to study under unbalanced data, such as a perception machine, a support vector machine, a decision tree, a neural network and the like, which respectively have cost-sensitive versions. Taking a cost-sensitive decision tree as an example, the method can be improved from 3 aspects to adapt to learning of unbalanced data, wherein the 3 aspects are decision threshold selection, splitting standard selection and pruning respectively, and adaptability of a realization model to an unbalanced data set is introduced by introducing a cost matrix into the method.

(2) Based on Bayes risk theory, the cost sensitive learning is regarded as the post-processing of the classification result, and a model is learned according to the traditional method to adjust the result with the aim of realizing the minimum loss.

(3) From the perspective of preprocessing, the cost is used for adjusting the weight, so that the classifier meets the characteristic of cost sensitivity, namely, the classifier pays more attention to the sample by improving the weight corresponding to the high-cost misclassified sample in the training process of the classifier. The algorithm represented by it is the AdaCost algorithm based on ensemble learning.

In many discussions utilizing cost-sensitive algorithms, the misdiagnosis cost among the elements in the cost matrix, namely categories, is often given by domain expert comprehensive domain knowledge and has certain subjectivity. In the actual transformer fault diagnosis, misdiagnosis cost among faults is difficult to accurately give, domain experts are required to synthesize domain knowledge and repeated tests, and factors such as fault severity and fault properties are comprehensively considered; meanwhile, the cost matrix determined by expert scoring inevitably has strong subjectivity.

Disclosure of Invention

The invention aims to provide a transformer fault diagnosis method based on cost sensitivity and ensemble learning, which is based on an AdaCost algorithm, improves the weight of a high-cost error classification sample, reduces the weight of a high-cost correct classification sample, solves the problem of low overall precision of a classifier under an unbalanced data set, and further improves the fault judgment accuracy.

In order to achieve the purpose, the invention provides the following scheme:

a transformer fault diagnosis method based on cost sensitivity and integrated learning comprises the following steps:

s1, preprocessing various transformer fault data, and dividing the transformer fault data into a training sample set and a test set;

s2, establishing a transformer fault diagnosis model based on AdaCost algorithm, and enabling a training sample set to be X { (X)₁,y₁),(x₂,y₂),…,(x_m,y_m) In which x_iCharacteristic vector of sample formed by gas dissolved in oil, y_iAs a fault type label, x_i∈X,y_iE.g. Y { +1, -1 }; let the number of iterations be T, T-1, 2, …, T; let the sample weight distribution of the t-th iteration be D_t＝(w_t1,w_t2,…,w_ti) I is 1,2, …, m, and

let the weak learner formed by the t-th iteration be h_t(x)；

S3, using distribution weight as D_tTraining the training sample set to obtain a weak learner h_t(x)；

S4, calculating h_t(x) Learning error rate e of_tT ═ 1,2, …, T, where i (x) is the error function;

s5, calculating h_t(x) Weight alpha occupied in forming strong classifier_t,t＝1,2,…,T；

S6, introducing cost factors and updating the weight distribution of each sample in the training sample set;

s7, taking 1,2, … and T in sequence, repeating iteration until the learning error rate meets the iteration times T required by the error rate, and integrating all weak learners by combining strategies to form a strong learner;

and S8, inputting the test set into the strong learner, and voting to determine the fault type.

Optionally, in step S4, h is calculated_t(x) Learning error rate e of_tT is 1,2, …, and specifically includes: the calculation formula is as follows:

optionally, in step S5, h is calculated_t(x) Weight alpha occupied in forming strong classifier_tT is 1,2, …, and specifically includes: the calculation formula is as follows:

optionally, in step S6, a cost factor is introduced, and the weight distribution of each sample in the training sample set is updated, which specifically includes:

wherein, beta_iIs a penalty factor, which is obtained by a cost matrix; z_tIs a normalization factor, ensures that the sum of the weight distributions of each sample is 1, and has the following calculation formula:

optionally, in step S7, T, 1,2, …, T are sequentially taken, iteration is repeated until the learning error rate meets the iteration number T required by the error rate, and all weak learners are integrated by combining a strategy to form a strong learner, which specifically includes:

the strong classifier h (x) is represented as follows:

according to the specific embodiment provided by the invention, the invention discloses the following technical effects: the transformer fault diagnosis method based on cost sensitivity and integrated learning provided by the invention is characterized in that a transformer fault diagnosis model is constructed based on an AdaCost algorithm, the AdaCost algorithm is improved based on an AdaBoost algorithm, a weight updating strategy of the AdaCoost algorithm is modified, a matrix formed by cost factors beta i and beta i is introduced into dt (x) and is called as a cost matrix, so that the weight of a misclassified sample with high cost is greatly improved, the weight of a correctly classified sample with high cost is properly reduced, the weight reduction of the correctly classified sample with high cost is relatively small, the overall idea is that the weight of the sample with high cost is greatly increased and slowly reduced, and the problem of low overall precision of a classifier under an unbalanced data set is solved; therefore, the AdaCost algorithm considers the cost difference of the misclassification, can well process the unbalance problem of the transformer fault data set, and improves the identification capability of the classification algorithm on fault samples and the overall classification accuracy.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a schematic diagram of the boosting algorithm;

FIG. 2 is a flow diagram of an AdaBoost training process;

FIG. 3 is a cost matrix composition diagram;

fig. 4 is a transformer fault diagnosis model based on AdaCost.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

The AdaCost algorithm is improved based on the AdaBoost algorithm, the AdaBoost algorithm is a tree integration algorithm based on a boost method, and the core strategy is repeated iteration and updating the weights of the samples and the weights of the base classifiers. And iteratively learning a classifier in each round, updating the weights of the samples according to the performance of the current classifier, wherein the updating strategy is that the weights of the correctly classified samples are reduced, the weights of the incorrectly classified samples are increased, the final model is a weighted linear combination of multiple iterative models, and the classifier with more accurate classification can obtain larger weight.

In AdaBoost, for correctly classified samples, the sample weight adjustment coefficient is exp (-alpha t), and the weights are all reduced by the same proportion; for misclassified samples, the sample weight adjustment coefficient is exp (α t), and the weights are all increased by the same proportion. Although for an unbalanced sample set, AdaBoost can pay more attention to samples of a few classes, and offset of the classifier to a majority class is avoided to a certain extent, the AdaBoost reduces or increases the weight in the same proportion, and does not consider cost factors, so that improvement of the classification performance of the model is limited. The AdaCost algorithm modifies a weight updating strategy of the Adaboost algorithm, and the basic idea is to greatly increase the weight of a high-cost misclassified sample, and appropriately reduce the weight of a high-cost correct classified sample to enable the weight reduction to be relatively small. The general idea is that the increase of the sample weight with high cost is slow, and the problem of low overall precision of the classifier under an unbalanced data set is solved.

Since the AdaCost algorithm is improved based on the AdaBoost algorithm, which is a tree integration algorithm based on the boost method, the principle of the boosting algorithm will be described first, as shown in fig. 1. Firstly, averagely distributing initial weight to each sample in an initial training set, and training to obtain a weak learner 1; then, updating the weight distribution of the samples in the training set based on the learning error rate, and further training to obtain a weak learner 2; repeating iteration until the iteration times T meeting the error rate requirement are met; and finally, integrating all weak learners by combining strategies to form a strong learner.

Three elements of the Boosting algorithm:

let the weak learner be f_i(x,θ_i) (ii) a The strong learner is F (x); x is a sample feature quantity, θ_iTo learn an error rate.

(1) Function model: the weak learner is superposed through a certain combination strategy to obtain a strong learner, which can be marked as formula 1.

(2) An objective function: let some loss function be E { F (x) }, and take the objective function H (x) as the optimization target of the algorithm.

(3) And (3) an optimization algorithm: stepwise optimization of error rate θ_iThe optimization formula is shown in formula (3).

F in the three elements_i(x,θ_i) And selecting the E { F (x) } as an exponential loss function to obtain the AdaBoost algorithm. AdaBoost is called adaptive boosting, and is an adaptive iterative algorithm based on the boost idea. Adaboost is adaptive in that the distribution of samples can be changed according to whether the samples are correctly classified, the correctly classified samples are always low in weight, and the misjudged samples are high in weight, namely, the probability that the misjudged samples are selected to enter the next weak classifier is improved.The AdaBoost training process is shown in fig. 2.

AdaCost modifies the weight update strategy of AdaBoost algorithm, at D_t(x) In which a cost factor beta is introduced_i，β_iThe constructed matrix is called a cost matrix, so that the weight of the high-cost misclassified sample is greatly increased, and the weight of the high-cost correctly classified sample is properly reduced, so that the weight reduction is relatively small. The core of AdaCost lies in the determination of a cost matrix, which is used to describe cost (penalty) information on the data set to be classified, thereby determining how a classifier should be trained when different classification errors result in different penalty strengths. The cost matrix (Costmatrix) is an N-order square matrix, where N represents the number of categories in the data set to be classified, and the cost matrix is specifically configured as shown in fig. 3.

Wherein the matrix element c_ijRepresenting the cost of misclassifying a sample with a real class i into a class j; each row of elements of the matrix represents the cost of misclassifying the real i-type samples into other types; when i ≠ j, it represents that the algorithm correctly predicts the sample class, and the entry of i ≠ j corresponds to an incorrect classification result. The cost matrix is set according to domain knowledge of the classification task, and c_ijThe following principles are generally followed in the assignment process of (1):

(1) the cost of misclassification must be greater than the cost of correct classification;

(2) if the prediction is true, there is no cost, i.e. consider c_ii＝0；

(3) The greater the difference in the degree of loss, c_ijAnd c_jiThe larger the difference in value of.

The invention provides a transformer fault diagnosis method based on cost sensitivity and integrated learning, which comprises the following steps:

s2, establishing a transformer fault diagnosis model based on AdaCost algorithm, and enabling a training sample set to be X { (X)₁,y₁),(x₂,y₂),…,(x_m,y_m) Wherein x is_iIs composed of gas dissolved in oilCharacteristic vector of this book, y_iAs a fault type label, x_i∈X,y_iE.g. Y { +1, -1 }; let the number of iterations be T, T-1, 2, …, T; let the sample weight distribution of the t-th iteration be D_t＝(w_t1,w_t2,…,w_ti) I is 1,2, …, m, and

let the weak learner formed by the t-th iteration be h_t(x)；

Wherein, in the step S4, h is calculated_t(x) Learning error rate e of_tT is 1,2, …, and specifically includes: the calculation formula is as follows:

the step S5, calculating h_t(x) Weight alpha occupied in forming strong classifier_tT is 1,2, …, and specifically includes: the calculation formula is as follows:

step S6, introducing a cost factor, and updating the weight distribution of each sample in the training sample set, specifically including:

the step S7, T sequentially takes 1,2, …, T, and iterates repeatedly until the learning error rate meets the iteration number T required by the error rate, and integrates all weak learners by combining strategies to form a strong learner, specifically including:

the strong classifier h (x) is represented as follows:

and (3) verifying the feasibility and effectiveness of AdaCost applied to transformer fault diagnosis by taking the data of dissolved gas in oil of the power transformer as a sample set, and taking macro F1 as an evaluation index of a model and marking as alpha macro-F1. According to the IEC 60599 standard, transformer faults are classified into 6 types of partial discharge, low-energy discharge, high-energy discharge, low-temperature overheat, medium-temperature overheat, and high-temperature overheat. Considering that the transformer oil temperature is increased due to the long-term discharge of the transformer, and a thermal fault is generated, the supplement discharge and overheating are the 7 th fault type.

According to the above, transformer fault diagnosis is a multi-classification problem, so that a one-to-one strategy is adopted, 28 classifiers are constructed for 8 states (normal, partial discharge, low-energy discharge, high-energy discharge, low-temperature overheat, medium-temperature overheat, high-temperature overheat and discharge and overheat) of a transformer to train, samples are input into the classifiers during testing, and the fault types are voted and determined. The AdaCost-based transformer fault diagnosis model is shown in fig. 4.

The cost matrix formed by expert scoring has strong subjectivity, and the confusion matrix reflects the number of misjudged samples (the number of misjudged samples is considered as high in weight) and derives the cost matrix N on the basis of the confusion matrix obtained by decision tree training, considering that the function of beta i is to improve the weight of the misclassified samples with high cost, and properly reduce the weight of the samples with high cost for correctly classified samples with high cost.

And selecting an evaluation index suitable for the classification of the unbalanced data, and taking the integral classification accuracy macro F1 as the evaluation index of the classifier, namely alpha macro-F1. Compared with the decision tree model (alpha macro-F1 is 0.7152), the overall classification accuracy of the AdaCost model, alpha macro-F1, is improved by 12.1% and advanced by 16.91% compared with the transformer fault diagnosis effect of the decision tree model and the AdaCost model. The method is proved to be capable of well processing the unbalance problem of the transformer fault data set, and the identification capability of the classification algorithm on the fault sample and the integral classification accuracy are improved. Meanwhile, the AdaCost algorithm considers the cost difference of misclassification and accords with the practical engineering significance.

The transformer fault diagnosis method based on cost sensitivity and integrated learning provided by the invention is characterized in that a transformer fault diagnosis model is constructed based on an AdaCost algorithm, the AdaCost algorithm is improved based on an AdaBoost algorithm, a weight updating strategy of the AdaCoost algorithm is modified, a matrix formed by cost factors beta i and beta i is introduced into dt (x) and is called as a cost matrix, so that the weight of a misclassified sample with high cost is greatly improved, the weight of a correctly classified sample with high cost is properly reduced, the weight reduction of the correctly classified sample with high cost is relatively small, the overall idea is that the weight of the sample with high cost is greatly increased and slowly reduced, and the problem of low overall precision of a classifier under an unbalanced data set is solved; therefore, the AdaCost algorithm considers the cost difference of the misclassification, can well process the unbalance problem of the transformer fault data set, and improves the identification capability of the classification algorithm on fault samples and the overall classification accuracy. The AdaCost algorithm is not only suitable for fault diagnosis, but also suitable for other classification application fields under the condition of unbalanced data, and has strong universality and generalization.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A transformer fault diagnosis method based on cost sensitivity and integrated learning is characterized by comprising the following steps:

let the weak learner formed by the t-th iteration be h_t(x)；

2. The cost-sensitive and ensemble-learning-based transformer fault diagnosis method according to claim 1, wherein the step S4 of calculating h_t(x) Learning error rate e of_tT is 1,2, …, and specifically includes: the calculation formula is as follows:

3. the cost-sensitive and ensemble-learning-based transformer fault diagnosis method according to claim 1, wherein the step S5 of calculating h_t(x) Weight alpha occupied in forming strong classifier_tT is 1,2, …, and specifically includes: the calculation formula is as follows:

4. the transformer fault diagnosis method based on cost sensitivity and ensemble learning of claim 1, wherein the step S6 introduces cost factors and updates the weight distribution of each sample in the training sample set, which specifically includes:

5. the transformer fault diagnosis method based on cost-sensitive and ensemble learning of claim 1, wherein the step S7, T sequentially takes 1,2, …, T, and iterates repeatedly until the learning error rate meets the iteration number T of the error rate requirement, and all weak learners are integrated by combining strategies to form a strong learner, specifically comprising:

the strong classifier h (x) is represented as follows: