CN117454140A

CN117454140A - Transformer winding fault classification method, device, equipment and medium

Info

Publication number: CN117454140A
Application number: CN202311488336.2A
Authority: CN
Inventors: 王松; 谢飞; 邱晟璇; 王国昊; 化小蕊
Original assignee: Northwest A&F University
Current assignee: Northwest A&F University
Priority date: 2023-11-09
Filing date: 2023-11-09
Publication date: 2024-01-26

Abstract

The invention discloses a transformer winding fault classification method, device, equipment and medium, and relates to the technical field of power transformer safety control and management. Firstly extracting deviation features of sample data under various fault types through a plurality of data feature index extraction methods to obtain a plurality of to-be-selected deviation feature sets, then evaluating the feature importance of the various deviation features through a random forest, carrying out parameter optimization according to the importance of the feature importance, and determining the preferred features and the number of adjacent nodes adopted by the nearest-neighbor node algorithm so as to classify the faults of the transformer winding according to the optimized nearest-neighbor node algorithm. According to the invention, various deviation features of sample data are extracted to evaluate the importance of the various deviation features, and a plurality of deviation features are applied to the transformer fault classification of multiple rounds according to the importance of the various deviation features, so that parameter optimization is performed, and the difference between frequency response curves before and after winding deformation is accurately reflected through the optimized deviation features, so that the accuracy of winding fault classification is improved.

Description

Transformer winding fault classification method, device, equipment and medium

Technical Field

The application relates to the technical field of power transformer safety control management, in particular to a transformer winding fault classification method, device, equipment and medium.

Background

In general, a power transformer is one of the vital devices in a power system, which is connected as a hub to the power grid of different voltage classes and is responsible for efficient power exchange. The winding is one of the key components of the transformer, and the quality and reliability of the winding are important for ensuring the safe and stable operation of the transformer. However, in actual operation, various degrees of deformation of the internal windings of the transformer are inevitably caused due to the effects of various factors such as short-circuit current. From statistical analysis, winding deformation problems are one of the most common causes of transformer failure and shutdown.

The frequency response analysis (Frequency response analysis, FRA) method is a winding deformation diagnosis method with high sensitivity, good repeatability and strong anti-interference capability. The method judges whether the winding has deformation problem or not by comparing and analyzing the deviation between frequency response curves of the same transformer winding before and after deformation. Therefore, accurately interpreting the deviation between these two curves is critical for winding deformation diagnostics using frequency response analysis.

In the prior art, with the rapid development of artificial intelligence technology, machine learning and frequency response analysis-based transformer winding fault diagnosis has been widely applied. The ability of automatically recognizing winding deformation can be obtained by learning data by establishing machine learning models such as a support vector machine (Support Vector Machine, SVM), a K-Nearest Neighbor (KNN), an artificial neural network (Artificial Neural Networks, ANNs) and the like. By adopting the method, the frequency response curve difference before and after winding deformation needs to be accurately analyzed, so that the feature extraction method is important, and plays a key role in constructing a proper machine learning model.

However, at present, no unified standard is available for evaluating the difference between two curves, the problems of inaccurate application of characteristic indexes and poor parameter selection often exist in the existing algorithm for measuring the fault deformation of the winding, and the difference between frequency response curves before and after the winding deformation cannot be accurately reflected, so that the accuracy rate of the fault classification of the winding is lower.

Disclosure of Invention

Based on the above, it is necessary to provide a method, an apparatus, a device and a medium for classifying faults of a transformer winding.

The technical scheme adopted in the specification is as follows:

the specification provides a transformer winding fault classification method, which comprises the following steps:

constructing sample data of a frequency response curve corresponding to the normal state of the transformer winding and various fault types, and marking the real fault types corresponding to the sample data;

according to a plurality of data characteristic index extraction methods, extracting a plurality of deviation characteristics between frequency response curves of a transformer winding in a normal state and various fault types in sample data to obtain a plurality of deviation characteristic sets to be selected;

carrying out feature importance assessment on each deviation feature in each deviation feature set to be selected through a random forest, and carrying out descending order arrangement on each deviation feature set to be selected according to the feature importance;

according to the ordering of each to-be-selected deviation feature set, inputting the deviation feature in at least one to-be-selected deviation feature set into a nearest node algorithm to classify the faults of the transformer winding, and obtaining the predicted fault types corresponding to each sample data;

and determining the optimal deviation characteristics and the optimal number of adjacent nodes adopted by the nearest-neighbor node algorithm by taking the minimum difference between the predicted fault type and the actual fault type of each sample data as an optimization target so as to classify the faults of the transformer windings according to the optimized nearest-neighbor node algorithm.

Optionally, the constructing sample data of the frequency response curves corresponding to the normal state of the transformer winding and various fault types specifically includes:

according to the electrical characteristics of the transformer winding, the number of cushion blocks and/or the number of fault cakes at different positions of the transformer are adjusted to simulate faults of the transformer winding with different degrees under different fault types;

based on the frequency response test, sample data of a frequency response curve corresponding to the normal state of the transformer winding and various fault types are obtained;

the different fault types comprise axial integral displacement, strong warping of winding wire cakes, cake spacing change and short circuit among cakes.

Optionally, the method for extracting the multiple data characteristic indexes specifically includes:

euclidean distance, linn's coefficient of agreement, sum of errors, sum of squares error, correlation coefficient, sum of squares max-min ratio error, absolute sum of logarithmic error, cross correlation factor, root mean square error, comparison standard deviation, least squares error, expected, minimum maximum, standard deviation, sum of squares error, spectral deviation, absolute average difference, random spectral deviation, normalized correlation coefficient, covariance.

Optionally, the evaluating the feature importance of each bias feature in each bias feature set to be selected through a random forest specifically includes:

Taking a plurality of deviation features corresponding to each sample data as sample features thereof, constructing a plurality of decision trees according to the sample features of each sample data, and classifying faults of the transformer winding to form a random forest;

for each decision tree, calculating a first prediction error of the decision tree according to the out-of-bag data constructing the decision tree and the corresponding labels thereof;

selecting the characteristics to be evaluated from all deviation characteristics of the out-of-bag data of the decision tree one by one, randomly perturbing the characteristics to be evaluated, and calculating a second prediction error of the decision tree after the characteristics to be evaluated are perturbed through the out-of-bag data after the characteristics to be evaluated are perturbed and the corresponding labels;

according to the difference between the first prediction error and the second prediction error, calculating the average precision decline value of the decision tree after disturbance of the feature to be evaluated by the following formula:

MDA _j (X)＝|OOBerror(X) _j -OOBerror_new(X) _j |；

the average value of the average precision decline value of each decision tree after disturbance of the feature to be evaluated is calculated as the importance of the feature to be evaluated by the following steps:

wherein the feature to be evaluated is the type of the deviation feature corresponding to the deviation feature set X to be selected, j is the serial number of the decision tree, and MDA _j (X) is the average precision decline value of the jth decision tree after disturbance of the deviation characteristic X of the data outside each bag, OOBerror _j For the first prediction error of the j-th decision tree, OOBerror _j The new is the second prediction error of the j-th decision tree after disturbance of the deviation characteristic X of the data outside each bag, MDA _{_} avg (X) is the average value of the average precision decline value of each decision tree after disturbance of the feature X to be evaluated.

Optionally, inputting the features in the at least one bias feature set to be selected into a nearest node algorithm to classify faults of the transformer winding, so as to obtain predicted fault types corresponding to each sample data, which specifically includes:

taking at least one deviation feature corresponding to each sample data as a sample feature thereof, and inputting the sample feature corresponding to each sample data into a nearest neighbor node algorithm;

according to sample characteristics of sample data to be classified and preset classified sample data, calculating Euclidean distance between the sample data to be classified and the classified sample data, and carrying out ascending sorting on the classified sample data according to the Euclidean distance;

selecting neighbor sample data of a preset number of neighbor nodes of sample data to be classified from the classified sample data according to the sequence from front to back;

according to Euclidean distance between the sample data to be classified and the neighbor sample data, calculating the weight of each neighbor sample data by an inverse distance weighting method;

And respectively carrying out weighted summation on the fault types of the neighbor sample data according to the weights of the neighbor sample data, and calculating the predicted fault types corresponding to the sample data to be classified.

Optionally, the determining the preferred deviation feature and the number of preferred neighboring nodes adopted by the nearest neighboring node algorithm with the minimum difference between the predicted fault type and the actual fault type of each sample data as an optimization target specifically includes:

according to the sequence of each deviation feature set to be selected, taking the deviation features of different numbers corresponding to each sample data as sample features;

determining the predicted fault type of the sample data by using a nearest neighbor node algorithm and sample features composed of deviation features under different numbers and different preset adjacent node numbers;

the preferred deviation feature and the preferred number of adjacent nodes are determined with the minimum difference between the predicted fault type and the true fault type for each sample data as an optimization objective.

Optionally, the method further comprises:

normalizing the plurality of candidate bias feature sets by:

wherein x is the deviation feature subjected to normalization, max (x) is the deviation feature maximum value in the deviation feature set where the deviation feature x is located, min (x) is the deviation feature minimum value in the deviation feature set where the deviation feature x is located, and x ^* Is the normalized deviation feature.

The specification provides a transformer winding trouble classification device, includes:

the construction module is used for constructing sample data of frequency response curves corresponding to the normal state of the transformer winding and various fault types, and marking the real fault types corresponding to the sample data;

the extraction module is used for extracting various deviation characteristics between frequency response curves of the transformer winding in the sample data in a normal state and various fault types according to various data characteristic index extraction methods to obtain a plurality of deviation characteristic sets to be selected;

the evaluation module is used for evaluating the feature importance of each deviation feature in each deviation feature set to be selected through a random forest, and arranging the deviation feature sets to be selected in a descending order according to the feature importance;

the prediction module is used for inputting the characteristics in at least one type of deviation characteristic set to be selected into a nearest node algorithm to classify faults of the transformer winding according to the sequence of the deviation characteristic sets to be selected, so as to obtain the predicted fault types corresponding to the data of each sample;

and the parameter optimization module is used for determining the optimal deviation characteristic and the optimal adjacent node number adopted by the nearest-neighbor node algorithm by taking the minimum difference between the predicted fault type and the actual fault type of each sample data as an optimization target so as to classify the faults of the transformer winding according to the optimized nearest-neighbor node algorithm.

The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the above-described transformer winding fault classification method.

The present specification provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above-described transformer winding fault classification method when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:

firstly, obtaining frequency response curves of transformer windings in various states as sample data and marking the sample data, then extracting deviation features of the sample data in various fault types through a plurality of data feature index extraction methods to obtain a plurality of to-be-selected deviation feature sets, then evaluating the feature importance of the various deviation features through random forests, inputting at least one deviation feature into a nearest node algorithm for transformer winding fault classification according to the importance of the deviation feature to obtain predicted fault types corresponding to the sample data, finally determining the optimal deviation features and the optimal number of adjacent nodes adopted by the nearest node algorithm by taking the minimum difference between the predicted fault types and the actual fault types of the sample data as an optimization target, and carrying out transformer winding fault classification according to the optimized nearest node algorithm.

According to the invention, the deviation features of sample data are extracted by utilizing various data feature indexes, then the importance of the deviation features is evaluated, and finally one or more deviation features are applied to the transformer fault classification of multiple rounds according to the importance of the deviation features, so that the best-performing deviation feature combination is determined, parameter optimization is realized, the difference between frequency response curves before and after winding deformation is accurately reflected by the optimal deviation features, and the accuracy of winding fault classification is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

fig. 1 is a schematic flow chart of a method for classifying faults of a transformer winding provided in the present specification;

fig. 2 is a schematic diagram of a fault test of a transformer winding provided in the present specification;

FIG. 3 is a schematic diagram of evaluating the importance of various deviation features of a fault classification provided herein;

FIG. 4 is a schematic diagram of evaluation of importance of various deviation features for a DSV fault level classification provided herein;

Fig. 5 is an integrated flow diagram of a transformer winding fault classification method provided in the present specification;

fig. 6 is a schematic diagram of a transformer winding fault classification device provided in the present specification;

fig. 7 is a schematic diagram of a computer device for implementing a method for classifying faults of a transformer winding according to the present disclosure.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the disclosure, are intended to be within the scope of the present application based on the embodiments described herein.

Currently, in the conventional KNN algorithm, the voting weight of each nearest neighbor is equal. This means that neighbors closer to the sample to be classified have the same influence as neighbors farther away in determining the final classification. However, there is a difference in the number of winding deformation samples of different types, resulting in data exhibiting an unbalanced state. Model classification accuracy may be reduced if the impact on neighboring samples with different distances to the sample to be classified is considered equally important. The invention therefore employs an inverse distance weighted K nearest neighbor (Inverse Distance Weighting KNN, IDW-KNN) algorithm. The algorithm uses the inverse of the distance to apply different weights to sample points around the unknown sample, with sample points closer to the unknown sample obtaining higher weights and sample points farther from the unknown sample obtaining lower weights. By weighting and summing nearest neighbor weights and classifying samples by voting, errors of the model can be reduced and classification accuracy can be further improved. The method can be introduced to effectively solve the influence of unbalanced sample data on the model.

In order to accurately reflect the difference between frequency response curves before and after winding deformation and extract effective winding fault characteristics, the invention provides a transformer winding deformation diagnosis method based on random forest characteristic optimization and improved K neighbor, aiming at the problems that the existing intelligent algorithm for detecting winding fault deformation often has inaccurate characteristic index application and poor parameter selection.

The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a transformer winding fault classification method in the present specification, which specifically includes the following steps:

s101: sample data of frequency response curves corresponding to normal states and various fault types of the transformer winding are constructed, and real fault types corresponding to the sample data are marked.

Generally, when the server of the service platform performs fault classification on the transformer winding, the classification model can be trained through sample data, and then a fault classification task is performed through the trained classification model.

Therefore, the server of the service platform can firstly construct sample data of frequency response curves corresponding to the normal state of the transformer winding and various fault types, and marks the sample data.

The specific type of each fault type can be determined according to specific needs, and the specification does not limit the specific type. For example, one or more of axial overall displacement, winding wire pancake strong warpage, pancake spacing variation, and inter-pancake shorting may be included.

In particular, when sample data is constructed, in one or more embodiments of the present disclosure, a server of a service platform may simulate faults of transformer windings of different degrees under different fault types by adjusting the number of pads and/or the number of fault cakes at different positions of the transformer according to electrical characteristics of the transformer windings. And obtaining sample data of a frequency response curve corresponding to the normal state of the transformer winding and various fault types based on the frequency response test. Fig. 2 is a schematic diagram of a transformer winding fault test in the present specification, and as can be seen from fig. 2, a test user can adjust each part of the transformer to simulate transformer winding faults with different degrees under different fault types.

The frequency response method described herein can be implemented by injecting an excitation signal of a low voltage sweep frequency into one terminal of the winding and measuring a corresponding response signal from the other terminal of the winding. Then, by calculating the magnitude of the ratio of the response voltage to the input voltage, the magnitude of the frequency response can be obtained, specifically by the following formula:

Where A (f) is the amplitude of FRA at frequency f. U (U) ₀ (f) And U ₁ (f) The output voltage and the input voltage of the winding at frequency f, respectively.

For example, the server may pass the frequency response test to obtain 63 sets of frequency response data, including axial overall displacement (Axial Displacement, AD), winding wire cake strong warping (FB), cake spacing variation (Disc Space Variation, DSV), frequency response data at Short Circuit (SC), and frequency response data of healthy windings, where four different fault levels may be set for each fault type. Such as an axial overall displacement primary fault, an axial overall displacement secondary fault, an axial overall displacement tertiary fault, and an axial overall displacement quaternary fault.

After obtaining the sample data of the frequency response curves of the normal state and various fault types, the server can label each sample data, and the labels can identify the fault types and the fault degrees corresponding to the sample data. For example, four fault types may be identified by AD, FB, DSV, SC, respectively, with 1, 2, 3, 4 identifying four degrees of fault for different fault types, respectively.

The server mentioned in the present specification may be a server provided on a service platform, or a device such as a desktop, a notebook, or the like capable of executing the aspects of the present specification. For convenience of explanation, only the server is used as the execution subject.

S102: and extracting various deviation features between frequency response curves of the transformer winding in a normal state and various fault types in the sample data according to various data feature index extraction methods to obtain a plurality of deviation feature sets to be selected.

After the frequency response curves of the transformer windings under various conditions are obtained, the server can conduct feature extraction on sample data according to various data feature index extraction methods.

The method for extracting the various data characteristic indexes can comprise the following steps:

euclidean Distance (ED):

forest compliance coefficient (LCC):

sum of Errors (SE):

sum-to-square error (SSRE):

correlation Coefficient (CC):

sum-of-squares max-min ratio error (ssmmsre):

absolute Sum of Log Error (ASLE):

cross Correlation Factor (CCF):

root Mean Square Error (RMSE):

comparative Standard Deviation (CSD):

least Squares Error (LSE):

desired (E):

minimum Maximum (MM):

standard Deviation (SD):

sum-of-squares error (SSE):

spectral deviation (σ):

absolute average Difference (DABS):

random spectral deviation (σs):

normalized correlation coefficient (ρ):

covariance (COVAR):

taking the euclidean distance as an example, the server can calculate the euclidean distance between the frequency response curve up-sampling point pairs in the normal state of the transformer winding and various fault types in the sample data as a deviation feature, namely, the euclidean distance between the normal sampling point on the frequency response curve in the normal state of the transformer winding and the frequency response curve up-sampling point under the fault types is used as a deviation feature. If the Euclidean distance between the frequency response curve upper sampling point pairs under the normal state of the transformer winding and the axial integral displacement fault is obtained as the deviation characteristic, further, the Euclidean distance between the frequency response curve upper sampling point pairs under the normal state of the transformer winding and the axial integral displacement fault primary fault is obtained as the deviation characteristic corresponding to different fault degrees. Other degrees of failure and other types of failure are the same and will not be described in detail herein.

The server can obtain a to-be-selected deviation feature set composed of one deviation feature according to Euclidean distance, and can obtain a to-be-selected deviation feature set composed of another deviation feature according to the forest consistency coefficient, and the deviation feature extraction can be carried out by other data feature index extraction methods, so that a plurality of to-be-selected deviation feature sets composed of various deviation features can be obtained.

For example, the server may obtain multiple candidate bias feature sets { X, Y, … … }, where X is the candidate bias feature set corresponding to the euclidean distance, and x= { X } ₁ ,x ₂ ,……,x _n }，x ₁ The Euclidean distance between the frequency response curves under the normal state corresponding to the first sample data is obtained, and the Euclidean distance corresponding to the other sample data is obtained. Y is a candidate deviation feature set corresponding to the forest consistent coefficient, and Y and X are the same and are not repeated here.

Further, in one or more embodiments of the present disclosure, the server may further perform data normalization processing on the to-be-selected bias characteristic set, and specifically, the server may normalize the to-be-selected bias characteristic set by the following formula:

The influence of noise and outliers in the sample data on subsequent data processing can be further reduced by normalization.

TABLE 1 deviation characterization schematic form

Table 1 is a schematic representation of deviation characteristics in the present specification, and it can be seen from table 1 that, corresponding to the four fault types in step S101, and the four fault levels under each fault type, table 1 exemplarily shows the deviation characteristic values of the Correlation Coefficient (CC), euclidean Distance (ED), absolute average Difference (DABS), comparative Standard Deviation (CSD), sum-of-squares error (SSRE), sum-of-squares maximum-minimum ratio error (SSMMRE) between the deviation characteristic values and the frequency response curve of the transformer winding under normal conditions.

S103: and evaluating the feature importance of each deviation feature in each deviation feature set to be selected through a random forest, and arranging the deviation feature sets to be selected in a descending order according to the feature importance.

After obtaining the plurality of candidate deviation feature sets, the server can use the deviation feature in each candidate deviation feature set corresponding to each sample data as the sample feature corresponding to the sample data.

Continuing with the description in step S102, the server may determine { x } based on each candidate bias feature set ₁ ，y ₁ ,....... } as the first sample: the sample characteristics corresponding to the present data, as with the other sample data, each of the bias features in the sample features is then evaluated for feature importance by Random Forest (RF).

Specifically, in one or more embodiments of the present disclosure, a server may first construct a plurality of decision trees to classify faults of a transformer winding according to sample characteristics of each sample data to form a random forest.

The server may then calculate, for each decision tree, a first prediction error (ooBerror) for that decision tree based on the Out-Of-Bag data (Out Of Bag, OOB) and its corresponding labels that construct the decision tree.

And then the server can select the characteristics to be evaluated from the deviation characteristics of the out-of-bag data of the decision tree one by one, randomly perturb the characteristics to be evaluated, and calculate a second prediction error (OOBerror_new) of the decision tree after the characteristics to be evaluated are perturbed through the out-of-bag data after the characteristics to be evaluated are perturbed and the corresponding labels of the out-of-bag data.

The server may then calculate the average degradation of accuracy (Mean Decrease Accuracy, MDA) of the decision tree after disturbance of the feature to be evaluated based on the difference between the first prediction error and the second prediction error by:

MDA _j (X)＝|OOBerror(X) _j -OOBerror_new(X) _j |

Wherein the feature to be evaluated corresponding to the formula is the bias corresponding to the bias feature set X to be selectedThe difference feature type, j is the number of decision tree, MDA _j (X) is the average precision decline value of the j-th decision tree after disturbance of the deviation characteristic X of the data outside each bag, OOBerror (X) _j For the first prediction error of the jth decision tree, OOBerror_new (X) _j And the second prediction error of the j-th decision tree after disturbance of the deviation characteristic X of the data outside each bag is obtained.

Then, calculating the average value of the average precision decline values of all decision trees after disturbance of the characteristics to be evaluated by the following formula, wherein the average value is used as the importance of the characteristics to be evaluated:

the MDA_avg (X) is the average value of the average precision decline value of each decision tree after the disturbance of the feature X to be evaluated, namely the importance of the feature to be evaluated, and the average value of the average precision decline of the decision tree can be obtained through the formula.

The above formula process shows that when a certain deviation feature is disturbed, the more the accuracy is reduced, the more important the deviation feature is, namely, the greater MDA value indicates the more important the deviation feature, the importance of all kinds of deviation features can be ordered according to the magnitude of the MDA value obtained by calculation, and the descending order of the importance of all kinds of deviation features can be adopted.

Further, in one or more embodiments of the present disclosure, the feature importance may be evaluated by a random forest for each deviation feature in each candidate deviation feature set through one or more rounds, and the specific round may be determined according to needs, which is not limited in the present disclosure.

Fig. 3 is a schematic diagram of evaluating the importance of each deviation feature of a fault classification in the present specification, and as can be seen from fig. 3, among the importance of each deviation feature of a fault classification, the sum-of-squares error is the highest, the sum-of-squares error is the lowest, and the importance ranking of other deviation features is not illustrated.

Fig. 4 is a schematic diagram of evaluating importance of each deviation feature of a DSV fault level classification in the present specification, and as can be seen from fig. 4, in importance of each deviation feature of the DSV fault level classification, the forest consistency coefficient is the highest, the normalized correlation coefficient is the lowest, and the importance ranking of other deviation features is not illustrated one by one.

S104: and according to the ordering of each to-be-selected deviation feature set, inputting the deviation feature in at least one to-be-selected deviation feature set into a nearest node algorithm to classify the faults of the transformer winding, and obtaining the predicted fault types corresponding to each sample data.

After the importance evaluation and the ranking of the various bias features are performed, the server may sequentially and incrementally select the candidate bias feature set from low to high according to the ranking in step S103. And inputting the deviation features in the selected deviation feature set to the nearest neighbor node algorithm (K-NearestNeighbor, KNN) for classifying faults of the transformer winding.

Specifically, with at least one deviation feature corresponding to each sample data as a sample feature thereof, the server may input the sample feature corresponding to each sample data into the nearest neighbor node algorithm.

And calculating Euclidean distances between the sample data to be classified and preset classified sample data according to the sample characteristics of the sample data to be classified and the classified sample data by a nearest neighbor node algorithm, and sorting the classified sample data in ascending order according to the Euclidean distances. The similarity between the sample data to be classified and the classified sample data is measured by calculating the Euclidean distance between the sample data to be classified and the classified sample data, and the shorter the distance is, the more similar the sample data to be classified and the classified sample data are, specifically, the calculation can be performed by the following formula:

wherein, alpha is in the feature space omega formed by n-dimensional deviation features _i ∈ω，β _j E omega, sample characteristics corresponding to two sample data in a data set, alpha _i ＝(α _i1 ，α _i2 ，...，α _in )，β _j ＝(β _j1 ，β _j2 ...，β _jm ). The Euclidean distance is taken as an example here, and the specific distance calculation mode can be determined according to needs, which is not limited in the specification, and can be, for example, manhattan distance, marsdet distance and other distance measures.

And then selecting neighbor sample data of the sample data to be classified with the preset number of neighbor nodes from the classified sample data according to the sequence from front to back. The number of the preset adjacent nodes can be determined according to the requirement, and the preset adjacent nodes are not limited in the specification, and can be preset values in 1-20.

And finally, determining the prediction fault type of the sample data to be classified according to the voting result of the sample data to be classified of each neighbor sample data.

The classification by the KNN algorithm is realized by a mature technology, and only the classification process is simply described and is not repeated.

As can be seen from the above, the conventional KNN algorithm assumes that all nearest neighbor samples have equal voting weights, i.e. the impact on the model prediction results is the same. However, due to the imbalance in the number of different types of transformer winding deformation samples (i.e., failure samples), i.e., maldistribution of data samples, the classification accuracy of the model may be reduced if all samples have the same weight.

Thus, in one or more embodiments of the present description, transformer winding fault classification is performed by an inverse distance weighted nearest neighbor node algorithm (Inverse Distance Weighting K-NearestNeighbor, IDW-KNN). The algorithm uses the inverse of the distance to give different weights to neighbor samples around the sample to be predicted, i.e. the size of the weights is determined according to the distance. Specifically, samples closer to the sample are given greater weight, and samples farther from the sample are given less weight. The results are then weighted summed according to the weight value assigned to each nearest neighbor sample and the samples are classified using voting.

Specifically, the server may select, from the classified sample data, neighbor sample data of a predetermined number of neighboring nodes of the sample data to be classified according to the order from front to back, and calculate the weight of each neighbor sample data by an inverse distance weighting method according to the euclidean distance between the sample data to be classified and the neighbor sample data, specifically by the following formula:

where w (a) is the weight of the a-th neighbor sample, d (α) _a ,β _a ) For sample data alpha to be classified _a The distance to the a-th neighbor sample may be a euclidean distance.

And finally, respectively carrying out weighted summation on the fault types of the neighbor sample data according to the weights of the neighbor sample data, and calculating the predicted fault types corresponding to the sample data to be classified.

The weighted voting may be that weights of neighbor samples belonging to the same fault type are added, in the conventional voting process of nearest neighbor nodes, each neighbor node casts a vote according to the respective type, and after the votes are weighted by inverse distance weighting, each neighbor node casts a vote number of inverse distance according to the respective type. And then determining the fault type corresponding to the sum of the maximum weights as the predicted fault type of the sample data to be classified. Of course, the same procedure as described above is not repeated here for further classification of the degree of failure under a certain failure type.

By applying the IDW-KNN algorithm, the similarity and the difference between the classified samples and the samples to be classified can be reflected better, the problem of the traditional KNN algorithm when the unbalanced sample distribution is processed is solved, the contribution of each classified sample to the classification result is ensured to be proportional to the position and the distance of the classified sample in the sample space, and therefore the accuracy of the fault classification of the transformer winding is improved.

S105: and determining the optimal deviation characteristics and the optimal number of adjacent nodes adopted by the nearest-neighbor node algorithm by taking the minimum difference between the predicted fault type and the actual fault type of each sample data as an optimization target so as to classify the faults of the transformer windings according to the optimized nearest-neighbor node algorithm.

Based on the above-mentioned process of determining the type of the predicted fault in step S105, the server may select a different number of deviation features and preset a different number of neighboring nodes, and determine the optimal deviation feature selection and neighboring node data selection (i.e., K-value selection) through the multi-round classification prediction process.

Taking 20 kinds of deviation features in step S102 as an example, the server may determine, according to the order of the to-be-selected deviation feature sets in each round of classification prediction, the deviation features under different numbers corresponding to each sample data as sample features thereof, and by using the nearest neighbor node algorithm, the predicted fault type of the sample data by using the sample features composed of the deviation features under different numbers and different preset numbers of adjacent nodes. Finally, the preferred deviation feature and the preferred number of adjacent nodes are determined with the minimum difference between the predicted fault type and the true fault type for each sample data as an optimization objective. The preferred deviation feature is that one or more deviation features which are optimal in performance are selected from 20 deviation features and used for subsequent fault classification of the transformer winding, and the number of adjacent nodes, namely the optimal K value, is preferred. On the premise of ensuring high enough accuracy, fewer deviation features can be selected as much as possible so as to avoid the complexity of the classification algorithm.

For example, the server may perform the transformer winding fault classification prediction by using the bias feature set to be selected with the highest importance according to the ranking of the bias feature sets to be selected, determine the optimal K value under the condition, and then determine the prediction accuracy of the transformer winding fault classification prediction by using the bias feature set to be selected with the highest importance under the optimal K value.

Then, the server can conduct transformer winding fault classification prediction through the first two to-be-selected deviation feature sets in the to-be-selected deviation feature set sequence, determine an optimal K value under the condition, and then determine the prediction accuracy of the transformer winding fault classification prediction under the optimal K value.

The method comprises the following steps of sequentially adding to-be-selected deviation feature sets for carrying out transformer winding fault classification prediction according to the ordering of the to-be-selected deviation feature sets, and respectively determining optimal K values under different numbers and prediction accuracy under the corresponding optimal K values.

Therefore, the server can determine the optimal deviation characteristics selected finally when the transformer winding fault classification is carried out according to the change of the prediction accuracy, with the aim of selecting fewer deviation characteristics as much as possible, and determine the optimal K value corresponding to the optimal deviation characteristics as the optimal number of adjacent nodes.

After the optimal number of adjacent nodes and fault classification are carried out through the deviation features, the optimal number of adjacent nodes can be applied to the nearest-neighbor node algorithm or the inverse distance weighted nearest-neighbor node algorithm, and finally the server can classify faults of the transformer winding through the nearest-neighbor node algorithm or the inverse distance weighted nearest-neighbor node algorithm after parameter optimization.

Based on the transformer winding fault classification method shown in fig. 1, firstly, obtaining frequency response curves of transformer windings in various states as sample data and marking the sample data, then extracting deviation features of the sample data in various fault types through a plurality of data feature index extraction methods to obtain a plurality of to-be-selected deviation feature sets, then carrying out feature importance assessment on the various deviation features through random forests, inputting at least one deviation feature into a nearest node algorithm for transformer winding fault classification according to the importance of the deviation feature sets, obtaining predicted fault types corresponding to the sample data, finally determining preferred deviation features and the number of preferred adjacent nodes adopted by the nearest node algorithm by taking the minimum difference between the predicted fault types and the actual fault types of the sample data as an optimization target, and carrying out transformer winding fault classification according to the optimized nearest node algorithm.

In addition, before the faults of the transformer windings to be tested are classified, a model is built according to the transformer windings, fault windings of various types and degrees are simulated, the fault types of the transformer are fully covered, frequency response tests are carried out, and the comprehensiveness of a frequency response data set is improved.

According to the invention, the RF characteristic is preferably combined with IDW-KNN to carry out deviation analysis on the health and fault frequency response curves, so that the difference characteristic between the frequency response curves before and after the fault winding is deformed can be fully mined, and the matching degree of the health and fault frequency response curves can be evaluated by a comparison and quantification method. The method solves the problems of inaccurate application of the characteristic index and poor parameter selection of the current intelligent algorithm, and overcomes the bottleneck of insufficient evaluation of the running state of the transformer. The method can provide finer and complete diagnosis results, thereby remarkably improving the accuracy and reliability of transformer fault diagnosis.

On the basis, the invention considers the K value and the number of the selected deviation features from low to high, and determines the optimal K value and the number of the deviation features by taking the highest diagnosis accuracy of the RF-IDW-KNN model as the target, thereby realizing parameter optimization. Compared with the prior art, the method can reflect winding deformation faults more accurately, improve the classification efficiency and accuracy of the algorithm model, and reduce the feature space dimension. The method is simple and easy to understand, has low calculation complexity, successfully overcomes the defects of the traditional frequency response method in terms of depending on experience judgment and being influenced by external environment, and provides a unified framework for engineers lacking relevant experience for fault classification diagnosis.

When the method for classifying faults of the transformer winding provided in the present specification is applied, the steps may not be executed according to the sequence of the steps shown in fig. 1, and the execution sequence of the specific steps may be determined according to needs, which is not limited in the present specification.

Based on the above description, in one or more embodiments of the present disclosure, the server may first establish a frequency response data set, then perform feature extraction, perform feature importance assessment and ranking through a random forest to obtain a feature set to be selected, then determine a fault classification result of a sample to be classified through an inverse distance weighted nearest neighbor node algorithm, determine an optimal K value and feature number through multiple rounds of application, and finally perform transformer winding fault classification through the optimized inverse distance weighted nearest neighbor node algorithm. Fig. 5 is a schematic diagram illustrating an integration flow of a transformer winding fault classification method according to the present disclosure.

Table 2 test results table

In addition, in the present specification, an example of applying the transformer winding fault classification method is also provided, in which the accuracy thereof is tested, and the test results are shown in table 2.

As can be seen from table 2, the test sample set includes AD, FB, DSV, SC samples with four different fault types and different fault degrees, and can be used for testing the accuracy of the algorithm, and the diagnosis result shows that the diagnostic effect of the algorithm on the set of samples is good. The test accuracy for the fault type and the fault degree can reach 100%. Test results show that the transformer winding fault classification method provided by the invention can evaluate the fault state of the transformer winding and accurately identify the fault type without manual intervention.

The above method for classifying faults of the transformer winding provided for one or more embodiments of the present specification further provides a corresponding device for classifying faults of the transformer winding based on the same thought, as shown in fig. 6.

Fig. 6 is a schematic diagram of a transformer winding fault classification device provided in the present specification, including:

the construction module 201 is configured to construct sample data of a frequency response curve corresponding to a normal state of a transformer winding and various fault types, and label real fault types corresponding to the sample data;

The extracting module 202 is configured to extract a plurality of deviation features between frequency response curves of the transformer winding in the sample data in a normal state and under various fault types according to a plurality of data feature index extracting methods, so as to obtain a plurality of to-be-selected deviation feature sets;

the evaluation module 203 is configured to evaluate feature importance of each bias feature in each bias feature set to be selected through a random forest, and arrange each bias feature set to be selected in a descending order according to the feature importance;

the prediction module 204 is configured to input, according to the order of each of the to-be-selected deviation feature sets, features in at least one of the to-be-selected deviation feature sets into a nearest node algorithm to perform transformer winding fault classification, so as to obtain a predicted fault type corresponding to each sample data;

and the parameter optimization module 205 is configured to determine, with the minimum difference between the predicted fault type and the actual fault type of each sample data as an optimization target, a preferred deviation feature and a preferred number of neighboring nodes adopted by the nearest neighboring node algorithm, so as to classify the transformer winding faults according to the optimized nearest neighboring node algorithm.

Optionally, the building module 201 simulates faults of the transformer winding with different degrees under different fault types by adjusting the number of cushion blocks and/or fault cakes at different positions of the transformer according to the electrical characteristics of the transformer winding, and obtains sample data of a normal state of the transformer winding and a frequency response curve corresponding to various fault types based on a frequency response test, wherein the different fault types comprise axial integral displacement, strong warping of winding wire cakes, cake spacing change and short circuit between cakes.

Optionally, the method for extracting the multiple data characteristic indexes includes: euclidean distance, linn's coefficient of agreement, sum of errors, sum of squares error, correlation coefficient, sum of squares max-min ratio error, absolute sum of logarithmic error, cross correlation factor, root mean square error, comparison standard deviation, least squares error, expected, minimum maximum, standard deviation, sum of squares error, spectral deviation, absolute average difference, random spectral deviation, normalized correlation coefficient, covariance.

Optionally, the evaluation module 203 takes as its samples a plurality of deviation features corresponding to each sample dataAccording to the characteristics, a plurality of decision trees are constructed according to sample characteristics of each sample data to classify faults of a transformer winding to form a random forest, for each decision tree, according to out-of-bag data of the constructed decision tree and corresponding labels, a first prediction error of the decision tree is calculated, characteristics to be evaluated are selected one by one from various deviation characteristics of out-of-bag data of the decision tree, random disturbance is conducted on the characteristics to be evaluated, a second prediction error of the decision tree after disturbance of the characteristics to be evaluated is calculated through out-of-bag data after disturbance of the characteristics to be evaluated and corresponding labels, and an average precision reduction value of the decision tree after disturbance of the characteristics to be evaluated is calculated according to the difference between the first prediction error and the second prediction error through the following formula: MDA (MDA) _j (X)＝|OOBerror(X) _j -OOBerror_new(X) _j The average value of the average precision decline value of each decision tree after disturbance of the feature to be evaluated is calculated as the importance of the feature to be evaluated by the following formula:wherein the feature to be evaluated is the type of the deviation feature corresponding to the deviation feature set X to be selected, j is the serial number of the decision tree, and MDA _j (X) is the average precision decline value of the jth decision tree after disturbance of the deviation characteristic X of the data outside each bag, OOBerror _j For the first prediction error of the j-th decision tree, OOBerror _j And (2) new is the second prediction error of the j-th decision tree after disturbance of the deviation feature X of the data outside each bag, and MDA_avg (X) is the average value of the average precision decline value of each decision tree after disturbance of the feature X to be evaluated.

Optionally, the prediction module 204 takes at least one deviation feature corresponding to each sample data as a sample feature thereof, inputs the sample feature corresponding to each sample data into a nearest neighbor node algorithm, calculates an euclidean distance between the sample data to be classified and the classified sample data according to the sample feature of the sample data to be classified and a preset classified sample data, performs ascending order on the classified sample data according to the euclidean distance, selects neighbor sample data of a preset number of neighboring nodes from the classified sample data according to the sequence from front to back, calculates weights of each neighbor sample data according to the euclidean distance between the sample data to be classified and the neighbor sample data by an inverse distance weighting method, performs weighted summation on fault types of each neighbor sample data according to the weights of each neighbor sample data, and calculates a predicted fault type corresponding to the sample data to be classified.

Optionally, the parameter optimization module 205 uses the deviation features of different numbers corresponding to each sample data as the sample features according to the order of the deviation feature sets to be selected, determines the predicted fault type of the sample data by using the sample features composed of the deviation features of different numbers and different preset adjacent node numbers through the nearest neighbor node algorithm, and determines the preferred deviation features and the preferred adjacent node numbers by taking the minimum difference between the predicted fault type and the actual fault type of each sample data as the optimization target.

Optionally, the extracting module 202 further normalizes the plurality of candidate bias feature sets by:wherein x is the deviation feature subjected to normalization, max (x) is the deviation feature maximum value in the deviation feature set where the deviation feature x is located, min (x) is the deviation feature minimum value in the deviation feature set where the deviation feature x is located, and x ^* Is the normalized deviation feature.

For specific limitations on the transformer winding fault classification device, reference may be made to the above limitation on the transformer winding fault classification method, and no further description is given here. The above-described respective modules in the transformer winding fault classification apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

The present specification also provides a computer readable storage medium storing a computer program operable to perform the transformer winding fault classification method provided in fig. 1 above.

The present specification also provides a schematic structural diagram of the computer device shown in fig. 7, where, as shown in fig. 7, the computer device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile memory, and may include hardware required by other services. The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to realize the transformer winding fault classification method provided in fig. 1.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

Claims

1. A method for classifying faults of a transformer winding, comprising:

2. The method for classifying faults of a transformer winding according to claim 1, wherein the constructing sample data of frequency response curves corresponding to normal states of the transformer winding and various fault types specifically comprises:

3. The transformer winding fault classification method according to claim 1, wherein the multiple data characteristic index extraction method specifically comprises:

4. The transformer winding fault classification method according to claim 1, wherein the feature importance evaluation is performed on each deviation feature in each candidate deviation feature set through a random forest, specifically comprising:

MDA _j (X)＝|OOBerror(X) _j -OOBerror_new(X)j _| ；

Wherein the feature to be evaluated is the type of the deviation feature corresponding to the deviation feature set X to be selected, j is the serial number of the decision tree, and MDA _j (X) is the average precision decline value of the jth decision tree after disturbance of the deviation characteristic X of the data outside each bag, OOBerror _j For the first prediction error of the j-th decision tree, OOBerror _j And (2) new is the second prediction error of the j-th decision tree after disturbance of the deviation feature X of the data outside each bag, and MDA_avg (X) is the average value of the average precision decline value of each decision tree after disturbance of the feature X to be evaluated.

5. The method for classifying faults of a transformer winding according to claim 1, wherein the step of inputting the features in the at least one bias feature set to the nearest node algorithm to classify faults of the transformer winding to obtain the predicted fault types corresponding to each sample data comprises the following steps:

6. The transformer winding fault classification method according to claim 1, wherein the determining the preferred deviation feature and the preferred number of adjacent nodes adopted by the nearest-neighbor node algorithm with the minimum difference between the predicted fault type and the true fault type of each sample data as an optimization target specifically comprises:

7. The transformer winding fault classification method of claim 1, further comprising:

normalizing the plurality of candidate bias feature sets by:

8. A transformer winding fault classification device, comprising:

9. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-7.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the preceding claims 1-7 when executing the program.