CN114019282A - Transformer fault diagnosis method based on principal component analysis and random forest phase fusion - Google Patents
Transformer fault diagnosis method based on principal component analysis and random forest phase fusion Download PDFInfo
- Publication number
- CN114019282A CN114019282A CN202111298905.8A CN202111298905A CN114019282A CN 114019282 A CN114019282 A CN 114019282A CN 202111298905 A CN202111298905 A CN 202111298905A CN 114019282 A CN114019282 A CN 114019282A
- Authority
- CN
- China
- Prior art keywords
- fault
- random forest
- chromatographic data
- data set
- oil chromatographic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007637 random forest analysis Methods 0.000 title claims abstract description 81
- 238000003745 diagnosis Methods 0.000 title claims abstract description 58
- 238000000513 principal component analysis Methods 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 title claims abstract description 32
- 230000004927 fusion Effects 0.000 title claims abstract description 18
- 238000011208 chromatographic data Methods 0.000 claims abstract description 91
- 238000012549 training Methods 0.000 claims abstract description 24
- 238000012360 testing method Methods 0.000 claims abstract description 15
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 claims description 54
- 239000004215 Carbon black (E152) Substances 0.000 claims description 30
- 229930195733 hydrocarbon Natural products 0.000 claims description 30
- 150000002430 hydrocarbons Chemical class 0.000 claims description 30
- 229910002092 carbon dioxide Inorganic materials 0.000 claims description 27
- 239000001569 carbon dioxide Substances 0.000 claims description 27
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical compound C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 claims description 25
- UGFAIRIUMAVXCW-UHFFFAOYSA-N Carbon monoxide Chemical compound [O+]#[C-] UGFAIRIUMAVXCW-UHFFFAOYSA-N 0.000 claims description 24
- 229910002091 carbon monoxide Inorganic materials 0.000 claims description 24
- HSFWRNGVRCDJHI-UHFFFAOYSA-N alpha-acetylene Natural products C#C HSFWRNGVRCDJHI-UHFFFAOYSA-N 0.000 claims description 22
- 125000002534 ethynyl group Chemical group [H]C#C* 0.000 claims description 21
- VGGSQFUCUMXWEO-UHFFFAOYSA-N Ethene Chemical compound C=C VGGSQFUCUMXWEO-UHFFFAOYSA-N 0.000 claims description 18
- 239000005977 Ethylene Substances 0.000 claims description 18
- 239000011159 matrix material Substances 0.000 claims description 18
- OTMSDBZUPAUEDD-UHFFFAOYSA-N Ethane Chemical compound CC OTMSDBZUPAUEDD-UHFFFAOYSA-N 0.000 claims description 16
- 238000003066 decision tree Methods 0.000 claims description 9
- 229910052739 hydrogen Inorganic materials 0.000 claims description 9
- 239000001257 hydrogen Substances 0.000 claims description 9
- 238000010586 diagram Methods 0.000 claims description 7
- 238000004587 chromatography analysis Methods 0.000 claims description 6
- 150000002431 hydrogen Chemical class 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000000354 decomposition reaction Methods 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 238000004868 gas analysis Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N30/00—Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
- G01N30/02—Column chromatography
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Housings And Mounting Of Transformers (AREA)
Abstract
The transformer fault diagnosis method based on principal component analysis and random forest phase fusion comprises the steps of determining the fault oil chromatographic data set ratio dimension of a fault type transformer as a fault code after a first fault oil chromatographic data set, eliminating the correlation among dimensional characteristics by using a principal component analysis model, preferably adjusting the first fault oil chromatographic data set until 8 main characteristics are remained as a second fault oil chromatographic data set, and performing the following steps according to the ratio of 0.8: and dividing the ratio of 0.2 into a training set and a test set, detecting the fault diagnosis accuracy of the optimized random forest classification model (namely the first random forest classification model) trained by the training set by the test set, and inputting the transformer oil chromatographic data into the first random forest classification model (namely the final random forest classification model) with the fault diagnosis accuracy not less than the set fault diagnosis accuracy to obtain a diagnosis result. The ratio dimensionality enhancement is combined with the principal component analysis model, the information content of the fault oil chromatographic data set is fully mined, and the accuracy of fault diagnosis is improved.
Description
Technical Field
The application relates to the field of transformer fault diagnosis, in particular to a transformer fault diagnosis method based on principal component analysis and random forest fusion.
Background
A power transformer is a stationary electrical device that is used to transform an ac voltage of a certain value into another voltage of the same frequency or several different values. A power transformer is one of important electrical devices in a power system, and a power plant and a substation raise or lower a voltage to a voltage required by a power utilization area through the power transformer to supply power to each place. The power supply reliability is affected by the power transformer failure, and in order to improve the power supply reliability, the power transformer is repaired in time after the power transformer fails.
In order to repair the fault in time after the fault of the power transformer, in the prior art, a fault type of the power transformer is judged according to components and content of gas by a three-ratio method through a dissolved gas analysis technology in power transformer oil, however, the efficiency of extracting effective information in a fault data set by the three-ratio method is low, and the accuracy of judging the fault type of the power transformer is low.
Disclosure of Invention
The application provides a transformer fault diagnosis method based on principal component analysis and random forest fusion, and aims to solve the technical problem that the accuracy rate of judging the fault type of a power transformer is low.
In order to solve the technical problem, the embodiment of the application discloses the following technical scheme:
on the first hand, the embodiment of the application discloses a transformer fault diagnosis method based on principal component analysis and random forest phase fusion, which comprises the steps of detecting oil chromatograms of transformers with definite fault types, extracting fault oil chromatograms of the detected fault transformers, and integrating the fault oil chromatograms into a fault oil chromatograms dataset;
carrying out ratio dimension increasing on the fault oil chromatographic data set, taking the fault oil chromatographic data set subjected to ratio dimension increasing as a first fault oil chromatographic data set, and carrying out fault coding;
judging the correlation among all the dimensional features according to a correlation thermodynamic diagram among 36 dimensional features in the first fault oil chromatogram data set;
establishing a principal component analysis model, eliminating the correlation among all the dimensional characteristics by adopting the principal component analysis model, adjusting and optimizing the first fault oil chromatographic data set to the remaining 8 main characteristics, taking the first fault oil chromatographic data set with the adjusted and optimized parameters to the remaining 8 main characteristics as a second fault oil chromatographic data set, wherein the second fault oil chromatographic data set comprises 99% of information content in the first fault oil chromatographic data set, and the second fault oil chromatographic data set is calculated according to the following formula of 0.8: dividing the ratio of 0.2 into a training set and a test set;
establishing a parameter-adjusting and optimal-selecting random forest classification model, training the random forest classification model by adopting a training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting a test set, and taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy;
and inputting newly detected transformer oil chromatographic data into a final random forest classification model for fault diagnosis to obtain a diagnosis result of whether the transformer has faults or not.
Optionally, establishing a principal component analysis model, eliminating correlation among the dimensional features by using the principal component analysis model, preferentially selecting the first fault oil chromatographic data set to the remaining 8 main features, and using the first fault oil chromatographic data set with the preferred parameter to the remaining 8 main features as a second fault oil chromatographic data set, where the second fault oil chromatographic data set includes 99% of information content in the first fault oil chromatographic data set, and the method includes:
inputting a sample set X ═ X of an n-dimensional space1,x2,···,xmWherein x isi∈xmAnd mapped to k-dimensional space;
preprocessing, the formula for changing the sample variance to 1 is as follows:
xi=xi/σ
calculating the covariance matrix XXTFor the covariance matrix XXTPerforming characteristic decomposition;
the maximum k characteristic values and k characteristic vectors corresponding to the characteristic values are obtained and are marked as omega1,ω2,···,ωkOutput projection matrix W ═ ω1,ω2···,ωkWhere ω isk∈RnIn the parameter adjustment and optimization process, the dimension number is the same as the number of main components, and the minimum k is selected as the number k of the main components, so that the formula for retaining the difference of 99% of the original data is as follows:
wherein m represents the number of features; x(i)Representing an initial matrix;and representing a matrix after dimensionality reduction to k dimension, wherein a molecule represents the sum of distances between an original point and a projection point, and the smaller the error is, the more complete the data after dimensionality reduction can represent the data before dimensionality reduction, and if the error is less than 0.01, the more 99% of information can be retained in the data after dimensionality reduction.
Optionally, establishing a parameter-adjusting and preferred random forest classification model, including:
the measure of the degree of purity was set as: criterion ═ mse', whether there is a dropped sample is set to: the number of features considered when restricting branching is set as: max _ features ═ sqrt', the maximum depth of the tree is set to: when the nodes are divided according to the attributes, the minimum number of samples per division is set as: min _ samples _ split is 5, and the number of decision trees is set as: n _ estimators is 1000, and the minimum number of leaf nodes is set as: min _ samples _ leaf is 4.
Optionally, the extracted fault oil chromatographic data is subjected to ratio dimension increasing, the fault oil chromatographic data subjected to ratio dimension increasing is used as first fault oil chromatographic data, and the fault oil chromatographic data is integrated into a fault oil chromatographic data set, including:
the fault oil chromatographic data set comprises the content of total hydrocarbon formed by integrating hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and all hydrocarbon;
the first set of fault oil chromatographic data includes hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content, hydrogen to methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, methane to ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethane to ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethylene to acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, acetylene to carbon monoxide, carbon dioxide and total hydrocarbon content ratios, carbon monoxide to carbon dioxide and total hydrocarbon content ratios and carbon dioxide and total hydrocarbon content ratios.
Optionally, inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis, and obtaining a diagnosis result of whether the transformer has a fault, including:
when the diagnosis result is that the transformer has a fault, the type and the position of the fault can be obtained.
The beneficial effect of this application does:
the flow schematic diagram of the transformer fault diagnosis method based on principal component analysis and random forest phase fusion provided by the embodiment of the application comprises the steps of detecting oil chromatograms of transformers with definite fault types, extracting fault oil chromatograms of the detected fault transformers, integrating the fault oil chromatograms into a fault oil chromatograms, performing ratio dimension raising on the fault oil chromatograms, using the fault oil chromatograms with the ratio dimension raised as a first fault oil chromatograms, performing fault coding, judging the relevance among all the dimensionality characteristics according to the relevance thermodynamic diagram among 36 dimensionality characteristics in the first fault oil chromatograms, establishing a principal component analysis model, eliminating the relevance among all the dimensionality characteristics by using the principal component analysis model, adjusting and optimizing the first fault oil chromatograms to the remaining 8 principal characteristics, and using the first fault oil chromatograms with the adjusted and optimized to the remaining 8 principal characteristics as a second fault oil chromatograms And a spectrum data set and a second fault oil chromatogram data set contain 99% of information content in the first fault oil chromatogram data set, and the second fault oil chromatogram data set is divided into a first fault oil chromatogram data set and a second fault oil chromatogram data set according to the ratio of 0.8: dividing the proportion of 0.2 into a training set and a testing set, establishing a parameter-adjusting and optimizing random forest classification model, training the random forest classification model by adopting the training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting the testing set, taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy, inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis, and obtaining the diagnosis result whether the transformer has faults or not. The dimension of fault oil chromatographic data is improved by a ratio dimension increasing method, the relevance among dimension characteristics is eliminated by a principal component analysis model, the first fault oil chromatographic data is preferably adjusted to the remaining 8 main characteristics, and the information content contained in the fault oil chromatographic data set is fully mined by combining the ratio dimension increasing method and the principal component analysis model. And a training set in a second fault oil chromatography data set with fully excavated information content is input into a random forest classification model, and the random forest classification model is trained, so that the accuracy of judging whether the power transformer has faults and fault types by the random forest classification model is improved, and the accuracy of judging whether the power transformer has faults and fault types by a transformer fault diagnosis method based on principal component analysis and random forest fusion is further improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a transformer fault diagnosis method based on principal component analysis and random forest fusion according to an embodiment of the present application;
fig. 2 is a schematic process diagram of a transformer fault diagnosis method based on principal component analysis and random forest fusion according to an embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, the embodiment of the present application provides a low accuracy in determining the fault type of the power transformer, including steps S110 to S160.
S110: and detecting the oil chromatogram of the transformer with the determined fault type, extracting fault oil chromatogram data of the detected fault transformer, and integrating the fault oil chromatogram data into a fault oil chromatogram data set.
S120: and (4) performing ratio dimension increasing on the fault oil chromatographic data set, taking the fault oil chromatographic data set subjected to ratio dimension increasing as a first fault oil chromatographic data set, and performing fault coding.
In some embodiments, as shown in fig. 2, in the actual encoding, when singular value decomposition is performed on the covariance matrix in the implementation process of the principal component analysis, the S matrix can be obtained. The expression for the principal component analysis error is equivalent to the following equation:
wherein SiIs a matrix of eigenvalues.
In some embodiments, as shown in fig. 2, performing ratio dimension increasing on the extracted fault oil chromatographic data, taking the fault oil chromatographic data after the ratio dimension increasing as first fault oil chromatographic data, and integrating the fault oil chromatographic data into a fault oil chromatographic data set, includes:
the fault oil chromatographic data set comprises the content of total hydrocarbon formed by integrating hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and all hydrocarbon;
the first set of fault oil chromatographic data includes hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content, hydrogen to methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, methane to ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethane to ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethylene to acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, acetylene to carbon monoxide, carbon dioxide and total hydrocarbon content ratios, carbon monoxide to carbon dioxide and total hydrocarbon content ratios and carbon dioxide and Total Hydrocarbon (TH) content ratios.
The specific ratio mode of the first fault oil chromatogram data set is shown in table 1:
TABLE 1
Wherein TH is CH4+C2H6+C2H4+C2H2。
In some embodiments, as shown in fig. 2, since the random forest classification model is a tree-based model, when processing variables, rather than being based on vector space measurement, the numerical value is only a category, i.e. there is no partial order relationship, and more reasonable label coding can be used. In the present application, "high temperature overheat", "medium temperature overheat", "low temperature overheat", "partial discharge", "low energy discharge", and "normal" in the fault type are encoded as "1", "2", "3", "4", "5", "6", and "7", respectively, using LabelEncode (tag code).
S130: and judging the correlation among the dimension characteristics according to a correlation thermodynamic diagram among the 36 dimension characteristics in the first fault oil chromatographic data set.
S140: establishing a principal component analysis model, eliminating the correlation among all the dimensional characteristics by adopting the principal component analysis model, adjusting and optimizing the first fault oil chromatographic data set to the remaining 8 main characteristics, taking the first fault oil chromatographic data set with the adjusted and optimized parameters to the remaining 8 main characteristics as a second fault oil chromatographic data set, wherein the second fault oil chromatographic data set comprises 99% of information content in the first fault oil chromatographic data set, and the second fault oil chromatographic data set is calculated according to the following formula of 0.8: a scale of 0.2 is divided into a training set and a test set.
In some embodiments, as shown in fig. 2, establishing a principal component analysis model, eliminating the correlation between the dimensional features by using the principal component analysis model, and preferably tuning the first faulty oil chromatographic data set to the remaining 8 main features, and using the first faulty oil chromatographic data set with the tuning being preferred to the remaining 8 main features as a second faulty oil chromatographic data set, where the second faulty oil chromatographic data set includes 99% of the information content in the first faulty oil chromatographic data set, includes:
inputting a sample set X ═ X of an n-dimensional space1,x2,···,xmWherein x isi∈xmAnd is combined withMapping to k-dimensional space;
preprocessing, the formula for changing the sample variance to 1 is as follows:
xi=xi/σ
calculating the covariance matrix XXTFor the covariance matrix XXTPerforming characteristic decomposition;
the maximum k characteristic values and k characteristic vectors corresponding to the characteristic values are obtained and are marked as omega1,ω2,···,ωkOutput projection matrix W ═ ω1,ω2···,ωkWhere ω isk∈RnIn the parameter adjustment and optimization process, the dimension number is the same as the number of main components, and the minimum k is selected as the number k of the main components, so that the formula for retaining the difference of 99% of the original data is as follows:
wherein m represents the number of features; x(i)Representing an initial matrix;and representing a matrix after dimensionality reduction to k dimension, wherein a molecule represents the sum of distances between an original point and a projection point, and the smaller the error is, the more complete the data after dimensionality reduction can represent the data before dimensionality reduction, and if the error is less than 0.01, the more 99% of information can be retained in the data after dimensionality reduction.
In some embodiments, one of the most critical parameters of the principal component analysis model is n _ components, which, if set to integers, are reduced to several principal components, and if set to decimals, indicate the information that the reduced-dimension data can retain. The principal component parameters were set as: n _ components is 8, i.e. dimensionality reduction to 8 principal components.
The method avoids the excessive loss of the information quantity in the second fault oil chromatographic data set caused by the fact that the correlation among all the dimensional characteristics is eliminated by the principal component analysis model, thereby ensuring the information quantity in the second fault oil chromatographic data set, improving the accuracy of judging whether the power transformer is in fault and the fault type by the random forest classification model, and further improving the accuracy of judging whether the power transformer is in fault and the fault type by the transformer fault diagnosis method based on principal component analysis and random forest phase fusion.
In some embodiments, the data set is partitioned into a training set and a validation set using a train _ test _ split () function, where the test set is sized to: test _ size ═ 0.2, the random seed is set to: and random _ state is 1, so that the data set division is unique during each operation, and the result can be reproduced.
S150: establishing a parameter-adjusting optimized random forest classification model, training the random forest classification model by adopting a training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting a test set, and taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy.
In some embodiments, as shown in fig. 2, establishing a tuning-parameter-optimized random forest classification model includes:
the measure of the degree of purity was set as: criterion ═ mse', whether there is a dropped sample is set to: the number of features considered when restricting branching is set as: max _ features ═ sqrt', the maximum depth of the tree is set to: when the nodes are divided according to the attributes, the minimum number of samples per division is set as: min _ samples _ split is 5, and the number of decision trees is set as: n _ estimators is 1000, and the minimum number of leaf nodes is set as: min _ samples _ leaf is 4.
At one endIn some embodiments, the random forest classification model also needs to consider two parameters: the number n _ { tree } of the constructed decision tree, the number k of input features to be considered when each node of the decision tree is split, and usually k can be log2n, where n represents the number of features in the original dataset. The construction of a single decision tree can be divided into the following steps:
assuming that the number of training samples is m, the number of input samples corresponding to each decision tree is m, and the m samples are randomly extracted from the training set in a place-back manner;
assuming that the number of training sample features is n, randomly selecting k sample features corresponding to each decision tree from the n features, and then selecting a best input feature from the k input features for splitting;
each tree is split until all training examples for that node belong to the same class. Pruning is not required during the decision tree splitting process.
S160: and inputting newly detected transformer oil chromatographic data into a final random forest classification model for fault diagnosis to obtain a diagnosis result of whether the transformer has faults or not.
In some embodiments, inputting newly detected transformer oil chromatographic data into a final random forest classification model for fault diagnosis to obtain a diagnosis result of whether a fault exists in the transformer, including:
when the diagnosis result is that the transformer has a fault, the type and the position of the fault can be obtained.
As can be seen from the foregoing embodiments, the schematic flow chart of the transformer fault diagnosis method based on principal component analysis and random forest phase fusion provided in the embodiments of the present application includes detecting an oil chromatogram of a transformer with a specific fault type, extracting fault oil chromatogram data of the detected fault transformer, integrating the fault oil chromatogram data into a fault oil chromatogram data set, performing ratio dimension raising on the fault oil chromatogram data set, using the fault oil chromatogram data set after the ratio dimension raising as a first fault oil chromatogram data set, performing fault coding, determining correlations among dimensional features according to a correlation thermodynamic diagram among 36 dimensional features in the first fault oil chromatogram data set, establishing a principal component analysis model, eliminating the correlations among the dimensional features by using the principal component analysis model, and preferably tuning the first fault oil chromatogram data set to the remaining 8 principal features, and taking a first fault oil chromatographic data set with the adjustment parameters optimized to the rest 8 main characteristics as a second fault oil chromatographic data set, wherein the second fault oil chromatographic data set comprises 99% of information content in the first fault oil chromatographic data set, and the second fault oil chromatographic data set is adjusted according to the following conditions that: dividing the proportion of 0.2 into a training set and a testing set, establishing a parameter-adjusting and optimizing random forest classification model, training the random forest classification model by adopting the training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting the testing set, taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy, inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis, and obtaining the diagnosis result whether the transformer has faults or not. The dimension of fault oil chromatographic data is improved by a ratio dimension increasing method, the relevance among dimension characteristics is eliminated by a principal component analysis model, the first fault oil chromatographic data is preferably adjusted to the remaining 8 main characteristics, and the information content contained in the fault oil chromatographic data set is fully mined by combining the ratio dimension increasing method and the principal component analysis model. And a training set in a second fault oil chromatography data set with fully excavated information content is input into a random forest classification model, and the random forest classification model is trained, so that the accuracy of judging whether the power transformer has faults and fault types by the random forest classification model is improved, and the accuracy of judging whether the power transformer has faults and fault types by a transformer fault diagnosis method based on principal component analysis and random forest fusion is further improved.
Since the above embodiments are all described by referring to and combining with other embodiments, the same portions are provided between different embodiments, and the same and similar portions between the various embodiments in this specification may be referred to each other. And will not be described in detail herein.
It is noted that, in this specification, relational terms such as "first" and "second," and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a circuit structure, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such circuit structure, article, or apparatus. Without further limitation, the presence of an element identified by the phrase "comprising an … …" does not exclude the presence of other like elements in a circuit structure, article or device comprising the element.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
The above-described embodiments of the present application do not limit the scope of the present application.
Claims (5)
1. A transformer fault diagnosis method based on principal component analysis and random forest fusion is characterized by comprising the following steps:
detecting oil chromatography of the transformer with the definite fault type, extracting fault oil chromatography data of the detected fault transformer, and integrating the fault oil chromatography data into a fault oil chromatography data set;
carrying out ratio dimension increasing on the fault oil chromatographic data set, taking the fault oil chromatographic data set subjected to ratio dimension increasing as a first fault oil chromatographic data set, and carrying out fault coding;
judging the correlation among all the dimensional features according to a correlation thermodynamic diagram among 36 dimensional features in the first fault oil chromatogram data set;
establishing a principal component analysis model, eliminating the correlation among all the dimensional characteristics by adopting the principal component analysis model, adjusting and optimizing the first fault oil chromatographic data set to the remaining 8 main characteristics, taking the first fault oil chromatographic data set with the adjusted and optimized parameters to the remaining 8 main characteristics as a second fault oil chromatographic data set, wherein the second fault oil chromatographic data set comprises 99% of information content in the first fault oil chromatographic data set, and the second fault oil chromatographic data set is calculated according to the following formula of 0.8: dividing the ratio of 0.2 into a training set and a test set;
establishing a parameter-adjusting and optimal-selecting random forest classification model, training the random forest classification model by adopting the training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting the test set, and taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy;
and inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis to obtain a diagnosis result of whether the transformer has faults or not.
2. The transformer fault diagnosis method based on principal component analysis and stochastic forest phase fusion according to claim 1, wherein the establishing of a principal component analysis model, the eliminating of the correlation among the dimensional features by the principal component analysis model, the adjusting and optimizing of the first fault oil chromatographic data set to the remaining 8 main features, and the adjusting and optimizing of the first fault oil chromatographic data set to the remaining 8 main features are taken as a second fault oil chromatographic data set, and the second fault oil chromatographic data set contains 99% of information content in the first fault oil chromatographic data set, and comprises the following steps:
inputting a sample set X ═ X of an n-dimensional space1,x2,…,xmWherein x isi∈xmAnd mapped to k-dimensional space;
preprocessing, the formula for changing the sample variance to 1 is as follows:
xi=xi/σ
calculating the covariance matrix XXTFor the covariance matrix XXTPerforming characteristic decomposition;
the maximum k characteristic values and k characteristic vectors corresponding to the characteristic values are obtained and are marked as omega1,ω2,···,ωk
Output projection matrix W ═ ω1,ω2…,ωkWhere ω isk∈RnIn the parameter adjustment and optimization process, the dimension number is the same as the number of main components, and the minimum k is selected as the number k of the main components, so that the formula for retaining the difference of 99% of the original data is as follows:
wherein m represents the number of features; x(i)Representing an initial matrix;representing the matrix after dimensionality reduction to k-dimension, the numerator representing the space between the original point and the projection pointThe sum of the distances and the smaller the error are, the more completely the data after dimensionality reduction can represent the data before dimensionality reduction, and if the error is less than 0.01, the data after dimensionality reduction can retain 99% of the information.
3. The transformer fault diagnosis method based on principal component analysis and random forest fusion as claimed in claim 1, wherein the establishing of the parameter-adjusting and preferred random forest classification model comprises:
the measure of the degree of purity was set as: criterion ═ mse', whether there is a dropped sample is set to: the number of features considered when restricting branching is set as: max _ features ═ sqrt', the maximum depth of the tree is set to: when the nodes are divided according to the attributes, the minimum number of samples per division is set as: min _ samples _ split is 5, and the number of decision trees is set as: n _ estimators is 1000, and the minimum number of leaf nodes is set as: min _ samples _ leaf is 4.
4. The transformer fault diagnosis method based on principal component analysis and random forest phase fusion according to claim 1, wherein the step of performing ratio dimension increasing on the extracted fault oil chromatographic data, the fault oil chromatographic data after the ratio dimension increasing is used as first fault oil chromatographic data, and the fault oil chromatographic data is integrated into a fault oil chromatographic data set, comprises the steps of:
the fault oil chromatographic data set comprises the content of total hydrocarbon formed by integrating hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and all hydrocarbon;
the first set of fault oil chromatographic data includes hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content, hydrogen to methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, methane to ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethane to ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethylene to acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, acetylene to carbon monoxide, carbon dioxide and total hydrocarbon content ratios, carbon monoxide to carbon dioxide and total hydrocarbon content ratios and carbon dioxide and total hydrocarbon content ratios.
5. The transformer fault diagnosis method based on principal component analysis and random forest fusion as claimed in claim 1, wherein the step of inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis to obtain a diagnosis result of whether a fault exists in the transformer comprises the steps of:
and when the diagnosis result is that the transformer has a fault, the type and the position of the fault can be obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111298905.8A CN114019282A (en) | 2021-11-04 | 2021-11-04 | Transformer fault diagnosis method based on principal component analysis and random forest phase fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111298905.8A CN114019282A (en) | 2021-11-04 | 2021-11-04 | Transformer fault diagnosis method based on principal component analysis and random forest phase fusion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114019282A true CN114019282A (en) | 2022-02-08 |
Family
ID=80061050
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111298905.8A Pending CN114019282A (en) | 2021-11-04 | 2021-11-04 | Transformer fault diagnosis method based on principal component analysis and random forest phase fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114019282A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115326947A (en) * | 2022-06-30 | 2022-11-11 | 中国南方电网有限责任公司超高压输电公司检修试验中心 | Transformer characteristic gas compensation method and device, computer equipment and storage medium |
CN117192371A (en) * | 2023-11-03 | 2023-12-08 | 南通清浪智能科技有限公司 | Test method and system for motor driver of new energy automobile |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105303262A (en) * | 2015-11-12 | 2016-02-03 | 河海大学 | Short period load prediction method based on kernel principle component analysis and random forest |
CN106324405A (en) * | 2016-09-07 | 2017-01-11 | 南京工程学院 | Transformer fault diagnosis method based on improved principal component analysis |
CN110197194A (en) * | 2019-04-12 | 2019-09-03 | 佛山科学技术学院 | A kind of Method for Bearing Fault Diagnosis and device based on improvement random forest |
CN110596492A (en) * | 2019-09-17 | 2019-12-20 | 昆明理工大学 | Transformer fault diagnosis method based on particle swarm optimization random forest model |
CN111027629A (en) * | 2019-12-13 | 2020-04-17 | 国网山东省电力公司莱芜供电公司 | Power distribution network fault outage rate prediction method and system based on improved random forest |
CN112329341A (en) * | 2020-11-02 | 2021-02-05 | 浙江智昌机器人科技有限公司 | Fault diagnosis system and method based on AR and random forest model |
CN112766550A (en) * | 2021-01-08 | 2021-05-07 | 佰聆数据股份有限公司 | Power failure sensitive user prediction method and system based on random forest, storage medium and computer equipment |
CN112861903A (en) * | 2020-12-03 | 2021-05-28 | 南京航空航天大学 | Gearbox fault diagnosis method based on improved deep forest |
-
2021
- 2021-11-04 CN CN202111298905.8A patent/CN114019282A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105303262A (en) * | 2015-11-12 | 2016-02-03 | 河海大学 | Short period load prediction method based on kernel principle component analysis and random forest |
CN106324405A (en) * | 2016-09-07 | 2017-01-11 | 南京工程学院 | Transformer fault diagnosis method based on improved principal component analysis |
CN110197194A (en) * | 2019-04-12 | 2019-09-03 | 佛山科学技术学院 | A kind of Method for Bearing Fault Diagnosis and device based on improvement random forest |
CN110596492A (en) * | 2019-09-17 | 2019-12-20 | 昆明理工大学 | Transformer fault diagnosis method based on particle swarm optimization random forest model |
CN111027629A (en) * | 2019-12-13 | 2020-04-17 | 国网山东省电力公司莱芜供电公司 | Power distribution network fault outage rate prediction method and system based on improved random forest |
CN112329341A (en) * | 2020-11-02 | 2021-02-05 | 浙江智昌机器人科技有限公司 | Fault diagnosis system and method based on AR and random forest model |
CN112861903A (en) * | 2020-12-03 | 2021-05-28 | 南京航空航天大学 | Gearbox fault diagnosis method based on improved deep forest |
CN112766550A (en) * | 2021-01-08 | 2021-05-07 | 佰聆数据股份有限公司 | Power failure sensitive user prediction method and system based on random forest, storage medium and computer equipment |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115326947A (en) * | 2022-06-30 | 2022-11-11 | 中国南方电网有限责任公司超高压输电公司检修试验中心 | Transformer characteristic gas compensation method and device, computer equipment and storage medium |
CN115326947B (en) * | 2022-06-30 | 2024-01-09 | 中国南方电网有限责任公司超高压输电公司检修试验中心 | Transformer characteristic gas compensation method, device, computer equipment and storage medium |
CN117192371A (en) * | 2023-11-03 | 2023-12-08 | 南通清浪智能科技有限公司 | Test method and system for motor driver of new energy automobile |
CN117192371B (en) * | 2023-11-03 | 2024-01-30 | 南通清浪智能科技有限公司 | Test method and system for motor driver of new energy automobile |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11544426B2 (en) | Systems and methods for enhanced sequential power system model parameter estimation | |
CN114019282A (en) | Transformer fault diagnosis method based on principal component analysis and random forest phase fusion | |
Kisswani | Evaluating the GDP–energy consumption nexus for the ASEAN‐5 countries using nonlinear ARDL model | |
Hassani et al. | Forecasting European industrial production with singular spectrum analysis | |
CN112288079B (en) | Graphic neural network model training method, software defect detection method and system | |
CN111505433A (en) | Low-voltage transformer area family variable relation error correction and phase identification method | |
CN115563563A (en) | Fault diagnosis method and device based on transformer oil chromatographic analysis | |
CN111060652A (en) | Method for predicting concentration of dissolved gas in transformer oil based on long-term and short-term memory network | |
CN111198924A (en) | Method for establishing product failure knowledge base, failure analysis method, device and medium | |
CN115327286A (en) | Transformer monitoring method and system applied to power station | |
CN117332330A (en) | Transformer fault evaluation method and system based on multi-source data graph fusion model | |
Moravej et al. | Power transformer protection scheme based on time‐frequency analysis | |
Stefanidou-Voziki et al. | Feature selection and optimization of a ML fault location algorithm for low voltage grids | |
Takamura et al. | Discriminative analysis of linguistic features for typological study | |
CN114358193A (en) | Transformer state diagnosis method based on oil chromatography, terminal and storage medium | |
CN116522138A (en) | Multi-element time sequence anomaly detection method and system based on weak supervision integration | |
CN110673997A (en) | Disk failure prediction method and device | |
CN116955602A (en) | Text processing method and device and electronic equipment | |
CN114325384A (en) | Crowdsourcing acquisition system and method based on motor fault knowledge | |
CN114324712A (en) | Transformer oil chromatographic fault judgment method and device, terminal and storage medium | |
Martins et al. | An active multiphase probabilistic power flow based on a clustering approach | |
Liu et al. | Transformer fault diagnosis model based on iterative nearest neighbor interpolation and ensemble learning | |
Saha | Influence of various text embeddings on clustering performance in NLP | |
Tebexreni et al. | Efficient Methods to Calculate the Reliability of Energy Systems with Correlated Renewable Sources | |
CN115267614B (en) | Detection method and system for intelligent transformer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |