CN114019282A - Transformer fault diagnosis method based on principal component analysis and random forest phase fusion - Google Patents

Transformer fault diagnosis method based on principal component analysis and random forest phase fusion Download PDF

Info

Publication number
CN114019282A
CN114019282A CN202111298905.8A CN202111298905A CN114019282A CN 114019282 A CN114019282 A CN 114019282A CN 202111298905 A CN202111298905 A CN 202111298905A CN 114019282 A CN114019282 A CN 114019282A
Authority
CN
China
Prior art keywords
fault
random forest
chromatographic data
data set
oil chromatographic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111298905.8A
Other languages
Chinese (zh)
Inventor
陈龙谭
于虹
李�昊
王宣军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Yunnan Power Grid Co Ltd
Original Assignee
Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Yunnan Power Grid Co Ltd filed Critical Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority to CN202111298905.8A priority Critical patent/CN114019282A/en
Publication of CN114019282A publication Critical patent/CN114019282A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Housings And Mounting Of Transformers (AREA)

Abstract

The transformer fault diagnosis method based on principal component analysis and random forest phase fusion comprises the steps of determining the fault oil chromatographic data set ratio dimension of a fault type transformer as a fault code after a first fault oil chromatographic data set, eliminating the correlation among dimensional characteristics by using a principal component analysis model, preferably adjusting the first fault oil chromatographic data set until 8 main characteristics are remained as a second fault oil chromatographic data set, and performing the following steps according to the ratio of 0.8: and dividing the ratio of 0.2 into a training set and a test set, detecting the fault diagnosis accuracy of the optimized random forest classification model (namely the first random forest classification model) trained by the training set by the test set, and inputting the transformer oil chromatographic data into the first random forest classification model (namely the final random forest classification model) with the fault diagnosis accuracy not less than the set fault diagnosis accuracy to obtain a diagnosis result. The ratio dimensionality enhancement is combined with the principal component analysis model, the information content of the fault oil chromatographic data set is fully mined, and the accuracy of fault diagnosis is improved.

Description

Transformer fault diagnosis method based on principal component analysis and random forest phase fusion
Technical Field
The application relates to the field of transformer fault diagnosis, in particular to a transformer fault diagnosis method based on principal component analysis and random forest fusion.
Background
A power transformer is a stationary electrical device that is used to transform an ac voltage of a certain value into another voltage of the same frequency or several different values. A power transformer is one of important electrical devices in a power system, and a power plant and a substation raise or lower a voltage to a voltage required by a power utilization area through the power transformer to supply power to each place. The power supply reliability is affected by the power transformer failure, and in order to improve the power supply reliability, the power transformer is repaired in time after the power transformer fails.
In order to repair the fault in time after the fault of the power transformer, in the prior art, a fault type of the power transformer is judged according to components and content of gas by a three-ratio method through a dissolved gas analysis technology in power transformer oil, however, the efficiency of extracting effective information in a fault data set by the three-ratio method is low, and the accuracy of judging the fault type of the power transformer is low.
Disclosure of Invention
The application provides a transformer fault diagnosis method based on principal component analysis and random forest fusion, and aims to solve the technical problem that the accuracy rate of judging the fault type of a power transformer is low.
In order to solve the technical problem, the embodiment of the application discloses the following technical scheme:
on the first hand, the embodiment of the application discloses a transformer fault diagnosis method based on principal component analysis and random forest phase fusion, which comprises the steps of detecting oil chromatograms of transformers with definite fault types, extracting fault oil chromatograms of the detected fault transformers, and integrating the fault oil chromatograms into a fault oil chromatograms dataset;
carrying out ratio dimension increasing on the fault oil chromatographic data set, taking the fault oil chromatographic data set subjected to ratio dimension increasing as a first fault oil chromatographic data set, and carrying out fault coding;
judging the correlation among all the dimensional features according to a correlation thermodynamic diagram among 36 dimensional features in the first fault oil chromatogram data set;
establishing a principal component analysis model, eliminating the correlation among all the dimensional characteristics by adopting the principal component analysis model, adjusting and optimizing the first fault oil chromatographic data set to the remaining 8 main characteristics, taking the first fault oil chromatographic data set with the adjusted and optimized parameters to the remaining 8 main characteristics as a second fault oil chromatographic data set, wherein the second fault oil chromatographic data set comprises 99% of information content in the first fault oil chromatographic data set, and the second fault oil chromatographic data set is calculated according to the following formula of 0.8: dividing the ratio of 0.2 into a training set and a test set;
establishing a parameter-adjusting and optimal-selecting random forest classification model, training the random forest classification model by adopting a training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting a test set, and taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy;
and inputting newly detected transformer oil chromatographic data into a final random forest classification model for fault diagnosis to obtain a diagnosis result of whether the transformer has faults or not.
Optionally, establishing a principal component analysis model, eliminating correlation among the dimensional features by using the principal component analysis model, preferentially selecting the first fault oil chromatographic data set to the remaining 8 main features, and using the first fault oil chromatographic data set with the preferred parameter to the remaining 8 main features as a second fault oil chromatographic data set, where the second fault oil chromatographic data set includes 99% of information content in the first fault oil chromatographic data set, and the method includes:
inputting a sample set X ═ X of an n-dimensional space1,x2,···,xmWherein x isi∈xmAnd mapped to k-dimensional space;
preprocessing, the formula for changing the sample mean value to 0 is:
Figure BDA0003337486830000021
preprocessing, the formula for changing the sample variance to 1 is as follows:
Figure BDA0003337486830000022
xi=xi
calculating the covariance matrix XXTFor the covariance matrix XXTPerforming characteristic decomposition;
the maximum k characteristic values and k characteristic vectors corresponding to the characteristic values are obtained and are marked as omega1,ω2,···,ωkOutput projection matrix W ═ ω12···,ωkWhere ω isk∈RnIn the parameter adjustment and optimization process, the dimension number is the same as the number of main components, and the minimum k is selected as the number k of the main components, so that the formula for retaining the difference of 99% of the original data is as follows:
Figure BDA0003337486830000023
wherein m represents the number of features; x(i)Representing an initial matrix;
Figure BDA0003337486830000024
and representing a matrix after dimensionality reduction to k dimension, wherein a molecule represents the sum of distances between an original point and a projection point, and the smaller the error is, the more complete the data after dimensionality reduction can represent the data before dimensionality reduction, and if the error is less than 0.01, the more 99% of information can be retained in the data after dimensionality reduction.
Optionally, establishing a parameter-adjusting and preferred random forest classification model, including:
the measure of the degree of purity was set as: criterion ═ mse', whether there is a dropped sample is set to: the number of features considered when restricting branching is set as: max _ features ═ sqrt', the maximum depth of the tree is set to: when the nodes are divided according to the attributes, the minimum number of samples per division is set as: min _ samples _ split is 5, and the number of decision trees is set as: n _ estimators is 1000, and the minimum number of leaf nodes is set as: min _ samples _ leaf is 4.
Optionally, the extracted fault oil chromatographic data is subjected to ratio dimension increasing, the fault oil chromatographic data subjected to ratio dimension increasing is used as first fault oil chromatographic data, and the fault oil chromatographic data is integrated into a fault oil chromatographic data set, including:
the fault oil chromatographic data set comprises the content of total hydrocarbon formed by integrating hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and all hydrocarbon;
the first set of fault oil chromatographic data includes hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content, hydrogen to methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, methane to ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethane to ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethylene to acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, acetylene to carbon monoxide, carbon dioxide and total hydrocarbon content ratios, carbon monoxide to carbon dioxide and total hydrocarbon content ratios and carbon dioxide and total hydrocarbon content ratios.
Optionally, inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis, and obtaining a diagnosis result of whether the transformer has a fault, including:
when the diagnosis result is that the transformer has a fault, the type and the position of the fault can be obtained.
The beneficial effect of this application does:
the flow schematic diagram of the transformer fault diagnosis method based on principal component analysis and random forest phase fusion provided by the embodiment of the application comprises the steps of detecting oil chromatograms of transformers with definite fault types, extracting fault oil chromatograms of the detected fault transformers, integrating the fault oil chromatograms into a fault oil chromatograms, performing ratio dimension raising on the fault oil chromatograms, using the fault oil chromatograms with the ratio dimension raised as a first fault oil chromatograms, performing fault coding, judging the relevance among all the dimensionality characteristics according to the relevance thermodynamic diagram among 36 dimensionality characteristics in the first fault oil chromatograms, establishing a principal component analysis model, eliminating the relevance among all the dimensionality characteristics by using the principal component analysis model, adjusting and optimizing the first fault oil chromatograms to the remaining 8 principal characteristics, and using the first fault oil chromatograms with the adjusted and optimized to the remaining 8 principal characteristics as a second fault oil chromatograms And a spectrum data set and a second fault oil chromatogram data set contain 99% of information content in the first fault oil chromatogram data set, and the second fault oil chromatogram data set is divided into a first fault oil chromatogram data set and a second fault oil chromatogram data set according to the ratio of 0.8: dividing the proportion of 0.2 into a training set and a testing set, establishing a parameter-adjusting and optimizing random forest classification model, training the random forest classification model by adopting the training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting the testing set, taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy, inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis, and obtaining the diagnosis result whether the transformer has faults or not. The dimension of fault oil chromatographic data is improved by a ratio dimension increasing method, the relevance among dimension characteristics is eliminated by a principal component analysis model, the first fault oil chromatographic data is preferably adjusted to the remaining 8 main characteristics, and the information content contained in the fault oil chromatographic data set is fully mined by combining the ratio dimension increasing method and the principal component analysis model. And a training set in a second fault oil chromatography data set with fully excavated information content is input into a random forest classification model, and the random forest classification model is trained, so that the accuracy of judging whether the power transformer has faults and fault types by the random forest classification model is improved, and the accuracy of judging whether the power transformer has faults and fault types by a transformer fault diagnosis method based on principal component analysis and random forest fusion is further improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a transformer fault diagnosis method based on principal component analysis and random forest fusion according to an embodiment of the present application;
fig. 2 is a schematic process diagram of a transformer fault diagnosis method based on principal component analysis and random forest fusion according to an embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, the embodiment of the present application provides a low accuracy in determining the fault type of the power transformer, including steps S110 to S160.
S110: and detecting the oil chromatogram of the transformer with the determined fault type, extracting fault oil chromatogram data of the detected fault transformer, and integrating the fault oil chromatogram data into a fault oil chromatogram data set.
S120: and (4) performing ratio dimension increasing on the fault oil chromatographic data set, taking the fault oil chromatographic data set subjected to ratio dimension increasing as a first fault oil chromatographic data set, and performing fault coding.
In some embodiments, as shown in fig. 2, in the actual encoding, when singular value decomposition is performed on the covariance matrix in the implementation process of the principal component analysis, the S matrix can be obtained. The expression for the principal component analysis error is equivalent to the following equation:
Figure BDA0003337486830000041
wherein SiIs a matrix of eigenvalues.
In some embodiments, as shown in fig. 2, performing ratio dimension increasing on the extracted fault oil chromatographic data, taking the fault oil chromatographic data after the ratio dimension increasing as first fault oil chromatographic data, and integrating the fault oil chromatographic data into a fault oil chromatographic data set, includes:
the fault oil chromatographic data set comprises the content of total hydrocarbon formed by integrating hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and all hydrocarbon;
the first set of fault oil chromatographic data includes hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content, hydrogen to methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, methane to ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethane to ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethylene to acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, acetylene to carbon monoxide, carbon dioxide and total hydrocarbon content ratios, carbon monoxide to carbon dioxide and total hydrocarbon content ratios and carbon dioxide and Total Hydrocarbon (TH) content ratios.
The specific ratio mode of the first fault oil chromatogram data set is shown in table 1:
TABLE 1
Figure BDA0003337486830000042
Wherein TH is CH4+C2H6+C2H4+C2H2
In some embodiments, as shown in fig. 2, since the random forest classification model is a tree-based model, when processing variables, rather than being based on vector space measurement, the numerical value is only a category, i.e. there is no partial order relationship, and more reasonable label coding can be used. In the present application, "high temperature overheat", "medium temperature overheat", "low temperature overheat", "partial discharge", "low energy discharge", and "normal" in the fault type are encoded as "1", "2", "3", "4", "5", "6", and "7", respectively, using LabelEncode (tag code).
S130: and judging the correlation among the dimension characteristics according to a correlation thermodynamic diagram among the 36 dimension characteristics in the first fault oil chromatographic data set.
S140: establishing a principal component analysis model, eliminating the correlation among all the dimensional characteristics by adopting the principal component analysis model, adjusting and optimizing the first fault oil chromatographic data set to the remaining 8 main characteristics, taking the first fault oil chromatographic data set with the adjusted and optimized parameters to the remaining 8 main characteristics as a second fault oil chromatographic data set, wherein the second fault oil chromatographic data set comprises 99% of information content in the first fault oil chromatographic data set, and the second fault oil chromatographic data set is calculated according to the following formula of 0.8: a scale of 0.2 is divided into a training set and a test set.
In some embodiments, as shown in fig. 2, establishing a principal component analysis model, eliminating the correlation between the dimensional features by using the principal component analysis model, and preferably tuning the first faulty oil chromatographic data set to the remaining 8 main features, and using the first faulty oil chromatographic data set with the tuning being preferred to the remaining 8 main features as a second faulty oil chromatographic data set, where the second faulty oil chromatographic data set includes 99% of the information content in the first faulty oil chromatographic data set, includes:
inputting a sample set X ═ X of an n-dimensional space1,x2,···,xmWherein x isi∈xmAnd is combined withMapping to k-dimensional space;
preprocessing, the formula for changing the sample mean value to 0 is:
Figure BDA0003337486830000051
preprocessing, the formula for changing the sample variance to 1 is as follows:
Figure BDA0003337486830000052
xi=xi
calculating the covariance matrix XXTFor the covariance matrix XXTPerforming characteristic decomposition;
the maximum k characteristic values and k characteristic vectors corresponding to the characteristic values are obtained and are marked as omega1,ω2,···,ωkOutput projection matrix W ═ ω12···,ωkWhere ω isk∈RnIn the parameter adjustment and optimization process, the dimension number is the same as the number of main components, and the minimum k is selected as the number k of the main components, so that the formula for retaining the difference of 99% of the original data is as follows:
Figure BDA0003337486830000053
wherein m represents the number of features; x(i)Representing an initial matrix;
Figure BDA0003337486830000054
and representing a matrix after dimensionality reduction to k dimension, wherein a molecule represents the sum of distances between an original point and a projection point, and the smaller the error is, the more complete the data after dimensionality reduction can represent the data before dimensionality reduction, and if the error is less than 0.01, the more 99% of information can be retained in the data after dimensionality reduction.
In some embodiments, one of the most critical parameters of the principal component analysis model is n _ components, which, if set to integers, are reduced to several principal components, and if set to decimals, indicate the information that the reduced-dimension data can retain. The principal component parameters were set as: n _ components is 8, i.e. dimensionality reduction to 8 principal components.
The method avoids the excessive loss of the information quantity in the second fault oil chromatographic data set caused by the fact that the correlation among all the dimensional characteristics is eliminated by the principal component analysis model, thereby ensuring the information quantity in the second fault oil chromatographic data set, improving the accuracy of judging whether the power transformer is in fault and the fault type by the random forest classification model, and further improving the accuracy of judging whether the power transformer is in fault and the fault type by the transformer fault diagnosis method based on principal component analysis and random forest phase fusion.
In some embodiments, the data set is partitioned into a training set and a validation set using a train _ test _ split () function, where the test set is sized to: test _ size ═ 0.2, the random seed is set to: and random _ state is 1, so that the data set division is unique during each operation, and the result can be reproduced.
S150: establishing a parameter-adjusting optimized random forest classification model, training the random forest classification model by adopting a training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting a test set, and taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy.
In some embodiments, as shown in fig. 2, establishing a tuning-parameter-optimized random forest classification model includes:
the measure of the degree of purity was set as: criterion ═ mse', whether there is a dropped sample is set to: the number of features considered when restricting branching is set as: max _ features ═ sqrt', the maximum depth of the tree is set to: when the nodes are divided according to the attributes, the minimum number of samples per division is set as: min _ samples _ split is 5, and the number of decision trees is set as: n _ estimators is 1000, and the minimum number of leaf nodes is set as: min _ samples _ leaf is 4.
At one endIn some embodiments, the random forest classification model also needs to consider two parameters: the number n _ { tree } of the constructed decision tree, the number k of input features to be considered when each node of the decision tree is split, and usually k can be log2n, where n represents the number of features in the original dataset. The construction of a single decision tree can be divided into the following steps:
assuming that the number of training samples is m, the number of input samples corresponding to each decision tree is m, and the m samples are randomly extracted from the training set in a place-back manner;
assuming that the number of training sample features is n, randomly selecting k sample features corresponding to each decision tree from the n features, and then selecting a best input feature from the k input features for splitting;
each tree is split until all training examples for that node belong to the same class. Pruning is not required during the decision tree splitting process.
S160: and inputting newly detected transformer oil chromatographic data into a final random forest classification model for fault diagnosis to obtain a diagnosis result of whether the transformer has faults or not.
In some embodiments, inputting newly detected transformer oil chromatographic data into a final random forest classification model for fault diagnosis to obtain a diagnosis result of whether a fault exists in the transformer, including:
when the diagnosis result is that the transformer has a fault, the type and the position of the fault can be obtained.
As can be seen from the foregoing embodiments, the schematic flow chart of the transformer fault diagnosis method based on principal component analysis and random forest phase fusion provided in the embodiments of the present application includes detecting an oil chromatogram of a transformer with a specific fault type, extracting fault oil chromatogram data of the detected fault transformer, integrating the fault oil chromatogram data into a fault oil chromatogram data set, performing ratio dimension raising on the fault oil chromatogram data set, using the fault oil chromatogram data set after the ratio dimension raising as a first fault oil chromatogram data set, performing fault coding, determining correlations among dimensional features according to a correlation thermodynamic diagram among 36 dimensional features in the first fault oil chromatogram data set, establishing a principal component analysis model, eliminating the correlations among the dimensional features by using the principal component analysis model, and preferably tuning the first fault oil chromatogram data set to the remaining 8 principal features, and taking a first fault oil chromatographic data set with the adjustment parameters optimized to the rest 8 main characteristics as a second fault oil chromatographic data set, wherein the second fault oil chromatographic data set comprises 99% of information content in the first fault oil chromatographic data set, and the second fault oil chromatographic data set is adjusted according to the following conditions that: dividing the proportion of 0.2 into a training set and a testing set, establishing a parameter-adjusting and optimizing random forest classification model, training the random forest classification model by adopting the training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting the testing set, taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy, inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis, and obtaining the diagnosis result whether the transformer has faults or not. The dimension of fault oil chromatographic data is improved by a ratio dimension increasing method, the relevance among dimension characteristics is eliminated by a principal component analysis model, the first fault oil chromatographic data is preferably adjusted to the remaining 8 main characteristics, and the information content contained in the fault oil chromatographic data set is fully mined by combining the ratio dimension increasing method and the principal component analysis model. And a training set in a second fault oil chromatography data set with fully excavated information content is input into a random forest classification model, and the random forest classification model is trained, so that the accuracy of judging whether the power transformer has faults and fault types by the random forest classification model is improved, and the accuracy of judging whether the power transformer has faults and fault types by a transformer fault diagnosis method based on principal component analysis and random forest fusion is further improved.
Since the above embodiments are all described by referring to and combining with other embodiments, the same portions are provided between different embodiments, and the same and similar portions between the various embodiments in this specification may be referred to each other. And will not be described in detail herein.
It is noted that, in this specification, relational terms such as "first" and "second," and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a circuit structure, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such circuit structure, article, or apparatus. Without further limitation, the presence of an element identified by the phrase "comprising an … …" does not exclude the presence of other like elements in a circuit structure, article or device comprising the element.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
The above-described embodiments of the present application do not limit the scope of the present application.

Claims (5)

1. A transformer fault diagnosis method based on principal component analysis and random forest fusion is characterized by comprising the following steps:
detecting oil chromatography of the transformer with the definite fault type, extracting fault oil chromatography data of the detected fault transformer, and integrating the fault oil chromatography data into a fault oil chromatography data set;
carrying out ratio dimension increasing on the fault oil chromatographic data set, taking the fault oil chromatographic data set subjected to ratio dimension increasing as a first fault oil chromatographic data set, and carrying out fault coding;
judging the correlation among all the dimensional features according to a correlation thermodynamic diagram among 36 dimensional features in the first fault oil chromatogram data set;
establishing a principal component analysis model, eliminating the correlation among all the dimensional characteristics by adopting the principal component analysis model, adjusting and optimizing the first fault oil chromatographic data set to the remaining 8 main characteristics, taking the first fault oil chromatographic data set with the adjusted and optimized parameters to the remaining 8 main characteristics as a second fault oil chromatographic data set, wherein the second fault oil chromatographic data set comprises 99% of information content in the first fault oil chromatographic data set, and the second fault oil chromatographic data set is calculated according to the following formula of 0.8: dividing the ratio of 0.2 into a training set and a test set;
establishing a parameter-adjusting and optimal-selecting random forest classification model, training the random forest classification model by adopting the training set, taking the trained random forest classification model as a first random forest classification model, detecting the fault diagnosis accuracy of the first random forest classification model by adopting the test set, and taking the first random forest classification model as a final random forest classification model when the fault diagnosis accuracy of the first random forest classification model is more than or equal to the set fault diagnosis accuracy;
and inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis to obtain a diagnosis result of whether the transformer has faults or not.
2. The transformer fault diagnosis method based on principal component analysis and stochastic forest phase fusion according to claim 1, wherein the establishing of a principal component analysis model, the eliminating of the correlation among the dimensional features by the principal component analysis model, the adjusting and optimizing of the first fault oil chromatographic data set to the remaining 8 main features, and the adjusting and optimizing of the first fault oil chromatographic data set to the remaining 8 main features are taken as a second fault oil chromatographic data set, and the second fault oil chromatographic data set contains 99% of information content in the first fault oil chromatographic data set, and comprises the following steps:
inputting a sample set X ═ X of an n-dimensional space1,x2,…,xmWherein x isi∈xmAnd mapped to k-dimensional space;
preprocessing, the formula for changing the sample mean value to 0 is:
Figure FDA0003337486820000011
preprocessing, the formula for changing the sample variance to 1 is as follows:
Figure FDA0003337486820000012
xi=xi
calculating the covariance matrix XXTFor the covariance matrix XXTPerforming characteristic decomposition;
the maximum k characteristic values and k characteristic vectors corresponding to the characteristic values are obtained and are marked as omega1,ω2,···,ωk
Output projection matrix W ═ ω12…,ωkWhere ω isk∈RnIn the parameter adjustment and optimization process, the dimension number is the same as the number of main components, and the minimum k is selected as the number k of the main components, so that the formula for retaining the difference of 99% of the original data is as follows:
Figure FDA0003337486820000021
wherein m represents the number of features; x(i)Representing an initial matrix;
Figure FDA0003337486820000022
representing the matrix after dimensionality reduction to k-dimension, the numerator representing the space between the original point and the projection pointThe sum of the distances and the smaller the error are, the more completely the data after dimensionality reduction can represent the data before dimensionality reduction, and if the error is less than 0.01, the data after dimensionality reduction can retain 99% of the information.
3. The transformer fault diagnosis method based on principal component analysis and random forest fusion as claimed in claim 1, wherein the establishing of the parameter-adjusting and preferred random forest classification model comprises:
the measure of the degree of purity was set as: criterion ═ mse', whether there is a dropped sample is set to: the number of features considered when restricting branching is set as: max _ features ═ sqrt', the maximum depth of the tree is set to: when the nodes are divided according to the attributes, the minimum number of samples per division is set as: min _ samples _ split is 5, and the number of decision trees is set as: n _ estimators is 1000, and the minimum number of leaf nodes is set as: min _ samples _ leaf is 4.
4. The transformer fault diagnosis method based on principal component analysis and random forest phase fusion according to claim 1, wherein the step of performing ratio dimension increasing on the extracted fault oil chromatographic data, the fault oil chromatographic data after the ratio dimension increasing is used as first fault oil chromatographic data, and the fault oil chromatographic data is integrated into a fault oil chromatographic data set, comprises the steps of:
the fault oil chromatographic data set comprises the content of total hydrocarbon formed by integrating hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and all hydrocarbon;
the first set of fault oil chromatographic data includes hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content, hydrogen to methane, ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, methane to ethane, ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethane to ethylene, acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, ethylene to acetylene, carbon monoxide, carbon dioxide and total hydrocarbon content ratios, acetylene to carbon monoxide, carbon dioxide and total hydrocarbon content ratios, carbon monoxide to carbon dioxide and total hydrocarbon content ratios and carbon dioxide and total hydrocarbon content ratios.
5. The transformer fault diagnosis method based on principal component analysis and random forest fusion as claimed in claim 1, wherein the step of inputting newly detected transformer oil chromatographic data into the final random forest classification model for fault diagnosis to obtain a diagnosis result of whether a fault exists in the transformer comprises the steps of:
and when the diagnosis result is that the transformer has a fault, the type and the position of the fault can be obtained.
CN202111298905.8A 2021-11-04 2021-11-04 Transformer fault diagnosis method based on principal component analysis and random forest phase fusion Pending CN114019282A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111298905.8A CN114019282A (en) 2021-11-04 2021-11-04 Transformer fault diagnosis method based on principal component analysis and random forest phase fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111298905.8A CN114019282A (en) 2021-11-04 2021-11-04 Transformer fault diagnosis method based on principal component analysis and random forest phase fusion

Publications (1)

Publication Number Publication Date
CN114019282A true CN114019282A (en) 2022-02-08

Family

ID=80061050

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111298905.8A Pending CN114019282A (en) 2021-11-04 2021-11-04 Transformer fault diagnosis method based on principal component analysis and random forest phase fusion

Country Status (1)

Country Link
CN (1) CN114019282A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115326947A (en) * 2022-06-30 2022-11-11 中国南方电网有限责任公司超高压输电公司检修试验中心 Transformer characteristic gas compensation method and device, computer equipment and storage medium
CN117192371A (en) * 2023-11-03 2023-12-08 南通清浪智能科技有限公司 Test method and system for motor driver of new energy automobile

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303262A (en) * 2015-11-12 2016-02-03 河海大学 Short period load prediction method based on kernel principle component analysis and random forest
CN106324405A (en) * 2016-09-07 2017-01-11 南京工程学院 Transformer fault diagnosis method based on improved principal component analysis
CN110197194A (en) * 2019-04-12 2019-09-03 佛山科学技术学院 A kind of Method for Bearing Fault Diagnosis and device based on improvement random forest
CN110596492A (en) * 2019-09-17 2019-12-20 昆明理工大学 Transformer fault diagnosis method based on particle swarm optimization random forest model
CN111027629A (en) * 2019-12-13 2020-04-17 国网山东省电力公司莱芜供电公司 Power distribution network fault outage rate prediction method and system based on improved random forest
CN112329341A (en) * 2020-11-02 2021-02-05 浙江智昌机器人科技有限公司 Fault diagnosis system and method based on AR and random forest model
CN112766550A (en) * 2021-01-08 2021-05-07 佰聆数据股份有限公司 Power failure sensitive user prediction method and system based on random forest, storage medium and computer equipment
CN112861903A (en) * 2020-12-03 2021-05-28 南京航空航天大学 Gearbox fault diagnosis method based on improved deep forest

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303262A (en) * 2015-11-12 2016-02-03 河海大学 Short period load prediction method based on kernel principle component analysis and random forest
CN106324405A (en) * 2016-09-07 2017-01-11 南京工程学院 Transformer fault diagnosis method based on improved principal component analysis
CN110197194A (en) * 2019-04-12 2019-09-03 佛山科学技术学院 A kind of Method for Bearing Fault Diagnosis and device based on improvement random forest
CN110596492A (en) * 2019-09-17 2019-12-20 昆明理工大学 Transformer fault diagnosis method based on particle swarm optimization random forest model
CN111027629A (en) * 2019-12-13 2020-04-17 国网山东省电力公司莱芜供电公司 Power distribution network fault outage rate prediction method and system based on improved random forest
CN112329341A (en) * 2020-11-02 2021-02-05 浙江智昌机器人科技有限公司 Fault diagnosis system and method based on AR and random forest model
CN112861903A (en) * 2020-12-03 2021-05-28 南京航空航天大学 Gearbox fault diagnosis method based on improved deep forest
CN112766550A (en) * 2021-01-08 2021-05-07 佰聆数据股份有限公司 Power failure sensitive user prediction method and system based on random forest, storage medium and computer equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115326947A (en) * 2022-06-30 2022-11-11 中国南方电网有限责任公司超高压输电公司检修试验中心 Transformer characteristic gas compensation method and device, computer equipment and storage medium
CN115326947B (en) * 2022-06-30 2024-01-09 中国南方电网有限责任公司超高压输电公司检修试验中心 Transformer characteristic gas compensation method, device, computer equipment and storage medium
CN117192371A (en) * 2023-11-03 2023-12-08 南通清浪智能科技有限公司 Test method and system for motor driver of new energy automobile
CN117192371B (en) * 2023-11-03 2024-01-30 南通清浪智能科技有限公司 Test method and system for motor driver of new energy automobile

Similar Documents

Publication Publication Date Title
US11544426B2 (en) Systems and methods for enhanced sequential power system model parameter estimation
CN114019282A (en) Transformer fault diagnosis method based on principal component analysis and random forest phase fusion
Kisswani Evaluating the GDP–energy consumption nexus for the ASEAN‐5 countries using nonlinear ARDL model
Hassani et al. Forecasting European industrial production with singular spectrum analysis
CN112288079B (en) Graphic neural network model training method, software defect detection method and system
CN111505433A (en) Low-voltage transformer area family variable relation error correction and phase identification method
CN115563563A (en) Fault diagnosis method and device based on transformer oil chromatographic analysis
CN111060652A (en) Method for predicting concentration of dissolved gas in transformer oil based on long-term and short-term memory network
CN111198924A (en) Method for establishing product failure knowledge base, failure analysis method, device and medium
CN115327286A (en) Transformer monitoring method and system applied to power station
CN117332330A (en) Transformer fault evaluation method and system based on multi-source data graph fusion model
Moravej et al. Power transformer protection scheme based on time‐frequency analysis
Stefanidou-Voziki et al. Feature selection and optimization of a ML fault location algorithm for low voltage grids
Takamura et al. Discriminative analysis of linguistic features for typological study
CN114358193A (en) Transformer state diagnosis method based on oil chromatography, terminal and storage medium
CN116522138A (en) Multi-element time sequence anomaly detection method and system based on weak supervision integration
CN110673997A (en) Disk failure prediction method and device
CN116955602A (en) Text processing method and device and electronic equipment
CN114325384A (en) Crowdsourcing acquisition system and method based on motor fault knowledge
CN114324712A (en) Transformer oil chromatographic fault judgment method and device, terminal and storage medium
Martins et al. An active multiphase probabilistic power flow based on a clustering approach
Liu et al. Transformer fault diagnosis model based on iterative nearest neighbor interpolation and ensemble learning
Saha Influence of various text embeddings on clustering performance in NLP
Tebexreni et al. Efficient Methods to Calculate the Reliability of Energy Systems with Correlated Renewable Sources
CN115267614B (en) Detection method and system for intelligent transformer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination