CN107817427A - Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation - Google Patents
Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation Download PDFInfo
- Publication number
- CN107817427A CN107817427A CN201711044243.5A CN201711044243A CN107817427A CN 107817427 A CN107817427 A CN 107817427A CN 201711044243 A CN201711044243 A CN 201711044243A CN 107817427 A CN107817427 A CN 107817427A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- sample
- attribute
- decision tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
- G01R31/12—Testing dielectric strength or breakdown voltage ; Testing or monitoring effectiveness or level of insulation, e.g. of a cable or of an apparatus, for example using partial discharge measurements; Electrostatic testing
- G01R31/1227—Testing dielectric strength or breakdown voltage ; Testing or monitoring effectiveness or level of insulation, e.g. of a cable or of an apparatus, for example using partial discharge measurements; Electrostatic testing of components, parts or materials
- G01R31/1254—Testing dielectric strength or breakdown voltage ; Testing or monitoring effectiveness or level of insulation, e.g. of a cable or of an apparatus, for example using partial discharge measurements; Electrostatic testing of components, parts or materials of gas-insulated power appliances or vacuum gaps
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
Abstract
The invention discloses a kind of decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation, and it is as follows that it includes decision tree formation flow:S1:Whether training of judgement sample is empty, if it is not, into step S2;Conversely, then enter step S6;S2:Sample in decision node only has a classification, if it is not, into step S3;Conversely, then enter step S6;S3:Whether the attribute A of information gain-ratio is continuous quantity in judgement sample, if it is not, into step S4;Conversely, into step S6;S4:Find attribute A partition threshold;S5:New node, and return to step S1 are grown according to attribute A;S6:As leaf node and it is named as respective classes;S7:Form decision tree.The beneficial effect that the present invention obtains is:Ensure SF6The safe and reliable operation of equipment, its discrimination to all kinds of defects is improved, the treatment effeciency to insulation fault can be improved.Pattern-recognition has been carried out to the shelf depreciation of acquisition using decision tree, has further improved the discrimination to shelf depreciation.
Description
Technical field
The present invention relates to sulfur hexafluoride decomposition technique field, particularly a kind of determining based on sulfur hexafluoride gas shelf depreciation
Plan tree recognition methods.
Background technology
Sulfur hexafluoride (SF6) gas is widely used in air insulating device due to its excellent insulation and arc extinction performance
In.However, SF6Air insulating device (abbreviation SF6Electrical equipment, such as gas insulated combined electrical equipment GIS, gas insulation breaker
GCB, gas-insulated transformer GIT and gas-insulated lines or pipeline GIL etc.) in manufacture, transport, installation, maintenance and operation
Deng during, internal inevitably various insulation defects, such as the metallic bur power on conductor, superstructure loosening or contact not
Good, conductor and supporting insulator peel off metal particle in legacy and cavity after the air gap to be formed, maintenance etc., and these are all
SF can be made6Device interior forms different degrees of insulation defect, so as to cause device interior electric field to be distorted, and then generation office
Discharge (PD) in portion.
When there is serious PD, on the one hand, PD can accelerate the further destruction to device interior insulation, ultimately result in absolutely
Reason barrier causes power outage, to operating SF6Equipment is a kind of potential hidden danger, there is the title of insulation " tumour ";The opposing party
Face, PD is the characteristic quantity of Efficient Characterization insulation status again, by SF6The PD of electrical equipment carry out detection go forward side by side row mode knowledge
Not, SF can largely be found6Insulation defect existing for device interior and type.Therefore, the production of insulation defect is identified
Life is to ensureing SF6Electrical equipment safe and reliable operation has important practical significance.
The content of the invention
It is local based on sulfur hexafluoride gas it is an object of the invention to provide one kind in view of the drawbacks described above of prior art
The decision tree recognition methods of electric discharge, ensure SF6The safe and reliable operation of equipment, its discrimination to all kinds of defects is improved, can be with
Improve the treatment effeciency to insulation fault.
The purpose of the present invention realized by such technical scheme, a kind of based on sulfur hexafluoride gas shelf depreciation
Decision tree recognition methods, it includes:It is as follows that the decision tree forms flow:
S1:Whether training of judgement sample is empty, if it is not, into step S2;Conversely, then enter step S6;
S2:Sample in decision node only has a classification, if it is not, into step S3;Conversely, then enter step S6;
S3:Whether the attribute A of information gain-ratio is continuous quantity in judgement sample, if it is not, into step S4;Conversely, into
Step S6;
S4:Find attribute A partition threshold;
S5:New node, and return to step S1 are grown according to attribute A;
S6:As leaf node and it is named as respective classes;
S7:Form decision tree.
Further, the judgement flow of the step S2 also includes:
S21:Calculate the information gain-ratio under each attribute in sample;
S22:Find attribute A maximum in information gain-ratio.
Further, the step S6 also includes:
S61:Calculate estimation mistake point rate and carry out beta pruning.
Further, decision tree is generated using C4.5 algorithms, product process is as follows:
S01:If S is the set of s data sample, the data sample belongs to m different class Ci(i=1 ..., m);
S02:If siIt is CiIn sample number, to a given sample, its total information entropy is:
Wherein, piIt is that arbitrary sample belongs to CiProbability, using si/ s estimates;
S03:If A is an attribute of sample, attribute A has v different value { a1,a2,...,av};
S04:S is divided into by v subset { S by attribute A1,S2,...,Sv};Wherein, SjThe value for being attribute A in S is aj's
Sample;
S05:If selecting A as testing attribute, these subsets are exactly point to be grown out from sample set S node
Branch.
Further, the decision tree generation also includes:
S06:If SijIt is subset SjIn belong to class CiSample number;
S07:Entropy (entropy) according to the attribute A subsets being divided into is:
Wherein,For subset SjPower, and be equal to subset SjIn number of samples divided by S gross sample
This number, entropy is smaller, and the purity of subset division is higher;
S08:I(s1j,s2j,...,smj) it is subset SjEntropy:
WhereinIt is SjIn sample belong to class CiProbability;
S09:It is with the information gain value obtained by after attribute A division sample sets S:
Gain (S, A)=I (s1,s2,...,sm)-E(A) (4)。
Further, also include:
S010:If A is continuous type attribute, training set S sample is sorted from small to large according to attribute A value;
S011:Assuming that training sample concentrates A to have the different values of v, then the value sequence for sequencing A attributes after sequence is { a1,
a2,...,av};The average value of consecutive value is taken then to share v-1 cut-point as cut-point one by one in order;
S012:The information gain-ratio of each cut-point is calculated respectively, and cut-point of the selection with maximum information ratio of profit increase is made
For local threshold values;
S013:In sequence { a1,a2,...,avIn find closest to but no more than local threshold values value vmaxAs
Attribute A partition threshold;
S014:Using the method choice testing attribute based on information gain-ratio, information gain-ratio is equal to information gain to dividing
The ratio of information content is cut, i.e., is with the A information gain-ratios divided to S:
Wherein,
Further, the beta pruning of decision tree is also included:
S611:The calculating formula of the wrong point rate of estimation of leaf node is:
Wherein, the mistake on f ordinary meanings divides rate, and f=E/N, E are wrong point in leaf node of number of samples, and N is current leaf
The sum of sample, z are fiducial limit, and generally when confidence level is 0.25, z 0.69, the estimation mistake point rate of subtree root node is
The weighted average of each undue rate of leaf segment point estimation and, i.e.,
Wherein k be branch number, NiBy the number for the sample assigned in i-th of branch.
Further, the training sample in step S1 includes:Metallic projections, insulator surface air gap, free metal are micro-
Three kinds of defects of grain.
By adopting the above-described technical solution, the present invention has the advantage that:
(1) SF is ensured6The safe and reliable operation of equipment, to grasping SF6Apparatus insulated operation conditions, structure State Maintenance body
System has important science and practical value;
(2) its discrimination to all kinds of defects is improved, can preferably characterize the feature of all kinds of insulation defects;
(3) the insulation defect intelligent diagnosis system based on decomposed constituent is established using decision tree, can improved to insulation event
The treatment effeciency of barrier;
(4) decision tree method of insulation defect identification is founded, finds rule and bar using component ratio identification fault type
Part;
(5) pattern-recognition has been carried out to the shelf depreciation of acquisition using decision tree, has further improved the knowledge to shelf depreciation
Not rate.
Other advantages, target and the feature of the present invention will be illustrated in the following description to a certain extent, and
And to a certain extent, based on will be apparent to those skilled in the art to investigating hereafter, Huo Zheke
To be instructed from the practice of the present invention.The target and other advantages of the present invention can be wanted by following specification and right
Book is sought to realize and obtain.
Brief description of the drawings
The brief description of the drawings of the present invention is as follows:
Fig. 1 is the decision Tree algorithms construction flow chart of the present invention.
Fig. 2 is the decision tree diagram that the present invention constructs.
The mistake that Fig. 3 is the present invention divides sample distribution figure.
Embodiment
The invention will be further described with reference to the accompanying drawings and examples.
Embodiment:As shown in Figure 1 to Figure 3;A kind of decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation, it
Include:It is as follows that the decision tree forms flow:
S1:Whether training of judgement sample is empty, if it is not, into step S2;Conversely, then enter step S6;
S2:Sample in decision node only has a classification, if it is not, into step S3;Conversely, then enter step S6;
The judgement flow of the step S2 also includes:
S21:Calculate the information gain-ratio under each attribute in sample;
S22:Find attribute A maximum in information gain-ratio.
S3:Whether the attribute A of information gain-ratio is continuous quantity in judgement sample, if it is not, into step S4;Conversely, into
Step S6;
S4:Find attribute A partition threshold;
S5:New node, and return to step S1 are grown according to attribute A;
S6:As leaf node and it is named as respective classes;The step S6 also includes:S61:Calculate estimation mistake and divide rate simultaneously
Carry out beta pruning.
S7:Form decision tree.
For the present patent application using C4.5 algorithms generation decision tree, C4.5 algorithms are that current most powerful and use is most wide
One of general decision Tree algorithms, it is the whole that ID3 algorithms are remained based on the ID3 algorithms proposed by Quinlan in 1986
Advantage, and a series of improvement have been carried out to ID3 algorithms, substantially increase the performance of the algorithm.
The principle of ID3 algorithms is decision tree when nodes at different levels select attribute, with information entropy theory, selects current sample
Concentrate and the attribute with maximum information yield value is as testing attribute, branch, Zhi Daosuo are established by the different values of the attribute
There is subset only comprising untill same category of data, to finally obtain the decision tree of an identification object.
Decision tree is generated using C4.5 algorithms, product process is as follows:
S01:If S is the set of s data sample, the data sample belongs to m different class Ci(i=1 ..., m);
S02:If siIt is CiIn sample number, to a given sample, its total information entropy is:
Wherein, piIt is that arbitrary sample belongs to CiProbability, using si/ s estimates;
S03:If A is an attribute of sample, attribute A has v different value { a1,a2,...,av};
S04:S is divided into by v subset { S by attribute A1,S2,...,Sv};Wherein, SjThe value for being attribute A in S is aj's
Sample;
S05:If selecting A as testing attribute, these subsets are exactly point to be grown out from sample set S node
Branch.
The decision tree generation also includes:
S06:If SijIt is subset SjIn belong to class CiSample number;
S07:Entropy (entropy) according to the attribute A subsets being divided into is:
Wherein,For subset SjPower, and be equal to subset SjIn number of samples divided by S gross sample
This number, entropy is smaller, and the purity of subset division is higher;
S08:I(s1j,s2j,...,smj) it is subset SjEntropy:
WhereinIt is SjIn sample belong to class CiProbability;
S09:It is with the information gain value obtained by after attribute A division sample sets S:
Gain (S, A)=I (s1,s2,...,sm)-E(A) (4)。
ID3 algorithms are exactly as testing attribute to attribute maximum each node selection information gain Gain (S, A).The calculation
The advantages of method is that method is simple, learning ability is stronger.Its shortcoming is intended to the attribute for selecting value more, and in most of feelings
The not necessarily optimal attribute of the more attribute of value under condition.In addition, only to contrast less data set effective for ID3 algorithms, and
It is more sensitive to noise, and when training dataset becomes big, decision tree may change therewith.
C4.5 algorithms have done a series of improvement to ID3 algorithms.First, it can be handled with the signal of connection attribute,
Its basic thought is that the codomain of Continuous valued attributes is divided into discrete section to gather, and is also included:
S010:If A is continuous type attribute, training set S sample is sorted from small to large according to attribute A value;
S011:Assuming that training sample concentrates A to have the different values of v, then the value sequence for sequencing A attributes after sequence is { a1,
a2,...,av};The average value of consecutive value is taken then to share v-1 cut-point as cut-point one by one in order;
S012:The information gain-ratio of each cut-point is calculated respectively, and cut-point of the selection with maximum information ratio of profit increase is made
For local threshold values;
S013:In sequence { a1,a2,...,avIn find closest to but no more than local threshold values value vmaxAs
Attribute A partition threshold;
S014:Using the method choice testing attribute based on information gain-ratio, information gain-ratio is equal to information gain to dividing
The ratio of information content is cut, i.e., is with the A information gain-ratios divided to S:
Wherein,
The information gain-ratio of all properties in current candidate property set is obtained in aforementioned manners, finds out wherein information gain-ratio
Sample set is divided into some subsample collection, the method same to each subsample continues by highest attribute as testing attribute
Segmentation is until indivisible or untill reaching stop condition.
Also include the beta pruning of decision tree:Decision-making branch pruning just refers to replace a whole stalk tree with a leaf node,
Whole tree is first established, then it is built again.It is if the root node of subtree estimates mistake point rate ratio after branch that it, which trims principle,
The estimation mistake of leaf divides rate big before branch, is carried out trimming, does not otherwise trim.
S611:The calculating formula of the wrong point rate of estimation of leaf node is:
Wherein, the mistake on f ordinary meanings divides rate, and f=E/N, E are wrong point in leaf node of number of samples, and N is current leaf
The sum of sample, z are fiducial limit, and generally when confidence level is 0.25, z 0.69, the estimation mistake point rate of subtree root node is
The weighted average of each undue rate of leaf segment point estimation and, i.e.,
Wherein k be branch number, NiBy the number for the sample assigned in i-th of branch.
Training sample in step S1 includes:Metallic projections, insulator surface air gap, three kinds of free metal particulate lack
Fall into.
Because pollution severity of insulators is not decomposed, so the characteristic quantity that the present patent application is extracted:c
(SOF2)/c (SO2F2), c (CF4)/c (CO2) and c (SOF2+SO2F2)/c (CO2+CF4) component content ratio are as feature
Amount, by 24 groups of SF6 decomposed constituents under three kinds of obtained metallic projections, insulator surface air gap, free metal particulate defects
Data are established using principle described above as training sample and trim decision tree, the minimum sample number of node is arranged to 2, puts
The letter factor is set to 0.25, and the classification accuracy of the decision tree is weighed using ten folding cross validations.
Result is generated from decision tree, in three characteristic quantities of input, the decision tree ultimately formed has only used c
(SOF2)/c (SO2F2) and c (the CF4)/component content ratio feature amounts of c (CO2) two, this explanation test data have preferable
Discrimination, it is only necessary to which three kinds of defects can be identified for two characteristic quantities, and C4.5 algorithms are according to the maximum original of information gain-ratio
Then, c (SOF2)/c (SO2F2) and c (CF4)/two characteristic quantities of c (CO2) are have chosen, and have cast out c (SOF2+SO2F2)/c
(CO2+CF4)。
To the decision tree of gained in Fig. 2, using another set test data as test sample, to verify its classification
Energy.Using 24 groups of SF under three kinds of defects6Decomposed constituent data, its recognition result are as shown in table 1;
The decision tree recognition result of table 1
Defect type | N classes | P classes | G classes | Amount to |
Sample number | 8 | 8 | 8 | 24 |
Identify number | 8 | 6 | 7 | 21 |
Discrimination | 100% | 75.0% | 87.5% | 87.5% |
Four kinds of Exemplary insulative defects of laboratory simulation are carried out with the decision tree of gained it can be seen from recognition result
Identification, comprehensive discrimination reach 87.50%, achieve more good recognition effect.
The recognition result confusion matrix of table 2
In addition to N classes defect all can be identified correctly, other defects are all present by the sample of mistake point.Table 2 gives identification knot
The confusion matrix of fruit, P class defects have 2 groups to be confused with G class defects, and G class defects have 1 group to be confused with P class defects.If by this
A little mistakes divide sample to be labeled in coordinate as shown in Figure 3, it can be seen that identify the sample standard deviation of mistake positioned at two kinds of defect types
Near border, this is due to that the border of decision tree is an absolute value, easily causes the object near border to be known by mistake
Not.
The device have the advantages that:It is proposed utilizes SF6The decision tree identification of the decomposition components content ratio amount of being characterized
Method, construct with c (SOF2)/c(SO2F2) and c (CF4)/c(CO2) PD Pattern Recognition decision tree as characteristic quantity,
Its decision process is as shown in Figure 2.And propose when discrimination is not high, can be by c (SOF2+SO2F2)/c(CO2+CF4) as auxiliary
Characteristic quantity is helped, further to improve the discrimination to shelf depreciation.The shelf depreciation obtained using the decision tree to laboratory is entered
Go pattern-recognition, achieve satisfied effect.When constituent content is not exceeded and has ultra-high frequency signal, represent have in equipment absolutely
Edge surface filth defect;And when constituent content is not exceeded and without ultra-high frequency signal, expression equipment is normal.
Finally illustrate, the above embodiments are merely illustrative of the technical solutions of the present invention and it is unrestricted, although with reference to compared with
The present invention is described in detail good embodiment, it will be understood by those within the art that, can be to the skill of the present invention
Art scheme is modified or equivalent substitution, and without departing from the objective and scope of the technical program, it all should cover in the present invention
Right among.
Claims (8)
1. a kind of decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation, it is characterised in that the decision tree is formed
Flow is as follows:
S1:Whether training of judgement sample is empty, if it is not, into step S2;Conversely, then enter step S6;
S2:Sample in decision node only has a classification, if it is not, into step S3;Conversely, then enter step S6;
S3:Whether the attribute A of information gain-ratio is continuous quantity in judgement sample, if it is not, into step S4;Conversely, into step
S6;
S4:Find attribute A partition threshold;
S5:New node, and return to step S1 are grown according to attribute A;
S6:As leaf node and it is named as respective classes;
S7:Form decision tree.
2. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 1, it is characterised in that institute
The judgement flow for stating step S2 also includes:
S21:Calculate the information gain-ratio under each attribute in sample;
S22:Find attribute A maximum in information gain-ratio.
3. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 2, it is characterised in that institute
Step S6 is stated also to include:
S61:Calculate estimation mistake point rate and carry out beta pruning.
4. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 3, it is characterised in that adopt
Decision tree is generated with C4.5 algorithms, product process is as follows:
S01:If S is the set of s data sample, the data sample belongs to m different class Ci(i=1 ..., m);
S02:If siIt is CiIn sample number, to a given sample, its total information entropy is:
<mrow>
<mi>I</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>s</mi>
<mn>1</mn>
</msub>
<mo>,</mo>
<msub>
<mi>s</mi>
<mn>2</mn>
</msub>
<mo>,</mo>
<mn>...</mn>
<mo>,</mo>
<msub>
<mi>s</mi>
<mi>m</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mo>-</mo>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>m</mi>
</munderover>
<msub>
<mi>p</mi>
<mi>i</mi>
</msub>
<msub>
<mi>log</mi>
<mn>2</mn>
</msub>
<msub>
<mi>p</mi>
<mi>i</mi>
</msub>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein, piIt is that arbitrary sample belongs to CiProbability, using si/ s estimates;
S03:If A is an attribute of sample, attribute A has v different value { a1,a2,...,av};
S04:S is divided into by v subset { S by attribute A1,S2,...,Sv};Wherein, SjThe value for being attribute A in S is ajSample
This;
S05:If selecting A, these subsets are exactly the branch to be grown out from sample set S node as testing attribute.
5. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 4, it is characterised in that institute
Decision tree generation is stated also to include:
S06:If SijIt is subset SjIn belong to class CiSample number;
S07:Entropy (entropy) according to the attribute A subsets being divided into is:
<mrow>
<mi>E</mi>
<mrow>
<mo>(</mo>
<mi>A</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>v</mi>
</munderover>
<mfrac>
<mrow>
<msub>
<mi>s</mi>
<mrow>
<mn>1</mn>
<mi>j</mi>
</mrow>
</msub>
<mo>+</mo>
<msub>
<mi>s</mi>
<mrow>
<mn>2</mn>
<mi>j</mi>
</mrow>
</msub>
<mo>+</mo>
<mn>...</mn>
<mo>+</mo>
<msub>
<mi>s</mi>
<mrow>
<mi>m</mi>
<mi>j</mi>
</mrow>
</msub>
</mrow>
<mi>s</mi>
</mfrac>
<mi>I</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>s</mi>
<mrow>
<mn>1</mn>
<mi>j</mi>
</mrow>
</msub>
<mo>,</mo>
<msub>
<mi>s</mi>
<mrow>
<mn>2</mn>
<mi>j</mi>
</mrow>
</msub>
<mo>,</mo>
<mn>...</mn>
<mo>,</mo>
<msub>
<mi>s</mi>
<mrow>
<mi>m</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>2</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein,For subset SjPower, and be equal to subset SjIn number of samples divided by S total number of samples,
Entropy is smaller, and the purity of subset division is higher;
S08:I(s1j,s2j,...,smj) it is subset SjEntropy:
<mrow>
<mi>z</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>s</mi>
<mrow>
<mn>1</mn>
<mi>j</mi>
</mrow>
</msub>
<mo>,</mo>
<msub>
<mi>s</mi>
<mrow>
<mn>2</mn>
<mi>j</mi>
</mrow>
</msub>
<mo>,</mo>
<mn>...</mn>
<mo>,</mo>
<msub>
<mi>s</mi>
<mrow>
<mi>m</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mo>-</mo>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>m</mi>
</munderover>
<msub>
<mi>p</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
<msub>
<mi>log</mi>
<mn>2</mn>
</msub>
<msub>
<mi>p</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>3</mn>
<mo>)</mo>
</mrow>
</mrow>
WhereinIt is SjIn sample belong to class CiProbability;
S09:It is with the information gain value obtained by after attribute A division sample sets S:
Gain (S, A)=I (s1,s2,...,sm)-E(A) (4)。
6. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 5, it is characterised in that also
Include:
S010:If A is continuous type attribute, training set S sample is sorted from small to large according to attribute A value;
S011:Assuming that training sample concentrates A to have the different values of v, then the value sequence for sequencing A attributes after sequence is { a1,
a2,...,av};The average value of consecutive value is taken then to share v-1 cut-point as cut-point one by one in order;
S012:The information gain-ratio of each cut-point is calculated respectively, and cut-point of the selection with maximum information ratio of profit increase is as office
Portion's threshold values;
S013:In sequence { a1,a2,...,avIn find closest to but no more than local threshold values value vmaxAs attribute
A partition threshold;
S014:Using the method choice testing attribute based on information gain-ratio, information gain-ratio, which is equal to information gain, to be believed segmentation
The ratio of breath amount, i.e., it is with the A information gain-ratios divided to S:
<mrow>
<mi>G</mi>
<mi>a</mi>
<mi>i</mi>
<mi>n</mi>
<mi>R</mi>
<mi>a</mi>
<mi>t</mi>
<mi>i</mi>
<mi>o</mi>
<mrow>
<mo>(</mo>
<mi>A</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfrac>
<mrow>
<mi>G</mi>
<mi>a</mi>
<mi>i</mi>
<mi>n</mi>
<mrow>
<mo>(</mo>
<mi>A</mi>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<mi>S</mi>
<mi>p</mi>
<mi>l</mi>
<mi>i</mi>
<mi>t</mi>
<mi>I</mi>
<mrow>
<mo>(</mo>
<mi>A</mi>
<mo>)</mo>
</mrow>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>5</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein,
7. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 6, it is characterised in that also
Include the beta pruning of decision tree:
S611:The calculating formula of the wrong point rate of estimation of leaf node is:
<mrow>
<mi>e</mi>
<mo>=</mo>
<mfrac>
<mrow>
<mi>f</mi>
<mo>+</mo>
<mfrac>
<msup>
<mi>z</mi>
<mn>2</mn>
</msup>
<mrow>
<mn>2</mn>
<mi>N</mi>
</mrow>
</mfrac>
<mo>+</mo>
<mi>z</mi>
<msqrt>
<mrow>
<mfrac>
<mi>f</mi>
<mi>N</mi>
</mfrac>
<mo>-</mo>
<mfrac>
<msup>
<mi>f</mi>
<mn>2</mn>
</msup>
<mi>N</mi>
</mfrac>
<mo>+</mo>
<mfrac>
<msup>
<mi>z</mi>
<mn>2</mn>
</msup>
<mrow>
<mn>4</mn>
<msup>
<mi>N</mi>
<mn>2</mn>
</msup>
</mrow>
</mfrac>
</mrow>
</msqrt>
</mrow>
<mrow>
<mn>1</mn>
<mo>+</mo>
<mfrac>
<msup>
<mi>z</mi>
<mn>2</mn>
</msup>
<mi>N</mi>
</mfrac>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>6</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein, the mistake on f ordinary meanings divides rate, and f=E/N, E are wrong point in leaf node of number of samples, and N is current leaf sample
Sum, z is fiducial limit, generally confidence level be 0.25 when, z 0.69, the estimation mistake of subtree root node divides rate to be each leaf
Node estimate undue rate weighted average and, i.e.,
<mrow>
<msub>
<mi>e</mi>
<mi>T</mi>
</msub>
<mo>=</mo>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>k</mi>
</munderover>
<mfrac>
<msub>
<mi>N</mi>
<mi>i</mi>
</msub>
<mi>N</mi>
</mfrac>
<msub>
<mi>e</mi>
<mi>i</mi>
</msub>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>7</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein k be branch number, NiBy the number for the sample assigned in i-th of branch.
8. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 1, it is characterised in that step
Training sample in rapid S1 includes:Three kinds of metallic projections, insulator surface air gap, free metal particulate defects.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711044243.5A CN107817427A (en) | 2017-10-31 | 2017-10-31 | Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711044243.5A CN107817427A (en) | 2017-10-31 | 2017-10-31 | Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107817427A true CN107817427A (en) | 2018-03-20 |
Family
ID=61603026
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711044243.5A Pending CN107817427A (en) | 2017-10-31 | 2017-10-31 | Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107817427A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805295A (en) * | 2018-03-26 | 2018-11-13 | 海南电网有限责任公司电力科学研究院 | A kind of method for diagnosing faults based on decision Tree algorithms |
CN109459522A (en) * | 2018-12-29 | 2019-03-12 | 云南电网有限责任公司电力科学研究院 | A kind of transformer failure prediction method and device based on ID3 algorithm |
CN110220602A (en) * | 2019-06-24 | 2019-09-10 | 广西电网有限责任公司电力科学研究院 | A kind of switchgear overheating fault recognition methods |
CN112305354A (en) * | 2020-10-23 | 2021-02-02 | 海南电网有限责任公司电力科学研究院 | Method for diagnosing defect type of sulfur hexafluoride insulation electrical equipment |
-
2017
- 2017-10-31 CN CN201711044243.5A patent/CN107817427A/en active Pending
Non-Patent Citations (1)
Title |
---|
刘帆: "《局部放电下六氟化硫分解特性与放电类型辨识及影响因素校正》", 《中国博士学位论文全文数据库.工程科技Ⅱ辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805295A (en) * | 2018-03-26 | 2018-11-13 | 海南电网有限责任公司电力科学研究院 | A kind of method for diagnosing faults based on decision Tree algorithms |
CN109459522A (en) * | 2018-12-29 | 2019-03-12 | 云南电网有限责任公司电力科学研究院 | A kind of transformer failure prediction method and device based on ID3 algorithm |
CN110220602A (en) * | 2019-06-24 | 2019-09-10 | 广西电网有限责任公司电力科学研究院 | A kind of switchgear overheating fault recognition methods |
CN110220602B (en) * | 2019-06-24 | 2020-08-25 | 广西电网有限责任公司电力科学研究院 | Switch cabinet overheating fault identification method |
CN112305354A (en) * | 2020-10-23 | 2021-02-02 | 海南电网有限责任公司电力科学研究院 | Method for diagnosing defect type of sulfur hexafluoride insulation electrical equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107817427A (en) | Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation | |
CN107301296B (en) | Data-based qualitative analysis method for circuit breaker fault influence factors | |
Tang et al. | Partial discharge recognition through an analysis of SF 6 decomposition products part 2: feature extraction and decision tree-based pattern recognition | |
CN109444656B (en) | Online diagnosis method for deformation position of transformer winding | |
CN103076547B (en) | Method for identifying GIS (Gas Insulated Switchgear) local discharge fault type mode based on support vector machines | |
CN110687393B (en) | Valve short-circuit protection fault positioning method based on VMD-SVD-FCM | |
CN109142969A (en) | A kind of power transmission line fault phase selection based on Continuous Hidden Markov Model | |
CN105701470A (en) | Analog circuit fault characteristic extraction method based on optimal wavelet packet decomposition | |
CN106199351A (en) | The sorting technique of local discharge signal and device | |
CN109470985A (en) | A kind of voltage sag source identification methods based on more resolution singular value decompositions | |
Omar et al. | Fault classification on transmission line using LSTM network | |
CN106443380B (en) | A kind of distribution cable local discharge signal recognition methods and device | |
CN108805295A (en) | A kind of method for diagnosing faults based on decision Tree algorithms | |
CN112861417A (en) | Transformer fault diagnosis method based on weighted sum selective naive Bayes | |
CN115600088A (en) | Distribution transformer fault diagnosis method based on vibration signals | |
CN115618249A (en) | Low-voltage power distribution station area phase identification method based on LargeVis dimension reduction and DBSCAN clustering | |
CN114091549A (en) | Equipment fault diagnosis method based on deep residual error network | |
CN116243115A (en) | High-voltage cable mode identification method and device based on time sequence topology data analysis | |
CN112381667B (en) | Distribution network electrical topology identification method based on deep learning | |
CN113866552B (en) | Medium voltage distribution network user electricity consumption abnormality diagnosis method based on machine learning | |
CN114397569A (en) | Circuit breaker fault arc detection method based on VMD parameter optimization and sample entropy | |
CN114386024A (en) | Power intranet terminal equipment abnormal attack detection method based on ensemble learning | |
CN117171544B (en) | Motor vibration fault diagnosis method based on multichannel fusion convolutional neural network | |
CN112817954A (en) | Missing value interpolation method based on multi-method ensemble learning | |
CN116503710A (en) | GIS partial discharge type identification method based on self-adaptive convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180320 |