CN107817427A - Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation - Google Patents

Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation Download PDF

Info

Publication number
CN107817427A
CN107817427A CN201711044243.5A CN201711044243A CN107817427A CN 107817427 A CN107817427 A CN 107817427A CN 201711044243 A CN201711044243 A CN 201711044243A CN 107817427 A CN107817427 A CN 107817427A
Authority
CN
China
Prior art keywords
mrow
msub
sample
attribute
decision tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711044243.5A
Other languages
Chinese (zh)
Inventor
邱妮
何国军
姚强
苗玉龙
唐炬
曾福平
杨华夏
籍勇亮
胡晓锐
宫林
张施令
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
State Grid Corp of China SGCC
Wuhan University WHU
Original Assignee
Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
State Grid Corp of China SGCC
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd, State Grid Corp of China SGCC, Wuhan University WHU filed Critical Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
Priority to CN201711044243.5A priority Critical patent/CN107817427A/en
Publication of CN107817427A publication Critical patent/CN107817427A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/12Testing dielectric strength or breakdown voltage ; Testing or monitoring effectiveness or level of insulation, e.g. of a cable or of an apparatus, for example using partial discharge measurements; Electrostatic testing
    • G01R31/1227Testing dielectric strength or breakdown voltage ; Testing or monitoring effectiveness or level of insulation, e.g. of a cable or of an apparatus, for example using partial discharge measurements; Electrostatic testing of components, parts or materials
    • G01R31/1254Testing dielectric strength or breakdown voltage ; Testing or monitoring effectiveness or level of insulation, e.g. of a cable or of an apparatus, for example using partial discharge measurements; Electrostatic testing of components, parts or materials of gas-insulated power appliances or vacuum gaps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The invention discloses a kind of decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation, and it is as follows that it includes decision tree formation flow:S1:Whether training of judgement sample is empty, if it is not, into step S2;Conversely, then enter step S6;S2:Sample in decision node only has a classification, if it is not, into step S3;Conversely, then enter step S6;S3:Whether the attribute A of information gain-ratio is continuous quantity in judgement sample, if it is not, into step S4;Conversely, into step S6;S4:Find attribute A partition threshold;S5:New node, and return to step S1 are grown according to attribute A;S6:As leaf node and it is named as respective classes;S7:Form decision tree.The beneficial effect that the present invention obtains is:Ensure SF6The safe and reliable operation of equipment, its discrimination to all kinds of defects is improved, the treatment effeciency to insulation fault can be improved.Pattern-recognition has been carried out to the shelf depreciation of acquisition using decision tree, has further improved the discrimination to shelf depreciation.

Description

Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation
Technical field
The present invention relates to sulfur hexafluoride decomposition technique field, particularly a kind of determining based on sulfur hexafluoride gas shelf depreciation Plan tree recognition methods.
Background technology
Sulfur hexafluoride (SF6) gas is widely used in air insulating device due to its excellent insulation and arc extinction performance In.However, SF6Air insulating device (abbreviation SF6Electrical equipment, such as gas insulated combined electrical equipment GIS, gas insulation breaker GCB, gas-insulated transformer GIT and gas-insulated lines or pipeline GIL etc.) in manufacture, transport, installation, maintenance and operation Deng during, internal inevitably various insulation defects, such as the metallic bur power on conductor, superstructure loosening or contact not Good, conductor and supporting insulator peel off metal particle in legacy and cavity after the air gap to be formed, maintenance etc., and these are all SF can be made6Device interior forms different degrees of insulation defect, so as to cause device interior electric field to be distorted, and then generation office Discharge (PD) in portion.
When there is serious PD, on the one hand, PD can accelerate the further destruction to device interior insulation, ultimately result in absolutely Reason barrier causes power outage, to operating SF6Equipment is a kind of potential hidden danger, there is the title of insulation " tumour ";The opposing party Face, PD is the characteristic quantity of Efficient Characterization insulation status again, by SF6The PD of electrical equipment carry out detection go forward side by side row mode knowledge Not, SF can largely be found6Insulation defect existing for device interior and type.Therefore, the production of insulation defect is identified Life is to ensureing SF6Electrical equipment safe and reliable operation has important practical significance.
The content of the invention
It is local based on sulfur hexafluoride gas it is an object of the invention to provide one kind in view of the drawbacks described above of prior art The decision tree recognition methods of electric discharge, ensure SF6The safe and reliable operation of equipment, its discrimination to all kinds of defects is improved, can be with Improve the treatment effeciency to insulation fault.
The purpose of the present invention realized by such technical scheme, a kind of based on sulfur hexafluoride gas shelf depreciation Decision tree recognition methods, it includes:It is as follows that the decision tree forms flow:
S1:Whether training of judgement sample is empty, if it is not, into step S2;Conversely, then enter step S6;
S2:Sample in decision node only has a classification, if it is not, into step S3;Conversely, then enter step S6;
S3:Whether the attribute A of information gain-ratio is continuous quantity in judgement sample, if it is not, into step S4;Conversely, into Step S6;
S4:Find attribute A partition threshold;
S5:New node, and return to step S1 are grown according to attribute A;
S6:As leaf node and it is named as respective classes;
S7:Form decision tree.
Further, the judgement flow of the step S2 also includes:
S21:Calculate the information gain-ratio under each attribute in sample;
S22:Find attribute A maximum in information gain-ratio.
Further, the step S6 also includes:
S61:Calculate estimation mistake point rate and carry out beta pruning.
Further, decision tree is generated using C4.5 algorithms, product process is as follows:
S01:If S is the set of s data sample, the data sample belongs to m different class Ci(i=1 ..., m);
S02:If siIt is CiIn sample number, to a given sample, its total information entropy is:
Wherein, piIt is that arbitrary sample belongs to CiProbability, using si/ s estimates;
S03:If A is an attribute of sample, attribute A has v different value { a1,a2,...,av};
S04:S is divided into by v subset { S by attribute A1,S2,...,Sv};Wherein, SjThe value for being attribute A in S is aj's Sample;
S05:If selecting A as testing attribute, these subsets are exactly point to be grown out from sample set S node Branch.
Further, the decision tree generation also includes:
S06:If SijIt is subset SjIn belong to class CiSample number;
S07:Entropy (entropy) according to the attribute A subsets being divided into is:
Wherein,For subset SjPower, and be equal to subset SjIn number of samples divided by S gross sample This number, entropy is smaller, and the purity of subset division is higher;
S08:I(s1j,s2j,...,smj) it is subset SjEntropy:
WhereinIt is SjIn sample belong to class CiProbability;
S09:It is with the information gain value obtained by after attribute A division sample sets S:
Gain (S, A)=I (s1,s2,...,sm)-E(A) (4)。
Further, also include:
S010:If A is continuous type attribute, training set S sample is sorted from small to large according to attribute A value;
S011:Assuming that training sample concentrates A to have the different values of v, then the value sequence for sequencing A attributes after sequence is { a1, a2,...,av};The average value of consecutive value is taken then to share v-1 cut-point as cut-point one by one in order;
S012:The information gain-ratio of each cut-point is calculated respectively, and cut-point of the selection with maximum information ratio of profit increase is made For local threshold values;
S013:In sequence { a1,a2,...,avIn find closest to but no more than local threshold values value vmaxAs Attribute A partition threshold;
S014:Using the method choice testing attribute based on information gain-ratio, information gain-ratio is equal to information gain to dividing The ratio of information content is cut, i.e., is with the A information gain-ratios divided to S:
Wherein,
Further, the beta pruning of decision tree is also included:
S611:The calculating formula of the wrong point rate of estimation of leaf node is:
Wherein, the mistake on f ordinary meanings divides rate, and f=E/N, E are wrong point in leaf node of number of samples, and N is current leaf The sum of sample, z are fiducial limit, and generally when confidence level is 0.25, z 0.69, the estimation mistake point rate of subtree root node is The weighted average of each undue rate of leaf segment point estimation and, i.e.,
Wherein k be branch number, NiBy the number for the sample assigned in i-th of branch.
Further, the training sample in step S1 includes:Metallic projections, insulator surface air gap, free metal are micro- Three kinds of defects of grain.
By adopting the above-described technical solution, the present invention has the advantage that:
(1) SF is ensured6The safe and reliable operation of equipment, to grasping SF6Apparatus insulated operation conditions, structure State Maintenance body System has important science and practical value;
(2) its discrimination to all kinds of defects is improved, can preferably characterize the feature of all kinds of insulation defects;
(3) the insulation defect intelligent diagnosis system based on decomposed constituent is established using decision tree, can improved to insulation event The treatment effeciency of barrier;
(4) decision tree method of insulation defect identification is founded, finds rule and bar using component ratio identification fault type Part;
(5) pattern-recognition has been carried out to the shelf depreciation of acquisition using decision tree, has further improved the knowledge to shelf depreciation Not rate.
Other advantages, target and the feature of the present invention will be illustrated in the following description to a certain extent, and And to a certain extent, based on will be apparent to those skilled in the art to investigating hereafter, Huo Zheke To be instructed from the practice of the present invention.The target and other advantages of the present invention can be wanted by following specification and right Book is sought to realize and obtain.
Brief description of the drawings
The brief description of the drawings of the present invention is as follows:
Fig. 1 is the decision Tree algorithms construction flow chart of the present invention.
Fig. 2 is the decision tree diagram that the present invention constructs.
The mistake that Fig. 3 is the present invention divides sample distribution figure.
Embodiment
The invention will be further described with reference to the accompanying drawings and examples.
Embodiment:As shown in Figure 1 to Figure 3;A kind of decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation, it Include:It is as follows that the decision tree forms flow:
S1:Whether training of judgement sample is empty, if it is not, into step S2;Conversely, then enter step S6;
S2:Sample in decision node only has a classification, if it is not, into step S3;Conversely, then enter step S6;
The judgement flow of the step S2 also includes:
S21:Calculate the information gain-ratio under each attribute in sample;
S22:Find attribute A maximum in information gain-ratio.
S3:Whether the attribute A of information gain-ratio is continuous quantity in judgement sample, if it is not, into step S4;Conversely, into Step S6;
S4:Find attribute A partition threshold;
S5:New node, and return to step S1 are grown according to attribute A;
S6:As leaf node and it is named as respective classes;The step S6 also includes:S61:Calculate estimation mistake and divide rate simultaneously Carry out beta pruning.
S7:Form decision tree.
For the present patent application using C4.5 algorithms generation decision tree, C4.5 algorithms are that current most powerful and use is most wide One of general decision Tree algorithms, it is the whole that ID3 algorithms are remained based on the ID3 algorithms proposed by Quinlan in 1986 Advantage, and a series of improvement have been carried out to ID3 algorithms, substantially increase the performance of the algorithm.
The principle of ID3 algorithms is decision tree when nodes at different levels select attribute, with information entropy theory, selects current sample Concentrate and the attribute with maximum information yield value is as testing attribute, branch, Zhi Daosuo are established by the different values of the attribute There is subset only comprising untill same category of data, to finally obtain the decision tree of an identification object.
Decision tree is generated using C4.5 algorithms, product process is as follows:
S01:If S is the set of s data sample, the data sample belongs to m different class Ci(i=1 ..., m);
S02:If siIt is CiIn sample number, to a given sample, its total information entropy is:
Wherein, piIt is that arbitrary sample belongs to CiProbability, using si/ s estimates;
S03:If A is an attribute of sample, attribute A has v different value { a1,a2,...,av};
S04:S is divided into by v subset { S by attribute A1,S2,...,Sv};Wherein, SjThe value for being attribute A in S is aj's Sample;
S05:If selecting A as testing attribute, these subsets are exactly point to be grown out from sample set S node Branch.
The decision tree generation also includes:
S06:If SijIt is subset SjIn belong to class CiSample number;
S07:Entropy (entropy) according to the attribute A subsets being divided into is:
Wherein,For subset SjPower, and be equal to subset SjIn number of samples divided by S gross sample This number, entropy is smaller, and the purity of subset division is higher;
S08:I(s1j,s2j,...,smj) it is subset SjEntropy:
WhereinIt is SjIn sample belong to class CiProbability;
S09:It is with the information gain value obtained by after attribute A division sample sets S:
Gain (S, A)=I (s1,s2,...,sm)-E(A) (4)。
ID3 algorithms are exactly as testing attribute to attribute maximum each node selection information gain Gain (S, A).The calculation The advantages of method is that method is simple, learning ability is stronger.Its shortcoming is intended to the attribute for selecting value more, and in most of feelings The not necessarily optimal attribute of the more attribute of value under condition.In addition, only to contrast less data set effective for ID3 algorithms, and It is more sensitive to noise, and when training dataset becomes big, decision tree may change therewith.
C4.5 algorithms have done a series of improvement to ID3 algorithms.First, it can be handled with the signal of connection attribute, Its basic thought is that the codomain of Continuous valued attributes is divided into discrete section to gather, and is also included:
S010:If A is continuous type attribute, training set S sample is sorted from small to large according to attribute A value;
S011:Assuming that training sample concentrates A to have the different values of v, then the value sequence for sequencing A attributes after sequence is { a1, a2,...,av};The average value of consecutive value is taken then to share v-1 cut-point as cut-point one by one in order;
S012:The information gain-ratio of each cut-point is calculated respectively, and cut-point of the selection with maximum information ratio of profit increase is made For local threshold values;
S013:In sequence { a1,a2,...,avIn find closest to but no more than local threshold values value vmaxAs Attribute A partition threshold;
S014:Using the method choice testing attribute based on information gain-ratio, information gain-ratio is equal to information gain to dividing The ratio of information content is cut, i.e., is with the A information gain-ratios divided to S:
Wherein,
The information gain-ratio of all properties in current candidate property set is obtained in aforementioned manners, finds out wherein information gain-ratio Sample set is divided into some subsample collection, the method same to each subsample continues by highest attribute as testing attribute Segmentation is until indivisible or untill reaching stop condition.
Also include the beta pruning of decision tree:Decision-making branch pruning just refers to replace a whole stalk tree with a leaf node, Whole tree is first established, then it is built again.It is if the root node of subtree estimates mistake point rate ratio after branch that it, which trims principle, The estimation mistake of leaf divides rate big before branch, is carried out trimming, does not otherwise trim.
S611:The calculating formula of the wrong point rate of estimation of leaf node is:
Wherein, the mistake on f ordinary meanings divides rate, and f=E/N, E are wrong point in leaf node of number of samples, and N is current leaf The sum of sample, z are fiducial limit, and generally when confidence level is 0.25, z 0.69, the estimation mistake point rate of subtree root node is The weighted average of each undue rate of leaf segment point estimation and, i.e.,
Wherein k be branch number, NiBy the number for the sample assigned in i-th of branch.
Training sample in step S1 includes:Metallic projections, insulator surface air gap, three kinds of free metal particulate lack Fall into.
Because pollution severity of insulators is not decomposed, so the characteristic quantity that the present patent application is extracted:c (SOF2)/c (SO2F2), c (CF4)/c (CO2) and c (SOF2+SO2F2)/c (CO2+CF4) component content ratio are as feature Amount, by 24 groups of SF6 decomposed constituents under three kinds of obtained metallic projections, insulator surface air gap, free metal particulate defects Data are established using principle described above as training sample and trim decision tree, the minimum sample number of node is arranged to 2, puts The letter factor is set to 0.25, and the classification accuracy of the decision tree is weighed using ten folding cross validations.
Result is generated from decision tree, in three characteristic quantities of input, the decision tree ultimately formed has only used c (SOF2)/c (SO2F2) and c (the CF4)/component content ratio feature amounts of c (CO2) two, this explanation test data have preferable Discrimination, it is only necessary to which three kinds of defects can be identified for two characteristic quantities, and C4.5 algorithms are according to the maximum original of information gain-ratio Then, c (SOF2)/c (SO2F2) and c (CF4)/two characteristic quantities of c (CO2) are have chosen, and have cast out c (SOF2+SO2F2)/c (CO2+CF4)。
To the decision tree of gained in Fig. 2, using another set test data as test sample, to verify its classification Energy.Using 24 groups of SF under three kinds of defects6Decomposed constituent data, its recognition result are as shown in table 1;
The decision tree recognition result of table 1
Defect type N classes P classes G classes Amount to
Sample number 8 8 8 24
Identify number 8 6 7 21
Discrimination 100% 75.0% 87.5% 87.5%
Four kinds of Exemplary insulative defects of laboratory simulation are carried out with the decision tree of gained it can be seen from recognition result Identification, comprehensive discrimination reach 87.50%, achieve more good recognition effect.
The recognition result confusion matrix of table 2
In addition to N classes defect all can be identified correctly, other defects are all present by the sample of mistake point.Table 2 gives identification knot The confusion matrix of fruit, P class defects have 2 groups to be confused with G class defects, and G class defects have 1 group to be confused with P class defects.If by this A little mistakes divide sample to be labeled in coordinate as shown in Figure 3, it can be seen that identify the sample standard deviation of mistake positioned at two kinds of defect types Near border, this is due to that the border of decision tree is an absolute value, easily causes the object near border to be known by mistake Not.
The device have the advantages that:It is proposed utilizes SF6The decision tree identification of the decomposition components content ratio amount of being characterized Method, construct with c (SOF2)/c(SO2F2) and c (CF4)/c(CO2) PD Pattern Recognition decision tree as characteristic quantity, Its decision process is as shown in Figure 2.And propose when discrimination is not high, can be by c (SOF2+SO2F2)/c(CO2+CF4) as auxiliary Characteristic quantity is helped, further to improve the discrimination to shelf depreciation.The shelf depreciation obtained using the decision tree to laboratory is entered Go pattern-recognition, achieve satisfied effect.When constituent content is not exceeded and has ultra-high frequency signal, represent have in equipment absolutely Edge surface filth defect;And when constituent content is not exceeded and without ultra-high frequency signal, expression equipment is normal.
Finally illustrate, the above embodiments are merely illustrative of the technical solutions of the present invention and it is unrestricted, although with reference to compared with The present invention is described in detail good embodiment, it will be understood by those within the art that, can be to the skill of the present invention Art scheme is modified or equivalent substitution, and without departing from the objective and scope of the technical program, it all should cover in the present invention Right among.

Claims (8)

1. a kind of decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation, it is characterised in that the decision tree is formed Flow is as follows:
S1:Whether training of judgement sample is empty, if it is not, into step S2;Conversely, then enter step S6;
S2:Sample in decision node only has a classification, if it is not, into step S3;Conversely, then enter step S6;
S3:Whether the attribute A of information gain-ratio is continuous quantity in judgement sample, if it is not, into step S4;Conversely, into step S6;
S4:Find attribute A partition threshold;
S5:New node, and return to step S1 are grown according to attribute A;
S6:As leaf node and it is named as respective classes;
S7:Form decision tree.
2. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 1, it is characterised in that institute The judgement flow for stating step S2 also includes:
S21:Calculate the information gain-ratio under each attribute in sample;
S22:Find attribute A maximum in information gain-ratio.
3. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 2, it is characterised in that institute Step S6 is stated also to include:
S61:Calculate estimation mistake point rate and carry out beta pruning.
4. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 3, it is characterised in that adopt Decision tree is generated with C4.5 algorithms, product process is as follows:
S01:If S is the set of s data sample, the data sample belongs to m different class Ci(i=1 ..., m);
S02:If siIt is CiIn sample number, to a given sample, its total information entropy is:
<mrow> <mi>I</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>s</mi> <mn>2</mn> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>s</mi> <mi>m</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>p</mi> <mi>i</mi> </msub> <msub> <mi>log</mi> <mn>2</mn> </msub> <msub> <mi>p</mi> <mi>i</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
Wherein, piIt is that arbitrary sample belongs to CiProbability, using si/ s estimates;
S03:If A is an attribute of sample, attribute A has v different value { a1,a2,...,av};
S04:S is divided into by v subset { S by attribute A1,S2,...,Sv};Wherein, SjThe value for being attribute A in S is ajSample This;
S05:If selecting A, these subsets are exactly the branch to be grown out from sample set S node as testing attribute.
5. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 4, it is characterised in that institute Decision tree generation is stated also to include:
S06:If SijIt is subset SjIn belong to class CiSample number;
S07:Entropy (entropy) according to the attribute A subsets being divided into is:
<mrow> <mi>E</mi> <mrow> <mo>(</mo> <mi>A</mi> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>v</mi> </munderover> <mfrac> <mrow> <msub> <mi>s</mi> <mrow> <mn>1</mn> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msub> <mi>s</mi> <mrow> <mn>2</mn> <mi>j</mi> </mrow> </msub> <mo>+</mo> <mn>...</mn> <mo>+</mo> <msub> <mi>s</mi> <mrow> <mi>m</mi> <mi>j</mi> </mrow> </msub> </mrow> <mi>s</mi> </mfrac> <mi>I</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mrow> <mn>1</mn> <mi>j</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>s</mi> <mrow> <mn>2</mn> <mi>j</mi> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>s</mi> <mrow> <mi>m</mi> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>
Wherein,For subset SjPower, and be equal to subset SjIn number of samples divided by S total number of samples, Entropy is smaller, and the purity of subset division is higher;
S08:I(s1j,s2j,...,smj) it is subset SjEntropy:
<mrow> <mi>z</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mrow> <mn>1</mn> <mi>j</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>s</mi> <mrow> <mn>2</mn> <mi>j</mi> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>s</mi> <mrow> <mi>m</mi> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>p</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>2</mn> </msub> <msub> <mi>p</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>
WhereinIt is SjIn sample belong to class CiProbability;
S09:It is with the information gain value obtained by after attribute A division sample sets S:
Gain (S, A)=I (s1,s2,...,sm)-E(A) (4)。
6. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 5, it is characterised in that also Include:
S010:If A is continuous type attribute, training set S sample is sorted from small to large according to attribute A value;
S011:Assuming that training sample concentrates A to have the different values of v, then the value sequence for sequencing A attributes after sequence is { a1, a2,...,av};The average value of consecutive value is taken then to share v-1 cut-point as cut-point one by one in order;
S012:The information gain-ratio of each cut-point is calculated respectively, and cut-point of the selection with maximum information ratio of profit increase is as office Portion's threshold values;
S013:In sequence { a1,a2,...,avIn find closest to but no more than local threshold values value vmaxAs attribute A partition threshold;
S014:Using the method choice testing attribute based on information gain-ratio, information gain-ratio, which is equal to information gain, to be believed segmentation The ratio of breath amount, i.e., it is with the A information gain-ratios divided to S:
<mrow> <mi>G</mi> <mi>a</mi> <mi>i</mi> <mi>n</mi> <mi>R</mi> <mi>a</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> <mrow> <mo>(</mo> <mi>A</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>G</mi> <mi>a</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>A</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>S</mi> <mi>p</mi> <mi>l</mi> <mi>i</mi> <mi>t</mi> <mi>I</mi> <mrow> <mo>(</mo> <mi>A</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow>
Wherein,
7. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 6, it is characterised in that also Include the beta pruning of decision tree:
S611:The calculating formula of the wrong point rate of estimation of leaf node is:
<mrow> <mi>e</mi> <mo>=</mo> <mfrac> <mrow> <mi>f</mi> <mo>+</mo> <mfrac> <msup> <mi>z</mi> <mn>2</mn> </msup> <mrow> <mn>2</mn> <mi>N</mi> </mrow> </mfrac> <mo>+</mo> <mi>z</mi> <msqrt> <mrow> <mfrac> <mi>f</mi> <mi>N</mi> </mfrac> <mo>-</mo> <mfrac> <msup> <mi>f</mi> <mn>2</mn> </msup> <mi>N</mi> </mfrac> <mo>+</mo> <mfrac> <msup> <mi>z</mi> <mn>2</mn> </msup> <mrow> <mn>4</mn> <msup> <mi>N</mi> <mn>2</mn> </msup> </mrow> </mfrac> </mrow> </msqrt> </mrow> <mrow> <mn>1</mn> <mo>+</mo> <mfrac> <msup> <mi>z</mi> <mn>2</mn> </msup> <mi>N</mi> </mfrac> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow>
Wherein, the mistake on f ordinary meanings divides rate, and f=E/N, E are wrong point in leaf node of number of samples, and N is current leaf sample Sum, z is fiducial limit, generally confidence level be 0.25 when, z 0.69, the estimation mistake of subtree root node divides rate to be each leaf Node estimate undue rate weighted average and, i.e.,
<mrow> <msub> <mi>e</mi> <mi>T</mi> </msub> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <mfrac> <msub> <mi>N</mi> <mi>i</mi> </msub> <mi>N</mi> </mfrac> <msub> <mi>e</mi> <mi>i</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow>
Wherein k be branch number, NiBy the number for the sample assigned in i-th of branch.
8. the decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation as claimed in claim 1, it is characterised in that step Training sample in rapid S1 includes:Three kinds of metallic projections, insulator surface air gap, free metal particulate defects.
CN201711044243.5A 2017-10-31 2017-10-31 Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation Pending CN107817427A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711044243.5A CN107817427A (en) 2017-10-31 2017-10-31 Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711044243.5A CN107817427A (en) 2017-10-31 2017-10-31 Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation

Publications (1)

Publication Number Publication Date
CN107817427A true CN107817427A (en) 2018-03-20

Family

ID=61603026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711044243.5A Pending CN107817427A (en) 2017-10-31 2017-10-31 Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation

Country Status (1)

Country Link
CN (1) CN107817427A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805295A (en) * 2018-03-26 2018-11-13 海南电网有限责任公司电力科学研究院 A kind of method for diagnosing faults based on decision Tree algorithms
CN109459522A (en) * 2018-12-29 2019-03-12 云南电网有限责任公司电力科学研究院 A kind of transformer failure prediction method and device based on ID3 algorithm
CN110220602A (en) * 2019-06-24 2019-09-10 广西电网有限责任公司电力科学研究院 A kind of switchgear overheating fault recognition methods
CN112305354A (en) * 2020-10-23 2021-02-02 海南电网有限责任公司电力科学研究院 Method for diagnosing defect type of sulfur hexafluoride insulation electrical equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘帆: "《局部放电下六氟化硫分解特性与放电类型辨识及影响因素校正》", 《中国博士学位论文全文数据库.工程科技Ⅱ辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805295A (en) * 2018-03-26 2018-11-13 海南电网有限责任公司电力科学研究院 A kind of method for diagnosing faults based on decision Tree algorithms
CN109459522A (en) * 2018-12-29 2019-03-12 云南电网有限责任公司电力科学研究院 A kind of transformer failure prediction method and device based on ID3 algorithm
CN110220602A (en) * 2019-06-24 2019-09-10 广西电网有限责任公司电力科学研究院 A kind of switchgear overheating fault recognition methods
CN110220602B (en) * 2019-06-24 2020-08-25 广西电网有限责任公司电力科学研究院 Switch cabinet overheating fault identification method
CN112305354A (en) * 2020-10-23 2021-02-02 海南电网有限责任公司电力科学研究院 Method for diagnosing defect type of sulfur hexafluoride insulation electrical equipment

Similar Documents

Publication Publication Date Title
CN107817427A (en) Decision tree recognition methods based on sulfur hexafluoride gas shelf depreciation
CN107301296B (en) Data-based qualitative analysis method for circuit breaker fault influence factors
Tang et al. Partial discharge recognition through an analysis of SF 6 decomposition products part 2: feature extraction and decision tree-based pattern recognition
CN109444656B (en) Online diagnosis method for deformation position of transformer winding
CN103076547B (en) Method for identifying GIS (Gas Insulated Switchgear) local discharge fault type mode based on support vector machines
CN110687393B (en) Valve short-circuit protection fault positioning method based on VMD-SVD-FCM
CN109142969A (en) A kind of power transmission line fault phase selection based on Continuous Hidden Markov Model
CN105701470A (en) Analog circuit fault characteristic extraction method based on optimal wavelet packet decomposition
CN106199351A (en) The sorting technique of local discharge signal and device
CN109470985A (en) A kind of voltage sag source identification methods based on more resolution singular value decompositions
Omar et al. Fault classification on transmission line using LSTM network
CN106443380B (en) A kind of distribution cable local discharge signal recognition methods and device
CN108805295A (en) A kind of method for diagnosing faults based on decision Tree algorithms
CN112861417A (en) Transformer fault diagnosis method based on weighted sum selective naive Bayes
CN115600088A (en) Distribution transformer fault diagnosis method based on vibration signals
CN115618249A (en) Low-voltage power distribution station area phase identification method based on LargeVis dimension reduction and DBSCAN clustering
CN114091549A (en) Equipment fault diagnosis method based on deep residual error network
CN116243115A (en) High-voltage cable mode identification method and device based on time sequence topology data analysis
CN112381667B (en) Distribution network electrical topology identification method based on deep learning
CN113866552B (en) Medium voltage distribution network user electricity consumption abnormality diagnosis method based on machine learning
CN114397569A (en) Circuit breaker fault arc detection method based on VMD parameter optimization and sample entropy
CN114386024A (en) Power intranet terminal equipment abnormal attack detection method based on ensemble learning
CN117171544B (en) Motor vibration fault diagnosis method based on multichannel fusion convolutional neural network
CN112817954A (en) Missing value interpolation method based on multi-method ensemble learning
CN116503710A (en) GIS partial discharge type identification method based on self-adaptive convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180320