CN106570526A - Classifier integration method for power transmission and transformation primary device load curve mining - Google Patents
Classifier integration method for power transmission and transformation primary device load curve mining Download PDFInfo
- Publication number
- CN106570526A CN106570526A CN201610958899.7A CN201610958899A CN106570526A CN 106570526 A CN106570526 A CN 106570526A CN 201610958899 A CN201610958899 A CN 201610958899A CN 106570526 A CN106570526 A CN 106570526A
- Authority
- CN
- China
- Prior art keywords
- dimensional
- classification
- classification results
- dimentional
- tuple
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A classifier integration method for power transmission and transformation primary device load curve mining comprises the following steps: performing k times of iteration sampling with replacement on a data set D to get k training sets Di; machine-learning the training sets Di to get classification models M and corresponding classifiers; inputting the data set D to the classification models M to get classification results Q; and classifying the classification results to determine the nature of tuples. The beneficial effect lies in that an integrated classifier model built through the method can effectively improve the accuracy of equipment load abnormal data identification. Based on the power transmission and transformation primary device load abnormality sorting result obtained by the method, the focus of inspection is put on the top-ranked devices. Thus, the operation and maintenance efficiency of devices is improved effectively, and cost is saved for electric power companies.
Description
Technical field
The present invention relates to power system big data excavation applications, particularly a kind of power transmission and transformation main equipment load curve is extremely right
The excavation of elephant.
Background technology
By the excavation to power transmission and transformation main equipment load curve, many effective informations, a kind of effect therein can be obtained
It is exactly power transmission and transformation main equipment load anomalous data identification.Due to main equipment faults itself, load harvester failure, communicate different
Often, the anthropic factor such as data storage problem, stealing electricity, may all cause power transmission and transformation main equipment load data to occur abnormal.For
Utilities Electric Co., at present conventional processing method is that this not only needs substantial amounts of human resourcess but also increase based on site inspection mostly
Operation cost.By effective identification of power transmission and transformation main equipment load abnormal data, accurately judge to cause the various of data exception
Factor, can realize causing the quick positioning of data exception link, improve equipment O&M efficiency, save Utilities Electric Co.'s spending.
The content of the invention
The invention aims to solve the above problems, devise a kind of for the excavation of power transmission and transformation main equipment load curve
Categorizer integration method.Specific design scheme is:
A kind of categorizer integration method excavated for power transmission and transformation main equipment load curve, integrated step is:
Step one, comprising p tuple dj(j=1,2, in data set D p), k time is carried out to data set D repeatedly
For sampling with replacement, k training set D is obtainedi(i=1,2, k);
Step 2, to k training set DiMachine learning is carried out, the learning method of the machine learning is n kinds, is obtained multiple
Grader Mn, (n=1,2,3), each grader MnInclude k disaggregated model Mni;
Step 3, with iteration input mode, by each tuple d in data set DjIt is input into each disaggregated model MnqIn, obtain
Obtain p*k*n output result djni;
Step 4, according to output result djni, with iteration input mode, to each tuple djBinary is carried out in the first dimension
Classification, each tuple obtains k*n one-dimensional classification results Qjni;
Step 5, according to one-dimensional classification results Qjni, with iteration input mode, to each tuple djCarry out in the second dimension
Binary classification, each tuple obtains n two-dimentional classification results Qjn;
Step 6, according to two-dimentional classification results Qjn, to each tuple djBinary classification is carried out in third dimension, each tuple
Obtain three-dimensional categorisation result Qj;
Step 7, according to three-dimensional categorisation result Qj, judge each tuple d in data set DjClassification results.
In the step 4, first dimension is tuple dimension, described the step of the first dimension carries out binary classification
For:
Step 4-1, number of iterations j=1, i=1, n=1 are taken, by tuple d1Input disaggregated model M11In, obtain a kind of element
Classification results d111;
Disaggregated model M11The binary element for including is classification element r, classification element w, when the output valve of d1 is normal, is sentenced
Disconnected to classify as classification element r, when the output valve of d1 is abnormal, judgement classifies as classification element w, the classification results d111To return
Class result d111rWith categorization results d111wIn one kind.
Step 4-2, number of iterations j=(j=1)+1, i=1, n=1 is made, tuple d2 is input in disaggregated model M11 and carries out
Binary classification,
Obtain a kind of element classification result d211, the classification results d211For categorization results d211rWith categorization results d211wIn
It is a kind of;
Step 4-3, repeat step step 4-1, step 4-2p time, obtain p classification results dj11;
Step 4-4, number of iterations i=(i=1)+1 is taken, repeat step four-1, step 4-2 obtains p*k categorization results
dj1i;
Step 4-5, number of iterations n=(n=1)+1 is taken, repeat step four-1, step 4-2, step 4-4 obtain n*p*k
Individual categorization results djni。
In the step 5, second dimension is disaggregated model dimension, described to carry out binary classification in the second dimension
Step is:
Step 5-1 is by j values are identical, the one-dimensional classification results Q of n value identicalsjniDivide in one-dimensional sorted group F,
Obtain p*n one-dimensional sorted group Fjn, each one-dimensional sorted group FjnInclude k one-dimensional classification results Qjni;
Step 5-2, make number of iterations j=1, n=1, take one-dimensional sorted group F11, in one-dimensional sorted group F11In, make number of iterations i
=1, judge the one-dimensional classification results Q111Species, if the one-dimensional classification results Q111For Q111r, then Q is made111=1, if described
One-dimensional classification results Q111For Q111w, then the Q is made111=-1,
Step 5-3, make number of iterations j=1, take one-dimensional sorted group F11, in one-dimensional sorted group F11In, make number of iterations i=(i
=1)+1, judge the one-dimensional classification results Q112Species, if the one-dimensional classification results Q112For Q112r, then Q is made112=1, if
The one-dimensional classification results Q112For Q112w, then the Q is made112=-1,
Step 5-4, repeat step five-3 obtains one-dimensional sorted group F11In k one-dimensional classification results value, then
The value of k one-dimensional classification results is sued for peace, two-dimentional classification value ∑ Q is obtained11If, two-dimentional classification value ∑ Q11>0, then judge
One-dimensional sorted group F11In all one-dimensional classification results Qj1iIt is two-dimentional classification results Qj1rIf, two-dimentional classification value ∑ Q11<0, then sentence
Disconnected one-dimensional sorted group F11In all one-dimensional classification results Qj1iIt is two-dimentional classification results Qj1w,
Step 5-5, make j=(j=1)+1, n=1 repeat steps five-2, step 5-3, step 5-4, and acquisition p is two-dimentional
Classification results Qj1;
Step 5-6, make n=(n=1)+1, repeat step five-5 obtain p*n two-dimentional classification results Qjn.
In the step 6, the third dimension be grader dimension, the step that binary classification is carried out in third dimension
Suddenly it is:
Step 6-1, by the identical two-dimentional classification results Q of j valuesjnDivide in sorted group F, obtain p two-dimentional sorted group
Fj, each two-dimentional sorted group FjInclude n two-dimentional classification results Qjn;
Step 6-2, make number of iterations j=1 take two-dimentional sorted group F1, in two-dimentional sorted group F1In, number of iterations n=1 is made, judge
The two-dimentional classification results Q11Species, if the two-dimentional classification results Q11For Q11r, then Q is made11=1, if the two dimension classification knot
Fruit Q11For Q11w, then the Q is made11=-1,
Step 6-3, make number of iterations j=1, take two-dimentional sorted group F1, in two-dimentional sorted group F1In, make number of iterations n=(n=
1)+1, judge the two-dimentional classification results Q12Species, if the two-dimentional classification results Q12For Q12r, then Q is made12=1, if described two
Dimension classification results Q12For Q12w, then the Q is made12=-1;
Step 6-4, repeat step six-3 obtains two-dimentional sorted group F1In n two-dimentional classification results value, then
The value of n two-dimentional classification results is sued for peace, three-dimensional classification value ∑ Q is obtained1If, three-dimensional classification value ∑ Q1>0, then judge two
Dimension sorted group F1In all two-dimentional classification results QjnIt is three-dimensional categorisation result QjrIf, three-dimensional classification value ∑ Q1<0, then judge two
Dimension sorted group F1In all two-dimentional classification results QjnIt is three-dimensional categorisation result Qjw;
Step 6-5, make j=(j=1)+1, and repeat step five-2, step 5-3, step 5-4 obtain p three-dimensional categorisation
As a result Qj, each tuple djCorrespond to a j value identical three-dimensional categorisation result.
If three-dimensional categorisation result is Qjr, then judge and its j value identical tuple djFor normal tuple, if three-dimensional categorisation knot
Fruit is Qjw, then j values identical tuple d therewith is judgedjFor abnormal tuple.
The sample mode of the data set D is sampled for playback, and training set DiIn tuple d quantity be p, unit
Group quantity of the d in training set Di is 0,1, it is multiple in one kind.
The disaggregated model MjIncluding decision-tree model, its classifying step is:
By training set DiRegard a node N as;
Analyzing and training collection DiVariable partitioning scheme, determine multiple partitioning scheme N;
Two optimum partitioning schemes are selected, node N is divided into into N1、N2;
By the dividing method to node N, to node N1、N2Split;
Repeat previous step, to training set DiMulti-stage division is carried out, decision tree is formed;
With remaining K-1 training set DiDecision tree is tested;
Carrying out node secateurs and node according to assay increases, and determines decision tree.
The disaggregated model MjIncluding Bayesian Classification Model, its expression formula is:
Wherein, dH、dXIn two linear properties variables, P (dH) it is dHPrior probability, P (dH|dX) it is dHPosteriority it is general
Rate;P(dX) it is dXPrior probability, P (dX|dH) it is dXPosterior probability.
The disaggregated model MjIncluding supporting vector machine model,
The supporting vector machine model includes linear separability supporting vector machine model, linearly inseparable support vector machine mould
Type,
The optimization problem of the linear separability supporting vector machine model can be solved by the method for convex quadratic programming;
The linearly inseparable supporting vector machine model can be mapped to the input data of lower dimensional space by kernel function
High bit space, and then the optimal separating hyper plane in new described high bit space.
The kernel function includes many Xu formulas kernel functions, gaussian radial basis function and S-shaped kernel function.
The grader collection excavated for power transmission and transformation main equipment load curve obtained by the above-mentioned technical proposal of the present invention
Into method, its advantage is:
The integrated classifier model set up by this method, can effectively be lifted power transmission and transformation main equipment load abnormal data and be distinguished
The accuracy of knowledge.The power transmission and transformation main equipment load abnormality degree obtained by using this method is sorted, and equipment in the top is entered
Row emphasis is verified, and can effectively improve equipment O&M efficiency, saves Utilities Electric Co.'s spending.
Description of the drawings
Fig. 1 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Receiver operating curve's comparison diagram that collection statistical result passes through Decision-Tree Classifier Model statistical result with data set;
Fig. 2 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Collection statistical result is with data set by the detection number of Decision-Tree Classifier Model statistical result and recall ratio comparison diagram;
Fig. 3 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Receiver operating curve's comparison diagram that collection statistical result passes through Bayesian Classification Model statistical result with data set;
Fig. 4 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Collection statistical result is with data set by the detection number of Bayesian Classification Model statistical result and recall ratio comparison diagram;
Fig. 5 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Receiver operating curve's comparison diagram that collection statistical result passes through support vector cassification modeling statistics result with data set;
Fig. 6 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Collection statistical result is with data set by the detection number of support vector cassification modeling statistics result and recall ratio comparison diagram;
Fig. 7 is all main equipments that the present invention is predicted based on the pack sorter model group that supporting vector machine model is obtained
The scatterplot of the extremely doubtful probability of the load data of object;
Fig. 8 is the distribution situation of the abnormal main equipment object of load data in the present invention;
In figure, ctree, Decision-Tree Classifier Model;Nb, Bayesian Classification Model;Svm, support vector cassification model;
CtreeBag, the pack sorter model group obtained based on Decision-Tree Classifier Model;NbBag, is obtained based on Bayesian Classification Model
The pack sorter model group for obtaining;SvmBag, based on the pack sorter model group that supporting vector machine model is obtained.
Specific embodiment
The present invention is specifically described below in conjunction with the accompanying drawings.
A kind of categorizer integration method excavated for power transmission and transformation main equipment load curve, integrated step is:
Step one, comprising p tuple dj(j=1,2, in data set D p), k time is carried out to data set D repeatedly
For sampling with replacement, k training set D is obtainedi(i=1,2, k);
Step 2, to k training set DiMachine learning is carried out, the learning method of the machine learning is n kinds, is obtained multiple
Grader Mn, (n=1,2,3), each grader MnInclude k disaggregated model Mni;
Step 3, with iteration input mode, by each tuple d in data set DjIt is input into each disaggregated model MnqIn, obtain
Obtain p*k*n output result djni;
Step 4, according to output result djni, with iteration input mode, to each tuple djBinary is carried out in the first dimension
Classification, each tuple obtains k*n one-dimensional classification results Qjni;
Step 5, according to one-dimensional classification results Qjni, with iteration input mode, to each tuple djCarry out in the second dimension
Binary classification, each tuple obtains n two-dimentional classification results Qjn;
Step 6, according to two-dimentional classification results Qjn, to each tuple djBinary classification is carried out in third dimension, each tuple
Obtain three-dimensional categorisation result Qj;
Step 7, according to three-dimensional categorisation result Qj, judge each tuple d in data set DjClassification results.
In the step 4, first dimension is tuple dimension, described the step of the first dimension carries out binary classification
For:
Step 4-1, number of iterations j=1, i=1, n=1 are taken, by tuple d1Input disaggregated model M11In, obtain a kind of element
Classification results d111;
Disaggregated model M11The binary element for including is classification element r, classification element w, when the output valve of d1 is normal, is sentenced
Disconnected to classify as classification element r, when the output valve of d1 is abnormal, judgement classifies as classification element w, the classification results d111To return
Class result d111rWith categorization results d111wIn one kind.
Step 4-2, number of iterations j=(j=1)+1, i=1, n=1 is made, tuple d2 is input in disaggregated model M11 and carries out
Binary classification,
Obtain a kind of element classification result d211, the classification results d211For categorization results d211rWith categorization results d211wIn
It is a kind of;
Step 4-3, repeat step step 4-1, step 4-2p time, obtain p classification results dj11;
Step 4-4, number of iterations i=(i=1)+1 is taken, repeat step four-1, step 4-2 obtains p*k categorization results
dj1i;
Step 4-5, number of iterations n=(n=1)+1 is taken, repeat step four-1, step 4-2, step 4-4 obtain n*p*k
Individual categorization results djni。
In the step 5, second dimension is disaggregated model dimension, described to carry out binary classification in the second dimension
Step is:
Step 5-1 is by j values are identical, the one-dimensional classification results Q of n value identicalsjniDivide in one-dimensional sorted group F,
Obtain p*n one-dimensional sorted group Fjn, each one-dimensional sorted group FjnInclude k one-dimensional classification results Qjni;
Step 5-2, make number of iterations j=1, n=1, take one-dimensional sorted group F11, in one-dimensional sorted group F11In, make number of iterations i
=1, judge the one-dimensional classification results Q111Species, if the one-dimensional classification results Q111For Q111r, then Q is made111=1, if described
One-dimensional classification results Q111For Q111w, then the Q is made111=-1,
Step 5-3, make number of iterations j=1, take one-dimensional sorted group F11, in one-dimensional sorted group F11In, make number of iterations i=(i
=1)+1, judge the one-dimensional classification results Q112Species, if the one-dimensional classification results Q112For Q112r, then Q is made112=1, if
The one-dimensional classification results Q112For Q112w, then the Q is made112=-1,
Step 5-4, repeat step five-3 obtains one-dimensional sorted group F11In k one-dimensional classification results value, then
The value of k one-dimensional classification results is sued for peace, two-dimentional classification value ∑ Q is obtained11If, two-dimentional classification value ∑ Q11>0, then judge
One-dimensional sorted group F11In all one-dimensional classification results Qj1iIt is two-dimentional classification results Qj1rIf, two-dimentional classification value ∑ Q11<0, then sentence
Disconnected one-dimensional sorted group F11In all one-dimensional classification results Qj1iIt is two-dimentional classification results Qj1w,
Step 5-5, make j=(j=1)+1, n=1 repeat steps five-2, step 5-3, step 5-4, and acquisition p is two-dimentional
Classification results Qj1;
Step 5-6, make n=(n=1)+1, repeat step five-5 obtain p*n two-dimentional classification results Qjn.
In the step 6, the third dimension be grader dimension, the step that binary classification is carried out in third dimension
Suddenly it is:
Step 6-1, by the identical two-dimentional classification results Q of j valuesjnDivide in sorted group F, obtain p two-dimentional sorted group
Fj, each two-dimentional sorted group FjInclude n two-dimentional classification results Qjn;
Step 6-2, make number of iterations j=1 take two-dimentional sorted group F1, in two-dimentional sorted group F1In, number of iterations n=1 is made, judge
The two-dimentional classification results Q11Species, if the two-dimentional classification results Q11For Q11r, then Q is made11=1, if the two dimension classification knot
Fruit Q11For Q11w, then the Q is made11=-1,
Step 6-3, make number of iterations j=1, take two-dimentional sorted group F1, in two-dimentional sorted group F1In, make number of iterations n=(n=
1)+1, judge the two-dimentional classification results Q12Species, if the two-dimentional classification results Q12For Q12r, then Q is made12=1, if described two
Dimension classification results Q12For Q12w, then the Q is made12=-1;
Step 6-4, repeat step six-3 obtains two-dimentional sorted group F1In n two-dimentional classification results value, then
The value of n two-dimentional classification results is sued for peace, three-dimensional classification value ∑ Q is obtained1If, three-dimensional classification value ∑ Q1>0, then judge two
Dimension sorted group F1In all two-dimentional classification results QjnIt is three-dimensional categorisation result QjrIf, three-dimensional classification value ∑ Q1<0, then judge two
Dimension sorted group F1In all two-dimentional classification results QjnIt is three-dimensional categorisation result Qjw;
Step 6-5, make j=(j=1)+1, and repeat step five-2, step 5-3, step 5-4 obtain p three-dimensional categorisation
As a result Qj, each tuple djCorrespond to a j value identical three-dimensional categorisation result.
If three-dimensional categorisation result is Qjr, then judge and its j value identical tuple djFor normal tuple, if three-dimensional categorisation knot
Fruit is Qjw, then j values identical tuple d therewith is judgedjFor abnormal tuple.
The sample mode of the data set D is sampled for playback, and training set DiIn tuple d quantity be p, unit
Group quantity of the d in training set Di is 0,1, it is multiple in one kind.
The disaggregated model MjIncluding decision-tree model, its classifying step is:
By training set DiRegard a node N as;
Analyzing and training collection DiVariable partitioning scheme, determine multiple partitioning scheme N;
Two optimum partitioning schemes are selected, node N is divided into into N1、N2;
By the dividing method to node N, to node N1、N2Split;
Repeat previous step, to training set DiMulti-stage division is carried out, decision tree is formed;
With remaining K-1 training set DiDecision tree is tested;
Carrying out node secateurs and node according to assay increases, and determines decision tree.
The disaggregated model MjIncluding Bayesian Classification Model, its expression formula is:
Wherein, dH、dXIn two linear properties variables, P (dH) it is dHPrior probability, P (dH|dX) it is dHPosteriority it is general
Rate;P(dX) it is dXPrior probability, P (dX|dH) it is dXPosterior probability.
The disaggregated model MjIncluding supporting vector machine model,
The supporting vector machine model includes linear separability supporting vector machine model, linearly inseparable support vector machine mould
Type,
The optimization problem of the linear separability supporting vector machine model can be solved by the method for convex quadratic programming;
The linearly inseparable supporting vector machine model can be mapped to the input data of lower dimensional space by kernel function
High bit space, and then the optimal separating hyper plane in new described high bit space.
The kernel function includes many Xu formulas kernel functions, gaussian radial basis function and S-shaped kernel function.
So that data set is 6200 power transmission and transformation main equipment object load datas of 18 months as an example, sample frequency is 30 points
Clock.Due to the abnormity of this paper primary study equipment long-term load data, therefore the unit of time studied takes one month, i.e., to original
Beginning data set is processed, and calculates the monthly average load of each device object to reflect its data characteristic modes, i.e., 30 day.Therefore this
One has 111600 load curves in example.6200 power transmission and transforming equipment objects are right comprising 6123 normal subjects and 77 exceptions
As exception object ratio is 1.24%.The input of model is raw data set, is output as machine utilization data exception degree and doubtful
Probability sorting.
Embodiment 1
Step one, feature extraction is carried out to all load curves in this example, the property set D of homologous thread is obtained, wherein common
11160 tuples d;
K sampling with replacement is carried out to set D, 31 different training sets D are obtainedi(i=1,2,31).By
In adopting sampling with replacement, some tuples d of D may not D againiMiddle appearance, and part tuple is likely to occur repeatedly;
31 training sets D of step 2 pairiDecision tree machine learning is carried out, 31 disaggregated model M are obtained1i, by 31 moulds of classifying
Type M1iClassify as grader M1;
To 31 training sets DiNaive Bayes machine learning is carried out, 31 disaggregated model M are obtained2i, by 31 disaggregated model M2i
Classify as grader M2;
To 31 training sets DiVector machine machine learning is supported, 31 disaggregated model M are obtained3i, by 31 moulds of classifying
Type M3iClassify as grader M3,
After above-mentioned machine learning, disaggregated model 3*31=91 is set up.
11160 tuples d in data set D are input into into 91 disaggregated models, 111160*91 output result is obtained, often
91 one-dimensional classification results Q of individual tuple d correspondence1ni。
Embodiment 2
Take first tuple d1ni, by its 91 one-dimensional classification results according to grader M1, grader M2, grader M3It is divided into
Three groups, grader M1Include 31 one-dimensional classification Q11i(i=1,2,31), grader M2Include 31 one-dimensional point
Class Q12i(i=1,2,31), grader M3Include 31 one-dimensional classification Q13i(i=1,2,31).
Judge 31 one-dimensional classification results Q in first tuple Classifier M1j1i, to first one-dimensional classification results Q111
Judged, if one-dimensional classification results Q111Show tuple d1Numerical value is normal, it is determined that one-dimensional classification results Q111Numerical value be
1, if one-dimensional classification results Q111Show the numerical exception of tuple d1, it is determined that one-dimensional classification results Q111Numerical value be -1, so
Afterwards by same procedure to first tuple Classifier M1In second one-dimensional classification results Q112Judged, until by 31
Individual one-dimensional classification results Q11iAll carry out judgement to finish, 31 judged results are added, because 31 is odd number, its result is only possible to
It is greater than 0 or less than 0, when it is more than 0, first tuple d1It is normal, when its result is less than 0, then judges first
Tuple d1It is abnormal, an i.e. two-dimentional classification results Q of first tuple11。
Above-mentioned steps, determine that 11160 tuples pass through M1The result that grader is obtained is normal or abnormal, obtains every
The two-dimentional classification results Q of individual tuple1n。
With same method, first tuple is classified using M2 graders, obtain second of first tuple
Two-dimentional classification results Q12;
With same method, first tuple is classified using M3 graders, obtain the 3rd of first tuple
Two-dimentional classification results Q13;
If two-dimentional classification results Q11Classification results explanation numerical value be normal, then two-dimentional classification results Q11=1, it is no
Then two-dimentional classification results Q11=-1;
If two-dimentional classification results Q12Classification results explanation numerical value be normal, then two-dimentional classification results Q12=1, it is no
Then two-dimentional classification results Q12=-1;
If two-dimentional classification results Q13Classification results explanation numerical value be normal, then two-dimentional classification results Q13=1, it is no
Then two-dimentional classification results Q13=-1;
By two-dimentional classification results Q11、Q12、Q13Numerical value be added, obtain three-dimensional classification value ∑ Q1, the three-dimensional classification value
∑Q1Value be -3, -2, -1,1,2,3.
As three-dimensional classification value ∑ Q1When taking 1,2,3, then judge tuple d1 for normal tuple, i.e. Q1r, when three-dimensional classification value
∑Q1When taking -1, -2, -3, then judge tuple d1 for abnormal tuple, i.e. Q1w。
Embodiment 3
To the equal repeat the above steps of 11160 tuples, you can in judging 11160 tuples, whether each tuple is abnormal.
Then three-dimensional categorization results are found out for QjwTuple, determine the category such as time, main equipment object, sampling time belonging to the tuple
Property, it is possible to find out 77 warping apparatus objects.
Embodiment 4
Fig. 1 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Receiver operating curve's comparison diagram that collection statistical result passes through Decision-Tree Classifier Model statistical result with data set;Fig. 2 is
It is of the present invention for power transmission and transformation main equipment load curve excavate categorizer integration method obtain data set statistical result with
Detection number and recall ratio comparison diagram that data set passes through Decision-Tree Classifier Model statistical result, as shown in Figure 1 and Figure 2, based on decision-making
Tree classification model obtains grader, then exports desired value by importing data to grader, and legacy data collection and desired value are entered
The method of row matching, can effectively lift the classification performance of decision tree, improve classification accuracy.
Embodiment 5
Fig. 3 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Receiver operating curve's comparison diagram that collection statistical result passes through Bayesian Classification Model statistical result with data set;Fig. 4 is
It is of the present invention for power transmission and transformation main equipment load curve excavate categorizer integration method obtain data set statistical result with
Data set is by the detection number of Bayesian Classification Model statistical result and recall ratio comparison diagram;As shown in Figure 3, Figure 4, based on pattra leaves
This disaggregated model obtains grader, then exports desired value by importing data to grader, and legacy data collection and desired value are entered
The method of row matching, can effectively lift Bayesian classification performance, improve classification accuracy.
Embodiment 6
Fig. 5 is the data obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
Receiver operating curve's comparison diagram that collection statistical result passes through support vector cassification modeling statistics result with data set;Figure
6 is the data set statistical result obtained for the categorizer integration method that power transmission and transformation main equipment load curve is excavated of the present invention
With data set by the detection number of support vector cassification modeling statistics result and recall ratio comparison diagram;Fig. 7 is that the present invention is based on
The load data of all main equipment objects that the pack sorter model group that supporting vector machine model is obtained is predicted is extremely doubtful
The scatterplot of probability, as illustrated in figs. 5-7, based on support vector cassification model grader is obtained, then by importing data to point
Class device exports desired value, and the method that legacy data collection is matched with desired value can effectively lift dividing for support vector machine
Class performance, improves classification accuracy.
Embodiment 7
Fig. 8 is the distribution situation of the abnormal main equipment object of load data in the present invention, as shown in figure 8, following in figure
Scatterplot represents that original input data concentrates the position of abnormal main equipment object, it is seen that abnormal main equipment pair as if random distribution
's.Scatterplot above represents that exception is main to be set after the numerical value by doubtful probability is to the rearrangement of all main equipment objects in figure
The distribution situation of standby object.It can be seen that most exception main equipment objects can be concentrated on into new data set by abnormality degree sequence
Front portion, front 1000 main equipment objects include 65 exception objects, and then 5200 main equipment objects only include 12 exceptions
Object.Emphasis verification can be carried out to the main equipment object of doubtful data exception according to the sequence, effectively improve equipment fortune inspection
Efficiency.
Above-mentioned technical proposal only embodies the optimal technical scheme of technical solution of the present invention, those skilled in the art
Some of which part is made some variation embody the present invention principle, belong to protection scope of the present invention it
It is interior.
Claims (10)
1. a kind of categorizer integration method excavated for power transmission and transformation main equipment load curve, it is characterised in that integrated step is:
Step one, comprising p tuple dj(j=1,2, in data set D p), k iteration is carried out to data set D and is put
Pumpback sample, obtains k training set Di(i=1,2, k);
Step 2, to k training set DiMachine learning is carried out, the learning method of the machine learning is n kinds, obtains multiple classification
Device Mn, (n=1,2,3), each grader MnInclude k disaggregated model Mni;
Step 3, with iteration input mode, by each tuple d in data set DjIt is input into each disaggregated model MnqIn, obtain p*
K*n output result djni;
Step 4, according to output result djni, with iteration input mode, to each tuple djBinary classification is carried out in the first dimension,
Each tuple obtains k*n one-dimensional classification results Qjni;
Step 5, according to one-dimensional classification results Qjni, with iteration input mode, to each tuple djBinary point is carried out in the second dimension
Class, each tuple obtains n two-dimentional classification results Qjn;
Step 6, according to two-dimentional classification results Qjn, to each tuple djBinary classification is carried out in third dimension, each tuple is obtained
One three-dimensional categorisation result Qj;
Step 7, according to three-dimensional categorisation result Qj, judge each tuple d in data set DjClassification results.
2., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 1, it is special
Levy and be, in the step 4, first dimension is tuple dimension, described the step of the first dimension carries out binary classification
For:
Step 4-1, number of iterations j=1, i=1, n=1 are taken, by tuple d1Input disaggregated model M11In, obtain a kind of element classification
As a result d111;
Disaggregated model M11The binary element for including is classification element r, classification element w, when the output valve of d1 is normal, judges to return
Class is classification element r, and when the output valve of d1 is abnormal, judgement classifies as classification element w, the classification results d111To sort out knot
Fruit d111rWith categorization results d111wIn one kind.
Step 4-2, number of iterations j=(j=1)+1, i=1, n=1 is made, by tuple d2Input disaggregated model M11In carry out binary point
Class,
Obtain a kind of element classification result d211, the classification results d211For categorization results d211rWith categorization results d211wIn one
Kind;
Step 4-3, repeat step step 4-1, step 4-2p time, obtain p classification results dj11;
Step 4-4, number of iterations i=(i=1)+1 is taken, repeat step four-1, step 4-2 obtains p*k categorization results dj1i;
Step 4-5, number of iterations n=(n=1)+1 is taken, repeat step four-1, step 4-2, step 4-4 obtain n*p*k and return
Class result djni。
3., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 2, it is special
Levy and be,
In the step 5, second dimension is disaggregated model dimension, described the step of the second dimension carries out binary classification
For:
Step 5-1 is by j values are identical, the one-dimensional classification results Q of n value identicalsjniDivide in one-dimensional sorted group F, obtain
P*n one-dimensional sorted group Fjn, each one-dimensional sorted group FjnInclude k one-dimensional classification results Qjni;
Step 5-2, make number of iterations j=1, n=1, take one-dimensional sorted group F11, in one-dimensional sorted group F11In, number of iterations i=1 is made,
Judge the one-dimensional classification results Q111Species, if the one-dimensional classification results Q111For Q111r, then Q is made111=1, if described one-dimensional
Classification results Q111For Q111w, then the Q is made111=-1,
Step 5-3, make number of iterations j=1, take one-dimensional sorted group F11, in one-dimensional sorted group F11In, make number of iterations i=(i=1)+
1, judge the one-dimensional classification results Q112Species, if the one-dimensional classification results Q112For Q112r, then Q is made112=1, if described one
Dimension classification results Q112For Q112w, then the Q is made112=-1,
Step 5-4, repeat step five-3 obtains one-dimensional sorted group F11In k one-dimensional classification results value, then to k
The value of one-dimensional classification results is sued for peace, and obtains two-dimentional classification value ∑ Q11If, two-dimentional classification value ∑ Q11>0, then judge one-dimensional
Sorted group F11In all one-dimensional classification results Qj1iIt is two-dimentional classification results Qj1rIf, two-dimentional classification value ∑ Q11<0, then judge one
Dimension sorted group F11In all one-dimensional classification results Qj1iIt is two-dimentional classification results Qj1w,
Step 5-5, make j=(j=1)+1, n=1 repeat steps five-2, step 5-3, step 5-4, obtain p two dimension classification
As a result Qj1;
Step 5-6, make n=(n=1)+1, repeat step five-5 obtain p*n two-dimentional classification results Qjn.
4., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 3, it is special
Levy and be, in the step 6, the third dimension is grader dimension, described the step of third dimension carries out binary classification
For:
Step 6-1, by the identical two-dimentional classification results Q of j valuesjnDivide in sorted group F, obtain p two-dimentional sorted group Fj,
Each two-dimentional sorted group FjInclude n two-dimentional classification results Qjn;
Step 6-2, make number of iterations j=1 take two-dimentional sorted group F1, in two-dimentional sorted group F1In, make number of iterations n=1, judge this two
Dimension classification results Q11Species, if the two-dimentional classification results Q11For Q11r, then Q is made11=1, if the two-dimentional classification results Q11
For Q11w, then the Q is made11=-1,
Step 6-3, make number of iterations j=1, take two-dimentional sorted group F1, in two-dimentional sorted group F1In, number of iterations n=(n=1)+1 is made,
Judge the two-dimentional classification results Q12Species, if the two-dimentional classification results Q12For Q12r, then Q is made12=1, if the two dimension point
Class result Q12For Q12w, then the Q is made12=-1;
Step 6-4, repeat step six-3 obtains two-dimentional sorted group F1In n two-dimentional classification results value, then to n
The value of two-dimentional classification results is sued for peace, and obtains three-dimensional classification value ∑ Q1If, three-dimensional classification value ∑ Q1>0, then judge two dimension point
Class group F1In all two-dimentional classification results QjnIt is three-dimensional categorisation result QjrIf, three-dimensional classification value ∑ Q1<0, then judge two dimension point
Class group F1In all two-dimentional classification results QjnIt is three-dimensional categorisation result Qjw;
Step 6-5, make j=(j=1)+1, and repeat step five-2, step 5-3, step 5-4 obtain p three-dimensional categorisation result
Qj, each tuple djCorrespond to a j value identical three-dimensional categorisation result.
5., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 4, it is special
Levy and be, if three-dimensional categorisation result is Qjr, then judge and its j value identical tuple djFor normal tuple, if three-dimensional categorisation knot
Fruit is Qjw, then j values identical tuple d therewith is judgedjFor abnormal tuple.
6., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 1, it is special
Levy and be, the sample mode of the data set D is sampled for playback, and training set DiIn tuple d quantity be p, unit
Group quantity of the d in training set Di is 0,1, it is multiple in one kind.
7., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 1, it is special
Levy and be, the disaggregated model MjIncluding decision-tree model, its classifying step is:
By training set DiRegard a node N as;
Analyzing and training collection DiVariable partitioning scheme, determine multiple partitioning scheme N;
Two optimum partitioning schemes are selected, node N is divided into into N1、N2;
By the dividing method to node N, to node N1、N2Split;
Repeat previous step, to training set DiMulti-stage division is carried out, decision tree is formed;
With remaining K-1 training set DiDecision tree is tested;
Carrying out node secateurs and node according to assay increases, and determines decision tree.
8., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 1, it is special
Levy and be, the disaggregated model MjIncluding Bayesian Classification Model, its expression formula is:
Wherein, dH、dXIn two linear properties variables, P (dH) it is dHPrior probability, P (dH|dX) it is dHPosterior probability;P
(dX) it is dXPrior probability, P (dX|dH) it is dXPosterior probability.
9., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 1, it is special
Levy and be, the disaggregated model MjIncluding supporting vector machine model,
The supporting vector machine model includes linear separability supporting vector machine model, linearly inseparable supporting vector machine model,
The optimization problem of the linear separability supporting vector machine model can be solved by the method for convex quadratic programming;
The input data of lower dimensional space can be mapped to a high position by the linearly inseparable supporting vector machine model by kernel function
Space, and then the optimal separating hyper plane in new described high bit space.
10., according to the categorizer integration method excavated for power transmission and transformation main equipment load curve described in claim 5, it is special
Levy and be, the kernel function includes many Xu formulas kernel functions, gaussian radial basis function and S-shaped kernel function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610958899.7A CN106570526A (en) | 2016-10-27 | 2016-10-27 | Classifier integration method for power transmission and transformation primary device load curve mining |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610958899.7A CN106570526A (en) | 2016-10-27 | 2016-10-27 | Classifier integration method for power transmission and transformation primary device load curve mining |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106570526A true CN106570526A (en) | 2017-04-19 |
Family
ID=58535782
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610958899.7A Pending CN106570526A (en) | 2016-10-27 | 2016-10-27 | Classifier integration method for power transmission and transformation primary device load curve mining |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106570526A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034241A (en) * | 2018-07-24 | 2018-12-18 | 南京千智电气科技有限公司 | Load cluster control method and system based on support vector machines |
CN109272058A (en) * | 2018-11-27 | 2019-01-25 | 电子科技大学中山学院 | Integrated power load curve clustering method |
CN111751650A (en) * | 2020-07-06 | 2020-10-09 | 重庆大学 | Non-invasive household electric equipment on-line monitoring system and fault identification method |
-
2016
- 2016-10-27 CN CN201610958899.7A patent/CN106570526A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034241A (en) * | 2018-07-24 | 2018-12-18 | 南京千智电气科技有限公司 | Load cluster control method and system based on support vector machines |
CN109272058A (en) * | 2018-11-27 | 2019-01-25 | 电子科技大学中山学院 | Integrated power load curve clustering method |
CN111751650A (en) * | 2020-07-06 | 2020-10-09 | 重庆大学 | Non-invasive household electric equipment on-line monitoring system and fault identification method |
CN111751650B (en) * | 2020-07-06 | 2021-06-22 | 重庆大学 | Non-invasive household electric equipment on-line monitoring system and fault identification method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xiao et al. | Feature-selection-based dynamic transfer ensemble model for customer churn prediction | |
CN107944480A (en) | A kind of enterprises ' industry sorting technique | |
CN105373606A (en) | Unbalanced data sampling method in improved C4.5 decision tree algorithm | |
Vijaya et al. | Computing efficient features using rough set theory combined with ensemble classification techniques to improve the customer churn prediction in telecommunication sector | |
CN100595780C (en) | Handwriting digital automatic identification method based on module neural network SN9701 rectangular array | |
Pineda-Bautista et al. | General framework for class-specific feature selection | |
CN102521656A (en) | Integrated transfer learning method for classification of unbalance samples | |
CN101464964A (en) | Pattern recognition method capable of holding vectorial machine for equipment fault diagnosis | |
CN101819601A (en) | Method for automatically classifying academic documents | |
Pardeshi et al. | Improved k-medoids clustering based on cluster validity index and object density | |
Lawi et al. | Ensemble GradientBoost for increasing classification accuracy of credit scoring | |
CN104598586A (en) | Large-scale text classifying method | |
CN103400190A (en) | Integrated framework method for optimizing extremity learning machine by using genetic algorithm | |
CN106570526A (en) | Classifier integration method for power transmission and transformation primary device load curve mining | |
Hall et al. | An overview of machine learning with SAS® enterprise miner™ | |
CN112767106B (en) | Automatic auditing method, system, computer readable storage medium and auditing equipment | |
CN105045913A (en) | Text classification method based on WordNet and latent semantic analysis | |
Hebert | Predicting rare failure events using classification trees on large scale manufacturing data with complex interactions | |
CN106156163A (en) | File classification method and device | |
CN101295362A (en) | Combination supporting vector machine and pattern classification method of neighbor method | |
CN104699819A (en) | Sememe classification method and device | |
Hirst | Machine learning for Hilbert series | |
Patel et al. | A reduced error pruning technique for improving accuracy of decision tree learning | |
CN109858541A (en) | A kind of specific data self-adapting detecting method based on data integration | |
Kumar et al. | Classification of faults in Web applications using machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170419 |
|
RJ01 | Rejection of invention patent application after publication |