CN109840536A - A kind of power grid power supply reliability horizontal clustering method and system - Google Patents
A kind of power grid power supply reliability horizontal clustering method and system Download PDFInfo
- Publication number
- CN109840536A CN109840536A CN201711228891.6A CN201711228891A CN109840536A CN 109840536 A CN109840536 A CN 109840536A CN 201711228891 A CN201711228891 A CN 201711228891A CN 109840536 A CN109840536 A CN 109840536A
- Authority
- CN
- China
- Prior art keywords
- principal component
- power supply
- supply reliability
- index
- clustering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000012795 verification Methods 0.000 claims abstract description 23
- 238000004422 calculation algorithm Methods 0.000 claims description 49
- 238000005457 optimization Methods 0.000 claims description 37
- 238000000513 principal component analysis Methods 0.000 claims description 14
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000007621 cluster analysis Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 claims description 3
- 239000012141 concentrate Substances 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 12
- 230000006870 function Effects 0.000 description 28
- 238000011156 evaluation Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 230000002159 abnormal effect Effects 0.000 description 5
- 230000004069 differentiation Effects 0.000 description 5
- 238000012847 principal component analysis method Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000005494 condensation Effects 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 230000002547 anomalous effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000010181 polygamy Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Landscapes
- Supply And Distribution Of Alternating Current (AREA)
Abstract
The present invention relates to a kind of power grid power supply reliability horizontal clustering method and system, comprising: the factor for choosing influence power supply reliability establishes index set;Final clustering target is determined from index set using significance of correlation coefficient verification;Clustering is carried out to final clustering target, obtains the cluster result of power supply reliability horizontal classification;It is optimized according to basic data of the cluster result to description power supply reliability level.The present invention has just carried out a series of processing to index set before cluster, and significance of correlation coefficient has been carried out after calculating data related coefficient and has verified and increase subsidiary discriminant, i.e. principal component is analysed, and is not only reduced dimension, is also increased confidence level in certain degree.
Description
Technical field
The present invention relates to a kind of clustering methods, and in particular to a kind of power grid power supply reliability horizontal clustering method and system.
Background technique
K-means algorithm is hard clustering algorithm, is the representative of the typically objective function clustering method based on prototype, it is
Data point obtains the tune of interative computation using the method that function seeks extreme value to certain objective function of distance as optimization of prototype
Whole rule.For K-means algorithm using Euclidean distance as similarity measure, it is to seek corresponding a certain initial cluster center vector V most
Optimal sorting class, so that evaluation index J is minimum.Algorithm is using error sum of squares criterion function as clustering criteria function.
K-means algorithm belongs to unsupervised learning method.This algorithm is divided into k cluster using k as parameter, n object, so that
Similarity with higher in cluster, and the similarity between cluster is lower.The calculating of similarity is averaged according to object in a cluster
Value (center of gravity for being counted as cluster) Lai Jinhang.This algorithm randomly chooses k object first, and each object represents the matter of a cluster
The heart.Remaining each object is assigned to it most like therewith according to the distance between the object and each cluster mass center
Cluster in.Then, the new mass center of each cluster is calculated.It repeats the above process, until criterion function is restrained.
K-means algorithm is a kind of Dynamic Clustering Algorithm of more typical pointwise modification iteration, and main points are flat with error
It just and is criterion function.Pointwise modification class center: a picture dot sample presses a certain principle, belongs to after a certain group of class it is necessary to weight
The mean value of this group of class is newly calculated, and point carries out picture dot element cluster next time using new mean value as condensation center;It is repaired by batch
Change class center: being classified and then calculated by a certain group of class center in whole picture dot samples and modify all kinds of mean values, as next
The condensation center point of subseries.
K-means algorithm has higher efficiency to large data sets and is scalability;Time complexity is bordering on linearly,
Algorithm is succinct, quick;And it is suitble to the advantages that excavating large-scale dataset.But the algorithm there are the following problems:
1, the needs constantly carry out sample classification adjustment, constantly calculate new cluster centre adjusted, therefore work as
Data volume is big, dimension is more, and the time overhead of algorithm is very big.And when similarity is smaller between class and class, the method effect
Fruit is poor.
2, in K-means algorithm, it is necessary first to an initial division is determined according to initial cluster center, then to first
Begin to divide and optimize.The selection of this initial cluster center has large effect to cluster result, once initial value selection
It is bad, it may be unable to get effective cluster result, this also becomes a main problem of K-means algorithm.For the problem
Solution, many algorithms use genetic algorithm (GA), such as are initialized in document using genetic algorithm (GA), poly- with inside
Class criterion is as evaluation index, but effect is unsatisfactory.
Summary of the invention
To solve above-mentioned deficiency of the prior art, the object of the present invention is to provide a kind of power grid power supply reliability horizontal focusings
Class method and system, practical processing is given when cluster data amount is big, the factor, and validity passes through power supply reliability
The example of horizontal clustering problem is verified.
The purpose of the present invention is adopt the following technical solutions realization:
The present invention provides a kind of power grid power supply reliability level optimization method, thes improvement is that:
The factor for choosing influence power supply reliability establishes index set;
Final clustering target is determined from index set using significance of correlation coefficient verification;
Clustering is carried out to final clustering target, obtains the cluster result of power supply reliability horizontal classification;
It is optimized according to basic data of the cluster result to description power supply reliability level.
Further: it includes basic data that the index set is divided by reliability level, and the basic data includes region
Feature, economic level, power load situation, grid structure and operating condition, equipment situation and technology management level index.
Further: described that clustering target is determined from index set using significance of correlation coefficient verification, comprising:
Basic data in index set is pre-processed;
Significance of correlation coefficient verification is carried out using Principal Component Analysis to the index in index set after pretreatment;
Form final clustering target.
Further: significance of correlation coefficient verification is carried out using Principal Component Analysis, comprising:
First principal component in index set after determining pretreatment;
Successively determine Second principal component, third principal component, the 4th principal component ... ..., the P principal component, p expression index set
In index number.
Further: first principal component includes: in index set after the determining pretreatment
The variance Va (rF1) for first linear combination F1 that the first principal component index is chosen is indicated, as Va (rF1)
When maximum, first linear combination F1 is first principal component;First linear combination is multiple tools in index set after pretreatment
There is the index of correlation, is reassembled into one group of new mutual unrelated overall target and carries out what linear combination obtained.
Further: described successively to determine Second principal component, third principal component, the 4th principal component ... ..., P it is main at
Point, comprising:
If first principal component F1 is not enough to represent the information of original P index, second linear combination F2 is chosen, first
The existing information of principal component F1 is not present in second linear combination F2, is expressed as Cov (F1, F2)=0 with mathematic(al) representation,
Then second linear combination F2 is referred to as Second principal component,;
The rest may be inferred constructs third, the the 4th ... ..., the P principal component.
Further: the final clustering target be first principal component, Second principal component, third principal component ..., the
The set m of P principal component.
Further: it is described that clustering is carried out to final clustering target, obtain the cluster of power supply reliability horizontal classification
Result includes: to carry out clustering, the K-means algorithm to final clustering target using the cost function of K-means algorithm
Cost function indicate are as follows:
Wherein: μ1,...,μkFor 1 ..., K cluster centre, m indicates final clustering target,Indicate that ith cluster refers to
The center of class, x where mark(i)For the data point in clustering target, c(1),...c(m)For the data point x in clustering target(1),...,
x(m)The class at place.
Further: it is described to be optimized according to basic data of the cluster result to description power supply reliability level,
Include:
The basic data that will be greater than 3 times of variances of basic data of power supply reliability level is rejected, so that the K-
The cost function of means algorithm minimizes.
Further: so that the cost function of the K-means algorithm minimizes, calculating formula is as follows:
The present invention also provides a kind of power grid power supply reliability level optimization systems, the improvement is that: including:
Module is constructed, the factor for choosing influence power supply reliability establishes index set;
Clustering target determining module, for determining that final cluster refers to from index set using significance of correlation coefficient verification
Mark;
Cluster Analysis module carries out clustering, the clustering target optimized to final clustering target;
Optimization module, for the clustering target according to the optimization to the basic number for describing power grid power supply reliability level
According to optimizing.
Further: the clustering target determining module further comprises:
Submodule is pre-processed, for pre-processing to the basic data in index set;
Submodule is verified, for carrying out phase relation digital display using Principal Component Analysis to the index in index set after pretreatment
The verification of work property;
Submodule is formed, final clustering target is used to form.
Further: the verification submodule, comprising:
First principal component determination unit, for first principal component in index set after determining pretreatment;
Second principal component, determination unit, for successively determining Second principal component, third principal component, the 4th principal component ... ...,
The P principal component, p indicate the index number in index set.
Further: the first principal component determination unit, comprising:
First chooses subelement, the variance Va for first linear combination F1 that first principal component index is chosen
(rF1) it indicates, when Va (rF1) maximum, first linear combination F1 is first principal component;First linear combination is pre- place
Multiple indexs with correlation in index set after reason are reassembled into one group of new mutual unrelated overall target and carry out linearly
What combination obtained.
Further: the Second principal component, determination unit, comprising:
Second chooses subelement, if being not enough to represent the information of original P index for first principal component F1, selection the
The existing information of two linear combination F2, first principal component F1 is not present in second linear combination F2, uses mathematic(al) representation
It is expressed as Cov (F1, F2)=0, then second linear combination F2 is referred to as Second principal component,;
Third principal component determines subelement, constructs third, the the 4th ... ..., the P principal component for the rest may be inferred.
Further: the final clustering target be first principal component, Second principal component, third principal component ..., the
The set m of P principal component.
Compared with the immediate prior art, technical solution provided by the invention is had the beneficial effect that
1, the factor that the present invention chooses influence power supply reliability establishes index set;It is verified using significance of correlation coefficient from finger
Mark, which is concentrated, determines final clustering target;Clustering is carried out to final clustering target, obtains the poly- of power supply reliability horizontal classification
Class result;It is optimized according to basic data of the cluster result to description power supply reliability level.Shadow is chosen before cluster
The factor for ringing power supply reliability establishes index set, and using also to carry out significance of correlation coefficient verification after correlation processing, leads to
The clustering target that clustering is optimized is crossed, power grid power supply reliability level is carried out according to the clustering target of optimization
When optimization, so that the cluster small data of area's sorting room similitude as far as possible, improve Clustering Effect.
2, in some cases, since selected index is all that same nature data are difficult to differentiate between traditional K-means algorithm
The lesser data of similitude cause to turn to the two into one kind, and present invention optimization is asked what the lesser data of similitude distinguished
Topic.Demonstration differentiation is carried out using using other householder methods-principal component analysis method, dimension is not only reduced, also in certain degree
Increase confidence level.
3, it in some cases, depends only on and the algorithm of output is directly applied into reality, without to practical abnormal data
It is rejected, even if algorithm is accurate, practical application is also due to real data is abnormal and leads to the biased property of cluster level.This
Algorithm is considered to the practical problem in engineer application, as some districts and cities power off time that is averaged is significantly larger than being averaged for similar districts and cities
Power off time can be to those well below market value singularly if being averaged power off time using similar all districts and cities rejecting is not added
Unit cause unfairness, scoring also just loses accuracy.
4, this adds addition districts and cities' equivalent user number in electric network reliability horizontal clustering evaluation procedure as weight factor
Enter the reliability level evaluation to districts and cities, province.To preferably strong, service ability, equipment level according to constituent parts rack
Districts and cities' unit gives scientific and reasonable evaluation.
Detailed description of the invention
Fig. 1 is a kind of general flow chart of power grid power supply reliability horizontal clustering method provided by the invention;
Fig. 2 is that a kind of process of power grid power supply reliability horizontal clustering method provided by the invention is schemed in detail;
Fig. 3 is the schematic diagram that increase principal component analysis method provided by the invention differentiates;
Fig. 4 is the knot of the interregional grid power supply reliability horizontal clustering system provided by the invention based on K-means algorithm
Composition.
Specific embodiment
Specific embodiments of the present invention will be described in further detail with reference to the accompanying drawing.
The following description and drawings fully show specific embodiments of the present invention, to enable those skilled in the art to
Practice them.Other embodiments may include structure, logic, it is electrical, process and other change.Embodiment
Only represent possible variation.Unless explicitly requested, otherwise individual component and function are optional, and the sequence operated can be with
Variation.The part of some embodiments and feature can be included in or replace part and the feature of other embodiments.This hair
The range of bright embodiment includes equivalent obtained by the entire scope of claims and all of claims
Object.Herein, these embodiments of the invention can individually or generally be indicated that this is only with term " invention "
For convenience, and if in fact disclosing the invention more than one, the range for being not meant to automatically limit the application is to appoint
What single invention or inventive concept.
Embodiment one,
The present invention pre-processes input data, also to carry out significance of correlation coefficient verification after correlation processing,
Demonstration differentiation is carried out using principal component analysis method using other householder methods-in addition, there are also an important steps.Then again using optimization
Cost function clustered so that cluster small data of area's sorting room similitude as far as possible.Directly by the cluster result of output
Using, without being handled again applied data, such as abnormal data, even if cluster result is good again, it is also difficult in engineer application
It is middle to obtain good effect.Therefore, applied actual conditions or data are also carried out exception in practical engineering applications to sentence
Not, abnormal data is made to be not involved in generic calculating.
The present invention provides a kind of power grid power supply reliability horizontal clustering method, and flow chart is as illustrated in fig. 1 and 2, comprising:
1, the factor for choosing influence power supply reliability establishes index set;
2, final clustering target is determined from index set using significance of correlation coefficient verification;
3, clustering is carried out to final clustering target using the cost function of K-means algorithm, obtains power supply reliability
The cluster result of horizontal classification;
4, it is optimized according to basic data of the cluster result to description power supply reliability level.
Related coefficient and checking validity are to verify the whether significant effective ways of correlation between selected index set, also right
Whether cluster result is optimal important influence.Therefore this step is the only way which must be passed before cluster, otherwise, selected cluster
The factor lacks feasibility and confidence level.Meanwhile after significance of correlation coefficient verification determines selected index, need by certain
Method carries out subsidiary discriminant, this analyses method by principal component and related coefficient significantly verifies and determines that final cluster refers to after mutually analyzing
Mark.
1, the factor for choosing influence power supply reliability establishes index set, comprising: index set is divided by reliability level includes
Basic data, the basic data include provincial characteristics, economic level, power load situation, grid structure and operating condition, set
Standby situation and technology management level index.
2, being verified using significance of correlation coefficient from final clustering target determining in index set includes to index after pretreatment
The index of concentration carries out significance of correlation coefficient verification using Principal Component Analysis;
Principal component analysis (PCA), one kind that multiple variables are selected less number significant variable by linear transformation are more
First statistical analysis technique.Also known as principal component analysis.Principal component analysis is to be introduced by K. Pearson to nonrandom variable first, so
The method is generalized to by H. Hotelling the situation of random vector afterwards.The size of information is usually weighed with sum of squares of deviations or variance
Amount.
In actual subject, for comprehensive problem analysis, much variables (or factor) related with this are often proposed, because
Each variable reflects certain information of this project to varying degrees.Principal component analysis is that is, have certain correlation for multiple
Property index, being reassembled into one group of new mutual unrelated overall target carries out linear combination and replaces former index, as new
Overall target.
Most classic way is exactly that the variance of F1 (first linear combination of selection, i.e. first overall target) is used to carry out table
It reaches, i.e. Va (rF1) is bigger, and the information for indicating that F1 includes is more.Therefore the F1 chosen in all linear combination should the side of being
Difference is maximum, therefore F1 is referred to as first principal component.If first principal component is not enough to represent the information of original P index, consider further that
It chooses F2 and selects second linear combination, in order to effectively reflect original information, there is no need to occur again again for the existing information of F1
It is exactly to require Cov (F1, F2)=0 with mathematical linguistics expression, then F2 is referred to as Second principal component, and so on can be constructed in F2
Third, the the 4th ... ... out, the P principal component.
Form final clustering target: first principal component, Second principal component, third principal component, the 4th principal component ... ..., P
A principal component forms final clustering target set m.
3, clustering is carried out to final clustering target using the cost function of K-means algorithm, obtains power supply reliability
The cluster result of horizontal classification:
K-means clustering procedure is the important algorithm in partition clustering analytic approach, but this clustering method must be determined in advance it is poly-
Class number, it is artificial to provide that clusters number factor is very big since business needs especially in power grid.And characteristic it is closely similar when,
As long as having some deviations in borderline data point, data may be divided into another kind of from one kind.User needs guaranteeing
It clusters between optimal and business needs and is chosen.The present invention is a kind of based on K-means Optimal Clustering, passes through cost function
Optimization algorithm, the similar data of distinctive characteristics, and gather similitude is smaller for one kind.The present invention is to new method in cluster data
Practical processing is given when amount is big, the factor is more, validity is obtained by the example of power supply reliability horizontal clustering problem
Verifying.
K-Means algorithm belongs to one kind of unsupervised formula study, and the input of algorithm is: training dataset(wherein x (i) ∈ RnWith number of clusters K (data are divided into K class);Algorithm output is in K cluster
Heart μ 1, μ 2 ..., μ k and the classification where each data point x (i).The process of K-Means algorithm is as follows:
I. K cluster centre of random initializtion (ClusterCentroid): μ 1, μ 2 ..., μ K;
Ii. for each data point x (i), the cluster centre nearest from it is found, such is classified to;I.e. Wherein c (i) indicates the class where the data point x (i) in clustering target;
Iii. the value for updating cluster centre uk is the average value of all data points for belonging to class k;Repeat ii and iii step until
Restrain or reach maximum number of iterations
4, it is optimized according to basic data of the cluster result to description power supply reliability level:
Clustering, the cost of K-means algorithm are carried out to final clustering target using the cost function of K-means algorithm
Function cost function are as follows:
Wherein: μ1,...,μkFor 1 ..., K cluster centre, m indicates final clustering target,Indicate that ith cluster refers to
The center of class, x where mark(i)For the data point in clustering target, c(1),...c(m)For the data point x in clustering target(1),...,
x(m)The class at place.
It is described to be optimized according to basic data of the cluster result to description power supply reliability level, comprising:
The basic data that will be greater than 3 times of variances of basic data of power supply reliability level is rejected, so that the K-
The cost function of means algorithm minimizes.
Further: so that the cost function of the K-means algorithm minimizes, calculating formula is as follows:
Embodiment two,
The present embodiment is to study how to evaluate entoilage city-level power supply company, state reliability level, and evaluation content is specific
Including two aspects: first is that average power off time of user;Second is that average frequency of power cut of user.It is analyzed from business, influencing power supply can
Mainly having contact rate, can turn for rate, line sectionalizing rate, looped network rate, cable rate, power distribution automation rate, equipment water by property
Equality.However, whether there is relationship between these factors, relationship is mostly strong but cannot to be needed to assist dividing by making a concrete analysis of in business
Analysis and differentiation.
The factor that influence reliability index is chosen in this research establishes index set, after being collected to index set, to exception
Data are handled and are corrected, and the prefecture-level power supply company within the scope of state's net is divided into 5 by reliability level by business diagnosis
Class is clustered the significant factor of correlation using the cost function that K-means optimizes, if Fig. 1 and 2 is algorithm flow chart.
According to influence and reflecting regional distribution network reliability level several factors, respectively from economic characteristics, population characteristic,
With electrical feature, equipment scale, grid structure and the several dimension index for selection of management level.Collected index result such as the following table 1
It is shown:
The index result that table 1 is collected into
By the way that index carries out data prediction, significance of correlation coefficient is verified to collecting, determine the index that finally clusters because
Son, as shown in table 2 below.In addition, variable number just will increase answering for analysis too much when with statistical analysis technique Study of Multivariable
Polygamy.And be many times, between variable to have certain correlativity, when having certain correlativity between two variables,
It can be construed to the two variables and reflect that the information of this project has certain overlapping.Principal component analysis is the institute for originally proposing
There is variable, it is extra that duplicate variable (variable of close relation) is left out, and new variables as few as possible is established, so that these new changes
Amount is incoherent two-by-two, and these new variables keep original information in the message context of reflection project as far as possible.
Therefore, this chooses after being verified by principal component analysis method to index progress selecting index and significance of correlation coefficient
Index compares, and discovery two methods conclusion matches.The dendrogram obtained using principal component analysis method is as shown in Figure 3.
The rejecting and selection of 2 index of table
Note: marking color background is selected index, and other is Rejection index
By to data prediction and after determining the final clustering target factor, using K-means optimization cost function into
Row cluster, optimizes target.As shown in table 3 below, using the K-means algorithm after optimization can preferably by characteristic consistently
Gather for one kind in city.
The comparative situation of the optimization of table 3 front and back cluster result
Districts and cities' title | Province title | Cover administrative region | K-means | K-means after optimization |
Linyi | Shandong | Linyi | 1 | 2 |
Qingdao | Shandong | Qingdao | 1 | 1 |
Weihai | Shandong | Weihai | 1 | 1 |
Heze | Shandong | Heze | 1 | 2 |
Liaocheng | Shandong | Liaocheng | 1 | 2 |
Tangshan | Ji Bei | Tangshan | 1 | 1 |
Sunshine | Shandong | Sunshine | 1 | 2 |
Zaozhuang | Shandong | Zaozhuang | 1 | 2 |
Tai'an | Shandong | Tai'an | 1 | 1 |
In addition, it is necessary to make an explanation, although this is using the K-means clustering method after optimization, this is
One optimization of algorithm.In view of application and practicability of the invention, needs for algorithm to be applied to and do a little explanations in practice.Tool
Body is as follows,
1, since cluster is intermediate result, cluster result is exactly applied in Practical Project and life by purpose, just
It is to separate classification after being clustered the reliability level of districts and cities for this example.Object applied by clustering is districts and cities
Average power off time of user and average frequency of power cut of user, it is flat with the practical power off time of districts and cities, number and the user of similar districts and cities
The average value of equal power off time and number is compared.However the met problem from example of calculation, i.e., individual districts and cities are public in cluster
It is horizontal that the average power off time of user and average frequency of power cut of user of department are significantly larger than other districts and cities, and other districts and cities users are average
Power off time and number distribution again more concentrates, therefore, if only by the average power off time of user of similar all districts and cities and time
Number participates in the calculating of average value, relatively large deviation can be caused to result, evaluation result will lose value.
If the most districts and cities' average power off time of user in Shanxi Province in this example is all more than 20 hours/family, user averagely stops
Electric number is all more than 5 times/family.And it is similar in other districts and cities' average power off time of user be mostly distributed in 1.5~4 hours/family, use
It is mostly distributed in 0.2~1 hour/family in average frequency of power cut, as shown in table 4 below.In view of the use of cluster middle part company, subdivision city
Family is averaged power off time and average frequency of power cut of user does not have representativeness, therefore is being applied to will be greater than similarly in practice
City's average power off time of user, 3 times of variances of number unit data rejected, be not involved in mean value calculation.
Table 4 considers part unit average after 3 times of variances
Note 1: non-null value is greater than the districts and cities of 3 times of variances in table
Note 2: being part districts and cities in table
2, since this algorithm is applied to 27 units of Guo Wang company, the reliability level of more than 300 prefecture-level companies is commented
Valence, each interregional, between each province's unit differentiation are obvious.This is mainly according to the strong journey of rack between each region, each province's unit
Degree, service ability, equipment equipment and region development use differentiation in districts and cities' reliability level cluster score
Evaluation.In this way so that districts and cities' unit that those racks are strong, service ability is good, equipment level is high is given compared with high score, instead
It, then give lower score value.
By to on-line system Operational Data Analysis, the accounting situation that districts and cities' equivalent user number accounts for the province can integrate compared with
Reflect the comprehensive level of its unit well, therefore is added to the districts and cities and is somebody's turn to do using districts and cities' equivalent user number accounting as weight
The comprehensive scoring saved, so that evaluation result more really restores the power supply reliability level of the districts and cities, the province.The following table is ground
City and net province consider the scoring event after weight (districts and cities' equivalent user number accounts for this province ratio).
5 districts and cities of table and net save scoring event after consideration weight
Shown in table 5 as above, does not consider weight, directly use the average value of districts and cities, the province deviation ratio score as province's reliability
Horizontal score, average power off time of user and number are respectively 77.32 and 88.97.And province user is average after considering weight
The final score of power off time and number is respectively 86.67 and 96.61, and two kinds of score differences, main cause is the districts and cities, province
The influence to result more than (area) specific equivalent user's accounting does not consider this if Tianjin Ninghe equivalent user accounting is 2.16%
Specific equivalent user's accounting, score is lower, so that the average value dragged down, drags down Tianjin global reliability level, also
It is difficult to reflect truth.And the latter scores, specific equivalent user is few, and score is also few, to province's total score (districts and cities' summation)
It influences also small.
Embodiment three,
Based on same inventive concept, the present invention also provides a kind of power grid power supply reliability water based on K-means algorithm
Flat clustering system, structure chart are as shown in Figure 4, comprising:
Module is constructed, the factor for choosing influence power supply reliability establishes index set;
Clustering target determining module, for determining that final cluster refers to from index set using significance of correlation coefficient verification
Mark;
Cluster Analysis module carries out clustering, the clustering target optimized to final clustering target;
Optimization module, for the clustering target according to the optimization to the basic number for describing power grid power supply reliability level
According to optimizing.
Further: the clustering target determining module further comprises:
Submodule is pre-processed, for pre-processing to the basic data in index set;
Submodule is verified, for carrying out phase relation digital display using Principal Component Analysis to the index in index set after pretreatment
The verification of work property;
Submodule is formed, final clustering target is used to form.
Further: the verification submodule, comprising:
First principal component determination unit, for first principal component in index set after determining pretreatment;
Second principal component, determination unit, for successively determining Second principal component, third principal component, the 4th principal component ... ...,
The P principal component, p indicate the index number in index set.
Further: the first principal component determination unit, comprising:
First chooses subelement, the variance Va for first linear combination F1 that first principal component index is chosen
(rF1) it indicates, when Va (rF1) maximum, first linear combination F1 is first principal component;First linear combination is pre- place
Multiple indexs with correlation in index set after reason are reassembled into one group of new mutual unrelated overall target and carry out linearly
What combination obtained.
Further: the Second principal component, determination unit, comprising:
Second chooses subelement, if being not enough to represent the information of original P index for first principal component F1, selection the
The existing information of two linear combination F2, first principal component F1 is not present in second linear combination F2, uses mathematic(al) representation
It is expressed as Cov (F1, F2)=0, then second linear combination F2 is referred to as Second principal component,;
Third principal component determines subelement, constructs third, the the 4th ... ..., the P principal component for the rest may be inferred.
Further: the final clustering target be first principal component, Second principal component, third principal component ..., the
The set m of P principal component.
A kind of power grid power supply reliability horizontal clustering method based on K-means algorithm provided by the invention, this method exist
By the cost function of optimization on the basis of original method, the accuracy and confidence level of cluster are improved.The present invention is poly- simultaneously
Class is forward and backward all to have done the work for improving Clustering Effect and practical significance, such as pre-processes input data, in correlation processing
Also to carry out significance of correlation coefficient verification afterwards, there are also an important step using other householder methods-using principal component analyse method into
Row demonstration differentiates, then is clustered using the cost function of optimization, so that the cluster small data of area's sorting room similitude as far as possible.
In practical engineering applications to the processing work of practical abnormal data, districts and cities' equivalent user number is added and accounts for province's ratio as weight
Deng these working groups integrally jointly promote the confidence level of cluster and the overall efficacy of reliability level, keep result more existing
Real application value.Applied actual conditions or data are carried out anomalous discrimination in practical engineering applications by the present invention, make exception
Data be not involved in generic calculating.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, although referring to above-described embodiment pair
The present invention is described in detail, those of ordinary skill in the art still can to a specific embodiment of the invention into
Row modification perhaps equivalent replacement these without departing from any modification of spirit and scope of the invention or equivalent replacement, applying
Within pending claims of the invention.
Claims (16)
1. a kind of power grid power supply reliability level optimization method, it is characterised in that:
The factor for choosing influence power supply reliability establishes index set;
Final clustering target is determined from index set using significance of correlation coefficient verification;
Clustering is carried out to final clustering target, obtains the cluster result of power supply reliability horizontal classification;
It is optimized according to basic data of the cluster result to description power supply reliability level.
2. power grid power supply reliability level optimization method as described in claim 1, it is characterised in that: the index set is by reliable
Property horizontal division includes basic data, and the basic data includes provincial characteristics, economic level, power load situation, rack knot
Structure and operating condition, equipment situation and technology management level index.
3. power grid power supply reliability level optimization method as claimed in claim 2, it is characterised in that: described to use related coefficient
Checking validity determines clustering target from index set, comprising:
Basic data in index set is pre-processed;
Significance of correlation coefficient verification is carried out using Principal Component Analysis to the index in index set after pretreatment;
Form final clustering target.
4. power grid power supply reliability level optimization method as claimed in claim 3, it is characterised in that: use Principal Component Analysis
Carry out significance of correlation coefficient verification, comprising:
First principal component in index set after determining pretreatment;
Successively determining Second principal component, third principal component, the 4th principal component ... ..., the P principal component, p are indicated in index set
Index number.
5. power grid power supply reliability horizontal clustering method as claimed in claim 4, it is characterised in that: after the determining pretreatment
First principal component includes: in index set
The variance Va (rF1) for first linear combination F1 that the first principal component index is chosen is indicated, when Va (rF1) maximum
When, first linear combination F1 is first principal component;First linear combination is multiple with phase in index set after pre-processing
The index of closing property is reassembled into one group of new mutual unrelated overall target and carries out what linear combination obtained.
6. power grid power supply reliability horizontal clustering method as claimed in claim 5, it is characterised in that: described successively to determine second
Principal component, third principal component, the 4th principal component ... ..., the P principal component, comprising:
If first principal component F1 is not enough to represent the information of original P index, choose second linear combination F2, first it is main at
Divide the existing information of F1 to be not present in second linear combination F2, is expressed as Cov (F1, F2)=0 with mathematic(al) representation, then claims
Second linear combination F2 is Second principal component,;
The rest may be inferred constructs third, the the 4th ... ..., the P principal component.
7. power grid power supply reliability level optimization method as claimed in claim 6, it is characterised in that: the final clustering target
For first principal component, Second principal component, third principal component ..., the set m of P principal component.
8. power grid power supply reliability level optimization method as claimed in claim 7, it is characterised in that: described to refer to final cluster
Mark carries out clustering, and the cluster result for obtaining power supply reliability horizontal classification includes: cost function using K-means algorithm
Clustering is carried out to final clustering target, the cost function of the K-means algorithm indicates are as follows:
Wherein: μ1,...,μkFor 1 ..., K cluster centre, m indicates final clustering target,Indicate ith cluster index institute
At the center of class, x(i)For the data point in clustering target, c(1),...c(m)For the data point x in clustering target(1),...,x(m)
The class at place.
9. power grid power supply reliability level optimization method as claimed in claim 8, it is characterised in that: described according to the cluster
As a result the basic data of description power supply reliability level is optimized, comprising:
The basic data that will be greater than 3 times of variances of basic data of power supply reliability level is rejected, so that the K-means is calculated
The cost function of method minimizes.
10. power grid power supply reliability level optimization method as claimed in claim 9, it is characterised in that: so that the K-means
The cost function of algorithm minimizes, and calculating formula is as follows:
11. a kind of power grid power supply reliability level optimization system, it is characterised in that: include:
Module is constructed, the factor for choosing influence power supply reliability establishes index set;
Clustering target determining module, for determining final clustering target from index set using significance of correlation coefficient verification;
Cluster Analysis module carries out clustering, the clustering target optimized to final clustering target;
Optimization module, for according to the clustering target of the optimization to the basic data of description power grid power supply reliability level into
Row optimization.
12. the power grid power supply reliability level optimization system stated such as claim 11, it is characterised in that: the clustering target determines
Module further comprises:
Submodule is pre-processed, for pre-processing to the basic data in index set;
Submodule is verified, for carrying out significance of correlation coefficient using Principal Component Analysis to the index in index set after pretreatment
Verification;
Submodule is formed, final clustering target is used to form.
13. power grid power supply reliability level optimization system as claimed in claim 11, it is characterised in that: the verification submodule
Block, comprising:
First principal component determination unit, for first principal component in index set after determining pretreatment;
Second principal component, determination unit, for successively determining Second principal component, third principal component, the 4th principal component ... ..., P
A principal component, p indicate the index number in index set.
14. power grid power supply reliability level optimization system as claimed in claim 13, it is characterised in that: the first principal component
Determination unit, comprising:
First chooses subelement, variance Va (rF1) table for first linear combination F1 that first principal component index is chosen
Show, when Va (rF1) maximum, first linear combination F1 is first principal component;First linear combination is to refer to after pre-processing
Mark concentrates multiple indexs with correlation, is reassembled into one group of new mutual unrelated overall target progress linear combination and obtains
It arrives.
15. power grid power supply reliability level optimization system as claimed in claim 13, it is characterised in that: the Second principal component,
Determination unit, comprising:
Second chooses subelement, if being not enough to represent the information of original P index for first principal component F1, chooses second
The existing information of linear combination F2, first principal component F1 is not present in second linear combination F2, is indicated with mathematic(al) representation
For Cov (F1, F2)=0, then second linear combination F2 is referred to as Second principal component,;
Third principal component determines subelement, constructs third, the the 4th ... ..., the P principal component for the rest may be inferred.
16. power grid power supply reliability level optimization system as claimed in claim 15, it is characterised in that: the final cluster refers to
Be designated as first principal component, Second principal component, third principal component ..., the set m of P principal component.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711228891.6A CN109840536A (en) | 2017-11-29 | 2017-11-29 | A kind of power grid power supply reliability horizontal clustering method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711228891.6A CN109840536A (en) | 2017-11-29 | 2017-11-29 | A kind of power grid power supply reliability horizontal clustering method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109840536A true CN109840536A (en) | 2019-06-04 |
Family
ID=66882339
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711228891.6A Pending CN109840536A (en) | 2017-11-29 | 2017-11-29 | A kind of power grid power supply reliability horizontal clustering method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109840536A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111768109A (en) * | 2020-07-02 | 2020-10-13 | 广东电网有限责任公司 | Reliability early warning method and system for power electronic medium-voltage distribution network and terminal equipment |
CN111932147A (en) * | 2020-09-02 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | Visualization method and device for overall index, electronic equipment and storage medium |
CN116956075A (en) * | 2023-09-18 | 2023-10-27 | 国网山西省电力公司营销服务中心 | Automatic identification method, system, equipment and storage medium for type of power consumer side |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103646286A (en) * | 2013-09-02 | 2014-03-19 | 河海大学 | Data processing method for estimating efficiency of intelligent distribution network |
CN105303468A (en) * | 2015-11-20 | 2016-02-03 | 国网天津市电力公司 | Comprehensive evaluation method of smart power grid construction based on principal component cluster analysis |
CN106530134A (en) * | 2016-11-16 | 2017-03-22 | 国家电网公司 | Influence index marginal benefit analysis method and device based on reliability calculation model |
-
2017
- 2017-11-29 CN CN201711228891.6A patent/CN109840536A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103646286A (en) * | 2013-09-02 | 2014-03-19 | 河海大学 | Data processing method for estimating efficiency of intelligent distribution network |
CN105303468A (en) * | 2015-11-20 | 2016-02-03 | 国网天津市电力公司 | Comprehensive evaluation method of smart power grid construction based on principal component cluster analysis |
CN106530134A (en) * | 2016-11-16 | 2017-03-22 | 国家电网公司 | Influence index marginal benefit analysis method and device based on reliability calculation model |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111768109A (en) * | 2020-07-02 | 2020-10-13 | 广东电网有限责任公司 | Reliability early warning method and system for power electronic medium-voltage distribution network and terminal equipment |
CN111932147A (en) * | 2020-09-02 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | Visualization method and device for overall index, electronic equipment and storage medium |
CN116956075A (en) * | 2023-09-18 | 2023-10-27 | 国网山西省电力公司营销服务中心 | Automatic identification method, system, equipment and storage medium for type of power consumer side |
CN116956075B (en) * | 2023-09-18 | 2024-01-12 | 国网山西省电力公司营销服务中心 | Automatic identification method, system, equipment and storage medium for type of power consumer side |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10606862B2 (en) | Method and apparatus for data processing in data modeling | |
CN110503256B (en) | Short-term load prediction method and system based on big data technology | |
CN108171379B (en) | Power load prediction method | |
CN111324642A (en) | Model algorithm type selection and evaluation method for power grid big data analysis | |
Park et al. | Explainability of machine learning models for bankruptcy prediction | |
CN110930198A (en) | Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment | |
CN110059852A (en) | A kind of stock yield prediction technique based on improvement random forests algorithm | |
Dudas et al. | Integration of data mining and multi-objective optimisation for decision support in production systems development | |
CN108345908A (en) | Sorting technique, sorting device and the storage medium of electric network data | |
CN109840536A (en) | A kind of power grid power supply reliability horizontal clustering method and system | |
CN106980906B (en) | Spark-based Ftrl voltage prediction method | |
CN110690701A (en) | Analysis method for influence factors of abnormal line loss | |
CN109271421A (en) | A kind of large data clustering method based on MapReduce | |
Wang et al. | Short-term load forecasting with LSTM based ensemble learning | |
CN115470862A (en) | Dynamic self-adaptive load prediction model combination method | |
Fernandes et al. | Analysis of residential natural gas consumers using fuzzy c-means clustering | |
Gökçe et al. | Performance comparison of simple regression, random forest and XGBoost algorithms for forecasting electricity demand | |
CN109389517B (en) | Analysis method and device for quantifying line loss influence factors | |
CN109829115B (en) | Search engine keyword optimization method | |
Zheng et al. | Modeling stochastic service time for complex on-demand food delivery | |
Jiang et al. | Fluctuation similarity modeling for traffic flow time series: A clustering approach | |
CN114240102A (en) | Line loss abnormal data identification method and device, electronic equipment and storage medium | |
Loseva et al. | Ensembles of neural networks with application of multi-objective self-configurable genetic programming | |
Hao et al. | Research on accurate identification of poor students in colleges based on big data analysis | |
Nadinta et al. | A clustering-based approach for reorganizing bus route on bus rapid transit system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |