CN109840536A

CN109840536A - A kind of power grid power supply reliability horizontal clustering method and system

Info

Publication number: CN109840536A
Application number: CN201711228891.6A
Authority: CN
Inventors: 高波; 陈红森; 张鹏; 呂颖; 王宏刚; 芦晶晶; 于之虹; 胡建勇
Original assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Current assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Priority date: 2017-11-29
Filing date: 2017-11-29
Publication date: 2019-06-04

Abstract

The present invention relates to a kind of power grid power supply reliability horizontal clustering method and system, comprising: the factor for choosing influence power supply reliability establishes index set；Final clustering target is determined from index set using significance of correlation coefficient verification；Clustering is carried out to final clustering target, obtains the cluster result of power supply reliability horizontal classification；It is optimized according to basic data of the cluster result to description power supply reliability level.The present invention has just carried out a series of processing to index set before cluster, and significance of correlation coefficient has been carried out after calculating data related coefficient and has verified and increase subsidiary discriminant, i.e. principal component is analysed, and is not only reduced dimension, is also increased confidence level in certain degree.

Description

A kind of power grid power supply reliability horizontal clustering method and system

Technical field

The present invention relates to a kind of clustering methods, and in particular to a kind of power grid power supply reliability horizontal clustering method and system.

Background technique

K-means algorithm is hard clustering algorithm, is the representative of the typically objective function clustering method based on prototype, it is Data point obtains the tune of interative computation using the method that function seeks extreme value to certain objective function of distance as optimization of prototype Whole rule.For K-means algorithm using Euclidean distance as similarity measure, it is to seek corresponding a certain initial cluster center vector V most Optimal sorting class, so that evaluation index J is minimum.Algorithm is using error sum of squares criterion function as clustering criteria function.

K-means algorithm belongs to unsupervised learning method.This algorithm is divided into k cluster using k as parameter, n object, so that Similarity with higher in cluster, and the similarity between cluster is lower.The calculating of similarity is averaged according to object in a cluster Value (center of gravity for being counted as cluster) Lai Jinhang.This algorithm randomly chooses k object first, and each object represents the matter of a cluster The heart.Remaining each object is assigned to it most like therewith according to the distance between the object and each cluster mass center Cluster in.Then, the new mass center of each cluster is calculated.It repeats the above process, until criterion function is restrained.

K-means algorithm is a kind of Dynamic Clustering Algorithm of more typical pointwise modification iteration, and main points are flat with error It just and is criterion function.Pointwise modification class center: a picture dot sample presses a certain principle, belongs to after a certain group of class it is necessary to weight The mean value of this group of class is newly calculated, and point carries out picture dot element cluster next time using new mean value as condensation center；It is repaired by batch Change class center: being classified and then calculated by a certain group of class center in whole picture dot samples and modify all kinds of mean values, as next The condensation center point of subseries.

K-means algorithm has higher efficiency to large data sets and is scalability；Time complexity is bordering on linearly, Algorithm is succinct, quick；And it is suitble to the advantages that excavating large-scale dataset.But the algorithm there are the following problems:

1, the needs constantly carry out sample classification adjustment, constantly calculate new cluster centre adjusted, therefore work as Data volume is big, dimension is more, and the time overhead of algorithm is very big.And when similarity is smaller between class and class, the method effect Fruit is poor.

2, in K-means algorithm, it is necessary first to an initial division is determined according to initial cluster center, then to first Begin to divide and optimize.The selection of this initial cluster center has large effect to cluster result, once initial value selection It is bad, it may be unable to get effective cluster result, this also becomes a main problem of K-means algorithm.For the problem Solution, many algorithms use genetic algorithm (GA), such as are initialized in document using genetic algorithm (GA), poly- with inside Class criterion is as evaluation index, but effect is unsatisfactory.

Summary of the invention

To solve above-mentioned deficiency of the prior art, the object of the present invention is to provide a kind of power grid power supply reliability horizontal focusings Class method and system, practical processing is given when cluster data amount is big, the factor, and validity passes through power supply reliability The example of horizontal clustering problem is verified.

The purpose of the present invention is adopt the following technical solutions realization:

The present invention provides a kind of power grid power supply reliability level optimization method, thes improvement is that:

The factor for choosing influence power supply reliability establishes index set；

Final clustering target is determined from index set using significance of correlation coefficient verification；

Clustering is carried out to final clustering target, obtains the cluster result of power supply reliability horizontal classification；

It is optimized according to basic data of the cluster result to description power supply reliability level.

Further: it includes basic data that the index set is divided by reliability level, and the basic data includes region Feature, economic level, power load situation, grid structure and operating condition, equipment situation and technology management level index.

Further: described that clustering target is determined from index set using significance of correlation coefficient verification, comprising:

Basic data in index set is pre-processed；

Significance of correlation coefficient verification is carried out using Principal Component Analysis to the index in index set after pretreatment；

Form final clustering target.

Further: significance of correlation coefficient verification is carried out using Principal Component Analysis, comprising:

First principal component in index set after determining pretreatment；

Successively determine Second principal component, third principal component, the 4th principal component ... ..., the P principal component, p expression index set In index number.

Further: first principal component includes: in index set after the determining pretreatment

The variance Va (rF1) for first linear combination F1 that the first principal component index is chosen is indicated, as Va (rF1) When maximum, first linear combination F1 is first principal component；First linear combination is multiple tools in index set after pretreatment There is the index of correlation, is reassembled into one group of new mutual unrelated overall target and carries out what linear combination obtained.

Further: described successively to determine Second principal component, third principal component, the 4th principal component ... ..., P it is main at Point, comprising:

If first principal component F1 is not enough to represent the information of original P index, second linear combination F2 is chosen, first The existing information of principal component F1 is not present in second linear combination F2, is expressed as Cov (F1, F2)=0 with mathematic(al) representation, Then second linear combination F2 is referred to as Second principal component,；

The rest may be inferred constructs third, the the 4th ... ..., the P principal component.

Further: the final clustering target be first principal component, Second principal component, third principal component ..., the The set m of P principal component.

Further: it is described that clustering is carried out to final clustering target, obtain the cluster of power supply reliability horizontal classification Result includes: to carry out clustering, the K-means algorithm to final clustering target using the cost function of K-means algorithm Cost function indicate are as follows:

Wherein: μ₁,...,μ_kFor 1 ..., K cluster centre, m indicates final clustering target,Indicate that ith cluster refers to The center of class, x where mark⁽ⁱ⁾For the data point in clustering target, c⁽¹⁾,...c^(m)For the data point x in clustering target⁽¹⁾,..., x^(m)The class at place.

Further: it is described to be optimized according to basic data of the cluster result to description power supply reliability level, Include:

The basic data that will be greater than 3 times of variances of basic data of power supply reliability level is rejected, so that the K- The cost function of means algorithm minimizes.

Further: so that the cost function of the K-means algorithm minimizes, calculating formula is as follows:

The present invention also provides a kind of power grid power supply reliability level optimization systems, the improvement is that: including:

Module is constructed, the factor for choosing influence power supply reliability establishes index set；

Clustering target determining module, for determining that final cluster refers to from index set using significance of correlation coefficient verification Mark；

Cluster Analysis module carries out clustering, the clustering target optimized to final clustering target；

Optimization module, for the clustering target according to the optimization to the basic number for describing power grid power supply reliability level According to optimizing.

Further: the clustering target determining module further comprises:

Submodule is pre-processed, for pre-processing to the basic data in index set；

Submodule is verified, for carrying out phase relation digital display using Principal Component Analysis to the index in index set after pretreatment The verification of work property；

Submodule is formed, final clustering target is used to form.

Further: the verification submodule, comprising:

First principal component determination unit, for first principal component in index set after determining pretreatment；

Second principal component, determination unit, for successively determining Second principal component, third principal component, the 4th principal component ... ..., The P principal component, p indicate the index number in index set.

Further: the first principal component determination unit, comprising:

First chooses subelement, the variance Va for first linear combination F1 that first principal component index is chosen (rF1) it indicates, when Va (rF1) maximum, first linear combination F1 is first principal component；First linear combination is pre- place Multiple indexs with correlation in index set after reason are reassembled into one group of new mutual unrelated overall target and carry out linearly What combination obtained.

Further: the Second principal component, determination unit, comprising:

Second chooses subelement, if being not enough to represent the information of original P index for first principal component F1, selection the The existing information of two linear combination F2, first principal component F1 is not present in second linear combination F2, uses mathematic(al) representation It is expressed as Cov (F1, F2)=0, then second linear combination F2 is referred to as Second principal component,；

Third principal component determines subelement, constructs third, the the 4th ... ..., the P principal component for the rest may be inferred.

Compared with the immediate prior art, technical solution provided by the invention is had the beneficial effect that

1, the factor that the present invention chooses influence power supply reliability establishes index set；It is verified using significance of correlation coefficient from finger Mark, which is concentrated, determines final clustering target；Clustering is carried out to final clustering target, obtains the poly- of power supply reliability horizontal classification Class result；It is optimized according to basic data of the cluster result to description power supply reliability level.Shadow is chosen before cluster The factor for ringing power supply reliability establishes index set, and using also to carry out significance of correlation coefficient verification after correlation processing, leads to The clustering target that clustering is optimized is crossed, power grid power supply reliability level is carried out according to the clustering target of optimization When optimization, so that the cluster small data of area's sorting room similitude as far as possible, improve Clustering Effect.

2, in some cases, since selected index is all that same nature data are difficult to differentiate between traditional K-means algorithm The lesser data of similitude cause to turn to the two into one kind, and present invention optimization is asked what the lesser data of similitude distinguished Topic.Demonstration differentiation is carried out using using other householder methods-principal component analysis method, dimension is not only reduced, also in certain degree Increase confidence level.

3, it in some cases, depends only on and the algorithm of output is directly applied into reality, without to practical abnormal data It is rejected, even if algorithm is accurate, practical application is also due to real data is abnormal and leads to the biased property of cluster level.This Algorithm is considered to the practical problem in engineer application, as some districts and cities power off time that is averaged is significantly larger than being averaged for similar districts and cities Power off time can be to those well below market value singularly if being averaged power off time using similar all districts and cities rejecting is not added Unit cause unfairness, scoring also just loses accuracy.

4, this adds addition districts and cities' equivalent user number in electric network reliability horizontal clustering evaluation procedure as weight factor Enter the reliability level evaluation to districts and cities, province.To preferably strong, service ability, equipment level according to constituent parts rack Districts and cities' unit gives scientific and reasonable evaluation.

Detailed description of the invention

Fig. 1 is a kind of general flow chart of power grid power supply reliability horizontal clustering method provided by the invention；

Fig. 2 is that a kind of process of power grid power supply reliability horizontal clustering method provided by the invention is schemed in detail；

Fig. 3 is the schematic diagram that increase principal component analysis method provided by the invention differentiates；

Fig. 4 is the knot of the interregional grid power supply reliability horizontal clustering system provided by the invention based on K-means algorithm Composition.

Specific embodiment

Specific embodiments of the present invention will be described in further detail with reference to the accompanying drawing.

The following description and drawings fully show specific embodiments of the present invention, to enable those skilled in the art to Practice them.Other embodiments may include structure, logic, it is electrical, process and other change.Embodiment Only represent possible variation.Unless explicitly requested, otherwise individual component and function are optional, and the sequence operated can be with Variation.The part of some embodiments and feature can be included in or replace part and the feature of other embodiments.This hair The range of bright embodiment includes equivalent obtained by the entire scope of claims and all of claims Object.Herein, these embodiments of the invention can individually or generally be indicated that this is only with term " invention " For convenience, and if in fact disclosing the invention more than one, the range for being not meant to automatically limit the application is to appoint What single invention or inventive concept.

Embodiment one,

The present invention pre-processes input data, also to carry out significance of correlation coefficient verification after correlation processing, Demonstration differentiation is carried out using principal component analysis method using other householder methods-in addition, there are also an important steps.Then again using optimization Cost function clustered so that cluster small data of area's sorting room similitude as far as possible.Directly by the cluster result of output Using, without being handled again applied data, such as abnormal data, even if cluster result is good again, it is also difficult in engineer application It is middle to obtain good effect.Therefore, applied actual conditions or data are also carried out exception in practical engineering applications to sentence Not, abnormal data is made to be not involved in generic calculating.

The present invention provides a kind of power grid power supply reliability horizontal clustering method, and flow chart is as illustrated in fig. 1 and 2, comprising:

1, the factor for choosing influence power supply reliability establishes index set；

2, final clustering target is determined from index set using significance of correlation coefficient verification；

3, clustering is carried out to final clustering target using the cost function of K-means algorithm, obtains power supply reliability The cluster result of horizontal classification；

4, it is optimized according to basic data of the cluster result to description power supply reliability level.

Related coefficient and checking validity are to verify the whether significant effective ways of correlation between selected index set, also right Whether cluster result is optimal important influence.Therefore this step is the only way which must be passed before cluster, otherwise, selected cluster The factor lacks feasibility and confidence level.Meanwhile after significance of correlation coefficient verification determines selected index, need by certain Method carries out subsidiary discriminant, this analyses method by principal component and related coefficient significantly verifies and determines that final cluster refers to after mutually analyzing Mark.

1, the factor for choosing influence power supply reliability establishes index set, comprising: index set is divided by reliability level includes Basic data, the basic data include provincial characteristics, economic level, power load situation, grid structure and operating condition, set Standby situation and technology management level index.

2, being verified using significance of correlation coefficient from final clustering target determining in index set includes to index after pretreatment The index of concentration carries out significance of correlation coefficient verification using Principal Component Analysis；

Principal component analysis (PCA), one kind that multiple variables are selected less number significant variable by linear transformation are more First statistical analysis technique.Also known as principal component analysis.Principal component analysis is to be introduced by K. Pearson to nonrandom variable first, so The method is generalized to by H. Hotelling the situation of random vector afterwards.The size of information is usually weighed with sum of squares of deviations or variance Amount.

In actual subject, for comprehensive problem analysis, much variables (or factor) related with this are often proposed, because Each variable reflects certain information of this project to varying degrees.Principal component analysis is that is, have certain correlation for multiple Property index, being reassembled into one group of new mutual unrelated overall target carries out linear combination and replaces former index, as new Overall target.

Most classic way is exactly that the variance of F1 (first linear combination of selection, i.e. first overall target) is used to carry out table It reaches, i.e. Va (rF1) is bigger, and the information for indicating that F1 includes is more.Therefore the F1 chosen in all linear combination should the side of being Difference is maximum, therefore F1 is referred to as first principal component.If first principal component is not enough to represent the information of original P index, consider further that It chooses F2 and selects second linear combination, in order to effectively reflect original information, there is no need to occur again again for the existing information of F1 It is exactly to require Cov (F1, F2)=0 with mathematical linguistics expression, then F2 is referred to as Second principal component, and so on can be constructed in F2 Third, the the 4th ... ... out, the P principal component.

Form final clustering target: first principal component, Second principal component, third principal component, the 4th principal component ... ..., P A principal component forms final clustering target set m.

3, clustering is carried out to final clustering target using the cost function of K-means algorithm, obtains power supply reliability The cluster result of horizontal classification:

K-means clustering procedure is the important algorithm in partition clustering analytic approach, but this clustering method must be determined in advance it is poly- Class number, it is artificial to provide that clusters number factor is very big since business needs especially in power grid.And characteristic it is closely similar when, As long as having some deviations in borderline data point, data may be divided into another kind of from one kind.User needs guaranteeing It clusters between optimal and business needs and is chosen.The present invention is a kind of based on K-means Optimal Clustering, passes through cost function Optimization algorithm, the similar data of distinctive characteristics, and gather similitude is smaller for one kind.The present invention is to new method in cluster data Practical processing is given when amount is big, the factor is more, validity is obtained by the example of power supply reliability horizontal clustering problem Verifying.

K-Means algorithm belongs to one kind of unsupervised formula study, and the input of algorithm is: training dataset(wherein x (i) ∈ R_nWith number of clusters K (data are divided into K class)；Algorithm output is in K cluster Heart μ 1, μ 2 ..., μ k and the classification where each data point x (i).The process of K-Means algorithm is as follows:

I. K cluster centre of random initializtion (ClusterCentroid): μ 1, μ 2 ..., μ K；

Ii. for each data point x (i), the cluster centre nearest from it is found, such is classified to；I.e. Wherein c (i) indicates the class where the data point x (i) in clustering target；

Iii. the value for updating cluster centre uk is the average value of all data points for belonging to class k；Repeat ii and iii step until Restrain or reach maximum number of iterations

4, it is optimized according to basic data of the cluster result to description power supply reliability level:

Clustering, the cost of K-means algorithm are carried out to final clustering target using the cost function of K-means algorithm Function cost function are as follows:

It is described to be optimized according to basic data of the cluster result to description power supply reliability level, comprising:

Embodiment two,

The present embodiment is to study how to evaluate entoilage city-level power supply company, state reliability level, and evaluation content is specific Including two aspects: first is that average power off time of user；Second is that average frequency of power cut of user.It is analyzed from business, influencing power supply can Mainly having contact rate, can turn for rate, line sectionalizing rate, looped network rate, cable rate, power distribution automation rate, equipment water by property Equality.However, whether there is relationship between these factors, relationship is mostly strong but cannot to be needed to assist dividing by making a concrete analysis of in business Analysis and differentiation.

The factor that influence reliability index is chosen in this research establishes index set, after being collected to index set, to exception Data are handled and are corrected, and the prefecture-level power supply company within the scope of state's net is divided into 5 by reliability level by business diagnosis Class is clustered the significant factor of correlation using the cost function that K-means optimizes, if Fig. 1 and 2 is algorithm flow chart.

According to influence and reflecting regional distribution network reliability level several factors, respectively from economic characteristics, population characteristic, With electrical feature, equipment scale, grid structure and the several dimension index for selection of management level.Collected index result such as the following table 1 It is shown:

The index result that table 1 is collected into

By the way that index carries out data prediction, significance of correlation coefficient is verified to collecting, determine the index that finally clusters because Son, as shown in table 2 below.In addition, variable number just will increase answering for analysis too much when with statistical analysis technique Study of Multivariable Polygamy.And be many times, between variable to have certain correlativity, when having certain correlativity between two variables, It can be construed to the two variables and reflect that the information of this project has certain overlapping.Principal component analysis is the institute for originally proposing There is variable, it is extra that duplicate variable (variable of close relation) is left out, and new variables as few as possible is established, so that these new changes Amount is incoherent two-by-two, and these new variables keep original information in the message context of reflection project as far as possible.

Therefore, this chooses after being verified by principal component analysis method to index progress selecting index and significance of correlation coefficient Index compares, and discovery two methods conclusion matches.The dendrogram obtained using principal component analysis method is as shown in Figure 3.

The rejecting and selection of 2 index of table

Note: marking color background is selected index, and other is Rejection index

By to data prediction and after determining the final clustering target factor, using K-means optimization cost function into Row cluster, optimizes target.As shown in table 3 below, using the K-means algorithm after optimization can preferably by characteristic consistently Gather for one kind in city.

The comparative situation of the optimization of table 3 front and back cluster result

Districts and cities' title	Province title	Cover administrative region	K-means	K-means after optimization
					Linyi	Shandong	Linyi	1	2
Qingdao	Shandong	Qingdao	1	1
					Weihai	Shandong	Weihai	1	1
Heze	Shandong	Heze	1	2
					Liaocheng	Shandong	Liaocheng	1	2
Tangshan	Ji Bei	Tangshan	1	1
					Sunshine	Shandong	Sunshine	1	2
Zaozhuang	Shandong	Zaozhuang	1	2
					Tai'an	Shandong	Tai'an	1	1

In addition, it is necessary to make an explanation, although this is using the K-means clustering method after optimization, this is One optimization of algorithm.In view of application and practicability of the invention, needs for algorithm to be applied to and do a little explanations in practice.Tool Body is as follows,

1, since cluster is intermediate result, cluster result is exactly applied in Practical Project and life by purpose, just It is to separate classification after being clustered the reliability level of districts and cities for this example.Object applied by clustering is districts and cities Average power off time of user and average frequency of power cut of user, it is flat with the practical power off time of districts and cities, number and the user of similar districts and cities The average value of equal power off time and number is compared.However the met problem from example of calculation, i.e., individual districts and cities are public in cluster It is horizontal that the average power off time of user and average frequency of power cut of user of department are significantly larger than other districts and cities, and other districts and cities users are average Power off time and number distribution again more concentrates, therefore, if only by the average power off time of user of similar all districts and cities and time Number participates in the calculating of average value, relatively large deviation can be caused to result, evaluation result will lose value.

If the most districts and cities' average power off time of user in Shanxi Province in this example is all more than 20 hours/family, user averagely stops Electric number is all more than 5 times/family.And it is similar in other districts and cities' average power off time of user be mostly distributed in 1.5~4 hours/family, use It is mostly distributed in 0.2~1 hour/family in average frequency of power cut, as shown in table 4 below.In view of the use of cluster middle part company, subdivision city Family is averaged power off time and average frequency of power cut of user does not have representativeness, therefore is being applied to will be greater than similarly in practice City's average power off time of user, 3 times of variances of number unit data rejected, be not involved in mean value calculation.

Table 4 considers part unit average after 3 times of variances

Note 1: non-null value is greater than the districts and cities of 3 times of variances in table

Note 2: being part districts and cities in table

2, since this algorithm is applied to 27 units of Guo Wang company, the reliability level of more than 300 prefecture-level companies is commented Valence, each interregional, between each province's unit differentiation are obvious.This is mainly according to the strong journey of rack between each region, each province's unit Degree, service ability, equipment equipment and region development use differentiation in districts and cities' reliability level cluster score Evaluation.In this way so that districts and cities' unit that those racks are strong, service ability is good, equipment level is high is given compared with high score, instead It, then give lower score value.

By to on-line system Operational Data Analysis, the accounting situation that districts and cities' equivalent user number accounts for the province can integrate compared with Reflect the comprehensive level of its unit well, therefore is added to the districts and cities and is somebody's turn to do using districts and cities' equivalent user number accounting as weight The comprehensive scoring saved, so that evaluation result more really restores the power supply reliability level of the districts and cities, the province.The following table is ground City and net province consider the scoring event after weight (districts and cities' equivalent user number accounts for this province ratio).

5 districts and cities of table and net save scoring event after consideration weight

Shown in table 5 as above, does not consider weight, directly use the average value of districts and cities, the province deviation ratio score as province's reliability Horizontal score, average power off time of user and number are respectively 77.32 and 88.97.And province user is average after considering weight The final score of power off time and number is respectively 86.67 and 96.61, and two kinds of score differences, main cause is the districts and cities, province The influence to result more than (area) specific equivalent user's accounting does not consider this if Tianjin Ninghe equivalent user accounting is 2.16% Specific equivalent user's accounting, score is lower, so that the average value dragged down, drags down Tianjin global reliability level, also It is difficult to reflect truth.And the latter scores, specific equivalent user is few, and score is also few, to province's total score (districts and cities' summation) It influences also small.

Embodiment three,

Based on same inventive concept, the present invention also provides a kind of power grid power supply reliability water based on K-means algorithm Flat clustering system, structure chart are as shown in Figure 4, comprising:

Further: the clustering target determining module further comprises:

Submodule is formed, final clustering target is used to form.

Further: the verification submodule, comprising:

Further: the first principal component determination unit, comprising:

Further: the Second principal component, determination unit, comprising:

A kind of power grid power supply reliability horizontal clustering method based on K-means algorithm provided by the invention, this method exist By the cost function of optimization on the basis of original method, the accuracy and confidence level of cluster are improved.The present invention is poly- simultaneously Class is forward and backward all to have done the work for improving Clustering Effect and practical significance, such as pre-processes input data, in correlation processing Also to carry out significance of correlation coefficient verification afterwards, there are also an important step using other householder methods-using principal component analyse method into Row demonstration differentiates, then is clustered using the cost function of optimization, so that the cluster small data of area's sorting room similitude as far as possible. In practical engineering applications to the processing work of practical abnormal data, districts and cities' equivalent user number is added and accounts for province's ratio as weight Deng these working groups integrally jointly promote the confidence level of cluster and the overall efficacy of reliability level, keep result more existing Real application value.Applied actual conditions or data are carried out anomalous discrimination in practical engineering applications by the present invention, make exception Data be not involved in generic calculating.

It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.

The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, although referring to above-described embodiment pair The present invention is described in detail, those of ordinary skill in the art still can to a specific embodiment of the invention into Row modification perhaps equivalent replacement these without departing from any modification of spirit and scope of the invention or equivalent replacement, applying Within pending claims of the invention.

Claims

1. a kind of power grid power supply reliability level optimization method, it is characterised in that:

2. power grid power supply reliability level optimization method as described in claim 1, it is characterised in that: the index set is by reliable Property horizontal division includes basic data, and the basic data includes provincial characteristics, economic level, power load situation, rack knot Structure and operating condition, equipment situation and technology management level index.

3. power grid power supply reliability level optimization method as claimed in claim 2, it is characterised in that: described to use related coefficient Checking validity determines clustering target from index set, comprising:

Basic data in index set is pre-processed；

Form final clustering target.

4. power grid power supply reliability level optimization method as claimed in claim 3, it is characterised in that: use Principal Component Analysis Carry out significance of correlation coefficient verification, comprising:

First principal component in index set after determining pretreatment；

Successively determining Second principal component, third principal component, the 4th principal component ... ..., the P principal component, p are indicated in index set Index number.

5. power grid power supply reliability horizontal clustering method as claimed in claim 4, it is characterised in that: after the determining pretreatment First principal component includes: in index set

The variance Va (rF1) for first linear combination F1 that the first principal component index is chosen is indicated, when Va (rF1) maximum When, first linear combination F1 is first principal component；First linear combination is multiple with phase in index set after pre-processing The index of closing property is reassembled into one group of new mutual unrelated overall target and carries out what linear combination obtained.

6. power grid power supply reliability horizontal clustering method as claimed in claim 5, it is characterised in that: described successively to determine second Principal component, third principal component, the 4th principal component ... ..., the P principal component, comprising:

If first principal component F1 is not enough to represent the information of original P index, choose second linear combination F2, first it is main at Divide the existing information of F1 to be not present in second linear combination F2, is expressed as Cov (F1, F2)=0 with mathematic(al) representation, then claims Second linear combination F2 is Second principal component,；

7. power grid power supply reliability level optimization method as claimed in claim 6, it is characterised in that: the final clustering target For first principal component, Second principal component, third principal component ..., the set m of P principal component.

8. power grid power supply reliability level optimization method as claimed in claim 7, it is characterised in that: described to refer to final cluster Mark carries out clustering, and the cluster result for obtaining power supply reliability horizontal classification includes: cost function using K-means algorithm Clustering is carried out to final clustering target, the cost function of the K-means algorithm indicates are as follows:

Wherein: μ₁,...,μ_kFor 1 ..., K cluster centre, m indicates final clustering target,Indicate ith cluster index institute At the center of class, x⁽ⁱ⁾For the data point in clustering target, c⁽¹⁾,...c^(m)For the data point x in clustering target⁽¹⁾,...,x^(m) The class at place.

9. power grid power supply reliability level optimization method as claimed in claim 8, it is characterised in that: described according to the cluster As a result the basic data of description power supply reliability level is optimized, comprising:

The basic data that will be greater than 3 times of variances of basic data of power supply reliability level is rejected, so that the K-means is calculated The cost function of method minimizes.

10. power grid power supply reliability level optimization method as claimed in claim 9, it is characterised in that: so that the K-means The cost function of algorithm minimizes, and calculating formula is as follows:

11. a kind of power grid power supply reliability level optimization system, it is characterised in that: include:

Clustering target determining module, for determining final clustering target from index set using significance of correlation coefficient verification；

Optimization module, for according to the clustering target of the optimization to the basic data of description power grid power supply reliability level into Row optimization.

12. the power grid power supply reliability level optimization system stated such as claim 11, it is characterised in that: the clustering target determines Module further comprises:

Submodule is verified, for carrying out significance of correlation coefficient using Principal Component Analysis to the index in index set after pretreatment Verification；

Submodule is formed, final clustering target is used to form.

13. power grid power supply reliability level optimization system as claimed in claim 11, it is characterised in that: the verification submodule Block, comprising:

Second principal component, determination unit, for successively determining Second principal component, third principal component, the 4th principal component ... ..., P A principal component, p indicate the index number in index set.

14. power grid power supply reliability level optimization system as claimed in claim 13, it is characterised in that: the first principal component Determination unit, comprising:

First chooses subelement, variance Va (rF1) table for first linear combination F1 that first principal component index is chosen Show, when Va (rF1) maximum, first linear combination F1 is first principal component；First linear combination is to refer to after pre-processing Mark concentrates multiple indexs with correlation, is reassembled into one group of new mutual unrelated overall target progress linear combination and obtains It arrives.

15. power grid power supply reliability level optimization system as claimed in claim 13, it is characterised in that: the Second principal component, Determination unit, comprising:

Second chooses subelement, if being not enough to represent the information of original P index for first principal component F1, chooses second The existing information of linear combination F2, first principal component F1 is not present in second linear combination F2, is indicated with mathematic(al) representation For Cov (F1, F2)=0, then second linear combination F2 is referred to as Second principal component,；

16. power grid power supply reliability level optimization system as claimed in claim 15, it is characterised in that: the final cluster refers to Be designated as first principal component, Second principal component, third principal component ..., the set m of P principal component.