CN109840536A - A kind of power grid power supply reliability horizontal clustering method and system - Google Patents

A kind of power grid power supply reliability horizontal clustering method and system Download PDF

Info

Publication number
CN109840536A
CN109840536A CN201711228891.6A CN201711228891A CN109840536A CN 109840536 A CN109840536 A CN 109840536A CN 201711228891 A CN201711228891 A CN 201711228891A CN 109840536 A CN109840536 A CN 109840536A
Authority
CN
China
Prior art keywords
principal component
power supply
supply reliability
index
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711228891.6A
Other languages
Chinese (zh)
Inventor
高波
陈红森
张鹏
呂颖
王宏刚
芦晶晶
于之虹
胡建勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI filed Critical State Grid Corp of China SGCC
Priority to CN201711228891.6A priority Critical patent/CN109840536A/en
Publication of CN109840536A publication Critical patent/CN109840536A/en
Pending legal-status Critical Current

Links

Landscapes

  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The present invention relates to a kind of power grid power supply reliability horizontal clustering method and system, comprising: the factor for choosing influence power supply reliability establishes index set;Final clustering target is determined from index set using significance of correlation coefficient verification;Clustering is carried out to final clustering target, obtains the cluster result of power supply reliability horizontal classification;It is optimized according to basic data of the cluster result to description power supply reliability level.The present invention has just carried out a series of processing to index set before cluster, and significance of correlation coefficient has been carried out after calculating data related coefficient and has verified and increase subsidiary discriminant, i.e. principal component is analysed, and is not only reduced dimension, is also increased confidence level in certain degree.

Description

A kind of power grid power supply reliability horizontal clustering method and system
Technical field
The present invention relates to a kind of clustering methods, and in particular to a kind of power grid power supply reliability horizontal clustering method and system.
Background technique
K-means algorithm is hard clustering algorithm, is the representative of the typically objective function clustering method based on prototype, it is Data point obtains the tune of interative computation using the method that function seeks extreme value to certain objective function of distance as optimization of prototype Whole rule.For K-means algorithm using Euclidean distance as similarity measure, it is to seek corresponding a certain initial cluster center vector V most Optimal sorting class, so that evaluation index J is minimum.Algorithm is using error sum of squares criterion function as clustering criteria function.
K-means algorithm belongs to unsupervised learning method.This algorithm is divided into k cluster using k as parameter, n object, so that Similarity with higher in cluster, and the similarity between cluster is lower.The calculating of similarity is averaged according to object in a cluster Value (center of gravity for being counted as cluster) Lai Jinhang.This algorithm randomly chooses k object first, and each object represents the matter of a cluster The heart.Remaining each object is assigned to it most like therewith according to the distance between the object and each cluster mass center Cluster in.Then, the new mass center of each cluster is calculated.It repeats the above process, until criterion function is restrained.
K-means algorithm is a kind of Dynamic Clustering Algorithm of more typical pointwise modification iteration, and main points are flat with error It just and is criterion function.Pointwise modification class center: a picture dot sample presses a certain principle, belongs to after a certain group of class it is necessary to weight The mean value of this group of class is newly calculated, and point carries out picture dot element cluster next time using new mean value as condensation center;It is repaired by batch Change class center: being classified and then calculated by a certain group of class center in whole picture dot samples and modify all kinds of mean values, as next The condensation center point of subseries.
K-means algorithm has higher efficiency to large data sets and is scalability;Time complexity is bordering on linearly, Algorithm is succinct, quick;And it is suitble to the advantages that excavating large-scale dataset.But the algorithm there are the following problems:
1, the needs constantly carry out sample classification adjustment, constantly calculate new cluster centre adjusted, therefore work as Data volume is big, dimension is more, and the time overhead of algorithm is very big.And when similarity is smaller between class and class, the method effect Fruit is poor.
2, in K-means algorithm, it is necessary first to an initial division is determined according to initial cluster center, then to first Begin to divide and optimize.The selection of this initial cluster center has large effect to cluster result, once initial value selection It is bad, it may be unable to get effective cluster result, this also becomes a main problem of K-means algorithm.For the problem Solution, many algorithms use genetic algorithm (GA), such as are initialized in document using genetic algorithm (GA), poly- with inside Class criterion is as evaluation index, but effect is unsatisfactory.
Summary of the invention
To solve above-mentioned deficiency of the prior art, the object of the present invention is to provide a kind of power grid power supply reliability horizontal focusings Class method and system, practical processing is given when cluster data amount is big, the factor, and validity passes through power supply reliability The example of horizontal clustering problem is verified.
The purpose of the present invention is adopt the following technical solutions realization:
The present invention provides a kind of power grid power supply reliability level optimization method, thes improvement is that:
The factor for choosing influence power supply reliability establishes index set;
Final clustering target is determined from index set using significance of correlation coefficient verification;
Clustering is carried out to final clustering target, obtains the cluster result of power supply reliability horizontal classification;
It is optimized according to basic data of the cluster result to description power supply reliability level.
Further: it includes basic data that the index set is divided by reliability level, and the basic data includes region Feature, economic level, power load situation, grid structure and operating condition, equipment situation and technology management level index.
Further: described that clustering target is determined from index set using significance of correlation coefficient verification, comprising:
Basic data in index set is pre-processed;
Significance of correlation coefficient verification is carried out using Principal Component Analysis to the index in index set after pretreatment;
Form final clustering target.
Further: significance of correlation coefficient verification is carried out using Principal Component Analysis, comprising:
First principal component in index set after determining pretreatment;
Successively determine Second principal component, third principal component, the 4th principal component ... ..., the P principal component, p expression index set In index number.
Further: first principal component includes: in index set after the determining pretreatment
The variance Va (rF1) for first linear combination F1 that the first principal component index is chosen is indicated, as Va (rF1) When maximum, first linear combination F1 is first principal component;First linear combination is multiple tools in index set after pretreatment There is the index of correlation, is reassembled into one group of new mutual unrelated overall target and carries out what linear combination obtained.
Further: described successively to determine Second principal component, third principal component, the 4th principal component ... ..., P it is main at Point, comprising:
If first principal component F1 is not enough to represent the information of original P index, second linear combination F2 is chosen, first The existing information of principal component F1 is not present in second linear combination F2, is expressed as Cov (F1, F2)=0 with mathematic(al) representation, Then second linear combination F2 is referred to as Second principal component,;
The rest may be inferred constructs third, the the 4th ... ..., the P principal component.
Further: the final clustering target be first principal component, Second principal component, third principal component ..., the The set m of P principal component.
Further: it is described that clustering is carried out to final clustering target, obtain the cluster of power supply reliability horizontal classification Result includes: to carry out clustering, the K-means algorithm to final clustering target using the cost function of K-means algorithm Cost function indicate are as follows:
Wherein: μ1,...,μkFor 1 ..., K cluster centre, m indicates final clustering target,Indicate that ith cluster refers to The center of class, x where mark(i)For the data point in clustering target, c(1),...c(m)For the data point x in clustering target(1),..., x(m)The class at place.
Further: it is described to be optimized according to basic data of the cluster result to description power supply reliability level, Include:
The basic data that will be greater than 3 times of variances of basic data of power supply reliability level is rejected, so that the K- The cost function of means algorithm minimizes.
Further: so that the cost function of the K-means algorithm minimizes, calculating formula is as follows:
The present invention also provides a kind of power grid power supply reliability level optimization systems, the improvement is that: including:
Module is constructed, the factor for choosing influence power supply reliability establishes index set;
Clustering target determining module, for determining that final cluster refers to from index set using significance of correlation coefficient verification Mark;
Cluster Analysis module carries out clustering, the clustering target optimized to final clustering target;
Optimization module, for the clustering target according to the optimization to the basic number for describing power grid power supply reliability level According to optimizing.
Further: the clustering target determining module further comprises:
Submodule is pre-processed, for pre-processing to the basic data in index set;
Submodule is verified, for carrying out phase relation digital display using Principal Component Analysis to the index in index set after pretreatment The verification of work property;
Submodule is formed, final clustering target is used to form.
Further: the verification submodule, comprising:
First principal component determination unit, for first principal component in index set after determining pretreatment;
Second principal component, determination unit, for successively determining Second principal component, third principal component, the 4th principal component ... ..., The P principal component, p indicate the index number in index set.
Further: the first principal component determination unit, comprising:
First chooses subelement, the variance Va for first linear combination F1 that first principal component index is chosen (rF1) it indicates, when Va (rF1) maximum, first linear combination F1 is first principal component;First linear combination is pre- place Multiple indexs with correlation in index set after reason are reassembled into one group of new mutual unrelated overall target and carry out linearly What combination obtained.
Further: the Second principal component, determination unit, comprising:
Second chooses subelement, if being not enough to represent the information of original P index for first principal component F1, selection the The existing information of two linear combination F2, first principal component F1 is not present in second linear combination F2, uses mathematic(al) representation It is expressed as Cov (F1, F2)=0, then second linear combination F2 is referred to as Second principal component,;
Third principal component determines subelement, constructs third, the the 4th ... ..., the P principal component for the rest may be inferred.
Further: the final clustering target be first principal component, Second principal component, third principal component ..., the The set m of P principal component.
Compared with the immediate prior art, technical solution provided by the invention is had the beneficial effect that
1, the factor that the present invention chooses influence power supply reliability establishes index set;It is verified using significance of correlation coefficient from finger Mark, which is concentrated, determines final clustering target;Clustering is carried out to final clustering target, obtains the poly- of power supply reliability horizontal classification Class result;It is optimized according to basic data of the cluster result to description power supply reliability level.Shadow is chosen before cluster The factor for ringing power supply reliability establishes index set, and using also to carry out significance of correlation coefficient verification after correlation processing, leads to The clustering target that clustering is optimized is crossed, power grid power supply reliability level is carried out according to the clustering target of optimization When optimization, so that the cluster small data of area's sorting room similitude as far as possible, improve Clustering Effect.
2, in some cases, since selected index is all that same nature data are difficult to differentiate between traditional K-means algorithm The lesser data of similitude cause to turn to the two into one kind, and present invention optimization is asked what the lesser data of similitude distinguished Topic.Demonstration differentiation is carried out using using other householder methods-principal component analysis method, dimension is not only reduced, also in certain degree Increase confidence level.
3, it in some cases, depends only on and the algorithm of output is directly applied into reality, without to practical abnormal data It is rejected, even if algorithm is accurate, practical application is also due to real data is abnormal and leads to the biased property of cluster level.This Algorithm is considered to the practical problem in engineer application, as some districts and cities power off time that is averaged is significantly larger than being averaged for similar districts and cities Power off time can be to those well below market value singularly if being averaged power off time using similar all districts and cities rejecting is not added Unit cause unfairness, scoring also just loses accuracy.
4, this adds addition districts and cities' equivalent user number in electric network reliability horizontal clustering evaluation procedure as weight factor Enter the reliability level evaluation to districts and cities, province.To preferably strong, service ability, equipment level according to constituent parts rack Districts and cities' unit gives scientific and reasonable evaluation.
Detailed description of the invention
Fig. 1 is a kind of general flow chart of power grid power supply reliability horizontal clustering method provided by the invention;
Fig. 2 is that a kind of process of power grid power supply reliability horizontal clustering method provided by the invention is schemed in detail;
Fig. 3 is the schematic diagram that increase principal component analysis method provided by the invention differentiates;
Fig. 4 is the knot of the interregional grid power supply reliability horizontal clustering system provided by the invention based on K-means algorithm Composition.
Specific embodiment
Specific embodiments of the present invention will be described in further detail with reference to the accompanying drawing.
The following description and drawings fully show specific embodiments of the present invention, to enable those skilled in the art to Practice them.Other embodiments may include structure, logic, it is electrical, process and other change.Embodiment Only represent possible variation.Unless explicitly requested, otherwise individual component and function are optional, and the sequence operated can be with Variation.The part of some embodiments and feature can be included in or replace part and the feature of other embodiments.This hair The range of bright embodiment includes equivalent obtained by the entire scope of claims and all of claims Object.Herein, these embodiments of the invention can individually or generally be indicated that this is only with term " invention " For convenience, and if in fact disclosing the invention more than one, the range for being not meant to automatically limit the application is to appoint What single invention or inventive concept.
Embodiment one,
The present invention pre-processes input data, also to carry out significance of correlation coefficient verification after correlation processing, Demonstration differentiation is carried out using principal component analysis method using other householder methods-in addition, there are also an important steps.Then again using optimization Cost function clustered so that cluster small data of area's sorting room similitude as far as possible.Directly by the cluster result of output Using, without being handled again applied data, such as abnormal data, even if cluster result is good again, it is also difficult in engineer application It is middle to obtain good effect.Therefore, applied actual conditions or data are also carried out exception in practical engineering applications to sentence Not, abnormal data is made to be not involved in generic calculating.
The present invention provides a kind of power grid power supply reliability horizontal clustering method, and flow chart is as illustrated in fig. 1 and 2, comprising:
1, the factor for choosing influence power supply reliability establishes index set;
2, final clustering target is determined from index set using significance of correlation coefficient verification;
3, clustering is carried out to final clustering target using the cost function of K-means algorithm, obtains power supply reliability The cluster result of horizontal classification;
4, it is optimized according to basic data of the cluster result to description power supply reliability level.
Related coefficient and checking validity are to verify the whether significant effective ways of correlation between selected index set, also right Whether cluster result is optimal important influence.Therefore this step is the only way which must be passed before cluster, otherwise, selected cluster The factor lacks feasibility and confidence level.Meanwhile after significance of correlation coefficient verification determines selected index, need by certain Method carries out subsidiary discriminant, this analyses method by principal component and related coefficient significantly verifies and determines that final cluster refers to after mutually analyzing Mark.
1, the factor for choosing influence power supply reliability establishes index set, comprising: index set is divided by reliability level includes Basic data, the basic data include provincial characteristics, economic level, power load situation, grid structure and operating condition, set Standby situation and technology management level index.
2, being verified using significance of correlation coefficient from final clustering target determining in index set includes to index after pretreatment The index of concentration carries out significance of correlation coefficient verification using Principal Component Analysis;
Principal component analysis (PCA), one kind that multiple variables are selected less number significant variable by linear transformation are more First statistical analysis technique.Also known as principal component analysis.Principal component analysis is to be introduced by K. Pearson to nonrandom variable first, so The method is generalized to by H. Hotelling the situation of random vector afterwards.The size of information is usually weighed with sum of squares of deviations or variance Amount.
In actual subject, for comprehensive problem analysis, much variables (or factor) related with this are often proposed, because Each variable reflects certain information of this project to varying degrees.Principal component analysis is that is, have certain correlation for multiple Property index, being reassembled into one group of new mutual unrelated overall target carries out linear combination and replaces former index, as new Overall target.
Most classic way is exactly that the variance of F1 (first linear combination of selection, i.e. first overall target) is used to carry out table It reaches, i.e. Va (rF1) is bigger, and the information for indicating that F1 includes is more.Therefore the F1 chosen in all linear combination should the side of being Difference is maximum, therefore F1 is referred to as first principal component.If first principal component is not enough to represent the information of original P index, consider further that It chooses F2 and selects second linear combination, in order to effectively reflect original information, there is no need to occur again again for the existing information of F1 It is exactly to require Cov (F1, F2)=0 with mathematical linguistics expression, then F2 is referred to as Second principal component, and so on can be constructed in F2 Third, the the 4th ... ... out, the P principal component.
Form final clustering target: first principal component, Second principal component, third principal component, the 4th principal component ... ..., P A principal component forms final clustering target set m.
3, clustering is carried out to final clustering target using the cost function of K-means algorithm, obtains power supply reliability The cluster result of horizontal classification:
K-means clustering procedure is the important algorithm in partition clustering analytic approach, but this clustering method must be determined in advance it is poly- Class number, it is artificial to provide that clusters number factor is very big since business needs especially in power grid.And characteristic it is closely similar when, As long as having some deviations in borderline data point, data may be divided into another kind of from one kind.User needs guaranteeing It clusters between optimal and business needs and is chosen.The present invention is a kind of based on K-means Optimal Clustering, passes through cost function Optimization algorithm, the similar data of distinctive characteristics, and gather similitude is smaller for one kind.The present invention is to new method in cluster data Practical processing is given when amount is big, the factor is more, validity is obtained by the example of power supply reliability horizontal clustering problem Verifying.
K-Means algorithm belongs to one kind of unsupervised formula study, and the input of algorithm is: training dataset(wherein x (i) ∈ RnWith number of clusters K (data are divided into K class);Algorithm output is in K cluster Heart μ 1, μ 2 ..., μ k and the classification where each data point x (i).The process of K-Means algorithm is as follows:
I. K cluster centre of random initializtion (ClusterCentroid): μ 1, μ 2 ..., μ K;
Ii. for each data point x (i), the cluster centre nearest from it is found, such is classified to;I.e. Wherein c (i) indicates the class where the data point x (i) in clustering target;
Iii. the value for updating cluster centre uk is the average value of all data points for belonging to class k;Repeat ii and iii step until Restrain or reach maximum number of iterations
4, it is optimized according to basic data of the cluster result to description power supply reliability level:
Clustering, the cost of K-means algorithm are carried out to final clustering target using the cost function of K-means algorithm Function cost function are as follows:
Wherein: μ1,...,μkFor 1 ..., K cluster centre, m indicates final clustering target,Indicate that ith cluster refers to The center of class, x where mark(i)For the data point in clustering target, c(1),...c(m)For the data point x in clustering target(1),..., x(m)The class at place.
It is described to be optimized according to basic data of the cluster result to description power supply reliability level, comprising:
The basic data that will be greater than 3 times of variances of basic data of power supply reliability level is rejected, so that the K- The cost function of means algorithm minimizes.
Further: so that the cost function of the K-means algorithm minimizes, calculating formula is as follows:
Embodiment two,
The present embodiment is to study how to evaluate entoilage city-level power supply company, state reliability level, and evaluation content is specific Including two aspects: first is that average power off time of user;Second is that average frequency of power cut of user.It is analyzed from business, influencing power supply can Mainly having contact rate, can turn for rate, line sectionalizing rate, looped network rate, cable rate, power distribution automation rate, equipment water by property Equality.However, whether there is relationship between these factors, relationship is mostly strong but cannot to be needed to assist dividing by making a concrete analysis of in business Analysis and differentiation.
The factor that influence reliability index is chosen in this research establishes index set, after being collected to index set, to exception Data are handled and are corrected, and the prefecture-level power supply company within the scope of state's net is divided into 5 by reliability level by business diagnosis Class is clustered the significant factor of correlation using the cost function that K-means optimizes, if Fig. 1 and 2 is algorithm flow chart.
According to influence and reflecting regional distribution network reliability level several factors, respectively from economic characteristics, population characteristic, With electrical feature, equipment scale, grid structure and the several dimension index for selection of management level.Collected index result such as the following table 1 It is shown:
The index result that table 1 is collected into
By the way that index carries out data prediction, significance of correlation coefficient is verified to collecting, determine the index that finally clusters because Son, as shown in table 2 below.In addition, variable number just will increase answering for analysis too much when with statistical analysis technique Study of Multivariable Polygamy.And be many times, between variable to have certain correlativity, when having certain correlativity between two variables, It can be construed to the two variables and reflect that the information of this project has certain overlapping.Principal component analysis is the institute for originally proposing There is variable, it is extra that duplicate variable (variable of close relation) is left out, and new variables as few as possible is established, so that these new changes Amount is incoherent two-by-two, and these new variables keep original information in the message context of reflection project as far as possible.
Therefore, this chooses after being verified by principal component analysis method to index progress selecting index and significance of correlation coefficient Index compares, and discovery two methods conclusion matches.The dendrogram obtained using principal component analysis method is as shown in Figure 3.
The rejecting and selection of 2 index of table
Note: marking color background is selected index, and other is Rejection index
By to data prediction and after determining the final clustering target factor, using K-means optimization cost function into Row cluster, optimizes target.As shown in table 3 below, using the K-means algorithm after optimization can preferably by characteristic consistently Gather for one kind in city.
The comparative situation of the optimization of table 3 front and back cluster result
Districts and cities' title Province title Cover administrative region K-means K-means after optimization
Linyi Shandong Linyi 1 2
Qingdao Shandong Qingdao 1 1
Weihai Shandong Weihai 1 1
Heze Shandong Heze 1 2
Liaocheng Shandong Liaocheng 1 2
Tangshan Ji Bei Tangshan 1 1
Sunshine Shandong Sunshine 1 2
Zaozhuang Shandong Zaozhuang 1 2
Tai'an Shandong Tai'an 1 1
In addition, it is necessary to make an explanation, although this is using the K-means clustering method after optimization, this is One optimization of algorithm.In view of application and practicability of the invention, needs for algorithm to be applied to and do a little explanations in practice.Tool Body is as follows,
1, since cluster is intermediate result, cluster result is exactly applied in Practical Project and life by purpose, just It is to separate classification after being clustered the reliability level of districts and cities for this example.Object applied by clustering is districts and cities Average power off time of user and average frequency of power cut of user, it is flat with the practical power off time of districts and cities, number and the user of similar districts and cities The average value of equal power off time and number is compared.However the met problem from example of calculation, i.e., individual districts and cities are public in cluster It is horizontal that the average power off time of user and average frequency of power cut of user of department are significantly larger than other districts and cities, and other districts and cities users are average Power off time and number distribution again more concentrates, therefore, if only by the average power off time of user of similar all districts and cities and time Number participates in the calculating of average value, relatively large deviation can be caused to result, evaluation result will lose value.
If the most districts and cities' average power off time of user in Shanxi Province in this example is all more than 20 hours/family, user averagely stops Electric number is all more than 5 times/family.And it is similar in other districts and cities' average power off time of user be mostly distributed in 1.5~4 hours/family, use It is mostly distributed in 0.2~1 hour/family in average frequency of power cut, as shown in table 4 below.In view of the use of cluster middle part company, subdivision city Family is averaged power off time and average frequency of power cut of user does not have representativeness, therefore is being applied to will be greater than similarly in practice City's average power off time of user, 3 times of variances of number unit data rejected, be not involved in mean value calculation.
Table 4 considers part unit average after 3 times of variances
Note 1: non-null value is greater than the districts and cities of 3 times of variances in table
Note 2: being part districts and cities in table
2, since this algorithm is applied to 27 units of Guo Wang company, the reliability level of more than 300 prefecture-level companies is commented Valence, each interregional, between each province's unit differentiation are obvious.This is mainly according to the strong journey of rack between each region, each province's unit Degree, service ability, equipment equipment and region development use differentiation in districts and cities' reliability level cluster score Evaluation.In this way so that districts and cities' unit that those racks are strong, service ability is good, equipment level is high is given compared with high score, instead It, then give lower score value.
By to on-line system Operational Data Analysis, the accounting situation that districts and cities' equivalent user number accounts for the province can integrate compared with Reflect the comprehensive level of its unit well, therefore is added to the districts and cities and is somebody's turn to do using districts and cities' equivalent user number accounting as weight The comprehensive scoring saved, so that evaluation result more really restores the power supply reliability level of the districts and cities, the province.The following table is ground City and net province consider the scoring event after weight (districts and cities' equivalent user number accounts for this province ratio).
5 districts and cities of table and net save scoring event after consideration weight
Shown in table 5 as above, does not consider weight, directly use the average value of districts and cities, the province deviation ratio score as province's reliability Horizontal score, average power off time of user and number are respectively 77.32 and 88.97.And province user is average after considering weight The final score of power off time and number is respectively 86.67 and 96.61, and two kinds of score differences, main cause is the districts and cities, province The influence to result more than (area) specific equivalent user's accounting does not consider this if Tianjin Ninghe equivalent user accounting is 2.16% Specific equivalent user's accounting, score is lower, so that the average value dragged down, drags down Tianjin global reliability level, also It is difficult to reflect truth.And the latter scores, specific equivalent user is few, and score is also few, to province's total score (districts and cities' summation) It influences also small.
Embodiment three,
Based on same inventive concept, the present invention also provides a kind of power grid power supply reliability water based on K-means algorithm Flat clustering system, structure chart are as shown in Figure 4, comprising:
Module is constructed, the factor for choosing influence power supply reliability establishes index set;
Clustering target determining module, for determining that final cluster refers to from index set using significance of correlation coefficient verification Mark;
Cluster Analysis module carries out clustering, the clustering target optimized to final clustering target;
Optimization module, for the clustering target according to the optimization to the basic number for describing power grid power supply reliability level According to optimizing.
Further: the clustering target determining module further comprises:
Submodule is pre-processed, for pre-processing to the basic data in index set;
Submodule is verified, for carrying out phase relation digital display using Principal Component Analysis to the index in index set after pretreatment The verification of work property;
Submodule is formed, final clustering target is used to form.
Further: the verification submodule, comprising:
First principal component determination unit, for first principal component in index set after determining pretreatment;
Second principal component, determination unit, for successively determining Second principal component, third principal component, the 4th principal component ... ..., The P principal component, p indicate the index number in index set.
Further: the first principal component determination unit, comprising:
First chooses subelement, the variance Va for first linear combination F1 that first principal component index is chosen (rF1) it indicates, when Va (rF1) maximum, first linear combination F1 is first principal component;First linear combination is pre- place Multiple indexs with correlation in index set after reason are reassembled into one group of new mutual unrelated overall target and carry out linearly What combination obtained.
Further: the Second principal component, determination unit, comprising:
Second chooses subelement, if being not enough to represent the information of original P index for first principal component F1, selection the The existing information of two linear combination F2, first principal component F1 is not present in second linear combination F2, uses mathematic(al) representation It is expressed as Cov (F1, F2)=0, then second linear combination F2 is referred to as Second principal component,;
Third principal component determines subelement, constructs third, the the 4th ... ..., the P principal component for the rest may be inferred.
Further: the final clustering target be first principal component, Second principal component, third principal component ..., the The set m of P principal component.
A kind of power grid power supply reliability horizontal clustering method based on K-means algorithm provided by the invention, this method exist By the cost function of optimization on the basis of original method, the accuracy and confidence level of cluster are improved.The present invention is poly- simultaneously Class is forward and backward all to have done the work for improving Clustering Effect and practical significance, such as pre-processes input data, in correlation processing Also to carry out significance of correlation coefficient verification afterwards, there are also an important step using other householder methods-using principal component analyse method into Row demonstration differentiates, then is clustered using the cost function of optimization, so that the cluster small data of area's sorting room similitude as far as possible. In practical engineering applications to the processing work of practical abnormal data, districts and cities' equivalent user number is added and accounts for province's ratio as weight Deng these working groups integrally jointly promote the confidence level of cluster and the overall efficacy of reliability level, keep result more existing Real application value.Applied actual conditions or data are carried out anomalous discrimination in practical engineering applications by the present invention, make exception Data be not involved in generic calculating.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, although referring to above-described embodiment pair The present invention is described in detail, those of ordinary skill in the art still can to a specific embodiment of the invention into Row modification perhaps equivalent replacement these without departing from any modification of spirit and scope of the invention or equivalent replacement, applying Within pending claims of the invention.

Claims (16)

1. a kind of power grid power supply reliability level optimization method, it is characterised in that:
The factor for choosing influence power supply reliability establishes index set;
Final clustering target is determined from index set using significance of correlation coefficient verification;
Clustering is carried out to final clustering target, obtains the cluster result of power supply reliability horizontal classification;
It is optimized according to basic data of the cluster result to description power supply reliability level.
2. power grid power supply reliability level optimization method as described in claim 1, it is characterised in that: the index set is by reliable Property horizontal division includes basic data, and the basic data includes provincial characteristics, economic level, power load situation, rack knot Structure and operating condition, equipment situation and technology management level index.
3. power grid power supply reliability level optimization method as claimed in claim 2, it is characterised in that: described to use related coefficient Checking validity determines clustering target from index set, comprising:
Basic data in index set is pre-processed;
Significance of correlation coefficient verification is carried out using Principal Component Analysis to the index in index set after pretreatment;
Form final clustering target.
4. power grid power supply reliability level optimization method as claimed in claim 3, it is characterised in that: use Principal Component Analysis Carry out significance of correlation coefficient verification, comprising:
First principal component in index set after determining pretreatment;
Successively determining Second principal component, third principal component, the 4th principal component ... ..., the P principal component, p are indicated in index set Index number.
5. power grid power supply reliability horizontal clustering method as claimed in claim 4, it is characterised in that: after the determining pretreatment First principal component includes: in index set
The variance Va (rF1) for first linear combination F1 that the first principal component index is chosen is indicated, when Va (rF1) maximum When, first linear combination F1 is first principal component;First linear combination is multiple with phase in index set after pre-processing The index of closing property is reassembled into one group of new mutual unrelated overall target and carries out what linear combination obtained.
6. power grid power supply reliability horizontal clustering method as claimed in claim 5, it is characterised in that: described successively to determine second Principal component, third principal component, the 4th principal component ... ..., the P principal component, comprising:
If first principal component F1 is not enough to represent the information of original P index, choose second linear combination F2, first it is main at Divide the existing information of F1 to be not present in second linear combination F2, is expressed as Cov (F1, F2)=0 with mathematic(al) representation, then claims Second linear combination F2 is Second principal component,;
The rest may be inferred constructs third, the the 4th ... ..., the P principal component.
7. power grid power supply reliability level optimization method as claimed in claim 6, it is characterised in that: the final clustering target For first principal component, Second principal component, third principal component ..., the set m of P principal component.
8. power grid power supply reliability level optimization method as claimed in claim 7, it is characterised in that: described to refer to final cluster Mark carries out clustering, and the cluster result for obtaining power supply reliability horizontal classification includes: cost function using K-means algorithm Clustering is carried out to final clustering target, the cost function of the K-means algorithm indicates are as follows:
Wherein: μ1,...,μkFor 1 ..., K cluster centre, m indicates final clustering target,Indicate ith cluster index institute At the center of class, x(i)For the data point in clustering target, c(1),...c(m)For the data point x in clustering target(1),...,x(m) The class at place.
9. power grid power supply reliability level optimization method as claimed in claim 8, it is characterised in that: described according to the cluster As a result the basic data of description power supply reliability level is optimized, comprising:
The basic data that will be greater than 3 times of variances of basic data of power supply reliability level is rejected, so that the K-means is calculated The cost function of method minimizes.
10. power grid power supply reliability level optimization method as claimed in claim 9, it is characterised in that: so that the K-means The cost function of algorithm minimizes, and calculating formula is as follows:
11. a kind of power grid power supply reliability level optimization system, it is characterised in that: include:
Module is constructed, the factor for choosing influence power supply reliability establishes index set;
Clustering target determining module, for determining final clustering target from index set using significance of correlation coefficient verification;
Cluster Analysis module carries out clustering, the clustering target optimized to final clustering target;
Optimization module, for according to the clustering target of the optimization to the basic data of description power grid power supply reliability level into Row optimization.
12. the power grid power supply reliability level optimization system stated such as claim 11, it is characterised in that: the clustering target determines Module further comprises:
Submodule is pre-processed, for pre-processing to the basic data in index set;
Submodule is verified, for carrying out significance of correlation coefficient using Principal Component Analysis to the index in index set after pretreatment Verification;
Submodule is formed, final clustering target is used to form.
13. power grid power supply reliability level optimization system as claimed in claim 11, it is characterised in that: the verification submodule Block, comprising:
First principal component determination unit, for first principal component in index set after determining pretreatment;
Second principal component, determination unit, for successively determining Second principal component, third principal component, the 4th principal component ... ..., P A principal component, p indicate the index number in index set.
14. power grid power supply reliability level optimization system as claimed in claim 13, it is characterised in that: the first principal component Determination unit, comprising:
First chooses subelement, variance Va (rF1) table for first linear combination F1 that first principal component index is chosen Show, when Va (rF1) maximum, first linear combination F1 is first principal component;First linear combination is to refer to after pre-processing Mark concentrates multiple indexs with correlation, is reassembled into one group of new mutual unrelated overall target progress linear combination and obtains It arrives.
15. power grid power supply reliability level optimization system as claimed in claim 13, it is characterised in that: the Second principal component, Determination unit, comprising:
Second chooses subelement, if being not enough to represent the information of original P index for first principal component F1, chooses second The existing information of linear combination F2, first principal component F1 is not present in second linear combination F2, is indicated with mathematic(al) representation For Cov (F1, F2)=0, then second linear combination F2 is referred to as Second principal component,;
Third principal component determines subelement, constructs third, the the 4th ... ..., the P principal component for the rest may be inferred.
16. power grid power supply reliability level optimization system as claimed in claim 15, it is characterised in that: the final cluster refers to Be designated as first principal component, Second principal component, third principal component ..., the set m of P principal component.
CN201711228891.6A 2017-11-29 2017-11-29 A kind of power grid power supply reliability horizontal clustering method and system Pending CN109840536A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711228891.6A CN109840536A (en) 2017-11-29 2017-11-29 A kind of power grid power supply reliability horizontal clustering method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711228891.6A CN109840536A (en) 2017-11-29 2017-11-29 A kind of power grid power supply reliability horizontal clustering method and system

Publications (1)

Publication Number Publication Date
CN109840536A true CN109840536A (en) 2019-06-04

Family

ID=66882339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711228891.6A Pending CN109840536A (en) 2017-11-29 2017-11-29 A kind of power grid power supply reliability horizontal clustering method and system

Country Status (1)

Country Link
CN (1) CN109840536A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768109A (en) * 2020-07-02 2020-10-13 广东电网有限责任公司 Reliability early warning method and system for power electronic medium-voltage distribution network and terminal equipment
CN111932147A (en) * 2020-09-02 2020-11-13 平安国际智慧城市科技股份有限公司 Visualization method and device for overall index, electronic equipment and storage medium
CN116956075A (en) * 2023-09-18 2023-10-27 国网山西省电力公司营销服务中心 Automatic identification method, system, equipment and storage medium for type of power consumer side

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646286A (en) * 2013-09-02 2014-03-19 河海大学 Data processing method for estimating efficiency of intelligent distribution network
CN105303468A (en) * 2015-11-20 2016-02-03 国网天津市电力公司 Comprehensive evaluation method of smart power grid construction based on principal component cluster analysis
CN106530134A (en) * 2016-11-16 2017-03-22 国家电网公司 Influence index marginal benefit analysis method and device based on reliability calculation model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646286A (en) * 2013-09-02 2014-03-19 河海大学 Data processing method for estimating efficiency of intelligent distribution network
CN105303468A (en) * 2015-11-20 2016-02-03 国网天津市电力公司 Comprehensive evaluation method of smart power grid construction based on principal component cluster analysis
CN106530134A (en) * 2016-11-16 2017-03-22 国家电网公司 Influence index marginal benefit analysis method and device based on reliability calculation model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768109A (en) * 2020-07-02 2020-10-13 广东电网有限责任公司 Reliability early warning method and system for power electronic medium-voltage distribution network and terminal equipment
CN111932147A (en) * 2020-09-02 2020-11-13 平安国际智慧城市科技股份有限公司 Visualization method and device for overall index, electronic equipment and storage medium
CN116956075A (en) * 2023-09-18 2023-10-27 国网山西省电力公司营销服务中心 Automatic identification method, system, equipment and storage medium for type of power consumer side
CN116956075B (en) * 2023-09-18 2024-01-12 国网山西省电力公司营销服务中心 Automatic identification method, system, equipment and storage medium for type of power consumer side

Similar Documents

Publication Publication Date Title
US10606862B2 (en) Method and apparatus for data processing in data modeling
CN110503256B (en) Short-term load prediction method and system based on big data technology
CN108171379B (en) Power load prediction method
CN111324642A (en) Model algorithm type selection and evaluation method for power grid big data analysis
Park et al. Explainability of machine learning models for bankruptcy prediction
CN110930198A (en) Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment
CN110059852A (en) A kind of stock yield prediction technique based on improvement random forests algorithm
Dudas et al. Integration of data mining and multi-objective optimisation for decision support in production systems development
CN108345908A (en) Sorting technique, sorting device and the storage medium of electric network data
CN109840536A (en) A kind of power grid power supply reliability horizontal clustering method and system
CN106980906B (en) Spark-based Ftrl voltage prediction method
CN110690701A (en) Analysis method for influence factors of abnormal line loss
CN109271421A (en) A kind of large data clustering method based on MapReduce
Wang et al. Short-term load forecasting with LSTM based ensemble learning
CN115470862A (en) Dynamic self-adaptive load prediction model combination method
Fernandes et al. Analysis of residential natural gas consumers using fuzzy c-means clustering
Gökçe et al. Performance comparison of simple regression, random forest and XGBoost algorithms for forecasting electricity demand
CN109389517B (en) Analysis method and device for quantifying line loss influence factors
CN109829115B (en) Search engine keyword optimization method
Zheng et al. Modeling stochastic service time for complex on-demand food delivery
Jiang et al. Fluctuation similarity modeling for traffic flow time series: A clustering approach
CN114240102A (en) Line loss abnormal data identification method and device, electronic equipment and storage medium
Loseva et al. Ensembles of neural networks with application of multi-objective self-configurable genetic programming
Hao et al. Research on accurate identification of poor students in colleges based on big data analysis
Nadinta et al. A clustering-based approach for reorganizing bus route on bus rapid transit system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination