CN113222366A - Power utilization reliability evaluation method of self-adaptive k-means clustering algorithm - Google Patents

Power utilization reliability evaluation method of self-adaptive k-means clustering algorithm Download PDF

Info

Publication number
CN113222366A
CN113222366A CN202110460917.XA CN202110460917A CN113222366A CN 113222366 A CN113222366 A CN 113222366A CN 202110460917 A CN202110460917 A CN 202110460917A CN 113222366 A CN113222366 A CN 113222366A
Authority
CN
China
Prior art keywords
clustering
user
center number
data
power utilization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110460917.XA
Other languages
Chinese (zh)
Inventor
曾健
秦丽文
桂海涛
吴茵
李任明
吴凡
阳国燕
程向辉
韦营
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin Power Supply Bureau of Guangxi Power Grid Co Ltd
Original Assignee
Guilin Power Supply Bureau of Guangxi Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin Power Supply Bureau of Guangxi Power Grid Co Ltd filed Critical Guilin Power Supply Bureau of Guangxi Power Grid Co Ltd
Priority to CN202110460917.XA priority Critical patent/CN113222366A/en
Publication of CN113222366A publication Critical patent/CN113222366A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention provides a power utilization reliability evaluation method of a self-adaptive k-means clustering algorithm, which comprises the following steps: acquiring user data; determining a maximum number of clusters kmaxAnd a minimum cluster center number kmin(ii) a Let k be k as the number of clustering centersminClustering the user data; number k in maximum cluster centermaxAnd a minimum cluster center number kminDetermining a clustering center number k value; selecting the best clustering center number K0And (5) obtaining the electricity utilization characteristics of the user according to the clustering result under the value. The invention determines the maximum clustering center number k firstmaxAnd a minimum cluster center number kminCalculating a data clustering effect evaluation index I corresponding to the k value of the clustering center numberDBIValue-wise determination of the best number of clusters K0The mode of value, the processing range is relativelyLarge data, simple and quick determination of the best cluster center number K0The method solves the defect that the traditional k-means clustering algorithm cannot assign the clustering center number in large-range data through experience.

Description

Power utilization reliability evaluation method of self-adaptive k-means clustering algorithm
Technical Field
The invention relates to the field of data processing, in particular to a power utilization reliability evaluation method of a self-adaptive k-means clustering algorithm.
Background
The operation reliability of the power system is up to the safety of the national civilization and the country, the operation reliability of the power system is accurately evaluated, and a targeted guidance suggestion can be provided for system maintenance. The existing reliability research is mostly focused on a power system layer or a user individual layer, and the most application in the aspect is to perform dimension reduction and clustering to a certain extent on the power utilization data of the user layer by using a k-means method so as to realize the reliability analysis of the user type.
However, the method has certain limitations, the method needs a manually specified clustering center number k, when the application range is gradually enlarged, the clustering center number cannot be specified through experience, and the effect of the method is greatly influenced. In addition, currently, reliability assessment of a user layer is always stopped in user load clustering, research results are more applied to a marketing system to guide user service, the running state of the system is difficult to reflect, power grid planning and reliability assessment work cannot be guided, and a large improvement space still exists.
Disclosure of Invention
A power utilization reliability evaluation method of a self-adaptive k-means clustering algorithm comprises the following steps:
step S1, user data is obtained;
step S2, determining the maximum clustering center number kmaxAnd a minimum cluster center number kmin
Step S3, let the number k of cluster centers be kminClustering the user data;
step S4, counting the maximum clustering center number kmaxAnd a minimum cluster center number kminDetermine the best clustering center number K0A value;
step S5, selecting the best clustering center number K0And (5) obtaining the electricity utilization characteristics of the user according to the clustering result under the value.
Further, the user data specifically includes: the power utilization curve of the user, the account information of the user and the work order information of the power grid fault.
Further, the step S4 specifically includes:
step S401, judging whether the value of the clustering center number k is smaller than the maximum clustering center number kmax
Step S402, if the value of the clustering center number k is less than the maximum clustering center number kmaxThen calculate the data clustering effect evaluation index IDBIValue, let cluster center number k +1, and return to step S3;
step S403, if the value of the clustering center number k is more than or equal to kmaxThen, the minimum data clustering effect evaluation index I calculated in step S402 is selectedDBICorresponding best cluster center number K0The value is obtained.
Further, the data clustering effect evaluation index IDBIIs an evaluation index of the data cluster, the number of centers of the best cluster K0Corresponding minimum data clustering effect evaluation index IDBI
Further, the data clustering effect evaluation index IDBIThe calculation formula of the value is:
Figure BDA0003042388150000021
wherein:
djnumber of centers of class j representing arbitrary selectionAccording to the average distance from the object to the corresponding class center;
dhrepresenting the average distance between the data object in the randomly selected h-type class center number and the corresponding class center;
dj,hand representing the Euclidean distance of class centers of the arbitrarily selected class j center number and the arbitrarily selected class h center number.
Further, the user electricity utilization characteristics are combined with the fault work order information to perform reliability analysis, and an electricity utilization reliability index is obtained.
Further, the power utilization reliability indexes comprise average power failure frequency, average power failure duration, expected number of users in power failure, average power failure shortage amount and power failure reason probability distribution.
Further, the user electricity consumption curve is obtained by averaging the same type of user electricity consumption, the user electricity consumption curve is week electricity consumption data of one year of the user, and the week electricity consumption data is obtained by selecting row number data of each user every 7 days to perform cleaning and differential operation.
Further, the length of the weekly electricity consumption curve is divided into 52 points, and the weekly electricity consumption data formed by each user every year form a 52-dimensional vector.
Further, the total vector quantity of the electricity consumption of the users is obtained according to a vector formed by the weekly electricity consumption data formed by each user every year, and the total vector quantity of the electricity consumption of the users is clustered by an algorithm.
Drawings
FIG. 1 is a schematic flow chart of an adaptive k-means clustering algorithm in the present invention;
FIG. 2 is a diagram showing an evaluation index I of data clustering effect in the present inventionDBIA curve that varies with the number k of cluster centers;
fig. 3 shows the clustering result when the number k of the clustering centers is 14 in the present invention.
Detailed Description
The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
Example one
A power utilization reliability assessment method of an adaptive k-means clustering algorithm is shown in figure 1 and comprises the following steps:
step S1, user data is obtained;
selecting user row data of a certain city in south China for explanation, wherein the time span is from 2019-09-30 to 2020-10-01, 54148362 data are obtained, 892421 users are contained, and a user electricity utilization curve is obtained through cleaning, and each user has 52 points. And (4) screening abnormal curves of the cleaned data to obtain 889905 effective user electricity utilization curves.
Step S2, determining the maximum clustering center number kmaxAnd a minimum cluster center number kmin
As shown in FIG. 2, the maximum cluster center number k is determinedmax19 and the minimum cluster center number kmin=2
Step S3, let the number k of cluster centers be kminStarting to cluster the user data;
step S4, counting the maximum clustering center number kmaxAnd a minimum cluster center number kminDetermine the best clustering center number K0A value;
calculating data clustering effect evaluation index IDBIWhen the value of k is less than or equal to k in the clustering center numbermaxReturning to the step 3, otherwise, calculatingMinimum data clustering effect evaluation index IDBIThe corresponding cluster center number k. In this example, the data clustering effect evaluation index IDBIThe curve as a function of the number k of clusters is shown in FIG. 2.
Step S5, selecting the best clustering center number K0And (5) obtaining the electricity utilization characteristics of the user according to the clustering result under the value.
As can be seen from fig. 2, when the number k of clusters is 14, the smallest data clustering effect evaluation index I appearsDBIThe value is obtained. Selecting the best clustering center number K0Clustering was repeated 14, as shown in fig. 3, resulting in 14 cluster centers representing 14 typical annual power curve types.
In a preferred embodiment of the present application, the adaptive k-means clustering algorithm is implemented by first determining the maximum clustering center number kmaxAnd a minimum cluster center number kminDetermine the optimal number of clusters K0The method can process data with a large range, and solves the defect that the traditional k-means clustering algorithm cannot assign the clustering center number in the large-range data through experience.
Further, in a preferred embodiment of the present application, the user data specifically includes: the power utilization curve of the user, the account information of the user and the work order information of the power grid fault.
In the application, the characteristics of the electricity consumption types of the typical users are obtained by counting the typical electricity consumption curve, the number of the users of various typical types and the proportion of the electricity consumption, and the electricity consumption characteristics of the global users can be comprehensively mastered according to the counting method.
Further, in a preferred embodiment of the present application, the step S4 specifically includes:
step S401, judging whether the value of the clustering center number k is smaller than the maximum clustering center number kmax
Step S402, if the value of the clustering center number k is less than the maximum clustering center number kmaxThen calculate the data clustering effect evaluation index IDBIValue, let cluster center number k +1, and return to step S3;
step S403, as describedThe value of the clustering center number k is greater than or equal to the maximum clustering center number kmaxThen, the minimum data clustering effect evaluation index I calculated in step S402 is selectedDBICorresponding best cluster center number K0The value is obtained.
Further, in a preferred embodiment of the present application, the optimal number of clusters K is0Corresponding minimum data clustering effect evaluation index IDBI
Further, the data clustering effect evaluation index IDBIThe calculation formula of the value is:
Figure BDA0003042388150000061
wherein:
djthe average distance from the data object in the jth class to the corresponding class center;
dhrepresenting the average distance from the data object in the h class to the center of the corresponding class;
dj,hrepresenting the euclidean distance of class centers for class j and class h.
Further, in a preferred embodiment of the present application, the user power utilization characteristics are combined with the fault work order information to perform reliability analysis, so as to obtain a power utilization reliability index.
As shown in FIG. 3, the abscissa is the number of weeks, ranging from 1 to 52, the first week representing 2019-10-1 to 2019-10-7, and so on. The ordinate is the electricity consumption in degrees. In addition, the clustering center to which each user belongs can be obtained through clustering, and the clustering center can be used as an annual power consumption curve corresponding to the typical user power consumption type and applied to subsequent power consumption reliability evaluation.
Further, in a preferred embodiment of the present application, the power utilization reliability index includes an average power outage frequency, an average power outage duration, the number of households expected to have power outage, an average power outage shortage amount, and a power outage cause probability distribution; wherein:
Figure BDA0003042388150000071
wherein λiThe number of power failure times within one year for the user i, NRThe total number of the users of the type is R, and the R is a user set belonging to the same type;
Figure BDA0003042388150000072
wherein t isiThe fault duration of the ith fault is respectively, and R is a user fault event set belonging to the same type;
the expected number of the users in the power failure is equal to the average power failure frequency multiplied by the average duration time of the power failure multiplied by the number of the users;
the average power outage amount is equal to the average power outage duration time multiplied by the average power of users;
and the power failure reason probability distribution is obtained by screening and counting the power grid fault first-aid repair work orders.
Further, in a preferred embodiment of this application, the user power consumption curve is obtained through averaging the user power consumption of the same type, the user power consumption curve is the week power consumption data of one year of the user, week power consumption data is obtained through washing and difference operation every 7 days by selecting a row number data of every user.
Further, in a preferred embodiment of the present application, the weekly power usage curve length is divided into 52 points, and the weekly power usage data formed each year by each user constitutes a 52-dimensional vector.
Further, in a preferred embodiment of the present application, the total vector number of the power consumption of the users is obtained according to a vector formed by the weekly power consumption data formed every year by each user, and the total vector number of the power consumption of the users is clustered by an algorithm. The total vector number is calculated in the embodiment as follows:
the total vector number is m × 52, where m is the total number of users.
In the description of the present invention, it is to be understood that the terms "intermediate", "length", "upper", "lower", "front", "rear", "vertical", "horizontal", "inner", "outer", "radial", "circumferential", and the like, indicate orientations and positional relationships that are based on the orientations and positional relationships shown in the drawings, are used for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and therefore, are not to be construed as limiting the present invention.
In the present invention, unless otherwise expressly stated or limited, the first feature may be "on" the second feature in direct contact with the second feature, or the first and second features may be in indirect contact via an intermediate. "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; may be mechanically coupled, may be electrically coupled or may be in communication with each other; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
The above description is for the purpose of illustrating embodiments of the invention and is not intended to limit the invention, and it will be apparent to those skilled in the art that any modification, equivalent replacement, or improvement made without departing from the spirit and principle of the invention shall fall within the protection scope of the invention.

Claims (10)

1. A power utilization reliability evaluation method of a self-adaptive k-means clustering algorithm is characterized by comprising the following steps of:
step S1, user data is obtained;
step S2, determining the maximum clustering center number kmaxAnd a minimum cluster center number kmin
Step S3, let the number k of cluster centers be kminClustering the user data;
step S4, counting the maximum clustering center number kmaxAnd a minimum cluster center number kminDetermine the best clustering center number K0A value;
step S5, selecting the best clustering center number K0And (5) obtaining the electricity utilization characteristics of the user according to the clustering result under the value.
2. The method for evaluating the electricity utilization reliability of the adaptive k-means clustering algorithm according to claim 1, wherein the user data specifically comprises: the power utilization curve of the user, the account information of the user and the work order information of the power grid fault.
3. The power utilization reliability assessment method of the adaptive k-means clustering algorithm according to claim 1, wherein the step S4 specifically comprises:
step S401, judging whether the value of the clustering center number k is smaller than the maximum clustering center number kmax
Step S402, if the value of the clustering center number k is less than the maximum clustering center number kmaxThen calculate the data clustering effect evaluation index IDBIValue, let cluster center number k +1, and return to step S3;
step S403, if the value of the clustering center number k is larger than or equal to the maximum clustering center number kmaxThen, the minimum data clustering effect evaluation index I calculated in step S402 is selectedDBICorresponding best cluster center number K0The value is obtained.
4. The method for evaluating the electricity utilization reliability of the adaptive K-means clustering algorithm according to claim 2, wherein the optimal clustering center number K is0Corresponding minimum data clustering effect evaluation index IDBI
5. The method of claim 3, wherein the power consumption of the adaptive k-means clustering algorithm is zeroThe reliability evaluation method is characterized in that the data clustering effect evaluation index IDBIThe calculation formula of the value is:
Figure FDA0003042388140000021
wherein:
djrepresenting the average distance between the data object in the arbitrarily selected j category center number and the corresponding category center;
dhrepresenting the average distance between the data object in the randomly selected h-type class center number and the corresponding class center;
dj,hand representing the Euclidean distance of class centers of the arbitrarily selected class j center number and the arbitrarily selected class h center number.
6. The power utilization reliability assessment method of the self-adaptive k-means clustering algorithm according to claim 1, characterized in that the user power utilization characteristics are combined with the fault work order information to perform reliability analysis to obtain a power utilization reliability index.
7. The method as claimed in claim 6, wherein the electricity reliability indicators include average outage frequency, average outage duration, expected number of users in outage, average outage power supply shortage and outage cause probability distribution.
8. The power utilization reliability assessment method for the adaptive k-means clustering algorithm according to claim 2, characterized in that the user power utilization curve is obtained by averaging the same type of user power utilization, the user power utilization curve is weekly power utilization data of one year of the user, and the weekly power utilization data is obtained by selecting a row number data every 7 days for each user to perform cleaning and differential operation.
9. The power utilization reliability assessment method of the adaptive k-means clustering algorithm according to claim 8, wherein the length of the power utilization curve of the users is divided into 52 points, and the weekly power utilization data formed by each user every year form a 52-dimensional vector.
10. The method as claimed in claim 9, wherein the total vector number of the user electricity consumption is obtained from the vector formed by the weekly electricity consumption data of each user every year, and the algorithm clusters the total vector number of the user electricity consumption.
CN202110460917.XA 2021-04-27 2021-04-27 Power utilization reliability evaluation method of self-adaptive k-means clustering algorithm Pending CN113222366A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110460917.XA CN113222366A (en) 2021-04-27 2021-04-27 Power utilization reliability evaluation method of self-adaptive k-means clustering algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110460917.XA CN113222366A (en) 2021-04-27 2021-04-27 Power utilization reliability evaluation method of self-adaptive k-means clustering algorithm

Publications (1)

Publication Number Publication Date
CN113222366A true CN113222366A (en) 2021-08-06

Family

ID=77089242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110460917.XA Pending CN113222366A (en) 2021-04-27 2021-04-27 Power utilization reliability evaluation method of self-adaptive k-means clustering algorithm

Country Status (1)

Country Link
CN (1) CN113222366A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133652A (en) * 2017-05-17 2017-09-05 国网山东省电力公司烟台供电公司 Electricity customers Valuation Method and system based on K means clustering algorithms

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133652A (en) * 2017-05-17 2017-09-05 国网山东省电力公司烟台供电公司 Electricity customers Valuation Method and system based on K means clustering algorithms

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高新: ""一种改进K-means聚类算法与新的聚类有效性指标研究"" *

Similar Documents

Publication Publication Date Title
CN108280479B (en) Power grid user classification method based on load characteristic index weighted clustering algorithm
Wong et al. A simple way to use interval data to segment residential customers for energy efficiency and demand response program targeting
CN110110881B (en) Power customer demand prediction analysis method and system
CN113267692B (en) Low-voltage transformer area line loss intelligent diagnosis and analysis method and system
Zhou et al. Residential demand response targeting using machine learning with observational data
CN107330540B (en) A kind of scarce power supply volume prediction technique in the distribution net platform region considering quality of voltage
CN110264107B (en) Large data technology-based abnormal diagnosis method for line loss rate of transformer area
CN107133652A (en) Electricity customers Valuation Method and system based on K means clustering algorithms
CN110322371A (en) The area Gao Suntai multiplexing electric abnormality user based on multiple linear regression analysis detects localization method
CN106096805A (en) A kind of residential electricity consumption load classification method based on entropy assessment feature selection
WO2020252785A1 (en) Abnormal electricity use recognition method and device, and computer readable storage medium
CN110490454B (en) Distribution network asset operation efficiency calculation method based on distribution network equipment operation efficiency evaluation
CN110874381B (en) Spatial density clustering-based user side load data abnormal value identification method
Jain et al. Energy efficiency in South Asia: Trends and determinants
CN107591803A (en) A kind of electric load behavior prediction method based on demand response
Wang et al. Multi-objective residential load dispatch based on comprehensive demand response potential and multi-dimensional user comfort
CN107834563B (en) Method and system for processing voltage sag
CN113222366A (en) Power utilization reliability evaluation method of self-adaptive k-means clustering algorithm
CN113112136A (en) Comprehensive evaluation method and system for reliability of power distribution network
CN112785060A (en) Lean operation and maintenance level optimization method for power distribution network
CN111144628A (en) Distributed energy supply type cooling, heating and power load prediction model system and method
CN115081893A (en) User electricity consumption data analysis method and device, electronic equipment and readable storage medium
CN109670550A (en) A kind of distribution terminal maintenance decision method and apparatus
CN115760400A (en) Mining behavior detection method based on electric power data and storage medium
Mehar et al. Analytical model for residential predicting energy consumption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination