CN111046913A - Load abnormal value identification method - Google Patents
Load abnormal value identification method Download PDFInfo
- Publication number
- CN111046913A CN111046913A CN201911125386.8A CN201911125386A CN111046913A CN 111046913 A CN111046913 A CN 111046913A CN 201911125386 A CN201911125386 A CN 201911125386A CN 111046913 A CN111046913 A CN 111046913A
- Authority
- CN
- China
- Prior art keywords
- load
- abnormal
- data
- value
- power utilization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 99
- 238000000034 method Methods 0.000 title claims abstract description 57
- 230000008569 process Effects 0.000 claims abstract description 6
- 230000005611 electricity Effects 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 238000007418 data mining Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 238000007689 inspection Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 241000135164 Timea Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Economics (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Probability & Statistics with Applications (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of power load data mining, in particular to a load abnormal value identification method. Which comprises the following steps: classifying the load curves by using the power utilization modes; classifying the load level of the load curve belonging to the normal power utilization mode; constructing an abnormal load data domain, and identifying the abnormal value of the load curve belonging to the normal power utilization mode; and constructing an abnormal load data field for identifying the negative abnormal load value in the abnormal power utilization mode by using the maximum upper limit and the minimum lower limit of the abnormal load data field, and identifying the load abnormal value of the load curve belonging to the abnormal power utilization mode. The method is based on the space density clustering method and the K center clustering method, can reasonably classify the load data, and is convenient to process different power utilization modes and different load levels under the same power utilization mode. The method for constructing the abnormal load data domain by the central limit theorem and the quartile range can flexibly adjust the range of the abnormal load data domain according to the required confidence degree.
Description
Technical Field
The invention relates to the technical field of power load data mining, in particular to a load abnormal value identification method.
Background
The intelligent electric meter is installed so that the load record of the power consumer is changed from electric quantity to a time sequence load curve, and the power time sequence load curve contains the power consumption behavior and load level information of the user at each moment compared with the electric quantity. As an important basis for the energy supply provider to provide energy management service and the user to perform demand-side response, the abnormal load value in the power load curve often has a significant impact on the above decision. Because the acquired power load curve contains mass data, the abnormal load value is difficult to be accurately identified in a manual mode, and an abnormal load value identification method facing large-scale data needs to be developed.
Because the distribution of the power load is unknown, the box plot has a wider application in abnormal value identification. The maximum value, the minimum value, the upper quartile, the median and the lower quartile provided by the data are analyzed, and an abnormal data domain is constructed based on the quartile, so that the abnormal load value is identified. The method is simple and easy to use, and cannot be complicated due to the increase of the data scale. The main disadvantage of the box plot is that the constructed anomaly data field does not take into account the local density characteristics of the load values at each time. In order to solve the problem, the distribution characteristics met by the load data are usually assumed in advance, and then abnormal load data domains under different confidence degrees are constructed according to corresponding cumulative density functions; or verifying the distribution condition of the load data by using a distribution inspection method, and constructing abnormal load data domains under different confidence degrees by using an accumulative density function after the hypothesis is accepted.
The above solution has the following problems: for the situation of directly assuming load data distribution, because the electricity utilization behavior of the user has great randomness, the electricity utilization behavior can change along with the time lapse and the change of the external environment, and the distribution of the load data is difficult to be accurately given; for assuming distribution first and then performing distribution inspection, since there may be load abnormal values in the load data, the distribution inspection process may be affected by the abnormal data before the abnormal value identification is performed, so the inspection result is not reliable.
Disclosure of Invention
The present invention is directed to a load abnormal value identification method, so as to solve the problems in the background art.
In order to achieve the above object, in one aspect, the present invention provides a load abnormal value identification method, including the steps of:
step 1: based on a space density clustering method, classifying the power utilization modes of the load curves into a normal power utilization mode and an abnormal power utilization mode;
step 2: based on a K-center clustering method, carrying out load level classification on the normal power utilization mode;
and step 3: under different load levels, aiming at the random distribution condition of the load values at each moment, constructing an abnormal load data domain based on a central limit theorem and a four-quadrant difference of the load values relative to the clustering central load deviation;
and 4, step 4: identifying abnormal load values possibly existing in the normal electric mode by using the load abnormal data field constructed in the step 3;
and 5: and (3) combining the abnormal load data field formed in the step (3), constructing an abnormal load data field for identifying the negative abnormal load value in the abnormal power consumption mode by using the maximum upper limit and the minimum lower limit of the abnormal load data field, and identifying the load abnormal value in the abnormal power consumption mode.
Preferably, in the step 1, the specific steps of classifying the user electricity utilization patterns based on the space density clustering method are as follows:
step 1.1: minimum number of data points N contained within a neighborhoodminSetting (2);
step 1.2: the scan radius epsilon is determined. At selected NminThen, each load curve and the Nth position in the neighborhood are calculatedminDegree of difference in power consumption pattern between adjacent load curvesWhereinIs calculated by the formulaIn the formula yaAnd ybRepresents two different daily load curves, each of which can be expressed in vector form as y ═ x1,x2,…,xn)T,xiRepresents a load value at the ith time;
step 1.3: preparing a historical load curve set D ═ y1,y2,…,yMWhere M is the total number of load curves;
step 1.4: initializing a set of core objectsThe number of classes c is 0, the set of unclassified samples Λ is D, the set of class partitions
Step 1.5: for the load curve yi(i ═ 1,2, …, M), finding a core object;
step 1.6: if core object setStopping classification, and entering step 1.10, otherwise, entering step 1.7;
step 1.7: initializing class serial number c as c +1, randomly selecting a core object o from the set omega, and initializing the current core object queue omegacInitializing the current classification set S ═ o }cUpdating an unclassified sample set Λ ═ Λ - { o };
step 1.8: if it is notThen is present at ScAfter generation, update S ═ S1,S2,…,Sc},Ω=Ω-ScStep 1.6 is carried out, otherwise step 1.9 is carried out;
step 1.9: from ΩcTaking out the core object o' to form an epsilon neighborhood sample set Zε(o') obtaining unclassified samples and belonging only to the set Zε(o') set of samples Δ ═ Zε(o') ∩ Λ, update Ωc=Ωc∪(Δ∩Ω)-o'、Sc=Sc∪ delta and Λ ═ Λ -delta, proceed to step 1.8;
step 1.10: output S ═ S1,S2,…,ScAnd Λ.
Preferably, in step 1.5, the step of finding the core object is as follows:
①, calculating the difference degree of the power consumption modes to find yiEpsilon field subsample set Zε(yi);
②, if Zε(yi) Containing more than N samplesminThen sample yiAdding core object set omega-omega ∪ { yi}。
Preferably, in step 2, the specific steps of classifying the load level of the normal power consumption mode are as follows:
step 2.1: determining an optimal cluster number K such that the data point load levels within the sub data sets are highly similar, between sub data setsStarting from the angle with larger difference degree of load level, utilizing comprehensive evaluation indexesIn the clustering number K belongs to [1, K ]max]In (1), is selected such thatObtaining the K with the minimum value, namely, the K can be used as the expected clustering number of the subdata set,representing the load curve y contained in the kth sub-setjAnd cluster centerDistance between, NkThe number of load curves contained for a subdata set;representing the degree of difference in load level between K sub-data sets, wherein,andeach representing a cluster center of a different sub data set;
step 2.2: and carrying out load level classification on the normal power mode by using the determined load level classification value K and using a K-center clustering method again.
Preferably, in step 2.1, the detailed steps of K-center clustering are as follows:
step 2.1.1: set of load curves S from which a load level classification is to be carried outi(i-1, 2, …, c) randomly selecting K load curves in a centralized manner to serve as initial center points, and setting the maximum iteration times;
step 2.1.2: collecting load curves S of normal power consumption modes to be classifiediLoad curve of (1), assigned to the nearest distanceA center point of (a);
step 2.1.3: starting to execute iteration to enable the sum d of Euclidean distances corresponding to the classification resultsKMinimum, which is calculated by the formula
Preferably, in step 2.1.3, the iteration is performed as follows:
step 2.1.3.1: calculating d from the assignment resultK: if the assignment is made for the first time, calculate dKAnd is directly stored inOf the variables, the variable holds the minimum dK(ii) a If the assignment is not made for the first time, d is calculatedKAnd save it in a variableContinuing to execute the step 2.1.3.2;
step 2.1.3.2: randomly selecting a non-central point;
step 2.1.3.3: creating a set C, storing the result of the iteration assignment, and directly storing the set C if the assignment is performed for the first timeIn the set, the set stores the optimal classification result; if not, compareAndif it is notThen set C is saved inIn the set, according toModifying the result in the set C by the non-central point selected by the machine, exchanging the randomly selected non-central point with the corresponding central point, and preparing data for the next round of assignment process;
step 2.1.3.4: judging whether the specified maximum iteration times is reached, if so, terminating the calculation and outputting the optimal classification resultOtherwise, executing the next round of iterative calculation, and turning to the step 2.1.3.1.
Preferably, in the step 3, the abnormal load data domain is constructed by a method that for the kth sub-data set, at the time t, based on confidence intervals of load expected values under 1- α confidence degrees and the difference of four-component difference of the load values relative to the load deviation of the cluster centerForming exception data fieldsIn the formula XtIn order to load the random variable,is the sample mean of the load at time t,for time t is the corrected sample variance, NkIs the number of samples, tα/2(Nk-1) is a degree of freedom NkThe upper side α/2 quantile of the t-distribution of-1.
Compared with the prior art, the invention has the beneficial effects that: the method can effectively overcome the defects that the box plot identification method does not consider the local distribution characteristic of the load value and the applicability of the method is poor due to unknown load data distribution. Compared with the method of constructing the abnormal load data domain by using a box plot method, the method based on the space density clustering method and the K center clustering method can reasonably classify the load data, and is convenient to process different power utilization modes and different load levels under the same power utilization mode. The method for constructing the abnormal load data domain by the central limit theorem and the quartile range can flexibly adjust the range of the abnormal load data domain according to the required confidence degree. On one hand, the distribution of the load values does not need to be determined in advance, and the condition that the distribution of the load data is assumed by means of expert experience is avoided; on the other hand, effective information provided by the load data is fully utilized, and the problem that the adaptability of the method to load data sets of different types of users is poor due to uncertain load data distribution is solved.
The algorithm provided by the invention conforms to the actual situation of power load data processing. With the increase of the popularity of the intelligent electric meter, the data server stores massive load data, and the load data inevitably influences the record of the normal load value due to external interference or the condition of the acquisition equipment. If the load data containing abnormal load values is directly used, errors of relevant decisions can be caused, and losses are caused. Under the condition that the utilization and distribution of load data information are unknown, the accuracy of abnormal load value identification and the applicability of the identification method to different types of data are poor in the traditional method.
When the algorithm provided by the invention faces different types of power user load data sets, the local distribution characteristics of the load data are considered, a flexible and reliable abnormal load data domain is constructed, and the algorithm can adapt to the load data sets of different types of users and the requirement of efficiently processing mass data.
Drawings
FIG. 1 is a schematic diagram illustrating the classification result of the load power consumption mode of the residential community according to the present invention;
FIG. 2 is a diagram illustrating the classification result of the load level in the normal power consumption mode according to the present invention;
FIG. 3 is a schematic diagram illustrating an abnormal load value recognition result in a normal power consumption mode according to the present invention;
fig. 4 is a schematic diagram of an abnormal load value identification result in the abnormal power consumption mode according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-4, the present invention provides a technical solution:
the invention provides a load abnormal value identification method, which comprises the following steps:
step 1: based on a space density clustering method, classifying the power utilization modes of the load curves into a normal power utilization mode and an abnormal power utilization mode;
step 2: based on a K-center clustering method, carrying out load level classification on the normal power utilization mode;
and step 3: under different load levels, aiming at the random distribution condition of the load values at each moment, constructing an abnormal load data domain based on a central limit theorem and a four-quadrant difference of the load values relative to the clustering central load deviation;
and 4, step 4: identifying abnormal load values possibly existing in the normal electric mode by using the load abnormal data field constructed in the step 3;
and 5: and (3) combining the abnormal load data field formed in the step (3), constructing an abnormal load data field for identifying the negative abnormal load value in the abnormal power consumption mode by using the maximum upper limit and the minimum lower limit of the abnormal load data field, and identifying the load abnormal value in the abnormal power consumption mode.
In this embodiment, in step 1, the specific steps of classifying the user power consumption modes based on the spatial density clustering method are as follows:
step 1.1: minimum number of data points N contained within a neighborhoodminThe setting of (1) is usually taken as adding 1 to the dimension of the load sample in the load curve;
step 1.2: the scan radius epsilon is determined. At selected NminThen, each load curve and the Nth position in the neighborhood are calculatedminDegree of difference in power consumption pattern between adjacent load curvesWhereinIs calculated by the formulaIn the formula yaAnd ybRepresents two different daily load curves, each of which can be expressed in vector form as y ═ x1,x2,...,xn)T,xiIndicating the load value at the ith time, the degree of difference between the selected power consumption modes being changed stepwise for the first timeAs the scanning radius ∈, the load data set is divided into a normal power usage mode and an abnormal power usage mode using the above parameters. Regarding the load of the residential community, the load of 24 hours per day is regarded as a load data point, so NminWhen 25, epsilon is 0.0108;
step 1.3: preparing a historical load curve set D ═ y1,y2,...,yMWhere M is the total number of load curves;
step 1.4: initializing a set of core objectsThe number of classes c is 0, the set of unclassified samples Λ is D, the set of class partitions
Step 1.5: for the load curve yi(i ═ 1,2, …, M), core objects are found, as follows:
①, calculating the difference degree of the power consumption modes to find yiEpsilon field subsample set Zε(yi);
②, if Zε(yi) Containing more than N samplesminThen sample yiAdding core object set omega-omega ∪ { yi};
Step 1.6: if core object setStopping classification, and entering step 1.10, otherwise, entering step 1.7;
step 1.7: initializing class serial number c as c +1, randomly selecting a core object o from the set omega, and initializing the current core object queue omegacInitializing the current classification set S ═ o }cUpdating an unclassified sample set Λ ═ Λ - { o };
step 1.8: if it is notThen is present at ScAfter generation, update S ═ S1,S2,…,Sc},Ω=Ω-ScStep 1.6 is carried out, otherwise step 1.9 is carried out;
step 1.9: from ΩcTaking out the core object o' to form an epsilon neighborhood sample set Zε(o') obtaining unclassified samples and belonging only to the set Zε(o') set of samples Δ ═ Zε(o') ∩ Λ, update Ωc=Ωc∪(Δ∩Ω)-o'、Sc=Sc∪ delta and Λ ═ Λ delta, proceed to step 1.8;
step 1.10: output S ═ S1,S2,…,ScAnd Λ.
The method is suitable for clustering and dividing samples of any dimensionality based on a space density clustering method, and abnormal samples are found while clustering is completed. The method is adopted to classify the electricity utilization modes, the unclassified load sample curve in the set Lambda is regarded as an abnormal electricity utilization mode curve, and the load curve in the set S is a normal electricity utilization mode. The classification is made with electrical patterns with the results shown in figure 1.
Further, in step 2, the specific steps of classifying the load level of the normal power consumption mode are as follows:
step 2.1: determining an optimal cluster number K such that the data point load levels within the sub data sets are highly similar, negative across the sub data setsStarting from the angle with larger difference degree of the load level, the comprehensive evaluation index is utilizedIn the clustering number K belongs to [1, K ]max]In (1), is selected such thatObtaining the K with the minimum value, namely, the K can be used as the expected clustering number of the subdata set,representing the load curve y contained in the kth sub-setjAnd cluster centerDistance between, NkThe number of load curves contained for a subdata set;representing the degree of difference in load level between K sub-data sets, wherein,andall represent the clustering centers of different subdata sets, wherein the detailed steps of K-center clustering are as follows:
step 2.1.1: set of load curves S from which a load level classification is to be carried outi(i-1, 2, …, c) randomly selecting K load curves in a centralized manner to serve as initial center points, and setting the maximum iteration times;
step 2.1.2: collecting load curves S of normal power consumption modes to be classifiediThe load curve in (1), assigned to the nearest center point;
step 2.1.3: starting to execute iteration to enable the sum d of Euclidean distances corresponding to the classification resultsKMinimum, which is calculated by the formula
Step 2.1.3.1: calculating d from the assignment resultK: if the assignment is made for the first time, calculate dKAnd is directly stored inOf the variables, the variable holds the minimum dK(ii) a If the assignment is not made for the first time, d is calculatedKAnd save it in a variableContinuing to execute the step 2.1.3.2;
step 2.1.3.2: randomly selecting a non-central point;
step 2.1.3.3: creating a set C, storing the result of the iteration assignment, and directly storing the set C if the assignment is performed for the first timeIn the set, the set stores the optimal classification result; if not, compareAndif it is notThen set C is saved inIn the set, modifying the result in the set C according to the randomly selected non-central point, exchanging the randomly selected non-central point with the corresponding central point, and preparing data for the next round of assignment process;
step 2.1.3.4: judging whether the specified maximum iteration times is reached, if so, terminating the calculation and outputting the optimal classification resultOtherwise, executing the next round of iterative calculation, and turning to the step 2.1.3.1;
step 2.2: and carrying out load level classification on the normal power mode by using the determined load level classification value K and using a K-center clustering method again. The load level classification result of the normal electricity usage pattern for the load of the residential community is shown in fig. 2.
Still further, in the step 3, the method for constructing the abnormal load data domain comprises the step of constructing a confidence interval of load expected values under confidence 1- α and a difference of four-component of load value to cluster center load deviation at the kth sub-data set at the time tForming exception data fieldsIn the formula XtIn order to load the random variable,is the sample mean of the load at time t,for time t is the corrected sample variance, NkIs the number of samples, tα/2(Nk-1) is a degree of freedom NkThe upper side α/2 quantile of t distribution of-1, and t can be obtained by querying the t distribution quantile tableα/2(Nk-1), ρ being a significance indicator, with reference to the definition of an outlier cut-off point in the boxplot, the identified abnormal load value is called a mild outlier when ρ is 1.5, the identified outlier is called an extreme outlier when ρ is 3,the load values of all load data points at the time t and the four-quantile difference of the deviation of the cluster center load values are calculated by taking the third four-quantile value as a load sample at the time tAnd the first quartile valueThe difference between the values of the two signals,by using the back sampling method, firstly, repeatedly sampling the load samples at the time t, then calculating the load average value as 1 load average value sample, repeating the work for 100 times through testing, and then averaging the load average value samples again to obtain the load average value sample
Specifically, in step 4, the constructed abnormal load data domain is used to identify the abnormal load value of the load data in the normal power consumption mode, and if the load value belongs to the abnormal load data domain, the load value is considered as the abnormal load value, otherwise, the load value is the normal load value. The recognition result of the presence of an abnormal value in the normal electric pattern load curve for the load of the residential community is shown in fig. 3.
In addition, in step 5, aiming at the abnormal electricity utilization mode, the corresponding abnormal load data field is constructed by selecting the maximum upper bound and the minimum lower bound of the abnormal data field of the normal electricity utilization mode. And similarly, identifying the abnormal load value of the load data in the abnormal power utilization mode by taking the constructed abnormal load data domain as an identification basis, and if the load value belongs to the abnormal data domain, considering the load value as the abnormal load value, otherwise, considering the load value as the normal load value. The identification result of the abnormal load value existing in the abnormal electricity consumption pattern load curve for the load of the residential community is shown in fig. 4.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and the preferred embodiments of the present invention are described in the above embodiments and the description, and are not intended to limit the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (7)
1. A load abnormal value identification method comprises the following steps:
step 1: based on a space density clustering method, classifying the power utilization modes of the load curves into a normal power utilization mode and an abnormal power utilization mode;
step 2: based on a K-center clustering method, carrying out load level classification on the normal power utilization mode;
and step 3: under different load levels, aiming at the random distribution condition of the load values at each moment, constructing an abnormal load data domain based on a central limit theorem and a four-quadrant difference of the load values relative to the clustering central load deviation;
and 4, step 4: identifying abnormal load values possibly existing in the normal electric mode by using the load abnormal data field constructed in the step 3;
and 5: and (3) combining the abnormal load data field formed in the step (3), constructing an abnormal load data field for identifying the negative abnormal load value in the abnormal power consumption mode by using the maximum upper limit and the minimum lower limit of the abnormal load data field, and identifying the load abnormal value in the abnormal power consumption mode.
2. The load abnormal value identification method according to claim 1, wherein: in the step 1, the specific steps of classifying the user electricity utilization modes based on the space density clustering method are as follows:
step 1.1: minimum number of data points N contained within a neighborhoodminSetting (2);
step 1.2: determining the scanning radius epsilon at a selected NminThen, each load curve and the Nth position in the neighborhood are calculatedminDegree of difference in power consumption pattern between adjacent load curvesWhereinIs calculated by the formulaIn the formula yaAnd ybRepresents two different daily load curves, each of which can be expressed in vector form as y ═ x1,x2,...,xn)T,xiRepresents a load value at the ith time;
step 1.3: preparing a historical load curve set D ═ y1,y2,…,yMWhere M is the total number of load curves;
step 1.4: initializing a set of core objectsThe number of classes c is 0, the set of unclassified samples Λ is D, the set of class partitions
Step 1.5: for the load curve yi(i ═ 1,2, …, M), finding a core object;
step 1.6: if core object setStopping classification and entering step 1.10, otherwise entering step 1.7;
step 1.7: initializing class serial number c as c +1, randomly selecting a core object o from the set omega, and initializing the current core object queue omegac= o, initializing the current sorted set Sc-o, updating the set of unclassified samples Λ = Λ - { o };
step 1.8: if it is notThen is present at ScAfter generation, update S = { S = }1,S2,…,Sc},Ω=Ω-ScStep 1.6 is carried out, otherwise step 1.9 is carried out;
step 1.9: from ΩcTaking out the core object o' to form an epsilon neighborhood sample set Zε(o') obtaining unclassified samples and belonging only to the set Zε(o') set of samples Δ ═ Zε(o') ∩ Λ, update Ωc=Ωc∪(Δ∩Ω)-o'、Sc=Sc∪ delta and Λ ═ Λ -delta, proceed to step 1.8;
step 1.10: output S ═ S1,S2,…,ScAnd Λ.
3. The load abnormal value identification method according to claim 2, wherein: in step 1.5, the step of searching for a core object is as follows:
① calculating power consumption pattern difference degree to find yiEpsilon field subsample set Zε(yi);
② if Zε(yi) Containing more than N samplesminThen sample yiAdding core object set omega-omega ∪ { yi}。
4. The load abnormal value identification method according to claim 1, wherein: in step 2, the specific steps of classifying the load level of the normal electricity utilization mode are as follows:
step 2.1: determining the optimal cluster number K, and utilizing the comprehensive evaluation index from the angles of high similarity of the load levels of the data points in the sub data sets and large difference of the load levels among the sub data setsIn the clustering number K belongs to [1, K ]max]In (1), is selected such thatObtaining the K with the minimum value, namely, the K can be used as the expected clustering number of the subdata set,representing the load curve y contained in the kth sub-setjAnd cluster centerDistance between, NkThe number of load curves contained for a subdata set;representing the degree of difference in load level between K sub-data sets, wherein,andeach representing a cluster center of a different sub data set;
step 2.2: and carrying out load level classification on the normal power mode by using the determined load level classification value K and using a K-center clustering method again.
5. The load abnormal value identification method according to claim 4, wherein: in step 2.1, the detailed steps of K center clustering are as follows:
step 2.1.1: set of load curves S from which a load level classification is to be carried outi(i-1, 2, …, c) randomly selecting K load curves in a centralized manner to serve as initial center points, and setting the maximum iteration times;
step 2.1.2: collecting load curves S of normal power consumption modes to be classifiediThe load curve in (1), assigned to the nearest center point;
6. The load abnormal value identification method according to claim 5, wherein: in step 2.1.3, the specific steps of performing iteration are as follows:
step 2.1.3.1: calculating d from the assignment resultK: if the assignment is made for the first time, calculate dKAnd is directly stored inOf the variables, the variable holds the minimum dK(ii) a If the assignment is not made for the first time, d is calculatedKAnd save it in a variableContinuing to execute the step 2.1.3.2;
step 2.1.3.2: randomly selecting a non-central point;
step 2.1.3.3: creating a set C, storing the result of the iteration assignment, and directly storing the set C if the assignment is performed for the first timeIn the set, the set stores the optimal classification result; if not, compareAndif it is notThen set C is saved inIn the set, modifying the result in the set C according to the randomly selected non-central point, and intersecting the randomly selected non-central point with the corresponding central pointAlternatively, data is prepared for the next round of assignment process;
7. The method for identifying load abnormal value according to claim 1, wherein in step 3, the abnormal load data domain is constructed by a confidence interval based on load expectation values under confidence 1- α and a difference of four-point differences of load values with respect to load deviation of cluster center at time t for the kth sub-data setForming exception data fieldsIn the formula XtIn order to load the random variable,is the sample mean of the load at time t,for time t is the corrected sample variance, NkIs the number of samples, tα/2(Nk-1) is a degree of freedom NkThe upper side α/2 quantile of the t-distribution of-1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911125386.8A CN111046913B (en) | 2019-11-18 | 2019-11-18 | Load abnormal value identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911125386.8A CN111046913B (en) | 2019-11-18 | 2019-11-18 | Load abnormal value identification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111046913A true CN111046913A (en) | 2020-04-21 |
CN111046913B CN111046913B (en) | 2023-02-14 |
Family
ID=70232982
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911125386.8A Active CN111046913B (en) | 2019-11-18 | 2019-11-18 | Load abnormal value identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111046913B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112529061A (en) * | 2020-12-03 | 2021-03-19 | 新奥数能科技有限公司 | Identification method and device for photovoltaic power abnormal data and terminal equipment |
CN112686288A (en) * | 2020-12-22 | 2021-04-20 | 博锐尚格科技股份有限公司 | Power consumption behavior anomaly detection method and device and computer readable storage medium |
CN113554117A (en) * | 2021-08-16 | 2021-10-26 | 中国南方电网有限责任公司 | Abnormal load data identification method and electronic equipment |
CN114169631A (en) * | 2021-12-15 | 2022-03-11 | 中国石油大学胜利学院 | Oil field power load management and control system based on data analysis |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110874381A (en) * | 2019-10-30 | 2020-03-10 | 西安交通大学 | User side load data abnormal value identification method based on space density clustering |
-
2019
- 2019-11-18 CN CN201911125386.8A patent/CN111046913B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110874381A (en) * | 2019-10-30 | 2020-03-10 | 西安交通大学 | User side load data abnormal value identification method based on space density clustering |
Non-Patent Citations (2)
Title |
---|
赵天辉等: "基于非参数回归分析的工业负荷异常值识别与修正方法", 《电力系统自动化》 * |
邓明斌等: "基于用户负荷的用电模式分析方法", 《计算机与数字工程》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112529061A (en) * | 2020-12-03 | 2021-03-19 | 新奥数能科技有限公司 | Identification method and device for photovoltaic power abnormal data and terminal equipment |
CN112529061B (en) * | 2020-12-03 | 2024-04-16 | 新奥数能科技有限公司 | Photovoltaic power abnormal data identification method and device and terminal equipment |
CN112686288A (en) * | 2020-12-22 | 2021-04-20 | 博锐尚格科技股份有限公司 | Power consumption behavior anomaly detection method and device and computer readable storage medium |
CN113554117A (en) * | 2021-08-16 | 2021-10-26 | 中国南方电网有限责任公司 | Abnormal load data identification method and electronic equipment |
CN114169631A (en) * | 2021-12-15 | 2022-03-11 | 中国石油大学胜利学院 | Oil field power load management and control system based on data analysis |
CN114169631B (en) * | 2021-12-15 | 2022-10-25 | 山东石油化工学院 | Oil field power load management and control system based on data analysis |
Also Published As
Publication number | Publication date |
---|---|
CN111046913B (en) | 2023-02-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111046913B (en) | Load abnormal value identification method | |
CN112699913B (en) | Method and device for diagnosing abnormal relationship of household transformer in transformer area | |
WO2018082523A1 (en) | Load cycle mode identification method | |
CN110990461A (en) | Big data analysis model algorithm model selection method and device, electronic equipment and medium | |
CN111401460B (en) | Abnormal electric quantity data identification method based on limit value learning | |
CN111553444A (en) | Load identification method based on non-invasive load terminal data | |
CN110874381B (en) | Spatial density clustering-based user side load data abnormal value identification method | |
Ma et al. | Topology identification of distribution networks using a split-EM based data-driven approach | |
CN111242161A (en) | Non-invasive non-resident user load identification method based on intelligent learning | |
CN116148753A (en) | Intelligent electric energy meter operation error monitoring system | |
CN111291782B (en) | Accumulated load prediction method based on information accumulation k-Shape clustering algorithm | |
Hartmann et al. | Suspicious electric consumption detection based on multi-profiling using live machine learning | |
CN116821832A (en) | Abnormal data identification and correction method for high-voltage industrial and commercial user power load | |
Frank et al. | Extracting operating modes from building electrical load data | |
CN109858667A (en) | It is a kind of based on thunder and lightning weather to the short term clustering method of loading effects | |
CN114519651A (en) | Intelligent power distribution method based on electric power big data | |
CN107274025B (en) | System and method for realizing intelligent identification and management of power consumption mode | |
CN113408622A (en) | Non-invasive load identification method and system considering characteristic quantity information expression difference | |
CN112528762A (en) | Harmonic source identification method based on data correlation analysis | |
CN115508662B (en) | Method for judging affiliation relationship between district ammeter and meter box | |
Yang et al. | An electricity data cluster analysis method based on SAGA-FCM algorithm | |
CN111476298A (en) | Power load state identification method in home and office environment | |
CN115545422A (en) | Platform area user variation relation identification method based on improved decision mechanism | |
Liu et al. | Research on the transformer area recognition method based on improved K-means clustering algorithm | |
Xie et al. | Energy System Time Series Data Quality Maintenance System Based on Data Mining Technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |