CN110991477A - Method and system for identifying users in abnormal industry and abnormal electricity utilization behaviors of power system - Google Patents
Method and system for identifying users in abnormal industry and abnormal electricity utilization behaviors of power system Download PDFInfo
- Publication number
- CN110991477A CN110991477A CN201911037248.4A CN201911037248A CN110991477A CN 110991477 A CN110991477 A CN 110991477A CN 201911037248 A CN201911037248 A CN 201911037248A CN 110991477 A CN110991477 A CN 110991477A
- Authority
- CN
- China
- Prior art keywords
- peak
- electricity consumption
- power consumption
- valley
- period
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000005611 electricity Effects 0.000 title claims abstract description 303
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 112
- 238000000034 method Methods 0.000 title claims abstract description 57
- 230000006399 behavior Effects 0.000 title claims abstract description 53
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 30
- 238000007621 cluster analysis Methods 0.000 claims abstract description 24
- 239000011159 matrix material Substances 0.000 claims description 61
- 238000004364 calculation method Methods 0.000 claims description 36
- 238000012545 processing Methods 0.000 claims description 22
- 238000004458 analytical method Methods 0.000 claims description 16
- 230000001186 cumulative effect Effects 0.000 claims description 15
- 150000001875 compounds Chemical class 0.000 claims description 12
- 230000007812 deficiency Effects 0.000 claims description 11
- 230000008859 change Effects 0.000 claims description 10
- 230000001174 ascending effect Effects 0.000 claims description 6
- 238000001514 detection method Methods 0.000 abstract description 9
- 238000009825 accumulation Methods 0.000 abstract description 3
- 230000008901 benefit Effects 0.000 abstract description 2
- 241000208125 Nicotiana Species 0.000 description 16
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 16
- 238000010586 diagram Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 7
- 238000004519 manufacturing process Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 238000012544 monitoring process Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 238000005192 partition Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 1
- 239000011449 brick Substances 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Economics (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a method and a system for identifying users in abnormal industries and abnormal electricity utilization behaviors of an electric power system. The method and the system collect power consumption data of various industries, establish power consumption characteristic index items capable of accurately describing the characteristics of the industries based on the characteristics of the industries, wherein the power consumption characteristic index items comprise peak periods, valley period power consumption and peak period power consumption, determine the ratio of low valley and peak load of user characteristic indexes and the power accumulation fluctuation rate of load peak periods and low valley periods through the power consumption characteristic index items, perform cluster analysis on power consumption behaviors according to the industries by adopting an AP (access point) clustering algorithm on the basis of establishing the power consumption characteristic index items, and identify abnormal users with wrong power consumption industry attribute identification and abnormal power consumption behavior users in the industries through clustering results. The method and the system remarkably reduce the user clustering number, correspondingly reduce the power utilization abnormity detection difficulty and the false alarm rate of the electricity stealing detection, and effectively improve the operation benefit of power supply enterprises.
Description
Technical Field
The present invention relates to the field of power analysis, and more particularly, to a method and system for identifying abnormal industry users and abnormal electricity usage behavior of a power system.
Background
Under the economic condition of the market, part of illegal operators steal electric energy in an unscrupulous way for consumping violence, so that the normal power supply and utilization order is disturbed, and the income loss of power supply enterprises is directly caused. Traditionally, electricity theft detection has relied primarily on manual screening. The application of the intelligent electric energy meter obviously enriches the electricity utilization information of users, and in recent years, the electricity stealing behavior is identified by monitoring indexes such as current loss, undervoltage and the like of the electric energy meter mainly according to rules summarized by experience of inspection personnel in engineering application. The method has exact physical significance, can accurately identify abnormal electricity consumption behaviors, plays an important role in preventing and controlling electricity stealing, has limited application range, and cannot effectively monitor electricity stealing modes which cannot cause abnormal electricity meter monitoring indexes, such as bypassing electricity meters and the like. In recent years, researchers propose to use power consumption behavior clustering analysis to identify whether the monitoring indexes of the electric meter are abnormal or not. The method extracts behavior characteristics, such as fluctuation intervals of load curves, user variability, volatility, trend and other indexes, from monthly power consumption data of the users, and then performs cluster analysis on the users according to various data mining algorithms to identify abnormal users.
It should be noted that, in the existing abnormal electricity consumption behavior detection method based on cluster analysis, the premise is that there is a sudden change in the electricity consumption of the abnormal user, and although the expression forms of the used index items are different, the sudden change in the electricity consumption and the power curve is mostly used as a key index. In engineering practice, a large number of interference factors can cause sudden changes of power consumption and load, and the sudden changes of the power consumption are used as key indexes to perform clustering analysis and identification on abnormal power consumption behaviors, so that the problems of low accuracy and difficulty in meeting application requirements often exist.
The electricity utilization behavior characteristics of users in different industries often have obvious differences, and when the users of different types are clustered and analyzed together, the diversified user types can obviously enlarge the state space where the users are likely to be distributed, so that the difficulty of identifying abnormal electricity utilization by clustering the users is improved. Therefore, a technique is needed to reduce the difficulty of power system users clustering to identify abnormal power utilization.
Disclosure of Invention
In order to solve the difficulty of identifying abnormal electricity consumption by clustering users of an electric power system in the prior art, the invention provides a method for identifying users in abnormal industries of the electric power system and abnormal electricity consumption behaviors, which comprises the following steps:
collecting user industry information based on the power utilization user file, collecting power consumption data of users according to a predetermined time interval, classifying the power consumption data according to the industry information, and determining a power consumption data original sample set X of the power utilization users in the tth industryt={xijkIn which xijkThe kth electricity consumption data of the ith user on the jth day is that k is more than or equal to 1 and less than or equal to N;
original sample set X of electricity consumption data of the t-th industrytProcessing the deficiency value and the abnormal value to generate a valid sample set X 'of power consumption data of the tth industry't={x'ijkWherein the missing value is the power consumption data missing from the original sample set, and the abnormal value is the power consumption data outside the predetermined normal power consumption interval, x'ijkIs to the original sample set XtThe kth power consumption data of the ith user on the jth day after the power consumption data in the data processing is processed is that k is more than or equal to 1 and less than or equal to N;
according to an effective sample set X'tDetermining the electricity consumption peak in the peak period, the valley period and the peak period of the jth day of the ith user according to the electricity consumption dataijValley of electricity consumption in valley periodij;
According to the peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the electricity consumption in the valley period according to the electricity consumption data in the valley period
Based on the predetermined AP clustering algorithm parameters, according to the ratio of the peak load to the valley load of the users in the tth industryijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming clustering analysis on the users;
and determining abnormal users in the tth industry and users with abnormal electricity utilization behaviors in the industry according to the clustering analysis result.
Further, the original sample set X of the electricity consumption data of the tth industrytProcessing the deficiency value and the abnormal value to generate an electricity utilization data valid sample set X 'of the t-th industry'tThe method refers to a t-th industry power consumption data original sample set XtWhen the missing value and/or the abnormal value exist in the sample, a neighbor mean method is adopted to replace the missing value and/or the abnormal value in the sample.
Further, the per-valid sample set X'tDetermining the peak period electricity consumption peak of the ith user on the jth day according to the electricity consumption dataijValley of electricity consumption in valley periodijAnd according to the peak time interval electricity consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe method comprises the following steps:
in valid sample set X'tIn the method, N electricity consumption data of the ith user in the j continuous time period are summed, wherein N is more than or equal to 6 and less than or equal to N;
arranging the power consumption data summation values of the ith user on the jth day according to an ascending order, determining the continuous time interval with the maximum power consumption data summation value as a peak time interval, determining the continuous time interval with the minimum power consumption data summation value as a valley time interval, and determining the power consumption data summation value of the peak time interval as the power consumption peak of the peak time intervalijAnd the summation value of the electricity consumption data in the valley period is the electricity consumption valley in the valley periodij。
Further, according to the peak time interval electricity consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe method comprises the following steps:
according to the peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe calculation formula is as follows:
ratioij=valleyij/peakij。
determining the accumulated fluctuation rate of the electricity consumption of the ith user in the j day of the peak period according to the electricity consumption data of the peak periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the peak period of the jth day of the ith user,is the average of the peak period power usage on the ith user day j,the electricity consumption data of the ith user in the peak time interval of the jth day is k, and n is the number of the electricity consumption data of the ith user in the peak time interval of the jth day;
determining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the valley period of the jth day of the ith user,is the average value of the electricity consumption in the valley period of the jth day of the ith user,the electricity consumption data of the ith user in the valley period of the jth day is k, and n is the number of the electricity consumption data of the ith user in the valley period of the jth day.
Further, based on the predetermined AP clustering algorithm parameter, according to the low valley peak load ratio of users in the tth industryijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming cluster analysis on users, and determining users with abnormal electricity utilization behaviors in the tth industry according to the cluster analysis result, wherein the steps comprise:
setting parameters of an AP clustering algorithm, wherein the parameters comprise an attenuation coefficient lambda, the maximum iteration time T and the time T' that the maximum iteration of a clustering center is not changed;
according to the ratio of low-peak to high-peak loadijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodCalculating a valid sample set X'tUser xiAnd user xkHas an Euclidean distance d betweeni,kThe calculation formula is as follows:
di,k=-||xi-xk||
according to the Euclidean distance di,kDetermining user xiAnd user xkThe similarity s (i, k) between the two groups is calculated by the formula:
s(i,k)=-di,k
taking s (i, k) as an element of a corresponding position of the similarity matrix, and taking the median of all obtained similarity elements as a diagonal element of the similarity matrix to generate the similarity matrix;
when valid sample set X'tWhen the electricity consumption data of a total of n users exist in the system, initializing an attraction matrix R and an attribution matrix A into a zero matrix of n multiplied by n, wherein the element R of the attraction matrix R is0(i, k) and element a of the attribution matrix A0(i, k) is 0, and the element R of the attraction matrix R is calculated1(i, k) and element a of the attribution matrix A1(i, k), the calculation formula is:
iterative update rt+1(i, k) and at+1(i, k), the calculation formula is as follows:
rt+1(i,k)=λ*rt-1(i,k)+(1-λ)*rt(i,k)
at+1(i,k)=λ*at-1(i,k)+(1-λ)*at(i,k)
in the formula, lambda is a preset attenuation coefficient, and T is more than or equal to 1 and less than or equal to T;
summing the attribution degree and the attraction degree of each data point, and determining a clustering center, wherein for a data point i, when a (i, k) + r (i, k) obtains the maximum value, if i is equal to k, the data point i is determined to be one clustering center, and if i is not equal to k, the data point k is determined to be the clustering center;
and when the iteration times T of the clustering center are equal to the preset maximum iteration times T' of the clustering center without change, the clustering center is not changed or the iteration times T are equal to the preset maximum iteration times T, ending the AP clustering algorithm.
According to another aspect of the present invention, there is provided a system for identifying abnormal industry users and abnormal electricity usage behavior of an electric power system, the system comprising:
a data acquisition unit for acquiring user industry information based on the user profile, acquiring power consumption data of users according to a predetermined time interval, classifying the power consumption data according to the industry information, and determining a t-th original sample set X of the power consumption data of the users in the industryt={xijkIn which xijkThe kth electricity consumption data of the ith user on the jth day is that k is more than or equal to 1 and less than or equal to N;
effective sample unit for original sample set X of power consumption data of t-th industrytProcessing the deficiency value and the abnormal value to generate a valid sample set X 'of power consumption data of the tth industry't={x'ijkWherein the missing value is the power consumption data missing from the original sample set, and the abnormal value is the power consumption data outside the predetermined normal power consumption interval, x'ijkIs to the original sample set XtThe kth power consumption data of the ith user on the jth day after the power consumption data in the data processing is processed is that k is more than or equal to 1 and less than or equal to N;
an electrical characteristic indicator unit for indicating the effective sample set X'tElectricity consumption data inDetermining the peak time interval, the valley time interval and the peak time interval peak of the jth day of the ith userijValley of electricity consumption in valley periodijAnd according to said peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the electricity consumption in the valley period according to the electricity consumption data in the valley period
A cluster analysis unit for determining the ratio of the peak load to the valley load of the users in the tth industry based on the predetermined AP clustering algorithm parametersijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming clustering analysis on the users;
and the result output unit is used for determining abnormal users in the tth industry and users with abnormal electricity utilization behaviors in the industry according to the clustering analysis result.
Further, the effective sample unit is used for generating a raw sample set X of the electricity consumption data of the tth industrytProcessing the deficiency value and the abnormal value to generate an electricity utilization data valid sample set X 'of the t-th industry'tThe method refers to a t-th industry power consumption data original sample set XtWhen the missing value and/or the abnormal value exist in the sample, a neighbor mean method is adopted to replace the missing value and/or the abnormal value in the sample.
Further, the electric characteristic index unit is according to an effective sample set X'tPower usage data determination inPeak electricity consumption peak of the ith user on the jth dayijValley of electricity consumption in valley periodijAnd according to the peak time interval electricity consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe method comprises the following steps:
in valid sample set X'tIn the method, N electricity consumption data of the ith user in the j continuous time period are summed, wherein N is more than or equal to 6 and less than or equal to N;
arranging the power consumption data summation values of the ith user on the jth day according to an ascending order, determining the continuous time interval with the maximum power consumption data summation value as a peak time interval, determining the continuous time interval with the minimum power consumption data summation value as a valley time interval, and determining the power consumption data summation value of the peak time interval as the power consumption peak of the peak time intervalijAnd the summation value of the electricity consumption data in the valley period is the electricity consumption valley in the valley periodij。
Further, the electricity consumption characteristic index unit uses electricity consumption peak according to the peak time periodijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe method comprises the following steps:
according to the peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe calculation formula is as follows:
ratioij=valleyij/peakij。
determining the accumulated fluctuation rate of the electricity consumption of the ith user in the j day of the peak period according to the electricity consumption data of the peak periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the peak period of the jth day of the ith user,is the average of the peak period power usage on the ith user day j,the electricity consumption data of the ith user in the peak time interval of the jth day is k, and n is the number of the electricity consumption data of the ith user in the peak time interval of the jth day;
determining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the valley period of the jth day of the ith user,is the average value of the electricity consumption in the valley period of the jth day of the ith user,the electricity consumption data of the ith user in the valley period of the jth day is k, and n is the number of the electricity consumption data of the ith user in the valley period of the jth day.
Further, the cluster analysis unit is used for analyzing the user's low-valley peak load ratio in the tth industry according to the predetermined AP clustering algorithm parametersijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming cluster analysis on users, and determining users with abnormal electricity utilization behaviors in the tth industry according to the cluster analysis result, wherein the steps comprise:
setting parameters of an AP clustering algorithm, wherein the parameters comprise an attenuation coefficient lambda, the maximum iteration time T and the time T' that the maximum iteration of a clustering center is not changed;
according to the ratio of low-peak to high-peak loadijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodCalculating a valid sample set X'tUser xiAnd user xkHas an Euclidean distance d betweeni,kThe calculation formula is as follows:
di,k=-||xi-xk||
according to the Euclidean distance di,kDetermining user xiAnd user xkThe similarity s (i, k) between the two groups is calculated by the formula:
s(i,k)=-di,k
taking s (i, k) as an element of a corresponding position of the similarity matrix, and taking the median of all obtained similarity elements as a diagonal element of the similarity matrix to generate the similarity matrix;
when valid sample set X'tWhen the electricity consumption data of a total of n users exist in the system, initializing an attraction matrix R and an attribution matrix A into a zero matrix of n multiplied by n, wherein the element R of the attraction matrix R is0(i, k) and element a of the attribution matrix A0(i, k) is 0, and the element R of the attraction matrix R is calculated1(i, k) and element a of the attribution matrix A1(i, k), the calculation formula is:
iterative update rt+1(i, k) and at+1(i, k), the calculation formula is as follows:
rt+1(i,k)=λ*rt-1(i,k)+(1-λ)*rt(i,k)
at+1(i,k)=λ*at-1(i,k)+(1-λ)*at(i,k)
in the formula, lambda is a preset attenuation coefficient, and T is more than or equal to 1 and less than or equal to T;
summing the attribution degree and the attraction degree of each data point, and determining a clustering center, wherein for a data point i, when a (i, k) + r (i, k) obtains the maximum value, if i is equal to k, the data point i is determined to be one clustering center, and if i is not equal to k, the data point k is determined to be the clustering center;
and when the iteration times T of the clustering center are equal to the preset maximum iteration times T' of the clustering center without change, the clustering center is not changed or the iteration times T are equal to the preset maximum iteration times T, ending the AP clustering algorithm.
The method and the system for identifying the users in the abnormal industries and the abnormal electricity consumption behaviors of the electric power system, which are provided by the technical scheme of the invention, take the electricity consumption data of the electricity consumption users in different industries in a period of time, analyze the electricity consumption modes of different industries, then establish electricity consumption characteristic index items capable of accurately describing the industry characteristics based on the statistical analysis of the industry characteristics, wherein the electricity consumption characteristic index items comprise the electricity consumption in valley period and the electricity consumption in peak period, determine the ratio of the electricity consumption characteristic indexes of valley and peak load and the power accumulation fluctuation rate of the load peak period and the valley period based on the electricity consumption characteristic index items, adopt Affinity prediction clustering algorithm to perform cluster analysis on the electricity consumption behaviors after the electricity consumption users are subdivided according to the industries on the basis of establishing the electricity consumption characteristic index, identify the abnormal users with wrong identification of the electricity consumption industry attributes through the clustering results, and the abnormal electricity consumption behavior users in the industry, according to the method and the system for identifying the abnormal industry users and the abnormal electricity consumption behaviors of the power system, the abnormal electricity consumption behaviors are identified through industry clustering, index items representing the electricity consumption behavior characteristics are extracted according to the common characteristics of the industry users, the interference of sudden change indexes of the electricity consumption is avoided, and the user clustering quantity can be obviously reduced according to the industry clustering, so that the distribution range of clustering objects in a state space is effectively reduced, the electricity consumption abnormal detection difficulty and the electricity stealing detection false alarm rate are correspondingly reduced, and the operation benefit of a power supply enterprise is effectively improved.
Drawings
A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:
FIG. 1 is a flow chart of a method of identifying abnormal industry users and abnormal electricity usage behavior for an electrical power system in accordance with a preferred embodiment of the present invention;
fig. 2 is a schematic structural diagram of a system for identifying abnormal industry users and abnormal electricity utilization behaviors of an electric power system according to a preferred embodiment of the invention.
Detailed Description
The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.
Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.
Fig. 1 is a flowchart of a method for identifying abnormal industry users and abnormal electricity usage behavior of an electric power system according to a preferred embodiment of the present invention. As shown in fig. 1, a method 100 for identifying abnormal industry users and abnormal electricity consumption behaviors of an electric power system according to the preferred embodiment starts with step 101.
In step 101, user industry information is collected based on a power consumption user profile, power consumption data of users are collected according to a predetermined time interval, the power consumption data are classified according to industry information, and a power consumption data original sample set X of power consumption users in the tth industry is determinedt={xijkIn which xijkThe kth electricity consumption data of the ith user on the jth day is that k is more than or equal to 1 and less than or equal to N. The sampling interval for acquiring the power consumption data of the sub-industry users is set to be 30 minutes generally, the sampling interval time can ensure the diversity of the sampled data, the sampling frequency can not be too high, and the efficiency is improved.
At step 102, a raw sample set X of power consumption data for the tth industrytProcessing the missing value and abnormal value to generate the power consumption of the t-th industryData valid sample set X't={x'ijkWherein the missing value is the power consumption data missing from the original sample set, and the abnormal value is the power consumption data outside the predetermined normal power consumption interval, x'ijkIs to the original sample set XtAnd k is more than or equal to 1 and less than or equal to N in the kth power consumption data of the ith user on the jth day after the power consumption data in the data processing. The data in the collected original sample set is analyzed and processed, and the integrity, the reasonability and the accuracy of the data of the analyzed sample are fully ensured.
In step 103, according to the valid sample set X'tDetermining the electricity consumption peak in the peak period, the valley period and the peak period of the jth day of the ith user according to the electricity consumption dataijValley of electricity consumption in valley periodij. Because the electricity load of an industrial user is obviously higher than that of other periods in the production period generally, but the electricity consumption trend of the non-production period is relatively stable, in addition, in consideration of the daily peak and the daily valley caused by production scheduling and living, the obvious load fluctuation can occur when the production period is finished, and the daily period caused by day and night does not exist in some industries, therefore, the electricity consumption peak in the peak period, the valley period and the peak period is adoptedijValley of electricity consumption in valley periodijThe industry characteristics can be well embodied as index characteristic items for identifying abnormal users and abnormal electricity utilization behaviors.
In step 104, the electricity consumption peak is used according to the peak time periodijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the electricity consumption in the valley period according to the electricity consumption data in the valley period
In step 105, based on the predetermined AP clustering algorithm parameters, according to the trough of users in the tth industryPeak load ratioijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodAnd carrying out clustering analysis on the users.
In step 106, abnormal users in the tth industry and users with abnormal electricity utilization behaviors in the industry are determined according to the clustering analysis result.
Preferably, the original sample set X of the electricity consumption data of the tth industrytProcessing the deficiency value and the abnormal value to generate an electricity utilization data valid sample set X 'of the t-th industry'tThe method refers to a t-th industry power consumption data original sample set XtWhen the missing value and/or the abnormal value exist in the sample, a neighbor mean method is adopted to replace the missing value and/or the abnormal value in the sample.
Preferably, the set of valid samples X'tDetermining the peak period electricity consumption peak of the ith user on the jth day according to the electricity consumption dataijValley of electricity consumption in valley periodijAnd according to the peak time interval electricity consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe method comprises the following steps:
in valid sample set X'tIn the method, N electricity consumption data of the ith user in the j continuous time period are summed, and N is more than or equal to 6 and less than or equal to N. When the determined sampling interval is 30 minutes, 12 power usage data for 6 consecutive hours are typically summed.
Arranging the power consumption data summation values of the ith user on the jth day according to an ascending order, determining the continuous time interval with the maximum power consumption data summation value as a peak time interval, determining the continuous time interval with the minimum power consumption data summation value as a valley time interval, and determining the power consumption data summation value of the peak time interval as the power consumption peak of the peak time intervalijAnd the summation value of the electricity consumption data in the valley period is the electricity consumption valley in the valley periodij。
Preferably, the power consumption peak is used according to the peak time periodijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe method comprises the following steps:
according to the peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe calculation formula is as follows:
ratioij=valleyij/peakij。
the low valley high peak load value ratioijThe difference of the electricity consumption of the industry users in the valley period and the peak period is reflected, and the difference is larger when the result is smaller.
Determining the accumulated fluctuation rate of the electricity consumption of the ith user in the j day of the peak period according to the electricity consumption data of the peak periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the peak period of the jth day of the ith user,is the average of the peak period power usage on the ith user day j,the electricity consumption data of the ith user in the peak time interval of the jth day is k, and n is the number of the electricity consumption data of the ith user in the peak time interval of the jth day;
determining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the valley period of the jth day of the ith user,is the average value of the electricity consumption in the valley period of the jth day of the ith user,the electricity consumption data of the ith user in the valley period of the jth day is k, and n is the number of the electricity consumption data of the ith user in the valley period of the jth day.
Cumulative fluctuation rate of power consumption in peak periodCumulative fluctuation rate of electricity consumption in valley periodThe arrangement of (2) facilitates comparison among users with different power utilization levels.
Preferably, based on the predetermined AP clustering algorithm parameter, according to the low valley to high peak load ratio of users in the tth industryijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming cluster analysis on users, and determining users with abnormal electricity utilization behaviors in the tth industry according to the cluster analysis result, wherein the steps comprise:
setting parameters of an AP clustering algorithm, wherein the parameters comprise an attenuation coefficient lambda, the maximum iteration time T and the time T' that the maximum iteration of a clustering center is not changed;
according to the ratio of low-peak to high-peak loadijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodCalculating a valid sample set X'tUser xiAnd user xkHas an Euclidean distance d betweeni,kThe calculation formula is as follows:
di,k=-||xi-xk||
according to the Euclidean distance di,kDetermining user xiAnd user xkSimilarity s (i, k) therebetween, whichThe calculation formula is as follows:
s(i,k)=-di,k
taking s (i, k) as an element of a corresponding position of the similarity matrix, and taking the median of all obtained similarity elements as a diagonal element of the similarity matrix to generate the similarity matrix;
when valid sample set X'tWhen the electricity consumption data of a total of n users exist in the system, initializing an attraction matrix R and an attribution matrix A into a zero matrix of n multiplied by n, wherein the element R of the attraction matrix R is0(i, k) and element a of the attribution matrix A0(i, k) is 0, and the element R of the attraction matrix R is calculated1(i, k) and element a of the attribution matrix A1(i, k), the calculation formula is:
iterative update rt+1(i, k) and at+1(i, k), the calculation formula is as follows:
rt+1(i,k)=λ*rt-1(i,k)+(1-λ)*rt(i,k)
at+1(i,k)=λ*at-1(i,k)+(1-λ)*at(i,k)
in the formula, lambda is a preset attenuation coefficient, and T is more than or equal to 1 and less than or equal to T;
summing the attribution degree and the attraction degree of each data point, and determining a clustering center, wherein for a data point i, when a (i, k) + r (i, k) obtains the maximum value, if i is equal to k, the data point i is determined to be one clustering center, and if i is not equal to k, the data point k is determined to be the clustering center;
and when the iteration times T of the clustering center are equal to the preset maximum iteration times T' of the clustering center without change, the clustering center is not changed or the iteration times T are equal to the preset maximum iteration times T, ending the AP clustering algorithm.
In the preferred embodiment, a user is taken as an example. The power load of the tobacco curing user in the mature season of the tobacco leaves is obviously higher than that of the tobacco curing user in other time periods, the power consumption in the immature season of the tobacco leaves is low, and the trend is stable; in addition, the tobacco leaves are baked for about 7 days once, the load cannot be changed violently in the period, the daily period caused by day and night does not exist, and the tobacco leaves have very typical electricity utilization characteristic indexes.
200 flue-cured tobacco special change users, 4 brickyards and 16 rural transformer substations in a certain area are selected, in order to fully depict the power consumption characteristics and trend of the flue-cured tobacco users in the tobacco leaf mature season, 30min interval data of 30 days from 1 day of 2016 to 30 days of 6 months are selected for power consumption data, the data of each user are preprocessed, index characteristic items are calculated, and the flue-cured tobacco users are subjected to cluster analysis by adopting an AP clustering algorithm. The parameters of the AP clustering algorithm are set as follows: the attenuation coefficient lambda is 0.5, the maximum iteration time T is 500, the time T' when the maximum iteration of the clustering center is not changed is 50, and the reference degree is set as the median of all values in the similarity matrix. The clustering centers, including the users and the partition accuracy, are shown in table 1 and table 2, respectively.
TABLE 1 Cluster center feature item 30 day average
As shown in table 1, the data value of at least one of the peak period power consumption cumulative fluctuation rate and the valley period power consumption cumulative fluctuation rate of the 4 cluster centers other than the cluster 1 center is much greater than 1, which indicates that the user is a user not in the industry or a user with abnormal power consumption behavior in the industry.
The cluster containing users and the partition accuracy are shown in table 2 below.
As shown in table 2, flue-cured tobacco specific transformation users in the cluster 1 are 197 users, the cluster 2 accurately identifies 16 station public transformations of non-flue-cured tobacco users, the cluster 3 more accurately identifies 4 brick plants of all the non-flue-cured tobacco users, and the clusters 4 and 5 accurately identify 2 abnormal flue-cured tobacco users in 200 flue-cured tobacco users, so that the identification of the non-industry abnormal users in the flue-cured tobacco industry and the identification of the abnormal power consumption behavior users in the industry are well realized, the identification accuracy is high, the classification dimensionality of the users is greatly reduced, and the false alarm rate of power stealing detection is improved.
In engineering practice, a large number of interference factors can cause sudden changes of power consumption and load, so that on the premise that the sudden changes of the power consumption of abnormal users exist, the sudden changes of the power consumption and a power curve are used as key indexes to perform clustering analysis to identify abnormal power consumption behaviors, and the problems that the accuracy is not high and the application requirements are difficult to meet often exist. And the electricity utilization behavior characteristics of users in different industries often have obvious difference, and when the users of different types are clustered and analyzed together, the diversified user types can obviously enlarge the state space in which the users are likely to be distributed, so that the difficulty of identifying abnormal electricity utilization by clustering the users is improved. The method for identifying the abnormal users and the abnormal electricity utilization behaviors of the electric power system provided by the invention can be used for identifying users with wrong industry attribute identification in each industry and also can be used for accurately identifying the abnormal user behaviors in the industry by analyzing the electricity utilization modes of different industries and then establishing an electricity utilization index characteristic item based on the statistical analysis of the industry characteristics and by adopting the Affinity prediction clustering algorithm to perform the clustering analysis of the electricity utilization behaviors after subdividing the electric power users according to the industry, thereby accurately describing the user behavior characteristics by combining the electricity utilization behavior characteristics of the industry and promoting the development direction of the data-driven electricity stealing detection practicability by reducing the classification dimensionality of the users and the false alarm rate of the electricity stealing detection.
Fig. 2 is a schematic structural diagram of a system for identifying abnormal industry users and abnormal electricity utilization behaviors of an electric power system according to a preferred embodiment of the invention. As shown in fig. 2, a system 200 for identifying abnormal industry users and abnormal electricity consumption behaviors of an electric power system according to the preferred embodiment includes:
a data acquisition unit 201 for acquiring user industry information based on the user profile and acquiring the power consumption of the user at predetermined time intervalsData, classifying the electricity consumption data according to industry information, and determining an original sample set X of the electricity consumption data of electricity users in the tth industryt={xijkIn which xijkThe kth electricity consumption data of the ith user on the jth day is that k is more than or equal to 1 and less than or equal to N;
an active sample unit 202 for raw sample set X of power usage data for the tth industrytProcessing the deficiency value and the abnormal value to generate a valid sample set X 'of power consumption data of the tth industry't={x'ijkWherein the missing value is the power consumption data missing from the original sample set, and the abnormal value is the power consumption data outside the predetermined normal power consumption interval, x'ijkIs to the original sample set XtThe kth power consumption data of the ith user on the jth day after the power consumption data in the data processing is processed is that k is more than or equal to 1 and less than or equal to N;
an electrical characteristic indicator unit 203 for indicating according to the valid sample set X'tDetermining the electricity consumption peak in the peak period, the valley period and the peak period of the jth day of the ith user according to the electricity consumption dataijValley of electricity consumption in valley periodijAnd according to said peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the electricity consumption in the valley period according to the electricity consumption data in the valley period
A cluster analysis unit 204 for determining a ratio of low-peak load to high-peak load of users in the tth industry according to a predetermined AP clustering algorithm parameterijCumulative fluctuation rate of power consumption in peak hoursAnd during the valley periodRate of fluctuation of accumulated electric quantityPerforming clustering analysis on the users;
and a result output unit 205, configured to determine, according to the cluster analysis result, an abnormal user in the tth industry and a user with abnormal electricity consumption behavior in the industry.
Preferably, the raw sample set X of the electricity consumption data of the valid sample unit 202 for the tth industrytProcessing the deficiency value and the abnormal value to generate an electricity utilization data valid sample set X 'of the t-th industry'tThe method refers to a t-th industry power consumption data original sample set XtWhen the missing value and/or the abnormal value exist in the sample, a neighbor mean method is adopted to replace the missing value and/or the abnormal value in the sample.
Preferably, the electrical characteristic index unit 203 is according to the valid sample set X'tDetermining the peak period electricity consumption peak of the ith user on the jth day according to the electricity consumption dataijValley of electricity consumption in valley periodijAnd according to the peak time interval electricity consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe method comprises the following steps:
in valid sample set X'tIn the method, N electricity consumption data of the ith user in the j continuous time period are summed, wherein N is more than or equal to 6 and less than or equal to N;
arranging the power consumption data summation values of the ith user on the jth day according to an ascending order, determining the continuous time interval with the maximum power consumption data summation value as a peak time interval, determining the continuous time interval with the minimum power consumption data summation value as a valley time interval, and determining the power consumption data summation value of the peak time interval as the power consumption peak of the peak time intervalijAnd the summation value of the electricity consumption data in the valley period is the electricity consumption valley in the valley periodij。
Preferably, the electricity consumption characteristic index unit 203 uses the electricity consumption peak according to the peak time periodijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd according to peak periodsDetermining the accumulated fluctuation rate of the electricity consumption in the peak period of the j day of the ith user according to the electricity dataDetermining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe method comprises the following steps:
according to the peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe calculation formula is as follows:
ratioij=valleyij/peakij。
determining the accumulated fluctuation rate of the electricity consumption of the ith user in the j day of the peak period according to the electricity consumption data of the peak periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the peak period of the jth day of the ith user,is the average of the peak period power usage on the ith user day j,the electricity consumption data of the ith user in the peak time interval of the jth day is k, and n is the number of the electricity consumption data of the ith user in the peak time interval of the jth day;
determining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the valley period of the jth day of the ith user,is the average value of the electricity consumption in the valley period of the jth day of the ith user,the electricity consumption data of the ith user in the valley period of the jth day is k, and n is the number of the electricity consumption data of the ith user in the valley period of the jth day.
Preferably, the cluster analysis unit 204 is configured to determine a ratio of a peak load to a trough load of a user in the tth industry according to a predetermined AP clustering algorithm parameterijCumulative fluctuation rate of power consumption in peak hoursAnd accumulation of electricity usage during off-peak periodsRate of fluctuationPerforming cluster analysis on users, and determining users with abnormal electricity utilization behaviors in the tth industry according to the cluster analysis result, wherein the steps comprise:
setting parameters of an AP clustering algorithm, wherein the parameters comprise an attenuation coefficient lambda, the maximum iteration time T and the time T' that the maximum iteration of a clustering center is not changed;
according to the ratio of low-peak to high-peak loadijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodCalculating a valid sample set X'tUser xiAnd user xkHas an Euclidean distance d betweeni,kThe calculation formula is as follows:
di,k=-||xi-xk||
according to the Euclidean distance di,kDetermining user xiAnd user xkThe similarity s (i, k) between the two groups is calculated by the formula:
s(i,k)=-di,k
taking s (i, k) as an element of a corresponding position of the similarity matrix, and taking the median of all obtained similarity elements as a diagonal element of the similarity matrix to generate the similarity matrix;
when valid sample set X'tWhen the electricity consumption data of a total of n users exist in the system, initializing an attraction matrix R and an attribution matrix A into a zero matrix of n multiplied by n, wherein the element R of the attraction matrix R is0(i, k) and element a of the attribution matrix A0(i, k) is 0, and the element R of the attraction matrix R is calculated1(i, k) and element a of the attribution matrix A1(i, k), the calculation formula is:
iterative update rt+1(i, k) and at+1(i, k), the calculation formula is as follows:
rt+1(i,k)=λ*rt-1(i,k)+(1-λ)*rt(i,k)
at+1(i,k)=λ*at-1(i,k)+(1-λ)*at(i,k)
in the formula, lambda is a preset attenuation coefficient, and T is more than or equal to 1 and less than or equal to T;
summing the attribution degree and the attraction degree of each data point, and determining a clustering center, wherein for a data point i, when a (i, k) + r (i, k) obtains the maximum value, if i is equal to k, the data point i is determined to be one clustering center, and if i is not equal to k, the data point k is determined to be the clustering center;
and when the iteration times T of the clustering center are equal to the preset maximum iteration times T' of the clustering center without change, the clustering center is not changed or the iteration times T are equal to the preset maximum iteration times T, ending the AP clustering algorithm.
The steps of the system for identifying users in the power system abnormal industry and abnormal electricity consumption behaviors are the same as the steps of the method for identifying users in the power system abnormal industry and abnormal electricity consumption behaviors for the power system, the technical effects are the same, and the description is omitted.
The invention has been described with reference to a few embodiments. However, other embodiments of the invention than the one disclosed above are equally possible within the scope of the invention, as would be apparent to a person skilled in the art from the appended patent claims.
Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the [ device, component, etc ]" are to be interpreted openly as referring to at least one instance of said device, component, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.
Claims (10)
1. A method of identifying abnormal industry users and abnormal electricity usage behavior of an electrical power system, the method comprising:
collecting user industry information based on the user file, collecting power consumption data of users according to a predetermined time interval, classifying the power consumption data according to the industry information, and determining a power consumption data original sample set X of the power consumption users in the tth industryt={xijkIn which xijkThe kth electricity consumption data of the ith user on the jth day is that k is more than or equal to 1 and less than or equal to N;
original sample set X of electricity consumption data of the t-th industrytProcessing the deficiency value and the abnormal value to generate a valid sample set X 'of power consumption data of the tth industry't={x′ijkWherein the missing value is the power consumption data missing from the original sample set, and the abnormal value is the power consumption data outside the predetermined normal power consumption interval, x'ijkIs to the original sample set XtThe kth power consumption data of the ith user on the jth day after the power consumption data in the data processing is processed is that k is more than or equal to 1 and less than or equal to N;
according to an effective sample set X'tDetermining the electricity consumption peak in the peak period, the valley period and the peak period of the jth day of the ith user according to the electricity consumption dataijElectricity consumption in the valley periodvalleyij;
According to the peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the electricity consumption in the valley period according to the electricity consumption data in the valley period
Based on the predetermined AP clustering algorithm parameters, according to the ratio of the peak load to the valley load of the users in the tth industryijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming clustering analysis on the users;
and determining abnormal users in the tth industry and users with abnormal electricity utilization behaviors in the industry according to the clustering analysis result.
2. The method of claim 1, wherein the raw sample set of power usage data for the tth industry is XtProcessing the deficiency value and the abnormal value to generate an electricity utilization data valid sample set X 'of the t-th industry'tThe method refers to a t-th industry power consumption data original sample set XtWhen the missing value and/or the abnormal value exist in the sample, a neighbor mean method is adopted to replace the missing value and/or the abnormal value in the sample.
3. The method of claim 1, wherein the basis is validSample set X'tDetermining the peak period electricity consumption peak of the ith user on the jth day according to the electricity consumption dataijValley of electricity consumption in valley periodijAnd according to the peak time interval electricity consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe method comprises the following steps:
in valid sample set X'tIn the method, N electricity consumption data of the ith user in the j continuous time period are summed, wherein N is more than or equal to 6 and less than or equal to N;
arranging the power consumption data summation values of the ith user on the jth day according to an ascending order, determining the continuous time interval with the maximum power consumption data summation value as a peak time interval, determining the continuous time interval with the minimum power consumption data summation value as a valley time interval, and determining the power consumption data summation value of the peak time interval as the power consumption peak of the peak time intervalijAnd the summation value of the electricity consumption data in the valley period is the electricity consumption valley in the valley periodij。
4. A method according to claim 3, characterised by using the amount of electricity peak according to the peak periodijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe method comprises the following steps:
according to the peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe calculation formula is as follows:
ratioij=valleyij/peakij。
using electricity according to peak time intervalThe quantity data determines the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith userThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the peak period of the jth day of the ith user,is the average of the peak period power usage on the ith user day j,the electricity consumption data of the ith user in the peak time interval of the jth day is k, and n is the number of the electricity consumption data of the ith user in the peak time interval of the jth day;
determining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the valley period of the jth day of the ith user,is the average value of the electricity consumption in the valley period of the jth day of the ith user,the electricity consumption data of the ith user in the valley period of the jth day is k, and n is the number of the electricity consumption data of the ith user in the valley period of the jth day.
5. The method according to claim 4, wherein the ratio of trough to peak load ratio of users in the tth industry is determined based on predetermined AP clustering algorithm parametersijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming cluster analysis on users, and determining users with abnormal electricity utilization behaviors in the tth industry according to the cluster analysis result, wherein the steps comprise:
setting parameters of an AP clustering algorithm, wherein the parameters comprise an attenuation coefficient lambda, the maximum iteration time T and the time T' that the maximum iteration of a clustering center is not changed;
according to the ratio of low-peak to high-peak loadijAccumulated fluctuation of electricity consumption in peak hoursRate of changeCumulative fluctuation rate of electricity consumption in valley periodCalculating a valid sample set X'tUser xiAnd user xkHas an Euclidean distance d betweeni,kThe calculation formula is as follows:
di,k=-||xi-xk||
according to the Euclidean distance di,kDetermining user xiAnd user xkThe similarity s (i, k) between the two groups is calculated by the formula:
s(i,k)=-di,k
taking s (i, k) as an element of a corresponding position of the similarity matrix, and taking the median of all obtained similarity elements as a diagonal element of the similarity matrix to generate the similarity matrix;
when valid sample set X'tWhen the electricity consumption data of a total of n users exist in the system, initializing an attraction matrix R and an attribution matrix A into a zero matrix of n multiplied by n, wherein the element R of the attraction matrix R is0(i, k) and element a of the attribution matrix A0(i, k) is 0, and the element R of the attraction matrix R is calculated1(i, k) and element a of the attribution matrix A1(i, k), the calculation formula is:
iterative update rt+1(i, k) and at+1(i, k), the calculation formula is as follows:
rt+1(i,k)=λ*rt-1(i,k)+(1-λ)*rt(i,k)
at+1(i,k)=λ*at-1(i,k)+(1-λ)*at(i,k)
in the formula, lambda is a preset attenuation coefficient, and T is more than or equal to 1 and less than or equal to T;
summing the attribution degree and the attraction degree of each data point, and determining a clustering center, wherein for a data point i, when a (i, k) + r (i, k) obtains the maximum value, if i is equal to k, the data point i is determined to be one clustering center, and if i is not equal to k, the data point k is determined to be the clustering center;
and when the iteration times T of the clustering center are equal to the preset maximum iteration times T' of the clustering center without change, the clustering center is not changed or the iteration times T are equal to the preset maximum iteration times T, ending the AP clustering algorithm.
6. A system for identifying abnormal industry users and abnormal electricity usage behavior in an electrical power system, the system comprising:
a data acquisition unit for acquiring user industry information based on the user profile, acquiring power consumption data of users according to a predetermined time interval, classifying the power consumption data according to the industry information, and determining a t-th original sample set X of the power consumption data of the users in the industryt={xijkIn which xijkThe kth electricity consumption data of the ith user on the jth day is that k is more than or equal to 1 and less than or equal to N;
effective sample unit for original sample set X of power consumption data of t-th industrytProcessing the deficiency value and the abnormal value to generate a valid sample set X 'of power consumption data of the tth industry't={x′ijkWherein the missing value is the power consumption data missing from the original sample set, and the abnormal value is the power consumption data outside the predetermined normal power consumption interval, x'ijkIs to the original sample set XtThe kth power consumption data of the ith user on the jth day after the power consumption data in the data processing is processed is that k is more than or equal to 1 and less than or equal to N;
an electrical characteristic indicator unit for indicating the effective sample set X'tDetermines the peak time, valley time, peak time, trough time, average time,peak period power consumption peakijValley of electricity consumption in valley periodijAnd according to said peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the electricity consumption in the valley period according to the electricity consumption data in the valley period
A cluster analysis unit for determining the ratio of the peak load to the valley load of the users in the tth industry based on the predetermined AP clustering algorithm parametersijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming clustering analysis on the users;
and the result output unit is used for determining abnormal users in the tth industry and users with abnormal electricity utilization behaviors in the industry according to the clustering analysis result.
7. The system of claim 6, wherein the valid sample units are raw sample sets X of power usage data for the tth industrytProcessing the deficiency value and the abnormal value to generate an electricity utilization data valid sample set X 'of the t-th industry'tThe method refers to a t-th industry power consumption data original sample set XtWhen the missing value and/or the abnormal value exist in the sample, a neighbor mean method is adopted to replace the missing value and/or the abnormal value in the sample.
8. The system of claim 6, wherein the power usage characteristic indicator unit is from a valid sample set of X'tDetermining the peak period electricity consumption peak of the ith user on the jth day according to the electricity consumption dataijValley of electricity consumption in valley periodijAnd according to the peak time interval electricity consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijThe method comprises the following steps:
in valid sample set X'tIn the method, N electricity consumption data of the ith user in the j continuous time period are summed, wherein N is more than or equal to 6 and less than or equal to N;
arranging the power consumption data summation values of the ith user on the jth day according to an ascending order, determining the continuous time interval with the maximum power consumption data summation value as a peak time interval, determining the continuous time interval with the minimum power consumption data summation value as a valley time interval, and determining the power consumption data summation value of the peak time interval as the power consumption peak of the peak time intervalijAnd the summation value of the electricity consumption data in the valley period is the electricity consumption valley in the valley periodij。
9. The system according to claim 8, wherein the electricity characteristic index unit uses the electricity consumption peak according to the peak periodijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijAnd determining the cumulative fluctuation rate of the electricity consumption in the peak period of the jth day of the ith user according to the electricity consumption data in the peak periodDetermining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe method comprises the following steps:
according to the peak time interval power consumption peakijValley of electricity consumption in valley periodijDetermining a trough to peak load ratio on day j for the ith userijWhich calculatesThe formula is as follows:
ratioij=valleyij/peakij。
determining the accumulated fluctuation rate of the electricity consumption of the ith user in the j day of the peak period according to the electricity consumption data of the peak periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the peak period of the jth day of the ith user,is the average of the peak period power usage on the ith user day j,the electricity consumption data of the ith user in the peak time interval of the jth day is k, and n is the number of the electricity consumption data of the ith user in the peak time interval of the jth day;
determining the accumulated fluctuation rate of the power consumption of the ith user in the jth day of the ith user according to the power consumption data of the ith user in the valley periodThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is the accumulated value of the electricity consumption fluctuation in the valley period of the jth day of the ith user,is the average value of the electricity consumption in the valley period of the jth day of the ith user,the electricity consumption data of the ith user in the valley period of the jth day is k, and n is the number of the electricity consumption data of the ith user in the valley period of the jth day.
10. The system according to claim 6, wherein the cluster analysis unit is configured to determine the trough to peak load ratio of users in the tth industry based on predetermined AP clustering algorithm parametersijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodPerforming cluster analysis on users, and determining users with abnormal electricity utilization behaviors in the tth industry according to the cluster analysis result, wherein the steps comprise:
setting parameters of an AP clustering algorithm, wherein the parameters comprise an attenuation coefficient lambda, the maximum iteration time T and the time T' that the maximum iteration of a clustering center is not changed;
according to the ratio of low-peak to high-peak loadijCumulative fluctuation rate of power consumption in peak hoursCumulative fluctuation rate of electricity consumption in valley periodCalculating a valid sample set X'tUser xiAnd user xkHas an Euclidean distance d betweeni,kThe calculation formula is as follows:
di,k=-||xi-xk||
according to the Euclidean distance di,kDetermining user xiAnd user xkThe similarity s (i, k) between the two groups is calculated by the formula:
s(i,k)=-di,k
taking s (i, k) as an element of a corresponding position of the similarity matrix, and taking the median of all obtained similarity elements as a diagonal element of the similarity matrix to generate the similarity matrix;
when valid sample set X'tWhen the electricity consumption data of a total of n users exist in the system, initializing an attraction matrix R and an attribution matrix A into a zero matrix of n multiplied by n, wherein the element R of the attraction matrix R is0(i, k) and element a of the attribution matrix A0(i, k) is 0, and the element R of the attraction matrix R is calculated1(i, k) and element a of the attribution matrix A1(i, k), the calculation formula is:
iterative update rt+1(i, k) and at+1(i, k), said calculation formulaThe following were used:
rt+1(i,k)=λ*rt-1(i,k)+(1-λ)*rt(i,k)
at+1(i,k)=λ*at-1(i,k)+(1-λ)*at(i,k)
in the formula, lambda is a preset attenuation coefficient, and T is more than or equal to 1 and less than or equal to T;
summing the attribution degree and the attraction degree of each data point, and determining a clustering center, wherein for a data point i, when a (i, k) + r (i, k) obtains the maximum value, if i is equal to k, the data point i is determined to be one clustering center, and if i is not equal to k, the data point k is determined to be the clustering center;
and when the iteration times T of the clustering center are equal to the preset maximum iteration times T' of the clustering center without change, the clustering center is not changed or the iteration times T are equal to the preset maximum iteration times T, ending the AP clustering algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911037248.4A CN110991477A (en) | 2019-10-29 | 2019-10-29 | Method and system for identifying users in abnormal industry and abnormal electricity utilization behaviors of power system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911037248.4A CN110991477A (en) | 2019-10-29 | 2019-10-29 | Method and system for identifying users in abnormal industry and abnormal electricity utilization behaviors of power system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110991477A true CN110991477A (en) | 2020-04-10 |
Family
ID=70082701
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911037248.4A Pending CN110991477A (en) | 2019-10-29 | 2019-10-29 | Method and system for identifying users in abnormal industry and abnormal electricity utilization behaviors of power system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110991477A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111680933A (en) * | 2020-06-29 | 2020-09-18 | 北京中电普华信息技术有限公司 | Method and device for analyzing power consumption behavior, readable medium and equipment |
CN112365174A (en) * | 2020-11-16 | 2021-02-12 | 深圳供电局有限公司 | Residential electricity distribution decision method and system based on electricity consumption behavior preference |
CN112714368A (en) * | 2020-12-07 | 2021-04-27 | 南方电网数字电网研究院有限公司 | Method and device for prompting abnormal electricity consumption, computer equipment and storage medium |
CN112765826A (en) * | 2021-01-27 | 2021-05-07 | 长沙理工大学 | Indoor hemp planting resident user identification method based on power consumption frequency distribution relative entropy |
CN112949700A (en) * | 2021-02-19 | 2021-06-11 | 国网北京市电力公司 | Method and device for identifying execution strength of enterprise yield limit policy |
CN115659228A (en) * | 2022-12-26 | 2023-01-31 | 国网浙江省电力有限公司宁波供电公司 | User electricity utilization stimulation method and system and readable storage medium |
CN117787572A (en) * | 2024-02-27 | 2024-03-29 | 国网山西省电力公司临汾供电公司 | Abnormal electricity utilization user identification method and device, storage medium and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105825298A (en) * | 2016-03-14 | 2016-08-03 | 梁海东 | Electric network metering early-warning system and method based on load characteristic pre-estimation |
CN109947815A (en) * | 2018-09-30 | 2019-06-28 | 国网浙江长兴县供电有限公司 | A kind of stealing discrimination method based on outlier algorithm |
CN110097297A (en) * | 2019-05-21 | 2019-08-06 | 国网湖南省电力有限公司 | A kind of various dimensions stealing situation Intellisense method, system, equipment and medium |
-
2019
- 2019-10-29 CN CN201911037248.4A patent/CN110991477A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105825298A (en) * | 2016-03-14 | 2016-08-03 | 梁海东 | Electric network metering early-warning system and method based on load characteristic pre-estimation |
CN109947815A (en) * | 2018-09-30 | 2019-06-28 | 国网浙江长兴县供电有限公司 | A kind of stealing discrimination method based on outlier algorithm |
CN110097297A (en) * | 2019-05-21 | 2019-08-06 | 国网湖南省电力有限公司 | A kind of various dimensions stealing situation Intellisense method, system, equipment and medium |
Non-Patent Citations (1)
Title |
---|
杨卫红;赖清平;兰宇;王丹;胡庆娥;王旭阳;刘艳茹;: "基于调节潜力指标的用户用电行为聚类分析算法研究" * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111680933A (en) * | 2020-06-29 | 2020-09-18 | 北京中电普华信息技术有限公司 | Method and device for analyzing power consumption behavior, readable medium and equipment |
CN111680933B (en) * | 2020-06-29 | 2023-04-18 | 北京中电普华信息技术有限公司 | Method and device for analyzing power consumption behavior, readable medium and equipment |
CN112365174A (en) * | 2020-11-16 | 2021-02-12 | 深圳供电局有限公司 | Residential electricity distribution decision method and system based on electricity consumption behavior preference |
CN112365174B (en) * | 2020-11-16 | 2023-08-25 | 深圳供电局有限公司 | Residential electricity distribution decision-making method and system based on electricity consumption behavior preference |
CN112714368A (en) * | 2020-12-07 | 2021-04-27 | 南方电网数字电网研究院有限公司 | Method and device for prompting abnormal electricity consumption, computer equipment and storage medium |
CN112714368B (en) * | 2020-12-07 | 2023-03-31 | 南方电网数字电网研究院有限公司 | Method and device for prompting abnormal electricity consumption, computer equipment and storage medium |
CN112765826A (en) * | 2021-01-27 | 2021-05-07 | 长沙理工大学 | Indoor hemp planting resident user identification method based on power consumption frequency distribution relative entropy |
CN112949700A (en) * | 2021-02-19 | 2021-06-11 | 国网北京市电力公司 | Method and device for identifying execution strength of enterprise yield limit policy |
CN115659228A (en) * | 2022-12-26 | 2023-01-31 | 国网浙江省电力有限公司宁波供电公司 | User electricity utilization stimulation method and system and readable storage medium |
CN117787572A (en) * | 2024-02-27 | 2024-03-29 | 国网山西省电力公司临汾供电公司 | Abnormal electricity utilization user identification method and device, storage medium and electronic equipment |
CN117787572B (en) * | 2024-02-27 | 2024-05-17 | 国网山西省电力公司临汾供电公司 | Abnormal electricity utilization user identification method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110991477A (en) | Method and system for identifying users in abnormal industry and abnormal electricity utilization behaviors of power system | |
CN110097297B (en) | Multi-dimensional electricity stealing situation intelligent sensing method, system, equipment and medium | |
CN110223196B (en) | Anti-electricity-stealing analysis method based on typical industry feature library and anti-electricity-stealing sample library | |
CN107742127B (en) | Improved electricity stealing prevention intelligent early warning system and method | |
Chicco et al. | Emergent electricity customer classification | |
CN111612650A (en) | Power consumer clustering method and system based on DTW distance and neighbor propagation clustering algorithm | |
CN111860600A (en) | User electricity utilization characteristic selection method based on maximum correlation minimum redundancy criterion | |
CN107248086A (en) | Advertisement putting aided analysis method based on user power utilization behavioural analysis | |
CN110109971A (en) | A kind of low-voltage platform area user power utilization Load Characteristic Analysis method | |
CN110264107B (en) | Large data technology-based abnormal diagnosis method for line loss rate of transformer area | |
CN110866841A (en) | Power consumer industry dimension power consumption pattern identification analysis method and system based on double clustering method | |
CN115660225A (en) | Electricity load prediction management method and system based on ammeter communication module | |
CN112581012A (en) | Electricity customer classification method participating in demand response | |
CN114611738A (en) | Load prediction method based on user electricity consumption behavior analysis | |
CN112614004A (en) | Method and device for processing power utilization information | |
CN115049410A (en) | Electricity stealing behavior identification method and device, electronic equipment and computer readable storage medium | |
CN107274025B (en) | System and method for realizing intelligent identification and management of power consumption mode | |
CN113902181A (en) | Short-term prediction method and equipment for common variable heavy overload | |
CN111612054A (en) | User electricity stealing behavior identification method based on non-negative matrix factorization and density clustering | |
CN111368257B (en) | Analysis and prediction method and device for coal-to-electricity load characteristics | |
CN114997470A (en) | Short-term power load prediction method based on LSTM neural network | |
CN113869601A (en) | Power consumer load prediction method, device and equipment | |
CN111915116A (en) | Electric power resident user classification method based on K-means clustering | |
Davarzani et al. | Study of missing meter data impact on domestic load profiles clustering and characterization | |
CN116910596B (en) | User electricity stealing analysis method, device and storage medium based on improved DBSCAN clustering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200410 |