CN112101471A - Electricity stealing probability early warning analysis method - Google Patents

Electricity stealing probability early warning analysis method Download PDF

Info

Publication number
CN112101471A
CN112101471A CN202010992846.3A CN202010992846A CN112101471A CN 112101471 A CN112101471 A CN 112101471A CN 202010992846 A CN202010992846 A CN 202010992846A CN 112101471 A CN112101471 A CN 112101471A
Authority
CN
China
Prior art keywords
electricity
data
stealing
analysis
electricity stealing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010992846.3A
Other languages
Chinese (zh)
Inventor
雷振江
田小蕾
王丽霞
胡楠
刘晓强
高强
冉冉
孙岩
白韬
孙廷昊
苌一江
孟威
汤宁
张子谦
张玮
梁明
许海丰
代作松
伏广东
曹国强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd
Nari Information and Communication Technology Co
Original Assignee
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd
Nari Information and Communication Technology Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd, Nari Information and Communication Technology Co filed Critical State Grid Corp of China SGCC
Priority to CN202010992846.3A priority Critical patent/CN112101471A/en
Publication of CN112101471A publication Critical patent/CN112101471A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a power stealing probability early warning analysis method, which comprises the steps of firstly establishing a customer power consumption behavior abnormity classification model by adopting a logistic regression analysis algorithm, and then establishing a customer power consumption behavior abnormity discrimination model by adopting a clustering analysis algorithm. The method fully applies logistic regression analysis technology and K-Means cluster analysis technology to calculate the electricity consumption behavior data of the user, realizes the on-line diagnosis of the field electricity stealing behavior, improves the work efficiency of electricity stealing troubleshooting, and reduces the work cost; the method comprises the steps of establishing a client electricity stealing probability big data analysis model, carrying out multi-dimensional analysis on all electricity consumers, accurately identifying suspected electricity stealing users, establishing systematic and normalized electricity anti-stealing analysis, early warning, troubleshooting and closed-loop service processes, and improving the work effect of electricity anti-stealing; based on the refined analysis result of the electricity stealing mode, the improvement of the design defect of the metering device and the upgrading of the electricity stealing prevention function are promoted.

Description

Electricity stealing probability early warning analysis method
Technical Field
The invention relates to the field of big data, in particular to an electricity stealing probability early warning analysis method.
Background
In order to quickly and accurately locate suspected users of 'default electricity utilization and electricity stealing', various electricity stealing factors are comprehensively considered on the basis of a large amount of customer electricity utilization information accumulated by an electricity utilization information acquisition system and a marketing service application system, a customer electricity stealing probability analysis model is established, the whole process management of on-site electricity stealing behavior on-line diagnosis and electricity stealing behavior analysis is realized through a big data technology analysis means, electricity stealing prevention services are flexibly developed, and the economic loss of a power grid is recovered. The customer electricity consumption behavior information can be divided into two categories of static information data and dynamic information data, wherein the static information data mainly comprise basic customer information, such as a house name, a customer region, an industry classification, electricity consumption capacity, an electricity consumption address, arrearage information, default records and the like; the dynamic information data mainly comprises acquisition information and metering statistics information, and the acquisition information mainly comprises table codes, voltage, current, phase angles and the like; the metering statistical information mainly comprises line loss, electric energy, average power utilization conditions of various industries and the like. The forms of electricity stealing, although varied, can be broadly divided into 2 ways: the electricity stealing mode of the hardware of the electric energy meter is changed and the high-tech electricity stealing means of the hardware of the electric energy meter is not changed. The former mostly generates abnormal acquisition data and can carry out feature matching based on various index data; the latter generally adopts data normality, and can only distinguish anomalies through data trends.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an electricity stealing probability early warning analysis method, which solves the problems of slow work complexity progress, low result accuracy and low reliability caused by manual screening and multiple on-site operation in the prior art.
The technical scheme adopted by the invention for realizing the purpose is as follows:
a power stealing probability early warning analysis method comprises the steps of firstly adopting a logistic regression analysis algorithm to establish a customer power consumption behavior abnormity classification model, and then adopting a clustering analysis algorithm to establish a customer power consumption behavior abnormity discrimination model.
The method for establishing the abnormal classification model of the electricity consumption behavior of the client by adopting the logistic regression analysis algorithm comprises the following steps:
step 1: acquiring typical electricity stealing case data and normal electricity using behavior data in the same proportion;
step 2: preprocessing typical electricity stealing case data and normal electricity utilization behavior data in the same proportion through a database;
and step 3: carrying out descriptive statistics on the multi-dimensional characteristics of the abnormal degree of the client;
and 4, step 4: using spss to make logistic regression analysis and setting 50% as prediction result threshold value, setting forward stepping likelihood ratio test method, selecting optimum independent variable and simultaneously outputting regression coefficient value beta of each variablei
And 5: and substituting the model training result into a prediction function.
The typical electricity stealing case data acquisition comprises the step of acquiring the related data information of the illegal electricity utilization and stealing of the customers in the marketing business application system, including electricity stealing case information, illegal electricity utilization and stealing information, on-site investigation evidence obtaining information and inspection result information.
And the acquisition of the normal electricity consumption behavior data of the same proportion comprises the acquisition of the normal electricity consumption behavior data of the same proportion in marketing service application.
The multi-dimensional characteristics of the customer abnormal degree comprise: whether current three-phase imbalance occurs, whether stopping of the electric energy meter and abnormal electric quantity fluctuation occur, and whether abnormal cover opening recording occurs.
The preprocessing of the typical electricity stealing case data and the normal electricity using behavior data with the same proportion comprises multi-table data merging, invalid value deletion, null value filling, and then marking whether electricity stealing is carried out, wherein the electricity stealing mark is 1, and otherwise, the electricity stealing mark is 0.
The method for establishing the customer electricity consumption behavior abnormity discrimination model by adopting the cluster analysis algorithm comprises the following steps:
step a: acquiring historical electricity consumption behavior data and user static data of a client in a marketing service application system;
step b: preprocessing the historical electricity consumption behavior data of the client and the static data of the user through a database;
step c: carrying out normalization processing on the electric quantity, voltage, current, power and load data of a client, and dividing the electric quantity, the voltage, the current, the power and the load data into different types according to areas and power utilization types;
step d: adopting a K-Means cluster analysis algorithm for different kinds of data, selecting a cluster number K value, and judging whether the model is converged; if yes, outputting a clustering result, and executing the step f;
step e: if not, adjusting the model parameters and returning to the step d;
step f: and generating typical electricity utilization behavior curves of various types of users according to the clustering structure.
The preprocessing of the historical electricity consumption behavior data and the user static data of the client comprises the following steps: invalid values are deleted and null values are filled.
The invention has the following beneficial effects and advantages:
the method fully applies logistic regression analysis technology and K-Means cluster analysis technology to calculate the electricity consumption behavior data of the user, realizes the on-line diagnosis of the field electricity stealing behavior, improves the work efficiency of electricity stealing troubleshooting, and reduces the work cost;
according to the method, a data analysis model with high power stealing probability is set up for customers, multidimensional analysis is carried out on all power customers, suspected power stealing users are accurately identified, systematic and normalized power stealing prevention analysis, early warning, troubleshooting and closed-loop service processes are established, and the power stealing prevention work effect is improved;
the invention refines the analysis result based on the electricity stealing mode, and promotes the improvement of the design defect of the metering device and the upgrading of the electricity stealing prevention function.
Drawings
FIG. 1 is a flow chart of the logistic regression analysis algorithm for establishing abnormal classification of customer electricity consumption behavior according to the present invention;
FIG. 2 is a flow chart of the cluster analysis algorithm for determining abnormal electricity consumption behavior of a customer according to the present invention;
FIG. 3 is a functional graph of a dependent variable sigmoid growth curve in the logistic regression analysis algorithm of the present invention;
FIG. 4 is a plot of customer load daily average case data for the present invention;
fig. 5 is a graph of the power usage behavior of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, embodiments accompanying the drawings are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather should be construed as modified in the spirit and scope of the present invention as set forth in the appended claims.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The method comprises the following steps:
fig. 1 is a flow chart of the logistic regression analysis algorithm for establishing abnormal classification of customer electricity consumption behavior according to the present invention.
1. A logistic regression analysis algorithm is adopted to establish a classification model of abnormal electricity consumption behaviors of customers, and the specific decomposition process is as follows:
(1) data acquisition: the method comprises the steps of obtaining typical electricity stealing case data and normal electricity using behavior data with the same proportion. Firstly, relevant data information of the illegal electricity utilization and electricity stealing of customers in a marketing business application system comprises electricity stealing case information, illegal electricity utilization and electricity stealing information, on-site investigation evidence obtaining information, inspection result information and the like; secondly, typical electricity stealing case information of different types is gathered from companies in cities and counties; thirdly, acquiring normal electricity consumption behavior data with the same proportion in marketing service application;
(2) preprocessing data through an Oracle database, including multi-table data combination, deleting invalid values, filling null values, and then marking whether electricity stealing is performed, wherein the electricity stealing mark is 1, and otherwise, the electricity stealing mark is 0;
(3) and according to the collected typical electricity stealing cases, according to different electricity stealing types, carrying out descriptive statistics on the multidimensional characteristics of the abnormal degree of the customer. The method mainly comprises the following steps: performing cross statistics on key information such as whether current three-phase imbalance occurs, whether stopping of the electric energy meter and abnormal electric quantity fluctuation occur, whether abnormal uncapping recording occurs and the like;
(4) using spss to make logistic regression analysis and setting 50% as prediction result threshold value, setting forward stepping likelihood ratio test method, selecting optimum independent variable and simultaneously outputting regression coefficient value beta of each variablei
(5) Substituting a prediction function according to the model training result:
z=β01voltage open phase + beta2Differential anomaly of electrical quantity + beta3Abnormal fluctuation of electric quantity + beta4Stop + beta of electric energy meter5Differential power anomaly + beta6CT loop + beta7Loss of current + beta8Uncovering + beta of electric energy meter9Opening and closing + beta of metering gate10Interference of constant magnetic field
Figure BDA0002691330420000021
Since 50% is set as the prediction result threshold in advance, when the p value is greater than 50%, the power stealing is represented, otherwise, the normal user is represented.
(6) Potential characteristics in the behavior information data of the electricity stealing users are mined, and an electricity stealing user characteristic file is established and used for anti-electricity stealing early warning and troubleshooting.
Fig. 2 is a flow chart for judging whether the power consumption behavior of the client is abnormal, which is established by the cluster analysis algorithm of the present invention.
2. A clustering analysis algorithm is adopted to establish a customer electricity consumption behavior abnormity discrimination model, and the specific decomposition process is as follows:
(1) data acquisition: the data are from historical electricity consumption behavior data of clients and user static data in a marketing business application system;
(2) preprocessing the historical electricity consumption behavior data of the client through an ORACLE database, deleting invalid values, filling null values and the like;
(3) the method comprises the following steps of dividing customers into different types according to regions and electricity utilization types;
(4) carrying out normalization processing on data such as electric quantity, voltage, current, power, load and the like of a client;
(5) adopting a K-Means clustering analysis algorithm for different kinds of data, selecting a clustering number K value according to business general knowledge, and judging whether the model is converged;
(6) if the model converges, outputting and generating typical electricity consumption behavior curves of various types of users; otherwise, adjusting model parameters, judging whether the model is converged according to the objective function SSE, continuously adjusting the k value, and finally selecting the minimum primary SSE as a clustering result;
(7) respectively drawing typical electricity consumption behavior curves according to the clustering result;
(8) and comparing and analyzing the power consumption behavior curve of the client in the new data with the typical power consumption behavior curve, and locking the abnormal power consumption client with the power consumption behavior not in accordance with the typical power consumption behavior track corresponding to the profile type.
The two models described in the above case are as follows:
(1) and (3) adopting a logistic regression analysis technology to classify the abnormal electricity consumption behavior of the customers. The logistic regression is a classification model in machine learning, and is mainly used for regression analysis of dependent variables, and independent variables can be classified variables or continuous variables. He can select from a plurality of independent variables the independent variable that has an effect on the dependent variable and can give a predictive formula for prediction.
Since the dependent variable is a sigmoidal growth curve function in the logistic regression algorithm, as shown in fig. 3:
Figure BDA0002691330420000031
z=β01x1+…+βkxk
from the above figure, it can be seen that there is a fast changing process in the middle segment of the sigmoid growth curve, which can be used for the problem of two classifications, i.e. the prediction result of the function is higher than the preset threshold, which is the type a, or else, the type B. The feature vectors and parameters are thus introduced to derive the following prediction functions:
Figure BDA0002691330420000032
βithe meaning of (a): certain risk factors, when exposure level varies, i.e. xi1 and xiA logarithmic value of some resulting odds ratio occurs compared to 0:
Figure BDA0002691330420000033
Figure BDA0002691330420000034
and (3) likelihood ratio test:
by comparing the variation of the log-likelihood functions of two models containing and not containing one or several observation factors to be examined, the statistic is G:
G=-2(lnLp-lnLk)
when the sample amount is large, G approximately obeys Chi with the degree of freedom as the number of factors to be detected2And (4) distribution.
Finally, the final model is trained by the linear regression loss function. And (3) bringing a large amount of typical electricity stealing case data into the model, randomly selecting normal users with the same proportion, mining potential characteristics in the customer electricity consumption behavior information data, and establishing a customer electricity consumption behavior abnormity classification model.
(2) And (3) adopting a clustering analysis technology to judge the abnormal electricity consumption behavior of the customers. The cluster analysis is a multivariate statistical analysis method for classifying samples or indexes, and the discussed objects are a large number of samples, and the samples can be reasonably classified according to respective characteristics without prior knowledge. The clustering principle is that data in the same cluster has higher similarity, but data in different clusters do not have similarity. The partitioning method gives a data set containing n objects or data lines, and k objects are arbitrarily selected from the data set as initial clustering centers, and the rest other objects are respectively distributed according to the distances between the objects and the clustering centers. Then, the cluster center of each obtained new cluster is calculated, and iteration is repeated until the objective function SSE starts to converge. The method generally adopts a mean square error function as a measure function, generates typical electricity consumption behavior curves of various types of users by adopting a K-Means algorithm, and judges whether electricity consumption behaviors are abnormal or not by comparing and analyzing the electricity consumption behavior curves of the clients in new data and the typical electricity consumption behavior curves.
(3) The K-Means calculation method is as follows:
1. randomly selecting k central points;
2. traversing all the data, and dividing each data into the nearest central points;
3. calculating the average value of each cluster and taking the average value as a new central point;
4. repeat 2-3 until the k centerline points no longer change (converge), or a sufficient number of iterations are performed.
(4) And (3) convergence of the algorithm:
from the K-Means algorithm, SSE is actually a strict coordinate descent process. Let the objective function SSE be as follows:
SSE(C1,C2,…,Ck)=∑(X-Ci)2
the euclidean distance is used as a clustering function between variables. One variable C at a timeiFinding the optimal solution, i.e. calculating the inverse partial number, then equaling 0, can be obtained
Figure BDA0002691330420000041
Wherein m isiIs CiThe number of elements of the cluster in which it is located.
I.e. the mean of the current cluster is the optimal solution (minimum) for the current direction, as per each iteration of K-Means. This therefore ensures that the SSE is reduced for each iteration, eventually causing the SSE to converge.
Since the SSE is a non-convex function, the SSE cannot guarantee finding a globally optimal solution, but only a locally optimal solution. But may be repeated several times, and the smallest SSE is selected as the final clustering result.
(5)0-1 normalization:
due to the different dimensions between the data, the comparison is inconvenient. Therefore, the data needs to be uniformly put in the range of 0-1 and converted into dimensionless pure numerical values, so that indexes of different units or orders of magnitude can be compared and weighted conveniently. The specific calculation method is as follows:
Figure BDA0002691330420000042
(6) selecting a K value:
in practical applications, K-Means is generally used as a data preprocessing or for assisting classification labeling. K is generally not set large. By enumeration, K is from 2 to a fixed value such as 10, K-Means is repeatedly run for several times on each K value (to avoid a local optimal solution), the average contour coefficient of the current K is calculated, and finally K corresponding to the value with the maximum contour coefficient is selected as the final cluster number.
Example (b):
typical electricity stealing case data are listed and brought into a logistic regression model for calculation, a potential characteristic curve is obtained, and an electricity stealing characteristic file is established.
Specific data preparation:
the following table 1 respectively obtains typical electricity stealing case data and normal electricity consumption behavior data with the same proportion, simplifies key fields for verification convenience, and performs cross statistics, and the specific implementation process is as follows:
TABLE 1 Electricity stealing case data
Figure BDA0002691330420000043
Figure BDA0002691330420000051
Step 1: modeling
Using spss to make logistic regression analysis and setting 50% as prediction result threshold value, setting forward stepping likelihood ratio test method, selecting optimum independent variable and simultaneously outputting regression coefficient value beta of each variablei
The model output results are shown in table 2 below:
variables in the equations of Table 2
Figure BDA0002691330420000052
a. The variable input in step 1 is a power differential exception.
b. And (3) opening the cover of the electric energy meter as the input variable in the step (2).
c. The variable input in step 3 is current loss.
d. The variable input in step 4, voltage phase loss.
TABLE 3 variables not in the equation
Figure BDA0002691330420000053
Figure BDA0002691330420000061
Step 2: and substituting the model training result into a prediction function.
Step 1 shows that 4 final selected model variables are obtained after 4 iterations, and are brought into a prediction function according to a model training result:
x-3.070 +1.195 voltage phase loss +2.381 power differential anomaly +1.990 current loss +3.035
Electric energy meter cover
Figure BDA0002691330420000062
And step 3: and (5) verifying the model.
And (3) predicting the sample according to the step (2), wherein the accuracy of the final model can reach 86%, and the fitting effect on electricity stealing users is particularly good and reaches 88%.
TABLE 4 prediction results tabulation
Figure BDA0002691330420000063
Secondly, enumerating 1000 customer historical load daily average data (load values are recorded once every 15 minutes at 96 points every day), establishing a K-Means cluster analysis model, substituting the K-Means cluster analysis model into a K-Means cluster analysis algorithm, judging whether the model converges, comparing curves, and obtaining customers with abnormal electricity consumption behaviors.
Specific data preparation: as shown in fig. 4
Step 1: and (5) establishing a model.
And (3) the experimental data are brought into the sps for training, a heuristic method is adopted for the k value according to the business general knowledge, the final iteration number is 3, and convergence is achieved because the clustering center is not changed or is slightly changed.
TABLE 6 iteration History
Figure BDA0002691330420000064
Step 2: and respectively drawing typical electricity consumption behavior curves according to the clustering results.
The resulting electricity usage behavior curve is shown in fig. 5.
And step 3: locking the exception client.
And comparing and analyzing the power consumption behavior curve of the client in the new data with the typical power consumption behavior curve, and locking the abnormal power consumption client with the power consumption behavior not in accordance with the typical power consumption behavior track corresponding to the profile type.

Claims (3)

1. The electricity stealing probability early warning analysis method is characterized by comprising the following steps: firstly, establishing a classification model of abnormal electricity consumption behaviors of customers by adopting a logistic regression analysis algorithm, and then establishing a discrimination model of the abnormal electricity consumption behaviors of the customers by adopting a clustering analysis algorithm;
the method for establishing the customer electricity consumption behavior abnormity discrimination model by adopting the cluster analysis algorithm comprises the following steps:
step a: acquiring historical electricity consumption behavior data and user static data of a client in a marketing service application system;
step b: preprocessing the historical electricity consumption behavior data of the client and the static data of the user through a database;
step c: carrying out normalization processing on the electric quantity, voltage, current, power and load data of a client, and dividing the electric quantity, the voltage, the current, the power and the load data into different types according to areas and power utilization types;
step d: adopting a K-Means cluster analysis algorithm for different kinds of data, selecting a cluster number K value, and judging whether the model is converged; if yes, outputting a clustering result, and executing the step f;
step e: if not, adjusting the model parameters and returning to the step d;
step f: and generating typical electricity utilization behavior curves of various types of users according to the clustering structure.
2. The electricity stealing probability early warning analysis method according to claim 1, characterized in that: the preprocessing of the historical electricity consumption behavior data and the user static data of the client comprises the following steps: invalid values are deleted and null values are filled.
3. The electricity stealing probability early warning analysis method according to claim 1, characterized in that: the method for establishing the abnormal classification model of the electricity consumption behavior of the client by adopting the logistic regression analysis algorithm comprises the following steps:
step 1: acquiring typical electricity stealing case data and normal electricity using behavior data in the same proportion;
step 2: preprocessing typical electricity stealing case data and normal electricity utilization behavior data in the same proportion through a database;
and step 3: carrying out descriptive statistics on the multi-dimensional characteristics of the abnormal degree of the client;
and 4, step 4: using spss to make logistic regression analysis and setting 50% as prediction result threshold value, setting forward stepping likelihood ratio test method, selecting optimum independent variable and simultaneously outputting regression coefficient value beta of each variablei
And 5: and substituting the model training result into a prediction function.
CN202010992846.3A 2020-09-21 2020-09-21 Electricity stealing probability early warning analysis method Pending CN112101471A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010992846.3A CN112101471A (en) 2020-09-21 2020-09-21 Electricity stealing probability early warning analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010992846.3A CN112101471A (en) 2020-09-21 2020-09-21 Electricity stealing probability early warning analysis method

Publications (1)

Publication Number Publication Date
CN112101471A true CN112101471A (en) 2020-12-18

Family

ID=73760118

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010992846.3A Pending CN112101471A (en) 2020-09-21 2020-09-21 Electricity stealing probability early warning analysis method

Country Status (1)

Country Link
CN (1) CN112101471A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944617A (en) * 2017-11-20 2018-04-20 国网福建省电力有限公司 A kind of doubtful stealing theme influence factor weight optimization method that logic-based returns
CN112132210A (en) * 2020-09-21 2020-12-25 国网辽宁省电力有限公司电力科学研究院 Electricity stealing probability early warning analysis method based on customer electricity consumption behavior
CN113744081A (en) * 2021-08-23 2021-12-03 国网青海省电力公司信息通信公司 Electricity stealing behavior analysis method
CN114841268A (en) * 2022-05-06 2022-08-02 国网江苏省电力有限公司营销服务中心 Abnormal power customer identification method based on Transformer and LSTM fusion algorithm

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145966A (en) * 2017-04-12 2017-09-08 山大地纬软件股份有限公司 Logic-based returns the analysis and early warning method of opposing electricity-stealing of probability analysis Optimized model
CN109190916A (en) * 2018-08-09 2019-01-11 国网浙江桐庐县供电有限公司 Method of opposing electricity-stealing based on big data analysis
CN110097297A (en) * 2019-05-21 2019-08-06 国网湖南省电力有限公司 A kind of various dimensions stealing situation Intellisense method, system, equipment and medium
CN110223196A (en) * 2019-06-04 2019-09-10 国网浙江省电力有限公司电力科学研究院 Analysis method of opposing electricity-stealing based on typical industry feature database and sample database of opposing electricity-stealing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145966A (en) * 2017-04-12 2017-09-08 山大地纬软件股份有限公司 Logic-based returns the analysis and early warning method of opposing electricity-stealing of probability analysis Optimized model
CN109190916A (en) * 2018-08-09 2019-01-11 国网浙江桐庐县供电有限公司 Method of opposing electricity-stealing based on big data analysis
CN110097297A (en) * 2019-05-21 2019-08-06 国网湖南省电力有限公司 A kind of various dimensions stealing situation Intellisense method, system, equipment and medium
CN110223196A (en) * 2019-06-04 2019-09-10 国网浙江省电力有限公司电力科学研究院 Analysis method of opposing electricity-stealing based on typical industry feature database and sample database of opposing electricity-stealing

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
刘卫新;尹文庆;潘霞;杨金成;: "聚类k-means算法在新疆反窃电工作中的应用", 南昌大学学报(理科版), no. 05 *
张德丰: "《TensorFlow深度学习从入门到进阶》", 31 May 2020, 北京:机械工业出版社, pages: 121 - 123 *
杨成荣;李战江;史来银;: "信用评价方法的多维最优选择策略", 统计与决策, no. 21 *
梁波;许峰;李文修;: "基于客户用电行为的窃电概率预警分析", 农村电工, no. 08, pages 1 *
蒙黄林: "《应用统计学》", 28 February 2018, 中国海洋大学出版社, pages: 181 - 192 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944617A (en) * 2017-11-20 2018-04-20 国网福建省电力有限公司 A kind of doubtful stealing theme influence factor weight optimization method that logic-based returns
CN112132210A (en) * 2020-09-21 2020-12-25 国网辽宁省电力有限公司电力科学研究院 Electricity stealing probability early warning analysis method based on customer electricity consumption behavior
CN113744081A (en) * 2021-08-23 2021-12-03 国网青海省电力公司信息通信公司 Electricity stealing behavior analysis method
CN113744081B (en) * 2021-08-23 2024-05-28 国网青海省电力公司信息通信公司 Analysis method for electricity stealing behavior
CN114841268A (en) * 2022-05-06 2022-08-02 国网江苏省电力有限公司营销服务中心 Abnormal power customer identification method based on Transformer and LSTM fusion algorithm
CN114841268B (en) * 2022-05-06 2023-04-18 国网江苏省电力有限公司营销服务中心 Abnormal power customer identification method based on Transformer and LSTM fusion algorithm

Similar Documents

Publication Publication Date Title
CN110223196B (en) Anti-electricity-stealing analysis method based on typical industry feature library and anti-electricity-stealing sample library
Buzau et al. Hybrid deep neural networks for detection of non-technical losses in electricity smart meters
CN110097297B (en) Multi-dimensional electricity stealing situation intelligent sensing method, system, equipment and medium
CN112101471A (en) Electricity stealing probability early warning analysis method
CN112132210A (en) Electricity stealing probability early warning analysis method based on customer electricity consumption behavior
CN111382542B (en) Highway electromechanical device life prediction system facing full life cycle
Hachicha et al. A survey of control-chart pattern-recognition literature (1991–2010) based on a new conceptual classification scheme
CN112084237A (en) Power system abnormity prediction method based on machine learning and big data analysis
CN108764584A (en) A kind of enterprise electrical energy replacement potential evaluation method
CN112084229A (en) Method and device for identifying abnormal gas consumption behaviors of town gas users
CN112966259B (en) Operation and maintenance behavior security threat assessment method and equipment for power monitoring system
CN117273489A (en) Photovoltaic state evaluation method and device
CN115730962A (en) Big data-based electric power marketing inspection analysis system and method
Li et al. Distance measures in building informatics: An in-depth assessment through typical tasks in building energy management
Long et al. A data-driven combined algorithm for abnormal power loss detection in the distribution network
CN115718861A (en) Method and system for classifying power users and monitoring abnormal behaviors in high-energy-consumption industry
CN115409120A (en) Data-driven-based auxiliary user electricity stealing behavior detection method
Jianyuan et al. Anomaly electricity detection method based on entropy weight method and isolated forest algorithm
CN117251814A (en) Method for analyzing electric quantity loss abnormality of highway charging pile
CN112633528A (en) Power grid primary equipment operation and maintenance cost determination method based on support vector machine
CN111861785A (en) Special transformer industry fault identification method based on power utilization characteristics and outlier detection
CN117060353A (en) Fault diagnosis method and system for high-voltage direct-current transmission system based on feedforward neural network
Aquize et al. Self-organizing maps for anomaly detection in fuel consumption. Case study: Illegal fuel storage in Bolivia
CN115147242A (en) Power grid data management system based on data mining
CN111461565A (en) Power supply side power generation performance evaluation method under power regulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination