CN114186844A

CN114186844A - Method and device for identifying electricity stealing clients

Info

Publication number: CN114186844A
Application number: CN202111500444.8A
Authority: CN
Inventors: 宫立华; 赵振东; 朱克; 朱龙珠; 朱静; 翟雪敏; 聂玲; 左华林; 单金宇
Original assignee: State Grid Co ltd Customer Service Center; Beijing China Power Information Technology Co Ltd
Current assignee: State Grid Co ltd Customer Service Center; Beijing China Power Information Technology Co Ltd
Priority date: 2021-12-09
Filing date: 2021-12-09
Publication date: 2022-03-15

Abstract

The invention provides a method and a device for identifying electricity stealing clients, which improve the comprehensiveness and the accuracy of data to be identified by constructing an electricity stealing client index system, and take a model with the best identifying effect on the electricity stealing clients in an identification model based on an LM (Linear Algorithm) neural network and an identification model based on a CART (Carrier-associated-term-score) decision tree as an electricity stealing client identification model.

Description

Method and device for identifying electricity stealing clients

Technical Field

The invention relates to the technical field of data analysis, in particular to a method and a device for identifying a power stealing client.

Background

In recent years, electricity stealing still happens occasionally, and great influence is brought to normal power supply and safe electricity utilization.

At present, electricity stealing identification mainly depends on manual patrol combined with index analysis, wherein the manual patrol needs a large amount of manpower and is low in investigation efficiency, the index analysis needs to analyze mass data, and the electricity stealing identification is high in difficulty and low in accuracy.

Disclosure of Invention

In view of this, the invention provides a method and a device for identifying a power stealing client, which can accurately identify the power stealing client.

In order to achieve the above purpose, the invention provides the following specific technical scheme:

a power stealing client identification method, comprising:

acquiring a multi-dimensional index value of a customer to be identified under a pre-constructed electricity stealing customer index system;

inputting the multidimensional index value into a pre-constructed electricity stealing client identification model to obtain an identification result of the client to be identified, wherein the electricity stealing client identification model is a model with the best identification effect on the electricity stealing client in an identification model based on an LM (Linear programming) neural network and an identification model based on a CART (Carrier induced reactor) decision tree, the identification model based on the LM neural network and the identification model based on the CART decision tree are obtained by respectively training an LM neural network model and a CART decision tree model by utilizing a training sample which is marked as an electricity stealing client or not, and the training sample comprises the multidimensional index value under an electricity stealing client index system.

Optionally, the obtaining a multidimensional index value of the customer to be identified under a pre-constructed electricity stealing customer index system includes:

acquiring power consumption data of the customer to be identified in a preset time period;

extracting data corresponding to the multidimensional indexes under the electricity stealing customer index system from the electricity utilization data;

data corresponding to the multidimensional indexes under the electricity stealing client index system are cleaned, and a Lagrange interpolation method is adopted to perform interpolation processing on the cleaned missing values;

and normalizing the data corresponding to the multidimensional indexes under the electricity stealing client index system after interpolation processing to obtain the multidimensional index values.

Optionally, the method further includes:

determining the number of neurons of an input layer in the LM neural network according to the number of multi-dimensional indexes in the electricity stealing client index system, and determining the number of neurons of an output layer in the LM neural network as 1;

carrying out multiple convergence training on the LM neural network by using the training samples by using different initial training parameters;

acquiring the precision of a training set and the precision of a test set after each convergence training, and respectively carrying out weighted average calculation on the precision of the training set and the precision of the test set after each convergence training according to the preset precision weight of the training set and the preset precision weight of the test set to obtain the precision value of each convergence training;

and determining the network parameter corresponding to the convergence training with the highest precision value as the optimal network parameter to obtain the identification model based on the LM neural network corresponding to the optimal network parameter.

Optionally, the method further includes:

selecting one or more multidimensional indexes from the electricity stealing client index system as the division attributes of tree nodes according to preset rules, taking each multidimensional index in the training sample as each branch of a test variable tree, repeating the process until one of preset conditions is met, stopping building the tree, and generating a CART decision tree;

pruning the generated CART decision tree by using a pruning algorithm to form a sub-tree sequence;

testing the sub-tree sequences on an independent verification data set through a cross verification method, and selecting an optimal sub-tree from the sub-tree sequences as the CART decision tree-based recognition model;

wherein the preset conditions include:

the number of samples in all leaf nodes of the CART decision tree is 1 or the samples belong to the same class;

the CART decision tree height reaches a threshold set by the user.

Optionally, the method further includes:

respectively counting the recognition accuracy of the recognition model based on the LM neural network and the recognition model based on the CART decision tree;

and determining the model with the highest identification accuracy as the electricity stealing client identification model.

Optionally, the method further includes:

respectively drawing ROC curves of the identification model based on the LM neural network and the identification model based on the CART decision tree;

and determining the model with the maximum AUC under the ROC curve as the electricity stealing customer identification model.

A power stealing client identifying device comprising:

the system comprises a to-be-identified client data acquisition unit, a power stealing client index system acquisition unit and a power stealing client index system acquisition unit, wherein the to-be-identified client data acquisition unit is used for acquiring a multi-dimensional index value of a to-be-identified client under a pre-constructed power stealing client index system;

the electricity stealing client identification unit is used for inputting the multi-dimensional index values into a pre-constructed electricity stealing client identification model to obtain the identification result of the client to be identified, wherein the electricity stealing client identification model is the model with the best identification effect on the electricity stealing client in the identification model based on the LM neural network and the identification model based on the CART decision tree, the identification model based on the LM neural network and the identification model based on the CART decision tree are obtained by respectively training the LM neural network model and the CART decision tree model by utilizing training samples which are marked as electricity stealing clients or not, and the training samples comprise the multi-dimensional index values under the electricity stealing client index system.

Optionally, the to-be-identified client data obtaining unit is specifically configured to:

Optionally, the apparatus further includes a first model building unit, specifically configured to:

Optionally, the apparatus further includes a second model building unit, specifically configured to:

wherein the preset conditions include:

the CART decision tree height reaches a threshold set by the user.

Optionally, the apparatus further comprises:

the recognition model evaluation unit is used for respectively counting the recognition accuracy of the recognition model based on the LM neural network and the recognition model based on the CART decision tree; and determining the model with the highest identification accuracy as the electricity stealing client identification model.

Optionally, the apparatus further comprises:

the identification model evaluation unit is used for respectively drawing ROC curves of the identification model based on the LM neural network and the identification model based on the CART decision tree; and determining the model with the maximum AUC under the ROC curve as the electricity stealing customer identification model.

Compared with the prior art, the invention has the following beneficial effects:

the invention discloses an electricity stealing client identification method, which improves the comprehensiveness and the accuracy of data to be identified by constructing an electricity stealing client index system, takes a model with the best identification effect on electricity stealing clients in an identification model based on an LM (Linear matrix) neural network and an identification model based on a CART (Carrier locator) decision tree as an electricity stealing client identification model, and has the advantages that the LM neural network has the local convergence of a Gaussian-Newton method and the global characteristic of a gradient descent method, the CART decision tree is high in efficiency, the maximum calculation frequency of each prediction does not exceed the depth of the decision tree, and the decision tree does not have any hypothesis requirement on statistical distribution on input data, so that the electricity stealing client identification is carried out by utilizing an optimal model selected from the two models, and the overall optimal solution with high convergence speed and high precision can be obtained, thereby improving the electricity stealing client identification efficiency.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic flow chart of a method for identifying a power stealing client according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a subscriber identity module according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The inventor finds out through research that: the traditional BP neural network algorithm is an optimization method of local search, is easy to trap in local extremum, has low convergence speed and is very sensitive to initial network weight. And the accuracy of the logistic regression algorithm is not high, and the real distribution of the fitting data is difficult.

On the basis, the recognition model with the best recognition effect on electricity stealing clients in the recognition model based on the LM neural network and the recognition model based on the CART decision tree is used as the recognition model of the electricity stealing clients, because the LM neural network has the local convergence of the Gaussian-Newton method and the global characteristic of the gradient descent method, the CART decision tree is high in efficiency, the maximum calculation frequency of each prediction does not exceed the depth of the decision tree, and the decision tree does not have any hypothesis requirement of statistical distribution on input data, the electricity stealing clients are recognized by using the optimal model selected from the two models, the global optimal solution with high convergence speed and high precision can be obtained, and the recognition efficiency of the electricity stealing clients is improved.

Specifically, referring to fig. 1, the method for identifying a power stealing client disclosed by the embodiment of the present invention includes the following steps:

s101: acquiring a multi-dimensional index value of a customer to be identified under a pre-constructed electricity stealing customer index system;

the invention takes an electricity acquisition terminal and an electric energy meter as research objects, combines a business scene, determines data requirements according to business contents, starts from an electricity information acquisition system and a marketing business application system, and comprises an acquisition point, a running electric energy meter, a daily measurement point power curve, a daily measurement point voltage curve, a daily measurement point current curve, measurement point daily frozen electric energy information, an electricity consumer, a metering point, a power supply unit, an electricity utilization address, default electricity utilization and stealing information and the like.

And when data is accessed, data quality inspection is carried out on the aspects of data integrity, repeatability, accuracy and the like, and cleaning and missing value interpolation processing are carried out on the data.

And performing data distribution analysis and data periodicity analysis on the processed data. The data distribution analysis is used for carrying out distribution analysis on the time periods where the historical electricity stealing data and the customer electricity utilization data are accessed, and the electricity stealing customer distribution conditions of all electricity utilization categories are counted. The data periodicity analysis-analysis method analyzes and compares the periodicity change of the two types of customers by drawing time series trend graphs of the power consumption of normal power customers and electricity stealing customers, and finds out the rule and the characteristic of the power consumption data distribution of the normal power customers and the electricity stealing customers.

And constructing the following electricity stealing client index system from 7 dimensions of voltage class, current class, electric quantity class, abnormal event class, historical electricity stealing class, phase sequence class and archive class according to the analysis result, wherein the index system represents the law of the electricity stealing client behavior.

The method comprises the following steps that a magnetic field abnormal event and an electric energy meter uncapping event occur from an electricity utilization information acquisition system, wherein the electricity utilization information acquisition system comprises a high-frequency magnetic field abnormal event, a power frequency magnetic field abnormal event, a strong magnetic field abnormal event, uncapping times and an uncapping duration mean value; the abnormal event of the defect of the electric equipment and the abnormal operation of the power price execution are from the application system of the marketing business, and comprise a self-contained emergency power supply event, a high-voltage cable exposure event, a fixed-ratio quantitative execution error and a high-supply low-count.

After an electricity stealing client index system is built, obtaining a multi-dimensional index value of a client to be identified under the pre-built electricity stealing client index system, which specifically comprises the following steps:

acquiring power consumption data of a customer to be identified in a preset time period;

extracting data corresponding to the multidimensional indexes under the electricity stealing client index system from the electricity utilization data;

and normalizing the data corresponding to the multidimensional indexes under the electricity stealing client index system after interpolation processing, and mapping the data between [ -1, 1] to obtain the multidimensional index value.

And performing interpolation processing on the missing value by adopting a Lagrange interpolation method. The specific method comprises the following steps:

firstly, determining dependent variables and independent variables from original data; secondly, 5 data before and after the missing value are taken out (the data in the data before and after are absent or empty, the data are directly omitted, and only data are combined into a group); then, a group is formed from the 10 pieces of data taken out. The lagrange formula is as follows:

finally, interpolation is realized through lagrange (y.index, list (y)) function and the interpolation result is returned.

S102: inputting multidimensional index values into a pre-constructed electricity stealing client identification model to obtain an identification result of a client to be identified, wherein the electricity stealing client identification model is a model with the best identification effect on the electricity stealing client in the identification model based on the LM neural network and the identification model based on the CART decision tree, the identification model based on the LM neural network and the identification model based on the CART decision tree are obtained by respectively training the LM neural network model and the CART decision tree model by utilizing a training sample which is marked as the electricity stealing client or not, and the training sample comprises the multidimensional index values under an electricity stealing client index system.

The identification model based on the LM neural network is the same as the training sample of the identification model based on the CART decision tree, and the training sample comprises multi-dimensional index values under a power stealing client index system.

The LM algorithm is a fast algorithm using standard numerical optimization techniques. The LM algorithm is an improved form of the Gauss-Newton method, is a combination of gradient descent and the Gauss-Newton method, and has the local convergence of the Gauss-Newton method and the global characteristic of the gradient descent method. Because it uses approximate second derivative information, convergence speed is much faster than that of the gradient method, and the algorithm is stable.

In BP neural network, let x_kRepresenting the vector formed by the weight and the threshold value of the k iteration, and the new vector x formed by the weight and the threshold value_k+1The following rule can be used to obtain:

x_k+1＝x_k+Δ_x. (1)

for newton's law:

wherein

The gradient is represented by the number of lines,

the Hessian matrix representing the error indicator function e (x).

Let the error index function be:

in the formula (3), e_i(x) For an error (i ═ 1, 2.., N), then:

in the formulas (4) and (5)

Is a Jacobian matrix, namely:

the calculation method of the Gauss-Newton method is as follows:

Δx＝-[J^T(x)J(x)]^-1J(x)e(x) (7)

the LM algorithm is an improvement of it, namely:

Δx＝-[J^T(x)J(x)+μI]^-1J(x)e(x) (8)

in the formula (8), the proportionality coefficient μ >0 is constant, and I is an identity matrix. When mu is 0, the Gauss-Newton method is obtained; when the value of mu is large, the gradient descent method is approached.

Practice proves that the speed can be improved by dozens of times or even hundreds of times by adopting the LM algorithm compared with the original gradient descent method. Also because of [ J^T(x)J(x)+μI]^-1Is positive, the solution of equation (7) always exists, so the LM algorithm is also better than the Gauss-Newton method.

The method for constructing the identification model based on the LM neural network comprises the following steps:

according to the Ockham principle of neural networks, in network modeling, a larger network is not used as long as a smaller network can work. The neural network with the least number of layers of 3 is used, namely an input layer, a hidden layer and an output layer, wherein the hidden layer and the output layer adopt an activation function relu.

The number of neurons in the input layer and the output layer is determined by the types of input and output elements in the training set respectively, and the network takes 49 main influence factors of a user as input and takes a client which is not a power stealing client as output. The number of neurons in the input and output layers is thus 49 and 1, respectively.

The number of hidden neurons is usually determined by empirical formula

Determining, wherein: n1 is the number of cryptic neurons; n is the number of neurons in the input layer; m is the number of neurons in the output layer; a is a constant between 1 and 10. And changing the number of hidden neurons from small to large according to the calculated value of n1, and selecting 10 hidden neurons under the condition that the number of neurons does not influence network errors and the relative training data of the network is not large enough to generate over training.

And performing model training on the training set based on the constructed model, and after the network performs multiple convergence training by adopting different initial training parameters, selecting the optimal convergence standard by adopting a weighted average value of the precision of the network training set and the precision of the test set, wherein the distribution of the weight can be set according to the quantity of the training set data and the test set data. And selecting the optimal training result from the training results, and obtaining the network parameters from the optimal training results to obtain the optimal result of model training.

The calculation steps of the LM neural network are as follows:

(1) giving out the allowable value epsilon, the coefficients beta and mu of the model training error₀And initializing the weight and threshold vector x₀Let k be 0 and μ be μ₀。

(2) Computing the network output and error index function E (x)_k)。

(3) The Jacobian matrix J (x) is calculated according to equation (6).

(4) Respectively calculating delta x and E (x) according to the formula (8) and the formula (3)_k)。

(5) If E (x)_k)<If epsilon, stopping; otherwise, with x_k+1Calculating error index function E (x) for weight and threshold_k+1)。

(6) If E (x)_k+1)<E(x_k) If k is k +1 and μ is μ/β, the process returns to step (2); otherwise, the weight and the threshold are not updated, let x_k+1＝x_kμ ═ μ β, and go back to step (4).

The CART decision tree algorithm is a supervised learning algorithm, and the key point of the CART decision tree algorithm is to select the partition attribute of each node from available attributes so as to ensure that the CART decision tree is dividedThe highest classification precision is achieved. The tree growing process is a process of continuously dividing a data set, and for each division, the "difference" between data records divided into the same branch is minimum (namely, the data records belong to the same class), the "difference" between data records divided into different branches is maximum (namely, the data records belong to different classes), and an index for measuring the "difference" is called as the impurity. The CART decision tree uses "kini index" to select partition attributes, and the purity of a data set can be measured in kini values. For a given sample set D, assume that there are K classes, the number of kth classes being C_kThen the expression for the kini coefficient of sample D is:

in particular, for sample D, if D is divided into D according to a certain value a of the characteristic A₁And D₂And two parts, under the condition of the characteristic A, the Gini index expression of D is as follows:

gini (D) indicates the impurity of the set D, and Gini (D, A) indicates the impurity of the set D after being divided by A ═ a. The greater the kini index, the greater the impurity of the sample set.

Before prediction of the CART decision tree, a training sample set is provided to construct and evaluate the CART, and then the CART decision tree can be used. The CART decision tree uses a training sample set as follows:

L＝{X₁，X₂，…，X_m，Y}

X1＝(x₁₁，x₁₂，…，x_1t1)，…，X_m＝(x_m1，x_m2，…，x_min)

Y＝(Y₁，Y₂，…，Y_k)

wherein, X₁，X₂，…，X_mCalled attribute directionQuantities, whose attributes may be ordered or discrete; y is called a tag vector whose attributes may be ordered or discrete.

The CART decision tree is characterized in that one attribute or a combination of a plurality of attributes is selected from a plurality of prediction attributes to serve as a dividing attribute of a tree node, a test variable is divided into branches, and the process is repeated to establish a sufficiently large classification tree.

The CART decision tree stops building trees when one of the following conditions is met:

1. the number of samples in all leaf nodes is 1 or the samples belong to the same class;

2. the decision tree height reaches a threshold set by the user.

After the decision tree is generated, the generated decision tree needs to be pruned by using a pruning algorithm, an optimal sub-tree is selected, and firstly, the decision tree T generated by the generation algorithm₀The bottom end starts to continuously prune until T₀Form a subtree sequence { T }₀，T₁T₂，T₃，…，T_n}，T₀Is not cut, T₁One leaf node is cut, and so on. And then testing the sub-tree sequences on an independent verification data set through a cross verification method, and selecting an optimal sub-tree from the sub-tree sequences as a final CART decision tree-based recognition model.

Inputting: and (4) a decision tree generated by the CART algorithm.

And (3) outputting: optimal decision tree T_α。

The method comprises the following specific steps:

(1) let k equal to 0 and T equal to T₀。

(2) Let α be + ∞.

(3) Bottom-up computation of internal node T C (T)_t) And | T_t|

Wherein, T_tRepresenting a subtree with T as root node, C (T)_t) Is to the number of trainingAccording to the prediction error, | T_tIs T |_tThe number of leaf nodes.

(4) Accessing the internal node T from top to bottom, if there is an internal node with g (T) ═ alpha, pruning, and determining the class of the leaf node T by majority voting method to obtain the tree T.

(5) Let k be k +1, α_k＝α，T_k＝T。

(6) If T is not a tree consisting of root nodes alone, go back to step 4.

Verifying on the sub-tree sequence by adopting a cross verification method, and finally selecting the optimal sub-tree T_αAnd the final identification model based on the CART decision tree is obtained.

After an identification model based on an LM (Linear modeling) neural network and an identification model based on a CART (Carrier-oriented training) decision tree are constructed, model effect evaluation needs to be carried out on the two models, so that the identification model based on the LM neural network and the identification model based on the CART decision tree are selected as electricity stealing client identification models, and two methods of accuracy and ROC (rock characteristic) curves are mainly adopted for the model effect evaluation in the embodiment.

In the first method, in the effect evaluation process of an identification model based on an LM (mean square) neural network and an identification model based on a CART (carry out robust tree), a confusion matrix is combined, the accuracy is respectively calculated by the accuracy (accuracy) of all samples/total samples with correct prediction, and then the models with higher accuracy are selected for application by performing preliminary comparison.

And drawing an ROC (receiver operating characteristic curve) curve of the recognition model based on the LM neural network and the recognition model based on the CART decision tree, comparing AUC (area Under curve) Under the two ROC curves, defining the AUC as an area surrounded by ROC curves and coordinate axes, and using the area as an index for measuring the advantages and disadvantages of different models, wherein the larger the value is, the more the possibility of correct prediction of the models is, and selecting the model with the larger AUC value according to the AUC value to develop application.

The optimal electricity stealing client identification model is selected through model evaluation to carry out subsequent application, so that the identification and early warning of electricity stealing clients are realized, and the application is as follows:

the electricity stealing possibility of a user is comprehensively judged by combining an electricity stealing prevention monitoring system based on an electricity stealing client identification model, the electricity utilization characteristic analysis and intelligent pushing of the client are completed, and electricity stealing behaviors in the client with abnormal electricity utilization are accurately identified; by constructing a long-acting mechanism for abnormal power utilization investigation, marketing inspection work is assisted to be carried out, the accuracy of electricity stealing client identification is improved, a more efficient method and means are provided for electricity stealing prevention management, the electricity stealing prevention management level is improved, and the company operation risk is reduced.

Based on the above-mentioned electricity stealing client identification method disclosed in the embodiment, the embodiment correspondingly discloses an electricity stealing client identification device, please refer to fig. 2, the device includes:

the client data acquisition unit 201 to be identified is used for acquiring a multidimensional index value of a client to be identified under a pre-established electricity stealing client index system;

the electricity stealing client identification unit 202 is configured to input the multidimensional index value into a pre-constructed electricity stealing client identification model to obtain an identification result of the client to be identified, where the electricity stealing client identification model is a model with a best identification effect on the electricity stealing client in an identification model based on an LM neural network and an identification model based on a CART decision tree, the identification model based on the LM neural network and the identification model based on the CART decision tree are obtained by respectively training an LM neural network model and a CART decision tree model using a training sample labeled as whether the electricity stealing client is a electricity stealing client, and the training sample includes the multidimensional index value under the electricity stealing client index system.

Optionally, the to-be-identified client data obtaining unit 201 is specifically configured to:

wherein the preset conditions include:

the CART decision tree height reaches a threshold set by the user.

Optionally, the apparatus further comprises:

The device for identifying the electricity stealing clients disclosed by the embodiment improves the comprehensiveness and the accuracy of data to be identified by constructing an electricity stealing client index system, takes a model with the best identification effect on the electricity stealing clients in an identification model based on an LM (linear least squares) neural network and an identification model based on a CART (carry out robust least squares) decision tree as an electricity stealing client identification model, and has the advantages that the LM neural network has the local convergence of a Gaussian-Newton method and the global characteristic of a gradient descent method, the CART decision tree is high in efficiency, the maximum calculation frequency of each prediction does not exceed the depth of the decision tree, and the decision tree does not have any hypothesis requirement on statistical distribution on input data, so that the electricity stealing clients are identified by using the optimal model selected from the two models, and the global optimal solution with high convergence speed and high precision can be obtained, and the identification efficiency of the electricity stealing clients is improved.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above embodiments can be combined arbitrarily, and the features described in the embodiments in the present specification can be replaced or combined with each other in the above description of the disclosed embodiments, so that those skilled in the art can implement or use the present application.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for identifying a power stealing client, comprising:

2. The method of claim 1, wherein the obtaining of the multi-dimensional index value of the customer to be identified under the pre-constructed electricity stealing customer index system comprises:

3. The method of claim 1, further comprising:

4. The method of claim 1, further comprising:

wherein the preset conditions include:

the CART decision tree height reaches a threshold set by the user.

5. The method of claim 1, further comprising:

6. The method of claim 1, further comprising:

7. An electricity stealing client identifying apparatus, comprising:

8. The apparatus according to claim 7, wherein the client data acquiring unit to be identified is specifically configured to:

9. The apparatus according to claim 7, characterized in that the apparatus further comprises a first model construction unit, in particular for:

10. The apparatus according to claim 7, characterized in that the apparatus further comprises a second model construction unit, in particular for:

wherein the preset conditions include:

the CART decision tree height reaches a threshold set by the user.