CN117556369B - Power theft detection method and system for dynamically generated residual error graph convolution neural network - Google Patents

Power theft detection method and system for dynamically generated residual error graph convolution neural network Download PDF

Info

Publication number
CN117556369B
CN117556369B CN202410046231.XA CN202410046231A CN117556369B CN 117556369 B CN117556369 B CN 117556369B CN 202410046231 A CN202410046231 A CN 202410046231A CN 117556369 B CN117556369 B CN 117556369B
Authority
CN
China
Prior art keywords
neural network
data
matrix
residual
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410046231.XA
Other languages
Chinese (zh)
Other versions
CN117556369A (en
Inventor
庄伟�
江文
纪兆辉
樊继利
李之恒
邢发男
申义贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202410046231.XA priority Critical patent/CN117556369B/en
Publication of CN117556369A publication Critical patent/CN117556369A/en
Application granted granted Critical
Publication of CN117556369B publication Critical patent/CN117556369B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a power theft detection method of a dynamically generated residual map convolution neural network, which comprises the following steps of: (1) Collecting user power consumption original data in a power system; (2) Performing missing supplement and outlier processing on the original data, and dividing a training set, a verification set and a test set; (3) Transforming the preprocessed one-dimensional power load curve into a two-dimensional power load characteristic matrix to input a graph convolution neural network; (4) Obtaining optimal parameters through training to obtain an adjacency matrix A which can most represent the data relationship; (5) The feature matrix X and the adjacent matrix A are sent into a residual map convolution neural network to extract potential features, and final classification is obtained through a pooling layer and a full-connection layer; the invention improves the accuracy of the electricity larceny detection field by dynamically learning the deep time dependency relationship, the period mode and the potential characteristics among the electricity consumption periods of the users. The adopted data supplementing method effectively relieves the problem of data unbalance in the real world.

Description

Power theft detection method and system for dynamically generated residual error graph convolution neural network
Technical Field
The invention relates to the technical field of electricity theft detection, in particular to an electricity theft detection method and system of a dynamically generated residual error graph convolution neural network.
Background
The power industry is one of important pillar industries supporting national strategic development, and the healthy and stable development thereof is related to the smooth operation of society. The electric energy loss in the power transmission and distribution process has great influence on the safety and economic benefits of the power system, and mainly comprises technical loss and non-technical loss. In recent years, more and more researches are carried out on electricity larceny detection technology, and the traditional method has blindness, randomness and low inspection efficiency and is also influenced by the defects of professional technology and detection experience of electricity consumption detection personnel. At present, a smart power grid is continuously developed, an advanced measurement system (AMI) is increasingly perfected, a smart electric meter is used as an important component of the advanced measurement system, a large amount of electricity load data is collected at high frequency, and a power theft detection method is gradually evolved from a traditional method of on-site inspection, evidence collection and the like by workers to a machine learning-based method.
However, most of machine learning models proposed in the prior art directly model a single power load curve, and cannot fully capture time dependence, space dependence and potential dependence under the period of power consumption data. Moreover, in actual situations, the number of normal electricity consumers is far greater than that of electricity stealing consumers, and most methods cannot effectively analyze consumption habits of the electricity stealing consumers, so that the application effect is poor and cannot be popularized to actual enterprises.
Disclosure of Invention
The invention aims to: the invention aims to provide a method and a system for detecting electricity larceny of a dynamically generated residual error graph convolution neural network, which accurately identify the electricity larceny behavior of a user.
The technical scheme is as follows: the invention discloses a method for detecting electricity theft of a dynamically generated residual error graph convolution neural network, which comprises the following steps:
(1) Collecting user power consumption original data in a power system;
(2) Performing missing supplement and outlier processing on the original data, and dividing a training set, a verification set and a test set;
(3) Transforming the preprocessed one-dimensional power load curve into a two-dimensional power load characteristic matrix X so as to input a graph convolution neural network;
(4) Calculating the correlation among nodes by taking each period of data as a node and adopting a dynamic topological graph generation method, and obtaining optimal parameters through training to obtain an adjacent matrix A which can most represent the data relationship;
(5) And (3) sending the feature matrix X and the adjacent matrix A into a residual map convolution neural network to extract potential features, and obtaining final classification through a pooling layer and a full connection layer.
Further, the step (1) specifically comprises the following steps: the raw data is a daily electricity record of 42372 users generated in 1035 days. The data format of a single user is a time series of 1x 1035.
Further, the step (2) includes the following steps:
(21) And (3) deleting the power consumption data with the missing part, namely deleting the users with the consumption data missing more than 50%, otherwise, supplementing the missing data by adopting a linear interpolation method, respectively taking out three values before and after the missing value, and dividing the six taken out data into a group if the three values are directly discarded in the absence or the null state; calculating to obtain a filling value by using an interpolation formula; the specific operation is as follows:
wherein, A value representing the day i electricity data, if empty, as NaN;
(22) Outliers are where individual values in a sample deviate significantly from the rest of the observations; the abnormal value is defined by adopting the three sigma law, and the specific formula is as follows:
wherein, For single sample average,/>Is the standard deviation;
(23) Carrying out standardization treatment by adopting a normalization method, and mapping all data into a [0-1] interval;
(24) Dividing the processed data set into a training set, a verification set and a test set, wherein the proportion is 6:2:2.
Further, the step (3) specifically includes the following steps: setting the one-dimensional power load curve after pretreatment asComprising N sample users, using electricity data to daily/>Will/>Reconstruction as/>Together with the adjacent matrix as input to the graph convolution module, i.e./>
Further, the step (4) adaptively learns the graph adjacency matrix according to a dynamic topological graph generating method to capture an implicit connection relation between data, and the method comprises the following steps:
(41) Randomly generating n node vectors Initializing a weight parameter W and an offset parameter b, and generating an initial relation measure between nodes by the following formula:
Wherein, To activate the function,/>Controlling the oscillation size of the preliminary relation measurement for the super parameter; v n is an implicit representation of the node vector En, V i,Vj represents the i, j-th vector in the sequence;
(42) Sorting the initial relation measurement according to rows and columns respectively, selecting k maximum values, setting the maximum values as 1, setting the rest values as 0, and obtaining an adjacent matrix A; where k is a super parameter set to [20, 25, 30, 35, 40, …,80].
Further, the residual map convolution neural network in the step (5) includes: the 1X1 convolution module and the residual MixHop graph convolution module specifically comprise the following steps:
(51) The number of channels of the feature matrix X is increased through a 1X1 convolution module;
(52) Extracting time and space characteristics and potential characteristics among user power curves through a residual MixHop graph convolution module; the residual MixHop graph convolution module is composed of a plurality of graph convolution GC layers, each GC layer is composed of two MixHop modules in a superposition mode, residual connection is adopted among the GC layers, and the formula is as follows:
given the adjacency matrix a, then:
wherein,
Wherein,The node representing the K layer is obtained through graph convolution operation; /(I)Is a hyper-parameter controlling the ratio of the original state of the reserved root node; /(I)Representing the input hidden state of the first layer, and multiplying the adjacent matrix by the feature matrix; /(I)The hidden state of the upper layer is the node characteristic of the upper layer;
Adding the normalized result of the identity matrix I to the matrix A; /(I) The degree matrix is the degree of which the diagonal line element is a node; /(I)An inverse of the degree matrix;
Where k is the propagation depth, Is a model weight parameter,/>Representing an output hidden state of the current layer;
(53) Filtering redundant information in the network by adopting a maximum pooling layer; selecting the maximum value in each rectangular subarea to enter the full connection layer; the formula is as follows:
wherein, Representing the result after the maximum pooling operation; /(I)A value representing a certain rectangular subregion in the input feature map;
(54) Converting the extracted model node characteristics into predictive scores to obtain a final classification detection result; the formula is as follows:
wherein, Is a SoftMax function; w, B is a model parameter of the full connection layer;
(55) Updating each parameter in the model through gradient descent, wherein binary cross entropy is adopted as a loss function, and the formula is as follows:
Wherein Y is the actual class number, namely 0 or 1, representing one of the classes; p is the prediction probability of the model, and represents the probability that the sample belongs to the first class;
(56) Training is ended when the model's loss over the training set no longer drops significantly.
The invention relates to a power theft detection system of a dynamically generated residual error graph convolution neural network, which comprises:
and a collection module: the system is used for collecting user power consumption original data in the power system;
And a pretreatment module: the method comprises the steps of performing deletion supplement on original data, performing outlier processing and dividing a training set, a verification set and a test set;
And a conversion module: the method comprises the steps of converting a preprocessed one-dimensional power load curve into a two-dimensional power load characteristic matrix to input a graph convolution neural network;
A neighbor matrix module: each period for data is a node, the correlation among the nodes is calculated by adopting a dynamic topological graph generation method, and the optimal parameters are found through training to obtain an adjacent matrix A which can most represent the data relationship;
And a classification module: and the method is used for sending the feature matrix X and the adjacent matrix A into a residual graph convolution neural network to extract potential features, and obtaining final classification through a pooling layer and a full connection layer.
The device of the invention comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the electricity larceny detection method of the dynamically generated residual graph convolution neural network when being loaded to the processor.
A storage medium according to the present invention stores a computer program which, when executed by a processor, implements a method for detecting theft of a dynamically generated residual map convolution neural network according to any one of the above.
The beneficial effects are that: compared with the prior art, the invention has the following remarkable advantages: providing fewer parameters while providing more accurate node relationships; a MixHop graph convolutional network is employed to conveniently extract deep time dependencies, periodic patterns, and potential features in the consumer power consumption data. By adding residual connection, the depth of the network is increased, and the problem of gradient explosion in the training process is effectively relieved, so that the robustness of the model is enhanced.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a graphical illustration of a dynamically generated residual map convolutional neural network constructed in accordance with the present invention;
FIG. 3 is a graphical illustration of a residual map convolution neural module in accordance with the present invention;
FIG. 4 is a convergence diagram of training and verifying loss values during the training process of the present invention;
FIG. 5 is a graph comparing effects of the present invention at different convolution layers;
fig. 6 is a graph comparing effects of the present invention at different residual convolution layers.
FIG. 7 is a graph showing the comparison of the effects of the present invention at different ratios of electricity theft.
Description of the embodiments
The technical scheme of the invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, an embodiment of the present invention provides a method for detecting electricity theft of a dynamically generated residual graph convolution neural network, including the following steps:
(1) Collecting user power consumption original data in a power system; the data set adopts actual electricity consumption data which is recorded by a national electric network (SGCC) from 2014 to 2016 and is recorded by 42372 as users in 1035 days, wherein the number of electricity stealing users is 3615, and the number of normal users is 38757. The data format of a single user is a time series of 1x 1035.
(2) Performing missing supplement and outlier processing on the original data, and dividing a training set, a verification set and a test set; the method comprises the following steps:
(21) And (3) deleting the power consumption data with the missing part, namely deleting the users with the consumption data missing more than 50%, otherwise, supplementing the missing data by adopting a linear interpolation method, respectively taking out three values before and after the missing value, and dividing the six taken out data into a group if the three values are directly discarded in the absence or the null state; calculating to obtain a filling value by using an interpolation formula; the specific operation is as follows:
wherein, A value representing the day i electricity data, if empty, as NaN;
(22) Outliers are where individual values in a sample deviate significantly from the rest of the observations; the abnormal value is defined by adopting the three sigma law, and the specific formula is as follows:
wherein, For single sample average,/>Is the standard deviation;
(23) Carrying out standardization treatment by adopting a normalization method, and mapping all data into a [0-1] interval; the processed data is presented in table 1.
(24) Dividing the processed data set into a training set, a verification set and a test set, wherein the proportion is 6:2:2.
(3) Transforming the preprocessed one-dimensional power load curve into a two-dimensional power load characteristic matrix X so as to input a graph convolution neural network; the method comprises the following steps: setting the one-dimensional power load curve after pretreatment asComprising N sample users, using electricity data to daily/>Will/>Reconstruction as/>Together with the adjacent matrix as input to the graph convolution module, i.e./>
(4) Calculating the correlation among nodes by taking each period of data as a node and adopting a dynamic topological graph generation method, and finding out the optimal parameters through training to obtain an adjacent matrix A which can most represent the data relationship; adaptively learning graph adjacency matrices to capture implicit connection relationships between data according to a dynamic topology graph generation method, comprising the steps of:
(41) Randomly generating n node vectors Initializing a weight parameter W and an offset parameter b, and generating an initial relation measure between nodes by the following formula:
Wherein, To activate the function,/>Controlling the oscillation size of the preliminary relation measurement for the super parameter; v n is an implicit representation of the node vector En, V i,Vj represents the i, j-th vector in the sequence;
(42) Sorting the initial relation measurement according to rows and columns respectively, selecting k maximum values, setting the maximum values as 1, setting the rest values as 0, and obtaining an adjacent matrix A; where k is a super parameter set to [20, 25, 30, 35, 40, …,80].
(5) The feature matrix X and the adjacent matrix A are sent into a residual map convolution neural network to extract potential features, and final classification is obtained through a pooling layer and a full-connection layer; the residual graph convolution neural network includes: the 1X1 convolution module and the residual MixHop graph convolution module specifically comprise the following steps:
(51) The number of channels of the feature matrix X is increased through a 1X1 convolution module;
(52) Extracting time and space characteristics and potential characteristics among user power curves through a residual MixHop graph convolution module; the residual MixHop graph convolution module is composed of a plurality of graph convolution GC layers, each GC layer is composed of two MixHop modules in a superposition mode, residual connection is adopted among the GC layers, and the formula is as follows:
given the adjacency matrix a, then:
wherein,
Wherein,The node representing the K layer is obtained through graph convolution operation; /(I)Is a hyper-parameter controlling the ratio of the original state of the reserved root node; /(I)Representing the input hidden state of the first layer, and multiplying the adjacent matrix by the feature matrix; /(I)The hidden state of the upper layer is the node characteristic of the upper layer;
Adding the normalized result of the identity matrix I to the matrix A; /(I) The degree matrix is the degree of which the diagonal line element is a node; /(I)An inverse of the degree matrix;
Where k is the propagation depth, Is a model weight parameter,/>Representing an output hidden state of the current layer;
As shown in fig. 3, mixHop convolution steps are demonstrated. Information is first propagated laterally and then selected longitudinally. The information dissemination step recursively disseminates node information according to a given graph structure. In the propagation process, the original state of a certain proportion of nodes is reserved. The process of the GCN module is shown in more detail in fig. 3 than the description of the module in fig. 2.
(53) Filtering redundant information in the network by adopting a maximum pooling layer; selecting the maximum value in each rectangular subarea to enter the full connection layer; the formula is as follows:
wherein, Representing the result after the maximum pooling operation; /(I)A value representing a certain rectangular subregion in the input feature map;
(54) Converting the extracted model node characteristics into predictive scores to obtain a final classification detection result; the formula is as follows:
wherein, Is a SoftMax function; w, B is a model parameter of the full connection layer;
(55) Updating each parameter in the model through gradient descent, wherein binary cross entropy is adopted as a loss function, and the formula is as follows:
Wherein Y is the actual class number, namely 0 or 1, representing one of the classes; p is the prediction probability of the model, and represents the probability that the sample belongs to the first class;
(56) Training is ended when the model's loss over the training set no longer drops significantly.
According to the invention, the electricity stealing detection is used as a two-class task for detecting whether a user generates electricity stealing behavior, a single accuracy evaluation index can not objectively evaluate the advantages of a model under the condition of unbalanced data, particularly extremely biased data, and the advantages of the model compared with other existing models are presented through multiple index evaluation models. The invention adopts the Area Under Curve (AUC) and average precision (MAP) which are commonly used in the field of electricity larceny detection as model evaluation indexes, and the detailed explanation is shown in table 2.
TABLE 2 introduction of evaluation index
The calculation method of each index is as follows:
wherein, Representing the rank value of sample i, M represents a normal user sample, N represents a steal user sample, and the samples are scored in ascending order of positive samples.
Prior to MAP evaluation, the tags of the test set need to be ranked according to test score, and in the field of electricity theft detection, the first 100 and 200 tags are typically selected to evaluate performance.
First, define the precision at k (denoted by P@k):
wherein, Representing the number of correctly predicted thieves before location k, and then using map@k (where the range of values for k is 100, 200) to represent the average of all P@k cases, as shown below,
Where r is the number of power theft in the first N tags,Is the location where electricity theft occurs.
The index pairs of the test set of the embodiment of the invention and other models are shown in table 3:
TABLE3 comparison of effects with mainstream models
As can be seen from Table 3, the AUC of the present model on the test set was 0.932, MAP@100 was 0.959, and MAP@200 was 0.967 higher than the existing model.
As shown in fig. 4, the convergence process of 100 rounds of the method at a training rate of 60% is shown. The horizontal axis represents the number of training rounds and the vertical axis represents the loss value. As training proceeds, both training and validation losses are significantly reduced. In particular, between 35 and 50 rounds, the verification loss exhibits small amplitude fluctuations, which may be due to data noise, and after 60 rounds the loss value tends to stabilize.
And determining the final layer number to be 6 by comparing the effects of different layers of the picture convolution in model training.
Figures 5 and 6 show the effect of different number of convolution layers on the model effect without and with the addition of residual connections, respectively.
Fig. 5 shows the results of a graph convolution module experiment without residual connection. When the number of layers is increased within a certain range, the performance of the model is improved, and after the threshold value is exceeded, the performance of the model is reduced. For example, the model performance increases more when the number of layers is 1 to 3, and the model performance decreases significantly after exceeding 3 layers. This is because, at the beginning, the number of iterations of the parameters can be increased by deepening the number of model layers, but the complex operation in the graph convolution causes the model to appear over-fitting after exceeding three layers, and the performance cannot be improved.
Fig. 6 shows experimental results under a residual graph convolution neural module. Also, model performance is improved when the number of layers can be increased over a range. In contrast, when the number of model layers under residual connection reaches 6, the model performance begins to decrease. This is because the residual connection makes the lower model retain node information of the upper model when the convolution operation is performed, so that the model layer number is deepened.
Meanwhile, the optimal performance of the model added with the residual connection is higher than that of the original model. For example, when the number of layers is 6, the optimal AUC for the model with residual connections added is 0.865, which is 0.12 higher than the model without residual connections added. This fully illustrates the validity of residual connection in the model of the present invention and fully ensures the initial information of the nodes in the process of graph convolution operation.
As shown in fig. 7, a comparison graph of the model detection effect at different electricity theft ratios is shown. In the data set adopted in the experiment, the proportion of electricity stealing users is 8%, and in order to show the effect of the model under different electricity stealing proportions, the proportion of the electricity stealing users is increased by reducing the number of normal users. It can be seen that the detection accuracy of the invention is higher with the increase of the electricity stealing proportion. It should be noted that the present invention requires a certain amount of marked electricity larceny user data, and too few electricity larceny users cannot provide sufficient features, for example, at a electricity larceny proportion of 2%,4%, the model cannot sufficiently extract the time correlation and periodicity in the electricity larceny data, resulting in poor accuracy.
The embodiment of the invention also provides a system for detecting electricity theft of the residual error graph convolution neural network, which comprises the following steps:
and a collection module: the system is used for collecting user power consumption original data in the power system;
And a pretreatment module: the method comprises the steps of performing deletion supplement on original data, performing outlier processing and dividing a training set, a verification set and a test set;
And a conversion module: the method comprises the steps of converting a preprocessed one-dimensional power load curve into a two-dimensional power load characteristic matrix to input a graph convolution neural network;
A neighbor matrix module: each period for data is a node, the correlation among the nodes is calculated by adopting a dynamic topological graph generation method, and the optimal parameters are found through training to obtain an adjacent matrix A which can most represent the data relationship;
And a classification module: and the method is used for sending the feature matrix X and the adjacent matrix A into a residual graph convolution neural network to extract potential features, and obtaining final classification through a pooling layer and a full connection layer.
The embodiment of the invention also provides equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the electricity larceny detection method of the dynamically generated residual error graph convolution neural network when being loaded to the processor.
The embodiment of the invention also provides a storage medium, which stores a computer program, and the computer program realizes the electricity theft detection method of the dynamically generated residual graph convolution neural network when being executed by a processor.

Claims (8)

1. The electricity stealing detection method of the residual map convolution neural network is characterized by comprising the following steps of:
(1) Collecting user power consumption original data in a power system;
(2) Performing missing supplement and outlier processing on the original data, and dividing a training set, a verification set and a test set;
(3) Transforming the preprocessed one-dimensional power load curve into a two-dimensional power load characteristic matrix X so as to input a graph convolution neural network;
(4) Calculating the correlation among nodes by taking each period of data as a node and adopting a dynamic topological graph generation method, and obtaining optimal parameters through training to obtain an adjacent matrix A which can most represent the data relationship; adaptively learning graph adjacency matrices to capture implicit connection relationships between data according to a dynamic topology graph generation method, comprising the steps of:
(41) Randomly generating n node vectors Initializing a weight parameter W and an offset parameter b, and generating an initial relation measure between nodes by the following formula:
Wherein, To activate the function,/>Controlling the oscillation size of the preliminary relation measurement for the super parameter; v n is an implicit representation of the node vector En, V i,Vj represents the i, j-th vector in the sequence;
(42) Sorting the initial relation measure according to row and column respectively, selecting The maximum value is set to be 1, and the rest values are set to be 0, so that an adjacent matrix A is obtained; wherein/>Is super parameter, set to [20, 25, 30, 35, 40, …,80];
(5) And (3) sending the feature matrix X and the adjacent matrix A into a residual map convolution neural network to extract potential features, and obtaining final classification through a pooling layer and a full connection layer.
2. The method for detecting theft of a dynamically generated residual convolution neural network according to claim 1, wherein the step (1) is specifically as follows: the original data is a daily electricity record generated by 42372 users in 1035 days; the data format of a single user is a time series of 1x 1035.
3. The method for detecting theft of a dynamically generated residual convolution neural network according to claim 1, wherein said step (2) comprises the steps of:
(21) And (3) deleting the power consumption data with the missing part, namely deleting the users with the consumption data missing more than 50%, otherwise, supplementing the missing data by adopting a linear interpolation method, respectively taking out three values before and after the missing value, and dividing the six taken out data into a group if the three values are directly discarded in the absence or the null state; calculating to obtain a filling value by using an interpolation formula; the specific operation is as follows:
wherein, A value representing the day c electricity data, if empty, as NaN;
(22) Outliers are where individual values in a sample deviate significantly from the rest of the observations; the abnormal value is defined by adopting the three sigma law, and the specific formula is as follows:
wherein, For single sample average,/>Is the standard deviation;
(23) Carrying out standardization treatment by adopting a normalization method, and mapping all data into a [0-1] interval;
(24) Dividing the processed data set into a training set, a verification set and a test set, wherein the proportion is 6:2:2.
4. The method for detecting theft of a dynamically generated residual convolution neural network according to claim 1, wherein the step (3) is specifically as follows: setting the one-dimensional power load curve after pretreatment asIncludes N sample users to daily/>For electricity consumption data, then/>Reconstruction as/>Together with the adjacent matrix as input to the graph convolution module, i.e./>
5. The method for detecting theft of a dynamically generated residual convolutional neural network of claim 1, wherein said step (5) residual convolutional neural network comprises: the 1X1 convolution module and the residual MixHop graph convolution module specifically comprise the following steps:
(51) The number of channels of the feature matrix X is increased through a 1X1 convolution module;
(52) Extracting time and space characteristics and potential characteristics among user power curves through a residual MixHop graph convolution module; the residual MixHop graph convolution module is composed of a plurality of graph convolution GC layers, each GC layer is composed of two MixHop modules in a superposition mode, residual connection is adopted among the GC layers, and the formula is as follows:
given the adjacency matrix a, then:
wherein,
Wherein,The node representing the k layer is obtained through graph convolution operation; /(I)Is a hyper-parameter controlling the ratio of the original state of the reserved root node; /(I)Representing the input hidden state of the first layer, and multiplying the adjacent matrix by the feature matrix; /(I)The hidden state of the upper layer is the node characteristic of the upper layer;
Adding the normalized result of the identity matrix I to the matrix A; /(I) The degree matrix is the degree of which the diagonal line element is a node; /(I)An inverse of the degree matrix;
wherein K is the propagation depth, Is a model weight parameter,/>Representing an output hidden state of the current layer;
(53) Filtering redundant information in the network by adopting a maximum pooling layer; selecting the maximum value in each rectangular subarea to enter the full connection layer; the formula is as follows:
wherein, Representing the result after the maximum pooling operation; /(I)A value representing a certain rectangular subregion in the input feature map;
(54) Converting the extracted model node characteristics into predictive scores to obtain a final classification detection result; the formula is as follows:
wherein, Is a SoftMax function; w FC,BFC is a full connection layer model parameter;
(55) Updating each parameter in the model through gradient descent, wherein binary cross entropy is adopted as a loss function, and the formula is as follows:
Wherein Y is the actual class number, namely 0 or 1, representing one of the classes; p is the prediction probability of the model, and represents the probability that the sample belongs to the first class;
(56) Training is ended when the model's loss over the training set no longer drops significantly.
6. A dynamically generated residual map convolutional neural network power theft detection system, comprising:
and a collection module: the system is used for collecting user power consumption original data in the power system;
And a pretreatment module: the method comprises the steps of performing deletion supplement on original data, performing outlier processing and dividing a training set, a verification set and a test set;
And a conversion module: the method comprises the steps of converting a preprocessed one-dimensional power load curve into a two-dimensional power load characteristic matrix to input a graph convolution neural network;
A neighbor matrix module: each period for data is a node, the correlation among the nodes is calculated by adopting a dynamic topological graph generation method, and the optimal parameters are found through training to obtain an adjacent matrix A which can most represent the data relationship; adaptively learning graph adjacency matrices to capture implicit connection relationships between data according to a dynamic topology graph generation method, comprising the steps of:
(41) Randomly generating n node vectors Initializing a weight parameter W and an offset parameter b, and generating an initial relation measure between nodes by the following formula:
Wherein, To activate the function,/>Controlling the oscillation size of the preliminary relation measurement for the super parameter; v n is an implicit representation of the node vector En, V i,Vj represents the i, j-th vector in the sequence;
(42) Sorting the initial relation measure according to row and column respectively, selecting The maximum value is set to be 1, and the rest values are set to be 0, so that an adjacent matrix A is obtained; wherein/>Is super parameter, set to [20, 25, 30, 35, 40, …,80];
And a classification module: and the method is used for sending the feature matrix X and the adjacent matrix A into a residual graph convolution neural network to extract potential features, and obtaining final classification through a pooling layer and a full connection layer.
7. A dynamically generated residual-map convolutional neural network electricity theft detection device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program when loaded to the processor implements a dynamically generated residual-map convolutional neural network electricity theft detection method according to any one of claims 1-6.
8. A dynamically generated residual-map convolutional neural network steal detection storage medium, the storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements a dynamically generated residual-map convolutional neural network steal detection method according to any one of claims 1-6.
CN202410046231.XA 2024-01-12 2024-01-12 Power theft detection method and system for dynamically generated residual error graph convolution neural network Active CN117556369B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410046231.XA CN117556369B (en) 2024-01-12 2024-01-12 Power theft detection method and system for dynamically generated residual error graph convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410046231.XA CN117556369B (en) 2024-01-12 2024-01-12 Power theft detection method and system for dynamically generated residual error graph convolution neural network

Publications (2)

Publication Number Publication Date
CN117556369A CN117556369A (en) 2024-02-13
CN117556369B true CN117556369B (en) 2024-04-19

Family

ID=89817108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410046231.XA Active CN117556369B (en) 2024-01-12 2024-01-12 Power theft detection method and system for dynamically generated residual error graph convolution neural network

Country Status (1)

Country Link
CN (1) CN117556369B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111343029A (en) * 2020-03-17 2020-06-26 上海英方软件股份有限公司 Monitoring platform and method based on data forwarding node topology monitoring
CN112990721A (en) * 2021-03-24 2021-06-18 山西大学 Electric power user value analysis method and system based on payment behaviors
CN113141008A (en) * 2021-04-23 2021-07-20 国网陕西省电力公司电力科学研究院 Data-driven power distribution network distributed new energy consumption capacity assessment method
CN113657171A (en) * 2021-07-20 2021-11-16 国网上海市电力公司 Low-voltage distribution network platform region topology identification method based on graph wavelet neural network
CN114595773A (en) * 2022-03-10 2022-06-07 广东泰云泽科技有限公司 Multi-source heterogeneous twin data fusion method and system based on factory production process
CN115293046A (en) * 2022-08-15 2022-11-04 国网四川省电力公司营销服务中心 Short-term power load prediction method, device, equipment and medium
CN115580446A (en) * 2022-09-22 2023-01-06 南京富尔登科技发展有限公司 Non-intrusive load detection method based on decentralized federal learning
CN115983448A (en) * 2022-12-14 2023-04-18 南京信息工程大学 Multi-energy load prediction method based on space-time diagram neural network
CN116226748A (en) * 2022-09-27 2023-06-06 中国电力科学研究院有限公司 Multi-label co-occurrence network discrimination-based electricity stealing type detection method and system
CN116366134A (en) * 2023-04-07 2023-06-30 军事科学院系统工程研究院系统总体研究所 Space-based Internet of things multi-star collaborative strategy construction method based on element space-time expansion diagram
CN117034179A (en) * 2023-10-10 2023-11-10 国网山东省电力公司营销服务中心(计量中心) Abnormal electric quantity identification and tracing method and system based on graph neural network
CN117076997A (en) * 2023-06-30 2023-11-17 国网辽宁省电力有限公司锦州供电公司 User electricity larceny detection method and system
CN117237678A (en) * 2023-11-16 2023-12-15 邯郸欣和电力建设有限公司 Method, device, equipment and storage medium for detecting abnormal electricity utilization behavior

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170242932A1 (en) * 2016-02-24 2017-08-24 International Business Machines Corporation Theft detection via adaptive lexical similarity analysis of social media data streams

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111343029A (en) * 2020-03-17 2020-06-26 上海英方软件股份有限公司 Monitoring platform and method based on data forwarding node topology monitoring
CN112990721A (en) * 2021-03-24 2021-06-18 山西大学 Electric power user value analysis method and system based on payment behaviors
CN113141008A (en) * 2021-04-23 2021-07-20 国网陕西省电力公司电力科学研究院 Data-driven power distribution network distributed new energy consumption capacity assessment method
CN113657171A (en) * 2021-07-20 2021-11-16 国网上海市电力公司 Low-voltage distribution network platform region topology identification method based on graph wavelet neural network
CN114595773A (en) * 2022-03-10 2022-06-07 广东泰云泽科技有限公司 Multi-source heterogeneous twin data fusion method and system based on factory production process
CN115293046A (en) * 2022-08-15 2022-11-04 国网四川省电力公司营销服务中心 Short-term power load prediction method, device, equipment and medium
CN115580446A (en) * 2022-09-22 2023-01-06 南京富尔登科技发展有限公司 Non-intrusive load detection method based on decentralized federal learning
CN116226748A (en) * 2022-09-27 2023-06-06 中国电力科学研究院有限公司 Multi-label co-occurrence network discrimination-based electricity stealing type detection method and system
CN115983448A (en) * 2022-12-14 2023-04-18 南京信息工程大学 Multi-energy load prediction method based on space-time diagram neural network
CN116366134A (en) * 2023-04-07 2023-06-30 军事科学院系统工程研究院系统总体研究所 Space-based Internet of things multi-star collaborative strategy construction method based on element space-time expansion diagram
CN117076997A (en) * 2023-06-30 2023-11-17 国网辽宁省电力有限公司锦州供电公司 User electricity larceny detection method and system
CN117034179A (en) * 2023-10-10 2023-11-10 国网山东省电力公司营销服务中心(计量中心) Abnormal electric quantity identification and tracing method and system based on graph neural network
CN117237678A (en) * 2023-11-16 2023-12-15 邯郸欣和电力建设有限公司 Method, device, equipment and storage medium for detecting abnormal electricity utilization behavior

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Electricity Theft Detection Using Dynamic GraphConstruction and Graph Attention Network;Wenlong Liao等;《IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS》;20231120;1-13 *
Electricity Theft Detection Using Euclidean and Graph Convolutional Neural Networks;Wenlong Liao等;《IEEE Transactions on Power Systems》;20230731;第38卷(第4期);3514-3527 *
基于动态图卷积网络的轻量化行为识别模型研究;赵果;《中国优秀硕士学位论文全文数据库 信息科技辑》;20230215(第02期);I138-1502 *
基于深度学习的窃电检测方法综述;张祥钦等;《辽宁工业大学学报(自然科学版)》;20231031;第43卷(第5期);296-302、318 *
面向半监督节点分类的双通道图随机卷积网络;李程鸿等;《小型微型计算机系统》;20230831;第44卷(第8期);1656-1664 *

Also Published As

Publication number Publication date
CN117556369A (en) 2024-02-13

Similar Documents

Publication Publication Date Title
CN106709035B (en) A kind of pretreatment system of electric power multidimensional panoramic view data
Lu et al. An MDL approach to the climate segmentation problem
CN104408667B (en) A kind of method and system of electric energy quality synthesis evaluation
CN101105841A (en) Method for constructing gene controlled subnetwork by large scale gene chip expression profile data
Hamrouni et al. Looking for a structural characterization of the sparseness measure of (frequent closed) itemset contexts
CN117078048B (en) Digital twinning-based intelligent city resource management method and system
US11841839B1 (en) Preprocessing and imputing method for structural data
CN113408341A (en) Load identification method and device, computer equipment and storage medium
CN112926627A (en) Equipment defect time prediction method based on capacitive equipment defect data
CN113240111A (en) Pruning method based on discrete cosine transform channel importance score
CN116245019A (en) Load prediction method, system, device and storage medium based on Bagging sampling and improved random forest algorithm
CN116169670A (en) Short-term non-resident load prediction method and system based on improved neural network
CN110472659B (en) Data processing method, device, computer readable storage medium and computer equipment
CN115170874A (en) Self-distillation implementation method based on decoupling distillation loss
CN115563477A (en) Harmonic data identification method and device, computer equipment and storage medium
CN113094448B (en) Analysis method and analysis device for residence empty state and electronic equipment
CN112418504B (en) Wind speed prediction method based on mixed variable selection optimization deep belief network
CN113076354A (en) User electricity consumption data analysis method and device based on non-invasive load monitoring
CN117313683A (en) Metadata processing method, device, server and storage medium
CN117556369B (en) Power theft detection method and system for dynamically generated residual error graph convolution neural network
CN116010831A (en) Combined clustering scene reduction method and system based on potential decision result
CN113809365B (en) Method and system for determining voltage decay of hydrogen fuel cell system and electronic equipment
CN116011564A (en) Entity relationship completion method, system and application for power equipment
CN115033591A (en) Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment
CN114036319A (en) Power knowledge extraction method, system, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant