CN116628605A - Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism - Google Patents

Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism Download PDF

Info

Publication number
CN116628605A
CN116628605A CN202310615835.7A CN202310615835A CN116628605A CN 116628605 A CN116628605 A CN 116628605A CN 202310615835 A CN202310615835 A CN 202310615835A CN 116628605 A CN116628605 A CN 116628605A
Authority
CN
China
Prior art keywords
data
model
formula
time sequence
electricity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310615835.7A
Other languages
Chinese (zh)
Inventor
方立雄
张建文
刘梦爽
段志尚
何哲伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Marketing Service Center Of State Grid Xinjiang Electric Power Co ltd Capital Intensive Center Metering Center
Original Assignee
Marketing Service Center Of State Grid Xinjiang Electric Power Co ltd Capital Intensive Center Metering Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Marketing Service Center Of State Grid Xinjiang Electric Power Co ltd Capital Intensive Center Metering Center filed Critical Marketing Service Center Of State Grid Xinjiang Electric Power Co ltd Capital Intensive Center Metering Center
Priority to CN202310615835.7A priority Critical patent/CN116628605A/en
Publication of CN116628605A publication Critical patent/CN116628605A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Economics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a method for classifying electricity theft based on ResNet and DSCAttention mechanisms, which comprises the following steps: step 1, firstly, carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger authenticability through trend season analysis and correlation analysis, and preparing design and construction of a model according to the time sequence data characteristics; step 2, entering data preprocessing, analyzing and processing missing values of data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of a model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way; and step 3, entering a model building and training stage, and adjusting different super parameters in the model training process to find out the super parameter combination with the best effect on classification evaluation until a final electricity stealing classification model is formed. The invention can classify and identify normal users and users in the electricity larceny time sequence data according to the electricity larceny time sequence data set, and calculate the probability of the users as electricity larceny.

Description

Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism
Technical Field
The invention belongs to the technical field of artificial intelligence and electricity larceny detection, and particularly relates to an electricity larceny classification method and device based on ResNet and DSCAttention mechanisms.
Background
At present, the method related to the identification and detection of the electricity larceny focuses on complex internal features of the electricity larceny time sequence data, and algorithms for using time sequence correlation for time sequence identification and classification are few; in the existing power stealing time sequence correlation learning method, an LSTM neural network is mostly adopted, so that the attention degree of introducing an attention mechanism into a power stealing time sequence recognition classification task is low.
The existing products currently associated with the identification of theft of electricity therefore have the drawbacks of: the prior products mainly pay attention to intrinsic complex features of electricity stealing data, such as time sequence period, trend features and the like, on electricity stealing classification tasks, most of the products adopt convolutional neural network algorithm to extract the intrinsic complex features and use the intrinsic complex features for classification, and little development and insufficient related research are caused in the aspect of learning time sequence relativity by a method of introducing attention mechanism.
Disclosure of Invention
The invention aims to provide a method and a device for electricity larceny classification based on ResNet and DSCAttention mechanisms, which can be used for classifying and identifying normal electricity users and electricity larceny users in electricity larceny time sequence data according to an electricity larceny time sequence data set and calculating the probability of the electricity larceny users.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a method for electricity theft classification based on ResNet and DSCAttention mechanisms comprises the following steps:
step 1, firstly, carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger authenticability through trend season analysis and correlation analysis, and preparing design and construction of a model according to the time sequence data characteristics;
step 2, entering a data preprocessing link, analyzing and processing missing values of data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of a model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way;
and step 3, after the data are processed, entering a construction and training stage of a deep learning model, and adjusting different super parameters in the model training process to find the super parameter combination with the best effect on classification evaluation until a final electricity stealing classification model is formed.
Furthermore, the electricity stealing classification model takes a ResNet18 network as a basic network structure, and introduces a deep separable convolution enhanced self-attention mechanism layer on the basis, so that the model can extract the inherent complex characteristics of electricity stealing time sequences and can also consider the correlation between the time sequences.
Further, the number of output channels of the second and third convolution layers of the ResNet18 network is modified, and a channel-by-channel convolution and channel-by-channel self-attention mechanism in a depth separable convolution structure is introduced; and then replacing the residual network structure left by the ResNet18 with a point-by-point convolution layer and a two-layer fully-connected neural network, wherein the point-by-point convolution layer and the upper-layer channel-by-channel convolution form a complete depth separable convolution structure, and the two-layer fully-connected neural network is used as a classifier.
Further, in the step 1, the time series is decomposed by using an additive model, and the intrinsic features of the time series are decomposed into trend term and season term features, so that the intrinsic features of the time series can be easily analyzed; the timing additive model is shown in formula (1.5):
T t =Trend t +Seasonal t +Resid t (1.5)
in the formula (1.5), T t Is the original time sequence, trend t Is a trend term of a time series additive model, and is a Seaseal t As season term, resin t Is the residual term of the additive model of the time series.
Further, the timeTrend term Trend of sequence additive model t The calculation of (2) is shown in the formula (1.6):
in the formula (1.6), y t+i A value at time t+i in the time series T, m=2k+1 representing an m-order moving average;
converting the one-dimensional time sequence data into two-dimensional space time sequence data by adopting a dividing method taking 7 days as a period, wherein the order or period of a moving average is 7, namely k=3; according to the definition of the (1.5) formula on the additive model, the sum of the seasonal term and the residual term is the time sequence minus the trend term;
defining the set of t+i as { t+1, t+2, & gt, t+m }, the seasonal term calculation formula of the time series additive model is as shown in (1.7).
Further, in the step 1, time series data of a cartesian coordinate system is converted into a polar coordinate system through a gram angle field, and the correlation of analysis time series among different times is calculated by calculating included angles among data features in each time in the polar coordinate system; the calculation of GAF is shown in formula (1.8):
in the formula (1.8), G is a square matrix of size n×n, cos (φ) nn ) The equation of the form is the special inner product defined by GAF, let the one-dimensional vector needed to calculate GAF be x= { X 1 ,x 2 ,...,x n The length of vector X is just n, phi in the formula 1 Is arccos (x) 1 )。
Further, cos (phi) ij ) The specific calculation process of (2) is shown in the formula (1.9):
cos(φ ij )=cos(arccos(x i )+arccos(x j )) (1.9)
in the formula (1.9), i, j represents an index of a one-dimensional vector X; and (3) carrying out time correlation analysis on the values of GAF of the normal electricity utilization data and the abnormal electricity utilization data according to the formulas (1.8) and (1.9).
Further, in the step 2, firstly, a mask matrix with the same size as the sample data is created, and the mask matrix is initialized to be a full 0 matrix; then determining the position of the missing value in the sample, and supplementing 1 at the same position in the mask matrix; by combining the mask matrices of the same size as the samples, a feature matrix with a channel number of 2 can be finally constructed.
Further, the missing values fill the following formula:
in the formula (1.10), i represents the position of the median value of the original data matrix, and x i Representing the value at the position i of the original data matrix, f (x i ) Representing the fill value after processing according to equation (1.10);
equation (1.11) represents the construction method of the mask matrix in the missing value filling method used; where i represents the position of the median value of the raw data matrix, x i Representing the value at the position i of the original data matrix, g (x i ) Represents the mask matrix transformed according to equation (1.11).
Further, the data is subjected to standardized processing by using quantile transformation, and then is input into a model for training; and meanwhile, the training set and the verification set are randomly split by adopting hierarchical sampling.
An apparatus for power theft classification based on a res net and DSCAttention mechanism, comprising:
the data analysis module is used for carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger identifiability through trend season analysis and correlation analysis, and preparing the design and construction of the model according to the time sequence data characteristics;
the data processing module is used for preprocessing data, analyzing and processing missing values of the data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of the model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way.
The invention has the characteristics and effects that:
(1) The method comprises the steps of using ResNet as a basic network, introducing a DSCAttention depth separable enhanced self-attention mechanism, combining the DSCAttention mechanism with a convolutional neural network part of the ResNet, extracting and learning the correlation between the intrinsic complex characteristics and time sequences of the power stealing time sequence data, and using the characteristics to guide the classification of a classifier.
(2) Because the electricity stealing identification is a two-class problem, when the model is trained, the cross entropy loss function is used for guiding the adjustment of the parameters of the neural network of each layer of the model, so that the model is optimally trained, and the optimal solution on the electricity stealing classification problem is obtained.
(3) The abnormal value is a missing value, and the abnormal value exists in the electricity stealing data, in the deep learning method, the model cannot directly calculate the null value, so that the missing value of the data is filled by using a zero padding method, in the model training process, the abnormal value can obtain great attention, so that the model is difficult to learn the actual distribution rule in the data, and the sensitivity of the abnormal value can be effectively reduced by adopting a quantile transformation method.
(4) In real-world data, electricity theft data is often unbalanced, and the problem can be alleviated to some extent by using a method of hierarchically splitting a training set and a verification set.
In summary, the invention performs better in electricity theft classification.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a diagram of a power theft classification model framework based on ResNet and DSCAttention mechanisms.
Fig. 2 is a schematic diagram of a depth separable convolution operation.
Fig. 3 is a schematic diagram of a channel-by-channel self-attention calculation.
Fig. 4 is a schematic diagram of the self-attention mechanism of convolution enhancement.
FIG. 5 is a schematic diagram of a depth separable convolution enhanced self-attention mechanism.
FIG. 6 is a flowchart of an overall implementation of a power theft classification algorithm.
Fig. 7 shows ROC graphs for different models.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples.
A method for classifying electricity stealing based on ResNet and DSCAttention mechanisms selects ResNet18 network as basic network structure, and introduces deep separable convolution enhanced self-attention mechanism layer based on the basic network structure, so that the model not only can extract internal complex features of electricity stealing time sequence, but also can consider correlation between time sequences, and obtains better classification accuracy in final classification task. Thus, a power theft classification model based on ResNet and DSCAttention mechanisms is obtained, and the whole framework is shown in figure 1.
1. ResNet residual error network structure
The invention modifies the traditional ResNet18 network, adds a concentration layer and depth separable convolution structure, and the network parameters of each layer of the final model structure are shown in table 1. The model modifies the first convolution layer and the maximum pooling layer of ResNet18, reduces the down-sampling multiple of the original feature input, and retains more useful information. At the second and third convolution layers of ResNet18, the model herein, in addition to modifying the number of output channels of the two-layer convolution layers, introduces a channel-by-channel convolution in a depth separable convolution structure and a channel-by-channel self-attention mechanism as used herein. The model then replaces the residual network structure left by the ResNet18 with a point-by-point convolutional layer and two fully-connected neural networks. The point-by-point convolution layer and the upper layer channel-by-channel convolution form a complete depth separable convolution structure, and the two layers of fully-connected neural networks serve as classifiers.
Table 1 comparison of network parameters
1.1Softmax function
The Softmax activation function is used for multi-classification problem, the inputs of a plurality of neurons are mapped between [0,1], the sum of mapping elements in the vector is 1, and the value calculation formula is as follows:
in the formula (1.1), i and j are the labels of a certain class in the multi-class problem, and x i Representing the i-th class value of the input x-values, and mapping the value to an exponential function,representing the sum of all class value indices.
1.2Relu activation function
The formula for the Relu activation function is:
Relu(x)=max(0,x)(1.2)
in equation (1.2), x is an input value, and the Relu function takes a maximum value between 0 and x, so that when the input is negative, the function is not activated, and is not applicable to the case where the input is negative.
1.3PRelu activation function
The PRelu activation function is formulated as:
PRelu(x)=max(0,x)+a x min(0,x)(1.3)
in the formula (1.3), x is an input value, a x Is the slope of PRelu in the region where the input value is negative. When the input value x is a positive number, the form and nature of the PRelu activation function is the Relu activation function. When the input value x is negative, the PRelu activation function is calculated as the minimum value of 0 and the input value x, and the calculated minimum value is multiplied by the slope a of the PRelu activation function negative region x
2. Convolution feature extraction layer and DSCAttention mechanism
2.1 convolutional neural network based power stealing data replication feature extraction
The convolutional neural network for time sequence trend item and seasonal feature extraction of the invention is a part consisting of three continuous layers conv3-16, conv3-16 and depthwise3-16 in table 1. The first two conv3-16 layers of the section are conventional convolution layers, the size of the convolution kernel is 3, the number of output channels is 16, the last depthwise3-16 layer is a channel-by-channel convolution, and the combination of pointwise1-48 point-by-point convolutions in table 1 forms a depth separable convolution operation calculation process as shown in fig. 2.
2.2 Power stealing timing dependency learning based on DSCAttention mechanism
The DSCAttention mechanism consists of two layers, attention-8 and depthwise3-8 in Table 1. Attention-8 is a channel-by-channel multi-head self-Attention mechanism, and the specific calculation process of this layer is shown in FIG. 3. depthwise3-8 is a channel-by-channel convolution enhancement portion of channel-by-channel multi-headed self-attention, and the specific calculation of this layer is shown in fig. 4. While the Attention-8 and depthwise3-8 are collectively referred to as the DSCAttention depth separable convolution enhanced self-Attention mechanism, the overall computation of this part is shown in FIG. 5.
(1) Channel-by-channel self-attention mechanism
The attention mechanism is a method of considering global information. Convolutional neural networks generally have the disadvantage of having a strong ability to extract local information, while there is little consideration and concern for global information. In the electricity stealing time sequence data, the electricity utilization behavior of a user has long-time correlation, and local convolution cannot be used for learning the long-time correlation.
The Attention mechanism mainly comprises three types of Soft Attention, hard Attention and self-Attention, the Attention mechanism method used by the invention is a self-Attention mechanism, and the calculation method is a formula (1.4).
In the formula (1.4), Q, K and V are the input matrix X and the weight parameters are W respectively Q ,W K ,W V The self-attention distribution of the input matrix X can be calculated by bringing Q, K and V into the formula (1.4). The self-attention distribution calculation method used herein is a new multi-head self-attention mechanism method combined with the channel attention idea. The calculation process is shown in fig. 3.
(2) Convolution enhanced self-attention mechanism
The invention adopts the convolution enhancement idea of a Conformer model on the whole idea, modifies the structure micro thereof, changes the self-attention part thereof into a channel-by-channel self-attention calculation method used herein, and optimizes the convolution enhancement part for 2D convolution operation. The convolution enhanced self-attention mechanism is shown in fig. 4.
(3) Depth separable convolution enhanced self-attention mechanism
Since the self-attention mechanism used in the invention is a channel-by-channel self-attention method, the conventional convolutional neural network of the convolutional enhancement part is replaced by a channel-by-channel convolutional neural network, so that a self-attention method DSCAttention based on depth separable convolutional enhancement is formed, and the calculation process of the method is shown in figure 5.
In fig. 5, the input matrix X is subjected to a channel-by-channel self-attention calculation and then outputs a self-attention distribution matrix, which is the correlation between the time sequences learned from the electricity larceny data by the model. A depth separable convolutional neural network is then used on the self-attention distribution matrix to compensate for the lack of local attention of the self-attention mechanism itself to the intra-and neighborhood periodicities.
3. Electricity larceny classification algorithm implementation flow
(1) Firstly, data analysis is carried out on the obtained data, time sequence data characteristics with stronger identifiability are obtained through trend season analysis and correlation analysis, and the design and construction of a deep learning algorithm model are prepared accordingly.
The intrinsic features of non-stationary time series data generally appear in a more complex form, so that it is difficult to perform feature analysis on the original sequence data. The time series is decomposed by using an additive model, and the intrinsic characteristics of the time series are decomposed into trend term and season term characteristics, so that the intrinsic characteristics of the time series can be easily analyzed. The time series additive model is shown in formula (1.5).
T t =Trend t +Seasonal t +Resid t (1.5)
In the formula (1.5), T t Is the original time sequence, trend t Is a trend term of a time series additive model, and is a Seaseal t As season term, resin t Is the residual term of the additive model of the time series. Trend term Trend of time sequence additive model t The calculation of (2) is shown in formula (1.6).
In the formula (1.6), y t+i The value at time t+i in the time series T is represented, and m=2k+1 represents an m-order moving average. Here, a 2D convolutional neural network is used on time-series complex feature extraction, so a partitioning method using 7 days as a period is used to convert one-dimensional time series data into two-dimensional space time series data, so the order or period of the moving average is 7, i.e., k=3. The sum of the seasonal term and the residual term is the time series minus the trend term according to the definition of the additive model by equation (1.5). Defining the set of t+i as { t+1, t+2, …, t+m }, the seasonal term calculation formula of the time series additive model is shown as (1.7).
The gladhand field (Gramian Angular Field, GAF) is a method by which time series data can be converted into spatial data. The Graham angle field converts time sequence data of a Cartesian coordinate system into a polar coordinate system, and the time sequence data are used for calculating correlation of analysis time sequences among different times by calculating included angles among data features in each time in the polar coordinate system. The GAF calculation is shown in formula (1.8).
In the formula (1.8), G is a square matrix of size n×n, cos (φ) nn ) The equation of the form is a special inner product defined by GAF. Let the one-dimensional vector needed to calculate GAF be x= { X 1 ,x 2 ,...,x n Vector X is just n in length. Phi in formula (1.8) 1 Is arccos (x) 1 ) Then cos (phi) ij ) The specific calculation process of (2) is shown in the formula (1.9).
cos(φ ij )=cos(arccos(x i )+arccos(x j )) (1.9)
In the formula (1.9), i, j represents the index of the one-dimensional vector X. And (3) carrying out time correlation analysis on the values of GAF of the normal electricity utilization data and the abnormal electricity utilization data according to the formulas (1.8) and (1.9).
(2) After the characteristic analysis of the data is carried out, a data preprocessing link is entered. The data preprocessing link mainly analyzes and processes missing values existing in data and constructs a mask matrix. The sensitivity of the model to outliers is reduced by normalizing the data using quantile transformation, and the training set and the validation set are partitioned by using a method of hierarchically splitting the data set.
The invention solves the problem of sample data missing by adopting a mode of not processing missing values. First, a mask matrix of the same size as the sample data is created, and the mask matrix is initialized to an all 0 matrix. The location of the missing value in the sample is then determined, with 1 being appended at the same location in the mask matrix. By combining the mask matrices of the same size as the samples, a feature matrix with a channel number of 2 can be finally constructed.
The missing value filling method is given by formula (1.10), wherein i represents the position of the value in the original data matrix, x i Representing the value at the position i of the original data matrix, f (x i ) The expression is according to formula (1.10) And (5) filling values after treatment.
Equation (1.11) gives the way the mask matrix is constructed in the missing value filling method used herein. Where i represents the position of the median value of the raw data matrix, x i Representing the value at the position i of the original data matrix, g (x i ) Represents the mask matrix transformed according to equation (1.11).
In general, the data features are greatly different, and when the model is trained by using the original data, the features with larger values can take larger effect in the model learning process, so that the quality of the model is affected. Therefore, the raw data needs to be standardized and then input into the model for training. A common normalization method is maximum and minimum normalization (Maximum and Minimum Normalization), the formula of which is shown as (1.12).
Wherein min (X) is the minimum value of X, max (X) is the maximum value of X, X i Is the value in X. Due to reasons of equipment false alarm and the like, the recorded data also often has the problem of abnormal value, and the maximum and minimum value normalization method is sensitive to the abnormal value, so that the problem of abnormal value sensitivity of the normalization method is solved, and a data normalization processing method of quantile transformation is used.
Because of the unbalance of the data set, the method of randomly splitting the training set and the verification set can lead to few or no negative samples of the training set, the model can not learn knowledge of the negative samples, and can also lead to few or no negative samples of the verification set, and the evaluation index of the model loses the reference meaning of the negative samples. To solve the above problem, a method of randomly splitting the training set and the validation set by hierarchical sampling (Stratified Sampling) is adopted herein, as shown in formula (1.13).
D new-normal :D new-outlier =D old-normal :D old-outlier (1.13)
(3) When the data processing is finished, the deep learning model is built and trained, and different super parameters are required to be adjusted in the model training process so as to find the super parameter combination with the best effect on classification evaluation. Finally, the best model for evaluating the AUC is compared with other models to obtain the advantages of the model used in the text.
In summary, the self-attention layer of the Conformer model of the present invention employs a conventional multi-headed self-attention mechanism, while the DSCAttention section uses a channel-by-channel multi-headed self-attention mechanism to calculate the self-attention distribution on each feature channel. In the convolution enhancement section, the Conformer model uses a 1D depth-separable convolution structure for enhancing the neighborhood period correlation of the learning time series, while the DSCAttention section uses a 2D depth-separable convolution structure for enhancing the correlation within the learning time series period and during the neighborhood period. Meanwhile, on the ResNet residual structure, a Conformer-based improved DSCAttention mechanism is introduced, so that the capability of the convolutional neural network lacking in time correlation attention can be made up. In the aspect of electricity stealing time sequence data, the electricity utilization period is generally recorded in one month or one year, the electricity utilization period is long, and the local principle of the convolutional neural network limits the capability of the convolutional neural network to process long-time sequences. Therefore, resNet fusion uses a multi-headed self-attention mechanism that is excellent in the ability to process long time sequences, an effective way to complement the short plate of the residual convolutional neural network in processing long time sequences.
The present invention compares the evaluation indexes of Random Forest (RF), support vector machine (Support Vector Machine, SVM) and Wide & Deep Convolutional Neural Networks (WDCNN) on AUC, map@100 and map@200, respectively, as shown in table 2.
TABLE 2 classification results
As can be seen from Table 2, the classification effect based on ResNet and DSCAttention mechanisms provided by the invention is best, 91.92% of AUC, 98.58% of MAP100 and 96.77% of MAP@200 are achieved, and on the best AUC index of each model, the classification effect is improved by 11.84% compared with a random forest RF, 10.34% compared with a support vector machine SVM, and 11.46% compared with a WDCNN model.
On the MAP@100 index, the method is improved by 14.33% compared with the random forest RF, 9.23% compared with a support vector machine SVM and 31.01% compared with a WDCNN model.
On the MAP@200 index, the method is 20.62% higher than the random forest RF, 13.04% higher than the support vector machine SVM, and 29.76% higher than the WDCNN model.
Fig. 7 is a ROC curve of the present invention versus other comparative model experiments. By observing the ROC curves of the four models, the ROC curves of the invention can be seen to be higher than the ROC curves of three models of WDCNN, SVM and RF, so the invention has better performance in the electricity larceny classification
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It should be understood by those skilled in the art that the above embodiments do not limit the scope of the present invention in any way, and all technical solutions obtained by equivalent substitution and the like fall within the scope of the present invention. The invention is not related in part to the same as or can be practiced with the prior art.

Claims (10)

1. A method for classifying electricity theft based on ResNet and DSCAttention mechanisms is characterized by comprising the following steps:
step 1, firstly, carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger authenticability through trend season analysis and correlation analysis, and preparing design and construction of a model according to the time sequence data characteristics;
step 2, entering a data preprocessing link, analyzing and processing missing values of data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of a model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way;
and step 3, after the data are processed, entering a construction and training stage of a deep learning model, and adjusting different super parameters in the model training process to find the super parameter combination with the best effect on classification evaluation until a final electricity stealing classification model is formed.
2. The method for electricity larceny classification based on ResNet and DSCAttention mechanisms as claimed in claim 1, wherein the electricity larceny classification model uses ResNet18 network as a basic network structure, and introduces a deep separable convolution enhanced self-attention mechanism layer on the basis, so that the model can extract intrinsic complex characteristics of electricity larceny time sequence and can consider correlation between time sequences.
3. The method of steal classification based on the res net and dscattntion mechanisms of claim 2, wherein the number of output channels of the second and third convolutional layers of the res net18 network is modified and a channel-by-channel convolution and a channel-by-channel self-attention mechanism in a deep separable convolutional structure is introduced; and then replacing the residual network structure left by the ResNet18 with a point-by-point convolution layer and a two-layer fully-connected neural network, wherein the point-by-point convolution layer and the upper-layer channel-by-channel convolution form a complete depth separable convolution structure, and the two-layer fully-connected neural network is used as a classifier.
4. The method for classifying electricity theft based on ResNet and DSCAttention mechanisms according to claim 1, wherein in step 1, the time series is decomposed by using an additive model, and the intrinsic features of the time series are decomposed into trend term and season term features, so that the intrinsic features of the time series can be easily analyzed; the timing additive model is shown in formula (1.5):
T t =Trend t +Seasonal t +Resid t (1.5)
in the formula (1.5), T t Is the original time sequence, trend t Is a trend term of a time series additive model, and is a Seaseal t As season term, resin t Is the residual term of the additive model of the time series.
5. The method of claim 4, wherein the Trend term Trend of the time series additive model is a Trend term of the ResNet and DSCAttention mechanism t The calculation of (2) is shown in the formula (1.6):
in the formula (1.6), y t+i A value at time t+i in the time series T, m=2k+1 representing an m-order moving average;
converting the one-dimensional time sequence data into two-dimensional space time sequence data by adopting a dividing method taking 7 days as a period, wherein the order or period of a moving average is 7, namely k=3; according to the definition of the (1.5) formula on the additive model, the sum of the seasonal term and the residual term is the time sequence minus the trend term;
defining the set of t+i as { t+1, t+2, & gt, t+m }, the seasonal term calculation formula of the time series additive model is as shown in (1.7).
6. The method for classifying electricity theft based on ResNet and DSCAttention mechanisms according to claim 1, wherein in step 1, time series data of a Cartesian coordinate system is converted into a polar coordinate system by a Graham angle field, and correlation of analysis time series between different times is calculated by calculating included angles between data features at each time in the polar coordinate system; the calculation of GAF is shown in formula (1.8):
in the formula (1.8), G is a square matrix of size n×n, cos (φ) nn ) In the form ofThe equation is the special inner product defined by GAF, let the one-dimensional vector needed to calculate GAF be X= { X 1 ,x 2 ,...,x n The length of vector X is just n, phi in the formula 1 Is arccos (x) 1 )。
7. The method of power theft classification based on ResNet and DSCAttention mechanisms of claim 6, wherein cos (φ ij ) The specific calculation process of (2) is shown in the formula (1.9):
cos(φ ij )=cos(arccos(x i )+arccos(x j )) (1.9)
in the formula (1.9), i, j represents an index of a one-dimensional vector X; and (3) carrying out time correlation analysis on the values of GAF of the normal electricity utilization data and the abnormal electricity utilization data according to the formulas (1.8) and (1.9).
8. The method of claim 1, wherein in step 2, a mask matrix having the same size as the sample data is created, and the mask matrix is initialized to be all 0 matrices; then determining the position of the missing value in the sample, and supplementing 1 at the same position in the mask matrix; by combining the mask matrices of the same size as the samples, a feature matrix with a channel number of 2 can be finally constructed.
9. The method of claim 8, wherein the missing value fills the following formula:
in the formula (1.10), i represents the position of the median value of the original data matrix, and x i Representing the value at the position i of the original data matrix, f (x i ) Representing the fill value after processing according to equation (1.10);
equation (1.11) represents the construction method of the mask matrix in the missing value filling method used; where i represents the position of the median value of the raw data matrix, x i Representing the value at the position i of the original data matrix, g (x i ) Represents the mask matrix transformed according to equation (1.11).
10. An apparatus for power theft classification based on a res net and DSCAttention mechanism, comprising:
the data analysis module is used for carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger identifiability through trend season analysis and correlation analysis, and preparing the design and construction of the model according to the time sequence data characteristics;
the data processing module is used for preprocessing data, analyzing and processing missing values of the data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of the model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way.
CN202310615835.7A 2023-05-26 2023-05-26 Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism Pending CN116628605A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310615835.7A CN116628605A (en) 2023-05-26 2023-05-26 Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310615835.7A CN116628605A (en) 2023-05-26 2023-05-26 Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism

Publications (1)

Publication Number Publication Date
CN116628605A true CN116628605A (en) 2023-08-22

Family

ID=87597039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310615835.7A Pending CN116628605A (en) 2023-05-26 2023-05-26 Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism

Country Status (1)

Country Link
CN (1) CN116628605A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116933216A (en) * 2023-09-18 2023-10-24 湖北华中电力科技开发有限责任公司 Management system and method based on flexible load resource aggregation feature analysis
CN117495109A (en) * 2023-12-29 2024-02-02 国网山东省电力公司禹城市供电公司 Electricity stealing user identification system based on deep well network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116933216A (en) * 2023-09-18 2023-10-24 湖北华中电力科技开发有限责任公司 Management system and method based on flexible load resource aggregation feature analysis
CN116933216B (en) * 2023-09-18 2023-12-01 湖北华中电力科技开发有限责任公司 Management system and method based on flexible load resource aggregation feature analysis
CN117495109A (en) * 2023-12-29 2024-02-02 国网山东省电力公司禹城市供电公司 Electricity stealing user identification system based on deep well network
CN117495109B (en) * 2023-12-29 2024-03-22 国网山东省电力公司禹城市供电公司 Power stealing user identification system based on neural network

Similar Documents

Publication Publication Date Title
CN107085704A (en) Fast face expression recognition method based on ELM own coding algorithms
CN103605972B (en) Non-restricted environment face verification method based on block depth neural network
CN116628605A (en) Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism
Wang et al. Tensor deep learning model for heterogeneous data fusion in Internet of Things
CN113033309B (en) Fault diagnosis method based on signal downsampling and one-dimensional convolutional neural network
CN106599797A (en) Infrared face identification method based on local parallel nerve network
CN106023065A (en) Tensor hyperspectral image spectrum-space dimensionality reduction method based on deep convolutional neural network
Banerjee et al. A new wrapper feature selection method for language-invariant offline signature verification
CN111046900A (en) Semi-supervised generation confrontation network image classification method based on local manifold regularization
CN111785329A (en) Single-cell RNA sequencing clustering method based on confrontation automatic encoder
CN113344045B (en) Method for improving SAR ship classification precision by combining HOG characteristics
CN112613536A (en) Near infrared spectrum diesel grade identification method based on SMOTE and deep learning
CN105868796A (en) Design method for linear discrimination of sparse representation classifier based on nuclear space
CN110674774A (en) Improved deep learning facial expression recognition method and system
Zuobin et al. Feature regrouping for cca-based feature fusion and extraction through normalized cut
CN115761398A (en) Bearing fault diagnosis method based on lightweight neural network and dimension expansion
CN116248392A (en) Network malicious traffic detection system and method based on multi-head attention mechanism
CN116469561A (en) Breast cancer survival prediction method based on deep learning
CN115661627A (en) Single-beam underwater target identification method based on GAF-D3Net
CN114037001A (en) Mechanical pump small sample fault diagnosis method based on WGAN-GP-C and metric learning
Chen et al. A finger vein recognition algorithm based on deep learning
CN114842425B (en) Abnormal behavior identification method for petrochemical process and electronic equipment
CN108898157B (en) Classification method for radar chart representation of numerical data based on convolutional neural network
CN115423091A (en) Conditional antagonistic neural network training method, scene generation method and system
CN114168822A (en) Method for establishing time series data clustering model and time series data clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination