CN116628605A - Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism - Google Patents
Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism Download PDFInfo
- Publication number
- CN116628605A CN116628605A CN202310615835.7A CN202310615835A CN116628605A CN 116628605 A CN116628605 A CN 116628605A CN 202310615835 A CN202310615835 A CN 202310615835A CN 116628605 A CN116628605 A CN 116628605A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- formula
- time sequence
- electricity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 230000005611 electricity Effects 0.000 title claims abstract description 52
- 230000007246 mechanism Effects 0.000 title claims abstract description 51
- 239000011159 matrix material Substances 0.000 claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 21
- 230000008569 process Effects 0.000 claims abstract description 16
- 230000002159 abnormal effect Effects 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims abstract description 13
- 238000004458 analytical method Methods 0.000 claims abstract description 11
- 238000010276 construction Methods 0.000 claims abstract description 10
- 238000010219 correlation analysis Methods 0.000 claims abstract description 9
- 238000012795 verification Methods 0.000 claims abstract description 9
- 238000007405 data analysis Methods 0.000 claims abstract description 8
- 230000000694 effects Effects 0.000 claims abstract description 8
- 230000035945 sensitivity Effects 0.000 claims abstract description 8
- 230000009466 transformation Effects 0.000 claims abstract description 8
- 238000013145 classification model Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000013461 design Methods 0.000 claims abstract description 6
- 238000011156 evaluation Methods 0.000 claims abstract description 6
- 238000004364 calculation method Methods 0.000 claims description 23
- 239000000654 additive Substances 0.000 claims description 21
- 230000000996 additive effect Effects 0.000 claims description 21
- 230000001932 seasonal effect Effects 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 238000013136 deep learning model Methods 0.000 claims description 3
- 239000011347 resin Substances 0.000 claims description 3
- 229920005989 resin Polymers 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims description 2
- 238000013527 convolutional neural network Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 14
- 230000004913 activation Effects 0.000 description 9
- 238000007637 random forest analysis Methods 0.000 description 9
- 238000010606 normalization Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000012706 support-vector machine Methods 0.000 description 5
- 238000000605 extraction Methods 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000007635 classification algorithm Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Business, Economics & Management (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Economics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses a method for classifying electricity theft based on ResNet and DSCAttention mechanisms, which comprises the following steps: step 1, firstly, carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger authenticability through trend season analysis and correlation analysis, and preparing design and construction of a model according to the time sequence data characteristics; step 2, entering data preprocessing, analyzing and processing missing values of data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of a model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way; and step 3, entering a model building and training stage, and adjusting different super parameters in the model training process to find out the super parameter combination with the best effect on classification evaluation until a final electricity stealing classification model is formed. The invention can classify and identify normal users and users in the electricity larceny time sequence data according to the electricity larceny time sequence data set, and calculate the probability of the users as electricity larceny.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence and electricity larceny detection, and particularly relates to an electricity larceny classification method and device based on ResNet and DSCAttention mechanisms.
Background
At present, the method related to the identification and detection of the electricity larceny focuses on complex internal features of the electricity larceny time sequence data, and algorithms for using time sequence correlation for time sequence identification and classification are few; in the existing power stealing time sequence correlation learning method, an LSTM neural network is mostly adopted, so that the attention degree of introducing an attention mechanism into a power stealing time sequence recognition classification task is low.
The existing products currently associated with the identification of theft of electricity therefore have the drawbacks of: the prior products mainly pay attention to intrinsic complex features of electricity stealing data, such as time sequence period, trend features and the like, on electricity stealing classification tasks, most of the products adopt convolutional neural network algorithm to extract the intrinsic complex features and use the intrinsic complex features for classification, and little development and insufficient related research are caused in the aspect of learning time sequence relativity by a method of introducing attention mechanism.
Disclosure of Invention
The invention aims to provide a method and a device for electricity larceny classification based on ResNet and DSCAttention mechanisms, which can be used for classifying and identifying normal electricity users and electricity larceny users in electricity larceny time sequence data according to an electricity larceny time sequence data set and calculating the probability of the electricity larceny users.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a method for electricity theft classification based on ResNet and DSCAttention mechanisms comprises the following steps:
step 1, firstly, carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger authenticability through trend season analysis and correlation analysis, and preparing design and construction of a model according to the time sequence data characteristics;
step 2, entering a data preprocessing link, analyzing and processing missing values of data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of a model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way;
and step 3, after the data are processed, entering a construction and training stage of a deep learning model, and adjusting different super parameters in the model training process to find the super parameter combination with the best effect on classification evaluation until a final electricity stealing classification model is formed.
Furthermore, the electricity stealing classification model takes a ResNet18 network as a basic network structure, and introduces a deep separable convolution enhanced self-attention mechanism layer on the basis, so that the model can extract the inherent complex characteristics of electricity stealing time sequences and can also consider the correlation between the time sequences.
Further, the number of output channels of the second and third convolution layers of the ResNet18 network is modified, and a channel-by-channel convolution and channel-by-channel self-attention mechanism in a depth separable convolution structure is introduced; and then replacing the residual network structure left by the ResNet18 with a point-by-point convolution layer and a two-layer fully-connected neural network, wherein the point-by-point convolution layer and the upper-layer channel-by-channel convolution form a complete depth separable convolution structure, and the two-layer fully-connected neural network is used as a classifier.
Further, in the step 1, the time series is decomposed by using an additive model, and the intrinsic features of the time series are decomposed into trend term and season term features, so that the intrinsic features of the time series can be easily analyzed; the timing additive model is shown in formula (1.5):
T t =Trend t +Seasonal t +Resid t (1.5)
in the formula (1.5), T t Is the original time sequence, trend t Is a trend term of a time series additive model, and is a Seaseal t As season term, resin t Is the residual term of the additive model of the time series.
Further, the timeTrend term Trend of sequence additive model t The calculation of (2) is shown in the formula (1.6):
in the formula (1.6), y t+i A value at time t+i in the time series T, m=2k+1 representing an m-order moving average;
converting the one-dimensional time sequence data into two-dimensional space time sequence data by adopting a dividing method taking 7 days as a period, wherein the order or period of a moving average is 7, namely k=3; according to the definition of the (1.5) formula on the additive model, the sum of the seasonal term and the residual term is the time sequence minus the trend term;
defining the set of t+i as { t+1, t+2, & gt, t+m }, the seasonal term calculation formula of the time series additive model is as shown in (1.7).
Further, in the step 1, time series data of a cartesian coordinate system is converted into a polar coordinate system through a gram angle field, and the correlation of analysis time series among different times is calculated by calculating included angles among data features in each time in the polar coordinate system; the calculation of GAF is shown in formula (1.8):
in the formula (1.8), G is a square matrix of size n×n, cos (φ) n +φ n ) The equation of the form is the special inner product defined by GAF, let the one-dimensional vector needed to calculate GAF be x= { X 1 ,x 2 ,...,x n The length of vector X is just n, phi in the formula 1 Is arccos (x) 1 )。
Further, cos (phi) i +φ j ) The specific calculation process of (2) is shown in the formula (1.9):
cos(φ i +φ j )=cos(arccos(x i )+arccos(x j )) (1.9)
in the formula (1.9), i, j represents an index of a one-dimensional vector X; and (3) carrying out time correlation analysis on the values of GAF of the normal electricity utilization data and the abnormal electricity utilization data according to the formulas (1.8) and (1.9).
Further, in the step 2, firstly, a mask matrix with the same size as the sample data is created, and the mask matrix is initialized to be a full 0 matrix; then determining the position of the missing value in the sample, and supplementing 1 at the same position in the mask matrix; by combining the mask matrices of the same size as the samples, a feature matrix with a channel number of 2 can be finally constructed.
Further, the missing values fill the following formula:
in the formula (1.10), i represents the position of the median value of the original data matrix, and x i Representing the value at the position i of the original data matrix, f (x i ) Representing the fill value after processing according to equation (1.10);
equation (1.11) represents the construction method of the mask matrix in the missing value filling method used; where i represents the position of the median value of the raw data matrix, x i Representing the value at the position i of the original data matrix, g (x i ) Represents the mask matrix transformed according to equation (1.11).
Further, the data is subjected to standardized processing by using quantile transformation, and then is input into a model for training; and meanwhile, the training set and the verification set are randomly split by adopting hierarchical sampling.
An apparatus for power theft classification based on a res net and DSCAttention mechanism, comprising:
the data analysis module is used for carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger identifiability through trend season analysis and correlation analysis, and preparing the design and construction of the model according to the time sequence data characteristics;
the data processing module is used for preprocessing data, analyzing and processing missing values of the data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of the model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way.
The invention has the characteristics and effects that:
(1) The method comprises the steps of using ResNet as a basic network, introducing a DSCAttention depth separable enhanced self-attention mechanism, combining the DSCAttention mechanism with a convolutional neural network part of the ResNet, extracting and learning the correlation between the intrinsic complex characteristics and time sequences of the power stealing time sequence data, and using the characteristics to guide the classification of a classifier.
(2) Because the electricity stealing identification is a two-class problem, when the model is trained, the cross entropy loss function is used for guiding the adjustment of the parameters of the neural network of each layer of the model, so that the model is optimally trained, and the optimal solution on the electricity stealing classification problem is obtained.
(3) The abnormal value is a missing value, and the abnormal value exists in the electricity stealing data, in the deep learning method, the model cannot directly calculate the null value, so that the missing value of the data is filled by using a zero padding method, in the model training process, the abnormal value can obtain great attention, so that the model is difficult to learn the actual distribution rule in the data, and the sensitivity of the abnormal value can be effectively reduced by adopting a quantile transformation method.
(4) In real-world data, electricity theft data is often unbalanced, and the problem can be alleviated to some extent by using a method of hierarchically splitting a training set and a verification set.
In summary, the invention performs better in electricity theft classification.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a diagram of a power theft classification model framework based on ResNet and DSCAttention mechanisms.
Fig. 2 is a schematic diagram of a depth separable convolution operation.
Fig. 3 is a schematic diagram of a channel-by-channel self-attention calculation.
Fig. 4 is a schematic diagram of the self-attention mechanism of convolution enhancement.
FIG. 5 is a schematic diagram of a depth separable convolution enhanced self-attention mechanism.
FIG. 6 is a flowchart of an overall implementation of a power theft classification algorithm.
Fig. 7 shows ROC graphs for different models.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples.
A method for classifying electricity stealing based on ResNet and DSCAttention mechanisms selects ResNet18 network as basic network structure, and introduces deep separable convolution enhanced self-attention mechanism layer based on the basic network structure, so that the model not only can extract internal complex features of electricity stealing time sequence, but also can consider correlation between time sequences, and obtains better classification accuracy in final classification task. Thus, a power theft classification model based on ResNet and DSCAttention mechanisms is obtained, and the whole framework is shown in figure 1.
1. ResNet residual error network structure
The invention modifies the traditional ResNet18 network, adds a concentration layer and depth separable convolution structure, and the network parameters of each layer of the final model structure are shown in table 1. The model modifies the first convolution layer and the maximum pooling layer of ResNet18, reduces the down-sampling multiple of the original feature input, and retains more useful information. At the second and third convolution layers of ResNet18, the model herein, in addition to modifying the number of output channels of the two-layer convolution layers, introduces a channel-by-channel convolution in a depth separable convolution structure and a channel-by-channel self-attention mechanism as used herein. The model then replaces the residual network structure left by the ResNet18 with a point-by-point convolutional layer and two fully-connected neural networks. The point-by-point convolution layer and the upper layer channel-by-channel convolution form a complete depth separable convolution structure, and the two layers of fully-connected neural networks serve as classifiers.
Table 1 comparison of network parameters
1.1Softmax function
The Softmax activation function is used for multi-classification problem, the inputs of a plurality of neurons are mapped between [0,1], the sum of mapping elements in the vector is 1, and the value calculation formula is as follows:
in the formula (1.1), i and j are the labels of a certain class in the multi-class problem, and x i Representing the i-th class value of the input x-values, and mapping the value to an exponential function,representing the sum of all class value indices.
1.2Relu activation function
The formula for the Relu activation function is:
Relu(x)=max(0,x)(1.2)
in equation (1.2), x is an input value, and the Relu function takes a maximum value between 0 and x, so that when the input is negative, the function is not activated, and is not applicable to the case where the input is negative.
1.3PRelu activation function
The PRelu activation function is formulated as:
PRelu(x)=max(0,x)+a x min(0,x)(1.3)
in the formula (1.3), x is an input value, a x Is the slope of PRelu in the region where the input value is negative. When the input value x is a positive number, the form and nature of the PRelu activation function is the Relu activation function. When the input value x is negative, the PRelu activation function is calculated as the minimum value of 0 and the input value x, and the calculated minimum value is multiplied by the slope a of the PRelu activation function negative region x 。
2. Convolution feature extraction layer and DSCAttention mechanism
2.1 convolutional neural network based power stealing data replication feature extraction
The convolutional neural network for time sequence trend item and seasonal feature extraction of the invention is a part consisting of three continuous layers conv3-16, conv3-16 and depthwise3-16 in table 1. The first two conv3-16 layers of the section are conventional convolution layers, the size of the convolution kernel is 3, the number of output channels is 16, the last depthwise3-16 layer is a channel-by-channel convolution, and the combination of pointwise1-48 point-by-point convolutions in table 1 forms a depth separable convolution operation calculation process as shown in fig. 2.
2.2 Power stealing timing dependency learning based on DSCAttention mechanism
The DSCAttention mechanism consists of two layers, attention-8 and depthwise3-8 in Table 1. Attention-8 is a channel-by-channel multi-head self-Attention mechanism, and the specific calculation process of this layer is shown in FIG. 3. depthwise3-8 is a channel-by-channel convolution enhancement portion of channel-by-channel multi-headed self-attention, and the specific calculation of this layer is shown in fig. 4. While the Attention-8 and depthwise3-8 are collectively referred to as the DSCAttention depth separable convolution enhanced self-Attention mechanism, the overall computation of this part is shown in FIG. 5.
(1) Channel-by-channel self-attention mechanism
The attention mechanism is a method of considering global information. Convolutional neural networks generally have the disadvantage of having a strong ability to extract local information, while there is little consideration and concern for global information. In the electricity stealing time sequence data, the electricity utilization behavior of a user has long-time correlation, and local convolution cannot be used for learning the long-time correlation.
The Attention mechanism mainly comprises three types of Soft Attention, hard Attention and self-Attention, the Attention mechanism method used by the invention is a self-Attention mechanism, and the calculation method is a formula (1.4).
In the formula (1.4), Q, K and V are the input matrix X and the weight parameters are W respectively Q ,W K ,W V The self-attention distribution of the input matrix X can be calculated by bringing Q, K and V into the formula (1.4). The self-attention distribution calculation method used herein is a new multi-head self-attention mechanism method combined with the channel attention idea. The calculation process is shown in fig. 3.
(2) Convolution enhanced self-attention mechanism
The invention adopts the convolution enhancement idea of a Conformer model on the whole idea, modifies the structure micro thereof, changes the self-attention part thereof into a channel-by-channel self-attention calculation method used herein, and optimizes the convolution enhancement part for 2D convolution operation. The convolution enhanced self-attention mechanism is shown in fig. 4.
(3) Depth separable convolution enhanced self-attention mechanism
Since the self-attention mechanism used in the invention is a channel-by-channel self-attention method, the conventional convolutional neural network of the convolutional enhancement part is replaced by a channel-by-channel convolutional neural network, so that a self-attention method DSCAttention based on depth separable convolutional enhancement is formed, and the calculation process of the method is shown in figure 5.
In fig. 5, the input matrix X is subjected to a channel-by-channel self-attention calculation and then outputs a self-attention distribution matrix, which is the correlation between the time sequences learned from the electricity larceny data by the model. A depth separable convolutional neural network is then used on the self-attention distribution matrix to compensate for the lack of local attention of the self-attention mechanism itself to the intra-and neighborhood periodicities.
3. Electricity larceny classification algorithm implementation flow
(1) Firstly, data analysis is carried out on the obtained data, time sequence data characteristics with stronger identifiability are obtained through trend season analysis and correlation analysis, and the design and construction of a deep learning algorithm model are prepared accordingly.
The intrinsic features of non-stationary time series data generally appear in a more complex form, so that it is difficult to perform feature analysis on the original sequence data. The time series is decomposed by using an additive model, and the intrinsic characteristics of the time series are decomposed into trend term and season term characteristics, so that the intrinsic characteristics of the time series can be easily analyzed. The time series additive model is shown in formula (1.5).
T t =Trend t +Seasonal t +Resid t (1.5)
In the formula (1.5), T t Is the original time sequence, trend t Is a trend term of a time series additive model, and is a Seaseal t As season term, resin t Is the residual term of the additive model of the time series. Trend term Trend of time sequence additive model t The calculation of (2) is shown in formula (1.6).
In the formula (1.6), y t+i The value at time t+i in the time series T is represented, and m=2k+1 represents an m-order moving average. Here, a 2D convolutional neural network is used on time-series complex feature extraction, so a partitioning method using 7 days as a period is used to convert one-dimensional time series data into two-dimensional space time series data, so the order or period of the moving average is 7, i.e., k=3. The sum of the seasonal term and the residual term is the time series minus the trend term according to the definition of the additive model by equation (1.5). Defining the set of t+i as { t+1, t+2, …, t+m }, the seasonal term calculation formula of the time series additive model is shown as (1.7).
The gladhand field (Gramian Angular Field, GAF) is a method by which time series data can be converted into spatial data. The Graham angle field converts time sequence data of a Cartesian coordinate system into a polar coordinate system, and the time sequence data are used for calculating correlation of analysis time sequences among different times by calculating included angles among data features in each time in the polar coordinate system. The GAF calculation is shown in formula (1.8).
In the formula (1.8), G is a square matrix of size n×n, cos (φ) n +φ n ) The equation of the form is a special inner product defined by GAF. Let the one-dimensional vector needed to calculate GAF be x= { X 1 ,x 2 ,...,x n Vector X is just n in length. Phi in formula (1.8) 1 Is arccos (x) 1 ) Then cos (phi) i +φ j ) The specific calculation process of (2) is shown in the formula (1.9).
cos(φ i +φ j )=cos(arccos(x i )+arccos(x j )) (1.9)
In the formula (1.9), i, j represents the index of the one-dimensional vector X. And (3) carrying out time correlation analysis on the values of GAF of the normal electricity utilization data and the abnormal electricity utilization data according to the formulas (1.8) and (1.9).
(2) After the characteristic analysis of the data is carried out, a data preprocessing link is entered. The data preprocessing link mainly analyzes and processes missing values existing in data and constructs a mask matrix. The sensitivity of the model to outliers is reduced by normalizing the data using quantile transformation, and the training set and the validation set are partitioned by using a method of hierarchically splitting the data set.
The invention solves the problem of sample data missing by adopting a mode of not processing missing values. First, a mask matrix of the same size as the sample data is created, and the mask matrix is initialized to an all 0 matrix. The location of the missing value in the sample is then determined, with 1 being appended at the same location in the mask matrix. By combining the mask matrices of the same size as the samples, a feature matrix with a channel number of 2 can be finally constructed.
The missing value filling method is given by formula (1.10), wherein i represents the position of the value in the original data matrix, x i Representing the value at the position i of the original data matrix, f (x i ) The expression is according to formula (1.10) And (5) filling values after treatment.
Equation (1.11) gives the way the mask matrix is constructed in the missing value filling method used herein. Where i represents the position of the median value of the raw data matrix, x i Representing the value at the position i of the original data matrix, g (x i ) Represents the mask matrix transformed according to equation (1.11).
In general, the data features are greatly different, and when the model is trained by using the original data, the features with larger values can take larger effect in the model learning process, so that the quality of the model is affected. Therefore, the raw data needs to be standardized and then input into the model for training. A common normalization method is maximum and minimum normalization (Maximum and Minimum Normalization), the formula of which is shown as (1.12).
Wherein min (X) is the minimum value of X, max (X) is the maximum value of X, X i Is the value in X. Due to reasons of equipment false alarm and the like, the recorded data also often has the problem of abnormal value, and the maximum and minimum value normalization method is sensitive to the abnormal value, so that the problem of abnormal value sensitivity of the normalization method is solved, and a data normalization processing method of quantile transformation is used.
Because of the unbalance of the data set, the method of randomly splitting the training set and the verification set can lead to few or no negative samples of the training set, the model can not learn knowledge of the negative samples, and can also lead to few or no negative samples of the verification set, and the evaluation index of the model loses the reference meaning of the negative samples. To solve the above problem, a method of randomly splitting the training set and the validation set by hierarchical sampling (Stratified Sampling) is adopted herein, as shown in formula (1.13).
D new-normal :D new-outlier =D old-normal :D old-outlier (1.13)
(3) When the data processing is finished, the deep learning model is built and trained, and different super parameters are required to be adjusted in the model training process so as to find the super parameter combination with the best effect on classification evaluation. Finally, the best model for evaluating the AUC is compared with other models to obtain the advantages of the model used in the text.
In summary, the self-attention layer of the Conformer model of the present invention employs a conventional multi-headed self-attention mechanism, while the DSCAttention section uses a channel-by-channel multi-headed self-attention mechanism to calculate the self-attention distribution on each feature channel. In the convolution enhancement section, the Conformer model uses a 1D depth-separable convolution structure for enhancing the neighborhood period correlation of the learning time series, while the DSCAttention section uses a 2D depth-separable convolution structure for enhancing the correlation within the learning time series period and during the neighborhood period. Meanwhile, on the ResNet residual structure, a Conformer-based improved DSCAttention mechanism is introduced, so that the capability of the convolutional neural network lacking in time correlation attention can be made up. In the aspect of electricity stealing time sequence data, the electricity utilization period is generally recorded in one month or one year, the electricity utilization period is long, and the local principle of the convolutional neural network limits the capability of the convolutional neural network to process long-time sequences. Therefore, resNet fusion uses a multi-headed self-attention mechanism that is excellent in the ability to process long time sequences, an effective way to complement the short plate of the residual convolutional neural network in processing long time sequences.
The present invention compares the evaluation indexes of Random Forest (RF), support vector machine (Support Vector Machine, SVM) and Wide & Deep Convolutional Neural Networks (WDCNN) on AUC, map@100 and map@200, respectively, as shown in table 2.
TABLE 2 classification results
As can be seen from Table 2, the classification effect based on ResNet and DSCAttention mechanisms provided by the invention is best, 91.92% of AUC, 98.58% of MAP100 and 96.77% of MAP@200 are achieved, and on the best AUC index of each model, the classification effect is improved by 11.84% compared with a random forest RF, 10.34% compared with a support vector machine SVM, and 11.46% compared with a WDCNN model.
On the MAP@100 index, the method is improved by 14.33% compared with the random forest RF, 9.23% compared with a support vector machine SVM and 31.01% compared with a WDCNN model.
On the MAP@200 index, the method is 20.62% higher than the random forest RF, 13.04% higher than the support vector machine SVM, and 29.76% higher than the WDCNN model.
Fig. 7 is a ROC curve of the present invention versus other comparative model experiments. By observing the ROC curves of the four models, the ROC curves of the invention can be seen to be higher than the ROC curves of three models of WDCNN, SVM and RF, so the invention has better performance in the electricity larceny classification
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It should be understood by those skilled in the art that the above embodiments do not limit the scope of the present invention in any way, and all technical solutions obtained by equivalent substitution and the like fall within the scope of the present invention. The invention is not related in part to the same as or can be practiced with the prior art.
Claims (10)
1. A method for classifying electricity theft based on ResNet and DSCAttention mechanisms is characterized by comprising the following steps:
step 1, firstly, carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger authenticability through trend season analysis and correlation analysis, and preparing design and construction of a model according to the time sequence data characteristics;
step 2, entering a data preprocessing link, analyzing and processing missing values of data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of a model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way;
and step 3, after the data are processed, entering a construction and training stage of a deep learning model, and adjusting different super parameters in the model training process to find the super parameter combination with the best effect on classification evaluation until a final electricity stealing classification model is formed.
2. The method for electricity larceny classification based on ResNet and DSCAttention mechanisms as claimed in claim 1, wherein the electricity larceny classification model uses ResNet18 network as a basic network structure, and introduces a deep separable convolution enhanced self-attention mechanism layer on the basis, so that the model can extract intrinsic complex characteristics of electricity larceny time sequence and can consider correlation between time sequences.
3. The method of steal classification based on the res net and dscattntion mechanisms of claim 2, wherein the number of output channels of the second and third convolutional layers of the res net18 network is modified and a channel-by-channel convolution and a channel-by-channel self-attention mechanism in a deep separable convolutional structure is introduced; and then replacing the residual network structure left by the ResNet18 with a point-by-point convolution layer and a two-layer fully-connected neural network, wherein the point-by-point convolution layer and the upper-layer channel-by-channel convolution form a complete depth separable convolution structure, and the two-layer fully-connected neural network is used as a classifier.
4. The method for classifying electricity theft based on ResNet and DSCAttention mechanisms according to claim 1, wherein in step 1, the time series is decomposed by using an additive model, and the intrinsic features of the time series are decomposed into trend term and season term features, so that the intrinsic features of the time series can be easily analyzed; the timing additive model is shown in formula (1.5):
T t =Trend t +Seasonal t +Resid t (1.5)
in the formula (1.5), T t Is the original time sequence, trend t Is a trend term of a time series additive model, and is a Seaseal t As season term, resin t Is the residual term of the additive model of the time series.
5. The method of claim 4, wherein the Trend term Trend of the time series additive model is a Trend term of the ResNet and DSCAttention mechanism t The calculation of (2) is shown in the formula (1.6):
in the formula (1.6), y t+i A value at time t+i in the time series T, m=2k+1 representing an m-order moving average;
converting the one-dimensional time sequence data into two-dimensional space time sequence data by adopting a dividing method taking 7 days as a period, wherein the order or period of a moving average is 7, namely k=3; according to the definition of the (1.5) formula on the additive model, the sum of the seasonal term and the residual term is the time sequence minus the trend term;
defining the set of t+i as { t+1, t+2, & gt, t+m }, the seasonal term calculation formula of the time series additive model is as shown in (1.7).
6. The method for classifying electricity theft based on ResNet and DSCAttention mechanisms according to claim 1, wherein in step 1, time series data of a Cartesian coordinate system is converted into a polar coordinate system by a Graham angle field, and correlation of analysis time series between different times is calculated by calculating included angles between data features at each time in the polar coordinate system; the calculation of GAF is shown in formula (1.8):
in the formula (1.8), G is a square matrix of size n×n, cos (φ) n +φ n ) In the form ofThe equation is the special inner product defined by GAF, let the one-dimensional vector needed to calculate GAF be X= { X 1 ,x 2 ,...,x n The length of vector X is just n, phi in the formula 1 Is arccos (x) 1 )。
7. The method of power theft classification based on ResNet and DSCAttention mechanisms of claim 6, wherein cos (φ i +φ j ) The specific calculation process of (2) is shown in the formula (1.9):
cos(φ i +φ j )=cos(arccos(x i )+arccos(x j )) (1.9)
in the formula (1.9), i, j represents an index of a one-dimensional vector X; and (3) carrying out time correlation analysis on the values of GAF of the normal electricity utilization data and the abnormal electricity utilization data according to the formulas (1.8) and (1.9).
8. The method of claim 1, wherein in step 2, a mask matrix having the same size as the sample data is created, and the mask matrix is initialized to be all 0 matrices; then determining the position of the missing value in the sample, and supplementing 1 at the same position in the mask matrix; by combining the mask matrices of the same size as the samples, a feature matrix with a channel number of 2 can be finally constructed.
9. The method of claim 8, wherein the missing value fills the following formula:
in the formula (1.10), i represents the position of the median value of the original data matrix, and x i Representing the value at the position i of the original data matrix, f (x i ) Representing the fill value after processing according to equation (1.10);
equation (1.11) represents the construction method of the mask matrix in the missing value filling method used; where i represents the position of the median value of the raw data matrix, x i Representing the value at the position i of the original data matrix, g (x i ) Represents the mask matrix transformed according to equation (1.11).
10. An apparatus for power theft classification based on a res net and DSCAttention mechanism, comprising:
the data analysis module is used for carrying out data analysis on the obtained data, obtaining time sequence data characteristics with stronger identifiability through trend season analysis and correlation analysis, and preparing the design and construction of the model according to the time sequence data characteristics;
the data processing module is used for preprocessing data, analyzing and processing missing values of the data, constructing a mask matrix, normalizing the data by using quantile transformation to reduce the sensitivity of the model to abnormal values, and dividing a training set and a verification set by using a method of splitting the data set in a layering way.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310615835.7A CN116628605A (en) | 2023-05-26 | 2023-05-26 | Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310615835.7A CN116628605A (en) | 2023-05-26 | 2023-05-26 | Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116628605A true CN116628605A (en) | 2023-08-22 |
Family
ID=87597039
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310615835.7A Pending CN116628605A (en) | 2023-05-26 | 2023-05-26 | Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116628605A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116933216A (en) * | 2023-09-18 | 2023-10-24 | 湖北华中电力科技开发有限责任公司 | Management system and method based on flexible load resource aggregation feature analysis |
CN117495109A (en) * | 2023-12-29 | 2024-02-02 | 国网山东省电力公司禹城市供电公司 | Electricity stealing user identification system based on deep well network |
-
2023
- 2023-05-26 CN CN202310615835.7A patent/CN116628605A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116933216A (en) * | 2023-09-18 | 2023-10-24 | 湖北华中电力科技开发有限责任公司 | Management system and method based on flexible load resource aggregation feature analysis |
CN116933216B (en) * | 2023-09-18 | 2023-12-01 | 湖北华中电力科技开发有限责任公司 | Management system and method based on flexible load resource aggregation feature analysis |
CN117495109A (en) * | 2023-12-29 | 2024-02-02 | 国网山东省电力公司禹城市供电公司 | Electricity stealing user identification system based on deep well network |
CN117495109B (en) * | 2023-12-29 | 2024-03-22 | 国网山东省电力公司禹城市供电公司 | Power stealing user identification system based on neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107085704A (en) | Fast face expression recognition method based on ELM own coding algorithms | |
CN103605972B (en) | Non-restricted environment face verification method based on block depth neural network | |
CN116628605A (en) | Method and device for electricity stealing classification based on ResNet and DSCAttention mechanism | |
Wang et al. | Tensor deep learning model for heterogeneous data fusion in Internet of Things | |
CN113033309B (en) | Fault diagnosis method based on signal downsampling and one-dimensional convolutional neural network | |
CN106599797A (en) | Infrared face identification method based on local parallel nerve network | |
CN106023065A (en) | Tensor hyperspectral image spectrum-space dimensionality reduction method based on deep convolutional neural network | |
Banerjee et al. | A new wrapper feature selection method for language-invariant offline signature verification | |
CN111046900A (en) | Semi-supervised generation confrontation network image classification method based on local manifold regularization | |
CN111785329A (en) | Single-cell RNA sequencing clustering method based on confrontation automatic encoder | |
CN113344045B (en) | Method for improving SAR ship classification precision by combining HOG characteristics | |
CN112613536A (en) | Near infrared spectrum diesel grade identification method based on SMOTE and deep learning | |
CN105868796A (en) | Design method for linear discrimination of sparse representation classifier based on nuclear space | |
CN110674774A (en) | Improved deep learning facial expression recognition method and system | |
Zuobin et al. | Feature regrouping for cca-based feature fusion and extraction through normalized cut | |
CN115761398A (en) | Bearing fault diagnosis method based on lightweight neural network and dimension expansion | |
CN116248392A (en) | Network malicious traffic detection system and method based on multi-head attention mechanism | |
CN116469561A (en) | Breast cancer survival prediction method based on deep learning | |
CN115661627A (en) | Single-beam underwater target identification method based on GAF-D3Net | |
CN114037001A (en) | Mechanical pump small sample fault diagnosis method based on WGAN-GP-C and metric learning | |
Chen et al. | A finger vein recognition algorithm based on deep learning | |
CN114842425B (en) | Abnormal behavior identification method for petrochemical process and electronic equipment | |
CN108898157B (en) | Classification method for radar chart representation of numerical data based on convolutional neural network | |
CN115423091A (en) | Conditional antagonistic neural network training method, scene generation method and system | |
CN114168822A (en) | Method for establishing time series data clustering model and time series data clustering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |