CN115659148A - Load decomposition method and system based on deep learning attention mechanism - Google Patents

Load decomposition method and system based on deep learning attention mechanism Download PDF

Info

Publication number
CN115659148A
CN115659148A CN202211419435.0A CN202211419435A CN115659148A CN 115659148 A CN115659148 A CN 115659148A CN 202211419435 A CN202211419435 A CN 202211419435A CN 115659148 A CN115659148 A CN 115659148A
Authority
CN
China
Prior art keywords
layer
model
block
attention mechanism
nilm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211419435.0A
Other languages
Chinese (zh)
Inventor
周开乐
张志越
陈鸣飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202211419435.0A priority Critical patent/CN115659148A/en
Publication of CN115659148A publication Critical patent/CN115659148A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a load decomposition method and system based on a deep learning attention mechanism, and relates to the technical field of load monitoring. The invention processes the total power signal through a pre-constructed TransUNet-NILM model to obtain high-resolution data, the model introduces a self-attention mechanism of a Transformer in a U-Net framework and serves as an encoder, the model comprises a lower sampling block, a Transformer block and an upper sampling block, an input sequence is decoded through the upper sampling block in the construction process of the model to obtain multi-scale characteristics, and the multi-scale characteristics are cut to obtain subsequences. The sequence-to-subsequence method provided by the invention balances the seq2seq and the seq2point to solve the convergence difficulty in deep nerves in a balanced manner, so that the training is easier and the calculation amount during the reasoning period is reduced.

Description

Load decomposition method and system based on deep learning attention mechanism
Technical Field
The invention relates to the technical field of load monitoring, in particular to a load decomposition method and system based on a deep learning attention mechanism.
Background
Load shedding (also known as non-intrusive load monitoring or NILM) is a computing technique for estimating the power demand of an individual appliance from a single meter measuring the combined demand of multiple appliances. The ultimate goal of load splitting is to help reduce energy consumption in the home, so as the term "non-invasive", the implementation of this method involves little intervention on the privacy of the user; or to help operators manage the grid; or identifying a faulty appliance; or survey of appliance usage behavior, etc. The load split method does not require additional sensors by obtaining information about the energy usage of the device from the total power consumption. Thus, this approach reduces the cost of the sensing infrastructure and relies on a load resolution approach to monitor the load of the electrical device. The load split algorithm, which can inform the end consumer of its potential energy saving effect, can also be used for demand response management, and it also allows new fair pricing policies to be formulated, especially for the entire electricity market in favor of green credits.
At present, with the recent deep learning method obtaining better effects in various fields, the research of load decomposition is also shifted from the traditional signal processing method to the deep learning architecture. In the framework of deep learning, the NILM is regarded as a sequence-to-sequence (seq 2 seq), sequence-to-point (seq 2 point), or sequence-to-subsequence (seq 2 subseq), and has the task of single-label or multi-label state classification and energy use prediction.
However, the existing method still has a certain problem, such as for the seq2seq method, when the length (time window) of the input (power) and output (device) sequences becomes long, the learning will make the training process difficult to converge, i.e. in the framework of the existing deep learning, the training process is difficult to converge.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a load decomposition method and system based on a deep learning attention mechanism, and solves the technical problem that the convergence is difficult in the training process in the prior art.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
in a first aspect, the present invention provides a load decomposition method based on a deep learning attention mechanism, wherein the load decomposition method obtains high resolution data by processing a total power signal through a pre-constructed TransUNet-NILM model, the TransUNet-NILM model introduces a self-attention mechanism of a Transformer into a U-Net architecture and serves as an encoder, the model comprises a downsampling block, a Transformer block and an upsampling block, and the construction process of the TransUNet-NILM model comprises:
s1, acquiring a total power signal with a real label, and processing the total power signal to obtain a training set;
s2, extracting features of the training set through a down-sampling block to obtain an embedded matrix;
s3, coding the embedded matrix through a Transformer block to obtain an input sequence;
s4, decoding the input sequence through the up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain sub-sequences, and enabling the sub-sequences to pass through an output layer to obtain high-resolution data;
and S5, optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target.
Preferably, the downsampling block comprises a plurality of convolution layers and pooling layers from input to output in sequence;
the feature extraction is carried out on the training set through the lower sampling block to obtain an embedded matrix, and the method comprises the following steps:
increasing hidden size of data in training set by using multiple convolutional layers, introducing position embedding, and learning L in pooling layer 2 Norm pooling operates to pool the convolution output with increased concealment size to obtain an embedded matrix.
Preferably, the Transformer block comprises a multi-head attention mechanism layer, two LayerNorm layers and a feedforward neural network;
the encoding of the embedded matrix by the transform block to obtain an input sequence includes:
s301, carrying out linear change on the embedded matrix to obtain Q, K and V matrixes of the embedded matrix;
s302, acquiring a plurality of subspaces of a plurality of Q, K and V matrixes through a multi-head attention layer, and splicing the plurality of subspaces to obtain an output sequence;
s303, normalizing the output sequence through the first layer LayerNorm;
s304, residual errors of the normalized output sequences are connected through a multi-head attention layer and a position feedforward neural network to obtain a splicing sequence, and the splicing sequence is subjected to standardization operation through a second layer of LayerNorm to obtain an input sequence.
Preferably, the upsampling block comprises, in order from input to output: the system comprises an deconvolution layer, a convolution layer, a window clipping layer and an output layer;
the decoding of the input sequence by the up-sampling block to obtain the multi-scale features and the cutting of the multi-scale features to obtain the sub-sequence, the sub-sequence passing through the output layer to obtain the high resolution data, includes:
s401, by using a plurality of up-sampling blocks, wherein each up-sampling block comprises a one-dimensional deconvolution layer and a convolution layer, when coding is input to the up-sampling block, the up-sampling block firstly passes through the deconvolution layer, and high-level features obtained by the down-sampling block are spliced with high-resolution features obtained by convolution of the convolution layers in the up-sampling block to obtain a multi-scale feature;
s402, cutting the multi-scale features through the sub-window cutting layer to obtain a subsequence;
and S403, repeating the steps S401 to S402, and finally inputting the subsequence to an output layer, wherein the output layer comprises a convolutional layer and an MLP.
Preferably, the clipping the multi-scale features through the sub-window clipping layer to obtain a subsequence includes:
the center of a clipping window in the sub-window clipping layer is aligned with the center of the main window, and W 'is less than or equal to W/2, wherein W' represents the clipping window, and W represents the main window.
Preferably, the method further comprises:
and S6, testing the TransUNet-NILM model through a test set, and further optimizing the TransUNet-NILM model.
In a second aspect, the present invention provides a load decomposition system based on a deep learning attention mechanism, in which the load decomposition system includes a training subsystem for a trained TransUNet-NILM model and a signal processing subsystem for calling the TransUNet-NILM model, the signal processing subsystem calls the TransUNet-NILM model to process a total power signal to obtain high resolution data, the TransUNet-NILM model introduces a self-attention mechanism of a Transformer into a U-Net architecture and serves as an encoder, the model includes a downsampling block, a Transformer block and an upsampling block, and the training subsystem includes:
the training set acquisition module is used for acquiring a total power signal with a real label and processing the total power signal to obtain a training set;
the down-sampling module is used for extracting the characteristics of the training set through a down-sampling block to obtain an embedded matrix;
the transform module is used for coding the embedded matrix through the transform block to obtain an input sequence;
the up-sampling module is used for decoding the input sequence through the up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain sub-sequences, and obtaining high-resolution data through the sub-sequences through the output layer;
and the optimization module is used for optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target.
Preferably, the system further comprises:
and the test module is used for testing the TransUNet-NILM model through a test set and further optimizing the TransUNet-NILM model.
In a third aspect, the present invention provides a computer-readable storage medium storing a computer program for load decomposition based on a deep learning attention mechanism, wherein the computer program causes a computer to execute the load decomposition method based on the deep learning attention mechanism as described above.
In a fourth aspect, the present invention provides an electronic device comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing a deep learning attention mechanism based load splitting method as described above.
(III) advantageous effects
The invention provides a load decomposition method and system based on a deep learning attention mechanism. Compared with the prior art, the method has the following beneficial effects:
the invention processes the total power signal through a pre-constructed TransUNet-NILM model to obtain high-resolution data, the model introduces a self-attention mechanism of a Transformer in a U-Net framework and serves as an encoder, the model comprises a down-sampling block, a Transformer block and an up-sampling block, and the construction process of the model comprises the following steps: acquiring a total power signal with a real label, and processing the total power signal to obtain a training set; extracting features of the training set through a down-sampling block to obtain an embedded matrix; coding the embedded matrix through a Transformer block to obtain an input sequence; decoding the input sequence through an up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain sub-sequences, and enabling the sub-sequences to pass through an output layer to obtain high-resolution data; and (4) optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target. The sequence-to-subsequence method provided by the invention balances the seq2seq and the seq2point to solve the convergence difficulty in deep nerves in a balanced manner, so that the training is easier and the calculation amount during the reasoning period is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a block diagram of a load splitting method based on a deep learning attention mechanism according to an embodiment of the present invention;
FIG. 2 is a block diagram of a load splitting system based on a deep learning attention mechanism in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are a part of the embodiments of the present invention, but not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the application solves the technical problem that convergence is difficult in the training process in the existing method by providing the load decomposition method and the system based on the deep learning attention mechanism, realizes the balance between the seq2seq and the seq2point, solves the problem of difficult convergence in the deep nerve in a balanced manner, and accelerates the convergence speed.
In order to solve the technical problems, the general idea of the embodiment of the application is as follows:
currently, many different methods have been applied to load decomposition, and many methods based on signal processing and machine learning techniques are proposed, such as Hidden Markov Models (HMMs) and their variants, graphic signal processing, combinatorial optimization methods, etc.; HMM-based techniques are generally inefficient and computationally complex as the number of decomposed devices increases.
However, with the recent deep learning method obtaining better effects in many different fields, the research of load decomposition is also shifted from the traditional signal processing method to the deep learning architecture. The success of a Recurrent Neural Network (RNN), a denoising automatic encoder (dAE), a long short-term memory (LSTM), a Convolutional Neural Network (CNN), a generation countermeasure network (GAN) and the like in power prediction is found in the application of deep learning to a load decomposition method. Researchers have generally viewed NILMs as a sequence-to-sequence (seq 2 seq), sequence-to-point (seq 2 point), or sequence-to-subsequence (seq 2 subseq) problem, and have the task of single-or multi-tag state classification and energy usage prediction.
The sequence-to-sequence model is a non-linear regression between a sequence of power readings and appliance readings that have the same time window; rather than training the network to predict the device readings from the window, the sequence-to-point model predicts only the output signal for the midpoint of the window by a sliding window method that utilizes all the neighbors, i.e., past and future, of the input sequence, which concentrates the neural network representation capability at the midpoint of the window rather than outputting on the more difficult edges, thereby producing a more accurate output.
Although some recent preliminary studies have demonstrated the great potential of NILM, there are still many challenges to be solved. The first is the trade-off between model computational complexity and tracking the long-term dependence of energy consumption data, as energy consumption data typically contains abundant daily, seasonal, and even annual patterns. Most deep learning models require a large amount of high quality labeling data to train, especially power consumption data for each device. The cost of data collection, such as point measurement, can be high. Furthermore, many users are reluctant to share their device information due to concerns about privacy violations.
For the load breaking task, the power consumption patterns of different devices are generally of different scales, and the total consumption of multiple devices tends to have more complex shapes, thus requiring the ability to handle scale changes. In addition to data information in the time domain, it is important to consider the context dependency of consumption patterns, since energy consumption behavior involves higher level semantics, such as when a dryer is operated after a washing machine, one may turn on the microwave oven multiple times until cooking is complete.
As can be seen from the above description, the existing methods have the following drawbacks:
1. for the seq2seq method, this learning will make it difficult to converge the training process when the length (time window) of the input (power) and output (device) sequences becomes long, for the seq2point method, each forward process of the model produces only one output signal, thus increasing the computational effort during the inference.
2. The ability of capturing long-term dependence on power signals in equipment is not strong in the load decomposition process based on an RNN sequence-to-sequence model, and a backward propagation path is too long, so that the problem of serious gradient disappearance or gradient explosion is easily caused; the LSTM model-based method is generally more similar to a Markov decision process, and global information is difficult to extract; the CNN-based model fails to utilize the relationship between the use of the front and the back electric appliances, which results in a high false positive/negative error rate in the decomposition result. The use of deep learning in the decomposition is accomplished by using a neural network for each device in the target environment, and the number of convolution layers and the resolution of the filter need to be increased for each device to obtain a more accurate decomposition result, which results in too high computational cost and too high computational complexity for practical application as the number of decomposition devices increases.
3. The transform-based model lacks a generalization bias capability (capability of capturing local features) in extracting power consumption signal features and does not have translational invariance and locality such as CNN, and therefore cannot be generalized to a load decomposition task well, that is, it is difficult to realize accurate decomposition when data is insufficient.
Aiming at the defects of the prior art, the embodiment of the invention provides a load decomposition method and system based on a deep learning attention mechanism.
In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.
The embodiment of the invention provides a load decomposition method based on a deep learning attention mechanism, which is characterized in that a pre-constructed TransUNet-NILM model is used for processing a total power signal to obtain high-resolution data, the model introduces a self-attention mechanism of a Transformer in a U-Net framework and is used as an encoder, the model comprises a down-sampling block, a Transformer block and an up-sampling block, and as shown in figure 1, the construction process of the model comprises the following steps:
s1, acquiring a total power signal with a real label, and processing the total power signal to obtain a training set;
s2, extracting features of the training set through a lower sampling block to obtain an embedded matrix;
s3, coding the embedded matrix through a Transformer block to obtain an input sequence;
s4, decoding the input sequence through an up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain subsequences, and enabling the subsequences to pass through an output layer to obtain high-resolution data;
and S5, optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target.
The sequence-to-subsequence method provided by the embodiment of the invention balances seq2seq and seq2point to solve the convergence difficulty in deep nerves in a balanced manner, so that the training is easier and the calculation amount during the reasoning period is reduced.
The following describes each step in detail:
the pre-constructed TransUNet-NILM model of the embodiment of the invention introduces a self-attention mechanism of a Transformer in a U-Net framework and serves as an encoder, and comprises a down-sampling block, a Transformer block and an up-sampling block. The lower sampling block sequentially comprises a plurality of convolution layers and a pooling layer from input to output; the Transformer block comprises a multi-head attention mechanism layer, two LayerNorm layers and a feedforward neural network; the upsampling block comprises the following components from input to output in sequence: an deconvolution layer, a convolution layer, a window clipping layer, and an output layer. The model is constructed as follows:
in step S1, a total power signal with a real label is obtained, and the total power signal is processed to obtain a training set. The specific implementation process is as follows:
in the embodiment of the present invention, let y (t) epsilon R W×d Representing one derived from the total power consumption of N consumersGroup input characteristics, W represents window size, X belongs to R W×N Represents the power signal of the relevant consumers and at time t each consumer has k states, denoted s i (t)={s i (t) 1 ,s i (t) 2 ,…,s i (t) k S here i (t) k ∈{0,1}。
S∈R W×N Associated multi-label state X ∈ R representing N devices W×N For the corresponding power consumption, let training set D be expressed as follows:
D={y(t),s(t)|t=1,2,…,W}。
in step S2, feature extraction is performed on the training set through the downsampling block, so as to obtain an embedded matrix. The specific implementation process is as follows:
and processing the training set through a plurality of convolution layers and pooling layers in the downsampling block to obtain an embedded matrix. The method specifically comprises the following steps:
before inputting the training set into a Transformer block, feature extraction is carried out in a down-sampling plate of a U-Net model, the hidden size of input data is increased by adopting a plurality of convolution layers, and then learning L in a pooling layer is carried out 2 Norm pooling, which applies square-mean pooling on input data to preserve features to patch, pools convolution output results with increasing concealment size i (t) of (d). To encode the patch space information, position embedding is introduced in an embodiment of the invention, which is added to the patch embedding to capture the sequence of positions, as follows:
z 0 =LPPooling(E)+E pos
Figure BDA0003942700540000121
wherein E ∈ R W×d Is patch embedding, E pos ∈R N×d Is position embedding, LPPooling denotes LP pooling calculation, and Conv denotes convolution operation.
The effect of patch embedding is to convert the original 2-dimensional image into 1-dimensional patch embedding, i.e. to do oneAnd (4) convolution. Unlike RNN, the Transformer-based model allows parallel operation, and all power consumption data of a period of time can be simultaneously input into the model, and E is introduced pos The function is to sort the time sequence of the input power data, and the specific formula is as follows:
Figure BDA0003942700540000122
Figure BDA0003942700540000123
wherein pos represents a position, d model Representing the vector length.
In step S3, the embedded matrix is encoded by the transform block, and an input sequence is obtained. The specific implementation process is as follows:
feeding the embedded matrix after convolution and pooling to a transform block of a Transunet-NILM for encoding, wherein an encoding part of the transform block comprises two LayerNorm layers and a multi-head attention layer and a feedforward neural network, and the specific process is as follows:
s301, carrying out linear change on the embedded matrix to obtain Q, K and V matrixes of the embedded matrix. The method specifically comprises the following steps:
attention to a single head (zoom dot product attention) can be represented by Q (Query), K (Key), and V (Value) matrices, obtained by linear transformation of an embedded matrix. Q and K are first multiplied by the dimension d of the K vector k Then constructs SoftAttention by softmax arithmetic processing, then multiplies V and returns a weight matrix, the formula is as follows:
Figure BDA0003942700540000131
s302, acquiring a plurality of subspaces of a plurality of Q, K and V matrixes through the multi-head attention layer, and splicing the plurality of subspaces to obtain an output sequence. The method specifically comprises the following steps:
the same focus for multi-head is to divide the hidden space into multiple subspaces with parameter matrices and perform the same calculations, resulting in multiple Q, K, V matrices. Therefore, information can be obtained in a plurality of subspaces by multi-head attention, and finally, output results z are obtained in each subspace i The final output sequence is obtained by connecting the two sequences, and the formula can be expressed as follows:
MultiHead(Q,K,V)=Concat(head 1 ,head 2 ,…,head h )W O
head i =Attention(QW i Q ,KW i K ,VW i V )
wherein:
Figure BDA0003942700540000132
are all trained weight matrices.
Figure BDA0003942700540000133
Is a model weight matrix for joint training.
S303, normalizing the output sequence through the first layer LayerNorm. The method specifically comprises the following steps:
the matrix MultiHead, obtained by Multi-head attention level calculation, is denoted herein as z MH Firstly, normalization processing is carried out, wherein LayerNorm is used for the purpose of accelerating the training speed; the stability of training is improved. The normalized output is fed to a position feedforward neural network, the layer is simple and is two fully-connected layers, the first layer is transformed by linear transformation, the second layer is transformed by nonlinear transformation using an activation function ReLU and then is transformed by linear transformation, and the two layers have the specific function that the input z is input MH Carrying out high-dimensional mapping, screening by using a nonlinear function ReLU, and recovering the original dimension, wherein the specific formula is as follows:
Figure BDA0003942700540000141
Figure BDA0003942700540000142
wherein:
Figure BDA0003942700540000143
the result of normalization processing output from the multi-head attention layer in the jth encoder is shown; z is a radical of j-1 The input sequence obtained in the j-1 encoder is shown; PFFN represents feed-forward neural network operation; w 1 、W 2 Representing a weight parameter, b 1 、b 2 Indicating the offset.
It is considered that the coding block of the transform block is stacked by 6 encoders, here
Figure BDA0003942700540000144
The results of normalization processing from the multi-head attention layer output in the jth encoder are shown.
S304, residual errors of the normalized output sequences are connected through a multi-head attention layer and a position feedforward neural network to obtain a splicing sequence, and the splicing sequence is subjected to standardization operation through a second layer of LayerNorm to obtain an input sequence z j The concrete formula is as follows:
Figure BDA0003942700540000145
it should be noted that, in the embodiment of the present invention, since the coding block uses 6 identical encoders, the method further includes: s305, repeating the steps S303 to S304.
In step S4, the input sequence is decoded by the up-sampling block to obtain the multi-scale features, and the multi-scale features are cut to obtain the sub-sequence, and the sub-sequence passes through the output layer to obtain the high-resolution data. The specific implementation process is as follows:
s401, by using a plurality of up-sampling blocks, wherein each up-sampling block comprises a one-dimensional deconvolution layer and a convolution layer, when coding is input to the up-sampling block, the coding first passes through the deconvolution layer, and then the coding is obtainedHigh-resolution feature patch convolved with multiple convolution layers in the upsampled block i (t) splicing to obtain a multi-scale feature.
S402, cutting the multi-scale features through the sub-window cutting layer to obtain a subsequence.
The method specifically comprises the following steps:
considering that the window size is W, and the main focus part in the load decomposition process is the part near the middle point of the window, the power consumption signal when t =0 and t = W has less influence on the decomposition result and increases the training time. Thus, considering the data dimension and the window setting size together, embodiments of the present invention introduce a smaller window clipping layer W 'whose center is aligned with the center of the main window, W' = W/2. The middle part of the last layer of the final decoder generates a "sub-sequence" as the final output.
S403, repeating the steps S401 to S402, and finally inputting the subsequence to an output layer, wherein the layer comprises a convolution layer and an MLP, the MLP comprises a deconvolution layer and two linear layers, and the formula is as follows:
s t =softmax(MLP(z))
MLP(z)=Tanh(Deconv(z)W 1 +b 1 )W 2 +b 2
where Tanh is the activation function, deconv denotes the deconvolution calculation, W 1 、W 2 Representing a weight parameter, b 1 、b 2 Indicating an offset.
In step S5, the TransUNet-NILM model is optimized using the absolute value error of both the real tag and the high resolution data as an optimization target. The specific implementation process is as follows:
the minimum loss function is selected during decomposition, the embodiment of the invention uses the mean absolute value error as the minimum loss function to ensure the accuracy of the decomposition, and the cross entropy between the distribution of decomposition equipment and the state of each equipment is predicted by using softmax to be used for the minimum loss function of the decomposition state of the equipment, and the specific formula is as follows:
Figure BDA0003942700540000161
Figure BDA0003942700540000162
wherein x = f (y),
Figure BDA0003942700540000163
representing true values, s and
Figure BDA0003942700540000164
respectively representing the real state label and the predicted state, and N representing the number of the electric equipment.
Note: the length of the output sub-sequence is taken into account here, i.e. the above-mentioned steps finally determine that the small window is W' = W/2, and then the model represents that
Figure BDA0003942700540000165
It should be noted that, in the embodiment of the present invention, the method further includes:
and S6, testing the TransUNet-NILM model through a test set, and further optimizing the TransUNet-NILM model.
An embodiment of the present invention further provides a load decomposition system based on a deep learning attention mechanism, as shown in fig. 2, the load decomposition system includes a training subsystem for training a transit-NILM model and a signal processing subsystem for invoking the transit-NILM model, the signal processing subsystem invokes the transit-NILM model to process a total power signal to obtain high resolution data, the model introduces a self-attention mechanism of a Transformer into a U-Net architecture and serves as an encoder, the model includes a down-sampling block, a Transformer block and an up-sampling block, and the training subsystem of the model includes:
the training set acquisition module is used for acquiring a total power signal with a real label and processing the total power signal to obtain a training set;
the down-sampling module is used for extracting the characteristics of the training set through the down-sampling block to obtain an embedded matrix;
the Transformer module is used for coding the embedded matrix through the Transformer block to obtain an input sequence;
the up-sampling module is used for decoding the input sequence through the up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain sub-sequences, and obtaining high-resolution data through the sub-sequences through the output layer;
and the optimization module is used for optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target.
It can be understood that the load decomposition system for the deep learning attention mechanism provided in the embodiment of the present invention corresponds to the load decomposition method based on the deep learning attention mechanism, and the explanations, examples, and advantageous effects of the relevant contents thereof can refer to the corresponding contents in the load decomposition method based on the deep learning attention mechanism, and are not repeated herein.
Embodiments of the present invention also provide a computer-readable storage medium storing a computer program for load decomposition in a deep learning attention mechanism, wherein the computer program causes a computer to execute the load decomposition method based on the deep learning attention mechanism as described above.
An embodiment of the present invention further provides an electronic device, including:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing a deep learning attention mechanism based load resolution method as described above.
In summary, compared with the prior art, the method has the following beneficial effects:
1. the sequence-to-subsequence method provided by the embodiment of the invention balances seq2seq and seq2point to balance and solve the convergence difficulty in deep nerves, so that the training is easier, and the calculation amount during reasoning is reduced.
2. The embodiment of the invention combines a Transformer and a CNN, adopts a TransUNet model to mutually make up the defects of the Transformer and the CNN, and a Transformer block encodes a marked image block in convolutional neural network feature mapping into an input sequence for extracting the global context; the decoder upsamples the encoded features and then combines them with a high resolution CNN profile to achieve an accurate decomposition.
3. The embodiment of the invention provides a method based on a deep learning attention mechanism, which can learn the mutual relation between characteristics through the attention mechanism in the data of the input long text without completely depending on the data, effectively solves the problem that the prior art excessively depends on the original data, and further improves the decomposition accuracy.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A load decomposition method based on a deep learning attention mechanism is characterized in that the load decomposition method processes a total power signal through a pre-constructed TransUNet-NILM model to obtain high-resolution data, the TransUNet-NILM model introduces a self-attention mechanism of a Transformer as an encoder in a U-Net architecture, the model comprises a down-sampling block, a Transformer block and an up-sampling block, and the construction process of the TransUNet-NILM model comprises the following steps:
s1, acquiring a total power signal with a real label, and processing the total power signal to obtain a training set;
s2, extracting features of the training set through a down-sampling block to obtain an embedded matrix;
s3, coding the embedded matrix through a Transformer block to obtain an input sequence;
s4, decoding the input sequence through an up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain subsequences, and enabling the subsequences to pass through an output layer to obtain high-resolution data;
and S5, optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target.
2. The deep learning attention mechanism-based load break method of claim 1, wherein the downsampling block comprises a plurality of convolution layers, pooling layers in order from input to output;
the feature extraction is carried out on the training set through the down-sampling block to obtain an embedded matrix, and the method comprises the following steps:
increasing hidden size of data in training set by using multiple convolutional layers, introducing position embedding, and learning L in pooling layer 2 The norm pooling operation pools the convolution output with increased concealment size to obtain an embedded matrix.
3. The load splitting method based on the deep learning attention mechanism as claimed in claim 1, wherein the Transformer block comprises a multi-head attention mechanism layer, two LayerNorm layers, a feedforward neural network;
the encoding of the embedded matrix by the transform block to obtain an input sequence includes:
s301, carrying out linear change on the embedded matrix to obtain Q, K and V matrixes of the embedded matrix;
s302, acquiring a plurality of subspaces of a plurality of Q, K and V matrixes through a multi-head attention layer, and splicing the plurality of subspaces to obtain an output sequence;
s303, normalizing the output sequence through the first layer LayerNorm;
s304, residual error connection is carried out on the plurality of normalized output sequences through a multi-head attention layer and a position feedforward neural network to obtain a splicing sequence, and standardized operation is carried out on the splicing sequence through a second layer LayerNorm to obtain an input sequence.
4. The deep learning attention mechanism-based load break method of claim 1, wherein the upsampling block comprises, in order from input to output: an deconvolution layer, a convolution layer, a window clipping layer and an output layer;
the decoding of the input sequence through the up-sampling block to obtain the multi-scale features, and the cutting of the multi-scale features to obtain the sub-sequence, wherein the sub-sequence obtains the high-resolution data through the output layer, and the method comprises the following steps:
s401, by using a plurality of up-sampling blocks, wherein each up-sampling block comprises a one-dimensional deconvolution layer and a convolution layer, when coding is input to the up-sampling block, the up-sampling block firstly passes through the deconvolution layer, and high-level features obtained by the down-sampling block are spliced with high-resolution features obtained by convolution of the convolution layers in the up-sampling block to obtain a multi-scale feature;
s402, cutting the multi-scale features through the sub-window cutting layer to obtain a subsequence;
and S403, repeating the steps S401 to S402, and finally inputting the subsequence to an output layer, wherein the output layer comprises a convolutional layer and an MLP.
5. The method for load decomposition based on deep learning attention mechanism according to claim 4, wherein the clipping the multi-scale features through the sub-window clipping layer to obtain the sub-sequence comprises:
the center of the clipping window in the sub-window clipping layer is aligned with the center of the main window, and W 'is less than or equal to W/2, wherein W' represents the clipping window, and W represents the main window.
6. The deep learning attention mechanism-based load decomposition method according to any one of claims 1 to 5, further comprising:
and S6, testing the TransUNet-NILM model through a test set, and further optimizing the TransUNet-NILM model.
7. A load decomposition system based on a deep learning attention mechanism is characterized in that the load decomposition system comprises a training subsystem of a TransUNet-NILM model for training and a signal processing subsystem of the TransUNet-NILM model for calling, the signal processing subsystem calls the TransUNet-NILM model to process a total power signal to obtain high-resolution data, the TransUNet-NILM model introduces a self-attention mechanism of a Transformer in a U-Net framework and serves as an encoder, the model comprises a downsampling block, a Transformer block and an upsampling block, and the training subsystem comprises:
the training set acquisition module is used for acquiring a total power signal with a real label and processing the total power signal to obtain a training set;
the down-sampling module is used for extracting the characteristics of the training set through a down-sampling block to obtain an embedded matrix;
the Transformer module is used for coding the embedded matrix through the Transformer block to obtain an input sequence;
the up-sampling module is used for decoding the input sequence through the up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain sub-sequences, and obtaining high-resolution data through the sub-sequences through the output layer;
and the optimization module is used for optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target.
8. The deep learning attention mechanism based load break system of claim 7, further comprising:
and the test module is used for testing the TransUNet-NILM model through a test set and further optimizing the TransUNet-NILM model.
9. A computer-readable storage medium storing a computer program for load decomposition based on a deep learning attention mechanism, wherein the computer program causes a computer to execute the load decomposition method based on the deep learning attention mechanism according to any one of claims 1 to 6.
10. An electronic device, comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the load splitting method based on the deep learning attention mechanism of any of claims 1-6.
CN202211419435.0A 2022-11-14 2022-11-14 Load decomposition method and system based on deep learning attention mechanism Pending CN115659148A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211419435.0A CN115659148A (en) 2022-11-14 2022-11-14 Load decomposition method and system based on deep learning attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211419435.0A CN115659148A (en) 2022-11-14 2022-11-14 Load decomposition method and system based on deep learning attention mechanism

Publications (1)

Publication Number Publication Date
CN115659148A true CN115659148A (en) 2023-01-31

Family

ID=85020612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211419435.0A Pending CN115659148A (en) 2022-11-14 2022-11-14 Load decomposition method and system based on deep learning attention mechanism

Country Status (1)

Country Link
CN (1) CN115659148A (en)

Similar Documents

Publication Publication Date Title
CN113705809B (en) Data prediction model training method, industrial index prediction method and device
CN112257263B (en) Equipment residual life prediction system based on self-attention mechanism
CN111784061B (en) Training method, device and equipment for power grid engineering cost prediction model
Tan et al. Multi-node load forecasting based on multi-task learning with modal feature extraction
CN115116559B (en) Method, device, equipment and medium for determining and training atomic coordinates in amino acid
Zhu et al. Parallel interaction spatiotemporal constrained variational autoencoder for soft sensor modeling
CN116433223A (en) Substation equipment fault early warning method and equipment based on double-domain sparse transducer model
CN115131313A (en) Hyperspectral image change detection method and device based on Transformer
CN116596150A (en) Event prediction method of transform Hoxwell process model based on multi-branch self-attention
CN115775350A (en) Image enhancement method and device and computing equipment
CN117094451B (en) Power consumption prediction method, device and terminal
CN116739787B (en) Transaction recommendation method and system based on artificial intelligence
CN115952928B (en) Short-term power load prediction method, device, equipment and storage medium
CN117076931A (en) Time sequence data prediction method and system based on conditional diffusion model
CN117011609A (en) Automatic tracking system and method for textile processing progress
CN117151276A (en) Intelligent management system of electricity selling platform
CN115659148A (en) Load decomposition method and system based on deep learning attention mechanism
CN116189201A (en) Image recognition method and device
CN116562952A (en) False transaction order detection method and device
Wang et al. MIANet: Multi-level temporal information aggregation in mixed-periodicity time series forecasting tasks
Dong et al. Adaptive neural network short term load forecasting with wavelet decompositions
CN116579505B (en) Electromechanical equipment cross-domain residual life prediction method and system without full life cycle sample
CN114913054B (en) Attention perception-based shader simplified variant evaluation method and device
CN114819108A (en) Fault identification method and device for comprehensive energy system
Ren Data Science Analysis Method Design via Big Data Technology and Attention Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination