CN115659148A

CN115659148A - Load decomposition method and system based on deep learning attention mechanism

Info

Publication number: CN115659148A
Application number: CN202211419435.0A
Authority: CN
Inventors: 周开乐; 张志越; 陈鸣飞
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2022-11-14
Filing date: 2022-11-14
Publication date: 2023-01-31

Abstract

The invention provides a load decomposition method and system based on a deep learning attention mechanism, and relates to the technical field of load monitoring. The invention processes the total power signal through a pre-constructed TransUNet-NILM model to obtain high-resolution data, the model introduces a self-attention mechanism of a Transformer in a U-Net framework and serves as an encoder, the model comprises a lower sampling block, a Transformer block and an upper sampling block, an input sequence is decoded through the upper sampling block in the construction process of the model to obtain multi-scale characteristics, and the multi-scale characteristics are cut to obtain subsequences. The sequence-to-subsequence method provided by the invention balances the seq2seq and the seq2point to solve the convergence difficulty in deep nerves in a balanced manner, so that the training is easier and the calculation amount during the reasoning period is reduced.

Description

Load decomposition method and system based on deep learning attention mechanism

Technical Field

The invention relates to the technical field of load monitoring, in particular to a load decomposition method and system based on a deep learning attention mechanism.

Background

Load shedding (also known as non-intrusive load monitoring or NILM) is a computing technique for estimating the power demand of an individual appliance from a single meter measuring the combined demand of multiple appliances. The ultimate goal of load splitting is to help reduce energy consumption in the home, so as the term "non-invasive", the implementation of this method involves little intervention on the privacy of the user; or to help operators manage the grid; or identifying a faulty appliance; or survey of appliance usage behavior, etc. The load split method does not require additional sensors by obtaining information about the energy usage of the device from the total power consumption. Thus, this approach reduces the cost of the sensing infrastructure and relies on a load resolution approach to monitor the load of the electrical device. The load split algorithm, which can inform the end consumer of its potential energy saving effect, can also be used for demand response management, and it also allows new fair pricing policies to be formulated, especially for the entire electricity market in favor of green credits.

At present, with the recent deep learning method obtaining better effects in various fields, the research of load decomposition is also shifted from the traditional signal processing method to the deep learning architecture. In the framework of deep learning, the NILM is regarded as a sequence-to-sequence (seq 2 seq), sequence-to-point (seq 2 point), or sequence-to-subsequence (seq 2 subseq), and has the task of single-label or multi-label state classification and energy use prediction.

However, the existing method still has a certain problem, such as for the seq2seq method, when the length (time window) of the input (power) and output (device) sequences becomes long, the learning will make the training process difficult to converge, i.e. in the framework of the existing deep learning, the training process is difficult to converge.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects of the prior art, the invention provides a load decomposition method and system based on a deep learning attention mechanism, and solves the technical problem that the convergence is difficult in the training process in the prior art.

(II) technical scheme

In order to achieve the purpose, the invention is realized by the following technical scheme:

in a first aspect, the present invention provides a load decomposition method based on a deep learning attention mechanism, wherein the load decomposition method obtains high resolution data by processing a total power signal through a pre-constructed TransUNet-NILM model, the TransUNet-NILM model introduces a self-attention mechanism of a Transformer into a U-Net architecture and serves as an encoder, the model comprises a downsampling block, a Transformer block and an upsampling block, and the construction process of the TransUNet-NILM model comprises:

s1, acquiring a total power signal with a real label, and processing the total power signal to obtain a training set;

s2, extracting features of the training set through a down-sampling block to obtain an embedded matrix;

s3, coding the embedded matrix through a Transformer block to obtain an input sequence;

s4, decoding the input sequence through the up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain sub-sequences, and enabling the sub-sequences to pass through an output layer to obtain high-resolution data;

and S5, optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target.

Preferably, the downsampling block comprises a plurality of convolution layers and pooling layers from input to output in sequence;

the feature extraction is carried out on the training set through the lower sampling block to obtain an embedded matrix, and the method comprises the following steps:

increasing hidden size of data in training set by using multiple convolutional layers, introducing position embedding, and learning L in pooling layer ² Norm pooling operates to pool the convolution output with increased concealment size to obtain an embedded matrix.

Preferably, the Transformer block comprises a multi-head attention mechanism layer, two LayerNorm layers and a feedforward neural network;

the encoding of the embedded matrix by the transform block to obtain an input sequence includes:

s301, carrying out linear change on the embedded matrix to obtain Q, K and V matrixes of the embedded matrix;

s302, acquiring a plurality of subspaces of a plurality of Q, K and V matrixes through a multi-head attention layer, and splicing the plurality of subspaces to obtain an output sequence;

s303, normalizing the output sequence through the first layer LayerNorm;

s304, residual errors of the normalized output sequences are connected through a multi-head attention layer and a position feedforward neural network to obtain a splicing sequence, and the splicing sequence is subjected to standardization operation through a second layer of LayerNorm to obtain an input sequence.

Preferably, the upsampling block comprises, in order from input to output: the system comprises an deconvolution layer, a convolution layer, a window clipping layer and an output layer;

the decoding of the input sequence by the up-sampling block to obtain the multi-scale features and the cutting of the multi-scale features to obtain the sub-sequence, the sub-sequence passing through the output layer to obtain the high resolution data, includes:

s401, by using a plurality of up-sampling blocks, wherein each up-sampling block comprises a one-dimensional deconvolution layer and a convolution layer, when coding is input to the up-sampling block, the up-sampling block firstly passes through the deconvolution layer, and high-level features obtained by the down-sampling block are spliced with high-resolution features obtained by convolution of the convolution layers in the up-sampling block to obtain a multi-scale feature;

s402, cutting the multi-scale features through the sub-window cutting layer to obtain a subsequence;

and S403, repeating the steps S401 to S402, and finally inputting the subsequence to an output layer, wherein the output layer comprises a convolutional layer and an MLP.

Preferably, the clipping the multi-scale features through the sub-window clipping layer to obtain a subsequence includes:

the center of a clipping window in the sub-window clipping layer is aligned with the center of the main window, and W 'is less than or equal to W/2, wherein W' represents the clipping window, and W represents the main window.

Preferably, the method further comprises:

and S6, testing the TransUNet-NILM model through a test set, and further optimizing the TransUNet-NILM model.

In a second aspect, the present invention provides a load decomposition system based on a deep learning attention mechanism, in which the load decomposition system includes a training subsystem for a trained TransUNet-NILM model and a signal processing subsystem for calling the TransUNet-NILM model, the signal processing subsystem calls the TransUNet-NILM model to process a total power signal to obtain high resolution data, the TransUNet-NILM model introduces a self-attention mechanism of a Transformer into a U-Net architecture and serves as an encoder, the model includes a downsampling block, a Transformer block and an upsampling block, and the training subsystem includes:

the training set acquisition module is used for acquiring a total power signal with a real label and processing the total power signal to obtain a training set;

the down-sampling module is used for extracting the characteristics of the training set through a down-sampling block to obtain an embedded matrix;

the transform module is used for coding the embedded matrix through the transform block to obtain an input sequence;

the up-sampling module is used for decoding the input sequence through the up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain sub-sequences, and obtaining high-resolution data through the sub-sequences through the output layer;

and the optimization module is used for optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target.

Preferably, the system further comprises:

and the test module is used for testing the TransUNet-NILM model through a test set and further optimizing the TransUNet-NILM model.

In a third aspect, the present invention provides a computer-readable storage medium storing a computer program for load decomposition based on a deep learning attention mechanism, wherein the computer program causes a computer to execute the load decomposition method based on the deep learning attention mechanism as described above.

In a fourth aspect, the present invention provides an electronic device comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing a deep learning attention mechanism based load splitting method as described above.

(III) advantageous effects

The invention provides a load decomposition method and system based on a deep learning attention mechanism. Compared with the prior art, the method has the following beneficial effects:

the invention processes the total power signal through a pre-constructed TransUNet-NILM model to obtain high-resolution data, the model introduces a self-attention mechanism of a Transformer in a U-Net framework and serves as an encoder, the model comprises a down-sampling block, a Transformer block and an up-sampling block, and the construction process of the model comprises the following steps: acquiring a total power signal with a real label, and processing the total power signal to obtain a training set; extracting features of the training set through a down-sampling block to obtain an embedded matrix; coding the embedded matrix through a Transformer block to obtain an input sequence; decoding the input sequence through an up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain sub-sequences, and enabling the sub-sequences to pass through an output layer to obtain high-resolution data; and (4) optimizing the TransUNet-NILM model by using the absolute value error of the real label and the high-resolution data as an optimization target. The sequence-to-subsequence method provided by the invention balances the seq2seq and the seq2point to solve the convergence difficulty in deep nerves in a balanced manner, so that the training is easier and the calculation amount during the reasoning period is reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a block diagram of a load splitting method based on a deep learning attention mechanism according to an embodiment of the present invention;

FIG. 2 is a block diagram of a load splitting system based on a deep learning attention mechanism in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are a part of the embodiments of the present invention, but not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the application solves the technical problem that convergence is difficult in the training process in the existing method by providing the load decomposition method and the system based on the deep learning attention mechanism, realizes the balance between the seq2seq and the seq2point, solves the problem of difficult convergence in the deep nerve in a balanced manner, and accelerates the convergence speed.

In order to solve the technical problems, the general idea of the embodiment of the application is as follows:

currently, many different methods have been applied to load decomposition, and many methods based on signal processing and machine learning techniques are proposed, such as Hidden Markov Models (HMMs) and their variants, graphic signal processing, combinatorial optimization methods, etc.; HMM-based techniques are generally inefficient and computationally complex as the number of decomposed devices increases.

However, with the recent deep learning method obtaining better effects in many different fields, the research of load decomposition is also shifted from the traditional signal processing method to the deep learning architecture. The success of a Recurrent Neural Network (RNN), a denoising automatic encoder (dAE), a long short-term memory (LSTM), a Convolutional Neural Network (CNN), a generation countermeasure network (GAN) and the like in power prediction is found in the application of deep learning to a load decomposition method. Researchers have generally viewed NILMs as a sequence-to-sequence (seq 2 seq), sequence-to-point (seq 2 point), or sequence-to-subsequence (seq 2 subseq) problem, and have the task of single-or multi-tag state classification and energy usage prediction.

The sequence-to-sequence model is a non-linear regression between a sequence of power readings and appliance readings that have the same time window; rather than training the network to predict the device readings from the window, the sequence-to-point model predicts only the output signal for the midpoint of the window by a sliding window method that utilizes all the neighbors, i.e., past and future, of the input sequence, which concentrates the neural network representation capability at the midpoint of the window rather than outputting on the more difficult edges, thereby producing a more accurate output.

Although some recent preliminary studies have demonstrated the great potential of NILM, there are still many challenges to be solved. The first is the trade-off between model computational complexity and tracking the long-term dependence of energy consumption data, as energy consumption data typically contains abundant daily, seasonal, and even annual patterns. Most deep learning models require a large amount of high quality labeling data to train, especially power consumption data for each device. The cost of data collection, such as point measurement, can be high. Furthermore, many users are reluctant to share their device information due to concerns about privacy violations.

For the load breaking task, the power consumption patterns of different devices are generally of different scales, and the total consumption of multiple devices tends to have more complex shapes, thus requiring the ability to handle scale changes. In addition to data information in the time domain, it is important to consider the context dependency of consumption patterns, since energy consumption behavior involves higher level semantics, such as when a dryer is operated after a washing machine, one may turn on the microwave oven multiple times until cooking is complete.

As can be seen from the above description, the existing methods have the following drawbacks:

1. for the seq2seq method, this learning will make it difficult to converge the training process when the length (time window) of the input (power) and output (device) sequences becomes long, for the seq2point method, each forward process of the model produces only one output signal, thus increasing the computational effort during the inference.

2. The ability of capturing long-term dependence on power signals in equipment is not strong in the load decomposition process based on an RNN sequence-to-sequence model, and a backward propagation path is too long, so that the problem of serious gradient disappearance or gradient explosion is easily caused; the LSTM model-based method is generally more similar to a Markov decision process, and global information is difficult to extract; the CNN-based model fails to utilize the relationship between the use of the front and the back electric appliances, which results in a high false positive/negative error rate in the decomposition result. The use of deep learning in the decomposition is accomplished by using a neural network for each device in the target environment, and the number of convolution layers and the resolution of the filter need to be increased for each device to obtain a more accurate decomposition result, which results in too high computational cost and too high computational complexity for practical application as the number of decomposition devices increases.

3. The transform-based model lacks a generalization bias capability (capability of capturing local features) in extracting power consumption signal features and does not have translational invariance and locality such as CNN, and therefore cannot be generalized to a load decomposition task well, that is, it is difficult to realize accurate decomposition when data is insufficient.

Aiming at the defects of the prior art, the embodiment of the invention provides a load decomposition method and system based on a deep learning attention mechanism.

In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.

The embodiment of the invention provides a load decomposition method based on a deep learning attention mechanism, which is characterized in that a pre-constructed TransUNet-NILM model is used for processing a total power signal to obtain high-resolution data, the model introduces a self-attention mechanism of a Transformer in a U-Net framework and is used as an encoder, the model comprises a down-sampling block, a Transformer block and an up-sampling block, and as shown in figure 1, the construction process of the model comprises the following steps:

s2, extracting features of the training set through a lower sampling block to obtain an embedded matrix;

s4, decoding the input sequence through an up-sampling block to obtain multi-scale features, cutting the multi-scale features to obtain subsequences, and enabling the subsequences to pass through an output layer to obtain high-resolution data;

The sequence-to-subsequence method provided by the embodiment of the invention balances seq2seq and seq2point to solve the convergence difficulty in deep nerves in a balanced manner, so that the training is easier and the calculation amount during the reasoning period is reduced.

The following describes each step in detail:

the pre-constructed TransUNet-NILM model of the embodiment of the invention introduces a self-attention mechanism of a Transformer in a U-Net framework and serves as an encoder, and comprises a down-sampling block, a Transformer block and an up-sampling block. The lower sampling block sequentially comprises a plurality of convolution layers and a pooling layer from input to output; the Transformer block comprises a multi-head attention mechanism layer, two LayerNorm layers and a feedforward neural network; the upsampling block comprises the following components from input to output in sequence: an deconvolution layer, a convolution layer, a window clipping layer, and an output layer. The model is constructed as follows:

in step S1, a total power signal with a real label is obtained, and the total power signal is processed to obtain a training set. The specific implementation process is as follows:

in the embodiment of the present invention, let y (t) epsilon R ^W×d Representing one derived from the total power consumption of N consumersGroup input characteristics, W represents window size, X belongs to R ^W×N Represents the power signal of the relevant consumers and at time t each consumer has k states, denoted s _i (t)＝{s _i (t) ¹ ,s _i (t) ² ,…,s _i (t) ^k S here _i (t) ^k ∈{0,1}。

S∈R ^W×N Associated multi-label state X ∈ R representing N devices ^W×N For the corresponding power consumption, let training set D be expressed as follows:

D＝{y(t),s(t)|t＝1,2,…,W}。

in step S2, feature extraction is performed on the training set through the downsampling block, so as to obtain an embedded matrix. The specific implementation process is as follows:

and processing the training set through a plurality of convolution layers and pooling layers in the downsampling block to obtain an embedded matrix. The method specifically comprises the following steps:

before inputting the training set into a Transformer block, feature extraction is carried out in a down-sampling plate of a U-Net model, the hidden size of input data is increased by adopting a plurality of convolution layers, and then learning L in a pooling layer is carried out ² Norm pooling, which applies square-mean pooling on input data to preserve features to patch, pools convolution output results with increasing concealment size _i (t) of (d). To encode the patch space information, position embedding is introduced in an embodiment of the invention, which is added to the patch embedding to capture the sequence of positions, as follows:

z ₀ ＝LPPooling(E)+E _pos

wherein E ∈ R ^W×d Is patch embedding, E _pos ∈R ^N×d Is position embedding, LPPooling denotes LP pooling calculation, and Conv denotes convolution operation.

The effect of patch embedding is to convert the original 2-dimensional image into 1-dimensional patch embedding, i.e. to do oneAnd (4) convolution. Unlike RNN, the Transformer-based model allows parallel operation, and all power consumption data of a period of time can be simultaneously input into the model, and E is introduced _pos The function is to sort the time sequence of the input power data, and the specific formula is as follows:

wherein pos represents a position, d _model Representing the vector length.

In step S3, the embedded matrix is encoded by the transform block, and an input sequence is obtained. The specific implementation process is as follows:

feeding the embedded matrix after convolution and pooling to a transform block of a Transunet-NILM for encoding, wherein an encoding part of the transform block comprises two LayerNorm layers and a multi-head attention layer and a feedforward neural network, and the specific process is as follows:

s301, carrying out linear change on the embedded matrix to obtain Q, K and V matrixes of the embedded matrix. The method specifically comprises the following steps:

attention to a single head (zoom dot product attention) can be represented by Q (Query), K (Key), and V (Value) matrices, obtained by linear transformation of an embedded matrix. Q and K are first multiplied by the dimension d of the K vector _k Then constructs SoftAttention by softmax arithmetic processing, then multiplies V and returns a weight matrix, the formula is as follows:

s302, acquiring a plurality of subspaces of a plurality of Q, K and V matrixes through the multi-head attention layer, and splicing the plurality of subspaces to obtain an output sequence. The method specifically comprises the following steps:

the same focus for multi-head is to divide the hidden space into multiple subspaces with parameter matrices and perform the same calculations, resulting in multiple Q, K, V matrices. Therefore, information can be obtained in a plurality of subspaces by multi-head attention, and finally, output results z are obtained in each subspace ⁱ The final output sequence is obtained by connecting the two sequences, and the formula can be expressed as follows:

MultiHead(Q,K,V)＝Concat(head ₁ ,head ₂ ,…,head _h )W ^O

head _i ＝Attention(QW _i ^Q ,KW _i ^K ,VW _i ^V )

wherein:

are all trained weight matrices.

Is a model weight matrix for joint training.

S303, normalizing the output sequence through the first layer LayerNorm. The method specifically comprises the following steps:

the matrix MultiHead, obtained by Multi-head attention level calculation, is denoted herein as z ^MH Firstly, normalization processing is carried out, wherein LayerNorm is used for the purpose of accelerating the training speed; the stability of training is improved. The normalized output is fed to a position feedforward neural network, the layer is simple and is two fully-connected layers, the first layer is transformed by linear transformation, the second layer is transformed by nonlinear transformation using an activation function ReLU and then is transformed by linear transformation, and the two layers have the specific function that the input z is input ^MH Carrying out high-dimensional mapping, screening by using a nonlinear function ReLU, and recovering the original dimension, wherein the specific formula is as follows:

wherein:

the result of normalization processing output from the multi-head attention layer in the jth encoder is shown; z is a radical of _j-1 The input sequence obtained in the j-1 encoder is shown; PFFN represents feed-forward neural network operation; w ₁ 、W ₂ Representing a weight parameter, b ₁ 、b ₂ Indicating the offset.

It is considered that the coding block of the transform block is stacked by 6 encoders, here

The results of normalization processing from the multi-head attention layer output in the jth encoder are shown.

S304, residual errors of the normalized output sequences are connected through a multi-head attention layer and a position feedforward neural network to obtain a splicing sequence, and the splicing sequence is subjected to standardization operation through a second layer of LayerNorm to obtain an input sequence z _j The concrete formula is as follows:

it should be noted that, in the embodiment of the present invention, since the coding block uses 6 identical encoders, the method further includes: s305, repeating the steps S303 to S304.

In step S4, the input sequence is decoded by the up-sampling block to obtain the multi-scale features, and the multi-scale features are cut to obtain the sub-sequence, and the sub-sequence passes through the output layer to obtain the high-resolution data. The specific implementation process is as follows:

s401, by using a plurality of up-sampling blocks, wherein each up-sampling block comprises a one-dimensional deconvolution layer and a convolution layer, when coding is input to the up-sampling block, the coding first passes through the deconvolution layer, and then the coding is obtainedHigh-resolution feature patch convolved with multiple convolution layers in the upsampled block _i (t) splicing to obtain a multi-scale feature.

S402, cutting the multi-scale features through the sub-window cutting layer to obtain a subsequence.

The method specifically comprises the following steps:

considering that the window size is W, and the main focus part in the load decomposition process is the part near the middle point of the window, the power consumption signal when t =0 and t = W has less influence on the decomposition result and increases the training time. Thus, considering the data dimension and the window setting size together, embodiments of the present invention introduce a smaller window clipping layer W 'whose center is aligned with the center of the main window, W' = W/2. The middle part of the last layer of the final decoder generates a "sub-sequence" as the final output.

S403, repeating the steps S401 to S402, and finally inputting the subsequence to an output layer, wherein the layer comprises a convolution layer and an MLP, the MLP comprises a deconvolution layer and two linear layers, and the formula is as follows:

s _t ＝softmax(MLP(z))

MLP(z)＝Tanh(Deconv(z)W ₁ +b ₁ )W ₂ +b ₂

where Tanh is the activation function, deconv denotes the deconvolution calculation, W ₁ 、W ₂ Representing a weight parameter, b ₁ 、b ₂ Indicating an offset.

In step S5, the TransUNet-NILM model is optimized using the absolute value error of both the real tag and the high resolution data as an optimization target. The specific implementation process is as follows:

the minimum loss function is selected during decomposition, the embodiment of the invention uses the mean absolute value error as the minimum loss function to ensure the accuracy of the decomposition, and the cross entropy between the distribution of decomposition equipment and the state of each equipment is predicted by using softmax to be used for the minimum loss function of the decomposition state of the equipment, and the specific formula is as follows:

wherein x = f (y),

representing true values, s and

respectively representing the real state label and the predicted state, and N representing the number of the electric equipment.

Note: the length of the output sub-sequence is taken into account here, i.e. the above-mentioned steps finally determine that the small window is W' = W/2, and then the model represents that

It should be noted that, in the embodiment of the present invention, the method further includes:

An embodiment of the present invention further provides a load decomposition system based on a deep learning attention mechanism, as shown in fig. 2, the load decomposition system includes a training subsystem for training a transit-NILM model and a signal processing subsystem for invoking the transit-NILM model, the signal processing subsystem invokes the transit-NILM model to process a total power signal to obtain high resolution data, the model introduces a self-attention mechanism of a Transformer into a U-Net architecture and serves as an encoder, the model includes a down-sampling block, a Transformer block and an up-sampling block, and the training subsystem of the model includes:

the down-sampling module is used for extracting the characteristics of the training set through the down-sampling block to obtain an embedded matrix;

the Transformer module is used for coding the embedded matrix through the Transformer block to obtain an input sequence;

It can be understood that the load decomposition system for the deep learning attention mechanism provided in the embodiment of the present invention corresponds to the load decomposition method based on the deep learning attention mechanism, and the explanations, examples, and advantageous effects of the relevant contents thereof can refer to the corresponding contents in the load decomposition method based on the deep learning attention mechanism, and are not repeated herein.

Embodiments of the present invention also provide a computer-readable storage medium storing a computer program for load decomposition in a deep learning attention mechanism, wherein the computer program causes a computer to execute the load decomposition method based on the deep learning attention mechanism as described above.

An embodiment of the present invention further provides an electronic device, including:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing a deep learning attention mechanism based load resolution method as described above.

In summary, compared with the prior art, the method has the following beneficial effects:

1. the sequence-to-subsequence method provided by the embodiment of the invention balances seq2seq and seq2point to balance and solve the convergence difficulty in deep nerves, so that the training is easier, and the calculation amount during reasoning is reduced.

2. The embodiment of the invention combines a Transformer and a CNN, adopts a TransUNet model to mutually make up the defects of the Transformer and the CNN, and a Transformer block encodes a marked image block in convolutional neural network feature mapping into an input sequence for extracting the global context; the decoder upsamples the encoded features and then combines them with a high resolution CNN profile to achieve an accurate decomposition.

3. The embodiment of the invention provides a method based on a deep learning attention mechanism, which can learn the mutual relation between characteristics through the attention mechanism in the data of the input long text without completely depending on the data, effectively solves the problem that the prior art excessively depends on the original data, and further improves the decomposition accuracy.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A load decomposition method based on a deep learning attention mechanism is characterized in that the load decomposition method processes a total power signal through a pre-constructed TransUNet-NILM model to obtain high-resolution data, the TransUNet-NILM model introduces a self-attention mechanism of a Transformer as an encoder in a U-Net architecture, the model comprises a down-sampling block, a Transformer block and an up-sampling block, and the construction process of the TransUNet-NILM model comprises the following steps:

2. The deep learning attention mechanism-based load break method of claim 1, wherein the downsampling block comprises a plurality of convolution layers, pooling layers in order from input to output;

the feature extraction is carried out on the training set through the down-sampling block to obtain an embedded matrix, and the method comprises the following steps:

increasing hidden size of data in training set by using multiple convolutional layers, introducing position embedding, and learning L in pooling layer ² The norm pooling operation pools the convolution output with increased concealment size to obtain an embedded matrix.

3. The load splitting method based on the deep learning attention mechanism as claimed in claim 1, wherein the Transformer block comprises a multi-head attention mechanism layer, two LayerNorm layers, a feedforward neural network;

s303, normalizing the output sequence through the first layer LayerNorm;

s304, residual error connection is carried out on the plurality of normalized output sequences through a multi-head attention layer and a position feedforward neural network to obtain a splicing sequence, and standardized operation is carried out on the splicing sequence through a second layer LayerNorm to obtain an input sequence.

4. The deep learning attention mechanism-based load break method of claim 1, wherein the upsampling block comprises, in order from input to output: an deconvolution layer, a convolution layer, a window clipping layer and an output layer;

the decoding of the input sequence through the up-sampling block to obtain the multi-scale features, and the cutting of the multi-scale features to obtain the sub-sequence, wherein the sub-sequence obtains the high-resolution data through the output layer, and the method comprises the following steps:

5. The method for load decomposition based on deep learning attention mechanism according to claim 4, wherein the clipping the multi-scale features through the sub-window clipping layer to obtain the sub-sequence comprises:

the center of the clipping window in the sub-window clipping layer is aligned with the center of the main window, and W 'is less than or equal to W/2, wherein W' represents the clipping window, and W represents the main window.

6. The deep learning attention mechanism-based load decomposition method according to any one of claims 1 to 5, further comprising:

7. A load decomposition system based on a deep learning attention mechanism is characterized in that the load decomposition system comprises a training subsystem of a TransUNet-NILM model for training and a signal processing subsystem of the TransUNet-NILM model for calling, the signal processing subsystem calls the TransUNet-NILM model to process a total power signal to obtain high-resolution data, the TransUNet-NILM model introduces a self-attention mechanism of a Transformer in a U-Net framework and serves as an encoder, the model comprises a downsampling block, a Transformer block and an upsampling block, and the training subsystem comprises:

8. The deep learning attention mechanism based load break system of claim 7, further comprising:

9. A computer-readable storage medium storing a computer program for load decomposition based on a deep learning attention mechanism, wherein the computer program causes a computer to execute the load decomposition method based on the deep learning attention mechanism according to any one of claims 1 to 6.

10. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the load splitting method based on the deep learning attention mechanism of any of claims 1-6.