CN114781744A

CN114781744A - Deep learning multi-step long radiance prediction method based on codec

Info

Publication number: CN114781744A
Application number: CN202210492332.0A
Authority: CN
Inventors: 谢利萍; 童俊龙; 张晗津; 张侃健; 魏海坤
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2022-05-07
Filing date: 2022-05-07
Publication date: 2022-07-22

Abstract

The invention relates to a deep learning multi-step long-radiance prediction method based on a codec, and belongs to the technical field of photovoltaic power generation. The prediction method comprises the following steps: s1, acquiring training data, acquiring historical irradiance data of the target area and meteorological data corresponding to the historical irradiance data, and making a supervision data set; s2, preprocessing data, including meteorological information characteristic coding and data normalization; s3, training a coder-decoder model, wherein the coder model is composed of a TCN and LSTM cascade structure, and the decoder is composed of an LSTM and MLP cascade structure; by reading the current time period t₀～t_NIrradiance as supervisory information, and t₀Historical irradiance and meteorological information before the moment are used as input data to train a coder-decoder model; s4, predicting, inputting the historical data into the coding solution obtained by training in the step S3And the encoder model predicts the solar irradiance of multiple steps in the future. The method can fully utilize the historical information of the irradiance sequence, and experiments show that the method can effectively improve the accuracy of multi-step long-irradiance prediction.

Description

Deep learning multi-step long radiance prediction method based on codec

Technical Field

The invention relates to a deep learning multistep long radiance prediction method based on a codec, and belongs to the technical field of photovoltaic power generation.

Background

Solar energy resources are the most promising renewable energy source, and a survey by the international renewable energy agency shows that as of 2020, 29% of the global electricity production comes from renewable energy sources, with the renewable energy power generation part solar energy accounting for 26.77% and rising year by year. However, due to uncertainty and intermittency of irradiance, photovoltaic power generation presents considerable instability, which increases grid connection and scheduling difficulty of photovoltaic power generation, and restricts wide application of solar energy resources.

At present, there are various irradiance prediction methods based on deep learning, such as irradiance prediction using models such as LSTM and CNN. Some scholars are dedicated to research on mixed prediction methods of RNN models and CNN models, such as LSTM-CNN combined prediction models. However, the CNN model is better at extracting spatial features, and has limited ability to extract time-dependent features; RNN models can maintain timing dependence but are difficult to handle long input sequences. The characteristics show that the deep learning model based on the RNN and the CNN is difficult to take long-sequence input and long-term dependence into consideration, and the model needs to be improved.

Disclosure of Invention

The technical problem is as follows:

the technical problem to be solved by the invention is as follows: the existing deep learning model based on RNN and CNN is difficult to give consideration to long-sequence input and long-term dependence, so that the irradiance prediction is difficult to carry out by fully utilizing historical information in a prediction model, and the model precision is difficult to guarantee after the prediction step length is increased.

In order to solve the technical problem, the invention provides a deep learning multi-step long irradiance prediction method based on a codec, so that irradiance prediction is performed by fully utilizing historical information, and the prediction effect is improved.

The technical scheme is as follows:

the invention provides a deep learning multi-step long radiance prediction method based on a codec, which comprises the following steps:

1. a deep learning multi-step long radiance prediction method based on a codec is characterized by comprising the following steps:

s1, acquiring training data, acquiring historical irradiance data of a target area and meteorological data corresponding to the historical irradiance data, and manufacturing a supervision data set according to a prediction task;

s2, preprocessing data, including weather information feature coding and data normalization;

s3, training the code decoder model, using the reading current time interval t₀～t_NIrradiance as supervisory information, and t₀Historical irradiance and meteorological information before the moment are used as input data, and a codec model is trained;

and S4, forecasting, inputting the historical data into the codec model obtained by training S3, and forecasting future multi-step solar irradiance.

Further, the step S1 includes the following steps:

(1) obtaining historical irradiance data of a target area and corresponding meteorological data (including but not limited to temperature, humidity, air pressure, wind speed and the like);

(2) if a certain segment of the historical data is missing or illegal, the average value of the adjacent data before and after the historical data is used for substitution, so that the continuity and the authenticity of the data are ensured, and the quality of the training data is ensured;

(3) matching the supervision information to make a supervision data set by reading the current time period t₀～t_NThe irradiance is used as supervision information, the historical irradiance and the corresponding meteorological information are matched to be used as input information, and a supervision data set is manufactured.

Further, the step S2 specifically includes the following steps:

(1) encoding meteorological information corresponding to irradiance, encoding weather types by using a one-hot encoding mode, and taking numerical values as encoding values for numerical information;

(2) in order to ensure that the model gradient changes reasonably in the training process, normalization operation needs to be performed on input data, and a normalization formula is as follows:

wherein a represents a feature in the data set, a' represents a normalized feature value, a_maxAnd a_minRespectively representing the maximum and minimum values of the feature in the historical data.

Further, the codec structure in step S3 is as follows:

(1) the encoder is composed of a series structure of a time convolution network TCN and a long-short term memory network LSTM, the TCN is responsible for acquiring long sequence input and keeping time sequence dependency relationship, and the compressed short sequence keeps time sequence dependency through the LSTM; specifically, the encoder firstly receives a long sequence input through the TCN, the network layer number depends on the length of the input sequence, secondly, the feature sequence extracted by the TCN is compressed and output to the LSTM, and finally, the output of the LSTM is used as the encoding output of the encoder;

(2) the decoder consists of a series structure of a long-short term memory network LSTM and a multilayer perceptron MLP, the LSTM is used for obtaining multi-step length prediction in a time circulation mode, and the performance of multi-step output is balanced through a loss function; the decoder first receives the output of the encoder, is responsible for decoding by the LSTM, and outputs to the MLP after decoding, where the MLP is used to match the output dimensions.

Further, the codec is characterized as follows:

(1) the coder is formed by cascading TCN and LSTM; the TCN receives long sequence input, obtains an output sequence after multi-layer characteristic extraction, and intercepts the rear section of the sequence as the input of the LSTM;

(2) the decoder is formed by LSTM and MLP cascade; LSTM receives the encoder state, MLP is used for dimension matching of LSTM output;

(3) each layer of the TCN is composed of a TCN residual block, and the TCN residual block comprises a series module and a residual connection; the series module consists of two groups of identical void cause and effect convolution layers, a weight normalization layer, a ReLu activation unit and a discarding layer;

(4) balancing the prediction performance of the multi-step output by designing a loss function, wherein the loss function is designed as follows:

where K represents the number of predicted steps, loss_iRepresents the output loss of the ith step, α_iExpress loss_iW represents the parameters of the model and β represents the regularization coefficients.

Further, the TCN and LSTM operation process of the codec includes:

(1) the hole causal convolution operation is described by:

wherein x represents a convolution operation, d represents a hole coefficient, x represents an input sequence, s represents an element of the sequence, f represents a convolution kernel, k represents the size of the convolution kernel, s-d · i represents an element selected by the hole convolution, and f(s) represents an output of the hole causal convolution;

(2) TCN residual block:

O₁＝dropout(ReLU(Norm(F(s))))，

O₂＝dropout(ReLU(Norm(O₁)))，

O_tcn＝s+O₂，

where Norm denotes weight normalization, ReLU denotes activation function, and dropout denotes a discard layer; o is_tcnRepresents the output of the residual block of TCN, i.e., the output of each layer of TCN;

(3) the LSTM includes:

forgetting door f_t：

f_t＝sigmoid(W_ifx_t+b_if+W_hfh_t-1+b_hf)，

Input door i_t：

i_t＝sigmoid(W_iix_t+b_ii+W_hih_t-1+b_hi)，

Activation function g_t：

g_t＝tanh(W_iix_t+b_ii+W_hih_t-1+b_hi)，

Output gate o_t：

o_t＝sigmoid(W_iox_t+b_io+W_hoh_t-1+b_ho)，

The current time corresponds to the state c of the memory cell_t：

LSTM output State h_t：

Wherein W_ifAnd b_ifWeight matrix and bias matrix respectively representing the external input of a forgetting gate, W_hfAnd b_hfWeight matrix and bias matrix, W, representing respectively the forgetting gate hidden state input_iiAnd b_iiWeight matrix and bias matrix, W, respectively representing external inputs to the input gate_hiAnd b_hiWeight matrix and bias matrix, W, representing the hidden state input of the input gate, respectively_ioAnd b_ioWeight matrix and bias matrix, W, representing external inputs to the output gates, respectively_hoAnd b_hoWeight matrix and bias matrix respectively representing hidden state inputs of output gates, f_t、i_tAnd o_tOutput of the forgetting gate, the input gate and the output gate at the time t, h_tHidden state at time t, x_tIs an external input.

Has the advantages that:

the encoder part of the invention acquires long sequence input by utilizing the characteristic that the TCN has a long receptive field, intercepts short sequence input LSTM after TCN output, and acquires longer historical information under the condition of short sequence input; the encoder has the capacity of inputting long sequences and maintaining time characteristic time sequence dependency, and is more suitable for characteristic extraction of time sequences; the decoder structure can ensure that the model outputs a plurality of prediction steps in sequence, and the prediction performance of multi-step output is balanced through a loss function. On the premise of only depending on historical information, the structure of the coder-decoder can obtain the irradiance prediction effect with high precision.

Drawings

FIG. 1 is a flow chart of irradiance prediction of the present invention;

FIG. 2 is a block diagram of a codec of the present invention;

FIG. 3 is a schematic diagram of the TCN residual block inside the codec according to the present invention;

FIG. 4 is a schematic diagram of the elements of an internal LSTM module of the codec of the present invention;

FIG. 5 is a graph showing the results of the present invention.

Detailed Description

In order to more clearly illustrate the technical solution of the present invention, the present invention is described below with reference to the accompanying drawings. The examples are given solely for the purpose of illustration and are not intended to limit the scope of the invention.

Referring to fig. 1, a codec-based deep learning multi-step long radiance prediction method includes the following steps:

s1, acquiring training data, acquiring historical irradiance data of a target area and meteorological data corresponding to the historical irradiance data, and making a supervision data set, wherein S1 comprises the following contents:

(1) obtaining historical irradiance data of a target area and corresponding meteorological data thereof, and selecting temperature T, humidity H, air pressure P, wind speed W and the like as the meteorological data in the embodiment;

(3) matching supervisory information to produce supervisory numbersBy reading the current time period t₀～t_NIrradiance x of₀～x_NAnd as the monitoring information of multi-step length prediction, matching the historical irradiance with the corresponding meteorological information as input information to manufacture a monitoring data set. For example: the first 24 moments of the current moment are selected as historical information input models, irradiance of 6 moments in the future is predicted, and the data set can be represented as ([ x ]_-24，T_-24，W_-24，P_-24，H_-24，...，x_-1，T_-1，W_-1，P_-1，H_-1]；[x₀，...，x₅]) Wherein [ x ]₀，...，x₅]Denotes supervision information, [ x ]_-t，T_-t，W_-t，P_-t，H_-t]Irradiance and weather information for t moments before the current moment are represented.

S2, preprocessing data, including meteorological information feature coding and data normalization, specifically including the following contents:

(1) the weather information corresponding to the irradiance is coded, the weather type is coded by utilizing a one-hot coding mode, and the numerical value is used as a coding value for the numerical value information;

wherein a represents a feature in the data set, a' represents a normalized feature value, a_maxAnd a_minRespectively representing the maximum value and the minimum value of the characteristic in historical data; the characteristics that need to be normalized in this embodiment are historical irradiance information and historical meteorological information, where the historical meteorological information includes temperature, humidity, wind speed, and barometric pressure.

S3, training a code decoder model;

the RNN model is easy to have gradient disappearance under the condition of long sequence input and time sequence dependency relationship disappearance, the CNN has a large receptive field, but due to structural limitation, time sequence dependency characteristics are difficult to extract fully, namely, the deep learning prediction model based on the RNN and the CNN is difficult to take both long sequence input and long-term time sequence dependency relationship into account, so that the model is difficult to fully utilize historical information to carry out irradiance prediction, and the model precision is difficult to ensure after the prediction step length is increased;

in order to fully utilize the useful information of the historical sequence, the present embodiment considers both long sequence input and long-term timing dependency relationship according to the codec structure proposed by the present invention, and reads the current time period t₀～t_NIrradiance as supervisory information, and t₀Historical irradiance and meteorological information before the moment are used as input data, and a codec model is trained; referring to fig. 2, the codec described in S3 includes the following:

the encoder is composed of a series structure of a time convolution network TCN and a long-short term memory network LSTM, the TCN is responsible for obtaining long sequence input and keeping a time sequence dependency relationship, and the compressed short sequence keeps time sequence dependency after passing through the LSTM; specifically, the encoder receives a long sequence input through the TCN, the network layer number depends on the length of the input sequence, then the feature sequence extracted by the TCN is compressed and output to the LSTM, and finally the output of the LSTM is used as the encoding output of the encoder;

the decoder consists of a series structure of a long-short term memory network LSTM and a multilayer perceptron MLP, the LSTM is used for obtaining multi-step length prediction in a time circulation mode, and the performance of multi-step output is balanced through a loss function; the decoder first receives the output of the encoder, is responsible for decoding by the LSTM, and outputs to the MLP after decoding, where the MLP is used to match the output dimensions.

Further, the codec in step S3 is characterized as follows:

(1) the encoder is formed by cascading TCN and LSTM; the TCN receives long sequence input, obtains an output sequence after multi-layer characteristic extraction, and intercepts the rear segment of the sequence as the input of the LSTM;

(2) the decoder is formed by cascading LSTM and MLP; LSTM receives the encoder state, MLP is used for dimension matching of LSTM output;

(3) each layer of the TCN is composed of a TCN residual block, and the TCN residual block comprises a serial module and residual connection; the series module consists of two groups of identical void cause and effect convolution layers, a weight normalization layer, a ReLu activation unit and a discarding layer;

where K represents the number of predicted steps, loss_iRepresents the output loss of the ith step, α_iExpress loss_iW represents a parameter of the model, β represents a regularization coefficient; in the present embodiment, the outputs at different times are regarded as being equally important, and therefore the weight coefficient α_iThe values are the same.

Further, the structure of the TCN residual block and LSTM module in step S3 is as shown in fig. 3 and fig. 4, which specifically includes:

(1) the hole causal convolution operation is described by:

wherein x represents convolution operation, d represents a hole coefficient, x represents an input sequence, s represents an element of the sequence, f represents a convolution kernel, k represents convolution kernel size, s-d · i represents an element of hole convolution selection, and f(s) represents hole causal convolution output;

(2) TCN residual block:

O₁＝dropout(ReLU(Norm(F(s))))，

O₂＝dropout(ReLU(Norm(O₁)))，

O_tcn＝s+O₂，

where Norm denotes weight normalization, ReLU denotes activation function, and dropout denotes a discard layer; o is_tcnRepresents the output of the TCN residual block, i.e., the output of each layer of the TCN;

(3) the LSTM includes:

a forgetting gate for discarding unimportant information; forgetting the output f of the door during the forgetting stage_tThe sigmoid is calculated by an activation function, whether the information of the previous moment is discarded or not is determined by the output value of the activation function, and the calculation formula of the forgetting gate is as follows:

f_t＝sigmoid(W_ifx_t+b_if+W_hfh_t-1+b_hf)，

the input gate and the activation function jointly complete memory selection, and the input gate and the activation function perform matrix multiplication to determine which values are stored to the current state;

the calculation formula of the input gate it is as follows:

i_t＝sigmoid(W_iix_t+b_ii+W_hih_t-1+b_hi)，

activation function g_tThe calculation formula of (a) is as follows:

g_t＝tanh(W_iix_t+b_ii+W_hih_t-1+b_hi)，

the current time corresponds to the state c of the memory cell_tThe input gate and the forgetting gate jointly determine, the matrix multiplication of the forgetting gate and the last time state represents discarding some unnecessary information, the matrix multiplication of the input gate and the activation function represents storing important information, and the calculation formula is as follows:

output gate o_t：

o_t＝sigmoid(W_iox_t+b_io+W_hoh_t-1+b_ho)，

The output gate determines the output value of the memory cell state at the current time, thereby obtaining the output state h of LSTM_tThe calculation formula is as follows:

wherein W_ifAnd b_ifWeight matrix and bias matrix respectively representing the external input of a forgetting gate, W_hfAnd b_hfWeight matrix and bias matrix, W, representing respectively the forgetting gate hidden state input_iiAnd b_iiWeight matrix and bias matrix, W, respectively representing external inputs to the input gate_hiAnd b_hiWeight matrix and bias matrix, W, representing hidden state inputs to the input gate, respectively_ioAnd b_ioWeight matrix and bias matrix, W, representing external inputs to the output gates, respectively_hoAnd b_hoWeight matrix and bias matrix respectively representing hidden state inputs of output gates, f_t、i_tAnd o_tRespectively the output of the forgetting gate, the input gate and the output gate at the moment t, h_tFor the hidden state at time t, x_tIndicating an external input.

And S4, forecasting, inputting the historical data into the codec model obtained by the training of S3, and forecasting the future multi-step solar irradiance.

The codec model created in this embodiment selects the following parameters:

selecting four layers of TCN and two layers of LSTM as encoders, and selecting the input length of the TCN as one time of the input length of the LSTM, namely, the second half of the output sequence of the TCN layer is used as the input of the LSTM layer;

a single layer LSTM and two layers MLP are chosen as decoders, where the number of LSTM-MLP concatenations is consistent with the prediction step size.

In order to verify the prediction performance of the method, irradiance data of American measurement and instrument laboratories are selected, and an oak ridge is specifically selected as the location of a data set. The training set length is 4 years, the test set length is 1 year, and compared with the traditional LSTM and TCN methods, the method takes one-step prediction in advance as an example, and the specific error prediction is shown in Table 1 and attached figure 5;

TABLE 1 comparison of conventional LSTM and TCN processes with the present process

	RMSE	MAE	nRMSE
				LSTM	55.47	25.78	36.23％
TCN	55.16	28.03	36.03％
				Method for producing a composite material	54.32	23.68	35.46％

The method of the invention has smaller prediction error than the traditional LSTM and TCN in each evaluation; therefore, the method has higher prediction precision.

Claims

s2, preprocessing data, including meteorological information characteristic coding and data normalization;

and S4, forecasting, inputting historical data into the codec model obtained by training in the step S3, and forecasting future multi-step solar irradiance.

2. The prediction method according to claim 1, wherein the step S1 includes the following steps:

(1) obtaining historical irradiance data of a target area and corresponding meteorological data including but not limited to temperature, humidity, air pressure and wind speed;

3. The prediction method according to claim 1, wherein the step S2 specifically includes the following contents:

(2) in order to ensure that the model gradient changes reasonably in the training process, normalization operation needs to be carried out on input data, and a normalization formula is as follows:

wherein a represents the feature in the data set, a' represents the normalized feature value, a_maxAnd a_minRespectively representing the maximum and minimum values of the feature in the historical data.

4. The prediction method according to claim 1, wherein the codec of step S3 comprises an encoder and a decoder, and wherein:

the encoder is composed of a series structure of a time convolution network TCN and a long-short term memory network LSTM, the TCN is responsible for acquiring long sequence input and keeping time sequence dependency relationship, and the compressed short sequence keeps time sequence dependency through the LSTM; specifically, the encoder receives a long sequence input through the TCN, the network layer number depends on the length of the input sequence, then the feature sequence extracted by the TCN is compressed and output to the LSTM, and finally the output of the LSTM is used as the encoding output of the encoder;

the decoder consists of a series structure of a long-short term memory network LSTM and a multilayer perceptron MLP, and obtains multi-step length prediction by using the LSTM in a time cycle manner, and balances the performance of multi-step output through a loss function; the decoder first receives the output of the encoder, is responsible for decoding by the LSTM, and outputs to the MLP after decoding, where the MLP is used to match the output dimension.

5. The prediction method of claim 4, wherein the codec is characterized by:

(1) the encoder is formed by cascading TCN and LSTM; the TCN receives long sequence input, obtains an output sequence after multi-layer characteristic extraction, and intercepts the rear section of the sequence as the input of the LSTM;

(2) the decoder is formed by cascading LSTM and MLP; LSTM receives the encoder state, MLP is used for the dimension matching of LSTM output;

(3) each layer of the TCN is composed of a TCN residual block, and the TCN residual block comprises a serial module and residual connection; the series module consists of two groups of identical void cause-and-effect convolution layers, a weight normalization layer, a ReLu activation unit and a discarding layer;

where K represents the predicted number of steps, loss_iRepresents the output loss of the ith step, α_iExpress loss_iW represents a parameter of the model and β represents a regularization coefficient.

6. The prediction method of claim 4, wherein the TCN and LSTM operations of the codec comprise:

(1) the hole causal convolution operation is described by:

wherein x represents a convolution operation, d represents a hole coefficient, χ represents an input sequence, s represents an element of the sequence, f represents a convolution kernel, k represents the size of the convolution kernel, s-d · i represents an element selected by the hole convolution, and f(s) represents an output of the hole causal convolution;

(2) TCN residual block:

O₁＝dropout(ReLU(Norm(F(s))))，

O₂＝dropout(ReLU(Norm(O₁)))，

O_tcn＝s+O₂，

(3) the LSTM includes:

forgetting door f_t：

f_t＝sigmoid(W_ifx_t+b_if+W_hfh_t-1+b_hf)，

Input gate i_t：

i_t＝sigmoid(W_iix_t+b_ii+W_hih_t-1+b_hi)，

Activation function g_t：

g_t＝tanh(W_iix_t+b_ii+W_hih_t-1+b_hi)，

Output gate o_t：

o_t＝sigmoid(W_iox_t+b_io+W_hoh_t-1+b_ho)，

The current time corresponds to the state c of the memory cell_t：

LSTM output State h_t：

Wherein W_ifAnd b_ifWeight matrix and bias matrix, W, respectively representing the external input of a forgetting gate_hfAnd b_hfWeight matrix and bias matrix, W, representing respectively the forgetting gate hidden state input_iiAnd b_iiWeight matrix and bias matrix, W, respectively representing external inputs to the input gate_hiAnd b_hiWeight matrix and bias matrix, W, representing the hidden state input of the input gate, respectively_ioAnd b_ioWeight matrix and bias matrix, W, representing external inputs to the output gates, respectively_hoAnd b_hoWeight matrix and bias matrix respectively representing hidden state inputs of output gates, f_t、i_tAnd o_tOutput of the forgetting gate, the input gate and the output gate at the time t, h_tHidden state at time t, x_tIs an external input.