CN113052214A

CN113052214A - Heat exchange station ultra-short term heat load prediction method based on long and short term time series network

Info

Publication number: CN113052214A
Application number: CN202110274414.3A
Authority: CN
Inventors: 刘旭东; 李硕; 范青武
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-03-14
Filing date: 2021-03-14
Publication date: 2021-06-29
Anticipated expiration: 2041-03-14
Also published as: CN113052214B

Abstract

The invention discloses a method for ultra-short-term heat load prediction of a heat exchange station. Firstly, the random forest algorithm is used to filter the features and reduce the dimension; then the data is standardized; then a heat load prediction model based on a long-term and short-term time series network is established. The concept of cyclic skip layer captures longer-term feature information, while using autoregressive algorithms to add linear processing capabilities to the model, enhancing the robustness of the model. The method uses the periodic characteristics of the hourly load itself to solve the problem of information loss when the neural network processes long sequence data, thereby improving the performance of model prediction.

Description

Heat exchange station ultra-short term heat load prediction method based on long and short term time series network

Technical Field

The invention relates to the technical field of centralized heating, in particular to a method for predicting ultra-short-term heat load of a heat exchange station. The invention relates to a specific application of a data-driven method in the field of heat load prediction in a centralized heating process.

Background

With the continuous development of the economic society of China and the continuous improvement of the urbanization level, the centralized heat supply is gradually covered in cities and rural areas in the northern China. According to the disclosure of the national statistical bureau, the central heating area of cities in China reaches 87.80 hundred million square meters by 2018, and is increased by 5.67 percent compared with the area at the end of 2017. Fossil fuel consumed by centralized heat supply can cause serious environmental pollution and haze, and in order to realize energy conservation and environmental protection and avoid uneven heat supply, prediction of heat load becomes an important research problem. And a heat supply company is researched and researched to know that the heat supply area of the heat supply company is about 350 ten thousand square meters, and if the temperature is reduced by 0.5 ℃ in the heat supply process, nearly ten thousand yuan can be saved. Therefore, from the aspects of energy conservation, environmental protection and economic benefit, the heat load prediction has very important practical significance. The central heating system is a nonlinear large-scale system, which comprises a plurality of valves and pumps, and an accurate mathematical model is difficult to establish, so that the data-driven method is more suitable for the field of heat load prediction.

The method is mainly designed for the ultra-short-term heat load forecasting task of the heat exchange station. The heat exchange station is directly connected with a heat user, distributes and distributes heat, and the heat supply company directly regulates and controls the heat exchange station. In real life, the heat exchange station is close to the district heat user, and the lag period of heat supply is close to 1 hour. Therefore, the heat exchange station and the ultra-short period respectively serve as the object of research and have better practical significance in time dimension.

The traditional heat load prediction method mainly comprises gray prediction, time series prediction, regression and other methods, and with the continuous development of intelligent algorithms, a plurality of algorithms of machine learning and neural networks are applied to the field, such as: support Vector Regression (SVR), Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), and the like. However, the above methods are based on the time sequence characteristic of the thermal load, when the input sequence is long, the gradient disappears, which easily causes the loss of information, so that the correlation of information in a long period is lost, and the prediction accuracy needs to be improved.

Disclosure of Invention

In order to solve the problem that the Long-term information is easy to lose when the neural Network processes the Long-sequence information, the invention provides a Long Short-term Time-series Network (LSTNet) model to deal with the problem. Thermal loading is a typical time series problem, and time-wise thermal loading is periodic. The long-short time sequence network model provided by the invention utilizes the characteristic and introduces the idea of cyclic jump, thereby effectively solving the problem of information loss. Firstly, the model utilizes a Random Forest (RF) algorithm to screen and reduce dimensions of features; and then, a thermal load prediction model based on a long-term and short-term time sequence network is established, long-term and short-term characteristic information is captured through the convolution layer and the cycle layer, then the concept of a cycle jump layer is introduced, longer-term characteristic information is captured, and meanwhile, linear processing capacity is added to the model by utilizing an autoregressive algorithm, so that the robustness of the model is enhanced.

The invention adopts the following technical scheme and implementation steps:

s1, selecting meteorological data and heating data in a certain time period, and constructing a data set as an input variable X_n；

S2, preprocessing the data, including identifying and correcting missing values and outliers, and standardizing the data;

s3, screening the input variables by using an RF method, and performing dimensionality reduction operation on the data set to obtain X_mAnd the data set is divided into 8: 2, dividing the ratio into a training set and a testing set;

s4 inputs the training set into the LSTNet model item by item, the weights and biases of the training model:

s401, firstly, capturing short-term local characteristic information by using a convolutional layer;

s402, utilizing the circulation layer to capture the long-term macro information, and outputting h_t ^R(ii) a Simultaneous cycle of skip floor benefit

The periodic characteristics of the sequence are used to capture longer-term information, the output is h_t ^S；

S403, connecting the outputs of the cycle layer and the cycle jump layer in a mode of full connection layer to obtain the output y_t ^D。

S404 thenLinear components are added for prediction by combining the output of the AR process, and meanwhile, the model can capture the scale change of the input, the robustness of the model is enhanced, and the output y is obtained_t ^A。

The output module of S405 integrates the output of the neural network part and the output of the AR model to obtain a final prediction model.

S5, inputting the test set into the well-trained LSTNet model one by one to obtain a predicted value

Advantageous effects

Compared with the prior art, the method fully considers the periodic characteristic of the ultra-short-term thermal load, and makes up the problem of information loss of the conventional neural network caused by gradient descent by introducing the concept of the cycle jump layer. Different from the traditional neural network algorithm, the method fully considers the periodic characteristic of the time-by-time heat load, the characteristic is more representative, and the prediction task of the ultra-short-term heat load can be better completed.

Drawings

FIG. 1 is a diagram of a model architecture according to the present invention;

FIG. 2 is a graph of simulated heat load data presentation of the present invention;

FIG. 3 is a graph showing the predicted results of the LSTNet model of the present invention;

Detailed Description

The technical features and advantages of the present invention will become more apparent from the following detailed description of the embodiments of the present invention when taken in conjunction with the accompanying drawings.

S1, selecting as many related characteristic variable data as possible, wherein the related characteristic variable data may include meteorological data, operation condition data, heat load data and the like, so as to construct a heat load data set to obtain X_n＝{x₁,x₂,…,x_nN is the number of characteristic variables;

s2, after the data set is constructed, preprocessing the data:

s201 compensates for the missing value, i.e. the value of 0 or null data, and calculates using the following formula:

x_i＝0.4x_i-1+0.4x_i+1+0.2x_i+2 (1)

in the formula x_iIs the current miss value, x_i-1、x_i+1And x_i+2The values of the previous moment, the next moment and the next two moments are respectively;

s202 treats an outlier, that is, a value exceeding 3 times or more the predetermined range, as a missing value;

s203, standardizing each dimension input variable, wherein the adopted calculation formula is as follows:

in the formula y_iIs a normalized value; x is the number of_iIs the original value;

and s represent the mean and variance of the raw data, respectively. The normalized data mean is 0, variance is 1, and there is no dimension.

S3, screening and dimension reduction are carried out on the feature variables by using an RF algorithm, the idea of evaluating feature importance by using a random forest is simple, and the method mainly comprises the steps of determining how much each feature contributes to each tree in the random forest, then averaging, and finally comparing the contribution sizes of different features. The importance of a certain feature x is denoted as IMP, and the calculation method is as follows:

s301, for each decision tree in the random forest, calculates its Out-Of-Bag data error, denoted as OOB error1, using the corresponding Out-Of-Bag data (Out Of Bag, OOB), and the calculation formula is as follows:

and taking the out-of-bag data as input, bringing the out-of-bag data into a random forest classifier, performing classification comparison on the O pieces of data by using the classifier, and counting the number of classification errors to be set as X.

S302, randomly adding noise interference to the characteristic x of all samples of the data outside the bag, and calculating the error of the data outside the bag again and recording the error as OOBERR 2;

s303, assuming there are N trees in the random forest, the importance IMP of the feature x is shown in formula 1:

after noise is randomly added to a certain feature, the accuracy rate outside the bag is greatly reduced, which indicates that the feature has a great influence on the classification result of the sample, that is, the feature has a high importance degree.

The invention utilizes random forest to sort the importance of the characteristic variables in a descending order, then determines the deletion ratio, and eliminates the unimportant indexes of the corresponding ratio from the current characteristic variables, thereby obtaining a new characteristic set, wherein the characteristic of the new characteristic set is X_m＝{x₁,x₂,…,x_m}. Wherein m is<n, the deleting proportion is determined according to the number of the characteristic variables in the original data set. After dimensionality reduction of the dataset, the dataset is scaled by 8: the scale of 2 is divided into a training set and a test set.

S4, inputting training set data into the LSTNet model item by item according to a time sequence, wherein the weights and the bias of the training model are shown in the overall structure of the LSTNet model in FIG. 1:

s401, the first module of the network is a convolution layer, and the function of the convolution layer is to extract features and capture local short-term feature information. The convolutional layer module consists of a number of filters, where the width is ω, the height is m, and m is the same as the number of features. The output of the ith filter is then:

h_i＝ReLU(W_i*X+b_i) (5)

in which h is output_iAs a vector, ReLU is an activation function, and ReLU (x) max (0, x). Is a convolution operation, W_iAnd b_iRespectively weight matrix and bias.

S402 the convolutional layer module outputs the loop layer and the loop-jump layer simultaneously input to the second module. What is used by the loop layer is a gated loop Unit (GRU), in which ReLU is used as an activation function for implicit updates. Then the hidden state output h of the cell at time t_t ^RComprises the following steps:

wherein z is_tAnd r_tThe outputs of the update gate and reset gate in the GRU neuron respectively,

output for an intermediate state; σ is sigmoid activation function, x_tAn input at this layer at time t, which is an elemental product; w, U and b are the weight matrix and offset, respectively, for each gate cell. The output of this layer is the hidden state at each time step.

The GRU network can capture long-term history information, but because the gradient disappears, all the previous information cannot be saved, so that the correlation of the longer-term information is lost. In the LSTNet model, the problem is solved by a jumping idea, which is based on periodic data, by the hyper-parameter of period p, obtaining very far time information. When the time t is predicted, the time data information of the previous period, the previous period and the earlier period can be predicted. Since this type of dependency is difficult to capture by the cyclic unit due to the long time of one cycle, introducing a cyclic network structure with hopping connections can extend the time span of the information flow to obtain longer-term data information. Its output h at time t_t ^SComprises the following steps:

the input to this layer is the same as the recycle layer and is the output of the convolutional layer. Where p is the number of skipped hidden units, i.e. the period. The general period is easily determined, and according to engineering experience or data trend, if the data is non-periodic or the periodicity is dynamically changed, attention mechanism method can be cited to dynamically update the period p.

S403, connecting the two layers, and combining the outputs of the two layers by adopting a full-connection layer mode by the model. The output of this layer at time t is:

wherein W^RAnd W^SWeights assigned to the loop layer and the loop jump layer, respectively, and b is an offset value.

S404 in the actual data set, the input scale changes non-periodically, but the neural network is not sensitive to the scale changes of the input and output, so the prediction accuracy of the neural network model is significantly reduced by this problem. Therefore, in the model, in order to solve the deficiency, a linear part is added in the model, and a classical Autoregressive (AR) model is adopted to enhance the robustness of the model. The output y of the AR model at time t_t ^AComprises the following steps:

wherein q is^AIs the input window size on the input matrix.

The S405 output module integrates the output of the neural network part and the output of the AR model to obtain the final output of the LSTNet model

Comprises the following steps:

wherein

Is the final predicted value of the model at time t.

S406, in the model training process, a Mean Square Error (MSE) function is used as a loss function, and the formula is as follows:

where n is the number of valid data,

and y_iRespectively predicted values and actual values tested.

To verify the effectiveness of the method, we used normal data from a heating season for verification. The data is obtained by simulating the 120-day heating process of the heat exchange station in one district of Zheng State in Henan by EnergyPlus software, and a data diagram is shown in FIG. 2. The comparison experiment is performed by using methods such as AR, Integrated Moving Average Autoregressive (ARIMA), MLR, SVR, GRU and the like, the experimental result is shown in FIG. 3, and the evaluation index result of each model is shown in Table 1.

TABLE 1 comparison of evaluation indexes for models of thermal load prediction

Model (model)	RMSE(×10³)	MAE(×10³)	R-Squared
				AR	40.815	27.213	76.724％
ARIMA	33.892	19.028	83.951％
				MLR	31.631	20.857	86.020％
SVR	29.220	18.662	88.070％
				GRU	24.994	17.249	91.268％
LSTNet	15.833	12.341	96.501％

From the above experimental results, it can be seen that the LSTNet model utilized herein predicts performance better than other models for time-wise thermal load prediction, closer to 1 on the R-square index than other models. Compared with a GRU model, the RMSE of the LSTNet model in the model is reduced by 36.7%, the MAE is reduced by 28.5%, and the model precision is obviously improved.

Claims

1. an ultra-short-term load forecasting method for a heat exchange station based on a long-term and short-term time series network, characterized in that the method steps are:

S1: Select meteorological data and heating data for a certain period of time, and construct a data set as input variable X _n ;

S2: Preprocess the data, including identifying and correcting missing values and outliers, and standardizing the data;

S3: Use the RF method to filter the input variables, perform dimensionality reduction operation on the data set to obtain X _m , and divide the data set into a training set and a test set in a ratio of 8:2;

S4: Input the training set into the LSTNet model one by one, train the weights and biases of the model, and obtain the trained network model;

S5: Input the test set into the trained LSTNet model one by one to get the predicted value

2. The ultra-short-term load forecasting method for a heat exchange station based on a long- and short-term time series network according to claim 1, wherein the step S2 has carried out data preprocessing, and the steps are:

S201: For missing values, the following formula can be used to calculate:

x _i =0.4x _i-1 +0.4x _i+1 +0.2x _i+2 (1)

where x _i is the current missing value, x _i-1 , x _i+1 and x _i+2 are the values of the previous moment, the next moment, and the next two moments respectively;

S202: For outliers, that is, values that exceed the specified range by more than 3 times, treat the values as missing values;

S203: Standardize each dimension of data, and the standardization formula is:

where y _i is the standardized value; _xi is the original value;

and s represent the mean and variance of the original data, respectively; the standardized data has a mean of 0, a variance of 1, and is dimensionless.

3. The ultra-short-term load prediction method for a heat exchange station based on a long-term and short-term time series network according to claim 1, wherein the method for dimensionality reduction operation in step S3 comprises:

S301: Calculate the out-of-bag data error of each decision tree in the random forest;

Use the corresponding out-of-bag data (Out Of Bag, OOB) to calculate its out-of-bag data error, denoted as OOBError1, and its calculation formula is as follows:

Among them, O is the total number of out-of-bag data, take out-of-bag data as input, bring it into the random forest classifier, use the classifier to classify and compare the O data, count the number of classification errors, and set it as X;

S302: Add noise interference to the feature x, and calculate the out-of-bag data error of each decision tree in the random forest again;

S303: Calculate the importance IMP of each feature, and its calculation formula is:

where OOBError1 and OOBError2 are the out-of-bag errors before and after adding noise, respectively; N is the total number of decision trees in the random forest.

4. The ultra-short-term load prediction method for a heat exchange station based on a long-term and short-term time series network according to claim 1, wherein the LSTNet model proposed in the step S4, the specific method of the model comprises:

S401: The first module of the network is a convolution layer, which consists of multiple filters, and the output formula of the ith filter is:

h _i =ReLU(W _i *X+ _bi ) (5)

where the output hi is a vector, _ReLU is the activation function, and ReLU(x)=max(0,x)*. is the convolution operation, and _Wi and _bi are the weight matrix and bias, respectively;

S402: The second module is a cyclic layer and a cyclic skip layer, whose functions are to obtain long-term and longer-term feature information, and output the hidden state of the unit at time t in the cyclic layer and the cyclic skip layer

and

The formulas are:

where z _t and r _t are the outputs of the update gate and the reset gate in the GRU neuron, respectively,

is the intermediate state output; σ is the sigmoid activation function, x _t is the input at the layer at time t, ⊙ is the element product; p is the number of skipped hidden units, that is, the period; W, U and b are the gates respectively the weight matrix and bias of the cell;

S403: In order to connect the upper two layers, the model adopts a fully connected layer to combine the outputs of the two layers; the output formula of this layer at time t is:

where WR and ^{W S} ^are the weights assigned by the recurrent layer and the recurrent skip layer, respectively, and b is the bias value;

S404: In order to capture the change of the input scale, an AR process is added to the model, and the output of the process at time t

for:

where q ^A is the input window size on the input matrix;

S405: Integrate the output of the neural network part with the output of the AR model to obtain the final prediction output of the model

for:

in

is the final predicted value of the model at time t;

S406: In the model training process, the Mean Square Error (MSE) function is used as the loss function, and the formula is:

where n is the number of valid data,

and y _i are the predicted value and the tested true value, respectively.