CN110659775A

CN110659775A - LSTM-based improved electric power short-time load prediction algorithm

Info

Publication number: CN110659775A
Application number: CN201910899903.0A
Authority: CN
Inventors: 王占魁; 吴军英; 辛锐; 白涛; 赵建斌; 魏明磊; 李井泉; 庄磊; 郑涛; 孙思思; 刘明硕; 彭姣; 陈曦
Original assignee: State Grid Corp of China SGCC; Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Priority date: 2019-09-23
Filing date: 2019-09-23
Publication date: 2020-01-07

Abstract

The invention discloses an LSTM-based improved electric power short-time load prediction algorithm, which comprises the following steps: s1: preprocessing all the power data; s2: a one hot method is adopted to encode the non-digital characteristic of weather, so that a more regular input tensor is generated; s3: the input layer dimension of the network is; the problems that the traditional algorithm is difficult to adapt to the traditional load prediction method, application scenes are not rich, learning capacity is weak, influence of human factors is large, and network training efficiency is low are solved.

Description

LSTM-based improved electric power short-time load prediction algorithm

Technical Field

The invention relates to the field of electric power short-time load prediction algorithms, in particular to an electric power short-time load prediction algorithm based on LSTM improvement.

Background

An electric power system is an indispensable important link in national construction and human life, and efficient utilization of electric energy is increasingly important. On one hand, the load of the load on the system can be conveniently compared through the prediction of the power load, so that the electrical equipment is reasonably used, and the economic, safe and reliable operation of the power system is guaranteed. On the other hand, the requirements of power generation and power utilization enterprises on the load prediction precision are higher and higher, the accurate load prediction is beneficial to reasonably arranging a scheduling plan of each power generation enterprise, perfect maintenance and inspection on power equipment and the like are carried out under the condition that high-quality power is provided for the public at any time, and the power utilization enterprises can also be helped to take active measures to improve the power utilization quality and the economical efficiency. The short-term power load prediction generally refers to prediction within 1 year, including year, month, day and hour, is mainly used for guiding the daily operation of a power department, and has profound significance on power utilization planning.

Conventional power load prediction methods include exponential smoothing, time series, trend extrapolation, regression analysis, and the like. In recent years, a large number of new energy industries and large-scale intermittent new energy power generation systems are widely used, so that the load type is changed into high randomness and dynamic variability, and the traditional load prediction method is difficult to adapt.

The modern power charge prediction method comprises a grey prediction method, a fuzzy load prediction method, a neural network method and the like. The gray prediction method is used for solving a partial differential equation through a GM (1, 1) model after accumulating or subtracting collected power data, gradually correcting model parameters, and the more data, the better effect, so that the problem of medium-and-long-term power prediction is well solved. However, the gray model itself has exponential growth, and there are various factors affecting the power load in a certain period, so the application scenarios of the gray prediction method are not abundant. The fuzzy prediction method is based on fuzzy theory, expresses the existing work experience, historical data or the combination of the work experience and the historical data in a regular form, and converts the work experience and the historical data into an algorithm which can be operated on a computer, thereby completing various work tasks, but the learning ability of the fuzzy prediction method is weak and the fuzzy prediction method is greatly influenced by artificial factors. The neural network method also has a plurality of models, the most classical is the BP neural network, and the traditional BP algorithm has low convergence rate, so that the network training efficiency is low. Therefore, many people improve the technology on the basis of the traditional Chinese medicine preparation.

With the improvement of the smart grid and the improvement of hardware equipment, massive high-quality data and strong calculation power are provided for load budget, and a strong foundation is provided for the application of deep learning and machine learning on power loads. The neural network load prediction model considering the day characteristic related factors further reduces the negative influence of uncertain factors such as weather on the model. The method also proves that the time series model considering the influence of the time factors on the overall data distribution is a powerful tool for solving the power prediction problem.

The long-time memory neural network (LSTM) is particularly suitable for processing and predicting time, sequence interval and delay events due to the fact that a time memory unit is contained in the neural network, and therefore the long-time memory neural network (LSTM) is very suitable for predicting power loads. The invention firstly provides an LSTM power prediction model.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides an LSTM-based improved electric power short-time load prediction algorithm, and solves the problems that the conventional algorithm is difficult to adapt to the traditional load prediction method, the application scenes are not rich, the learning capability is weak, the influence of human factors is large, and the network training efficiency is low.

The invention adopts the technical scheme that an LSTM-based improved electric power short-time load prediction algorithm comprises the following steps:

s1: preprocessing all the power data;

s2: a one hot method is adopted to encode the non-digital characteristic of weather, so that a more regular input tensor is generated;

s3: inputting an input layer dimension of a network;

s4: the hidden layer is used as a calculation core of the model, an LSTM network is adopted, and the output of each layer is used as the input of the next layer;

s5: after the forward propagation of the network, integrating the calculation results of all the hidden layers by using a full connection layer, and mapping through a Re-LU linear rectification function to obtain a final predicted value;

s6: in order to realize the automatic update of the network parameters, the automatic derivation of the parameter gradient is realized by adopting an error back propagation rule;

s7: adopting an Adam optimization algorithm with batches to realize the iterative optimization training of hyper-parameters on the model;

s8: in evaluating the regression prediction model, the general mean error and root mean square error are used for evaluation.

Preferably, the calculation formula of the electric power short-time load prediction algorithm based on the LSTM is as follows:

y_i＝π(f⁸(f⁷(…f¹(x_i))))

wherein x is_iAnd y_iIs the ith power data and its corresponding predicted output, fⁱIs the ith layer neural network in the model, and pi represents the fully connected layer.

Preferably, the Adam optimization algorithm θ of S7_tThe calculation formula of (2) is as follows:

in the formula, theta_t-1For the parameter to be updated, m_tFor partial first order moment estimation, m₀＝0；v_tFor the estimation of the second order moments, v₀＝0；β₁And beta₂An exponential decay rate estimated for the moment; epsilon is a small positive number, and the default values of these parameters in the machine learning problem are alpha 0.01 and beta₁＝0.9，β₂＝0.999，ε＝10^－8。

Preferably, Adam optimization algorithm g of S7_tThe calculation formula of (2) is as follows:

in the formula, theta_t-1Is a parameter to be updated; α is learning rate, g_tIs the gradient of a random objective function.

Preferably, Adam optimization algorithm m of S7_tThe calculation formula of (2) is as follows:

m_t＝β₁m_t-1+(1-β₁)g_t

in the formula, theta_t-1Is a parameter to be updated; alpha is the learning rate; g_tIs the gradient of the random objective function; m is_tFor partial first order moment estimation, m₀＝0；v_tFor the estimation of the second order moments, v₀＝0；β₁And beta₂An exponential decay rate estimated for the moment; epsilon is a small positive number, and the default values of these parameters in the machine learning problem are alpha 0.01 and beta₁＝0.9，β₂＝0.999，ε＝10^－8。

Preferably, the Adam optimization algorithm v of S7_tThe calculation formula of (2) is as follows:

Preferably, the general average error of S8 is calculated as:

preferably, the root mean square error of S8 is calculated as:

wherein x is_iIs the actual value of the load at the moment i,

is the corresponding predicted value.

Preferably, the input layer dimension of the network of S3 is n × f, where n is the number of neural units in each layer of the network, corresponding to n pieces of training data, and f is the number of features included in one power data record.

Preferably, the LSTM network of S4 is an LSTM network containing 8 neural units.

The LSTM-based improved electric power short-time load prediction algorithm has the following beneficial effects:

1. compared with a basic random gradient descent algorithm, the Adam optimization algorithm has the advantages of high speed; the method can be used for non-stationary target functions/data, namely the mean value and covariance of the gradient are changed greatly; can be used for noisy and/or sparse gradients; the method is not easy to fall into a local optimal point and has high updating speed.

2. By adopting the LSTM algorithm, the LSTM is an enhanced recurrent neural network, has better time sequence memory capability and is an ideal model for processing long-term time sequence data.

Drawings

FIG. 1 is an LSTM network diagram of 8 neural units based on the LSTM improved power short-time load prediction algorithm

FIG. 2 is a RNN unit diagram of the LSTM-based improved power short-time load prediction algorithm of the present invention

FIG. 3 is a diagram of RNN calculation formula of the LSTM-based improved power short-time load prediction algorithm of the present invention

FIG. 4 is a cell unit structure diagram of the LSTM based on the LSTM improved power short-time load prediction algorithm

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.

Recurrent Neural Network (RNN)

The recurrent neural network RNN, which solves the problem of information preservation due to its unique network model structure, is a neural network dedicated to processing and predicting sequence data, and one of its key points is that it can be used to connect previous information to the current task. Shown in FIG. 2 below, is an RNN unit:

RNN principle: in the network neural module in the figure, A represents a hidden layer at time t, and an input X at time t is read_tAnd outputs a value h_tLooping causes information to pass from the current step to the next step. The RNN differs from the traditional neural network in that the hidden layer input has two sources, one is X_tAnd the other is the output A of the last hidden state_t-1

The upper graph RNN can be represented as shown in FIG. 3 using the formula:

S_t＝tanh(UX_t+WS_t-1) (1)

O_t＝softmax(VS_t) (2)

because of the output value O in real applications_tIs often limited to a certain range, so we will remember the state S at the current moment_tMultiplying the weight V and then outputting. Note that parameters W, U and V are the same in each layer of the network.

Long-short time memory network LSTM

Due to the limitation of RNN, as the time interval increases, the ability to learn information far away from the connection is lost (gradient disappears), and LSTM takes turns to solve this problem. The LSTM solves the problems of RNN gradient explosion and gradient dispersion, and has great improvement. The cellular unit structure of LSTM is shown in figure 4:

the LSTM adds three valve nodes at each level in the RNN, as shown, the input valve is placed at input X_tThen, the valve purifies the input to screen out useful information to be fed into a network, and the input valve is usually realized by a sigmoid function. Memory state S for forgetting to place valve at previous moment_t-1Its role is then to select the part where the previous memory state should be preserved, which is usually implemented by the tanh function. The last one is an output valve which is placed after the output of the network and can automatically extract the important part in the output information, and the output valve is also realized by a sigmoid function.

Therefore, the calculation formula is:

f_t＝σ(w_f·[h_t-1，x_t]+b_f) (1)

i_t＝σ(w_i·[h_t-1，x_t]+b_i) (2)

g_t＝tanh(w_g·[h_t-1，x_t]+b_g) (3)

C_t＝f_tC_t-1+i_tg_t (4)

o_t＝σ(w_o·[h_t-1，x_t]+b_o) (5)

h_t＝o_t tanh C_t (6)

f. i, g, C and o respectively represent a forgetting gate, an input gate, an alternative cell state for updating, an updated cell state and an output gate; w and b are respectively corresponding weight coefficient matrix and bias item; sigma and tanh respectively represent a sigmoid activation function and a hyperbolic tangent function, when each valve is opened, each state in the front of the network can influence the current node calculation, and when each valve is closed, the previous calculation result is not related to the current state. Through the mechanism, the LSTM achieves the purpose of reducing the overall calculation amount of the network, and becomes the mainstream framework of the currently applied recurrent neural network.

Analysis of related influencing factors

Calendar effect

Calendar effects refer to fluctuations in electricity usage with respect to date, including week effects, holiday effects, and month effects. The week effect is that successive repetitions of weekdays and weekends create a potential 7-day periodicity that causes the power usage to also exhibit a specific cyclic pattern during that period. Meanwhile, the daily load fluctuation of 8h work and manufacture also belongs to the category of the week effect. The holiday effect refers to the effect of holidays on power usage, which will cover the week effect. The month effect, namely the seasonal nature that the temperature change can not be explained, is more remarkable in summer holidays and spring festival periods, and has different degrees of influence on other months.

The invention provides a causal model for processing the calendar effect, and the model has the main advantage of effectively processing the non-periodic characteristics (lunar holidays, leap years and the like) which are difficult to capture by a dynamic model in the calendar effect in a sample. The calendar effect model is shown in equation (7).

In the formula: f. of_i(D_t) Representing the influence of the calendar effect on the ith time load of the tth day; the first term to the right of the equal sign indicates the week effect, D_wtIs a virtual variable, 1 when day t is week w; beta is a_wiRepresents the effect of week w on the ith hour power load; the second term on the right of the equal sign indicates the holiday effect, D_jtIs a virtual variable, which is 1 when the t day belongs to the jth false day; beta is a_jiRepresenting the effect of holiday type j on i hours electrical load. Wherein, when D_jtWhen 1, D_jt-1-0, when the week effect is covered by the holiday effect. The third term on the right of the equal sign represents the month effect, k f tM is a virtual variable, and is 1 when the t-th day belongs to the k-th month; beta is a_kiShows the effect of month k on i hour load.

Effect of temperature

The external temperature is another factor determining the power demand, and particularly has a significant influence on the load prediction result in the medium-short time scale, and specifically has the following aspects: first, the temperature changes the direction of change or the demand state of the power consumption, and if the temperature is further increased at a higher temperature, the use of the air conditioner increases the power consumption, but increases the temperature at a lower temperature, and decreases the power consumption. Second, temperature variation affects power demand according to demand temperature flexibility. Third, calendar effects may interact with temperature effects. According to the invention, temperature effect modeling is carried out along two dimensions, the first dimension estimates the temperature threshold value when the required behavior is adjusted, and firstly, a piecewise linear auxiliary regression shown as a formula (8) is constructed.

In the formula:

load data of ith hour on the t day after long-term trend is removed by an H-P filter;τand

are respectively asA low temperature threshold and a high temperature threshold; epsilon'_itIs a random error term; t is_tThe daily average air temperature on the t-th day; the remaining variables are regression parameters. And repeatedly giving high and low temperature thresholds, performing multiple regression, and verifying an AIC criterion, wherein when the AIC is minimum, the corresponding low temperature and high temperature thresholds are the estimated values meeting the requirements.

It should be noted that the temperature effect is more correlated to the indoor temperature, and ideally the hourly indoor temperature data is used, in view of the present invention which has to model the hourly load separately. However, the cost for acquiring hourly temperature data is high, and meanwhile, because the relation between indoor and outdoor temperatures is influenced by the thermodynamic performance of a building, a universal formula for calculating the indoor temperature from the outdoor temperature cannot be deduced, so that the daily frequency and the outdoor temperature are only adopted in the temperature effect modeling. Under the condition of limited data, in order to better capture the real temperature effect, the invention introduces the lowest average temperature, the highest average temperature and the daily average temperature in modeling and the moving average temperature which lasts for more than five days; meanwhile, in order to solve the problem of nonlinearity in temperature elasticity, the square of the temperature is also used as an explanatory variable in the invention. In summary, the temperature effect is shown in formula (9).

In the formula: g_iRepresents the temperature effect and is a function of the temperature T; t is_tTaking the average of the highest temperature and the lowest temperature in a day as the average temperature in the day;τand

estimation thresholds for low and high temperatures, respectively;

the average temperature is a moving average value of the average temperature for more than five consecutive days; a. the_tIs the temperature difference within the day; the remaining variables are regression parameters.

Daytime effect

The daytime effect is an influence of the daytime time on the amount of electricity used, and generally, the shorter the daytime time is, the more the illumination power consumption is, and the more the amount of electricity used is. The latitude and longitude spans of China are large, so that the time difference of daytime in different regions and different seasons is obvious, and the daytime effect is taken into consideration by the model. Specifically, in the view angle analyzed and predicted time by time, the daytime effect is mainly reflected in the influence of sunrise and sunset moments on electricity consumption in the adjacent time period, and is not influenced in the time period in which the daylight or the night is stable. By looking up meteorological data, the sunrise time of China is distributed in the Beijing time of 5:00-7:00 in the whole year; the sunset time is distributed in the range of 17:00-19: 00. Therefore, for the load prediction of the above-described time period, the present invention introduces the daytime effect as a high frequency component to explain the influence of the daytime on the used amount of electricity, as shown in equation (10).

In the formula: m is_iThe influence of the daytime effect on the i-th time load is expressed as sunrise time Sr_tAnd sunset time Ss_tA function of (a); the remaining variables are regression parameters.

LSTM-based power load prediction

Considering that the LSTM is an enhanced recurrent neural network, has better time sequence memory capability and is an ideal model for processing long-term time sequence data problems. Therefore, we propose a long-short term memory neural network-based power load prediction algorithm. Fig. 3 is an overall flow diagram of the network model. All power data are preprocessed, and a one hot method is adopted to encode the non-digital feature of weather, so that a more regular input tensor is generated. The dimension of the input layer of the network is n multiplied by f, n is the number of the neural units of each layer in the network and corresponds to n pieces of training data, and f is the number of features contained in one piece of power data record. Hidden layer as the computational core of the model, an LSTM network with 8 neural units is designed, as shown in fig. 1, with the output of each layer being the input to the next layer, and we do not draw all hidden layers due to paper space limitations. After the forward propagation of the network, the calculation results of all hidden layers are integrated by using a full connection layer, and a final predicted value is obtained through Re-LU linear rectification function mapping.

The whole process can be formulated as:

y_i＝π(f⁸(f⁷(…f¹(x_i)))) (11)

In order to realize automatic updating of network parameters, an error back propagation rule is adopted to realize automatic derivation of parameter gradients. In the traditional depth model, batch random gradient descent (batch-SGD) algorithm is mostly adopted for realizing the iterative optimization of the hyper-parameters in the training process, so that the phenomenon of unstable values often occurs in the training process, and the requirement of the batch-SGD algorithm on the storage capacity of the network is high. Compared with batch-SGD, the Adam optimization algorithm with batch has the advantages of small occupied memory, stable training, suitability for high-noise data problems and the like, so that the Adam with batch replaces the traditional batch-SGD.

Adam optimization algorithm

Adam is a first-order optimization algorithm that can replace the traditional Stochastic Gradient Descent (SGD) process, which can iteratively update neural network weights based on training data. Adam was originally proposed by Diederikkingma by OpenAI and Jimmy Ba at Toronto university in the ICLR paper filed 2015 (Adam: AMethod for Stocharstic Optimization). Compared with the basic random gradient descent algorithm, the method has the advantages of high speed; the method can be used for non-stationary target functions/data, namely the mean value and covariance of the gradient are changed greatly; can be used for noisy and/or sparse gradients; the method is not easy to fall into a local optimal point and has high updating speed.

The parameter updating method of the Adam algorithm is as follows:

Whileθt not converge do

t＝t+1

m_t＝β₁m_t-1+(1-β₁)g_t

Endwhile

wherein, theta_t-1Is a parameter to be updated; alpha is the learning rate; g_tIs the gradient of the random objective function; m is_tFor partial first order moment estimation, m₀＝0；v_tFor the estimation of the second order moments, v₀＝0；β₁And beta₂An exponential decay rate estimated for the moment; ε is a small positive number. The default values for these parameters in the machine learning problem are α ═ 0.01, β₁＝0.9，β₂＝0.999，ε＝10^－8。

Model evaluation index

In evaluating the regression prediction model, a general mean-square error (MAPE) and root-mean-square error (RMSE) are used to describe the formula for the calculation as follows:

wherein x is_iIs the actual value of the load at the moment i,

is the corresponding predicted value.

Experimental Environment

All models of the invention are realized by python 3.7 programming language, the LSTM-based power load prediction model is realized by using a Tensorflow deep learning library, the trained model hardware environment is win10, Intel Core I56500, NVIDIAGeForce GTX 1070ti, and the memory is 12 GB.

Data pre-processing

The data set used by the method is 40000+ electric power data collected by the State grid Hebei province electric power company from 6 months 1 days in 2018 to 8 months 30 days in 2018, and the data set is randomly divided into a training set and a testing set according to a ratio of 9: 1. The quality of the data features determines to some extent the upper limit that can be reached by the model performance, so the necessary data preprocessing is of great importance.

Since weather is not a numerical attribute, each weather is represented by a corresponding unique integer value using a one hot encoding method. Finally, in order to improve the prediction accuracy of the model and accelerate convergence, all the features are subjected to normalization processing.

Claims

1. An LSTM-based improved electric power short-time load prediction algorithm is characterized by comprising the following steps:

s1: preprocessing all the power data;

s3: inputting an input layer dimension of a network;

2. The LSTM improved power short-term load prediction algorithm according to claim 1, wherein the calculation formula of the LSTM improved power short-term load prediction algorithm is:

y_i＝π(f⁸(f⁷(…f¹(x_i))))

3. The LSTM-based improved power short-time load prediction algorithm of claim 1, wherein the Adam optimization algorithm θ of S7_tThe calculation formula of (2) is as follows:

4. The LSTM-based improved power short-time load prediction algorithm of claim 1, wherein the Adam optimization algorithm g of S7_tThe calculation formula of (2) is as follows:

5. The LSTM-based improved power short-time load prediction algorithm according to claim 1, wherein the Adam optimization algorithm m of S7_tThe calculation formula of (2) is as follows:

m_t＝β₁m_t-1+(1-β₁)g_t

6. The LSTM-based improved power short-time load prediction algorithm according to claim 1, wherein the Adam optimization algorithm v of S7_tThe calculation formula of (2) is as follows:

7. The LSTM-based modified electric power short-time load prediction algorithm of claim 1, wherein the calculation formula of the general average error of S8 is:

8. the LSTM-based modified power short-term load prediction algorithm of claim 1, wherein the root mean square error of S8 is calculated as:

wherein x is_iIs the actual value of the load at the moment i,

is the corresponding predicted value.

9. The LSTM-based improved power short-term load prediction algorithm according to claim 1, wherein the input layer dimension of the network of S3 is n x f, where n is the number of neural units in each layer of the network, corresponding to n training data, and f is the number of features included in one power data record.

10. The LSTM improved power short-term load prediction algorithm according to claim 1, wherein the LSTM network of S4 is an LSTM network comprising 8 neural units.