CN117592593A

CN117592593A - Short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-intent

Info

Publication number: CN117592593A
Application number: CN202311366894.1A
Authority: CN
Inventors: 梅锦超; 李文武; 邵江城; 邱宪达; 杨苗; 查子健
Original assignee: China Three Gorges University CTGU
Current assignee: China Three Gorges University CTGU
Priority date: 2023-10-20
Filing date: 2023-10-20
Publication date: 2024-02-23

Abstract

A short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-intent, comprising: step1, preprocessing historical power load data; step2, decomposing the preprocessed data by adopting a complete set of empirical modes to obtain a plurality of subsequence components; step3, adopting variation modal decomposition to further decompose, establishing an evaluation standard, and adopting a whale optimization algorithm to perform parameter optimization on the decomposition number and penalty factors; step4, selecting the influence factor with the maximum correlation to obtain an optimal feature set; step5, constructing a bidirectional long-short-term neural network for training and predicting, defining a loss function as an objective function, and optimizing the super parameters by adopting a whale optimization algorithm; step6, predicting to obtain a final prediction result; step7, verifying the validity of the prediction model. The secondary modal decomposition can adaptively decompose into a suitable number of simple modal components; the bi-directional processing power of the BiLSTM neural network and the attention mechanism and whale optimization algorithm further improve the prediction accuracy.

Description

Short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-intent

Technical Field

The invention relates to the technical field of power system prediction, in particular to a short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-attention.

Background

The power load prediction can be classified into ultra-short term (4 h in the future), short term (72 h in the future) and medium-long term prediction (1 month to 1 year in the future) on a time scale; the prediction mode can be classified into point prediction, interval prediction and probability prediction. Short-term load prediction is an important component of load prediction, and the commonly used prediction methods can be summarized into three main categories: traditional mathematical statistical models, machine learning and deep learning. The traditional method and the machine learning have high prediction speed, but neglect the sample time sequence relation. In recent years, deep learning has been widely used. The BP neural network counter propagates errors, the training speed is low, and the prediction accuracy is not high. The support vector machine has good learning ability, but the network generalization ability is not strong, and the prediction accuracy is not high. As a special variant of recurrent neural networks (recurrent neural network, RNN), long short-term memory neural (long short term memory, LSTM) networks are currently the most popular deep neural networks in the field of time series prediction. The hidden layer structure of RNN sequential connection is inherited, and simple hidden layer neurons are replaced by long-short-time memory units, so that the problem of long-term dependence caused by gradient explosion and gradient disappearance is solved. However, the LSTM neural network can only use information of a history time and does not consider information of a future time, so that a relationship between a front time direction and a rear time direction of load data cannot be learned. The literature provides a DA mechanism, attention mechanisms are respectively added in the stages of a decoder and an encoder to highlight the effect of important information in input and output links, and the DA mechanism is applied to the field of short-term prediction of power load, so that a bidirectional long-short-term neural network model can fully learn the coupling relation between the load and influence factors.

In order to overcome the defect of random initialization of long-term network parameters, the network weight can be optimized through an optimization algorithm, and the network stability and the prediction precision are improved. Common optimization algorithms include genetic algorithm, particle swarm optimization algorithm, simulated annealing algorithm, and the like. WOA optimization was proposed in 2016. WOA is a swarm intelligent optimization algorithm that simulates whale swarm surrounds and chases hunting. The algorithm is simple in principle, easy to program and realize, few in setting parameters, and superior to PSO in solving accuracy and convergence speed.

Because the fluctuation direct prediction of the load sequence can generate larger errors, in order to weaken the nonlinearity of the sequence and further improve the prediction precision, some technologies decompose the wind power time sequence by adopting variation modal decomposition, then predict each subsequence, finally reconstruct the prediction result of each subsequence, decompose the photovoltaic power time sequence by adopting empirical mode decomposition, and then input a support vector machine for prediction. Although the prediction accuracy is improved by the method of predicting after decomposition, the problem of spectrum aliasing exists during decomposition, and the high-frequency strong non-stationary component generated by decomposition also causes a large prediction error. The secondary decomposition processing is performed on the time series data to predict again, so that the prediction accuracy of the high-frequency component part in the primary decomposition is improved, but the number of generated subcomponents is excessive, and the time required for the respective prediction is long. The high-frequency modal component still has the problem of high entropy value, and the modal decomposition process lacks an evaluation standard for guiding parameter setting, and often depends on experience to give parameters, so that the decomposition effect is not ideal.

Disclosure of Invention

The invention aims to solve the technical problem of providing a short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-attention, and solves the problem of poor prediction precision in power load prediction based on an artificial neural network in the prior art.

In order to solve the technical problems, the invention adopts the following technical scheme:

a short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-intent, comprising the steps of:

step1, preprocessing historical power load data; the method comprises the steps of performing missing value filling and abnormal value clearing on original load data;

step2, decomposing the preprocessed data by adopting a complete set of empirical modes to obtain a plurality of subsequence components;

step3, further decomposing the high-frequency components in the data obtained in Step2 by adopting a variation mode decomposition, establishing an evaluation standard, and carrying out parameter optimization on the number of the decomposition and a penalty factor by adopting a whale optimization algorithm;

step4, selecting the maximum correlation influence factor by adopting the maximum information coefficient to obtain an optimal feature set;

step5, constructing a bidirectional long-short-period neural network for each component subsequence, performing training prediction, adding an attention mechanism, defining a loss function as an objective function, and optimizing super parameters by adopting a whale optimization algorithm;

step6, predicting to obtain the result of each component, and superposing to obtain a final prediction result;

step7, verifying the validity of the prediction model.

In Step3, the decomposition process of the variation mode is as follows:

the step3.1, variance problem is that the sum of the estimated bandwidths of the eigenmode components is minimum, the constraint condition is that the sum of the eigenmode components is the original signal, and the formula is as follows:

wherein: { u _k -a set of eigenmode components; { w _k -a set of center frequencies; delta (t) is a pulse signal; k is the preset decomposition number; f (t) is the original signal;

step3.2, using a penalty factor α and a lagrangian multiplier λ (t), where α affects the reconstruction accuracy of the signal, to convert the constrained variational problem to an unconstrained problem, where λ (t) maintains the stringency of the constraint, as follows:

step3.3, solving the unconstrained problem through an alternate direction multiplier method, thereby realizing effective separation of signal frequencies, wherein iterative updating formulas of the intrinsic mode components and the center frequency are respectively as follows:

wherein:for the (n+1) th iteration the kth eigenmode component with center frequency omega corresponds to wiener filtering of the current residual signal and is for +.>Performing inverse Fourier transform;

step3.4, propose an evaluation criterion L suitable for the field of time sequence prediction _loss The formula is

Wherein: l (L) _loss Is the mean absolute error of the VMD reconstructed signal and the original signal;

step3.5, evaluation criteria L _loss The minimum is an objective function, and a whale optimization algorithm WOA is utilized to optimize a penalty factor and the number K of decomposition of the variation modal decomposition, wherein the whale optimization algorithm comprises the following steps: according to the hunting characteristics of whales, the hunting object is first surrounded, and the position is adjusted according to the hunting object position and the spiral movement, so that the optimal hunting surrounding strategy is obtained. After the whale is contracted and surrounded, the screw equation is adopted to update the next position of the whale, so that proper hunting objects are efficiently hunted:

X(t+1)＝D ⁱ e ^bl ·cos(2πl)+X ^* (t)

D ⁱ ＝|X ^* (t)-X(t)| (7)

wherein D represents the distance between the current whale and its prey; x is X ^* (t) obtaining a position vector of an optimal solution for the time t; d (D) ⁱ Representing the distance between the ith whale and the prey when the current optimal position is obtained; b represents a logarithmic spiral shape constant, and l is a random number between-1 and 1; in the updating process, random probability optimization is adopted to control whale groups to select between shrinkage, surrounding and spiral movement, so that the searching range is enlarged to find other suitable prey, and the global searching predation capability of WOA is enhanced; and finally, screening out the current optimal solution and the global optimal solution through iterative updating.

In Step1, the interpolation method is adopted to perform the deletion filling on the input data set, and then the linear function normalization is adopted to perform the normalization processing on the input data set.

And (3) calculating the predicted result obtained in Step6 through the loss function constructed in Step5, and adjusting parameters of the predicted network obtained in Step5 based on the calculated result until the predicted network with the minimum calculated result of the loss function is obtained and used for obtaining the predicted result.

The loss functions described above include mean square error, root mean square error, mean absolute error, and mean absolute percentage error.

The invention provides a short-term power load prediction method based on improved secondary modal decomposition and WOA optimization BILSTM-attention, which utilizes a complete set of empirical mode decomposition original power load sequences with self-adaptive noise, further decomposes sequences with strong complexity by utilizing variational mode decomposition, establishes an evaluation standard for decomposition results, and optimizes super parameters in variational mode decomposition by utilizing a whale optimization algorithm; then, carrying out importance ranking on relevant factors such as weather, electricity price and the like by using the maximum information coefficient to obtain an optimal feature set; and respectively establishing a bidirectional long-short term neural network for each modal component obtained by decomposition in combination with the feature set, adding an attention mechanism and adopting a whale algorithm to optimize the super-parameters of the neural network. And finally, each component is overlapped to obtain a final prediction result, so that the prediction precision is improved.

Drawings

The invention is further illustrated by the following examples in conjunction with the accompanying drawings:

FIG. 1 is a short-term power load prediction flow;

FIG. 2 is a diagram of the CEEMDAN processing to obtain modal components;

FIG. 3 is a VMD process to obtain modal components;

FIG. 4 is a comparison of the model predictions of example 1;

FIG. 5 is a comparison of the model predictions of example 2;

Detailed Description

The technical scheme of the invention is described in detail below with reference to the accompanying drawings and examples.

As shown in fig. 1, a short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-contribution, comprising the steps of:

step2, decomposing the preprocessed data by adopting a complete set empirical mode to obtain a plurality of subsequence components, and decomposing by adopting a CEEMDAN time-frequency decomposition method;

wherein:the kth eigenmode component with the center frequency omega at the n+1th iteration is equivalent to wiener filtering of the current residual signal andand is about->Performing inverse Fourier transform;

step3.4, under ideal conditions, the reconstructed signal f' (t) is the same as the original signal f (t), and decomposition loss occurs in the actual decomposition process, the decomposition loss is derived from residual signals which do not accord with the definition of the eigen mode function, the amplitude is smaller, the fluctuation is faster, the prediction difficulty is larger, the model is often not considered, but if the decomposition loss is larger, the ignored information is more, the improvement of the prediction precision is not facilitated, and therefore, an evaluation standard L applicable to the time sequence prediction field is provided _loss The formula is

Wherein: l (L) _loss The mean absolute error of the VMD reconstruction signal and the original signal is smaller, which indicates that the smaller the decomposition loss is, the more information the mode contains is, and the more accurate the model is;

X(t+1)＝D ⁱ e ^bl ·cos(2πl)+X ^* (t)

D ⁱ ＝|X ^* (t)-X(t)| (7)

wherein D represents the distance between the current whale and its prey; x is X ^* (t) obtaining a position vector of an optimal solution for the time t; d (D) ⁱ Representing the distance between the ith whale and the prey when the current optimal position is obtained; b represents a logarithmic spiralA linear constant, i is a random number between-1 and 1; in the updating process, random probability optimization is adopted to control whale groups to select between shrinkage, surrounding and spiral movement, so that the searching range is enlarged to find other suitable prey, and the global searching predation capability of WOA is enhanced; finally, the current optimal solution and the global optimal solution are screened out through iterative updating;

in the model training process, the influence of complex external conditions on the load can be reflected to a certain extent by introducing as many features as possible by the prediction model; however, features with lower correlation have limited predictive performance improvement, and too many feature set elements can lead to increased model input dimensions, so that both the complexity of the model and the training time consumption are significantly increased; in machine learning, feature screening is mainly divided into a packing method and a filtering method; the packaging method is characterized in that the characteristics are input into a prediction model for iteration, and the characteristics are selected according to a final prediction result; the filtering rule is to evaluate the feature set in advance and filter out the feature after sorting according to the evaluation result; common filtering methods such as correlation analysis, variance and selection methods are widely applied to various large fields due to the characteristics of high speed and small calculated amount;

unlike Pearson, which can only extract the correlation relationship between linear data, spearman can extract a simple monotonic nonlinear relationship, MIC effectively measures the linear or nonlinear correlation strength between two variables, and compensates the nonlinear relationship between the attributes of the weakly correlated variables while maintaining the attribute information of the strongly correlated variables;

the LSTM neural network replaces simple neurons in the RNN hidden layer with long-short-time memory units with long-term memory, so that the long-term dependence problem is effectively solved; the long-short-time memory unit comprises 3 parts of a forgetting gate, an input gate and an output gate, wherein the forgetting gate discards irrelevant information, the input gate determines new information stored in the unit state, and the output gate controls the output of the hidden layer node;

the LSTM neural network solves the long-term dependence problem of RNN, but cannot learn the relationship between the front and rear time directions of the load data because the information of the future time is not considered. For this reason, the bi-directional concept is introduced herein to obtain information of the whole time domain, the BiLSTM neural network is a combination of the LSTM neural network and the bi-directional RNN, the long-term dependence problem is effectively solved, and the model training can be performed by using sequence information of 2 directions in the past and in the future simultaneously; compared with RNN and LSTM neural networks, the BiLSTM neural network has stronger nonlinear expression capability, and can better process uncertainty in data;

step7, verifying the validity of the prediction model on the Australian load data set and the electrician cup load data set respectively.

In step2, CEEMDAN is a posterior, adaptive time-frequency decomposition method suitable for smoothing non-stationary sequences; unlike wavelet decomposition, which requires manual wavelet basis setting, CEEMDAN can adaptively decompose a sequence into a finite number of IMFs of different time scales, denoted CIMF; by adding white noise with opposite signs into the original signal, the problem that the white noise is introduced by the mode aliasing phenomenon and the integrated empirical mode decomposition existing in the empirical mode decomposition is solved; the multi-element load sequence is subjected to modal decomposition, so that the prediction difficulty can be reduced; taking the original electrical load as an example for analysis, the basic steps of CEEMDAN decomposition are as follows:

step2.1 adding M pairs of white noise of opposite sign to the original electrical load sequence, i.e

Wherein: g (t) is the original electrical load sequence;and->Positive and negative white noise introduced for the τ time respectively;and->The new electric load sequences after adding positive and negative white noise are respectively.

step2.2, performing modal decomposition on the 2 new electric load sequences by using empirical mode decomposition to obtain 2 groups of CIMF components.

step2.3, repeating step 2M times to obtain 2 sets of integrated CIMF components, i.e

Wherein: f (F) ⁺ And F ^- Respectively adding M times of positive and negative white noise and decomposing to obtain CIMF component group average values;andand respectively adding positive white noise and negative white noise for the τ time and decomposing to obtain component groups.

step2.4, take F ⁺ And F ^- The average value of (2) is the final electrical load sequence decomposition result. The basic formula of empirical mode decomposition is:

wherein: i _η (t) is the η IMF component; a is the CIMF number after decomposition; r (t) is the residual component.

The embodiment of the invention evaluates the prediction result through the following errors:

1.MSE

the Mean Square Error (MSE) refers to the expected value of the square of the difference between the predicted value and the actual value.

2.RMSE

Root Mean Square Error (RMSE) is the deviation between the predicted value and the actual value measured by square root processing on the basis of the MSE.

3.MAE

The Mean Absolute Error (MAE) refers to the average of the absolute errors between the predicted value and the actual value.

4.MAPE

Average absolute percent error (MAPE) refers to the result of averaging the sum of the duty cycles of the difference between the predicted value and the actual value among the actual values.

Example 1:

using multidimensional load data of 7 days from 25 days in 2006 to 31 days in 1 month in 2006 in certain region of Australia, sampling the data in units of 30min, and collecting 48 points every day; the data of the first 5 days are used as training sets, the data of the last 2 days are used as testing sets, and the testing sets are used as final model evaluation and cannot participate in the training process.

The feature vector of the data set adopted by the invention comprises 6 dimensions: the parameters of the load, the electricity price, the dew point temperature, the dry bulb temperature, the wet bulb temperature and the humidity are respectively that the parameters of 6 dimensions are every 30min

The sample was taken once and the characteristic parameters are shown in table 1.

TABLE 1 characterization parameters

Parameter setting

Hyper-parameters in neural networks are often difficult to determine, but directly affect the final prediction result. The traditional hyper-parameter value selection method is an empirical method; however, manual tuning tends to be time consuming and the results tend to be unsatisfactory. Based on feature screening, the invention adopts a whale optimization method to optimize the super parameters of the BiLSTM-At network model in the text.

The neuron optimizing range of the BiLSTM-At network is (1, 30), the iteration frequency optimizing range is (100, 200), the particle population scale is set to be 10, and the maximum iteration frequency is 100; the optimal combination of parameters is found to be (20, 135).

The key parameters of VMD modal decomposition are the decomposition number K and penalty factors, the key parameters are optimized by using a particle swarm algorithm, the decomposition number is (1, 50), the penalty factors are (1, 20000), the particle swarm size is 10, the maximum iteration number is 100, and the optimal parameter combination is found (6, 11602); the optimizing result is shown in the following table:

table 2 super parameter optimizing results

The table shows that after particle swarm super-parameter optimization, the model effect is improved by 0.14%, and the effectiveness of super-parameter optimization is verified.

Parameter optimization-based quadratic modal decomposition

The short-term load sequence has stronger nonlinearity, and the modal decomposition technology can weaken the non-stationarity of the sequence and reduce the prediction difficulty; therefore, in the first stage of the invention, the CEEMDAN technology with higher decomposition efficiency is adopted to decompose the original sequence; in order to avoid the situation of modal aliasing, simultaneously, fully decomposing the sequence, further decomposing the high-frequency component by using a VMD, and optimizing the number of the VMD decomposition and penalty factors by using WOA (WOA) by taking an error index as an objective function; thus 8 and 6 modal components of different frequency scales are obtained, as shown in fig. 2 and 3.

Feature selection based on pearson correlation coefficients

The accuracy of load prediction can be affected by excessive or redundant characteristic factors; the invention considers the method using the pearson correlation coefficient to screen the characteristic parameters first; filtering out the weakly correlated features, and taking the strongly correlated features as input, so that the final prediction accuracy of the model is improved; the pearson correlation coefficient analysis results and the feature screening results are shown in the following table:

TABLE 3 Pelson correlation coefficient analysis results

Influencing characteristics	Pearson coefficient
		Dew point temperature	0.32
Dry bulb temperature	0.75
		Wet bulb temperature	0.72
Humidity of the water	0.18
		Price of electricity	0.81

TABLE 4 characterization screening results

Feature ordering	Feature type
		1	Price of electricity
2	Dry bulb temperature
		3	Wet bulb temperature

As can be seen from table 2, in the 5 kinds of neural network models, the BP neural network is easily trapped in the local optimum, so that the prediction result is unstable and the error is large; the LSTM neural network has a complex network structure, the fluctuation range of the prediction result is smaller, the prediction accuracy is more accurate, and the MAPE of the model is reduced to 3.45%; the prediction accuracy of the BiLSTM neural network is highest, the prediction result is very stable, and the MAPE of the model is only 2.50%, and compared with the BP neural network and the LSTM neural network, the prediction accuracy of the BiLSTM neural network is improved by 32.15%, 27.06% and 5.54% respectively.

Notably, MAPE of BiLSTM neural network is better than WOA-BiLSTM-At, however R2 performs worse than WOA-BiLSTM-At because BiLSTM neural network has smaller average error and larger single point error, and the two-way long-short term neural network after optimization has more accurate and stable prediction result, the result is shown in FIG. 4.

Example 2:

using multidimensional load data of 2012, 1 month, 1 day, and 10 days of 2015, 1 month, and data of an electrician cup, sampling the data in units of 30 minutes, and collecting 48 points every day; the first 80% of the days are used as training sets, the last 20% of the days are used as test sets, and the test sets are used as final model evaluation and do not participate in the training process.

The dataset feature vectors employed herein contain 6 dimensions in total: the load, maximum temperature, minimum temperature, average temperature, rainfall, relative humidity, respectively.

Table 6 comparison of prediction model results

	Mape	Rmse	Mae	R2
					BP	1.1414	0.2795	0.2212	0.8776
LSTM	0.8297	0.2099	0.1345	0.9053
					BiLSTM	1.0517	0.2557	0.1751	0.9628
BiLSTM-At	0.8843	0.2758	0.2267	0.9912
					WOA-BiLSTM-At	1.2290	0.2858	0.2142	0.9537

As shown in FIG. 5, the method provided by the invention realizes the full-day prediction of the data set, the ultraviolet rays in the line graph are load prediction curves, the red lines are real load curves of the day, and as shown by the red lines, the fluctuation of the real load is more severe, and the prediction difficulty is high; as shown by ultraviolet rays, the method provided by the invention well simulates the trend of load fluctuation, realizes more accurate prediction, and has high economic value for a power grid system; compared with other methods, the load curve fitted by the method is closer to a true value; the single point accuracy of the method of the invention is higher.

In actual prediction, the fluctuation of the load sequence is large, and the improved secondary modal decomposition can adaptively decompose the complex load sequence into a proper number of simple modal components; in addition, the bi-directional processing capacity of the BiLSTM neural network, the attention mechanism and the whale optimization algorithm further improve the prediction precision; in summary, the practical application of the electrical cup data again proves the superiority of the method.

Claims

1. A short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-intent, comprising the steps of:

step7, verifying the validity of the prediction model.

2. The short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-intent of claim 1, wherein in Step3, the variational modal decomposition process is as follows:

X(t+1)＝D ⁱ e ^bl ·cos(2πl)+X ^* (t)

D ⁱ ＝|X ^* (t)-X(t)| (7)

3. The short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-intent of claim 2, wherein in Step1, the input data set is subjected to deletion filling by adopting an interpolation method, and then is subjected to normalization processing by adopting linear function normalization.

4. The method for predicting short-term power load based on improved quadratic modal decomposition and WOA optimization BILSTM-intent of claim 3, wherein the loss function constructed in Step5 is used for calculating the prediction result obtained in Step6 through the loss function, and parameters of the prediction network obtained in Step5 are adjusted based on the calculation result until the prediction network with the minimum calculation result of the loss function is obtained for obtaining the prediction result.

5. The short-term power load prediction method based on improved quadratic modal decomposition and WOA optimization BILSTM-contribution of claim 4, wherein said loss function includes mean square error, root mean square error, mean absolute error, and mean absolute percentage error.