CN113515889A

CN113515889A - Dynamic wind speed prediction model establishing method

Info

Publication number: CN113515889A
Application number: CN202110557310.3A
Authority: CN
Inventors: 李永刚; 王月; 吴滨源
Original assignee: North China Electric Power University
Current assignee: North China Electric Power University
Priority date: 2021-05-21
Filing date: 2021-05-21
Publication date: 2021-10-19
Anticipated expiration: 2041-05-21
Also published as: CN113515889B

Abstract

The invention discloses a dynamic wind speed prediction model establishing method, which comprises the following steps: acquiring actual measurement wind speed data of a target area, and preprocessing the actual measurement wind speed data; training and predicting the preprocessed actual measurement wind speed data by utilizing various prediction algorithms to obtain various wind speed prediction models to form a Q learning model set; wind speed fluctuation conditions and attribute factors are added into the Q learning model in a centralized manner, an optimal wind speed prediction model in each time period is selected through a Q reinforcement learning algorithm, preliminary wind speed prediction data are obtained, and a wind speed prediction error is calculated; and constructing an error Q learning model base based on the wind speed prediction error, and selecting an optimal wind speed prediction error model from the error Q learning model base through a Q reinforcement learning algorithm to correct the preliminary wind speed prediction value to obtain final wind speed prediction data. According to the method provided by the invention, a dynamic wind speed prediction model is constructed by adopting a Q reinforcement learning algorithm twice, and the method has the characteristics of strong generalization capability, good robustness and high prediction precision.

Description

Dynamic wind speed prediction model establishing method

Technical Field

The invention relates to the technical field of wind speed prediction, in particular to a dynamic wind speed prediction model establishing method.

Background

In recent years, the energy structure is continuously developed towards a low-carbon type direction, and the permeability of a renewable energy power grid represented by wind power is gradually increased year by year. With the large-scale grid connection of renewable energy represented by wind power generation, the economical efficiency of power grid dispatching is gradually improved, but the wind speed has the characteristics of volatility, indirection, low energy density and the like, so that the reliability of the operation of a power system is seriously reduced. Therefore, in order to better utilize wind power generation and take stability of a power system into consideration, a short-term accurate prediction of wind speed is required, but no clear definition is given at present on how to establish a prediction model and how to optimize the prediction model to improve prediction accuracy and generalization capability.

Disclosure of Invention

The invention aims to provide a dynamic wind speed prediction model establishing method, which adopts a Q reinforcement learning algorithm twice to establish a dynamic wind speed prediction model and has the characteristics of strong generalization capability, good robustness and high prediction precision.

In order to achieve the purpose, the invention provides the following scheme:

a dynamic wind speed prediction model building method comprises the following steps:

s1) acquiring the actually measured wind speed data of the target area, and preprocessing the actually measured wind speed data;

s2) dividing the preprocessed actual measurement wind speed data into a wind speed training set, a wind speed testing set and a wind speed inspection set, training the wind speed training set by using various prediction algorithms, predicting the wind speed testing set to obtain various wind speed prediction models, and forming a Q learning model set;

s3) adding wind speed fluctuation conditions and attribute factors in the Q learning model set, selecting an optimal wind speed prediction model in each time period through a Q reinforcement learning algorithm to obtain preliminary wind speed prediction data, and calculating a wind speed prediction error according to the preliminary wind speed prediction data and corresponding actually measured wind speed data;

s4) constructing an error Q learning model base based on the wind speed prediction error, and selecting an optimal wind speed prediction error model from the error Q learning model base through a Q reinforcement learning algorithm to correct the preliminary wind speed prediction value to obtain final wind speed prediction data.

Optionally, the preprocessing of the measured wind speed data in step S1) is to replace missing and abnormal values in the measured wind speed data with an adjacent data complementation method.

Optionally, the method further includes, after step S4):

s5) verifying the validity of the optimal wind speed prediction error model by using the wind speed verification set.

Optionally, verifying the effectiveness of the optimal wind speed prediction error model selects a mean square error epsilon₁Relative error e₂And determining the coefficient R²The final wind speed prediction data is evaluated by the three evaluation indexes, and the calculation formulas are respectively as follows:

wherein: x is the number of_t、y_t、

The measured wind speed value, the final predicted wind speed value, the measured wind speed average value and the final predicted wind speed average value at the time t are respectively.

Optionally, the multiple prediction algorithms adopted in the Q learning model set in step S2) are 5, including learning algorithms of LSTM, XGBoost, SVR, BP neural network, and KRR.

Optionally, the error Q learning model library in step S4) includes 5 prediction algorithms, including learning algorithms of SVR, BP neural network, GKRR, PKRR, and MHKRR.

Optionally, the calculation formula of the wind speed prediction error in step S3) and step S4) is as follows:

in the formula:

is the wind speed prediction error, x is the measured wind speed value,

and the wind speed is a preliminary wind speed predicted value.

Optionally, the final wind speed prediction data in step S4) is calculated as follows:

in the formula: y is the final predicted value of the wind speed,

for the purpose of preliminary wind speed prediction,

the error is predicted for the corrected wind speed.

Optionally, in step S3) and in step S4), the Q reinforcement learning algorithm uses a reward function with a mixture of errors and model ranks, and the calculation formula of the reward function is as follows:

in the formula: r (s, a) is a reward function; s is a stateSpace, S ═ S₁,…,s_I,…,s_N}，s_IThe current wind speed prediction model is adopted, and N is the number of the wind speed prediction models; a is motion space, and A is { a ═ a₁,…,a_J,…,a_N}，a_JAn act of switching from the current wind speed prediction model to the next wind speed prediction model at the next prediction time step; RANK (M)_I,t) And RANK (M)_I,t+1) Wind speed prediction model M at the t-th time and the t + 1-th time respectively_IRank of (2); TIME (M)_I,t) And TIME (M)_I,t+1) A wind speed prediction model M for the t th time and the t +1 th time_IThe calculated time of (a); α and β are weight coefficients, and α + β is 1.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the dynamic wind speed prediction model establishing method provided by the invention adopts two times of Q reinforcement learning algorithm to establish a dynamic wind speed prediction model, wherein one Q learning agent is responsible for selecting an optimal wind speed prediction model to carry out preliminary wind speed prediction, and the other Q learning agent is responsible for inputting the optimal wind speed prediction model into an error correction part through calculating errors to select the optimal wind speed prediction error model from the optimal wind speed prediction model so as to obtain an optimal prediction strategy; q learning effectively selects the optimal prediction model in the wind speed prediction part and the error correction part; the error correction of the invention reduces the average relative error of prediction by 50%, and the error correction link has effectiveness to the mature prediction model; according to the method, the typical months in different seasons are predicted by constructing the wind speed prediction model, and the result shows that the method is strong in generalization capability, good in robustness and high in prediction precision, the problem of reduction of the operation reliability of the power system caused by characteristics such as wind speed fluctuation, indirection and low energy density is solved, and the scheduling economy of a power grid containing renewable energy source grid connection and the operation safety of a wind power plant can be remarkably improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a flow chart of a dynamic wind speed prediction model building method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a dynamic wind speed prediction model building method according to an embodiment of the present invention;

FIG. 3 is a diagram of the convergence of reward functions in accordance with an embodiment of the present invention;

FIG. 4 is a 2019 wind speed fluctuation diagram of an actual wind field in the northeast of China according to an embodiment of the present invention;

FIG. 5a is a diagram of typical spring month (month 3) QWSP, LSTM, BP neural network wind speed prediction data in accordance with an embodiment of the present invention;

FIG. 5b is a diagram of wind speed prediction data for QWSP, LSTM, BP neural networks in a typical summer month (month 6) according to an embodiment of the present invention;

FIG. 5c is a diagram of wind speed prediction data for a typical autumn month (9 months) QWSP, LSTM, BP neural network in accordance with an embodiment of the present invention;

FIG. 5d is a diagram of wind speed prediction data for a typical winter month (12 months) QWSP, LSTM, BP neural network in accordance with an embodiment of the present invention;

FIG. 6a is a diagram of DPDQ wind speed prediction data in a typical month (month 3) of spring according to an embodiment of the present invention;

FIG. 6b is a diagram of DPDQ wind speed prediction data in a typical summer month (month 6) according to an embodiment of the present invention;

FIG. 6c is a diagram of DPDQ wind speed prediction data in a typical month (9 months) in autumn according to an embodiment of the present invention;

FIG. 6d is a diagram of DPDQ wind speed prediction data for a typical winter month (12 months) according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

As shown in fig. 1 to fig. 2, the method for establishing a dynamic wind speed prediction model according to an embodiment of the present invention includes the following steps:

s3) adding wind speed fluctuation conditions and attribute factors (wind speed, temperature, humidity, wind direction and turbulence speed) into the Q learning model in a centralized manner, selecting an optimal wind speed prediction model at each time interval through a Q reinforcement learning algorithm to obtain preliminary wind speed prediction data, and calculating a wind speed prediction error according to the preliminary wind speed prediction data and corresponding measured wind speed data;

s4) constructing an error Q learning model base based on the wind speed prediction error, selecting an optimal wind speed prediction error model from the error Q learning model base through a Q reinforcement learning algorithm to correct the preliminary wind speed prediction value, and correcting the error to obtain final wind speed prediction data.

In the step S2), the Q learning model set adopts 5 prediction algorithms, deep learning can perform deep feature mining on actually measured wind speed data by considering high variability of a wind speed sequence, the wind speed transformation trend can be better predicted when the wind speed fluctuates violently, an overfitting phenomenon may exist in details, SVR and BP neural networks often have higher prediction accuracy in a period when the wind speed fluctuates slowly, and a large prediction error exists when the wind speed fluctuates violently, so that five algorithms of a deep learning algorithm LSTM, an integrated learning algorithm BooXgst, a shallow learning algorithm SVR, a BP neural network and a KRR are selected as basic models of the wind speed prediction model set, and the Q learning can select a more suitable prediction model in the Q learning model set aiming at different fluctuation conditions, wherein the KRR selects a PKRR based on a polynomial kernel function.

The principle of the Q reinforcement learning algorithm is briefly described as follows:

in order to train Q learning representatives, a mathematical framework based on reinforcement learning dynamic model selection is first defined in a markov decision process, a Q learning agent usually takes sequential operations in a series of states according to a state-action value matrix (Q matrix) until a final goal is reached, and a reward is obtained by evaluating the prediction effect of a current state space to update the Q matrix, wherein the state space S is composed of a current prediction model:

S＝{s₁,…,s_I,…,s_N}

in the formula: s_IRepresenting a current wind speed prediction model; and N is the number of wind speed prediction models. Similarly, the motion space a consists of the wind speed prediction model of the next step:

A＝{a₁,…,a_J,…,a_N}

wherein a is_JRepresenting the action of switching from the current wind speed prediction model to the next wind speed prediction model at the next prediction time step. To successfully solve the markov decision process using Q learning, the most central part is to derive the reward matrix R by an appropriate reward function R (s, a), an embodiment of the invention defines the error and model rank hybrid reward function as follows:

R_t(s_I,a_J)＝α[RANK(M_I,t)-RANK(M_J,t+1)]+β[TIME(M_I,t)-TIME(M_J,t+1)]

in the formula: RANK (M)_I,t) And RANK (M)_I,t+1) Wind speed prediction model M at the t-th time and the t + 1-th time respectively_IRank of (2); TIME (M)_I,t) And TIME (M)_I,t+1) Wind speed prediction model M at the t-th time and the t + 1-th time respectively_IThe calculated time of (a); alpha and beta are weight coefficients, andsatisfies α + β ═ 1. Since Q learning is a model-free dynamic model selection framework, a model is often selected preferentially for prediction, and thus when both models are ranked as 1, the term is 0, and the reward and punishment effect is lost. Therefore, two Q learning frames are selected and weighted, so that the reward function is more universal. After defining the state space, action space and reward function, the training data set T is learned by using Q_tAnd training a Q learning dynamic prediction model.

Q learning agents using a decay t greedy approach take completely random actions from the beginning while reducing randomness through decay during learning, at N_eAfter the sub-training, the Q-learning algorithm will eventually converge to the optimal strategy Q, which is used to find the optimal action a in the Q-learning process. The method comprises the following specific steps:

(1) defining model step length k, prediction scale N and model library size N_MQ learning data set T_tDynamic prediction of model data set T_cLearning rate kappa to control learning aggressiveness, discounting factor gamma to weigh future returns, training times N_eIs ensured at T_cSelecting an optimal model from the N models;

(2) initializing Q (s, a), wherein omega is 1, and starting training; choosing random action a with probability of ω_eOtherwise, select

(3) Calculating and updating an incentive matrix R according to an incentive function calculation formula;

(4) updating Q (s, a) by:

(5) repeating (2) - (4) k times to find out optimal action each time

The calculation formula of the wind speed prediction error in step S3) and step S4) is as follows:

in the formula:

is the wind speed prediction error, x is the measured wind speed value,

and the wind speed is a preliminary wind speed predicted value.

The error Q learning model library adopted in the step S4) is 5; for the selection of the error correction model set (i.e. the error Q learning model library formed by the wind speed prediction error model), because the volatility and variability of the prediction error are far from the actual measurement of the wind speed sequence, and more, the error sequence needs to be predicted in detail, the embodiment of the invention selects five models with higher efficiency, namely, an SVR, a BP neural network, a GKRR, a PKRR and an MHKRR, to form the error correction model set, wherein the GKRR, the PKRR and the MHKRR models adopt different kernel functions.

The method further comprises, after step S4): s5) verifying the effectiveness of the optimal wind speed prediction error model by using the wind speed verification set; verifying the effectiveness of the optimal wind speed prediction error model and selecting a mean square error epsilon₁Relative error e₂And determining the coefficient R²Evaluating the final wind speed prediction data by three evaluation indexes, wherein epsilon₁、ε₂Most preferably 0, R²The optimal expectation is 1, and the calculation formulas are respectively as follows:

wherein: x is the number of_t、y_t、

Selecting 2019 years of wind speed and related attribute data of a certain actual wind field in northeast China for research, predicting short-term wind speed in typical months (3 months, 6 months, 9 months and 12 months) of each quarter of the wind field, preprocessing the actual wind speed data, replacing missing and abnormal data values by adopting an adjacent data complementation method, wherein the sampling interval of the actual wind speed data is 10min, 4320 points are counted in one month, the data 20 days before each month is taken as a wind speed training set, the data 21-25 days are taken as a wind speed testing set, and the data 26-30 days are taken as a wind speed testing set (for testing whether the model parameter setting is proper or not)

The model hyper-parameters are set as follows:

the specific parameters of the Q learning framework of the embodiment of the present invention are set to be k equal to 0.1 and γ equal to 0.8, so as to ensure the learning speed of dynamic model selection, N_eTaking future reward and punishment of the reward function into full consideration, and selecting alpha to be 0.9 and beta to be 0.1; according to the actual operation condition of the wind field, the day-ahead prediction with the step length of 6 is selected (k is 6, and n is 144), that is, according to the training result of the wind speed training set data, the optimal strategy is adopted to make a model selection decision for the next k steps, and the super-parameter setting of the basic model is shown in table 1.

TABLE 1 hyper-parameter settings for different algorithms

Regarding the setting of the Q learning reward function, the adaptive error function commonly used in the artificial intelligence algorithm is currently used, but it is found in the training process that the Q learning using the adaptive error function as the reward function fails to converge, because the magnitude of the prediction evaluation index not only depends on the prediction model, but also changes with time, and the operation of switching from the bad model to the best model may still receive a negative return (due to the decrease of the prediction evaluation index). Meanwhile, whether a prediction model is mature or not is not only related to prediction accuracy but also related to time cost paid by the prediction model, so that another reward function is provided for evaluating the effect of the model, namely the ranking improvement of the model and the prediction time of the model are comprehensively considered. The training results of the two methods are shown in fig. 3, and it can be seen that the reward function is successfully converged, and the time sequence effect is effectively avoided.

The wind speed fluctuation condition of the northeast wind field 2019 in the whole year is shown in fig. 4, and it can be seen that the wind energy density of the wind field is high, wherein the wind speed fluctuation in two seasons of spring and winter is severe, the wind speed difference at different moments is large, and the highest wind speed exceeds 25 m/s. And the wind speed in summer and autumn is mostly lower than 10m/s, the fluctuation is smooth, and the wind energy density is obviously lower than that in spring and winter.

In order to check and explain the effectiveness of the selection of the Dynamic model based on the Q learning, artificial intelligence algorithms of two different prediction principles of a single prediction model LSTM and a BP neural network are selected to perform simulation comparison analysis with a wind speed prediction part (QWSP) in a Dynamic prediction based on double Q learning, and the wind speed prediction part performs day-ahead prediction with the sliding step length of 6 for each quarter of typical month wind speeds, wherein the specific prediction results are shown in fig. 5a to 5 d.

As can be seen from fig. 5a to 5d, the QWSP can be well expressed under the condition of coping with different wind speed fluctuations in different seasons, and the overall prediction effect is better than that of a single prediction model; in detail, the wind speed fluctuation in summer and autumn is relatively smooth, the wind speed is also relatively low, and the wind speed in spring and winter is relatively high. It can be seen from the expansion of the section 121-126 in fig. 5a and the section 115-120 in fig. 5b that each model can obtain better prediction effect; in parts 61-66 of fig. 5c, the BP neural network and the LSTM cannot cope with all wind speed changes due to no dynamic selection, and the predicted result has a large deviation from the actual value; the detail prediction result in the diagram (d) is not true, and it can be seen from the right diagram that the prediction result deviation of the QWSP model is greater than that of the BP neural network, which is mainly that the prediction deviation is greater due to the selection error of the Q learning model, and due to the arrangement of the reward function mechanism, the ranking of the selected model should be later, so that the reward is a negative value, and therefore, the model is corrected in the next model selection, so as to obtain a better prediction result. When the dynamic selection of Q learning is correct, the prediction precision is generally high, and the model has better robustness; when the single prediction model is over-fitted, the situation of large prediction error occurs, and due to the mechanism of the reward function, the model can be corrected in the next time period in time. Therefore, the model selection strategy based on Q learning can improve the overall performance of the wind speed prediction model. Typical monthly wind speed prediction errors for each quarter are shown in table 2.

TABLE 2 short term prediction error for different methods

As can be seen from the data in Table 2, each model R had minimal wind speed fluctuation in summer²All reach about 0.9, and the wind speed fluctuation in winter is large, which brings certain prediction difficulty to the model and leads to prediction error epsilon₁And also increases. The model provided by the embodiment of the invention is selected to have R based on the dynamic model of reinforcement learning²The result is closest to 1, ε in each quarter₁The error is also the smallest of the three models.

In order to verify the effectiveness of the error correction link in the embodiment of the present invention, the DPDQ is used to predict the wind speed in each season before the day, and the results are shown in fig. 6a to fig. 6d, which shows that the DPDQ model can achieve a better effect in the predicted value of each season, but for some extreme wind conditions with too large wind speed difference, for example, the highest wind speed point in fig. 6c still has a certain prediction error, which is unavoidable.

The dynamic wind speed prediction model establishing method provided by the invention adopts two times of Q reinforcement learning algorithm to establish a dynamic wind speed prediction model, wherein one Q learning agent is responsible for selecting an optimal wind speed prediction model to carry out preliminary wind speed prediction, and the other Q learning agent is responsible for inputting the optimal wind speed prediction model into an error correction part through calculating errors to select the optimal wind speed prediction error model from the optimal wind speed prediction model so as to obtain an optimal prediction strategy; q learning effectively selects the optimal prediction model in the wind speed prediction part and the error correction part; the error correction of the invention reduces the average relative error of prediction by 50%, and the error correction link has effectiveness to the mature prediction model; according to the method, the typical months in different seasons are predicted by constructing the wind speed prediction model, and the result shows that the method is strong in generalization capability, good in robustness and high in prediction precision, the problem of reduction of the operation reliability of the power system caused by characteristics such as wind speed fluctuation, indirection and low energy density is solved, and the scheduling economy of a power grid containing renewable energy source grid connection and the operation safety of a wind power plant can be remarkably improved.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A dynamic wind speed prediction model building method is characterized by comprising the following steps:

2. The method for building a dynamic wind speed prediction model according to claim 1, wherein the preprocessing of the measured wind speed data in step S1) is to replace missing and abnormal values in the measured wind speed data by using a neighboring data complementation method.

3. The method for building a dynamic wind speed prediction model according to claim 1, further comprising after step S4):

4. The method of claim 3, wherein validating the optimal wind speed prediction error model selects a mean square error ε₁Relative error e₂And determining the coefficient R²The final wind speed prediction data is evaluated by the three evaluation indexes, and the calculation formulas are respectively as follows:

wherein: x is the number of_t、y_t、

5. The method for establishing the dynamic wind speed prediction model according to claim 1, wherein the plurality of prediction algorithms adopted in the Q learning model set in step S2) are 5, including learning algorithms of LSTM, XGBoost, SVR, BP neural network and KRR.

6. The method for building a dynamic wind speed prediction model according to claim 1, wherein the error Q learning model library in step S4) includes 5 prediction algorithms, including learning algorithms of SVR, BP neural network, GKRR, PKRR and MHKRR.

7. The dynamic wind speed prediction model building method according to claim 1, wherein the wind speed prediction error in step S3) is calculated as follows:

in the formula:

is the wind speed prediction error, x is the measured wind speed value,

and the wind speed is a preliminary wind speed predicted value.

8. The method for building a dynamic wind speed prediction model according to claim 1, wherein the final wind speed prediction data in step S4) is calculated as follows:

in the formula: y is the final predicted value of the wind speed,

for the purpose of preliminary wind speed prediction,

the error is predicted for the corrected wind speed.

9. The method for building a dynamic wind speed prediction model according to claim 1, wherein the Q reinforcement learning algorithm in step S3) and in step S4) uses a reward function with a mixture of errors and model rank, and the calculation formula of the reward function is as follows:

in the formula: r (s, a) is a reward function; s is a state space, S ═ S₁,…,s_I,·…,s_N}，s_IThe current wind speed prediction model is adopted, and N is the number of the wind speed prediction models; a is motion space, and A is { a ═ a₁,…,a_J,…,a_N}，a_JAn act of switching from the current wind speed prediction model to the next wind speed prediction model at the next prediction time step; RANK (M)_I,t) And RANK (M)_I,t+1) Wind speed prediction model M at the t-th time and the t + 1-th time respectively_IRank of (2); TIME (M)_I,t) And TIME (M)_I,t+1) A wind speed prediction model M for the t th time and the t +1 th time_IThe calculated time of (a); α and β are weight coefficients, and α + β is 1.