CN113515889B

CN113515889B - Dynamic wind speed prediction model building method

Info

Publication number: CN113515889B
Application number: CN202110557310.3A
Authority: CN
Inventors: 李永刚; 王月; 吴滨源
Original assignee: North China Electric Power University
Current assignee: North China Electric Power University
Priority date: 2021-05-21
Filing date: 2021-05-21
Publication date: 2023-06-13
Anticipated expiration: 2041-05-21
Also published as: CN113515889A

Abstract

The invention discloses a method for establishing a dynamic wind speed prediction model, which comprises the following steps: obtaining measured wind speed data of a target area and preprocessing the measured wind speed data; training and predicting the preprocessed actually measured wind speed data by utilizing a plurality of prediction algorithms to obtain a plurality of wind speed prediction models, and forming a Q learning model set; adding a wind speed fluctuation condition and attribute factors in the Q learning model set, selecting an optimal wind speed prediction model of each period through a Q reinforcement learning algorithm to obtain preliminary wind speed prediction data, and calculating a wind speed prediction error; and constructing an error Q learning model library based on the wind speed prediction error, and selecting an optimal wind speed prediction error model from the error Q learning model library through a Q reinforcement learning algorithm to correct the preliminary wind speed prediction value so as to obtain final wind speed prediction data. The method provided by the invention adopts the twice Q reinforcement learning algorithm to construct the dynamic wind speed prediction model, and has the characteristics of strong generalization capability, good robustness and high prediction precision.

Description

Dynamic wind speed prediction model building method

Technical Field

The invention relates to the technical field of wind speed prediction, in particular to a method for establishing a dynamic wind speed prediction model.

Background

In recent years, energy structures are continuously developed in a low-carbon direction, and the permeability of renewable energy power grids represented by wind power is increased year by year. With the large-scale grid connection of renewable energy sources represented by wind power generation, the dispatching economy of a power grid is gradually improved, but the running reliability of a power system is seriously reduced due to the characteristics of fluctuation, indirection, low energy density and the like of wind speed. Therefore, in order to better utilize wind power generation and simultaneously consider the stability of the power system, short-term accurate prediction needs to be performed on the wind speed, but no clear definition is given yet on how to build a prediction model and how to optimize the prediction model to improve the prediction precision and generalization capability.

Disclosure of Invention

The invention aims to provide a method for establishing a dynamic wind speed prediction model, which adopts a twice Q reinforcement learning algorithm to construct the dynamic wind speed prediction model and has the characteristics of strong generalization capability, good robustness and high prediction precision.

In order to achieve the above object, the present invention provides the following solutions:

a method for establishing a dynamic wind speed prediction model comprises the following steps:

s1) obtaining measured wind speed data of a target area, and preprocessing the measured wind speed data;

s2) dividing the preprocessed actually measured wind speed data into a wind speed training set, a wind speed testing set and a wind speed checking set, training the wind speed training set by utilizing a plurality of prediction algorithms, predicting the wind speed testing set to obtain a plurality of wind speed prediction models, and forming a Q learning model set;

s3) adding wind speed fluctuation conditions and attribute factors in the Q learning model set, selecting an optimal wind speed prediction model of each period through a Q reinforcement learning algorithm to obtain preliminary wind speed prediction data, and calculating a wind speed prediction error according to the preliminary wind speed prediction data and corresponding actually measured wind speed data;

s4) constructing an error Q learning model library based on the wind speed prediction error, and selecting an optimal wind speed prediction error model from the error Q learning model library through a Q reinforcement learning algorithm to correct the preliminary wind speed prediction value so as to obtain final wind speed prediction data.

Optionally, the preprocessing of the measured wind speed data in step S1) refers to replacing missing and abnormal values in the measured wind speed data by adopting an adjacent data complementation method.

Optionally, the method further comprises, after step S4):

s5) verifying the validity of the optimal wind speed prediction error model by using the wind speed test set.

Optionally, verifying the validity of the optimal wind speed prediction error model selects a mean square error ε ₁ Relative error epsilon ₂ Determining a coefficient R ² The three evaluation indexes evaluate the final wind speed prediction data, and the calculation formulas are respectively as follows:

/>

wherein: x is x _t 、y _t 、

The measured wind speed value, the final predicted wind speed value, the measured wind speed average value and the final predicted wind speed average value at the time t are respectively.

Optionally, the plurality of prediction algorithms adopted in the Q learning model set in step S2) is 5, including a LSTM, XGBoost, SVR, BP neural network and a KRR learning algorithm.

Optionally, the plurality of prediction algorithms adopted in the error Q learning model library in step S4) is 5, including learning algorithms of SVR, BP neural network, GKRR, PKRR and MHKRR.

Optionally, the calculation formula of the wind speed prediction error in step S3) and in step S4) is as follows:

wherein:

for the wind speed prediction error, x is the measured wind speed value, < >>

Is a preliminary wind speed predictor.

Optionally, the calculation formula of the final wind speed prediction data in step S4) is as follows:

wherein: y is the final wind speed predictor value,

for the preliminary wind speed forecast->

Is the corrected wind speed prediction error.

Optionally, the Q reinforcement learning algorithm in step S3) and in step S4) uses a reward function with a mixture of error and model rank, and the calculation formula of the reward function is as follows:

wherein: r (s, a) is a reward function; s is a state space, S= { S ₁ ,…,s _I ,…,s _N }，s _I N is the number of wind speed prediction models for the current wind speed prediction model; a is an action space, a= { a ₁ ,…,a _J ,…,a _N }，a _J An act of switching from the current wind speed predictive model to the next wind speed predictive model for a next predicted time step; RANK (M) _I,t ) And RANK (M) _I,t+1 ) Wind speed prediction model M at the t time and the t+1 time respectively _I Is a ranking of (2); TIME (M) _I,t ) And TIME (M) _I,t+1 ) Wind speed prediction model M for t time and t+1 time _I Is calculated according to the calculation time of (2); α, β are weight coefficients, and α+β=1 is satisfied.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects: according to the method for establishing the dynamic wind speed prediction model, the dynamic wind speed prediction model is established by adopting a twice Q reinforcement learning algorithm, wherein one Q learning agent is responsible for selecting an optimal wind speed prediction model to conduct preliminary wind speed prediction, and the other Q learning agent is responsible for inputting the optimal wind speed prediction model into an error correction part through calculating errors, and selecting an optimal wind speed prediction error model from the optimal wind speed prediction model to obtain an optimal prediction strategy; and Q learning effectively selects the optimal prediction model in both the wind speed prediction part and the error correction part; the error correction reduces the average relative error of prediction by 50%, and the error correction link has effectiveness on the mature prediction model; according to the method, the wind speed prediction model is constructed to predict typical months in different seasons, and the result shows that the method has strong generalization capability, good robustness and high prediction precision, solves the problem of reduced operation reliability of the power system caused by the characteristics of wind speed fluctuation, indirection, low energy density and the like, and can remarkably improve the scheduling economy of a power grid containing renewable energy source grid connection and the operation safety of a wind power plant.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for establishing a dynamic wind speed prediction model according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a dynamic wind speed prediction model building method according to an embodiment of the present invention;

FIG. 3 is a convergence diagram of a bonus function according to an embodiment of the invention;

FIG. 4 is a diagram showing a 2019-year wind speed fluctuation of an actual wind field in northeast according to an embodiment of the present invention;

FIG. 5a is a graph of predicted wind speed data for a typical month (3 months) QWSP, LSTM, BP neural network in spring according to an embodiment of the present invention;

FIG. 5b is a graph of predicted wind speed data for a typical month in summer (6 months) QWSP, LSTM, BP neural network according to an embodiment of the present invention;

FIG. 5c is a graph of wind speed prediction data for a autumn representative month (9 months) QWSP, LSTM, BP neural network according to an embodiment of the invention;

FIG. 5d is a graph of predicted wind speed data for a typical month in winter (12 months) QWSP, LSTM, BP neural network according to an embodiment of the present invention;

FIG. 6a is a graph of DPDQ wind speed prediction data for a typical month (3 months) of spring according to an embodiment of the present invention;

FIG. 6b is a graph of DPDQ wind speed forecast data for a typical month in summer (6 months) according to an embodiment of the present invention;

FIG. 6c is a graph of DPDQ wind speed prediction data for a typical month in autumn (9 months) according to an embodiment of the present invention;

FIG. 6d is a graph of DPDQ wind speed forecast data for a typical month in winter (12 months) according to an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

As shown in fig. 1 to 2, the method for establishing a dynamic wind speed prediction model provided by the embodiment of the invention includes the following steps:

s3) adding wind speed fluctuation conditions and attribute factors (wind speed, temperature, humidity, wind direction and turbulence speed) in the Q learning model set, selecting an optimal wind speed prediction model of each period through a Q reinforcement learning algorithm to obtain preliminary wind speed prediction data, and calculating a wind speed prediction error according to the preliminary wind speed prediction data and corresponding measured wind speed data;

s4) constructing an error Q learning model library based on the wind speed prediction error, selecting an optimal wind speed prediction error model from the error Q learning model library through a Q reinforcement learning algorithm to correct the preliminary wind speed prediction value, and correcting the error to obtain final wind speed prediction data.

In the step S2), 5 kinds of prediction algorithms are adopted in the Q learning model set, considering the high variability of the wind speed sequence, deep learning can perform deep feature mining on measured wind speed data, a wind speed transformation trend can be predicted better when wind speed fluctuation is severe, but in detail, a phenomenon of over fitting may exist, SVR and BP neural networks often have higher prediction precision in a period of relatively mild wind speed fluctuation, and larger prediction errors exist when wind speed fluctuation is severe, so that five algorithms, namely a deep learning algorithm LSTM, an integrated learning algorithm XGBoost, a shallow learning algorithm SVR, a BP neural network and a KRR, are selected as basic models in the wind speed prediction model set, so that for different fluctuation situations, Q learning can select a more suitable prediction model in the deep learning algorithm, wherein KRR selects PKRR based on a polynomial kernel function.

The principle of the Q reinforcement learning algorithm is briefly described as follows:

to train the Q-learning representation, a mathematical framework based on reinforcement-learning dynamic model selection is first defined in a markov decision process, typically the Q-learning agent takes sequential operations in a series of states according to a state-effect value matrix (Q-matrix) until a final goal is reached, and a reward update Q-matrix is obtained by evaluating the predictive effect of the current state space, which consists of the current predictive model:

S＝{s ₁ ,…,s _I ,…,s _N }

wherein: s is(s) _I Representing a current wind speed prediction model; n is the number of wind speed prediction models. Likewise, the action space a consists of the wind speed prediction model of the next step:

A＝{a ₁ ,…,a _J ,…,a _N }

wherein a is _J Representing an action of switching from the current wind speed predictive model to the next wind speed predictive model at the next predicted time step. In order to successfully solve the Markov decision process using Q learning, the most central part is to derive the reward matrix R through the appropriate reward function R (s, a), and embodiments of the present invention define the error and model rank mix reward function as follows:

R _t (s _I ,a _J )＝α[RANK(M _I,t )-RANK(M _J,t+1 )]+β[TIME(M _I,t )-TIME(M _J,t+1 )]

wherein: RANK (M) _I,t ) And RANK (M) _I,t+1 ) Wind speed prediction model M at the t time and the t+1 time respectively _I Is a ranking of (2); TIME (M) _I,t ) And TIME (M) _I,t+1 ) Wind speed prediction model M at the t time and the t+1 time respectively _I Is calculated according to the calculation time of (2); α, β are weight coefficients, and α+β=1 is satisfied. Dynamic model selection as Q learning model-freeThe frame is selected, and the model is selected preferentially for prediction, so that when the two models are ranked as 1, the model is 0, and the rewarding and punishment effects are lost. Therefore, the two weighted Q learning frameworks are selected to make the reward function more universal. After defining the state space, action space and reward function, training data set T is trained by using Q learning _t Training Q to learn the dynamic prediction model.

The Q learning agent adopting decay t greedy method takes completely random action from the beginning, reduces randomness through decay in the learning process, and is characterized by N _e After the secondary training, the Q learning algorithm will eventually converge to an optimal strategy Q, which is used to find the optimal action a during Q learning. The method comprises the following specific steps:

(1) Defining a model step length k, a prediction scale N and a model library size N _M Q learning data set T _t Dynamic predictive model dataset T _c The learning rate kappa of the aggressiveness of learning is controlled, the discount factor gamma of future return is weighed, and the training times N are controlled _e Ensure at T _c Selecting the best model from the N models;

(2) Initializing Q (s, a), ω=1, starting training; selecting random action a with probability of ω _e Otherwise select

(3) Calculating and updating a reward matrix R according to a reward function calculation formula;

(4) Update Q (s, a) by:

(5) Repeating (2) - (4) k times to find the optimal action each time

The calculation formula of the wind speed prediction error in step S3) and in step S4) is as follows:

wherein:

for the wind speed prediction error, x is the measured wind speed value, < >>

Is a preliminary wind speed predictor.

The number of the multiple prediction algorithms adopted in the error Q learning model library in the step S4) is 5; for the selection of an error correction model set (namely an error Q learning model library formed by wind speed prediction error models), because the fluctuation and variability of the prediction errors are not as severe as the actually measured wind speed sequence, more accurate prediction is needed for the error sequence, five models with higher efficiency, namely SVR, BP neural network and GKRR, PKRR, MHKRR, are selected to form the error correction model set, wherein GKRR, PKRR and MHKRR models are KRR models adopting different kernel functions.

The method further comprises, after step S4): s5) verifying the validity of the optimal wind speed prediction error model by using the wind speed test set; verifying the validity of the optimal wind speed prediction error model, and selecting a mean square error epsilon ₁ Relative error epsilon ₂ Determining a coefficient R ² Three evaluation indexes evaluate the final wind speed prediction data, wherein epsilon ₁ 、ε ₂ Most desirably 0, R ² The optimal expectation is 1, and the calculation formulas are respectively as follows:

wherein: x is x _t 、y _t 、

Selecting 2019 wind speed and related attribute data of a certain practical wind field in northeast China to develop research, carrying out short-term wind speed prediction on typical months (3 months, 6 months, 9 months and 12 months) of each quarter of the wind field, preprocessing measured wind speed data, replacing missing and abnormal data values by adopting an adjacent data complementation method, sampling the measured wind speed data at a sampling interval of 10min, taking 20 days of data before each month as a wind speed training set, 21-25 days of data as a wind speed test set and 26-30 days as a wind speed test set (used for checking whether model parameter setting is proper or not)

The model super parameters were set as follows:

the specific parameters of the learning framework Q of the embodiment of the invention are set to be kappa=0.1 and gamma=0.8 so as to ensure the learning speed of dynamic model selection, N _e =100, and fully considers future rewards and punishments of the reward function, choosing α=0.9, β=0.1; according to the actual running condition of the wind field, selecting to conduct the day-ahead prediction (k=6, n=144) with the step length of 6, namely adopting the optimal strategy to make model selection decision for the next k steps according to the training result of wind speed training set data, wherein the super-parameter setting of the basic model is shown in table 1.

Table 1 super parameter settings for different algorithms

Regarding the setting of the Q learning reward function, the adaptive error function commonly used in artificial intelligence algorithms is currently used, but it is found in the training process that Q learning using the adaptive error function as the reward function fails to converge because the size of the predictive evaluation index not only depends on the predictive model, but also changes over time, and negative returns (due to the decrease in the predictive evaluation index) may still be received by taking the operation of switching from the poor model to the optimal model. Meanwhile, whether one prediction model is mature or not is related to not only prediction precision but also time cost paid by the prediction model, so that another rewarding function is provided for evaluating the model effect, namely model ranking improvement and model prediction time are comprehensively considered. The training results of the two methods are shown in fig. 3, and it can be seen that the reward function is successfully converged, so that the time sequence effect is effectively avoided.

As shown in FIG. 4, the wind speed fluctuation condition of the northeast wind field 2019 in the whole year can be seen that the wind energy density of the wind field is larger, wherein the wind speed fluctuation of the spring and winter seasons is more severe, the wind speed difference at different moments is larger, and the highest wind speed exceeds 25m/s. The wind speed in summer and autumn is mostly lower than 10m/s, the fluctuation is gentle, and the wind energy density is obviously lower than that in spring and winter.

To test and explain the effectiveness of dynamic model selection based on Q learning, an artificial intelligent algorithm of two different prediction principles of a single prediction model LSTM and BP neural network is selected to carry out simulation comparison analysis on a wind speed prediction part (Q learning wind speed prediction, QWSP) in a dynamic prediction (Dynamic prediction based on double Q learning, DPDQ) model based on double Q learning, and day-ahead prediction with a sliding step length of 6 is carried out on the wind speed of a typical month in each quarter, and specific prediction results are shown in fig. 5a to 5 d.

As can be seen from fig. 5a to fig. 5d, the QWSP can perform well under the condition of coping with different wind speed fluctuation in each season, and the overall prediction effect is better than that of a single prediction model; from the detail, the wind speed fluctuation in summer and autumn is more gentle, the wind speed is lower, and the wind speed in spring and winter is relatively higher. As can be seen by expanding the parts 121-126 in FIG. 5a and the parts 115-120 in FIG. 5b, each model can obtain a good prediction effect; in the parts 61-66 in FIG. 5c, the BP neural network and the LSTM cannot cope with all wind speed change conditions due to the fact that dynamic selection is not carried out, and the deviation between the prediction result and the actual value is large; the detail prediction result in the graph (d) is not the same, and as can be seen from the right graph, the prediction result deviation of the QWSP model is larger than that of the BP neural network, which is mainly the prediction deviation caused by the selection error of the Q learning model, and the ranking of the selected model should be relatively later due to the arrangement of the reward function mechanism, so that the reward is negative, and the correction is carried out in the next model selection so as to obtain a better prediction result. When the dynamic selection of Q learning is correct, the prediction precision is generally higher, and the model has better robustness; when the single prediction model is fitted, the situation of larger prediction error occurs, and the model can be corrected in the next period in time due to the mechanism of the reward function. Therefore, the model selection strategy based on Q learning can improve the performance of the wind speed prediction model as a whole. The prediction errors of the wind speed of the typical month in each quarter are shown in table 2.

Table 2 short-term prediction error for different methods

As can be seen from the data in Table 2, since the fluctuation of the wind speed in summer is minimal, each model R ² All reach about 0.9, and the fluctuation of wind speed in winter is large, so that a certain prediction difficulty is brought to the model, and a prediction error epsilon is caused ₁ And also increases. The model provided by the embodiment of the invention selects the R of the dynamic model based on reinforcement learning ² The result is closest to 1, ε in each quarter ₁ Error is also the smallest of the three models.

In order to verify the effectiveness of the error correction link in the embodiment of the present invention, the result of predicting the wind speed before each season by using the DPDQ is shown in fig. 6a to 6d, it can be seen that the predicted value of the DPDQ model in each season can reach a better effect, but for certain extreme wind conditions with too large wind speed difference, such as the highest wind speed point in fig. 6c, a certain predicted error still exists, which is unavoidable.

According to the method for establishing the dynamic wind speed prediction model, the dynamic wind speed prediction model is established by adopting a twice Q reinforcement learning algorithm, wherein one Q learning agent is responsible for selecting an optimal wind speed prediction model to conduct preliminary wind speed prediction, and the other Q learning agent is responsible for inputting the optimal wind speed prediction model into an error correction part through calculating errors, and selecting an optimal wind speed prediction error model from the optimal wind speed prediction model to obtain an optimal prediction strategy; and Q learning effectively selects the optimal prediction model in both the wind speed prediction part and the error correction part; the error correction reduces the average relative error of prediction by 50%, and the error correction link has effectiveness on the mature prediction model; according to the method, the wind speed prediction model is constructed to predict typical months in different seasons, and the result shows that the method has strong generalization capability, good robustness and high prediction precision, solves the problem of reduced operation reliability of the power system caused by the characteristics of wind speed fluctuation, indirection, low energy density and the like, and can remarkably improve the scheduling economy of a power grid containing renewable energy source grid connection and the operation safety of a wind power plant.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. The method for establishing the dynamic wind speed prediction model is characterized by comprising the following steps of:

2. The method for building a dynamic wind speed prediction model according to claim 1, wherein the preprocessing of the measured wind speed data in step S1) means to replace missing and abnormal values in the measured wind speed data by an adjacent data complementation method.

3. The method for building a dynamic wind speed prediction model according to claim 1, wherein the method further comprises, after step S4):

4. A method of building a dynamic wind speed prediction model according to claim 3, wherein verifying the validity of the optimal wind speed prediction error model selects a mean square error epsilon ₁ Relative error epsilon ₂ Determining a coefficient R ² The three evaluation indexes evaluate the final wind speed prediction data, and the calculation formulas are respectively as follows:

wherein: x is x _t 、y _t 、

5. The method according to claim 1, wherein the plurality of prediction algorithms adopted in the Q learning model set in step S2) is 5, including learning algorithms of LSTM, XGBoost, SVR, BP neural network and KRR.

6. The method according to claim 1, wherein the plurality of prediction algorithms adopted in the error Q learning model library in step S4) is 5, including learning algorithms of SVR, BP neural network, GKRR, PKRR and MHKRR.

7. The method according to claim 1, wherein the calculation formula of the wind speed prediction error in step S3) is as follows:

/>

wherein:

for the wind speed prediction error, x is the measured wind speed value, < >>

Is a preliminary wind speed predictor.

8. The method according to claim 1, wherein the calculation formula of the final wind speed prediction data in step S4) is as follows:

wherein: y is the final wind speed predictor value,

for the preliminary wind speed forecast->

Is the corrected wind speed prediction error.

9. The method for building a dynamic wind speed prediction model according to claim 1, wherein the Q reinforcement learning algorithm in step S3) and in step S4) adopts a reward function with a mixture of errors and model ranks, and the calculation formula of the reward function is as follows:

wherein: r (s, a) is a reward function; s is a state space, S= { S ₁ ,…,s _I ,·…,s _N }，s _I N is the number of wind speed prediction models for the current wind speed prediction model; a is an action space, a= { a ₁ ,…,a _J ,…,a _N }，a _J An act of switching from the current wind speed predictive model to the next wind speed predictive model for a next predicted time step; RANK (M) _I,t ) And RANK (M) _I,t+1 ) Wind speed prediction model M at the t time and the t+1 time respectively _I Is a ranking of (2); TIME (M) _I,t ) And TIME (M) _I,t+1 ) Wind speed prediction model M for t time and t+1 time _I Is calculated according to the calculation time of (2); α, β are weight coefficients, and α+β=1 is satisfied.