CN111985711B

CN111985711B - Wind power probability prediction model building method

Info

Publication number: CN111985711B
Application number: CN202010834929.XA
Authority: CN
Inventors: 李永刚; 王月; 吴滨源
Original assignee: North China Electric Power University
Current assignee: North China Electric Power University
Priority date: 2020-08-19
Filing date: 2020-08-19
Publication date: 2024-02-02
Anticipated expiration: 2040-08-19
Also published as: CN111985711A

Abstract

The invention discloses a wind power probability prediction model building method, which comprises the following steps: removing abnormal values in the initial data set, and selecting weather variables with the relevance to the wind power greater than a preset threshold value as a training data set of a wind power probability prediction model based on a gray relevance theory; establishing an improved natural gradient lifting meta-model: updating the parameter vector; and carrying out Blending model fusion on the improved natural gradient lifting meta-models, establishing a new meta-model for training, and outputting a final prediction statistical parameter vector. The wind power probability prediction model establishment method provided by the invention can provide complete wind power uncertainty information, has higher prediction area coverage rate and smaller prediction interval average width ratio, and provides more accurate reference for constructing a high-efficiency intelligent new energy power system.

Description

Wind power probability prediction model building method

Technical Field

The invention relates to the technical field of wind power prediction of wind power plants, in particular to a wind power probability prediction model building method.

Background

With the low-carbon development of energy structures, the permeability of renewable energy power grids represented by wind power is increased year by year. However, due to the strong randomness of wind energy, the wind power has serious fluctuation, the complete uncertainty information can not be obtained only by carrying out point prediction on the wind power, and the safe and stable operation of the power grid is challenging. Therefore, in order to better utilize wind power generation, a scheduling plan is effectively adjusted, the wind power bidding grid-connected advantage is enlarged, and accurate probability prediction of wind power is critical. However, how the probability prediction model should be built, how the model input variables should be selected, and how the model should be optimized to improve the prediction ability and generalization ability have not been well defined.

Disclosure of Invention

The invention aims to provide a wind power probability prediction model building method, which can predict and acquire uncertainty information of wind power, is beneficial to improving the operation reliability of a wind power grid and solves the problems of low wind power generation consumption level and difficult scheduling plan formulation caused by the lack of uncertainty information of wind power.

In order to achieve the above object, the present invention provides the following solutions:

a wind power probability prediction model building method comprises the following steps:

s1, data preprocessing: removing abnormal values in the initial data set, and selecting weather variables with the relevance to the wind power greater than a preset threshold value as a training data set of a wind power probability prediction model based on a gray relevance theory;

s2, establishing an improved natural gradient lifting meta-model: based on a training data set, predicting parameter vectors of wind power probability distribution, establishing a connection between a general gradient and a natural gradient through Fisher information quantity, selecting classification and regression trees as a base learner, and establishing an improved natural gradient lifting meta-model to realize parameter vector updating;

s3, blending model fusion: and carrying out Blending model fusion on the improved natural gradient lifting meta-models, establishing a new meta-model for training, and outputting a final prediction statistical parameter vector.

Optionally, in the step S1, abnormal values in the initial data set are removed, and based on a gray correlation theory, a meteorological variable with a degree of correlation with wind power greater than a preset threshold is selected as a training data set of a wind power probability prediction model, which specifically includes:

s101, eliminating abnormal values in initial data by using a box graph;

s102, using wind power as a reference data column and related meteorological variables as a comparison data column, performing initialization processing on each sequence, calculating association coefficients based on a gray association theory to represent association degrees of two groups of sequences, and selecting meteorological variables with association degrees larger than a threshold value as a training data set of a prediction model.

Optionally, in step S101, the removing the outlier in the initial data by using the bin graph specifically includes:

the outlier cutoff upper and lower limits are determined by equation (1):

wherein: min and max represent the upper limit and the lower limit of data interception; q (Q) ₁ 、Q ₃ Respectively representing upper and lower quartiles; iqr=q ₃ -Q ₁ 。

Optionally, in step S102, wind power is taken as a reference data column, relevant weather variables are taken as a comparison data column, each sequence is initialized, a correlation coefficient is calculated to represent the correlation degree of two groups of sequences based on a gray correlation theory, and weather variables with the correlation degree larger than a threshold value are selected as training data sets of a prediction model, which specifically includes:

1) Normalizing the time series of each variable, and taking the kth of the n meteorological variable series as a comparison series S ^k (t) the wind power sequence is a reference sequence S ⁰ (t) calculating the difference between the two as the absolute value sequence delta ^k (t) as shown in formula (2), wherein k E (1, n),

Δ ^k (t)＝|S ^k (t)-S ⁰ (t)| (2)

2) Calculating the correlation coefficient eta ^k (t)：

Wherein: min (·) and Max (·) represent minimum and maximum values of the sequence, and ρ is a resolution coefficient;

3) Solving the association degree gamma ^k ：

Wherein: t (T) _n Is the sequence length;

4) Setting a threshold valueAnd selecting meteorological variables with the association degree larger than a threshold value to form a training data set.

Optionally, in the step S2, based on the training data set, a parameter vector of wind power probability distribution is predicted, a relationship is established between a general gradient and a natural gradient through Fisher information, a classification tree and a regression tree are selected as a base learner, and an improved natural gradient lifting meta-model is established, so as to realize parameter vector update, and the method specifically includes:

s201, let D include n _D Samples, m features, i.e. d= { (x) _i ，y _i )}(x _i ∈R ^m ，y _i E R), where x _i Characterizing the eigenvector of the ith sample, y _i Representing the label value corresponding to the ith sample, i E (1, n) _D )；

S202, y _i Establishes a scoring function S (theta, y) based on the shannon information amount _i )：

S(θ,y _i )＝-log P _θ (y _i ) (8)

Wherein: p (P) _θ (y _i ) Is y _i Probability values in the predictive probability distribution; θ is a parameter vector of the predictive probability distribution;

s203, let-log P _θ (y _i ) F (θ), taylor expansion is performed on f (θ+d'), and the third order and above remainder are discarded:

wherein: d' is theta edgeMoving an infinitely small step vector; />Representing a natural gradient;

converting the European space into a statistical manifold, and processing the formula (12) under the Riemann space:

wherein the calculation of the primary term can be simplified to:

the remainder is denoted as:

wherein: psi (θ), a Riemann metric of statistical manifold at θ, is used to characterize P _θ (yi) the Fisher information amount brought by:

thereby realizing the calculation of natural gradient through general gradient

S204, selecting classification and regression trees as base learners, and generating a group of new base learners along the generalized natural gradient direction by each iteration of the base learners, thereby establishing an improved natural gradient lifting meta-model;

s205, updating a parameter vector theta of wind power probability distribution:

wherein: θ ⁰ As an initial parameter vector, α ^m Is a scale factor, beta is a unified learning rate, B ^m Is a unified representation of the base learner.

Optionally, in the step S3, blending the multiple improved natural gradient lifting meta-models, building a new meta-model for training, so as to output a final predicted statistical parameter vector, which specifically includes:

s301, initial dataset segmentation: dividing an original training data set into a sub training set DT and a test set DA according to a proportion, and defining an original prediction data set as DP;

s302, model fusion:

given the confidence level, building V NGBoost meta-models MO ₁ 、MO ₂ 、…、MO _V Learning DT by using the metamodels, and outputting a predicted result DA_ P, DP _P of DA and DP on the metamodels after training is completed, wherein DA_ P, DP _P is an initial statistical parameter vector of the predicted values corresponding to DA and DP;

the predicted mean value determined by DA_P and the actual result DA_OUT corresponding to the original DA data form a new data set, and a new meta-model MO is established _DA Training and obtaining predicted output MO _DA P, where MO is _DA P is the corrected predicted statistical parameter vector;

MO is prepared from _DA P and DP P form new data set, and new meta-model MO is built _P Training is carried out, so that a final prediction statistical parameter vector is output, the upper limit and the lower limit of a predicted value under a given confidence level are calculated through the vector, and the points are connected into a predicted value upper limit curve and a predicted value lower limit curve.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects: according to the wind power probability prediction model establishment method provided by the invention, a natural gradient lifting meta-model of generalized natural gradient calculation is utilized; compared with the traditional point prediction model, the model provided by the invention solves the application defect of the traditional Boosting algorithm in solving the problem of wind power probability prediction, has higher generalization and robustness, has higher prediction area coverage rate and smaller average width occupation ratio of a prediction interval, has the reinforced prediction effect, and simultaneously has lower model redundancy, and the model building method is more suitable for practical engineering application, thereby being beneficial to improving the power grid scheduling economy and the running safety of a wind power plant with renewable energy grid connection.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is an initial data diagram of an embodiment of the present invention;

FIG. 2 (a) is a schematic diagram of a meteorological variable box according to an embodiment of the present invention;

FIG. 2 (b) is a stacked histogram of correlation coefficients according to an embodiment of the present invention;

FIG. 3 is a graph showing the comparison of the normal gradient and the natural gradient calculation process according to the embodiment of the invention.

FIG. 4 is a schematic diagram of a Blending fusion step according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

Fig. 1 is an initial data diagram of an embodiment of the present invention, as shown in fig. 1, all data are actually measured by a SCADA system in 2019 of a certain wind farm in northeast China. The data sampling interval is 15min, and 960 samples of 1 month, 1 day to 1 month and 10 days are taken to form an initial data set. Wherein: the input meteorological variables comprise wind direction, temperature, humidity, air pressure and wind speed of a wind farm collection point, and the output variables are wind power. As can be seen from fig. 1: the wind power and related meteorological variables have strong randomness and volatility, and probability prediction of the wind power is performed to extract related uncertainty information.

The wind power probability prediction model establishment method provided by the invention comprises the following steps:

In the step S1, abnormal values in the initial data set are removed, and based on a gray correlation theory, a meteorological variable with a degree of correlation with wind power greater than a preset threshold value is selected as a training data set of a wind power probability prediction model, and the method specifically comprises the following steps:

s101, eliminating abnormal values in initial data by using a box graph;

As shown in fig. 2 (a), in step S101, the outlier in the initial data is removed by using a box graph, which specifically includes:

the outlier cutoff upper and lower limits are determined by equation (1):

Wind power is related to meteorological variables such as temperature, wind speed and the like. However, this correlation varies from one wind farm to another, as affected by factors such as the geographic location of the wind farm and the local microclimate of the location. In order to improve the generalization capability and robustness of the model, the patent adopts gray correlation analysis to screen the characteristic meteorological variables of the input data, and the screening result is shown in fig. 2 (b). In step S102, wind power is taken as a reference data sequence, related meteorological variables are taken as a comparison data sequence, each sequence is initialized, association coefficients are calculated based on a gray association theory to represent association degrees of two groups of sequences, and meteorological variables with association degrees larger than a threshold value are selected as training data sets of a prediction model, and the method specifically comprises the following steps:

1) For each variable time sequenceThe column is normalized by taking the kth of the n meteorological variable sequences as a comparison sequence S ^k (t) the wind power sequence is a reference sequence S ⁰ (t) calculating the difference between the two as the absolute value sequence delta ^k (t) as shown in formula (2), wherein k E (1, n),

Δ ^k (t)＝|S ^k (t)-S ⁰ (t)| (2)

2) Calculating the correlation coefficient eta ^k (t)：

3) Solving the association degree gamma ^k ：

Wherein: t (T) _n Is the sequence length;

In the step S2, based on the training data set, the parameter vector of the wind power probability distribution is predicted, a relationship is established between a general gradient and a natural gradient through Fisher information, a classification tree and a regression tree are selected as a base learner, an improved natural gradient lifting meta-model is established, and the parameter vector update is realized, which specifically comprises:

S202, y _i Shannon information amount of (c)Build a scoring function S (θ, y) for the benchmark _i )：

S(θ,y _i )＝-log P _θ (y _i ) (8)

wherein the calculation of the primary term can be simplified to:

the remainder is denoted as:

thereby realizing the calculation of natural gradient through general gradient

s205, updating a parameter vector theta of wind power probability distribution:

Scaling factor alpha ^m The setting of the method is to avoid the situation that the local approximation is far away from the current parameter position in the calculation process to cause training failure, and the specific numerical values are selected in a linear search mode. The learning rate β is usually 0.1 or 0.01.

Based on the step 1, the invention analyzes the application defect of the traditional Boosting method in the wind power probability prediction problem, provides a new natural gradient calculation method in a targeted way, establishes a relation between a general gradient and a natural gradient through Fisher information quantity, selects classification and regression trees as a base learner, and establishes a natural gradient lifting meta-model. Wherein, the common gradient application defect is analyzed:

when the traditional Boosting method solves the point prediction problem, the optimal approximation function F (x) _i ) For the purpose ofThe target minimizes the expected value of the loss function L on the training set. Wherein L and F (x) _i ) Can be expressed as:

wherein: subscript M denotes the mth gradient boost phase; weight gamma _M The optimal step length is obtained through linear search;for a normal gradient, each iteration of the algorithm produces a new basis learner in that direction, minimizing the loss function, namely:

wherein: d is F (xi) edgeAn infinitely small step size of the movement.

Unlike point prediction is: probability prediction problem to solve y _i Is a complete probability distribution P of (2) _θ (y _i ) Is the object. When P _θ (y _i ) Re-parameterization intoAt the same time, from θ to θ+d and +.>To->The calculated common gradient changes, and the moving gauge of the updated parameters in the distribution space cannot be truly reflectedLaw. Namely: the common gradient cannot keep unchanged for the repartitioning, and defects exist when the probability prediction problem is solved.

The invention lays a foundation for describing natural gradient and analogies the loss function in the point prediction problem by y _i Establishes a scoring function S (theta, y) based on the shannon information amount _i )。

S(θ,y _i )＝-log P _θ (y _i ) (8)

Wherein: p (P) _θ (y _i ) Is y _i Probability values in the predictive probability distribution; θ is a parameter vector of the predictive probability distribution.

Let Q be y _i The expression (9) is always true.

The part on the right side and the left side of the inequality sign is the divergence D under the scoring function _KL (Q||P) representing the difference between the predicted distribution and the actual distribution. Namely:

the natural gradient is a gradient determined according to KL divergence on the statistical manifold, is the fastest rising direction in the Riemann space, and has invariance to the re-parameterization. Generating a new base learner along the natural gradient direction at each iteration theta, and acquiring the maximum score improvement, namely:

wherein:representing a natural gradient; d' is theta edge->An infinitely small step vector of movement.

Because the related concepts of the natural gradient solution are taken from the information geometry, the method brings inconvenience to popularization and application in actual engineering. The invention focuses on improving the solving process of the natural gradient, establishes a relation between the general gradient and the natural gradient through Fisher information quantity, and calculates the natural gradient through the general gradient.

Unlike the point prediction model proposed by other inventions, the point prediction model is: aiming at wind power probability prediction, the model building method provided by the invention emphasizes a parameter vector theta for predicting wind power probability distribution, namely: uncertainty information of wind power can be predicted and obtained. The specific calculation process is as follows:

calculation first of all by θ ⁰ For initial parameter vectors, the essence is to fit y _i Is arranged on the edge of the substrate; subsequently at the mth iteration, y is calculated _i And corresponding parameter vector thereofNatural gradient +.>Fitting in this direction generates a new set of basis learners, thus enabling parameter vector updates.

The meta-model is fused, so that the learning effect can be enhanced, excessive redundancy of the whole model is not caused, and only part of the point prediction models in the current related field are fused by adopting a Stacking model. However, stacking model fusion is too complex, and the problem of data crossing of training data referencing global statistics occurs in the calculation process, so that the method is not suitable for solving the problem of wind power probability prediction.

In order to overcome the defects, the invention is based on the simple fusion of the Blending model, overcomes the advantage of data crossing, and fuses a plurality of improved natural gradient lifting meta-models, wherein the fusion step is shown in figure 4. In the step S3, a Blending model is performed on the plurality of improved natural gradient lifting meta-models, a new meta-model is built for training, so as to output a final predicted statistical parameter vector, which specifically includes:

s302, model fusion:

the predicted mean value determined by DA_P and the actual result DA_OUT corresponding to the original DA data form a new data set, and a new meta-model MO is established _DA Training and obtaining predicted output MO _DA P, where MO is _DA P is the corrected predicted statistical parameter vector; compared with DA_P, MODA_P has higher accuracy and smaller sharpness, and the advantages of model fusion are reflected;

According to the wind power probability prediction model establishment method provided by the invention, a natural gradient lifting meta-model of generalized natural gradient calculation is utilized; compared with the traditional point prediction model, the model provided by the invention solves the application defect of the traditional Boosting algorithm in solving the problem of wind power probability prediction, has higher generalization and robustness, has higher prediction area coverage rate and smaller average width occupation ratio of a prediction interval, has the reinforced prediction effect, and simultaneously has lower model redundancy, and the model building method is more suitable for practical engineering application, thereby being beneficial to improving the power grid scheduling economy and the running safety of a wind power plant with renewable energy grid connection.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. The wind power probability prediction model building method is characterized by comprising the following steps of:

s3, blending model fusion: performing Blending model fusion on a plurality of improved natural gradient lifting meta-models, establishing a new meta-model for training, and outputting a final prediction statistical parameter vector;

s101, eliminating abnormal values in initial data by using a box graph;

s102, carrying out initial value processing on each sequence by taking wind power as a reference data sequence and relevant meteorological variables as a comparison data sequence, calculating association coefficients based on a gray association theory to represent the association degree of two groups of sequences, and selecting meteorological variables with the association degree larger than a threshold value as a training data set of a prediction model;

in the step S101, the step of eliminating the outlier in the initial data by using the box graph specifically includes:

the outlier cutoff upper and lower limits are determined by equation (1):

wherein: min and max represent the upper limit and the lower limit of data interception; q (Q) ₁ 、Q ₃ Respectively representing upper and lower quartiles; iqr=q ₃ -Q ₁ ；

The wind power is taken as a reference data column, related meteorological variables are taken as a comparison data column, each sequence is subjected to initial value processing, the association coefficient is calculated based on a gray association theory to represent the association degree of two groups of sequences, and the meteorological variables with the association degree larger than a threshold value are selected as training data sets of a prediction model, and the method specifically comprises the following steps:

Δ ^k (t)＝|S ^k (t)-S ⁰ (t)| (2)

2) Calculating the correlation coefficient eta ^k (t)：

3) Solving the association degree gamma ^k ：

Wherein: t (T) _n Is the sequence length;

4) Setting a threshold valueSelecting meteorological variables with the association degree larger than a threshold value to form a training data set;

S(θ,y _i )＝-logP _θ (y _i ) (8)

s203, make-log P _θ (y _i ) F (θ), taylor expansion is performed on f (θ+d'), and the third order and above remainder are discarded:

wherein the calculation of the primary term can be simplified to:

the remainder is denoted as:

thereby realizing the calculation of natural gradient through general gradient

s205, updating a parameter vector theta of wind power probability distribution:

wherein: θ ₀ As an initial parameter vector, α ^m Is a scale factor, beta is a unified learning rate, B ^m A unified representation of the base learner;

in the step S3, a Blending model is performed on the plurality of improved natural gradient lifting meta-models, a new meta-model is built for training, so as to output a final predicted statistical parameter vector, which specifically includes:

s302, model fusion: