CN114912577A - Wind power plant short-term wind speed prediction method combining VMD and attention mechanism - Google Patents

Wind power plant short-term wind speed prediction method combining VMD and attention mechanism Download PDF

Info

Publication number
CN114912577A
CN114912577A CN202210425233.0A CN202210425233A CN114912577A CN 114912577 A CN114912577 A CN 114912577A CN 202210425233 A CN202210425233 A CN 202210425233A CN 114912577 A CN114912577 A CN 114912577A
Authority
CN
China
Prior art keywords
wind speed
layer
vmd
formula
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210425233.0A
Other languages
Chinese (zh)
Inventor
季培远
赵英男
陈飞
季冠岚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202210425233.0A priority Critical patent/CN114912577A/en
Publication of CN114912577A publication Critical patent/CN114912577A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Wind Motors (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a short-term wind speed prediction method for a wind power plant combining a VMD (minimum mean Square) and an attention mechanism, which comprises the steps of firstly obtaining wind speed space-time data of a target station, establishing a series SWSM (Single-Range short Range; extracting wind speed time domain characteristics by using the GRU model at the top layer, and obtaining respective prediction results; accumulating all the prediction results to obtain a final predicted wind speed; the method fully utilizes the time-space correlation of the wind speed, combines the VMD and the attention mechanism, improves the unstable characteristic of the original wind speed, optimizes the CNN-GRU model by utilizing the attention mechanism, enables the model to capture the long-distance interdependence characteristic in the sequence more easily, effectively improves the wind speed prediction precision, and ensures the reliable operation of the power system.

Description

Wind power plant short-term wind speed prediction method combining VMD and attention mechanism
Technical Field
The invention relates to the technical field of new energy power generation and deep learning, in particular to a wind power plant short-term wind speed prediction method combining a VMD (variable vector machine) and an attention mechanism.
Background
Currently, there is an increasing demand to utilize renewable energy as a solution to future energy shortages, and many conventional power generation systems are being replaced by renewable energy systems. Wind energy has gained wide attention and utilization worldwide as one of the most potential, practical, abundant and environmentally friendly renewable resources in the world. Therefore, further development of wind power generation technology is required.
The method has very important significance for the operation control of the power system by accurately predicting the short-term wind speed, is beneficial to reasonably scheduling wind power integration, reduces the voltage and frequency fluctuation caused by the wind power change, and improves the operation reliability of a power grid. At present, wind speed prediction technologies can be divided into three categories, namely physical models, statistical models and artificial intelligence models. The physical model is represented by a numerical weather forecast model, which uses real-time weather conditions for forecasting, but is generally used for long-term wind speed forecasting in a specific area because the modeling process requires a large amount of calculation. And is not suitable for short-term and ultra-short-term wind speed prediction. The statistical method establishes a nonlinear mapping relation between historical wind speed data through a learning rule, and realizes time series prediction. The basis of artificial intelligence models is machine learning techniques. It describes a complex non-linear relationship between system input and output based on a large amount of wind speed time data. With the rapid development of deep learning technologies, the deep learning technologies are also rapidly applied to short-term wind speed prediction, and the methods combine the existing wind speed prediction technology with a hybrid neural network model and obtain good prediction effects.
Because wind power generation has the characteristics of intermittency, volatility, uncertainty and the like. In practical applications, a certain treatment method is usually combined to obtain a relatively stable subsequence. Through reasonable control of convergence conditions. In addition, when the input time sequence is long, the sequence information is easily lost by networks such as LSTM and GRU, and it is difficult to model the structural information between data, which also affects the accuracy of wind speed prediction.
Disclosure of Invention
The invention discloses a short-term wind speed prediction method for a wind power plant by combining a VMD (virtual machine model) and an attention mechanism, and aims to solve the technical problems that when a time sequence is input to be longer, the networks such as LSTM (local Scale time) and GRU (generalized regression) are easy to lose sequence information, the modeling of structural information among data is difficult, and the accuracy of wind speed prediction is influenced.
In order to achieve the purpose, the invention adopts the following technical scheme:
a wind power plant short-term wind speed prediction method combining a VMD and an attention mechanism specifically comprises the following steps:
step 1: establishing a wind speed data set with two dimensions of time and space;
step 2: establishing a series of SWSMs according to the original data, wherein the SWSMs comprise time domain and space domain characteristics of a wind speed data set;
and step 3: performing wind speed decomposition on the SWSM on each time sequence by using the VMD to obtain sub-SWSM consisting of the IMFs;
and 4, step 4: combining the CNN model with an attention mechanism to obtain an SEnet model;
and 5: aiming at each sub SWSM, the SEnet model is obtained by applying the step 4, and the airspace characteristics of the wind speed are extracted;
step 6: processing the airspace characteristics obtained in the step 5 by applying a GRU model, extracting time domain characteristics and obtaining each predicted component of the wind speed;
and 7: giving different weights to the features of the input through an attention layer based on an attention mechanism;
and 8: and combining the prediction results and obtaining the final predicted wind speed.
In a preferred scheme, in step 1, a wind speed data set is established to include two dimensions of time and space, for an original wind speed data set, wind speeds at a predicted time and a predicted position are set as tag wind speeds, and then the data set and the tag wind speeds are proportionally divided into a training set, a verification set and a test set in a time sequence.
In a preferred embodiment, in step 2, the establishing of the SWSM includes the following procedures:
assuming that the object of study is an array of M rows and N columns over a spatial region, which array may be represented by an M N grid, the position of each site in the array may be indexed by a two-dimensional rectangular coordinate (i, j) (1 ≦ i ≦ M, 1 ≦ j ≦ N), and for each site, the wind speed is a one-dimensional time series, and at time t, the spatial wind speed matrix SWSM for the site (M, N) may be defined as x (i, j) t ∈R M×N
Figure BDA0003608236460000031
The wind speed sequence is converted to SWSM by the above method.
In a preferred embodiment, in step 3, the main steps of VMD decomposition are as follows:
s21: firstly, constructing variation problem, ensuring that the decomposition sequence is modal component with limited bandwidth of central frequency, simultaneously minimizing the sum of the estimated bandwidths of all the modes, and preprocessing the wind speed data as
Figure BDA0003608236460000041
The corresponding constraint variation expression is
Figure BDA0003608236460000042
Figure BDA0003608236460000043
Wherein K is the number of modes to be decomposed, positive integer, { u k }、{ω k Respectively corresponding to the k-th modal component and the center frequency after decomposition, wherein delta (t) is a dirac function and is convolution operation;
s22: for solving S21, introducing a Lagrangian multiplier lambda, converting the constraint problem into an unconstrained problem, and obtaining an augmented Lagrangian expression:
Figure BDA0003608236460000044
in the formula, alpha is a penalty factor and is used for reducing the influence of Gaussian noise;
s23: finally, solving the unconstrained variational problem by adopting an alternating direction multiplier ADMM iterative algorithm, optimizing to obtain each modal component and center frequency, searching saddle points of the augmented Lagrangian function, and iteratively updating the parameters { u } k },{ω k And λ; the formula is as follows:
Figure BDA0003608236460000045
Figure BDA0003608236460000051
Figure BDA0003608236460000052
in the formula
Figure BDA0003608236460000053
And
Figure BDA0003608236460000054
respectively represent f (omega) and u i (ω), λ (ω) and
Figure BDA0003608236460000055
a Fourier transform of (1); n is the number of iterations; gamma is noise tolerance and is used for meeting the fidelity requirement of signal decomposition;
s24: finally, for a given accuracy of determination e>0, when satisfying
Figure BDA0003608236460000056
If so, the iteration is stopped, otherwise, the step S3 is returned to. (ii) a Finally, k decomposed IMF components can be obtained; decomposing SWSM by VMD method to obtain sub-SWSM composed of IMFsComponent IMF of station (M, N) at time t k A constituent sub-SWSM can be defined as
Figure BDA0003608236460000057
Figure BDA0003608236460000058
In a preferred scheme, in the step 4, the core idea of CNN used in sense combined with attention is to automatically acquire the importance degree of each feature channel by means of learning, and then according to the importance degree, promote useful features and suppress features that are not useful for the current task, this function is implemented by SE blocks, and a convolution layer, two SE block layers and then another convolution layer are included in the sense layer, and a convolution layer, a global average pooling layer, two active layers and a fusion layer are set in each SE block; the following is the procedure for the SEnet layer construction:
s41 first F tr Is a conversion operation, which is a standard convolution operation with an input of X and an output of U, and its defining formula is as follows:
Figure BDA0003608236460000061
in the formula, Vc represents the c-th convolution kernel, X s Represents the s-th input, represents the convolution operation, u c Represents the c-th 2D matrix in the 3D matrix U,
Figure BDA0003608236460000063
then the 2D spatial kernel for the X corresponding channel, which represents v c Of the single channel of (a).
S42: then is the Squeeze operation, which is actually a global average pooling operation to compress spatial features, converting the W × H × C input to 1 × 1 × C output, where W denotes the width of a channel, H denotes the height of a channel, and there are C channels, the Squeeze operation formula is as follows:
Figure BDA0003608236460000062
s43: the following is the Excitation operation, which has the following formula:
s=σ(g(z,W))=σ(W 2 δ(W 1 z))
in the formula, W 1 z is a full link layer operation, W 1 Is C/r C, where r is a scaling parameter, and W is the z dimension 1C 1 The result of z is 1 x 1C/r; δ is the ReLU function, and does not change the dimensionality of the output; then W is further mixed with 2 Multiplication of W 2 The process of multiplication is also a process of full link layer, W 2 The dimension of (d) is C/r, so the output dimension is 1C; finally, obtaining a weight matrix s by using sigma, namely a sigmoid function;
(4) and finally, assigning a value to the 3D matrix U through a weight matrix s, wherein the formula is as follows:
H S =s c ·u c
in the formula u c Is a two-dimensional matrix, s c Is a weight, and this equation is equivalent to u c Each value in the matrix is multiplied by s c
In a preferred scheme, the model is used for extracting the spatial domain characteristics of the wind speed, the sub SWSM matrix obtained by VMD decomposition is input to a SEnet layer, convolution activation is carried out on an input image by using convolution kernel in a convolution layer to obtain a characteristic diagram of a convolution block, the characteristic diagram of the convolution block is input to the convolution layer in an SE block, weight assignment is carried out on a channel of the convolution kernel in the convolution layer by using two activation layers, a new characteristic diagram is obtained by global average pooling, the new characteristic diagram is flattened to be subjected to dimension reduction while the spatial characteristics are kept, and the feature diagram after dimension reduction can be used as the input of a GRU layer.
In a preferred approach, in step 6, feature extraction on a time series is implemented by the GRU layer, the GRU having two gates, one reset gate and the other update gate, the reset gate determining how to combine new input information with previous memory, the update gate defining the amount of previous memory saved to the current time step, two gating vectors determining which information can ultimately be used as output of a gated loop unit, which can save information in a long-term sequence and not be cleared over time or removed because it is irrelevant to prediction;
the gate and activation function of the GRU is calculated as follows:
s61: activation function Sigmoid:
Figure BDA0003608236460000071
s62: activation function tanh:
Figure BDA0003608236460000072
s63: and (4) updating the door: z is a radical of t =σ(W z ·[h t-1 ,x t ])
S64: resetting the gate: r is t =σ(W r ·[h t-1 ,x t ])
S65: new memory (input of reset gate):
Figure BDA0003608236460000081
s66: and (3) outputting a value:
Figure BDA0003608236460000082
where σ is the activation function Sigmoid, tanh is the activation function tanh, and z is the update gate and the reset gate, respectively t And r t ,x t For input, h t In order to hide the output of the layer,
Figure BDA0003608236460000085
is to input x t And past hidden layer states h t-1 Summarizing; w z 、W r And W are the weights of the update gate, reset gate and candidate output, respectively.
In a preferred scheme, in step 7, an attention layer is added after the GRU layer, and different weights are given to input features of the model, so that the influence of important information is strengthened to avoid the problem of long-distance information loss, and the model can more easily capture long-distance interdependent features in the sequence, wherein the input of the attention layer is an output vector h subjected to activating processing of the GRU layer t Calculating the probability corresponding to different eigenvectors according to the weight distribution principle, continuously updating and iterating to obtain a better weight parameter matrix, wherein the weight coefficient calculation formula of the attention layer is as follows:
e t =utanh(wh t +b)
Figure BDA0003608236460000083
Figure BDA0003608236460000084
in the formula, e t Represents the output vector h of the GRU network layer at the t moment t The determined attention probability distribution value; u and w are weight coefficients; b is a bias coefficient; the output of the attention layer at the t-th moment is represented by s t And (4) showing.
In a preferred embodiment, in step 8, the prediction results are combined in the output layer, and the calculation is performed by the full-link layer, and the output Y ═ Y [ Y ] with the predicted step size m is predicted 1 ,y 2 ,·····,y m ] T During prediction, an early stopping mechanism is used for monitoring the model, when the training error is not optimized within a certain training frequency, the training is stopped, otherwise, the training is continued until the original set frequency is finished, and the prediction formula is as follows:
y t =Sigmoid(w o s t +b o )
in the formula, y t A predicted output value indicating a t-th time; w is a 0 Is a weight matrix; b 0 For the deviation vector, the activation function is a Sigmoid function.
Compared with the prior art, the invention has the following advantages:
(1) and processing the wind speed data by adopting a VMD method. The unstable wind speed sequence is converted into a relatively stable subsequence, and the wind speed prediction precision is improved.
(2) Considering that irrelevant features in the data can cause the performance of the model to be reduced, the attention mechanism is used for reallocating the feature weights to improve the performance of the model.
(3) The bottom layer architecture of the algorithm adopts a CNN-GRU model. The model can process the space-time characteristics of the wind speed, and the wind speed is predicted by utilizing the space-time correlation, so that the prediction precision is improved. .
Drawings
FIG. 1 is a wind speed prediction flow chart of a wind farm short-term wind speed prediction method combining a VMD and an attention mechanism according to the present invention.
FIG. 2 is a schematic diagram of a wind speed sequence converted into SWSM according to the wind farm short-term wind speed prediction method combining a VMD and an attention mechanism.
FIG. 3 is a schematic diagram of a VMD decomposed SWSM of a wind farm short-term wind speed prediction method combining a VMD and an attention mechanism according to the present invention.
Fig. 4 is an architecture diagram of a GRU network of a wind farm short-term wind speed prediction method combining a VMD and an attention mechanism according to the present invention.
FIG. 5 is a structural diagram of SENET of a wind farm short-term wind speed prediction method combining a VMD and an attention mechanism.
FIG. 6 is a schematic diagram of a wind speed prediction result with a prediction time of 20 minutes of the wind farm short-term wind speed prediction method combining a VMD and an attention mechanism provided by the invention.
FIG. 7 is a comparison graph of different model prediction errors RMSE (m/s) of the wind power plant short-term wind speed prediction method combining the VMD and the attention mechanism.
FIG. 8 is a comparison graph of different model prediction errors MAPE (m/s) of the wind farm short-term wind speed prediction method combining the VMD and the attention mechanism.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
1-8, a wind farm short-term wind speed prediction method combining a VMD and an attention mechanism and adopting a variational modal decomposition and an attention mechanism to improve the accuracy of wind speed prediction. Firstly, noise reduction is carried out on the wind speed by using a variational modal decomposition technology to obtain an optimized wind speed matrix; then converting the spatial information into visual information through a gray image by using the wind speed at the same time in the wind power plant, processing the visual information by using a deep convolution neural network which is relatively suitable for processing the visual information, and extracting spatial features by using a method of taking an SENet layer formed by combining CNN (convolutional neural network) and an attention mechanism as a strengthening network; then, combining the GRU with an attention mechanism to realize the extraction of time characteristics; finally, obtaining a wind speed prediction result; which comprises the following steps:
step 1: and establishing a wind speed data set with two dimensions of time and space.
Step 2: a series of SWSMs are built from the raw data, the SWSMs including time-domain and space-domain characteristics of the wind speed data set.
And step 3: and performing wind speed decomposition on the SWSM on each time sequence by using the VMD, and decomposing to obtain sub-SWSM consisting of the IMFs.
And 4, step 4: combining the CNN model with the attention mechanism to obtain the SENEt model.
And 5: and (4) aiming at each sub SWSM, applying the step 4 to obtain a SEnet model, and extracting the spatial domain characteristics of the wind speed.
Step 6: and (5) processing the spatial domain characteristics obtained in the step (5) by applying a GRU model, extracting time domain characteristics and obtaining each prediction component of the wind speed.
And 7: the features of the input are given different weights by the attention layer based on the attention mechanism.
And 8: and combining the prediction results and obtaining the final predicted wind speed.
In a preferred embodiment, in step 1, the established wind speed data set comprises both time and space dimensions. For the original wind speed data set, the wind speed at the predicted time and position is set as the tag wind speed, and then the data set and the tag wind speed are proportionally divided into a training set, a verification set and a test set in a time sequence.
As shown in fig. 3 in detail, in a preferred embodiment, in step 2, the SWSM is established by the following method: assuming the object of study is an array of M rows and N columns over a spatial region, which may be represented by an M N grid, the position of each site in this array may be indexed by a two-dimensional rectangular coordinate (i, j) (1 ≦ i ≦ M, 1 ≦ j ≦ N), and the wind speed is a one-dimensional time series for each site. Thus, when time is t, the spatial wind velocity matrix (SWSM) for a site (M, N) can be defined as x (i, j) t ∈R M×N
Figure BDA0003608236460000121
In a preferred embodiment, in step 3, the main steps of VMD decomposition are as follows:
(1) firstly, constructing variation problem, ensuring that the decomposition sequence is modal component with limited bandwidth of central frequency, simultaneously minimizing the sum of the estimated bandwidths of all the modes, and preprocessing the wind speed data as
Figure BDA0003608236460000125
The corresponding constraint variation expression is
Figure BDA0003608236460000122
Figure BDA0003608236460000123
Wherein K is the number of patterns to be decomposed (positive integer), { u k }、{ω k Respectively corresponding to the k-th modal component and the center frequency after decomposition, wherein delta (t) is a dirac function and is convolution operation;
(2) for solving the (1), introducing a Lagrangian multiplier lambda, converting the constraint problem into an unconstrained problem, and obtaining an augmented Lagrangian expression:
Figure BDA0003608236460000124
where α is a penalty factor for reducing the effect of gaussian noise.
(3) Finally, solving the unconstrained variational problem by adopting an alternating direction multiplier (ADMM) iterative algorithm, optimizing to obtain each modal component and center frequency, searching saddle points of the augmented Lagrangian function, and iteratively updating the parameters { u } k },{ω k And lambda. The formula is as follows:
Figure BDA0003608236460000131
Figure BDA0003608236460000132
Figure BDA0003608236460000133
in the formula
Figure BDA0003608236460000134
And
Figure BDA0003608236460000135
respectively represent f (omega) and u i (ω), λ (ω) and
Figure BDA0003608236460000136
fourier transform of (2); n is the number of iterations; gamma is a noise tolerance for meeting the fidelity requirements of signal decomposition.
(4) Finally, for a given accuracy of determination e>0, if it satisfies
Figure BDA0003608236460000137
Stopping the iteration, otherwise returning to the step (3). Finally, k decomposed IMF components can be obtained.
Decomposing SWSM by VMD method to obtain sub-SWSM composed of each IMF, when time is t, component IMF of site (M, N) k A constituent sub-SWSM can be defined as
Figure BDA0003608236460000138
Figure BDA0003608236460000139
In a preferred embodiment, in the step 4, the core idea of CNN and attention combination used in the SENet model is to automatically acquire the importance degree of each feature channel by means of learning, and then to promote useful features and suppress features that are not useful for the current task according to the importance degree. This function is implemented by the SE block. Whereas in the sense layer one convolutional layer is included, two SE block layers are included, followed by another convolutional layer. In each SE block, a convolutional layer, a global averaging pooling layer, two active layers, and a fusion layer are provided. The structure of the SE block is shown in fig. 4.
The following is the procedure for the SENet layer construction:
(1) first F tr Is a conversion operation, which is a standard convolution operation with an input of X and an output of U, and its defining formula is as follows:
Figure BDA0003608236460000141
in the formula, v c Denotes the c-th convolution kernel, X s Represents the s-th input, represents the convolution operation. u. of c Represents the c-th 2D matrix in the 3D matrix U,
Figure BDA0003608236460000143
then the 2D spatial kernel for the X corresponding channel, which represents v c Of the single channel of (a).
(2) Then is the Squeeze operation, which is actually a global average pooling operation to compress the spatial features, converting the W × H × C input to a 1 × 1 × C output, where W denotes the width of the channel and H denotes the height of the channel, for a total of C channels. The Squeeze operation formula is as follows:
Figure BDA0003608236460000142
(3) the following is the Excitation operation, which has the following formula:
s=σ(g(z,W))=σ(W 2 δ(W 1 z))
in the formula, W 1 z is a full link layer operation, W 1 Is C/r C, where r is a scaling parameter, and W is the z dimension 1C 1 The result of z is 1 x 1C/r; δ is the ReLU function, and does not change the dimensionality of the output; then W is further mixed with 2 Multiply by and W 2 The multiplication process is also a full link layer process, W 2 The dimension of (d) is C/r, so the output dimension is 1C; finally, sigma, namely a sigmoid function, is used to obtain a weight matrix s.
(4) And finally, assigning a value to the 3D matrix U through a weight matrix s, wherein the formula is as follows:
H S =s c ·u c
in the formula u c Is a two-dimensional matrix, s c Is a weight, and this equation is equivalent to u c Each value in the matrix is multiplied by s c
According to the SENET model described in step 4, spatial features of wind speed are extracted using this model. And inputting the sub SWSM matrix obtained by VMD decomposition into a SENET layer, and performing convolution activation on an input image by using a convolution kernel in the convolution layer to obtain a feature map of a convolution block. And inputting the feature map of the convolution block into a convolution layer in the SE block, performing weight assignment on a channel of a convolution kernel in the convolution layer by using two activation layers, and performing global average pooling to obtain a new feature map. And flattening the new feature map to reduce the dimension while keeping the space features. The feature map after dimension reduction can be used as the input of the GRU layer.
In a preferred embodiment, in step 6, feature extraction in time series is implemented by the GRU layer. The GRU has two gates, a reset gate (resetgate) which determines how the new input information is combined with the previous memory and an update gate (updategate) which defines the amount of time the previous memory is saved to the current time step. The two gating vectors determine which information can ultimately be output as a gated loop unit, which can hold information in long-term sequences and is not cleared over time or removed because it is not relevant to prediction.
The gate and activation function of the GRU is calculated as follows:
(1) activation function Sigmoid:
Figure BDA0003608236460000161
(2) activation function tanh:
Figure BDA0003608236460000162
(3) and (4) updating the door: z is a radical of t =σ(W z ·[h t-1 ,x t ])
(4) Resetting a gate: r is a radical of hydrogen t =σ(W r ·[h t-1 ,x t ])
(5) New memory (input of reset gate):
Figure BDA0003608236460000163
(6) and (3) outputting a value:
Figure BDA0003608236460000164
in the formulaσ is the activation function Sigmoid, and tanh is the activation function tanh. The refresh gate and the reset gate are z respectively t And r t 。x t For input, h t Is the output of the hidden layer.
Figure BDA0003608236460000166
Is to input x t And past hidden layer state h t-1 Summarizing; w z 、W r And W are the weights of the update gate, reset gate and candidate output, respectively.
In a preferred embodiment, in step 7, an attention layer is added after the GRU layer, and by giving different weights to the input features of the model, the influence of important information is strengthened to avoid the problem of long-distance information loss of the sequence, so that the model can more easily capture the features which are mutually dependent in the sequence for a long distance. The input of the attention layer is an output vector h subjected to GRU layer activation processing t And calculating the probabilities corresponding to different eigenvectors according to a weight distribution principle, and continuously updating and iterating to obtain a better weight parameter matrix. The formula for calculating the weight coefficient of the attention layer is as follows:
e t =utanh(wh t +b)
Figure BDA0003608236460000165
Figure BDA0003608236460000171
in the formula, e t Represents the output vector h of the GRU network layer at the t moment t The determined attention probability distribution value; u and w are weight coefficients; b is a bias coefficient; the output of the attention layer at the t-th moment is represented by s t And (4) showing.
In a preferred embodiment, in step 8, the prediction results are combined in the output layer and calculated by the full-link layer, and the output Y ═ Y with step size m is predicted 1 ,y 2 ,·····,y m ] T . Early stop in predictionAnd the stopping mechanism monitors the model, stops training when the training error is not optimized within a certain training frequency, and otherwise continues training until the original set frequency is finished. The prediction formula is as follows:
y t =Sigmoid(w o s t +b o )
in the formula, y t A predicted output value indicating a t-th time; w is a o Is a weight matrix; b o For the deviation vector, the activation function is a Sigmoid function.
Compared with the prior art, the invention has the following advantages:
(4) and processing the wind speed data by adopting a VMD method. The unstable wind speed sequence is converted into a relatively stable subsequence, and the wind speed prediction precision is improved.
(5) Considering that irrelevant features in the data can cause the performance of the model to be reduced, the attention mechanism is used for reallocating the feature weights to improve the performance of the model.
The bottom layer architecture of the algorithm adopts a CNN-GRU model. The model can process the space-time characteristics of the wind speed, and the wind speed is predicted by utilizing the space-time correlation, so that the prediction precision is improved.
FIG. 6 shows the partial wind speed prediction result and the residual at a prediction time interval of 20 minutes.
As can be seen from fig. 6, the VCGA model has a very high fitting degree, and can accurately reflect the trend of the true value. The residual analysis result shows that the prediction residuals of the model are uniformly and randomly distributed on two sides of a zero base line, which indicates that no systematic error exists in the modeling process. Thus, the model is feasible in short term wind speed prediction.
Fig. 7 and 8 show the RMSE and MAPE results at different prediction times for different prediction models, respectively.
In order to verify the effect of the invention, the algorithm of the invention is compared with other traditional machine learning algorithms and sub-algorithms of the algorithm for experiments, and two representative evaluation indexes of RMSE and MAPE are used. The comparison result shows that compared with other traditional methods based on deep learning, the method provided by the invention has higher prediction precision and more accurate result
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (9)

1. A wind power plant short-term wind speed prediction method combining a VMD and an attention mechanism is characterized by comprising the following steps:
step 1: establishing a wind speed data set with two dimensions of time and space;
step 2: establishing a series of SWSMs according to the original data, wherein the SWSMs comprise time domain and space domain characteristics of a wind speed data set;
and step 3: performing wind speed decomposition on the SWSM on each time sequence by using the VMD to obtain sub-SWSM consisting of the IMFs;
and 4, step 4: combining the CNN model with an attention mechanism to obtain an SENEt model;
and 5: aiming at each sub SWSM, the SEnet model obtained in the step 4 is applied, and the airspace characteristics of the wind speed are extracted;
step 6: processing the airspace characteristics obtained in the step 5 by applying a GRU model, extracting time domain characteristics and obtaining each predicted component of the wind speed;
and 7: giving different weights to the features of the input through an attention layer based on an attention mechanism;
and 8: and combining the prediction results and obtaining the final predicted wind speed.
2. The method for predicting the short-term wind speed of the wind farm by combining the VMD and the attention mechanism as claimed in claim 1, wherein in the step 1, a wind speed data set is established to comprise two dimensions of time and space, for an original wind speed data set, the wind speed at the predicted time and position is set as a label wind speed, and then the data set and the label wind speed are proportionally divided into a training set, a verification set and a test set in a time sequence.
3. The wind farm short-term wind speed prediction method combining the VMD and the attention mechanism according to claim 1, wherein in the step 2, the establishment of the SWSM comprises the following process:
assuming that the object of study is an array of M rows and N columns over a spatial region, which array may be represented by an M N grid, the position of each site in the array may be indexed by a two-dimensional rectangular coordinate (i, j) (1 ≦ i ≦ M, 1 ≦ j ≦ N), and for each site, the wind speed is a one-dimensional time series, and at time t, the spatial wind speed matrix SWSM for the site (M, N) may be defined as x (i, j) t ∈ R M×N
Figure FDA0003608236450000021
The wind speed sequence was converted to SWSM by the method described above.
4. The wind farm short-term wind speed prediction method combining the VMD and the attention mechanism according to claim 1, characterized in that in step 3, the main steps of VMD decomposition are as follows:
s21: firstly, constructing variation problem, ensuring that the decomposition sequence is modal component with limited bandwidth of central frequency, simultaneously minimizing the sum of the estimated bandwidths of all the modes, and preprocessing the wind speed data as
Figure FDA0003608236450000022
The corresponding constraint variation expression is
Figure FDA0003608236450000023
Figure FDA0003608236450000024
In the formula, K isNumber of patterns to be decomposed, positive integer, { u } k }、{ω k Respectively corresponding to the k-th modal component and the center frequency after decomposition, wherein delta (t) is a dirac function and is convolution operation;
s22: for solving S21, introducing a Lagrangian multiplier lambda, converting the constraint problem into an unconstrained problem, and obtaining an augmented Lagrangian expression:
Figure FDA0003608236450000031
in the formula, alpha is a penalty factor and is used for reducing the influence of Gaussian noise;
s23: finally, solving the unconstrained variational problem by adopting an alternating direction multiplier ADMM iterative algorithm, optimizing to obtain each modal component and center frequency, searching saddle points of the augmented Lagrangian function, and iteratively updating the parameters { u } k },{ω k And λ; the formula is as follows:
Figure FDA0003608236450000032
Figure FDA0003608236450000033
Figure FDA0003608236450000034
in the formula
Figure FDA0003608236450000035
And
Figure FDA0003608236450000036
respectively represent f (omega) and u i (ω), λ (ω) and
Figure FDA0003608236450000037
a Fourier transform of (1); n is the number of iterations; gamma is noise tolerance and is used for meeting the fidelity requirement of signal decomposition;
s24: finally, for a given accuracy of determination e>0, when satisfying
Figure FDA0003608236450000038
If so, the iteration is stopped, otherwise, the step S3 is returned to. (ii) a Finally, k decomposed IMF components can be obtained; decomposing SWSM by VMD method to obtain sub-SWSM composed of each IMF, when time is t, component IMF of site (M, N) k A constituent sub-SWSM can be defined as
Figure FDA0003608236450000039
Figure FDA0003608236450000041
5. The method for predicting the short-term wind speed of the wind farm combining the VMD and the attention mechanism according to claim 1, characterized in that in the step 4, the core idea of combining the CNN used in the SENET and the attention is to automatically acquire the importance degree of each feature channel by means of learning, and then according to the importance degree, to promote useful features and suppress features which are not used for the current task, the function is realized by SE blocks, and a convolution layer, two SE block layers and then another convolution layer are included in the SENET layer, and a convolution layer, a global average pooling layer, two active layers and a fusion layer are arranged in each SE block; the following is the procedure for the SENet layer construction:
s41 first F tr Is a conversion operation, which is a standard convolution operation with an input of X and an output of U, and its defining formula is as follows:
Figure FDA0003608236450000042
in the formula, Vc represents the c-th convolution kernel, X s Represents the s-th input, represents the convolution operation, u c Represents the c-th 2D matrix in the 3D matrix U,
Figure FDA0003608236450000043
then the 2D spatial kernel for the X corresponding channel, which represents v c Of the single channel of (a).
S42: then is the Squeeze operation, which is actually a global average pooling operation to compress spatial features, converting the W × H × C input to 1 × 1 × C output, where W denotes the width of a channel, H denotes the height of a channel, and there are C channels, the Squeeze operation formula is as follows:
Figure FDA0003608236450000044
s43: the following is the Excitation operation, which has the following formula:
s=σ(g(z,W))=σ(W 2 δ(W 1 z))
in the formula, W 1 z is a full link layer operation, W 1 Is C/r C, where r is a scaling parameter, and W is the z dimension 1C 1 The result of z is 1 x 1C/r; δ is the ReLU function, and does not change the dimensionality of the output; then W is further mixed with 2 Multiplication of W 2 The process of multiplication is also a process of full link layer, W 2 The dimension of (d) is C/r, so the output dimension is 1C; finally, obtaining a weight matrix s by using sigma, namely a sigmoid function;
(4) and finally, assigning a value to the 3D matrix U through a weight matrix s, wherein the formula is as follows:
H S =s c ·u c
in the formula u c Is a two-dimensional matrix, s c Is a weight, and this equation is equivalent to dividing u in the matrix c Is multiplied by s c
6. The wind farm short-term wind speed prediction method combining the VMD and the attention mechanism is characterized in that a sub-SWSM matrix obtained by decomposing the VMD is input to a SEnet layer, convolution activation is carried out on an input image by using convolution kernels in the convolution layers to obtain a feature map of a convolution block, the feature map of the convolution block is input to the convolution layers in an SE block, weight assignment is carried out on channels of the convolution kernels in the convolution layers by using the two activation layers, a new feature map is obtained by global averaging pooling, the new feature map is flattened to be subjected to dimension reduction while space features are kept, and the feature map after dimension reduction can be used as input of a GRU layer.
7. A wind farm short term wind speed prediction method combining VMD and attention mechanism according to claim 1 characterized by that in step 6 feature extraction on time series is implemented by GRU layer, GRU has two gates, one is reset gate and the other is update gate, the reset gate determines how to combine new input information with previous memory, the update gate defines the amount of previous memory saved to current time step, two gating vectors determine which information can finally be used as output of gating cycle units, they can save information in long term series and not clear over time or because they are not relevant to prediction;
the gate and activation function of the GRU is calculated as follows:
s61: activation function Sigmoid:
Figure FDA0003608236450000061
s62: activation function tanh:
Figure FDA0003608236450000062
s63: and (4) updating the door: z is a radical of t =σ(W z ·[h t-1 ,x t ])
S64: resetting a gate: r is t =σ(W r ·[h t-1 ,x t ])
S65: new memory (input to reset gate):
Figure FDA0003608236450000063
s66: and (3) outputting a value:
Figure FDA0003608236450000064
where σ is the activation function Sigmoid, tanh is the activation function tanh, and the update gate and the reset gate are z t And r t ,x t For input, h t In order to hide the output of the layer,
Figure FDA0003608236450000065
is to input x t And past hidden layer state h t-1 Summarizing; w z 、W r And W are the weights of the update gate, reset gate and candidate output, respectively.
8. The method for predicting the short-term wind speed of the wind farm by combining the VMD and the attention mechanism as claimed in claim 1, wherein in step 7, an attention layer is added behind the GRU layer, and the input of the attention layer is an output vector h subjected to the activation processing of the GRU layer by giving different weights to the input characteristics of the model t Calculating the probability corresponding to different eigenvectors according to the weight distribution principle, continuously updating and iterating to obtain a better weight parameter matrix, and noting
The weight coefficient calculation formula of the intention layer is as follows:
e t =utanh(wh t +b)
Figure FDA0003608236450000071
Figure FDA0003608236450000072
in the formula, e t Represents that the t-th moment is output by the GRU network layer to a vector h t The determined attention probability distribution value; u and w are weight coefficients; b is a bias coefficient; the output of the attention layer at the t-th moment is represented by s t And (4) showing.
9. The method for predicting the short-term wind speed of the wind farm by combining the VMD and the attention mechanism as claimed in claim 1, wherein in step 8, the prediction results are combined in an output layer, the calculation is carried out through a full connection layer, and the output Y-Y with the step length m is predicted 1 ,y 2 ,……,y m ] T During prediction, an early stopping mechanism is used for monitoring the model, when the training error is not optimized within a certain training frequency, the training is stopped, otherwise, the training is continued until the original set frequency is finished, and the prediction formula is as follows:
y t =Sigmoid(w o s t +b o )
in the formula, y t A predicted output value indicating a t-th time; w is a 0 Is a weight matrix; b 0 For the deviation vector, the activation function is a Sigmoid function.
CN202210425233.0A 2022-04-21 2022-04-21 Wind power plant short-term wind speed prediction method combining VMD and attention mechanism Pending CN114912577A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210425233.0A CN114912577A (en) 2022-04-21 2022-04-21 Wind power plant short-term wind speed prediction method combining VMD and attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210425233.0A CN114912577A (en) 2022-04-21 2022-04-21 Wind power plant short-term wind speed prediction method combining VMD and attention mechanism

Publications (1)

Publication Number Publication Date
CN114912577A true CN114912577A (en) 2022-08-16

Family

ID=82764917

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210425233.0A Pending CN114912577A (en) 2022-04-21 2022-04-21 Wind power plant short-term wind speed prediction method combining VMD and attention mechanism

Country Status (1)

Country Link
CN (1) CN114912577A (en)

Similar Documents

Publication Publication Date Title
CN109102126B (en) Theoretical line loss rate prediction model based on deep migration learning
CN109492822B (en) Air pollutant concentration time-space domain correlation prediction method
Tian et al. Multi-step short-term wind speed prediction based on integrated multi-model fusion
CN110070226B (en) Photovoltaic power prediction method and system based on convolutional neural network and meta-learning
Tian Modes decomposition forecasting approach for ultra-short-term wind speed
Wang et al. The study and application of a novel hybrid forecasting model–A case study of wind speed forecasting in China
CN113053115B (en) Traffic prediction method based on multi-scale graph convolution network model
CN112348271A (en) Short-term photovoltaic power prediction method based on VMD-IPSO-GRU
CN112529282A (en) Wind power plant cluster short-term power prediction method based on space-time graph convolutional neural network
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
Wu et al. Promoting wind energy for sustainable development by precise wind speed prediction based on graph neural networks
CN112766078B (en) GRU-NN power load level prediction method based on EMD-SVR-MLR and attention mechanism
CN111861013B (en) Power load prediction method and device
Shao et al. Wind speed forecast based on the LSTM neural network optimized by the firework algorithm
CN111967679A (en) Ionized layer total electron content forecasting method based on TCN model
CN113762338B (en) Traffic flow prediction method, equipment and medium based on multiple graph attention mechanism
CN112418482A (en) Cloud computing energy consumption prediction method based on time series clustering
CN115511177A (en) Ultra-short-term wind speed prediction method based on INGO-SWGMN hybrid model
CN112613657A (en) Short-term wind speed prediction method for wind power plant
Wang et al. An approach for day-ahead interval forecasting of photovoltaic power: A novel DCGAN and LSTM based quantile regression modeling method
CN114169251A (en) Ultra-short-term wind power prediction method
Huang et al. Short-Term PV Power Forecasting Based on CEEMDAN and Ensemble DeepTCN
Gungor et al. Lenard: Lightweight ensemble learner for medium-term electricity consumption prediction
Peng et al. Meteorological satellite operation prediction using a BiLSTM deep learning model
CN114912577A (en) Wind power plant short-term wind speed prediction method combining VMD and attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination