Background
A practical photovoltaic power ultra-short-term prediction model is developed, so that the rotating reserve capacity can be effectively reduced, and the safe and economic operation level of a power grid is improved. In order to deal with the influence of large-scale photovoltaic power generation access on the stability of a power grid, research work on ultra-short-term prediction of photovoltaic power generation power is carried out successively in domestic and overseas colleges and power enterprises.
The existing researchers can be summarized into the following methods when researching photovoltaic power ultra-short term prediction technology:
(1) the physical method comprises the following steps: numerical Weather Prediction (NWP) data (mainly comprising solar radiation, temperature, cloud amount, rainfall, wind speed and the like) are used as input, characteristics of photovoltaic power generation equipment (comprising a photovoltaic module, an inverter and the like) are researched, a corresponding mathematical model of photovoltaic power generation power and the NWP data is established, and the photovoltaic power generation power is further predicted. The input information of the physical model comprises the following two parts: the dynamic information comprises NWP data provided by a meteorological department and online measured data used for Model Output Statistics (MOS) to reduce residual errors. And secondly, static information and system data of the photovoltaic power station, such as the geographic position of installation of a photovoltaic module, the photoelectric conversion efficiency and the like. As for the physical model, there are an ASHRAE model, a HOTTEL model, a REST model, a Nielsen model with cloud weather, a cloud shading coefficient model, and the like which are relatively perfect at present.
The physical method depends on detailed power station geographic information and accurate meteorological data, and the physical formula has certain errors, so that the model has the defects of poor anti-interference capability, low robustness, low accuracy, complex parameter selection and the like.
(2) The statistical method comprises the following steps: the method is a method for further predicting the photovoltaic power generation power by finding out an internal rule by inputting historical data such as solar radiation, photovoltaic power generation output and the like, excluding ill-conditioned data points, establishing a function mapping relation between the historical data and the output power and not considering the physical process of solar radiation change. Common statistical methods are time sequence method, regression analysis method, grey theory, fuzzy theory, space-time correlation method and the like. The specific research cases at home and abroad are as follows: the self-adaptive wavelet network method proposed by Mellit et al, Italy predicts hourly photovoltaic power generation of a grid-connected photovoltaic power station of 20kW in Riyaster City.
The statistical method usually needs to collect and process a large amount of historical data during prediction, so that the difficulty of data acquisition and processing is increased, and the time is consumed because manual setting and debugging are needed in a parameter estimation part.
(3) The artificial intelligence method comprises the following steps: a large amount of historical observation data is needed to establish an input-output mapping relation, the method is mainly applied to a nonlinear mapping model, and common methods include a neural network, a support vector machine, a Kalman filtering algorithm, a Markov chain, a particle swarm algorithm, a genetic algorithm and the like. The specific research cases at home and abroad are as follows: atsushi Yona et al, university of Ruckus, Japan, takes the air pressure, temperature, relative humidity, wind speed and the like 18 hours before prediction as neural network input, and predicts the solar irradiation amount by respectively utilizing a feedforward neural network, a radial basis nerve and a recurrent neural network to obtain the output power of a solar power generation system; dingming et al, university of the combined fertilizer industry in China, put forward a method for directly predicting the power of a photovoltaic power station based on a Markov chain.
The artificial intelligence method is the same as the statistical method, a large amount of historical data is required to be collected and processed, and the defects that over-learning is easy to fall into local optimal solutions easily occur in the model learning process.
The photovoltaic power ultra-short term prediction is that the power of 15 minutes to 4 hours in the future is predicted 15 minutes by 15 minutes. The important function of the ultra-short term prediction of the photovoltaic power lies in improving the output predictability of the photovoltaic power station, making a proper power generation plan for the photovoltaic power station, providing powerful technical support for the optimal scheduling of a power grid, relieving the peak and frequency modulation pressure of a power system, and enabling the power grid to receive electric energy from the photovoltaic power station as much as possible while running safely and efficiently. Meanwhile, the photovoltaic power ultra-short term prediction plays an important role in power station power generation amount estimation, maintenance plan formulation, intelligent operation and maintenance and other aspects. Therefore, it is necessary to efficiently and accurately realize the ultra-short-term prediction of the photovoltaic power.
For the existing photovoltaic power ultra-short term prediction technology, the following defects are provided:
(1) when a photovoltaic power ultra-short term prediction model is constructed by using a statistical method and an artificial intelligence method, a large amount of historical photovoltaic power and radiation data are used as a modeling basis, and the accuracy of photovoltaic power ultra-short term prediction is low due to serious shortage of historical live data.
(2) Numerical weather forecast NWP data are used in the photovoltaic power ultra-short-term prediction model constructed by the physical method, time and space scales of the numerical weather forecast NWP data are relatively low (such as European meteorological center mode data), and the data cannot effectively reflect the influence of minute-level or small-level meteorological elements on the output power of the photovoltaic station, so that the prediction accuracy of the photovoltaic power is influenced due to the fact that the acquired photovoltaic station meteorological forecast data are inaccurate in the space-time aspect.
(3) Most photovoltaic power ultra-short term prediction technologies adopt a single prediction method, and combined prediction of multiple methods is not considered. The photovoltaic power prediction under various conditions cannot be considered by a single prediction method, so that the accuracy of the photovoltaic power ultra-short term prediction is not high.
(4) At present, the influence of meteorological elements of a single station on a photovoltaic power station is only considered in the input of a photovoltaic power ultra-short term prediction model, and peripheral elements of the power station, namely regional information, are not considered. The meteorological elements of the single station can only reflect the relation between the meteorological elements and the photovoltaic power station in a deviated way, and the ultra-short-term prediction accuracy of the photovoltaic power is low.
Disclosure of Invention
The invention aims to solve the technical problem of providing a photovoltaic power ultra-short-term prediction method based on a satellite cloud picture and a space-time neural network, which is closer to the actual production and operation of a photovoltaic power station, so that the accuracy of photovoltaic power ultra-short-term prediction is effectively improved.
In order to solve the technical problems, the invention adopts the following technical scheme: the photovoltaic power ultra-short term prediction method based on the satellite cloud picture and the space-time neural network comprises the following steps:
firstly, supplementing the insufficient condition of historical radiation data of a photovoltaic power station through a satellite inversion radiation model;
then, a minute-level satellite extrapolation cloud picture is obtained by utilizing a satellite extrapolation technology, and high-precision short-term meteorological forecast data are provided for photovoltaic power ultra-short-term prediction;
and finally, based on an extrapolated cloud chart, quickly updating assimilation data, historical power station radiation, power monitoring data, geographic information data, solar position and time data, and adopting a space-time neural network algorithm to realize photovoltaic power ultra-short-term prediction.
Preferably, the process of the satellite inversion of solar radiation comprises the following steps:
firstly, calculating clear sky irradiance through a clear sky model;
then, calculating a cloud index by satellite data through a Heliosat model, and further calculating a clear sky index to quantify a weakening effect of a cloud layer;
and finally, obtaining solar radiation data by combining the Heliosat model with the clear sky irradiance and the clear sky index.
Preferably, the band is divided by 0.7 μm, band 1 covers the ultraviolet and visible light, from 0.29 μm to 0.70 μm, band 2 covers the near infrared band from 0.7 μm to 4 μm, with the latest out-of-ground spectral energy distribution and the latest solar constant 1366.W/m2The atmospheric external irradiance of the average sun-ground distance of the two wave bands is respectively EOn1=635.4W/m2And EOn2=709.7W/m2,
For each band i, the band direct irradiance EbniFrom the product of the individual transmittances:
Ebni=TRiTgiToiTniTwiTaiEOni
in the formula, TRi、Tgi、Toi、Tni、Twi、TaiRespectively the transmittances of Rayleigh scattering, uniform mixed gas absorption, ozone absorption, nitrogen dioxide absorption, water vapor absorption and aerosol extinction;
and then adding the components of the two wave bands to obtain direct irradiance:
Ebn=Ebn1+Ebn2
for each band, the backscattering component Eddi is obtained taking into account the multiple reflections between the ground and the atmosphere:
Eddi=ρgiρsi(Ebi+Edpi)/(1-ρgiρsi)
ρs1=[0.13363+0.00077358α1+β1(0.37567+0.22946α1)/(1-0.10832α1)]
/[1+β1(0.84057+0.68683α1)/(1-0.08158α1)]
ρs2=[0.010191+0.00085547α2+β2(0.14618+0.062758α2)/(1-0.19402α2)]
/[1+β2(0.58101+0.17426α2)/(1-0.17568α2)]
wherein Ebi is Ebni × cosZ, total diffuse irradiance of each waveband is Edi is Edpi + Eddi, the obtained diffuse irradiance is Ed1+ Ed2, and the global irradiance is Eg + Eb.
Preferably, the Heliosat model combines a clear sky model with a cloud index, which is defined as follows:
the clear sky index is defined as the ratio of the actual horizontal irradiance to the clear sky horizontal irradiance, namely:
wherein ρ
tIs the apparent reflectivity observed by the satellite,
the apparent reflectivity of the brightest cloud layer,
for surface reflectivity, G and Gc are the actual surface irradiance andclear sky irradiance;
and finally, establishing an empirical relationship between the clear sky index and the cloud albedo, wherein the specific numerical relationship is as follows:
preferably, the satellite extrapolation technology utilizes a satellite cloud picture shot by a global synchronous satellite in a period of history, and extrapolates the satellite cloud picture by adopting a satellite extrapolation algorithm to finally obtain the cloud layer movement situation within a few hours and a minute in the future, wherein the historical satellite cloud picture is filtered by utilizing Fourier transform.
Preferably, an extrapolation model adopted by the satellite extrapolation technology is Multi-level Correlation LSTM, extrapolation of the satellite is realized by using a satellite cloud picture of a historical time sequence in a coding and decoding mode, in the previous m-1 step, each step of the model is input into a real satellite picture at the previous moment, the subsequent step of the model is input into a model prediction result at the previous moment, finally, the prediction result in the step [ m-1, t ] is used as output, then a loss function is selected, and the model is optimized.
Preferably, the photovoltaic power ultra-short term prediction is performed by adopting a photovoltaic power ultra-short term prediction model, and the steps are as follows: (1) collecting data required by a prediction model, and performing data quality control;
(2) for the data after the data quality control, realizing feature engineering and feature selection by adopting a mode of combining Pearson correlation coefficient analysis and Principal Component Analysis (PCA);
(3) and the 3DConv _ LSTM network structure is adopted to realize power prediction from a space-time perspective.
Preferably, the data required by the prediction model comprises meteorological data, wherein the quality control of the meteorological data comprises the following steps:
(1) checking a limit value: the method comprises the following steps of (1) checking a climate limit value and a region limit value;
(2) and (3) checking the time consistency: the sampling value of each meteorological element cannot exceed the variation range within a certain time, and the data beyond the variation range is suspicious data;
(3) internal consistency checking: judging whether the meteorological elements accord with a certain rule or not by utilizing the physical characteristic relation of different values of the meteorological elements, and judging whether the observed value of one variable at the same time is credible or not through the observed value of the other variable;
(4) checking the spatial consistency: according to the spatial distribution rule of meteorological elements, data of a certain station is compared with meteorological element data values observed by adjacent stations at the same time or an estimation value of a station to be checked is calculated by utilizing an observation value of an adjacent station through a certain interpolation method, then the observation value is compared with the estimation value, if the data are positive deviations or negative deviations, and the deviation amplitude exceeds the historical upper limit, the record is marked as suspicious, an alarm is given to prompt manual determination, and if the deviation amplitude exceeds twice the historical upper limit, the missing measurement is processed.
Preferably, the missing value is processed by a K-nearest neighbor method in data quality control.
Preferably, the 3DConv _ LSTM network structure is divided into two parts:
(1) further extracting space-time characteristics required by photovoltaic prediction by adopting a 3DConv network structure;
(2) and inputting the extracted space-time feature vector into an LSTM unit, constructing a nonlinear mapping relation between the space-time feature and the photovoltaic power, and realizing power prediction from 4 hours to 15 minutes in the future based on the nonlinear mapping relation.
By adopting the technical scheme, the invention has the following beneficial effects:
1. the method realizes the satellite inversion radiation technology, makes up the situation of insufficient historical live data caused by few radiation monitoring stations, and provides data support for realizing efficient and accurate photovoltaic power generation power prediction.
2. Satellite cloud picture forecast data is obtained through a satellite extrapolation model, namely minute-scale cloud picture information is obtained, high-precision short-term weather forecast data is provided for photovoltaic power ultra-short-term forecast, and a data basis is provided for photovoltaic power ultra-short-term forecast accuracy.
3. A photovoltaic power ultra-short-term prediction model of a complex space-time neural network algorithm is provided, and the ultra-short-term prediction accuracy is effectively improved.
Therefore, based on the technical scheme, the method is closer to the actual production and operation of the photovoltaic power station, so that the ultra-short-term prediction accuracy of the photovoltaic power is effectively improved.
The following detailed description and the accompanying drawings are included to provide a further understanding of the invention.
Detailed Description
The technical solutions of the embodiments of the present invention are explained and illustrated below with reference to the drawings of the embodiments of the present invention, but the following embodiments are only preferred embodiments of the present invention, and not all embodiments. Based on the embodiments in the implementation, other embodiments obtained by those skilled in the art without any creative effort belong to the protection scope of the present invention.
The invention provides a photovoltaic power ultra-short-term prediction method based on a satellite cloud picture and a space-time neural network, which is based on radiation and power data, quickly updated assimilation data, satellite data, time, terrain and landform, solar azimuth angle and other theoretical information collected by a micro meteorological station of a photovoltaic power station, and adopts an artificial intelligence algorithm to establish an ultra-short-term prediction model through a data preprocessing method.
As shown in fig. 1, the main process is as follows:
(1) and the insufficient condition of historical radiation data of the photovoltaic power station is supplemented through a satellite inversion radiation model, and data support is provided for subsequent modeling.
(2) And a minute-level satellite extrapolation cloud picture is obtained by utilizing a satellite extrapolation technology, and high-precision short-term weather forecast data is provided for photovoltaic power ultra-short-term prediction.
(3) And realizing photovoltaic power ultra-short-term prediction by adopting a space-time neural network algorithm based on data such as an extrapolated nephogram, rapidly updated assimilation data, historical power station radiation, power monitoring data, geographic information data, sun position and time.
1. Satellite inversion radiation model
Photovoltaic power generation relies on a photoelectric conversion element to convert solar radiation energy into electrical energy, so that ultra-short-term prediction of photovoltaic power relies on a large amount of historical solar radiation monitoring data.
There are three main ways to obtain surface solar radiation: ground observation, numerical simulation and satellite remote sensing. Each has its advantages and disadvantages. Ground observation is the most direct and reliable method for obtaining surface solar radiation and is indispensable to research and application requiring strict data quality. However, surface measurements at specific points are not sufficient to describe the pattern of the spatial distribution of surface solar radiation, particularly in remote areas where observation sites are rare (e.g., the sea surface). For example, in China, nearly ten thousand meteorological temperature monitoring stations are distributed in China, only 191 meteorological radiation monitoring stations are distributed, and the spatial distribution of solar radiation data cannot meet the modeling requirement.
In contrast to ground observations, numerical modes (e.g., atmospheric circulation mode, GCM) can generate spatio-temporally continuous maps of surface solar radiation on a regional and global scale. Its greatest advantage is integrity and consistency, which is particularly important for long-term climate monitoring and analysis. Among various surface solar radiation products based on numerical models, re-analysis of the products is most common. Typical reanalysis products, such as the ERA-Interim reanalysis product of the European middle-term weather forecast center, have the main disadvantage that the model has larger error in simulating or predicting the cloud cover. Therefore, the quality of surface solar radiation products, particularly high time resolution products, is questionable in terms of accuracy.
Satellite remote sensing has the capability of capturing cloud space distribution and dynamic evolution, and provides a unique means for monitoring and estimating surface solar radiation. Satellite remote sensing is a better way to obtain solar radiation from regional and even global surfaces than numerical simulations. On one hand, the satellite inversion can provide a large amount of historical solar radiation data for a photovoltaic plant to provide data support for building a prediction model, and on the other hand, the solar radiation prediction can be realized based on the data obtained by satellite extrapolation, and the forecast data of factors such as short clouds and the like can be obtained, so that a foundation is laid for the prediction of the light power.
In the embodiment, wind cloud satellite and sunflower 8 satellite data are mainly used in the satellite inversion radiation model.
And after satellite data required in a satellite inversion solar radiation model is collected, constructing the model.
The satellite inversion solar radiation model is mainly realized by simulating the process of solar radiation from the outside of the atmosphere to the earth surface. The amount of solar radiation reaches the earth's surface over a long journey, so the interaction between the solar radiation of the earth's outer layers and the earth's atmosphere, surface and objects needs to be taken into account when building the radiation model for satellite inversion. Meanwhile, the cloud is used as a surface solar radiation regulator with the maximum intensity (far stronger than other atmospheric components), and plays a key role in surface solar radiation estimation. In view of the decisive influence of the cloud on the surface solar radiation, the estimation of the surface solar radiation is to a certain extent centered around how accurately the radiation attenuation of the cloud in the atmosphere (cloud scattering and cloud absorption) is taken into account.
The process of the satellite inversion of solar radiation is shown in fig. 2, and mainly comprises the following steps:
first, the clear sky irradiance is calculated through a clear sky model. Then, the satellite data is used for calculating a cloud index through a Heliosat model so as to calculate a clear sky index, and weakening effects of cloud layers are quantized. And finally, obtaining solar radiation data by combining the Heliosat model with the clear sky irradiance and the clear sky index.
(1) Clear sky radiation
Among various atmospheric components under clear air conditions, ozone, water vapor and aerosol are three key components which have a large influence on surface solar radiation. The atmospheric aerosol is a multiphase system consisting of atmospheric molecules, solid particles and liquid particles suspended in the atmosphere, and the size of the particle size of the multiphase system is 0.001-100 mu m. Among other things, atmospheric aerosol optical thickness (AOD) data, which is used to describe the attenuation of the aerosol to light, has a significant impact on the accuracy of the model. The AOD is defined as the integral of the extinction coefficient of the atmospheric medium from the earth surface to the top of the atmospheric layer in the vertical direction, and the higher the value, the stronger the attenuation of the light by the aerosol, and the lower the atmospheric visibility.
The calculation method of the AOD corresponding to the wavelength lambda is given as follows:
τλ=βλ-α
in the formula, beta represents
Turbidity coefficient, related to concentration of atmospheric aerosol, alpha stands for
Wavelength index, related to the particle size distribution in an atmospheric aerosol.
REST2 is a high performance model that uses atmospheric data to estimate cloudless sky irradiance and Photosynthetically Active Radiation (PAR). The derivation process uses a dual-band scheme, and the model particularly pays special attention to accurately explaining the influence of aerosol on clear sky radiation.
REST2 demarcates the band at 0.7 μm. Band 1 covers both ultraviolet and visible light, from 0.29 μm to 0.70 μm, and is characterized by strong absorption by ozone in the ultraviolet band and strong scattering by molecules and aerosols over the entire band. The wave band 2 covers the near infrared wave band of 0.7-4 μm, and is characterized by strong absorption and limited scattering to gases such as water vapor, carbon dioxide and the like. The modeling method improves the precision of estimating clear sky radiation.
Using the latest extraterrestrial spectral energy distribution and the latest solar constant 1366.W/m2The atmospheric external irradiance of the average sun-ground distance of the two wave bands is respectively EOn1=635.4W/m2And EOn2=709.7W/m2。
For each band i, the band direct irradiance EbniCan be obtained from the product of the individual transmittances:
Ebni=TRiTgiToiTniTwiTaiEOni
in the formula, TRi、Tgi、Toi、Tni、Twi、TaiRespectively, the transmittances of Rayleigh scattering, uniform mixed gas absorption, ozone absorption, nitrogen dioxide absorption, water vapor absorption and aerosol extinction. The transmission of each component is modeled by a relatively accurate parametric scheme, introducing only small errors.
And then adding the components of the two wave bands to obtain direct irradiance:
Ebn=Ebn1+Ebn2
the calculation of the scattering irradiance is based on a two-layer scattering scheme. The top layer is assumed to be the source of all rayleigh scattering, and all ozone and mixed gas absorption. Likewise, the underlayer is assumed to be the source of all aerosol scattering, as well as aerosol, water vapor and nitrogen dioxide absorption.
Normally, the backscattering contribution must be increased due to the interaction between the reflecting earth's surface and the atmospheric scattering layer. This contribution is usually small, but may become particularly important in snow areas. For each band, the backscattering component Eddi is obtained taking into account the multiple reflections between the ground and the atmosphere:
Eddi=ρgiρsi(Ebi+Edpi)/(1-ρgiρsi)
ρs1=[0.13363+0.00077358α1+β1(0.37567+0.22946α1)/(1-0.10832α1)]
/[1+β1(0.84057+0.68683α1)/(1-0.08158α1)]
ρs2=[0.010191+0.00085547α2+β2(0.14618+0.062758α2)/(1-0.19402α2)]
/[1+β2(0.58101+0.17426α2)/(1-0.17568α2)]
wherein Ebi is Ebni × cosZ, total diffuse irradiance of each waveband is Edi is Edpi + Eddi, the obtained diffuse irradiance is Ed1+ Ed2, and the global irradiance is Eg + Eb.
(2) Heliosat model
To solve the problem of efficiently and accurately estimating solar radiation under various weather conditions, many previous studies have used a large number of atmospheric and surface parameters as inputs to a radiation transmission model to take into account the effects of clouds on radiation. Although these models have definite physical processes, the spatial resolution of the final result is limited due to the large number of variables involved and the large amount of computation. The Heliosat method is different from the Heliosat method in that the cloud albedo is selected as a cloud attenuation effect comprehensive index, the influence of the cloud is determined by utilizing the comprehensive characteristics of the whole atmosphere, the assumption of the vertical structure of the atmosphere is not needed, and the calculation efficiency is higher, so that the Heliosat method is widely applied.
Heliosat adopts a semi-parameterized solar radiation model, adopts satellite data to identify cloud characteristics, and also considers the atmospheric attenuation process of most solar radiation in the calculation scheme and also adopts some input physical parameters. Thus, this calculation scheme can adequately simulate real conditions. The estimation effect of the method is generally superior to that of radiation products such as global energy and water circulation experiments, ERA reanalysis data and the like.
The Heliosat model combines a clear sky model with a "cloud index". The cloud index method is based on the following assumptions: the appearance of clouds on the pixels results in an increase in the reflectivity of the visible image; the attenuation of the down-going short wave irradiance of a pixel by the atmosphere is related to the magnitude of the change between the reflectivity that should be observed in a cloudless sky and the currently observed reflectivity. The magnitude of this change can be characterized by introducing a cloud index and a clear sky index. The cloud index is defined as follows:
the clear sky index is defined as the ratio of the actual horizontal irradiance to the clear sky horizontal irradiance, namely:
where ρ is
tIs the apparent reflectivity observed by the satellite.
The apparent reflectivity of the brightest cloud layer,
and G and Gc are the actual ground surface irradiance and clear sky irradiance respectively. Finally, an empirical relationship between the clear sky index and the cloud albedo is established, as shown in fig. 3.
The specific numerical relationship is as follows:
the data needed for the whole calculation process are shown in table 1:
TABLE 1 inverse model data sheet
The satellite inversion radiation technology is adopted to obtain the actual cases of surface solar radiation under the conditions of sunny days, cloudy days and cloudy days, and a daily mean square root error calculation formula is adopted to evaluate the inversion effect.
Wherein, PMiIs the actual power at time i, PPi,tFor the ultra-short term prediction at time i, N is 144 samples (data of 10 minutes by satellite).
Through the analysis of the inspection effect, the solar radiation change caused by the minute-level meteorological change can be presented through the satellite inversion technology, so that the defect of power station radiation data can be well made up, and the ultra-short-term prediction accuracy of the photovoltaic power can be effectively improved.
2. Satellite extrapolation model
Cloud is one of the main meteorological factors affecting solar radiation and light power, and the difficulty of ultra-short term prediction of photovoltaic power is mainly shown in the following steps: (1) the solar motion rule and the atmospheric state fluctuation coact to cause the irradiance fluctuation characteristic to be complex, and the small-scale change in the day is difficult to grasp; (2) the cloud cluster extinction movement causes rapid and violent change of the surface irradiance, and the photovoltaic output presents minute-level careless mutation in cloudy weather. Therefore, the satellite extrapolation technology is adopted in the photovoltaic power ultra-short-term prediction model to realize the prediction of the cloud picture at the minute level, and accurate short-term weather forecast data can be effectively obtained, so that the ultra-short-term prediction accuracy is improved.
The satellite extrapolation process is to use satellite cloud pictures (sunflower 8 and wind cloud 4) shot by a global synchronous satellite in a period of history to extrapolate the satellite cloud pictures by adopting a satellite extrapolation algorithm, and finally obtain the movement condition of the cloud layer within hours and minutes in the future.
As shown in fig. 4, the satellite extrapolation steps mainly include: and (4) data quality control and construction of an extrapolation model.
2.1 data quality control
The data quality control means preprocessing the problems of noise, bad line and strip removal and the like existing in the collected satellite cloud images. If the satellite cloud picture with noise is input into the extrapolation model, the extrapolation accuracy is not improved, and the extrapolation effect is reduced. Therefore, it is very important to perform data quality control on the collected satellite cloud images, and filtering processing is generally performed by using fourier transform to reduce periodic noise, sharp noise, bad line removal and banding in the satellite cloud images.
2.2 satellite cloud Picture extrapolation model
The most commonly used extrapolation model at present mainly uses an optical flow method, but 3 constraint conditions (certain brightness, small motion and consistent space) of the optical flow method cause that the extrapolation of the optical flow method can predict the motion trail of the satellite cloud picture, but cannot predict the change of the thickness of a cloud layer. According to the latest research of artificial intelligence in the meteorological industry, the fact that the extrapolation model based on the deep learning algorithm can not only realize the prediction of the track, but also effectively realize the prediction of the size and thickness of the cloud layer is found.
The extrapolation model mainly adopts Multi-level Correlation LSTM (MLC-LSTM), the core of the model is ConvLSTM, and the MLC-LSTM realizes satellite extrapolation by using a satellite cloud picture of a historical time sequence in a coding and decoding mode. In the first m-1 step, each step of the model is input as a real satellite picture at the last moment, and the subsequent model is input as a model prediction result at the last moment. And finally, taking the prediction result of the step (m-1, t) as output, and then selecting a loss function to optimize the model.
The core of the MLC-LSTM model is ConvLSTM, which mainly converts the product operation inside the LSTM into convolution operation, even though the convolution operation has convolution structures in input to state and state-to-state conversion, so that the method is more suitable for processing spatio-temporal data. Research shows that the ConvLSTM network can better capture the space-time correlation and is always superior to FC-LSTM.
3. Photovoltaic power ultra-short term prediction model
The photovoltaic power ultra-short term prediction model is constructed based on satellite extrapolation cloud pictures, rapid updating assimilation data, DEM elevation data, time, theoretical sun position and other data fusion, and is input into a space-time neural network algorithm model for prediction, and finally, an ultra-short term power prediction result of 15 minutes after 4 hours in the future is obtained.
As shown in fig. 5, the photovoltaic power ultra-short term prediction mainly includes the steps of:
(1) collecting data required by a prediction model, and performing data quality control to ensure that data used in modeling is clean and reliable data and ensure the accuracy of model prediction;
(2) for data after data quality control, a Pearson correlation coefficient analysis and Principal Component Analysis (PCA) combined mode is adopted to realize feature engineering and feature selection so as to extract useful features, reduce feature redundancy, reduce the structure and time complexity of a model prediction model and ensure efficient model prediction;
(3) and the 3DConv _ LSTM network structure is adopted to realize power prediction from a space-time perspective, so that the accuracy of ultra-short-term power prediction is further ensured.
3.1 data quality control
The collected data set includes the robot station observation data and the fast update and assimilation update data, and various abnormal values and missing values may exist in the robot station observation data and the pattern data. Therefore, all data in the data set need to be correspondingly preprocessed to ensure that data used in modeling is clean and reliable, and the accuracy of model prediction and the quality control of the data are ensured.
The total flow of the quality control of the meteorological data is shown in figure 6:
(1) checking a threshold value: the limit value inspection is classified into a climate limit value inspection and a region limit value inspection according to its characteristics. And regarding the data which does not belong to the range of the limit value as abnormal data, and performing the processing without logging in a database. The upper limit of the limit value range is set to determine whether the effect of the limit value check is reasonable. If the upper limit value is set too high, erroneous data is likely to be handled as correct data, resulting in the use of erroneous data in applications; if the upper limit value is set too low, the correct data of the individual larger values may be processed as the wrong data, and the data is very helpful for researching extreme weather and meteorological warning.
(2) And (3) checking the time consistency: the time consistency check is a method for checking whether the change of the corresponding element values of the automatic weather station conforms to a certain rule or not by utilizing the characteristic that the weather elements have a certain rule along with the change of time and do not jump greatly in a relatively short time. The sampling value of each meteorological element cannot exceed the variation range within a certain time, and the data beyond the variation range is suspicious data.
(3) Internal consistency check: according to the meteorological principle, the internal consistency check is to judge whether the meteorological elements accord with a certain rule by using the physical characteristic relation of different values of the meteorological elements, and judge whether the observed value of one variable at the same time is credible through the observed value of the other variable. Not only errors but also possible errors can be detected.
(4) Checking the spatial consistency: the space consistency check is a method for checking whether meteorological elements of the station are correct or not according to the characteristic that the meteorological parameters have certain space distribution characteristics and the similarity of factors such as the geographic environment of a single station and other adjacent stations causes the similarity of the meteorological elements. According to the spatial distribution rule of meteorological elements, data of a certain station is compared with meteorological element data values observed by adjacent stations at the same time or an estimation value of a station to be checked is calculated by utilizing an observation value of an adjacent station through a certain interpolation method, then the observation value is compared with the estimation value, if the data are positive deviations or negative deviations, and the deviation amplitude exceeds the historical upper limit, the record is marked as suspicious, an alarm is given to prompt manual determination, and if the deviation amplitude exceeds twice the historical upper limit, the missing measurement is processed.
The meteorological data are checked by adopting the flow, and the meteorological data which do not meet the standard are recorded as abnormal values. If the abnormal value is not too much, the abnormal value can be treated as a missing value and needs to be further treated.
The invention adopts the K nearest neighbor method to process missing values, ensures the high quality of data and is convenient for later use. The K-nearest neighbor method estimates the missing value by K nearest neighbors, the weight corresponding to each neighbor is obtained by weighting the distance from the neighbor to the sample with the missing value, and the euclidean distance is generally selected as the distance. There are two cases to handle missing values: if the missing value is discrete, using a K neighbor classifier to vote and select the most categories in the K neighbors for filling; if the variable is a continuous variable, a K neighbor regressor is used for filling the average value of the variable in K neighbors.
3.2 feature engineering and feature selection
If the existing characteristic variables in the existing data set are not enough or the existing characteristic variables are not enough to fully represent the characteristics of the data, and a good effect cannot be predicted by modeling according to a small part of characteristics, then characteristic engineering is required to generate new characteristic variables capable of representing the characteristics of the data or to perform transformation on the existing characteristic variables.
The method is characterized in that Pearson correlation coefficient analysis and Principal Component Analysis (PCA) are combined, high-dimensional data are mapped into a low-dimensional space through linear projection to be represented, and the variance of the data on the projected dimension is expected to be maximum, so that fewer data dimensions are used, and the characteristics of more original data points are reserved.
The PCA algorithm flow is as follows: standardizing the original d-dimensional data set; constructing a covariance matrix of the sample; calculating an eigenvalue of the covariance matrix and a corresponding eigenvector; selecting a feature vector corresponding to the first k largest feature values, where k is the new feature space dimension (k < ═ d); constructing a mapping matrix W through the first k eigenvectors; the d-dimensional input dataset X is converted to a new k-dimensional feature subspace by means of a mapping matrix W. When each meteorological feature is subjected to PCA conversion, the physical meaning does not exist, and only one numerical value is represented.
Table 2 shows the final features obtained by subjecting the raw data features to pearson correlation coefficients and PCA algorithm. Since the features after the PCA conversion have no physical meaning, they are not described in detail.
TABLE 2 ultra-short term power prediction model data sheet
3.3 ultra-short term spatio-temporal prediction model
As shown in fig. 7, the ultra-short-term prediction model takes photovoltaic power station real-time operation data, automatic meteorological station monitoring data, satellite cloud atlas extrapolation data, rapid updating assimilation mode data, time and other data as input, adopts a 3DConv _ LSTM network structure to perform prediction, and outputs a photovoltaic power prediction value which is 15 minutes after 4 hours. The 3DConv _ LSTM network structure is divided into two parts:
(1) a 3DConv network structure is adopted to further extract space-time characteristics required by photovoltaic prediction;
(2) and inputting the extracted space-time feature vector into an LSTM unit, constructing a nonlinear mapping relation between the space-time feature and the photovoltaic power, and realizing power prediction from 4 hours to 15 minutes in the future based on the nonlinear mapping relation.
(1)3DConv network
The 3DConv network is well suited for spatio-temporal feature learning. Compared to 2DConv networks, 3DConv networks are able to better model time information through 3D volume and 3D pooling operations. In 3DConv networks, convolution and pooling operations are performed spatio-temporally, whereas in 2DConv they are done only spatially.
2) LSTM network
The long-short term memory network (LSTM) is a variant of the cyclic neural network, the problem of gradient explosion or disappearance of the simple cyclic neural network can be effectively solved by introducing a linear connection and a gating mechanism, and the whole network can establish a time sequence dependency relationship with a longer distance.
The LSTM network introduces a new internal state c _ t exclusively for linear cyclic information transfer, while nonlinearly outputting information to the external state h _ t of the hidden layer.
ht=ot⊙tanh(ct)
Wherein f is
t,i
t,o
tRespectively as forgetting to gate and losingAn input gate and an output gate for controlling the path of information transfer,. alpha.
t-1Is the memory cell at the last moment in time,
are candidate states obtained by a non-linear function. The roles of the three gates are respectively: forget door f
tControlling the internal state c of the last moment
t-1How much information needs to be forgotten; input door i
tControlling candidate states at the current time
How much information needs to be saved; output gate o
tControlling the internal state c at the present moment
tHow much information needs to be output to the external state h
t. When f is
t=0,i
tWhen the state vector is equal to 1, the memory unit clears the history information and the candidate state vector
And (6) writing. But now memory cell c
tStill related to historical information at the previous time. When f is
t=1,i
tWhen the value is 0, the memory unit copies the content of the previous time and does not write new information.
The three gates are calculated as:
it=σ(Wixt+Uiht-1+bi)
ft=σ(Wfxt+Ufht-1+bf)
ot=σ(Woxt+Uoht-1+bo)
the hidden state h in the recurrent neural network stores historical information and can be regarded as a memory (memory). In a simple round-robin network, the hidden state is overwritten every moment, and thus can be considered as a short-term memory. In neural networks, long-term memory (long-term) can be considered as a network parameter, implying experience learned from training data, and the update period is much slower than short-term memory. In LSTM networks, the memory unit c may capture a certain critical message at a certain time and has the ability to store the critical message for a certain time interval. The life cycle of the information stored in the memory unit c is longer than the short-term memory h, but much shorter than the long-term memory, so it is called long short-term memory.
The LSTM has the advantages that:
the LSTM in one memory block can transfer constant errors, the error maximization can be transferred backwards through the control of the input gate and the forgetting gate in different time sequence stages, and finally the LSTM can solve the long dependence problem.
LSTM can handle noise, continuous input, and distributed representation.
The generalization ability of LSTM is stronger than that of other models.
Fine-grained parameter tuning is not required, and the computational complexity is consistent with the RNN.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that the present invention may be practiced without limitation to such specific embodiments. Any modification which does not depart from the functional and structural principles of the present invention is intended to be included within the scope of the claims.