CN115115125A

CN115115125A - Photovoltaic power interval probability prediction method based on deep learning fusion model

Info

Publication number: CN115115125A
Application number: CN202210823439.9A
Authority: CN
Inventors: 王开艳; 杜浩东; 贾嵘; 王颂凯; 刘恒
Original assignee: Xian University of Technology
Current assignee: Xian University of Technology
Priority date: 2022-07-13
Filing date: 2022-07-13
Publication date: 2022-09-27

Abstract

The invention discloses a photovoltaic power interval probability prediction method based on a deep learning fusion model, which specifically comprises the following steps: firstly, acquiring the photovoltaic power and meteorological factors to carry out variable correlation analysis, and determining the input variables of a prediction model; selecting a clustering variable, constructing statistical characteristics of the clustering variable, performing similar daily clustering on photovoltaic historical data by adopting a fuzzy C-means clustering algorithm, and performing normalization processing on the photovoltaic historical data; then dividing the similar daily data set into a training set and a testing set; constructing a QR-CNN-BilSTM interval prediction model, training and predicting a photovoltaic power interval; and finally, generating a photovoltaic probability prediction result on the test set. The method can well track the future photovoltaic power trend, realizes high-accuracy measurement of photovoltaic power prediction uncertainty on the basis of meeting the reliability requirement, generates the photovoltaic power prediction interval under the corresponding confidence level, and has practical application value.

Description

Photovoltaic power interval probability prediction method based on deep learning fusion model

Technical Field

The invention belongs to the technical field of photovoltaic power generation prediction, and particularly relates to a photovoltaic power interval probability prediction method based on a deep learning fusion model.

Background

In recent years, the problem of environmental pollution is becoming more serious, the problem of non-renewable energy shortage is becoming more prominent, new energy development roads are sought in countries in the world, new energy development strategies are implemented in 2014 in China, solar energy serving as an important component of energy transformation is developed and utilized on a large scale, and by the end of 2021, the total installation of photovoltaic power generation in China reaches 3.06 hundred million kW, and the new installation of photovoltaic power generation in the whole country in 2021 year is 5300 ten thousand kW. The large-scale new energy grid-connected power generation represented by photovoltaic power generation is an unblocked development trend and prominent characteristics of a new generation of power system in the future. However, due to the influence of various complex environmental factors, photovoltaic power generation has strong random fluctuation, intermittency and non-stationarity, and as a high-proportion photovoltaic power generation in an electric power system is connected, the photovoltaic power generation as an uncontrollable power source seriously threatens the safe and stable operation of the electric power system. Therefore, the research on the photovoltaic power prediction technology has important significance for building a new generation of electric power system in China and enabling the electric power system to be suitable for the access of high-proportion renewable energy sources, and has important values for building an integrated security defense system of the electric power system and realizing risk control.

Existing photovoltaic power prediction techniques are formally classified from the results of the predictions, which can be classified into deterministic predictions and non-deterministic predictions. The photovoltaic power certainty prediction result is single-point prediction, and the advantage is that the method is visual, but uncertainty information of photovoltaic power prediction cannot be represented. The uncertainty prediction can give a possible variation range and a confidence degree of the photovoltaic power at a future moment, and the uncertainty prediction result can provide more comprehensive data support for power system scheduling, so that the method has more important engineering significance. The prediction models are classified into physical models and data driving models, the physical models are established from the characteristics, installation angles and the like of photovoltaic modules, geographic conditions, meteorological elements and the like are considered, the physical models are complex in construction mechanism, and the application is less at present. The data-driven model mainly comprises a statistical method and an artificial intelligence algorithm. The statistical method adopts curve fitting and parameter estimation to establish the relation between the photovoltaic power and the influence factors thereof, and a common time sequence method and a gray model are adopted. The artificial intelligence model is represented by a neural network and a deep learning model, has strong nonlinear data processing capacity, and is a model generally adopted in recent years.

Although a lot of work has been done by many scholars in the field of photovoltaic power prediction, the following problems still exist at the present stage: (1) photovoltaic power prediction is concentrated on point prediction, and interval probability prediction research is less. (2) The reliability and the sensitivity of the existing interval probability prediction model are not high, and the performance of the photovoltaic power interval probability prediction model needs to be further improved urgently. (3) Most studies adopt arithmetic data with 1h or 15min as an interval, however, when the power with 5min as an interval in the future is predicted, more complex and variable photovoltaic power fluctuation is faced, the traditional single model cannot well cope with the problem, and multi-model fusion is one of the solutions in the future.

Disclosure of Invention

The invention aims to provide a photovoltaic power interval probability prediction method based on a deep learning fusion model, and solves the problems of inaccurate photovoltaic power interval prediction and probability prediction results.

The technical scheme adopted by the invention is that the photovoltaic power interval probability prediction method based on the deep learning fusion model is implemented according to the following steps:

step 1, acquiring preprocessed meteorological element data and historical photovoltaic power data, performing variable correlation analysis on the photovoltaic power and meteorological factors, and determining input variables of a prediction model;

step 2, selecting clustering variables and constructing statistical characteristics of the clustering variables;

step 3, according to the clustering variables and the statistical characteristics thereof selected in the step 2, performing similar daily clustering on the photovoltaic historical data by adopting a fuzzy C-means clustering algorithm to obtain a photovoltaic similar daily data set;

step 4, carrying out normalization processing on the photovoltaic similar day data set by adopting a min-max normalization method;

step 5, dividing the similar day data sets under each weather condition into a training set and a testing set;

step 6, constructing a QR-CNN-BilSTM interval prediction model, setting parameters of the CNN and the BilSTM model, and setting related parameters of model training; inputting a training set of the similar daily data set into the model for training;

step 7, inputting the data of the test set into the QR-CNN-BilSTM interval prediction model trained in the step 6, and performing photovoltaic power interval prediction;

step 8, performing reverse normalization on the interval prediction result to make the interval prediction result have physical significance;

and 9, generating a photovoltaic probability prediction result on the test set by adopting a kernel density estimation algorithm optimized by a cross validation and grid search method.

The present invention is also characterized in that,

in the step 1, the method specifically comprises the following steps:

step 1.1, selecting preprocessed meteorological element data and preprocessed photovoltaic power generation power data;

the time resolution of the meteorological element variable and the photovoltaic power variable is 5min, and wind speed, relative humidity, ambient temperature, total horizontal radiation, diffused horizontal radiation, rainfall, wind direction, total oblique radiation and diffused oblique radiation are selected as original meteorological element data variables;

step 1.2, measuring the correlation degree among a plurality of meteorological element variables by adopting a kendall rank correlation coefficient R;

and step 1.3, selecting meteorological factors of which the kendall rank correlation coefficient R absolute value of the photovoltaic power is not less than 0.5, and inputting the meteorological factors as a prediction model.

In the step 2, the method specifically comprises the following steps:

2.1, selecting a meteorological factor variable with the highest kendall rank correlation coefficient value of the photovoltaic power as a clustering variable;

and 2.2, selecting the average value, the standard deviation, the maximum value, the peak-to-valley number, the variation coefficient, the kurtosis and the skewness of the clustering variables as statistical characteristics.

In the step 3, the method specifically comprises the following steps:

step 3.1, calculating numerical values of 7 statistical characteristics of the clustering variables on each day according to the statistical characteristics of the clustering variables constructed in the step 2;

step 3.2, determining the number c of data clustering categories, and initializing a clustering center V _i Giving fuzzification parameter m and initializing membership degree matrix U ⁽⁰⁾ Given the termination criterion epsilon of the algorithm;

step 3.3, calculating all clustering centers of the t iteration according to the formula (5) to obtain a clustering center matrix:

in the formula: u. u _ij The membership degree of the ith sample belonging to the jth class; x is the number of _i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters;

step 3.4, updating membership degree matrix U ^(t) The calculation method is shown in formula (6):

in the formula: u. of _ij The membership degree of the ith sample belonging to the jth class; x is the number of _i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters; d _ij Is the distance from the ith sample to the jth class center;

step 3.5, calculate | | | U ^(t) -U ^(t-1) And verifying whether an iteration stop condition is met ^(t) -U ^(t-1) ||<E, stopping iteration if the condition is met, otherwise, continuously repeating the step 3.3 and the step 3.4 until a bar is reachedAnd (4) until the data sets of similar days in each weather condition are obtained finally.

In step 6, the method specifically comprises the following steps:

6.1, constructing a characteristic diagram of the normalized similar day data set in a sliding window mode, inputting the characteristic diagram into a CNN network, and extracting a characteristic vector representing the dynamic change of photovoltaic power by using a convolution layer and a pooling layer of the characteristic diagram;

6.2, converting the output characteristic vector into a time sequence and inputting the time sequence into a BilSTM network;

step 6.3, introducing a QR model, combining with the CNN-BilSTM, and fusing the QR model and the CNN-BilSTM model in a loss function mode;

is provided with

Representing a CNN-BilSTM point prediction model, wherein X _t For model inputs, i.e., independent variables, omega is the model parameter of CNN-BilSTM, Y _t As a function of the amount of the dependent variable,

is Y _t The predicted value of (2);

the QR-CNN-BilSTM model may be represented as

Estimation of model parameters omega (tau) at each quantile

By minimizing a loss function

Obtaining;

6.4, setting parameters of the model and training the model;

the QR-CNN-BilSTM model network structure is composed of a CNN layer, a maximum pooling layer, three BilSTM layers and a full connection layer which are connected in sequence;

setting the number of convolution layers as 1 layer, the number of convolution kernels as 64, the size of the convolution kernels as 4, the boundary processing mode of convolution as ' same ', the activation function as ' and the size of the pooling window as 3;

setting the number of the layers of the BiLSTM network to be 3, the number of the neurons to be 128 and the dropout parameter to be 0.2; setting the quantile starting to be 0.05, the step length to be 0.05 and the end point to be 1, so that the number of the neurons in the full connecting layer is 19;

an initial learning rate of 0.01, a learning rate attenuation of 1.5, and a learning rate minimum of 10 are set ^-4 Maximum iteration number is 100, batch processing parameter is 32, optimizer is Adam;

inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training.

The invention has the beneficial effects that:

the photovoltaic power interval probability prediction method based on the deep learning fusion model provides a correlation analysis method based on a kendall rank correlation coefficient, so that model input variables can be determined, and invalid information in historical data is reduced to improve training efficiency; selecting meteorological factor variables with high correlation with photovoltaic power as clustering variables, and selecting 7 statistical characteristics such as average values of the clustering variables as clustering characteristics to comprehensively reflect fluctuation rules and characteristics of each calendar history data, so that an FCM (fuzzy c-means) algorithm can perform high-efficiency clustering, and the clustering algorithm is high-efficiency and reasonable; the QR-CNN-BilSTM model fuses two deep learning models, namely the CNN and the BilSTM, has higher prediction precision compared with a traditional single deep learning prediction model, when the future photovoltaic power is predicted by fine time granularity with 5min intervals, the photovoltaic power with 5min intervals presents more rapid change characteristics under the condition of non-sunny weather, the future photovoltaic power trend can be better tracked, on the basis of meeting the reliability requirement, the high-accuracy measurement of the photovoltaic power prediction uncertainty is realized, the photovoltaic power prediction interval under the corresponding confidence level is generated, and the method has practical application value.

Drawings

FIG. 1 is a flow chart of a photovoltaic power interval probability prediction method based on a deep learning fusion model according to the present invention;

FIG. 2 is a flow chart of a fuzzy C-means clustering algorithm in the photovoltaic power interval probability prediction method based on the deep learning fusion model;

FIG. 3 is a diagram of the interval prediction result of the QR-CNN-BilSTM model used in the method of the present invention in sunny days;

FIG. 4 is a graph of interval prediction results in sunny days using a QR-LSTM model;

FIG. 5 is a graph of interval prediction results in sunny days using a QR-BilSTM model;

FIG. 6 is a graph of interval prediction results of a QR-CNN-BilSTM model adopted in the method of the invention during sunny days and cloudy days;

FIG. 7 is a graph of interval prediction results using a QR-LSTM model on a sunny and cloudy day;

FIG. 8 is a graph of interval prediction results during a clear-to-cloudy day using a QR-BilSTM model;

FIG. 9 is a graph of the interval prediction results of the QR-CNN-BilSTM model used in the method of the present invention in rainy days;

FIG. 10 is a graph of interval prediction results in rainy days using a QR-LSTM model;

FIG. 11 is a graph of the interval prediction results in rainy days using the QR-BilSTM model;

FIG. 12 is a graph of the probability prediction results of the QR-CNN-BilSTM model in combination with the nuclear density estimation method in sunny days;

FIG. 13 is a graph of the probability prediction results of the QR-CNN-BilSTM model in combination with the nuclear density estimation method in a cloudy day during sunny days;

FIG. 14 is a graph of the probability prediction results of the QR-CNN-BilSTM model combined with the nuclear density estimation method in rainy days.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

The photovoltaic power interval probability prediction method based on the deep learning fusion model is implemented according to the following steps as shown in fig. 1:

step 1, acquiring preprocessed meteorological element data and historical photovoltaic power data, performing variable correlation analysis on photovoltaic power and meteorological factors, and determining input variables of a prediction model; the method specifically comprises the following steps:

step 1.1, selecting preprocessed meteorological element data and preprocessed photovoltaic power generation power data, wherein time resolutions of meteorological element variables and photovoltaic power variables are kept consistent;

the time resolution of the meteorological element variable and the photovoltaic power variable is 5min, wind speed, relative humidity, ambient temperature, total horizontal radiation, diffused horizontal radiation, rainfall, wind direction, total oblique radiation and diffused oblique radiation are selected as original meteorological element data variables, the photovoltaic array does not generate electricity at night, data with the power at night being 0 are removed, and data in a period of 6: 00-19: 30 each day are reserved as example analysis data.

Searching vacancy values of historical meteorological data and photovoltaic power generation power data, and filling the vacancy values by using an interpolation method; searching abnormal values of original historical meteorological data and photovoltaic power generation data by adopting a box type graph, and replacing the abnormal values of the data by using the upper and lower boundaries of the box type graph;

the kendall rank correlation coefficient R is defined as formula (1):

in the formula: p represents the number of coincident pairs; q represents the number of non-uniform pairs,

representing the total logarithm of observations. When two pairs of observations A of variables A and B _i 、B _i And A _j 、B _j Satisfy A _i <B _i And at this time A _j <B _j If the two pairs of observed values are inconsistent or harmonious, otherwise, the two pairs of observed values are inconsistent or harmonious;

the closer the kendall rank correlation coefficient R is to 1, the higher the correlation of the meteorological variable with the output power is. The coefficient is positive, indicating positive correlation, and negative indicating negative correlation;

step 1.3, selecting meteorological factors of which the kendall rank correlation coefficient R absolute value of the photovoltaic power is not less than 0.5, and inputting the meteorological factors as a prediction model;

in this example, the kendall rank correlation coefficient R of the photovoltaic power variable and each meteorological element variable is shown in table 1:

TABLE 1 Meteorological factor variables

Selecting a plurality of meteorological factors with high correlation with the photovoltaic power, namely meteorological factors with a kendall rank correlation coefficient absolute value of the photovoltaic power not less than 0.5, and inputting the meteorological factors as a prediction model, so that total horizontal radiation, diffused horizontal radiation, total oblique radiation and diffused oblique radiation are selected as meteorological element variables of the embodiment;

step 2, selecting a clustering variable, and constructing statistical characteristics of the clustering variable, wherein the statistical characteristics specifically comprise the following steps:

2.1, selecting a meteorological factor variable with the highest kendall rank correlation coefficient value of the photovoltaic power as a clustering variable; in the embodiment, total horizontal radiation is selected as a clustering variable;

step 2.2, selecting the average value, the standard deviation, the maximum value, the peak wave valley number, the variation coefficient, the kurtosis and the skewness of the clustering variables as statistical characteristics;

the coefficient of variation C, kurtosis K and skewness D are defined as shown in formulas (2) to (4), respectively:

in the formula, σ represents the standard deviation of the variables,

denotes the mean value of the variables, X _i Representing a certain sample of the variable, M representing the total number of samples of the variable;

step 3, according to the clustering variables and the statistical characteristics thereof selected in the step 2, as shown in fig. 2, similar day clustering of photovoltaic historical data is performed by adopting a Fuzzy C Mean (FCM) clustering algorithm to obtain a photovoltaic similar day data set; the method specifically comprises the following steps:

step 3.1, calculating numerical values of 7 statistical characteristics of the clustering variables on each day according to the statistical characteristics of the clustering variables constructed in the step 2, and taking the numerical values as clustering data for clustering by adopting a Fuzzy C Mean (FCM) clustering algorithm;

in the formula: u. of _ij The membership degree of the ith sample belonging to the jth class; x is the number of _i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters.

in the formula: u. of _ij The membership degree of the ith sample belonging to the jth class; x is the number of _i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents polyThe number of classes; d _ij Is the distance from the ith sample to the jth class center.

Step 3.5, calculate | | | U ^(t) -U ^(t-1) And verifying whether an iteration stop condition is met ^(t) -U ^(t-1) ||<If the condition is met, stopping iteration, otherwise, continuously repeating the step 3.3 and the step 3.4 until the condition is met, and finally obtaining a similar day data set under each weather condition;

step 4, carrying out normalization processing on the photovoltaic similar day data set by adopting a min-max normalization method so as to eliminate the influence of different variable dimension differences on a prediction result;

the min-max normalization method is shown in equation (7):

in the formula, x ^* Representing data to be normalized, x representing normalized data, x _max Representing the maximum value, x, of some variable data _min Is the minimum value of some variable data.

Step 5, dividing the similar day data sets under all weather conditions into a training set and a test set;

step 6, constructing a Quantile Regression (QR) -Convolutional Neural Network (CNN) -bidirectional long and short term memory (BilSTM) neural network, namely a QR-CNN-BilSTM interval prediction model, setting parameters of the CNN and the BilSTM model, and setting related parameters of model training; inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training; the method specifically comprises the following steps:

step 6.1, constructing a characteristic diagram of the normalized similar day data set in a sliding window mode, inputting the characteristic diagram into a CNN (network communication network), and extracting a characteristic vector representing dynamic change of photovoltaic power by utilizing a convolution layer and a pooling layer of the characteristic diagram;

the neuron output after the convolutional neural network extracts the local features is shown as a formula (8);

in the formula: o represents a neuron local output; i is neuron input; l, m and n respectively represent 3 dimensions of the output matrix; i. j and n respectively represent the length, width and depth of the convolution kernel K; b _n A threshold value representing a convolution kernel;

representing a multiplication operation of the matrix.

And 6.2, converting the output characteristic vector into a time sequence and inputting the time sequence into the BilSTM network so as to further capture the long-term dependence in the time sequence.

The calculation process of the BilSTM network is as follows: the forgetting gate determines which input information is to be deleted from the memory cell state, as shown in equation (9);

f _t ＝σ(W _f ·[h _t-1 ,X _t ]+b _f ) (9)

inputting the output value of the previous moment and the input value of the current moment into an input gate, and obtaining the output value of the input gate after calculation, wherein the output value is shown as a formula (10);

i _t ＝σ(W _i ·[h _t-1 ,X _t ])+b _i (10)

inputting the output value of the previous moment and the input value of the current moment into an input gate, and obtaining the state of the candidate cell after calculation, as shown in a formula (11);

updating the current cell state as shown in formula (12);

inputting the output value of the previous moment and the input value of the current moment into an output gate, and obtaining the output value of the output gate after calculation, wherein the output value is shown as a formula (13);

o _t ＝σ(W _o ·[h _t-1 ,X _t ]+b _o ) (13)

calculating the output of the output gate and the cell state to obtain an output value as shown in a formula (14);

h _t ＝o _t *tanh(C _t ) (14)

in the formula: f. of _t Forget gate output at time t; both sigma and tanh functions are activation functions; h is _t-1 Outputting information for the data at the time t-1; x _t Inputting information for data at time t; w _f 、W _i 、W _C 、W _o Is a weight coefficient; b _f 、b _i 、b _C 、b _o Is a bias parameter; i.e. i _t And

an output representing an input at time t; c _t-1 Is the cell state at time t-1; c _t Is composed of _t The cellular state at the time; h is _t Outputting information for data at the time t; o _t Representing the output at time t after activation by the activation function Sigmoid.

The forward data is input to the forward LSTM layer, resulting in an output of the forward LSTM layer. And reversely inputting the data into the reverse LSTM layer to obtain reverse output, and then reversing the output again to finally obtain the output of the reverse LSTM layer. Finally, the forward LSTM layer output and the reverse LSTM layer output are linearly superposed according to a certain weight to obtain an output result;

is provided with

is Y _t The predicted value of (2).

The QR-CNN-BilSTM model may be represented as

Estimation of model parameters omega (tau) at each quantile

By minimizing a loss function

Obtaining;

and 6.4, setting parameters of the model and training the model.

setting the number of layers of a BilSTM network to be 3, the number of neurons to be 128 and a dropout parameter to be 0.2; the set quantile starting point is 0.05, the step length is 0.05, and the end point is 1, so the number of neurons in the full connecting layer is 19.

inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training;

step 7, inputting the data of the test set into the QR-CNN-BilSTM model trained in the step 6, and predicting the photovoltaic power interval;

step 9, generating a photovoltaic probability prediction result on the test set by adopting a kernel density estimation algorithm optimized by a cross validation and grid search method based on the inverse normalized photovoltaic power interval prediction result data obtained in the step 8;

the method specifically comprises the following steps: for a specific future photovoltaic power prediction point, applying the QR-CNN-BilSTM fusion model in the step 7 to obtain a group of N samples under N conditional quantiles, namely the vector containing the N samples is

The probability density function can be obtained by a kernel density estimation method, and KDE calculation of the vector is shown as a formula (15):

in the formula: n is the total number of samples; b is the optimal bandwidth determined by adopting a cross-validation grid search method, and B is greater than 0; k is a kernel function. The gaussian kernel is the kernel function K used in the present invention, which is shown in equation (16):

the cross validation and grid search method provided by the invention optimizes the kernel density estimation, solves the problem of difficult bandwidth selection in the kernel density estimation, and can generate a high-quality probability prediction result.

And evaluating the photovoltaic power point prediction, interval prediction and probability prediction results. Using root mean square error e _RMSE And the mean absolute percentage error e _MAPE Used for point prediction result evaluation; using interval comprehensive evaluation index I _WC Evaluating an interval prediction result; scoring P with successive ranking probabilities _CRPS Carrying out probability prediction result evaluation; as shown in the following formula:

I _WC ＝I _PINAW /I _PICP

wherein:

in the formula: p _ri Representing observed values of power, P _pi And expressing the power predicted value, wherein N is the total point number of the predicted future photovoltaic power. S _n Is a logical value, S when the observed value falls within the prediction interval _n Get 1, otherwise S _n Taking 0; e is the difference between the maximum value and the minimum value of the observed value; p is _upi And P _downi Respectively an upper bound and a lower bound of the prediction interval; p (x) represents a probability density function; f (P) _pi ) Represents P _pi The cumulative distribution function of; h (P) _pi -P _ri ) Is a step function.

The photovoltaic power interval probability prediction method based on the deep learning fusion model comprises the steps of screening meteorological factors by adopting a correlation coefficient method, and reducing model prediction errors caused by excessive irrelevant characteristics; then, selecting high-correlation meteorological factor variables as clustering variables and constructing statistical characteristics of the clustering variables, and clustering by adopting an FCM clustering algorithm to obtain a similar day data set so as to lay a foundation for further improving prediction precision; the QR-CNN-BilSTM model fuses two deep learning models, namely CNN and BilSTM, has higher prediction precision compared with the traditional single deep learning prediction model, and can generate interval prediction results with higher quality; the cross validation and grid search method optimizes the kernel density estimation, solves the problem of difficult bandwidth selection in the kernel density estimation, and can generate high-quality probability prediction results.

Examples

Photovoltaic power data of a photovoltaic array produced by a certain manufacturer of Australian desert knowledge solar center (DKASC) Alice Springs sites in 2019-2020 and 4 meteorological factor data obtained by screening are adopted as simulation data. The time resolution of the data set was 5 min. Because the photovoltaic array does not generate electricity at night, the data with the power of 0 at night are removed, and the data in the time period of 6: 00-19: 30 each day are reserved as example analysis data.

Finally, the data sets are divided into three categories by adopting a fuzzy C-means clustering algorithm: and (3) similar day data sets in sunny days, cloudy days after sunny days and rainy days. And dividing each similar day data set into a training set and a test set, wherein the proportion of the training set is 0.7, the test set selects 30d of similar day data close to the time, and the training set selects the first 70d of similar day data closest to the test set in the similar day data sets.

In order to illustrate the advantages of the proposed model in photovoltaic power short-term intervals and probability prediction, the prediction effects of the proposed QR-CNN-BilSTM model and the QR-LSTM and QR-BilSTM models are respectively compared under 3 weather types, one day is randomly selected from each weather type to be used for visual analysis, as shown in fig. 3-11, the prediction results of each model point and the prediction results of 95% confidence level intervals under the weather types of sunny days, sunny-cloudy days and rainy days are shown, fig. 12-14 show the probability prediction results of the QR-CNN-BilSTM model combining nuclear density estimation under the weather types of sunny days, sunny-cloudy days and rainy days (9 point probability prediction results are selected from 164 points at equal distances), and evaluation index pairs are shown in tables 2, 3 and 4.

Evaluation index of model in Table 2

Evaluation index of model in Table 3

Evaluation index of the model in Table 4

From the table 2, it can be seen that in sunny days, the coverage rate of each model prediction interval is close to 100%, the prediction interval width of the QR-CNN-BilSTM model is obviously lower than that of other models, the PINAW index is reduced by 72.77% compared with the QR-LSTM model, and is reduced by 66.26% compared with the QR-BilSTM model; meanwhile, the section comprehensive evaluation index value I of the QR-CNN-BilSTM model _WC Is also minimal, I _WC The index is reduced by 72.62 percent compared with QR-LSTM and is reduced by 66.08 percent compared with QR-BiLSTM, so that the interval prediction performance is the best. And the CRPS value of the QR-CNN-BilSTM probability evaluation index is minimum, so that the probability prediction performance is best. And the QR-CNN-BilSTM model point prediction performance is better from the certainty evaluation index.

It can be seen from table 3 that the PICP values in the prediction intervals of the models are all measured during sunny days and cloudy days>On the premise of 95%, PINAW in the prediction interval of the QR-CNN-BilSTM model is remarkably reduced by 28.34% compared with QR-LSTM and by 25.62% compared with QR-BilSTM; section comprehensive evaluation index I of QR-CNN-BilSTM model simultaneously _WC Also has a minimum value, I _WC The index is reduced by 28.80% compared with QR-LSTM and reduced by 26.09% compared with QR-BiLSTM, so that the interval prediction performance is optimal. And the CRPS value of the QR-CNN-BilSTM probability evaluation index is minimum, so that the probability prediction performance is best. And the QR-CNN-BilSTM model point prediction performance is better from the certainty evaluation index.

From Table 4, it can be seen that the PICP value in each model prediction interval is in the rainy dayAll are provided with>On the premise of 95%, the PINAW of the prediction interval of the QR-CNN-BilSTM model is remarkably reduced, is reduced by 7.98% compared with the QR-LSTM, is reduced by 4.47% compared with the QR-BilSTM, and simultaneously, the interval comprehensive evaluation index I _WC Also has a minimum value of I _WC The index is reduced by 6.81% compared with QR-LSTM, the index is reduced by 4.08% compared with QR-BiLSTM, and the uncertainty of photovoltaic power interval prediction is obviously reduced, so that the QR-CNN-BiLSTM interval prediction performance is optimal. The QR-CNN-BilSTM probability evaluation index CRPS has the smallest value, so the probability prediction performance is also the best. And the QR-CNN-BilSTM model point prediction performance is better as seen from the certainty evaluation index. By the analysis, the point prediction, interval prediction and probability prediction performances of the QR-CNN-BilSTM model are superior.

Claims

1. The photovoltaic power interval probability prediction method based on the deep learning fusion model is characterized by comprising the following steps:

step 2, selecting clustering variables, constructing statistical characteristics of the clustering variables,

2. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 1, specifically:

3. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 2, wherein in the step 2, specifically:

4. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 3, specifically:

in the formula: u. of _ij The membership degree of the ith sample belonging to the jth class; x is the number of _i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters;

step 3.4, updating the membership degree matrix U ^(t) The calculation method is shown in formula (6):

in the formula: u. of _ij The membership degree of the ith sample belonging to the jth class; x is the number of _i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters; d is a radical of _ij Is the distance from the ith sample to the jth class center;

step 3.5, calculate | | | U ^(t) -U ^(t-1) And verifying whether an iteration stop condition is met ^(t) -U ^(t-1) ||<And e, if the condition is met, stopping iteration, otherwise, continuously repeating the step 3.3 and the step 3.4 until the condition is met, and finally obtaining the similar day data sets under various weather conditions.

5. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 6, specifically:

is provided with

is Y _t The predicted value of (2);

the QR-CNN-BilSTM model may be represented as

Estimation of model parameters omega (tau) at each quantile

By minimizing a loss function

Obtaining;

6.4, setting parameters of the model and training the model;

setting the convolution layer number to be 1, the convolution kernel number to be 64, the convolution kernel size to be 4, the convolution boundary processing mode to be 'same', the activation function to be 3 and the pooling window size to be 3;

setting the number of layers of a BilSTM network to be 3, the number of neurons to be 128 and a dropout parameter to be 0.2; setting the quantile starting to be 0.05, the step length to be 0.05 and the end point to be 1, so that the number of neurons in the full connecting layer is 19;