CN115115125A - Photovoltaic power interval probability prediction method based on deep learning fusion model - Google Patents
Photovoltaic power interval probability prediction method based on deep learning fusion model Download PDFInfo
- Publication number
- CN115115125A CN115115125A CN202210823439.9A CN202210823439A CN115115125A CN 115115125 A CN115115125 A CN 115115125A CN 202210823439 A CN202210823439 A CN 202210823439A CN 115115125 A CN115115125 A CN 115115125A
- Authority
- CN
- China
- Prior art keywords
- model
- photovoltaic power
- bilstm
- clustering
- cnn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000013135 deep learning Methods 0.000 title claims abstract description 18
- 230000004927 fusion Effects 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 27
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 17
- 238000012360 testing method Methods 0.000 claims abstract description 14
- 238000010606 normalization Methods 0.000 claims abstract description 11
- 238000012545 processing Methods 0.000 claims abstract description 11
- 238000010219 correlation analysis Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 17
- 230000005855 radiation Effects 0.000 claims description 17
- 238000010248 power generation Methods 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 11
- 238000010586 diagram Methods 0.000 claims description 10
- 210000002569 neuron Anatomy 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 238000002790 cross-validation Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 claims description 3
- 238000005259 measurement Methods 0.000 abstract description 2
- 238000011156 evaluation Methods 0.000 description 16
- 210000004027 cell Anatomy 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 108090000672 Annexin A5 Proteins 0.000 description 3
- 238000013136 deep learning model Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000009434 installation Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000005611 electricity Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 1
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012954 risk control Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
- H02J3/003—Load forecast, e.g. methods or systems for forecasting future load demand
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Development Economics (AREA)
- Power Engineering (AREA)
- Entrepreneurship & Innovation (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a photovoltaic power interval probability prediction method based on a deep learning fusion model, which specifically comprises the following steps: firstly, acquiring the photovoltaic power and meteorological factors to carry out variable correlation analysis, and determining the input variables of a prediction model; selecting a clustering variable, constructing statistical characteristics of the clustering variable, performing similar daily clustering on photovoltaic historical data by adopting a fuzzy C-means clustering algorithm, and performing normalization processing on the photovoltaic historical data; then dividing the similar daily data set into a training set and a testing set; constructing a QR-CNN-BilSTM interval prediction model, training and predicting a photovoltaic power interval; and finally, generating a photovoltaic probability prediction result on the test set. The method can well track the future photovoltaic power trend, realizes high-accuracy measurement of photovoltaic power prediction uncertainty on the basis of meeting the reliability requirement, generates the photovoltaic power prediction interval under the corresponding confidence level, and has practical application value.
Description
Technical Field
The invention belongs to the technical field of photovoltaic power generation prediction, and particularly relates to a photovoltaic power interval probability prediction method based on a deep learning fusion model.
Background
In recent years, the problem of environmental pollution is becoming more serious, the problem of non-renewable energy shortage is becoming more prominent, new energy development roads are sought in countries in the world, new energy development strategies are implemented in 2014 in China, solar energy serving as an important component of energy transformation is developed and utilized on a large scale, and by the end of 2021, the total installation of photovoltaic power generation in China reaches 3.06 hundred million kW, and the new installation of photovoltaic power generation in the whole country in 2021 year is 5300 ten thousand kW. The large-scale new energy grid-connected power generation represented by photovoltaic power generation is an unblocked development trend and prominent characteristics of a new generation of power system in the future. However, due to the influence of various complex environmental factors, photovoltaic power generation has strong random fluctuation, intermittency and non-stationarity, and as a high-proportion photovoltaic power generation in an electric power system is connected, the photovoltaic power generation as an uncontrollable power source seriously threatens the safe and stable operation of the electric power system. Therefore, the research on the photovoltaic power prediction technology has important significance for building a new generation of electric power system in China and enabling the electric power system to be suitable for the access of high-proportion renewable energy sources, and has important values for building an integrated security defense system of the electric power system and realizing risk control.
Existing photovoltaic power prediction techniques are formally classified from the results of the predictions, which can be classified into deterministic predictions and non-deterministic predictions. The photovoltaic power certainty prediction result is single-point prediction, and the advantage is that the method is visual, but uncertainty information of photovoltaic power prediction cannot be represented. The uncertainty prediction can give a possible variation range and a confidence degree of the photovoltaic power at a future moment, and the uncertainty prediction result can provide more comprehensive data support for power system scheduling, so that the method has more important engineering significance. The prediction models are classified into physical models and data driving models, the physical models are established from the characteristics, installation angles and the like of photovoltaic modules, geographic conditions, meteorological elements and the like are considered, the physical models are complex in construction mechanism, and the application is less at present. The data-driven model mainly comprises a statistical method and an artificial intelligence algorithm. The statistical method adopts curve fitting and parameter estimation to establish the relation between the photovoltaic power and the influence factors thereof, and a common time sequence method and a gray model are adopted. The artificial intelligence model is represented by a neural network and a deep learning model, has strong nonlinear data processing capacity, and is a model generally adopted in recent years.
Although a lot of work has been done by many scholars in the field of photovoltaic power prediction, the following problems still exist at the present stage: (1) photovoltaic power prediction is concentrated on point prediction, and interval probability prediction research is less. (2) The reliability and the sensitivity of the existing interval probability prediction model are not high, and the performance of the photovoltaic power interval probability prediction model needs to be further improved urgently. (3) Most studies adopt arithmetic data with 1h or 15min as an interval, however, when the power with 5min as an interval in the future is predicted, more complex and variable photovoltaic power fluctuation is faced, the traditional single model cannot well cope with the problem, and multi-model fusion is one of the solutions in the future.
Disclosure of Invention
The invention aims to provide a photovoltaic power interval probability prediction method based on a deep learning fusion model, and solves the problems of inaccurate photovoltaic power interval prediction and probability prediction results.
The technical scheme adopted by the invention is that the photovoltaic power interval probability prediction method based on the deep learning fusion model is implemented according to the following steps:
and 9, generating a photovoltaic probability prediction result on the test set by adopting a kernel density estimation algorithm optimized by a cross validation and grid search method.
The present invention is also characterized in that,
in the step 1, the method specifically comprises the following steps:
step 1.1, selecting preprocessed meteorological element data and preprocessed photovoltaic power generation power data;
the time resolution of the meteorological element variable and the photovoltaic power variable is 5min, and wind speed, relative humidity, ambient temperature, total horizontal radiation, diffused horizontal radiation, rainfall, wind direction, total oblique radiation and diffused oblique radiation are selected as original meteorological element data variables;
step 1.2, measuring the correlation degree among a plurality of meteorological element variables by adopting a kendall rank correlation coefficient R;
and step 1.3, selecting meteorological factors of which the kendall rank correlation coefficient R absolute value of the photovoltaic power is not less than 0.5, and inputting the meteorological factors as a prediction model.
In the step 2, the method specifically comprises the following steps:
2.1, selecting a meteorological factor variable with the highest kendall rank correlation coefficient value of the photovoltaic power as a clustering variable;
and 2.2, selecting the average value, the standard deviation, the maximum value, the peak-to-valley number, the variation coefficient, the kurtosis and the skewness of the clustering variables as statistical characteristics.
In the step 3, the method specifically comprises the following steps:
step 3.1, calculating numerical values of 7 statistical characteristics of the clustering variables on each day according to the statistical characteristics of the clustering variables constructed in the step 2;
step 3.2, determining the number c of data clustering categories, and initializing a clustering center V i Giving fuzzification parameter m and initializing membership degree matrix U (0) Given the termination criterion epsilon of the algorithm;
step 3.3, calculating all clustering centers of the t iteration according to the formula (5) to obtain a clustering center matrix:
in the formula: u. u ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters;
step 3.4, updating membership degree matrix U (t) The calculation method is shown in formula (6):
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters; d ij Is the distance from the ith sample to the jth class center;
step 3.5, calculate | | | U (t) -U (t-1) And verifying whether an iteration stop condition is met (t) -U (t-1) ||<E, stopping iteration if the condition is met, otherwise, continuously repeating the step 3.3 and the step 3.4 until a bar is reachedAnd (4) until the data sets of similar days in each weather condition are obtained finally.
In step 6, the method specifically comprises the following steps:
6.1, constructing a characteristic diagram of the normalized similar day data set in a sliding window mode, inputting the characteristic diagram into a CNN network, and extracting a characteristic vector representing the dynamic change of photovoltaic power by using a convolution layer and a pooling layer of the characteristic diagram;
6.2, converting the output characteristic vector into a time sequence and inputting the time sequence into a BilSTM network;
step 6.3, introducing a QR model, combining with the CNN-BilSTM, and fusing the QR model and the CNN-BilSTM model in a loss function mode;
is provided withRepresenting a CNN-BilSTM point prediction model, wherein X t For model inputs, i.e., independent variables, omega is the model parameter of CNN-BilSTM, Y t As a function of the amount of the dependent variable,is Y t The predicted value of (2);
the QR-CNN-BilSTM model may be represented asEstimation of model parameters omega (tau) at each quantileBy minimizing a loss functionObtaining;
6.4, setting parameters of the model and training the model;
the QR-CNN-BilSTM model network structure is composed of a CNN layer, a maximum pooling layer, three BilSTM layers and a full connection layer which are connected in sequence;
setting the number of convolution layers as 1 layer, the number of convolution kernels as 64, the size of the convolution kernels as 4, the boundary processing mode of convolution as ' same ', the activation function as ' and the size of the pooling window as 3;
setting the number of the layers of the BiLSTM network to be 3, the number of the neurons to be 128 and the dropout parameter to be 0.2; setting the quantile starting to be 0.05, the step length to be 0.05 and the end point to be 1, so that the number of the neurons in the full connecting layer is 19;
an initial learning rate of 0.01, a learning rate attenuation of 1.5, and a learning rate minimum of 10 are set -4 Maximum iteration number is 100, batch processing parameter is 32, optimizer is Adam;
inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training.
The invention has the beneficial effects that:
the photovoltaic power interval probability prediction method based on the deep learning fusion model provides a correlation analysis method based on a kendall rank correlation coefficient, so that model input variables can be determined, and invalid information in historical data is reduced to improve training efficiency; selecting meteorological factor variables with high correlation with photovoltaic power as clustering variables, and selecting 7 statistical characteristics such as average values of the clustering variables as clustering characteristics to comprehensively reflect fluctuation rules and characteristics of each calendar history data, so that an FCM (fuzzy c-means) algorithm can perform high-efficiency clustering, and the clustering algorithm is high-efficiency and reasonable; the QR-CNN-BilSTM model fuses two deep learning models, namely the CNN and the BilSTM, has higher prediction precision compared with a traditional single deep learning prediction model, when the future photovoltaic power is predicted by fine time granularity with 5min intervals, the photovoltaic power with 5min intervals presents more rapid change characteristics under the condition of non-sunny weather, the future photovoltaic power trend can be better tracked, on the basis of meeting the reliability requirement, the high-accuracy measurement of the photovoltaic power prediction uncertainty is realized, the photovoltaic power prediction interval under the corresponding confidence level is generated, and the method has practical application value.
Drawings
FIG. 1 is a flow chart of a photovoltaic power interval probability prediction method based on a deep learning fusion model according to the present invention;
FIG. 2 is a flow chart of a fuzzy C-means clustering algorithm in the photovoltaic power interval probability prediction method based on the deep learning fusion model;
FIG. 3 is a diagram of the interval prediction result of the QR-CNN-BilSTM model used in the method of the present invention in sunny days;
FIG. 4 is a graph of interval prediction results in sunny days using a QR-LSTM model;
FIG. 5 is a graph of interval prediction results in sunny days using a QR-BilSTM model;
FIG. 6 is a graph of interval prediction results of a QR-CNN-BilSTM model adopted in the method of the invention during sunny days and cloudy days;
FIG. 7 is a graph of interval prediction results using a QR-LSTM model on a sunny and cloudy day;
FIG. 8 is a graph of interval prediction results during a clear-to-cloudy day using a QR-BilSTM model;
FIG. 9 is a graph of the interval prediction results of the QR-CNN-BilSTM model used in the method of the present invention in rainy days;
FIG. 10 is a graph of interval prediction results in rainy days using a QR-LSTM model;
FIG. 11 is a graph of the interval prediction results in rainy days using the QR-BilSTM model;
FIG. 12 is a graph of the probability prediction results of the QR-CNN-BilSTM model in combination with the nuclear density estimation method in sunny days;
FIG. 13 is a graph of the probability prediction results of the QR-CNN-BilSTM model in combination with the nuclear density estimation method in a cloudy day during sunny days;
FIG. 14 is a graph of the probability prediction results of the QR-CNN-BilSTM model combined with the nuclear density estimation method in rainy days.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The photovoltaic power interval probability prediction method based on the deep learning fusion model is implemented according to the following steps as shown in fig. 1:
step 1.1, selecting preprocessed meteorological element data and preprocessed photovoltaic power generation power data, wherein time resolutions of meteorological element variables and photovoltaic power variables are kept consistent;
the time resolution of the meteorological element variable and the photovoltaic power variable is 5min, wind speed, relative humidity, ambient temperature, total horizontal radiation, diffused horizontal radiation, rainfall, wind direction, total oblique radiation and diffused oblique radiation are selected as original meteorological element data variables, the photovoltaic array does not generate electricity at night, data with the power at night being 0 are removed, and data in a period of 6: 00-19: 30 each day are reserved as example analysis data.
Searching vacancy values of historical meteorological data and photovoltaic power generation power data, and filling the vacancy values by using an interpolation method; searching abnormal values of original historical meteorological data and photovoltaic power generation data by adopting a box type graph, and replacing the abnormal values of the data by using the upper and lower boundaries of the box type graph;
step 1.2, measuring the correlation degree among a plurality of meteorological element variables by adopting a kendall rank correlation coefficient R;
the kendall rank correlation coefficient R is defined as formula (1):
in the formula: p represents the number of coincident pairs; q represents the number of non-uniform pairs,representing the total logarithm of observations. When two pairs of observations A of variables A and B i 、B i And A j 、B j Satisfy A i <B i And at this time A j <B j If the two pairs of observed values are inconsistent or harmonious, otherwise, the two pairs of observed values are inconsistent or harmonious;
the closer the kendall rank correlation coefficient R is to 1, the higher the correlation of the meteorological variable with the output power is. The coefficient is positive, indicating positive correlation, and negative indicating negative correlation;
step 1.3, selecting meteorological factors of which the kendall rank correlation coefficient R absolute value of the photovoltaic power is not less than 0.5, and inputting the meteorological factors as a prediction model;
in this example, the kendall rank correlation coefficient R of the photovoltaic power variable and each meteorological element variable is shown in table 1:
TABLE 1 Meteorological factor variables
Selecting a plurality of meteorological factors with high correlation with the photovoltaic power, namely meteorological factors with a kendall rank correlation coefficient absolute value of the photovoltaic power not less than 0.5, and inputting the meteorological factors as a prediction model, so that total horizontal radiation, diffused horizontal radiation, total oblique radiation and diffused oblique radiation are selected as meteorological element variables of the embodiment;
2.1, selecting a meteorological factor variable with the highest kendall rank correlation coefficient value of the photovoltaic power as a clustering variable; in the embodiment, total horizontal radiation is selected as a clustering variable;
step 2.2, selecting the average value, the standard deviation, the maximum value, the peak wave valley number, the variation coefficient, the kurtosis and the skewness of the clustering variables as statistical characteristics;
the coefficient of variation C, kurtosis K and skewness D are defined as shown in formulas (2) to (4), respectively:
in the formula, σ represents the standard deviation of the variables,denotes the mean value of the variables, X i Representing a certain sample of the variable, M representing the total number of samples of the variable;
step 3.1, calculating numerical values of 7 statistical characteristics of the clustering variables on each day according to the statistical characteristics of the clustering variables constructed in the step 2, and taking the numerical values as clustering data for clustering by adopting a Fuzzy C Mean (FCM) clustering algorithm;
step 3.2, determining the number c of data clustering categories, and initializing a clustering center V i Giving fuzzification parameter m and initializing membership degree matrix U (0) Given the termination criterion epsilon of the algorithm;
step 3.3, calculating all clustering centers of the t iteration according to the formula (5) to obtain a clustering center matrix:
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters.
Step 3.4, updating membership degree matrix U (t) The calculation method is shown in formula (6):
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents polyThe number of classes; d ij Is the distance from the ith sample to the jth class center.
Step 3.5, calculate | | | U (t) -U (t-1) And verifying whether an iteration stop condition is met (t) -U (t-1) ||<If the condition is met, stopping iteration, otherwise, continuously repeating the step 3.3 and the step 3.4 until the condition is met, and finally obtaining a similar day data set under each weather condition;
the min-max normalization method is shown in equation (7):
in the formula, x * Representing data to be normalized, x representing normalized data, x max Representing the maximum value, x, of some variable data min Is the minimum value of some variable data.
step 6.1, constructing a characteristic diagram of the normalized similar day data set in a sliding window mode, inputting the characteristic diagram into a CNN (network communication network), and extracting a characteristic vector representing dynamic change of photovoltaic power by utilizing a convolution layer and a pooling layer of the characteristic diagram;
the neuron output after the convolutional neural network extracts the local features is shown as a formula (8);
in the formula: o represents a neuron local output; i is neuron input; l, m and n respectively represent 3 dimensions of the output matrix; i. j and n respectively represent the length, width and depth of the convolution kernel K; b n A threshold value representing a convolution kernel;representing a multiplication operation of the matrix.
And 6.2, converting the output characteristic vector into a time sequence and inputting the time sequence into the BilSTM network so as to further capture the long-term dependence in the time sequence.
The calculation process of the BilSTM network is as follows: the forgetting gate determines which input information is to be deleted from the memory cell state, as shown in equation (9);
f t =σ(W f ·[h t-1 ,X t ]+b f ) (9)
inputting the output value of the previous moment and the input value of the current moment into an input gate, and obtaining the output value of the input gate after calculation, wherein the output value is shown as a formula (10);
i t =σ(W i ·[h t-1 ,X t ])+b i (10)
inputting the output value of the previous moment and the input value of the current moment into an input gate, and obtaining the state of the candidate cell after calculation, as shown in a formula (11);
updating the current cell state as shown in formula (12);
inputting the output value of the previous moment and the input value of the current moment into an output gate, and obtaining the output value of the output gate after calculation, wherein the output value is shown as a formula (13);
o t =σ(W o ·[h t-1 ,X t ]+b o ) (13)
calculating the output of the output gate and the cell state to obtain an output value as shown in a formula (14);
h t =o t *tanh(C t ) (14)
in the formula: f. of t Forget gate output at time t; both sigma and tanh functions are activation functions; h is t-1 Outputting information for the data at the time t-1; x t Inputting information for data at time t; w f 、W i 、W C 、W o Is a weight coefficient; b f 、b i 、b C 、b o Is a bias parameter; i.e. i t Andan output representing an input at time t; c t-1 Is the cell state at time t-1; c t Is composed of t The cellular state at the time; h is t Outputting information for data at the time t; o t Representing the output at time t after activation by the activation function Sigmoid.
The forward data is input to the forward LSTM layer, resulting in an output of the forward LSTM layer. And reversely inputting the data into the reverse LSTM layer to obtain reverse output, and then reversing the output again to finally obtain the output of the reverse LSTM layer. Finally, the forward LSTM layer output and the reverse LSTM layer output are linearly superposed according to a certain weight to obtain an output result;
step 6.3, introducing a QR model, combining with the CNN-BilSTM, and fusing the QR model and the CNN-BilSTM model in a loss function mode;
is provided withRepresenting a CNN-BilSTM point prediction model, wherein X t For model inputs, i.e., independent variables, omega is the model parameter of CNN-BilSTM, Y t As a function of the amount of the dependent variable,is Y t The predicted value of (2).
The QR-CNN-BilSTM model may be represented asEstimation of model parameters omega (tau) at each quantileBy minimizing a loss functionObtaining;
and 6.4, setting parameters of the model and training the model.
The QR-CNN-BilSTM model network structure is composed of a CNN layer, a maximum pooling layer, three BilSTM layers and a full connection layer which are connected in sequence;
setting the number of convolution layers as 1 layer, the number of convolution kernels as 64, the size of the convolution kernels as 4, the boundary processing mode of convolution as ' same ', the activation function as ' and the size of the pooling window as 3;
setting the number of layers of a BilSTM network to be 3, the number of neurons to be 128 and a dropout parameter to be 0.2; the set quantile starting point is 0.05, the step length is 0.05, and the end point is 1, so the number of neurons in the full connecting layer is 19.
An initial learning rate of 0.01, a learning rate attenuation of 1.5, and a learning rate minimum of 10 are set -4 Maximum iteration number is 100, batch processing parameter is 32, optimizer is Adam;
inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training;
step 9, generating a photovoltaic probability prediction result on the test set by adopting a kernel density estimation algorithm optimized by a cross validation and grid search method based on the inverse normalized photovoltaic power interval prediction result data obtained in the step 8;
the method specifically comprises the following steps: for a specific future photovoltaic power prediction point, applying the QR-CNN-BilSTM fusion model in the step 7 to obtain a group of N samples under N conditional quantiles, namely the vector containing the N samples isThe probability density function can be obtained by a kernel density estimation method, and KDE calculation of the vector is shown as a formula (15):
in the formula: n is the total number of samples; b is the optimal bandwidth determined by adopting a cross-validation grid search method, and B is greater than 0; k is a kernel function. The gaussian kernel is the kernel function K used in the present invention, which is shown in equation (16):
the cross validation and grid search method provided by the invention optimizes the kernel density estimation, solves the problem of difficult bandwidth selection in the kernel density estimation, and can generate a high-quality probability prediction result.
And evaluating the photovoltaic power point prediction, interval prediction and probability prediction results. Using root mean square error e RMSE And the mean absolute percentage error e MAPE Used for point prediction result evaluation; using interval comprehensive evaluation index I WC Evaluating an interval prediction result; scoring P with successive ranking probabilities CRPS Carrying out probability prediction result evaluation; as shown in the following formula:
I WC =I PINAW /I PICP
wherein:
in the formula: p ri Representing observed values of power, P pi And expressing the power predicted value, wherein N is the total point number of the predicted future photovoltaic power. S n Is a logical value, S when the observed value falls within the prediction interval n Get 1, otherwise S n Taking 0; e is the difference between the maximum value and the minimum value of the observed value; p is upi And P downi Respectively an upper bound and a lower bound of the prediction interval; p (x) represents a probability density function; f (P) pi ) Represents P pi The cumulative distribution function of; h (P) pi -P ri ) Is a step function.
The photovoltaic power interval probability prediction method based on the deep learning fusion model comprises the steps of screening meteorological factors by adopting a correlation coefficient method, and reducing model prediction errors caused by excessive irrelevant characteristics; then, selecting high-correlation meteorological factor variables as clustering variables and constructing statistical characteristics of the clustering variables, and clustering by adopting an FCM clustering algorithm to obtain a similar day data set so as to lay a foundation for further improving prediction precision; the QR-CNN-BilSTM model fuses two deep learning models, namely CNN and BilSTM, has higher prediction precision compared with the traditional single deep learning prediction model, and can generate interval prediction results with higher quality; the cross validation and grid search method optimizes the kernel density estimation, solves the problem of difficult bandwidth selection in the kernel density estimation, and can generate high-quality probability prediction results.
Examples
Photovoltaic power data of a photovoltaic array produced by a certain manufacturer of Australian desert knowledge solar center (DKASC) Alice Springs sites in 2019-2020 and 4 meteorological factor data obtained by screening are adopted as simulation data. The time resolution of the data set was 5 min. Because the photovoltaic array does not generate electricity at night, the data with the power of 0 at night are removed, and the data in the time period of 6: 00-19: 30 each day are reserved as example analysis data.
Finally, the data sets are divided into three categories by adopting a fuzzy C-means clustering algorithm: and (3) similar day data sets in sunny days, cloudy days after sunny days and rainy days. And dividing each similar day data set into a training set and a test set, wherein the proportion of the training set is 0.7, the test set selects 30d of similar day data close to the time, and the training set selects the first 70d of similar day data closest to the test set in the similar day data sets.
In order to illustrate the advantages of the proposed model in photovoltaic power short-term intervals and probability prediction, the prediction effects of the proposed QR-CNN-BilSTM model and the QR-LSTM and QR-BilSTM models are respectively compared under 3 weather types, one day is randomly selected from each weather type to be used for visual analysis, as shown in fig. 3-11, the prediction results of each model point and the prediction results of 95% confidence level intervals under the weather types of sunny days, sunny-cloudy days and rainy days are shown, fig. 12-14 show the probability prediction results of the QR-CNN-BilSTM model combining nuclear density estimation under the weather types of sunny days, sunny-cloudy days and rainy days (9 point probability prediction results are selected from 164 points at equal distances), and evaluation index pairs are shown in tables 2, 3 and 4.
Evaluation index of model in Table 2
Evaluation index of model in Table 3
Evaluation index of the model in Table 4
From the table 2, it can be seen that in sunny days, the coverage rate of each model prediction interval is close to 100%, the prediction interval width of the QR-CNN-BilSTM model is obviously lower than that of other models, the PINAW index is reduced by 72.77% compared with the QR-LSTM model, and is reduced by 66.26% compared with the QR-BilSTM model; meanwhile, the section comprehensive evaluation index value I of the QR-CNN-BilSTM model WC Is also minimal, I WC The index is reduced by 72.62 percent compared with QR-LSTM and is reduced by 66.08 percent compared with QR-BiLSTM, so that the interval prediction performance is the best. And the CRPS value of the QR-CNN-BilSTM probability evaluation index is minimum, so that the probability prediction performance is best. And the QR-CNN-BilSTM model point prediction performance is better from the certainty evaluation index.
It can be seen from table 3 that the PICP values in the prediction intervals of the models are all measured during sunny days and cloudy days>On the premise of 95%, PINAW in the prediction interval of the QR-CNN-BilSTM model is remarkably reduced by 28.34% compared with QR-LSTM and by 25.62% compared with QR-BilSTM; section comprehensive evaluation index I of QR-CNN-BilSTM model simultaneously WC Also has a minimum value, I WC The index is reduced by 28.80% compared with QR-LSTM and reduced by 26.09% compared with QR-BiLSTM, so that the interval prediction performance is optimal. And the CRPS value of the QR-CNN-BilSTM probability evaluation index is minimum, so that the probability prediction performance is best. And the QR-CNN-BilSTM model point prediction performance is better from the certainty evaluation index.
From Table 4, it can be seen that the PICP value in each model prediction interval is in the rainy dayAll are provided with>On the premise of 95%, the PINAW of the prediction interval of the QR-CNN-BilSTM model is remarkably reduced, is reduced by 7.98% compared with the QR-LSTM, is reduced by 4.47% compared with the QR-BilSTM, and simultaneously, the interval comprehensive evaluation index I WC Also has a minimum value of I WC The index is reduced by 6.81% compared with QR-LSTM, the index is reduced by 4.08% compared with QR-BiLSTM, and the uncertainty of photovoltaic power interval prediction is obviously reduced, so that the QR-CNN-BiLSTM interval prediction performance is optimal. The QR-CNN-BilSTM probability evaluation index CRPS has the smallest value, so the probability prediction performance is also the best. And the QR-CNN-BilSTM model point prediction performance is better as seen from the certainty evaluation index. By the analysis, the point prediction, interval prediction and probability prediction performances of the QR-CNN-BilSTM model are superior.
Claims (5)
1. The photovoltaic power interval probability prediction method based on the deep learning fusion model is characterized by comprising the following steps:
step 1, acquiring preprocessed meteorological element data and historical photovoltaic power data, performing variable correlation analysis on the photovoltaic power and meteorological factors, and determining input variables of a prediction model;
step 2, selecting clustering variables, constructing statistical characteristics of the clustering variables,
step 3, according to the clustering variables and the statistical characteristics thereof selected in the step 2, performing similar daily clustering on the photovoltaic historical data by adopting a fuzzy C-means clustering algorithm to obtain a photovoltaic similar daily data set;
step 4, carrying out normalization processing on the photovoltaic similar day data set by adopting a min-max normalization method;
step 5, dividing the similar day data sets under each weather condition into a training set and a testing set;
step 6, constructing a QR-CNN-BilSTM interval prediction model, setting parameters of the CNN and the BilSTM model, and setting related parameters of model training; inputting a training set of the similar daily data set into the model for training;
step 7, inputting the data of the test set into the QR-CNN-BilSTM interval prediction model trained in the step 6, and performing photovoltaic power interval prediction;
step 8, performing reverse normalization on the interval prediction result to make the interval prediction result have physical significance;
and 9, generating a photovoltaic probability prediction result on the test set by adopting a kernel density estimation algorithm optimized by a cross validation and grid search method.
2. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 1, specifically:
step 1.1, selecting preprocessed meteorological element data and preprocessed photovoltaic power generation power data;
the time resolution of the meteorological element variable and the photovoltaic power variable is 5min, and wind speed, relative humidity, ambient temperature, total horizontal radiation, diffused horizontal radiation, rainfall, wind direction, total oblique radiation and diffused oblique radiation are selected as original meteorological element data variables;
step 1.2, measuring the correlation degree among a plurality of meteorological element variables by adopting a kendall rank correlation coefficient R;
and step 1.3, selecting meteorological factors of which the kendall rank correlation coefficient R absolute value of the photovoltaic power is not less than 0.5, and inputting the meteorological factors as a prediction model.
3. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 2, wherein in the step 2, specifically:
2.1, selecting a meteorological factor variable with the highest kendall rank correlation coefficient value of the photovoltaic power as a clustering variable;
and 2.2, selecting the average value, the standard deviation, the maximum value, the peak-to-valley number, the variation coefficient, the kurtosis and the skewness of the clustering variables as statistical characteristics.
4. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 3, specifically:
step 3.1, calculating numerical values of 7 statistical characteristics of the clustering variables on each day according to the statistical characteristics of the clustering variables constructed in the step 2;
step 3.2, determining the number c of data clustering categories, and initializing a clustering center V i Giving fuzzification parameter m and initializing membership degree matrix U (0) Given the termination criterion epsilon of the algorithm;
step 3.3, calculating all clustering centers of the t iteration according to the formula (5) to obtain a clustering center matrix:
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters;
step 3.4, updating the membership degree matrix U (t) The calculation method is shown in formula (6):
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters; d is a radical of ij Is the distance from the ith sample to the jth class center;
step 3.5, calculate | | | U (t) -U (t-1) And verifying whether an iteration stop condition is met (t) -U (t-1) ||<And e, if the condition is met, stopping iteration, otherwise, continuously repeating the step 3.3 and the step 3.4 until the condition is met, and finally obtaining the similar day data sets under various weather conditions.
5. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 6, specifically:
6.1, constructing a characteristic diagram of the normalized similar day data set in a sliding window mode, inputting the characteristic diagram into a CNN network, and extracting a characteristic vector representing the dynamic change of photovoltaic power by using a convolution layer and a pooling layer of the characteristic diagram;
6.2, converting the output characteristic vector into a time sequence and inputting the time sequence into a BilSTM network;
step 6.3, introducing a QR model, combining with the CNN-BilSTM, and fusing the QR model and the CNN-BilSTM model in a loss function mode;
is provided withRepresenting a CNN-BilSTM point prediction model, wherein X t For model inputs, i.e., independent variables, omega is the model parameter of CNN-BilSTM, Y t As a function of the amount of the dependent variable,is Y t The predicted value of (2);
the QR-CNN-BilSTM model may be represented asEstimation of model parameters omega (tau) at each quantileBy minimizing a loss functionObtaining;
6.4, setting parameters of the model and training the model;
the QR-CNN-BilSTM model network structure is composed of a CNN layer, a maximum pooling layer, three BilSTM layers and a full connection layer which are connected in sequence;
setting the convolution layer number to be 1, the convolution kernel number to be 64, the convolution kernel size to be 4, the convolution boundary processing mode to be 'same', the activation function to be 3 and the pooling window size to be 3;
setting the number of layers of a BilSTM network to be 3, the number of neurons to be 128 and a dropout parameter to be 0.2; setting the quantile starting to be 0.05, the step length to be 0.05 and the end point to be 1, so that the number of neurons in the full connecting layer is 19;
an initial learning rate of 0.01, a learning rate attenuation of 1.5, and a learning rate minimum of 10 are set -4 Maximum iteration number is 100, batch processing parameter is 32, optimizer is Adam;
inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210823439.9A CN115115125A (en) | 2022-07-13 | 2022-07-13 | Photovoltaic power interval probability prediction method based on deep learning fusion model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210823439.9A CN115115125A (en) | 2022-07-13 | 2022-07-13 | Photovoltaic power interval probability prediction method based on deep learning fusion model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115115125A true CN115115125A (en) | 2022-09-27 |
Family
ID=83332242
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210823439.9A Pending CN115115125A (en) | 2022-07-13 | 2022-07-13 | Photovoltaic power interval probability prediction method based on deep learning fusion model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115115125A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115144548A (en) * | 2022-08-31 | 2022-10-04 | 天津市环鉴环境检测有限公司 | Harmful gas composition real-time monitoring system and monitoring method thereof |
CN116565863A (en) * | 2023-07-10 | 2023-08-08 | 南京师范大学 | Short-term photovoltaic output prediction method based on space-time correlation |
CN116742624A (en) * | 2023-08-10 | 2023-09-12 | 华能新能源股份有限公司山西分公司 | Photovoltaic power generation amount prediction method and system |
-
2022
- 2022-07-13 CN CN202210823439.9A patent/CN115115125A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115144548A (en) * | 2022-08-31 | 2022-10-04 | 天津市环鉴环境检测有限公司 | Harmful gas composition real-time monitoring system and monitoring method thereof |
CN116565863A (en) * | 2023-07-10 | 2023-08-08 | 南京师范大学 | Short-term photovoltaic output prediction method based on space-time correlation |
CN116565863B (en) * | 2023-07-10 | 2023-09-26 | 南京师范大学 | Short-term photovoltaic output prediction method based on space-time correlation |
CN116742624A (en) * | 2023-08-10 | 2023-09-12 | 华能新能源股份有限公司山西分公司 | Photovoltaic power generation amount prediction method and system |
CN116742624B (en) * | 2023-08-10 | 2023-11-03 | 华能新能源股份有限公司山西分公司 | Photovoltaic power generation amount prediction method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Photovoltaic power forecasting with a hybrid deep learning approach | |
US20220373984A1 (en) | Hybrid photovoltaic power prediction method and system based on multi-source data fusion | |
CN108921339B (en) | Quantile regression-based photovoltaic power interval prediction method for genetic support vector machine | |
CN110580543A (en) | Power load prediction method and system based on deep belief network | |
CN109284870A (en) | Short-term method for forecasting photovoltaic power generation quantity based on shot and long term Memory Neural Networks | |
CN115115125A (en) | Photovoltaic power interval probability prediction method based on deep learning fusion model | |
CN112561058B (en) | Short-term photovoltaic power prediction method based on Stacking-integrated learning | |
CN114792156B (en) | Photovoltaic output power prediction method and system based on curve characteristic index clustering | |
CN115796004A (en) | Photovoltaic power station ultra-short term power intelligent prediction method based on SLSTM and MLSTNet models | |
CN112418346B (en) | Numerical weather forecast total radiation system error classification calculation method | |
Zhang et al. | Solar radiation intensity probabilistic forecasting based on K-means time series clustering and Gaussian process regression | |
CN113554466A (en) | Short-term power consumption prediction model construction method, prediction method and device | |
CN113537582B (en) | Photovoltaic power ultra-short-term prediction method based on short-wave radiation correction | |
CN112232561A (en) | Power load probability prediction method based on constrained parallel LSTM quantile regression | |
CN115860177A (en) | Photovoltaic power generation power prediction method based on combined machine learning model and application thereof | |
CN112418526A (en) | Comprehensive energy load control method and device based on improved deep belief network | |
CN111242355A (en) | Photovoltaic probability prediction method and system based on Bayesian neural network | |
CN116341613A (en) | Ultra-short-term photovoltaic power prediction method based on Informar encoder and LSTM | |
CN112465251A (en) | Short-term photovoltaic output probability prediction method based on simplest gated neural network | |
CN113516271A (en) | Wind power cluster power day-ahead prediction method based on space-time neural network | |
CN113988436A (en) | Power consumption prediction method based on LSTM neural network and hierarchical relation correction | |
CN115409369A (en) | Comprehensive energy system reliability evaluation method based on mechanism and data hybrid driving | |
CN113762591B (en) | Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning | |
Zhang et al. | Inferential statistics and machine learning models for short-term wind power forecasting | |
CN115481788A (en) | Load prediction method and system for phase change energy storage system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |