CN115115125A - Photovoltaic power interval probability prediction method based on deep learning fusion model - Google Patents

Photovoltaic power interval probability prediction method based on deep learning fusion model Download PDF

Info

Publication number
CN115115125A
CN115115125A CN202210823439.9A CN202210823439A CN115115125A CN 115115125 A CN115115125 A CN 115115125A CN 202210823439 A CN202210823439 A CN 202210823439A CN 115115125 A CN115115125 A CN 115115125A
Authority
CN
China
Prior art keywords
model
photovoltaic power
bilstm
clustering
cnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210823439.9A
Other languages
Chinese (zh)
Inventor
王开艳
杜浩东
贾嵘
王颂凯
刘恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202210823439.9A priority Critical patent/CN115115125A/en
Publication of CN115115125A publication Critical patent/CN115115125A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/003Load forecast, e.g. methods or systems for forecasting future load demand

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Power Engineering (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a photovoltaic power interval probability prediction method based on a deep learning fusion model, which specifically comprises the following steps: firstly, acquiring the photovoltaic power and meteorological factors to carry out variable correlation analysis, and determining the input variables of a prediction model; selecting a clustering variable, constructing statistical characteristics of the clustering variable, performing similar daily clustering on photovoltaic historical data by adopting a fuzzy C-means clustering algorithm, and performing normalization processing on the photovoltaic historical data; then dividing the similar daily data set into a training set and a testing set; constructing a QR-CNN-BilSTM interval prediction model, training and predicting a photovoltaic power interval; and finally, generating a photovoltaic probability prediction result on the test set. The method can well track the future photovoltaic power trend, realizes high-accuracy measurement of photovoltaic power prediction uncertainty on the basis of meeting the reliability requirement, generates the photovoltaic power prediction interval under the corresponding confidence level, and has practical application value.

Description

Photovoltaic power interval probability prediction method based on deep learning fusion model
Technical Field
The invention belongs to the technical field of photovoltaic power generation prediction, and particularly relates to a photovoltaic power interval probability prediction method based on a deep learning fusion model.
Background
In recent years, the problem of environmental pollution is becoming more serious, the problem of non-renewable energy shortage is becoming more prominent, new energy development roads are sought in countries in the world, new energy development strategies are implemented in 2014 in China, solar energy serving as an important component of energy transformation is developed and utilized on a large scale, and by the end of 2021, the total installation of photovoltaic power generation in China reaches 3.06 hundred million kW, and the new installation of photovoltaic power generation in the whole country in 2021 year is 5300 ten thousand kW. The large-scale new energy grid-connected power generation represented by photovoltaic power generation is an unblocked development trend and prominent characteristics of a new generation of power system in the future. However, due to the influence of various complex environmental factors, photovoltaic power generation has strong random fluctuation, intermittency and non-stationarity, and as a high-proportion photovoltaic power generation in an electric power system is connected, the photovoltaic power generation as an uncontrollable power source seriously threatens the safe and stable operation of the electric power system. Therefore, the research on the photovoltaic power prediction technology has important significance for building a new generation of electric power system in China and enabling the electric power system to be suitable for the access of high-proportion renewable energy sources, and has important values for building an integrated security defense system of the electric power system and realizing risk control.
Existing photovoltaic power prediction techniques are formally classified from the results of the predictions, which can be classified into deterministic predictions and non-deterministic predictions. The photovoltaic power certainty prediction result is single-point prediction, and the advantage is that the method is visual, but uncertainty information of photovoltaic power prediction cannot be represented. The uncertainty prediction can give a possible variation range and a confidence degree of the photovoltaic power at a future moment, and the uncertainty prediction result can provide more comprehensive data support for power system scheduling, so that the method has more important engineering significance. The prediction models are classified into physical models and data driving models, the physical models are established from the characteristics, installation angles and the like of photovoltaic modules, geographic conditions, meteorological elements and the like are considered, the physical models are complex in construction mechanism, and the application is less at present. The data-driven model mainly comprises a statistical method and an artificial intelligence algorithm. The statistical method adopts curve fitting and parameter estimation to establish the relation between the photovoltaic power and the influence factors thereof, and a common time sequence method and a gray model are adopted. The artificial intelligence model is represented by a neural network and a deep learning model, has strong nonlinear data processing capacity, and is a model generally adopted in recent years.
Although a lot of work has been done by many scholars in the field of photovoltaic power prediction, the following problems still exist at the present stage: (1) photovoltaic power prediction is concentrated on point prediction, and interval probability prediction research is less. (2) The reliability and the sensitivity of the existing interval probability prediction model are not high, and the performance of the photovoltaic power interval probability prediction model needs to be further improved urgently. (3) Most studies adopt arithmetic data with 1h or 15min as an interval, however, when the power with 5min as an interval in the future is predicted, more complex and variable photovoltaic power fluctuation is faced, the traditional single model cannot well cope with the problem, and multi-model fusion is one of the solutions in the future.
Disclosure of Invention
The invention aims to provide a photovoltaic power interval probability prediction method based on a deep learning fusion model, and solves the problems of inaccurate photovoltaic power interval prediction and probability prediction results.
The technical scheme adopted by the invention is that the photovoltaic power interval probability prediction method based on the deep learning fusion model is implemented according to the following steps:
step 1, acquiring preprocessed meteorological element data and historical photovoltaic power data, performing variable correlation analysis on the photovoltaic power and meteorological factors, and determining input variables of a prediction model;
step 2, selecting clustering variables and constructing statistical characteristics of the clustering variables;
step 3, according to the clustering variables and the statistical characteristics thereof selected in the step 2, performing similar daily clustering on the photovoltaic historical data by adopting a fuzzy C-means clustering algorithm to obtain a photovoltaic similar daily data set;
step 4, carrying out normalization processing on the photovoltaic similar day data set by adopting a min-max normalization method;
step 5, dividing the similar day data sets under each weather condition into a training set and a testing set;
step 6, constructing a QR-CNN-BilSTM interval prediction model, setting parameters of the CNN and the BilSTM model, and setting related parameters of model training; inputting a training set of the similar daily data set into the model for training;
step 7, inputting the data of the test set into the QR-CNN-BilSTM interval prediction model trained in the step 6, and performing photovoltaic power interval prediction;
step 8, performing reverse normalization on the interval prediction result to make the interval prediction result have physical significance;
and 9, generating a photovoltaic probability prediction result on the test set by adopting a kernel density estimation algorithm optimized by a cross validation and grid search method.
The present invention is also characterized in that,
in the step 1, the method specifically comprises the following steps:
step 1.1, selecting preprocessed meteorological element data and preprocessed photovoltaic power generation power data;
the time resolution of the meteorological element variable and the photovoltaic power variable is 5min, and wind speed, relative humidity, ambient temperature, total horizontal radiation, diffused horizontal radiation, rainfall, wind direction, total oblique radiation and diffused oblique radiation are selected as original meteorological element data variables;
step 1.2, measuring the correlation degree among a plurality of meteorological element variables by adopting a kendall rank correlation coefficient R;
and step 1.3, selecting meteorological factors of which the kendall rank correlation coefficient R absolute value of the photovoltaic power is not less than 0.5, and inputting the meteorological factors as a prediction model.
In the step 2, the method specifically comprises the following steps:
2.1, selecting a meteorological factor variable with the highest kendall rank correlation coefficient value of the photovoltaic power as a clustering variable;
and 2.2, selecting the average value, the standard deviation, the maximum value, the peak-to-valley number, the variation coefficient, the kurtosis and the skewness of the clustering variables as statistical characteristics.
In the step 3, the method specifically comprises the following steps:
step 3.1, calculating numerical values of 7 statistical characteristics of the clustering variables on each day according to the statistical characteristics of the clustering variables constructed in the step 2;
step 3.2, determining the number c of data clustering categories, and initializing a clustering center V i Giving fuzzification parameter m and initializing membership degree matrix U (0) Given the termination criterion epsilon of the algorithm;
step 3.3, calculating all clustering centers of the t iteration according to the formula (5) to obtain a clustering center matrix:
Figure BDA0003743373950000041
in the formula: u. u ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters;
step 3.4, updating membership degree matrix U (t) The calculation method is shown in formula (6):
Figure BDA0003743373950000051
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters; d ij Is the distance from the ith sample to the jth class center;
step 3.5, calculate | | | U (t) -U (t-1) And verifying whether an iteration stop condition is met (t) -U (t-1) ||<E, stopping iteration if the condition is met, otherwise, continuously repeating the step 3.3 and the step 3.4 until a bar is reachedAnd (4) until the data sets of similar days in each weather condition are obtained finally.
In step 6, the method specifically comprises the following steps:
6.1, constructing a characteristic diagram of the normalized similar day data set in a sliding window mode, inputting the characteristic diagram into a CNN network, and extracting a characteristic vector representing the dynamic change of photovoltaic power by using a convolution layer and a pooling layer of the characteristic diagram;
6.2, converting the output characteristic vector into a time sequence and inputting the time sequence into a BilSTM network;
step 6.3, introducing a QR model, combining with the CNN-BilSTM, and fusing the QR model and the CNN-BilSTM model in a loss function mode;
is provided with
Figure BDA0003743373950000052
Representing a CNN-BilSTM point prediction model, wherein X t For model inputs, i.e., independent variables, omega is the model parameter of CNN-BilSTM, Y t As a function of the amount of the dependent variable,
Figure BDA0003743373950000053
is Y t The predicted value of (2);
the QR-CNN-BilSTM model may be represented as
Figure BDA0003743373950000061
Estimation of model parameters omega (tau) at each quantile
Figure BDA0003743373950000062
By minimizing a loss function
Figure BDA0003743373950000063
Obtaining;
6.4, setting parameters of the model and training the model;
the QR-CNN-BilSTM model network structure is composed of a CNN layer, a maximum pooling layer, three BilSTM layers and a full connection layer which are connected in sequence;
setting the number of convolution layers as 1 layer, the number of convolution kernels as 64, the size of the convolution kernels as 4, the boundary processing mode of convolution as ' same ', the activation function as ' and the size of the pooling window as 3;
setting the number of the layers of the BiLSTM network to be 3, the number of the neurons to be 128 and the dropout parameter to be 0.2; setting the quantile starting to be 0.05, the step length to be 0.05 and the end point to be 1, so that the number of the neurons in the full connecting layer is 19;
an initial learning rate of 0.01, a learning rate attenuation of 1.5, and a learning rate minimum of 10 are set -4 Maximum iteration number is 100, batch processing parameter is 32, optimizer is Adam;
inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training.
The invention has the beneficial effects that:
the photovoltaic power interval probability prediction method based on the deep learning fusion model provides a correlation analysis method based on a kendall rank correlation coefficient, so that model input variables can be determined, and invalid information in historical data is reduced to improve training efficiency; selecting meteorological factor variables with high correlation with photovoltaic power as clustering variables, and selecting 7 statistical characteristics such as average values of the clustering variables as clustering characteristics to comprehensively reflect fluctuation rules and characteristics of each calendar history data, so that an FCM (fuzzy c-means) algorithm can perform high-efficiency clustering, and the clustering algorithm is high-efficiency and reasonable; the QR-CNN-BilSTM model fuses two deep learning models, namely the CNN and the BilSTM, has higher prediction precision compared with a traditional single deep learning prediction model, when the future photovoltaic power is predicted by fine time granularity with 5min intervals, the photovoltaic power with 5min intervals presents more rapid change characteristics under the condition of non-sunny weather, the future photovoltaic power trend can be better tracked, on the basis of meeting the reliability requirement, the high-accuracy measurement of the photovoltaic power prediction uncertainty is realized, the photovoltaic power prediction interval under the corresponding confidence level is generated, and the method has practical application value.
Drawings
FIG. 1 is a flow chart of a photovoltaic power interval probability prediction method based on a deep learning fusion model according to the present invention;
FIG. 2 is a flow chart of a fuzzy C-means clustering algorithm in the photovoltaic power interval probability prediction method based on the deep learning fusion model;
FIG. 3 is a diagram of the interval prediction result of the QR-CNN-BilSTM model used in the method of the present invention in sunny days;
FIG. 4 is a graph of interval prediction results in sunny days using a QR-LSTM model;
FIG. 5 is a graph of interval prediction results in sunny days using a QR-BilSTM model;
FIG. 6 is a graph of interval prediction results of a QR-CNN-BilSTM model adopted in the method of the invention during sunny days and cloudy days;
FIG. 7 is a graph of interval prediction results using a QR-LSTM model on a sunny and cloudy day;
FIG. 8 is a graph of interval prediction results during a clear-to-cloudy day using a QR-BilSTM model;
FIG. 9 is a graph of the interval prediction results of the QR-CNN-BilSTM model used in the method of the present invention in rainy days;
FIG. 10 is a graph of interval prediction results in rainy days using a QR-LSTM model;
FIG. 11 is a graph of the interval prediction results in rainy days using the QR-BilSTM model;
FIG. 12 is a graph of the probability prediction results of the QR-CNN-BilSTM model in combination with the nuclear density estimation method in sunny days;
FIG. 13 is a graph of the probability prediction results of the QR-CNN-BilSTM model in combination with the nuclear density estimation method in a cloudy day during sunny days;
FIG. 14 is a graph of the probability prediction results of the QR-CNN-BilSTM model combined with the nuclear density estimation method in rainy days.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The photovoltaic power interval probability prediction method based on the deep learning fusion model is implemented according to the following steps as shown in fig. 1:
step 1, acquiring preprocessed meteorological element data and historical photovoltaic power data, performing variable correlation analysis on photovoltaic power and meteorological factors, and determining input variables of a prediction model; the method specifically comprises the following steps:
step 1.1, selecting preprocessed meteorological element data and preprocessed photovoltaic power generation power data, wherein time resolutions of meteorological element variables and photovoltaic power variables are kept consistent;
the time resolution of the meteorological element variable and the photovoltaic power variable is 5min, wind speed, relative humidity, ambient temperature, total horizontal radiation, diffused horizontal radiation, rainfall, wind direction, total oblique radiation and diffused oblique radiation are selected as original meteorological element data variables, the photovoltaic array does not generate electricity at night, data with the power at night being 0 are removed, and data in a period of 6: 00-19: 30 each day are reserved as example analysis data.
Searching vacancy values of historical meteorological data and photovoltaic power generation power data, and filling the vacancy values by using an interpolation method; searching abnormal values of original historical meteorological data and photovoltaic power generation data by adopting a box type graph, and replacing the abnormal values of the data by using the upper and lower boundaries of the box type graph;
step 1.2, measuring the correlation degree among a plurality of meteorological element variables by adopting a kendall rank correlation coefficient R;
the kendall rank correlation coefficient R is defined as formula (1):
Figure BDA0003743373950000091
in the formula: p represents the number of coincident pairs; q represents the number of non-uniform pairs,
Figure BDA0003743373950000092
representing the total logarithm of observations. When two pairs of observations A of variables A and B i 、B i And A j 、B j Satisfy A i <B i And at this time A j <B j If the two pairs of observed values are inconsistent or harmonious, otherwise, the two pairs of observed values are inconsistent or harmonious;
the closer the kendall rank correlation coefficient R is to 1, the higher the correlation of the meteorological variable with the output power is. The coefficient is positive, indicating positive correlation, and negative indicating negative correlation;
step 1.3, selecting meteorological factors of which the kendall rank correlation coefficient R absolute value of the photovoltaic power is not less than 0.5, and inputting the meteorological factors as a prediction model;
in this example, the kendall rank correlation coefficient R of the photovoltaic power variable and each meteorological element variable is shown in table 1:
TABLE 1 Meteorological factor variables
Figure BDA0003743373950000093
Selecting a plurality of meteorological factors with high correlation with the photovoltaic power, namely meteorological factors with a kendall rank correlation coefficient absolute value of the photovoltaic power not less than 0.5, and inputting the meteorological factors as a prediction model, so that total horizontal radiation, diffused horizontal radiation, total oblique radiation and diffused oblique radiation are selected as meteorological element variables of the embodiment;
step 2, selecting a clustering variable, and constructing statistical characteristics of the clustering variable, wherein the statistical characteristics specifically comprise the following steps:
2.1, selecting a meteorological factor variable with the highest kendall rank correlation coefficient value of the photovoltaic power as a clustering variable; in the embodiment, total horizontal radiation is selected as a clustering variable;
step 2.2, selecting the average value, the standard deviation, the maximum value, the peak wave valley number, the variation coefficient, the kurtosis and the skewness of the clustering variables as statistical characteristics;
the coefficient of variation C, kurtosis K and skewness D are defined as shown in formulas (2) to (4), respectively:
Figure BDA0003743373950000101
Figure BDA0003743373950000102
Figure BDA0003743373950000103
in the formula, σ represents the standard deviation of the variables,
Figure BDA0003743373950000104
denotes the mean value of the variables, X i Representing a certain sample of the variable, M representing the total number of samples of the variable;
step 3, according to the clustering variables and the statistical characteristics thereof selected in the step 2, as shown in fig. 2, similar day clustering of photovoltaic historical data is performed by adopting a Fuzzy C Mean (FCM) clustering algorithm to obtain a photovoltaic similar day data set; the method specifically comprises the following steps:
step 3.1, calculating numerical values of 7 statistical characteristics of the clustering variables on each day according to the statistical characteristics of the clustering variables constructed in the step 2, and taking the numerical values as clustering data for clustering by adopting a Fuzzy C Mean (FCM) clustering algorithm;
step 3.2, determining the number c of data clustering categories, and initializing a clustering center V i Giving fuzzification parameter m and initializing membership degree matrix U (0) Given the termination criterion epsilon of the algorithm;
step 3.3, calculating all clustering centers of the t iteration according to the formula (5) to obtain a clustering center matrix:
Figure BDA0003743373950000111
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters.
Step 3.4, updating membership degree matrix U (t) The calculation method is shown in formula (6):
Figure BDA0003743373950000112
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents polyThe number of classes; d ij Is the distance from the ith sample to the jth class center.
Step 3.5, calculate | | | U (t) -U (t-1) And verifying whether an iteration stop condition is met (t) -U (t-1) ||<If the condition is met, stopping iteration, otherwise, continuously repeating the step 3.3 and the step 3.4 until the condition is met, and finally obtaining a similar day data set under each weather condition;
step 4, carrying out normalization processing on the photovoltaic similar day data set by adopting a min-max normalization method so as to eliminate the influence of different variable dimension differences on a prediction result;
the min-max normalization method is shown in equation (7):
Figure BDA0003743373950000113
in the formula, x * Representing data to be normalized, x representing normalized data, x max Representing the maximum value, x, of some variable data min Is the minimum value of some variable data.
Step 5, dividing the similar day data sets under all weather conditions into a training set and a test set;
step 6, constructing a Quantile Regression (QR) -Convolutional Neural Network (CNN) -bidirectional long and short term memory (BilSTM) neural network, namely a QR-CNN-BilSTM interval prediction model, setting parameters of the CNN and the BilSTM model, and setting related parameters of model training; inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training; the method specifically comprises the following steps:
step 6.1, constructing a characteristic diagram of the normalized similar day data set in a sliding window mode, inputting the characteristic diagram into a CNN (network communication network), and extracting a characteristic vector representing dynamic change of photovoltaic power by utilizing a convolution layer and a pooling layer of the characteristic diagram;
the neuron output after the convolutional neural network extracts the local features is shown as a formula (8);
Figure BDA0003743373950000121
in the formula: o represents a neuron local output; i is neuron input; l, m and n respectively represent 3 dimensions of the output matrix; i. j and n respectively represent the length, width and depth of the convolution kernel K; b n A threshold value representing a convolution kernel;
Figure BDA0003743373950000122
representing a multiplication operation of the matrix.
And 6.2, converting the output characteristic vector into a time sequence and inputting the time sequence into the BilSTM network so as to further capture the long-term dependence in the time sequence.
The calculation process of the BilSTM network is as follows: the forgetting gate determines which input information is to be deleted from the memory cell state, as shown in equation (9);
f t =σ(W f ·[h t-1 ,X t ]+b f ) (9)
inputting the output value of the previous moment and the input value of the current moment into an input gate, and obtaining the output value of the input gate after calculation, wherein the output value is shown as a formula (10);
i t =σ(W i ·[h t-1 ,X t ])+b i (10)
inputting the output value of the previous moment and the input value of the current moment into an input gate, and obtaining the state of the candidate cell after calculation, as shown in a formula (11);
Figure BDA0003743373950000131
updating the current cell state as shown in formula (12);
Figure BDA0003743373950000132
inputting the output value of the previous moment and the input value of the current moment into an output gate, and obtaining the output value of the output gate after calculation, wherein the output value is shown as a formula (13);
o t =σ(W o ·[h t-1 ,X t ]+b o ) (13)
calculating the output of the output gate and the cell state to obtain an output value as shown in a formula (14);
h t =o t *tanh(C t ) (14)
in the formula: f. of t Forget gate output at time t; both sigma and tanh functions are activation functions; h is t-1 Outputting information for the data at the time t-1; x t Inputting information for data at time t; w f 、W i 、W C 、W o Is a weight coefficient; b f 、b i 、b C 、b o Is a bias parameter; i.e. i t And
Figure BDA0003743373950000133
an output representing an input at time t; c t-1 Is the cell state at time t-1; c t Is composed of t The cellular state at the time; h is t Outputting information for data at the time t; o t Representing the output at time t after activation by the activation function Sigmoid.
The forward data is input to the forward LSTM layer, resulting in an output of the forward LSTM layer. And reversely inputting the data into the reverse LSTM layer to obtain reverse output, and then reversing the output again to finally obtain the output of the reverse LSTM layer. Finally, the forward LSTM layer output and the reverse LSTM layer output are linearly superposed according to a certain weight to obtain an output result;
step 6.3, introducing a QR model, combining with the CNN-BilSTM, and fusing the QR model and the CNN-BilSTM model in a loss function mode;
is provided with
Figure BDA0003743373950000141
Representing a CNN-BilSTM point prediction model, wherein X t For model inputs, i.e., independent variables, omega is the model parameter of CNN-BilSTM, Y t As a function of the amount of the dependent variable,
Figure BDA0003743373950000142
is Y t The predicted value of (2).
The QR-CNN-BilSTM model may be represented as
Figure BDA0003743373950000143
Estimation of model parameters omega (tau) at each quantile
Figure BDA0003743373950000144
By minimizing a loss function
Figure BDA0003743373950000145
Obtaining;
and 6.4, setting parameters of the model and training the model.
The QR-CNN-BilSTM model network structure is composed of a CNN layer, a maximum pooling layer, three BilSTM layers and a full connection layer which are connected in sequence;
setting the number of convolution layers as 1 layer, the number of convolution kernels as 64, the size of the convolution kernels as 4, the boundary processing mode of convolution as ' same ', the activation function as ' and the size of the pooling window as 3;
setting the number of layers of a BilSTM network to be 3, the number of neurons to be 128 and a dropout parameter to be 0.2; the set quantile starting point is 0.05, the step length is 0.05, and the end point is 1, so the number of neurons in the full connecting layer is 19.
An initial learning rate of 0.01, a learning rate attenuation of 1.5, and a learning rate minimum of 10 are set -4 Maximum iteration number is 100, batch processing parameter is 32, optimizer is Adam;
inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training;
step 7, inputting the data of the test set into the QR-CNN-BilSTM model trained in the step 6, and predicting the photovoltaic power interval;
step 8, performing reverse normalization on the interval prediction result to make the interval prediction result have physical significance;
step 9, generating a photovoltaic probability prediction result on the test set by adopting a kernel density estimation algorithm optimized by a cross validation and grid search method based on the inverse normalized photovoltaic power interval prediction result data obtained in the step 8;
the method specifically comprises the following steps: for a specific future photovoltaic power prediction point, applying the QR-CNN-BilSTM fusion model in the step 7 to obtain a group of N samples under N conditional quantiles, namely the vector containing the N samples is
Figure BDA0003743373950000151
The probability density function can be obtained by a kernel density estimation method, and KDE calculation of the vector is shown as a formula (15):
Figure BDA0003743373950000152
in the formula: n is the total number of samples; b is the optimal bandwidth determined by adopting a cross-validation grid search method, and B is greater than 0; k is a kernel function. The gaussian kernel is the kernel function K used in the present invention, which is shown in equation (16):
Figure BDA0003743373950000153
the cross validation and grid search method provided by the invention optimizes the kernel density estimation, solves the problem of difficult bandwidth selection in the kernel density estimation, and can generate a high-quality probability prediction result.
And evaluating the photovoltaic power point prediction, interval prediction and probability prediction results. Using root mean square error e RMSE And the mean absolute percentage error e MAPE Used for point prediction result evaluation; using interval comprehensive evaluation index I WC Evaluating an interval prediction result; scoring P with successive ranking probabilities CRPS Carrying out probability prediction result evaluation; as shown in the following formula:
Figure BDA0003743373950000161
Figure BDA0003743373950000162
I WC =I PINAW /I PICP
Figure BDA0003743373950000163
wherein:
Figure BDA0003743373950000164
Figure BDA0003743373950000165
Figure BDA0003743373950000166
Figure BDA0003743373950000167
in the formula: p ri Representing observed values of power, P pi And expressing the power predicted value, wherein N is the total point number of the predicted future photovoltaic power. S n Is a logical value, S when the observed value falls within the prediction interval n Get 1, otherwise S n Taking 0; e is the difference between the maximum value and the minimum value of the observed value; p is upi And P downi Respectively an upper bound and a lower bound of the prediction interval; p (x) represents a probability density function; f (P) pi ) Represents P pi The cumulative distribution function of; h (P) pi -P ri ) Is a step function.
The photovoltaic power interval probability prediction method based on the deep learning fusion model comprises the steps of screening meteorological factors by adopting a correlation coefficient method, and reducing model prediction errors caused by excessive irrelevant characteristics; then, selecting high-correlation meteorological factor variables as clustering variables and constructing statistical characteristics of the clustering variables, and clustering by adopting an FCM clustering algorithm to obtain a similar day data set so as to lay a foundation for further improving prediction precision; the QR-CNN-BilSTM model fuses two deep learning models, namely CNN and BilSTM, has higher prediction precision compared with the traditional single deep learning prediction model, and can generate interval prediction results with higher quality; the cross validation and grid search method optimizes the kernel density estimation, solves the problem of difficult bandwidth selection in the kernel density estimation, and can generate high-quality probability prediction results.
Examples
Photovoltaic power data of a photovoltaic array produced by a certain manufacturer of Australian desert knowledge solar center (DKASC) Alice Springs sites in 2019-2020 and 4 meteorological factor data obtained by screening are adopted as simulation data. The time resolution of the data set was 5 min. Because the photovoltaic array does not generate electricity at night, the data with the power of 0 at night are removed, and the data in the time period of 6: 00-19: 30 each day are reserved as example analysis data.
Finally, the data sets are divided into three categories by adopting a fuzzy C-means clustering algorithm: and (3) similar day data sets in sunny days, cloudy days after sunny days and rainy days. And dividing each similar day data set into a training set and a test set, wherein the proportion of the training set is 0.7, the test set selects 30d of similar day data close to the time, and the training set selects the first 70d of similar day data closest to the test set in the similar day data sets.
In order to illustrate the advantages of the proposed model in photovoltaic power short-term intervals and probability prediction, the prediction effects of the proposed QR-CNN-BilSTM model and the QR-LSTM and QR-BilSTM models are respectively compared under 3 weather types, one day is randomly selected from each weather type to be used for visual analysis, as shown in fig. 3-11, the prediction results of each model point and the prediction results of 95% confidence level intervals under the weather types of sunny days, sunny-cloudy days and rainy days are shown, fig. 12-14 show the probability prediction results of the QR-CNN-BilSTM model combining nuclear density estimation under the weather types of sunny days, sunny-cloudy days and rainy days (9 point probability prediction results are selected from 164 points at equal distances), and evaluation index pairs are shown in tables 2, 3 and 4.
Evaluation index of model in Table 2
Figure BDA0003743373950000181
Evaluation index of model in Table 3
Figure BDA0003743373950000182
Evaluation index of the model in Table 4
Figure BDA0003743373950000183
From the table 2, it can be seen that in sunny days, the coverage rate of each model prediction interval is close to 100%, the prediction interval width of the QR-CNN-BilSTM model is obviously lower than that of other models, the PINAW index is reduced by 72.77% compared with the QR-LSTM model, and is reduced by 66.26% compared with the QR-BilSTM model; meanwhile, the section comprehensive evaluation index value I of the QR-CNN-BilSTM model WC Is also minimal, I WC The index is reduced by 72.62 percent compared with QR-LSTM and is reduced by 66.08 percent compared with QR-BiLSTM, so that the interval prediction performance is the best. And the CRPS value of the QR-CNN-BilSTM probability evaluation index is minimum, so that the probability prediction performance is best. And the QR-CNN-BilSTM model point prediction performance is better from the certainty evaluation index.
It can be seen from table 3 that the PICP values in the prediction intervals of the models are all measured during sunny days and cloudy days>On the premise of 95%, PINAW in the prediction interval of the QR-CNN-BilSTM model is remarkably reduced by 28.34% compared with QR-LSTM and by 25.62% compared with QR-BilSTM; section comprehensive evaluation index I of QR-CNN-BilSTM model simultaneously WC Also has a minimum value, I WC The index is reduced by 28.80% compared with QR-LSTM and reduced by 26.09% compared with QR-BiLSTM, so that the interval prediction performance is optimal. And the CRPS value of the QR-CNN-BilSTM probability evaluation index is minimum, so that the probability prediction performance is best. And the QR-CNN-BilSTM model point prediction performance is better from the certainty evaluation index.
From Table 4, it can be seen that the PICP value in each model prediction interval is in the rainy dayAll are provided with>On the premise of 95%, the PINAW of the prediction interval of the QR-CNN-BilSTM model is remarkably reduced, is reduced by 7.98% compared with the QR-LSTM, is reduced by 4.47% compared with the QR-BilSTM, and simultaneously, the interval comprehensive evaluation index I WC Also has a minimum value of I WC The index is reduced by 6.81% compared with QR-LSTM, the index is reduced by 4.08% compared with QR-BiLSTM, and the uncertainty of photovoltaic power interval prediction is obviously reduced, so that the QR-CNN-BiLSTM interval prediction performance is optimal. The QR-CNN-BilSTM probability evaluation index CRPS has the smallest value, so the probability prediction performance is also the best. And the QR-CNN-BilSTM model point prediction performance is better as seen from the certainty evaluation index. By the analysis, the point prediction, interval prediction and probability prediction performances of the QR-CNN-BilSTM model are superior.

Claims (5)

1. The photovoltaic power interval probability prediction method based on the deep learning fusion model is characterized by comprising the following steps:
step 1, acquiring preprocessed meteorological element data and historical photovoltaic power data, performing variable correlation analysis on the photovoltaic power and meteorological factors, and determining input variables of a prediction model;
step 2, selecting clustering variables, constructing statistical characteristics of the clustering variables,
step 3, according to the clustering variables and the statistical characteristics thereof selected in the step 2, performing similar daily clustering on the photovoltaic historical data by adopting a fuzzy C-means clustering algorithm to obtain a photovoltaic similar daily data set;
step 4, carrying out normalization processing on the photovoltaic similar day data set by adopting a min-max normalization method;
step 5, dividing the similar day data sets under each weather condition into a training set and a testing set;
step 6, constructing a QR-CNN-BilSTM interval prediction model, setting parameters of the CNN and the BilSTM model, and setting related parameters of model training; inputting a training set of the similar daily data set into the model for training;
step 7, inputting the data of the test set into the QR-CNN-BilSTM interval prediction model trained in the step 6, and performing photovoltaic power interval prediction;
step 8, performing reverse normalization on the interval prediction result to make the interval prediction result have physical significance;
and 9, generating a photovoltaic probability prediction result on the test set by adopting a kernel density estimation algorithm optimized by a cross validation and grid search method.
2. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 1, specifically:
step 1.1, selecting preprocessed meteorological element data and preprocessed photovoltaic power generation power data;
the time resolution of the meteorological element variable and the photovoltaic power variable is 5min, and wind speed, relative humidity, ambient temperature, total horizontal radiation, diffused horizontal radiation, rainfall, wind direction, total oblique radiation and diffused oblique radiation are selected as original meteorological element data variables;
step 1.2, measuring the correlation degree among a plurality of meteorological element variables by adopting a kendall rank correlation coefficient R;
and step 1.3, selecting meteorological factors of which the kendall rank correlation coefficient R absolute value of the photovoltaic power is not less than 0.5, and inputting the meteorological factors as a prediction model.
3. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 2, wherein in the step 2, specifically:
2.1, selecting a meteorological factor variable with the highest kendall rank correlation coefficient value of the photovoltaic power as a clustering variable;
and 2.2, selecting the average value, the standard deviation, the maximum value, the peak-to-valley number, the variation coefficient, the kurtosis and the skewness of the clustering variables as statistical characteristics.
4. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 3, specifically:
step 3.1, calculating numerical values of 7 statistical characteristics of the clustering variables on each day according to the statistical characteristics of the clustering variables constructed in the step 2;
step 3.2, determining the number c of data clustering categories, and initializing a clustering center V i Giving fuzzification parameter m and initializing membership degree matrix U (0) Given the termination criterion epsilon of the algorithm;
step 3.3, calculating all clustering centers of the t iteration according to the formula (5) to obtain a clustering center matrix:
Figure FDA0003743373940000031
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters;
step 3.4, updating the membership degree matrix U (t) The calculation method is shown in formula (6):
Figure FDA0003743373940000032
in the formula: u. of ij The membership degree of the ith sample belonging to the jth class; x is the number of i Is a sample point; m is a membership factor; n is the number of samples; t is the number of iterations; c represents the number of clusters; d is a radical of ij Is the distance from the ith sample to the jth class center;
step 3.5, calculate | | | U (t) -U (t-1) And verifying whether an iteration stop condition is met (t) -U (t-1) ||<And e, if the condition is met, stopping iteration, otherwise, continuously repeating the step 3.3 and the step 3.4 until the condition is met, and finally obtaining the similar day data sets under various weather conditions.
5. The photovoltaic power interval probability prediction method based on the deep learning fusion model according to claim 1, wherein in the step 6, specifically:
6.1, constructing a characteristic diagram of the normalized similar day data set in a sliding window mode, inputting the characteristic diagram into a CNN network, and extracting a characteristic vector representing the dynamic change of photovoltaic power by using a convolution layer and a pooling layer of the characteristic diagram;
6.2, converting the output characteristic vector into a time sequence and inputting the time sequence into a BilSTM network;
step 6.3, introducing a QR model, combining with the CNN-BilSTM, and fusing the QR model and the CNN-BilSTM model in a loss function mode;
is provided with
Figure FDA0003743373940000041
Representing a CNN-BilSTM point prediction model, wherein X t For model inputs, i.e., independent variables, omega is the model parameter of CNN-BilSTM, Y t As a function of the amount of the dependent variable,
Figure FDA0003743373940000042
is Y t The predicted value of (2);
the QR-CNN-BilSTM model may be represented as
Figure FDA0003743373940000043
Estimation of model parameters omega (tau) at each quantile
Figure FDA0003743373940000044
By minimizing a loss function
Figure FDA0003743373940000045
Obtaining;
6.4, setting parameters of the model and training the model;
the QR-CNN-BilSTM model network structure is composed of a CNN layer, a maximum pooling layer, three BilSTM layers and a full connection layer which are connected in sequence;
setting the convolution layer number to be 1, the convolution kernel number to be 64, the convolution kernel size to be 4, the convolution boundary processing mode to be 'same', the activation function to be 3 and the pooling window size to be 3;
setting the number of layers of a BilSTM network to be 3, the number of neurons to be 128 and a dropout parameter to be 0.2; setting the quantile starting to be 0.05, the step length to be 0.05 and the end point to be 1, so that the number of neurons in the full connecting layer is 19;
an initial learning rate of 0.01, a learning rate attenuation of 1.5, and a learning rate minimum of 10 are set -4 Maximum iteration number is 100, batch processing parameter is 32, optimizer is Adam;
inputting a training set of similar day data sets into the constructed QR-CNN-BilSTM model for model training.
CN202210823439.9A 2022-07-13 2022-07-13 Photovoltaic power interval probability prediction method based on deep learning fusion model Pending CN115115125A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210823439.9A CN115115125A (en) 2022-07-13 2022-07-13 Photovoltaic power interval probability prediction method based on deep learning fusion model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210823439.9A CN115115125A (en) 2022-07-13 2022-07-13 Photovoltaic power interval probability prediction method based on deep learning fusion model

Publications (1)

Publication Number Publication Date
CN115115125A true CN115115125A (en) 2022-09-27

Family

ID=83332242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210823439.9A Pending CN115115125A (en) 2022-07-13 2022-07-13 Photovoltaic power interval probability prediction method based on deep learning fusion model

Country Status (1)

Country Link
CN (1) CN115115125A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115144548A (en) * 2022-08-31 2022-10-04 天津市环鉴环境检测有限公司 Harmful gas composition real-time monitoring system and monitoring method thereof
CN116565863A (en) * 2023-07-10 2023-08-08 南京师范大学 Short-term photovoltaic output prediction method based on space-time correlation
CN116742624A (en) * 2023-08-10 2023-09-12 华能新能源股份有限公司山西分公司 Photovoltaic power generation amount prediction method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115144548A (en) * 2022-08-31 2022-10-04 天津市环鉴环境检测有限公司 Harmful gas composition real-time monitoring system and monitoring method thereof
CN116565863A (en) * 2023-07-10 2023-08-08 南京师范大学 Short-term photovoltaic output prediction method based on space-time correlation
CN116565863B (en) * 2023-07-10 2023-09-26 南京师范大学 Short-term photovoltaic output prediction method based on space-time correlation
CN116742624A (en) * 2023-08-10 2023-09-12 华能新能源股份有限公司山西分公司 Photovoltaic power generation amount prediction method and system
CN116742624B (en) * 2023-08-10 2023-11-03 华能新能源股份有限公司山西分公司 Photovoltaic power generation amount prediction method and system

Similar Documents

Publication Publication Date Title
Li et al. Photovoltaic power forecasting with a hybrid deep learning approach
US20220373984A1 (en) Hybrid photovoltaic power prediction method and system based on multi-source data fusion
CN108921339B (en) Quantile regression-based photovoltaic power interval prediction method for genetic support vector machine
CN110580543A (en) Power load prediction method and system based on deep belief network
CN109284870A (en) Short-term method for forecasting photovoltaic power generation quantity based on shot and long term Memory Neural Networks
CN115115125A (en) Photovoltaic power interval probability prediction method based on deep learning fusion model
CN112561058B (en) Short-term photovoltaic power prediction method based on Stacking-integrated learning
CN114792156B (en) Photovoltaic output power prediction method and system based on curve characteristic index clustering
CN115796004A (en) Photovoltaic power station ultra-short term power intelligent prediction method based on SLSTM and MLSTNet models
CN112418346B (en) Numerical weather forecast total radiation system error classification calculation method
Zhang et al. Solar radiation intensity probabilistic forecasting based on K-means time series clustering and Gaussian process regression
CN113554466A (en) Short-term power consumption prediction model construction method, prediction method and device
CN113537582B (en) Photovoltaic power ultra-short-term prediction method based on short-wave radiation correction
CN112232561A (en) Power load probability prediction method based on constrained parallel LSTM quantile regression
CN115860177A (en) Photovoltaic power generation power prediction method based on combined machine learning model and application thereof
CN112418526A (en) Comprehensive energy load control method and device based on improved deep belief network
CN111242355A (en) Photovoltaic probability prediction method and system based on Bayesian neural network
CN116341613A (en) Ultra-short-term photovoltaic power prediction method based on Informar encoder and LSTM
CN112465251A (en) Short-term photovoltaic output probability prediction method based on simplest gated neural network
CN113516271A (en) Wind power cluster power day-ahead prediction method based on space-time neural network
CN113988436A (en) Power consumption prediction method based on LSTM neural network and hierarchical relation correction
CN115409369A (en) Comprehensive energy system reliability evaluation method based on mechanism and data hybrid driving
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
Zhang et al. Inferential statistics and machine learning models for short-term wind power forecasting
CN115481788A (en) Load prediction method and system for phase change energy storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination