CN115511170A - Multi-photovoltaic power station power prediction error modeling method - Google Patents

Multi-photovoltaic power station power prediction error modeling method Download PDF

Info

Publication number
CN115511170A
CN115511170A CN202211152635.4A CN202211152635A CN115511170A CN 115511170 A CN115511170 A CN 115511170A CN 202211152635 A CN202211152635 A CN 202211152635A CN 115511170 A CN115511170 A CN 115511170A
Authority
CN
China
Prior art keywords
photovoltaic power
power
power station
photovoltaic
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211152635.4A
Other languages
Chinese (zh)
Inventor
王晨旭
马骏超
彭琰
陆承宇
王松
吴俊�
邓晖
章枫
程颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
Electric Power Research Institute of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Zhejiang Electric Power Co Ltd filed Critical Electric Power Research Institute of State Grid Zhejiang Electric Power Co Ltd
Priority to CN202211152635.4A priority Critical patent/CN115511170A/en
Publication of CN115511170A publication Critical patent/CN115511170A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/004Generation forecast, e.g. methods or systems for forecasting future energy generation

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Power Engineering (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a multi-photovoltaic power station power prediction error modeling method. The technical scheme adopted by the invention is as follows: firstly, dividing a multi-photovoltaic power station meteorological and power historical data set into data sets under different weather types by adopting a K-means clustering method; then, for a multi-photovoltaic power station data set under each weather type, a correlation model of the predicted power and the measured power is constructed by adopting a D Vine-Copula function, accurate modeling of multi-dimensional variable correlation is achieved; finally, on the basis of the known photovoltaic power station power prediction value, accurate modeling of photovoltaic power station power prediction error probability distribution under different weather conditions is achieved by means of a D vine correlation model, and the quantization precision of photovoltaic power station power prediction uncertainty is improved.

Description

Multi-photovoltaic power station power prediction error modeling method
Technical Field
The invention belongs to the technical field of new energy power generation control, and particularly relates to a Vine-Copula function-based multi-photovoltaic power station power prediction error modeling method.
Background
With the continuous propulsion of new power systems mainly based on new energy, high-proportion photovoltaic access power systems have become the trend of current energy development. The photovoltaic power generation is influenced by a plurality of meteorological factors such as irradiance, temperature and humidity, the output of the photovoltaic power generation has strong fluctuation and randomness, and the photovoltaic power generation has great influence on peak regulation, frequency modulation, standby and the like of a power grid. With the access of large-scale photovoltaic power grid, the risk brought by the uncertainty of output of the photovoltaic power grid to the regulation and control operation of the power system is more and more prominent. The photovoltaic power generation power is predicted with high precision, important information support can be provided for novel power system planning, operation, stability analysis and market trading, and the method has important significance for improving the comprehensive energy efficiency of the system and promoting the friendly consumption of new energy. At present, aiming at the existing fruitful research result of photovoltaic power prediction, the research idea of the method is mainly a machine learning and deep learning model for describing the relation between meteorological factors and power generation power, so that the power station power prediction value is obtained through a power prediction model on the basis of known meteorological prediction data. However, the photovoltaic power generation power is greatly influenced by weather change fluctuation, so that the power predicted value of the photovoltaic power generation power is prone to have deviation. Therefore, a photovoltaic power generation power prediction error model is needed to be established, and support is provided for quantifying the influence of prediction uncertainty on system operation and standby.
At present, modeling for power prediction errors of photovoltaic power stations at home and abroad mainly aims at a single photovoltaic power station. However, the output between photovoltaic power stations adjacent to the geographical position has strong correlation, and the power prediction error of the photovoltaic power stations also has similar characteristics. Because the multi-photovoltaic power station power prediction error modeling process relates to multi-dimensional variable correlation modeling, a modeling method for a single photovoltaic power station is difficult to directly transplant into an application scene of the multi-photovoltaic power station. N.Zhang et al, the document (Modeling Conditional for Wind Power in Generation Scheduling [ J ]. IEEE transformation on Power Systems,2014,29 (3): 1316-1324), according to a multidimensional Gaussian Copula function, models Wind Power prediction errors, and results show that the Wind Power prediction values have great influence on the probability distribution of the prediction errors. A document (a method for estimating the probability distribution of the condition prediction error of the photovoltaic power generation output [ J ]. Power system automation, 2015,39 (16): 8-15) published by ZhaoWeijia et al, models photovoltaic power prediction errors under different weather types by utilizing a binary Gaussian Copula function, and the result shows that the weather types have great influence on the probability distribution of photovoltaic power prediction, but the asymmetric correlation structure among multidimensional variables cannot be accurately described only by adopting the Gaussian Copula function for describing the correlation of two-dimensional variables in research. The method comprises the steps of adopting fuzzy C-means clustering to classify single photovoltaic prediction errors and establishing a Gaussian mixture model suitable for describing photovoltaic prediction error distribution, but not considering the output influence of photovoltaic power stations with similar geographic positions, and being incapable of fully utilizing existing data information to obtain a more accurate probability distribution model, wherein the literature is published by Zhao Shuqiang et al [ a photovoltaic output prediction error distribution model [ J ] in the day ahead based on numerical characteristic clustering [ power system automation, 2019,43 (13): 36-45 ]. The method is characterized in that a method for predicting the power of the wind power based on clustering and nonparametric kernel density estimation is adopted in a document published by Zhang Ying et al (wind power prediction error analysis [ J ]. Solar science report, 2019,40 (12): 3594-3604), the prediction error data with similar characteristics are classified into one class by adopting a clustering method, and then the nonparametric kernel density estimation is adopted to obtain the probability distribution of the prediction error data, but the research is only suitable for analyzing the error distribution characteristics under the conditions of known power prediction and actually measured historical data, and the probability distribution of the predicted output of the new energy in a future period of time is difficult to construct on the basis of the known power prediction value.
Therefore, the existing new energy power prediction error modeling method is difficult to meet the multi-photovoltaic power station prediction error modeling requirements under different weather types, and a new modeling method needs to be provided.
Disclosure of Invention
Aiming at the problem that the power prediction error model of the multi-photovoltaic power station is difficult to accurately construct, the invention provides a multi-photovoltaic power station power prediction error modeling method based on a Vine-Copula function, which is based on a Vine-Copula theory, constructs a correlation model between the power prediction value and the power measured value of the multi-photovoltaic power station under different weather types, and therefore accurate modeling of the prediction error is realized under the condition that the power prediction value is known.
Therefore, the technical scheme adopted by the invention is as follows: a multi-photovoltaic power station power prediction error modeling method comprises the following model construction steps:
1) According to historical statistical data of a plurality of photovoltaic power stations in an area, a historical data set comprising meteorological factors, photovoltaic predicted power and photovoltaic measured power of the photovoltaic power stations is constructed;
2) Obtaining photovoltaic power station predicted power and actual measurement power data sets under typical weather types by adopting a K-means clustering method according to a plurality of photovoltaic power station historical data sets in the area in the step 1) and taking solar irradiance, air temperature and air pressure as dividing bases;
3) Calculating the power prediction error of the photovoltaic power station in each time period according to the photovoltaic power station predicted power and the actually measured power data set in the typical weather type in the step 2);
4) The method comprises the following steps of constructing a correlation model between predicted power and actually measured power of the photovoltaic power station under different weather types by adopting a Vine-Copula function, wherein the specific process is as follows:
4-a) establishing a D Vine correlation model of the predicted power and the actually measured power of the photovoltaic power station under different weather types by using a D Vine-Copula function structure;
4-b) selecting binary Copula types and parameters in the D rattan correlation model one by adopting Euclidean distance inspection according to the output data sets of the photovoltaic power station under different weather types.
Further, in the step 1), in a photovoltaic power station with meteorological and electrical measurement devices, a historical data set containing meteorological factors and a historical data set containing photovoltaic power are constructed:
W i,j =[I i,j ,T i,j ,V i,j ]
Figure BDA0003857545500000031
wherein subscript i represents the photovoltaic power plant serial number; subscript j represents the historical data sequence number; w is a group of i,j Representing the jth set of meteorological historical data of the ith photovoltaic power station; I.C. A i,j 、T i,j And V i,j The meteorological factors used for power prediction of the photovoltaic power station are solar irradiance, temperature and air pressure respectively; d i,j Representing a jth group of photovoltaic power historical data of an ith photovoltaic power plant;
Figure BDA0003857545500000032
the predicted power value of the ith photovoltaic power station;
Figure BDA0003857545500000033
and the measured power value of the ith photovoltaic power station.
Further, in the step 2), according to the meteorological factor historical data set of the photovoltaic power station, the data set is divided into different weather types by adopting a K-means clustering method, the K-means clustering method adopts the Euclidean distance between samples to describe the similarity of the samples, and taking the ith photovoltaic power station as an example, the Euclidean distance between the data samples is as follows:
Figure BDA0003857545500000034
wherein, Δ w j.k Representing the Euclidean distance between the jth sample and the kth sample; i | · | | represents a 2 norm.
Further, the K-means clustering method is a typical unsupervised learning method, and an objective function for dividing a data set is as follows:
Figure BDA0003857545500000041
k is the number of clusters, and the number of clusters is the weather type to be divided; w i,c The cluster center of the c-th cluster is determined by the expected value of the sample point belonging to the cluster; in the K-means clustering process, firstly, the clustering number K needs to be determined, K sample points are randomly selected as initial clustering, the Euclidean distance from the minimized sample point to the clustering center is used as an optimization objective function, the sample point and the clustering center in the clustering are continuously updated until convergence, and the division of different weather types is realized.
Further, the meteorological historical data set of the photovoltaic power station is divided into sunny days, cloudy days and rainy and snowy days through K-means clustering, and the corresponding data set passes through a variable W i 1 ,W i 2 And W i 2 Representing, sample points in each data set as
Figure BDA0003857545500000042
And
Figure BDA0003857545500000043
wherein the usage types 1,2 and 3 correspond to sunny, cloudy and sleet weather.
Further, in the step 2), the meteorological historical data sets of the photovoltaic power station are divided into meteorological data sets under the weather types of sunny days, cloudy days and rainy and snowy days through a K-means clustering method, the photovoltaic power data at the same time are divided according to meteorological data time sequence labels to obtain photovoltaic power data sets under the weather types of sunny days, cloudy days and rainy and snowy days, and the corresponding data sets are subjected to variable quantity
Figure BDA0003857545500000044
And
Figure BDA0003857545500000045
representing, sample points in each data set as
Figure BDA0003857545500000046
And
Figure BDA0003857545500000047
further, in the step 3), after the photovoltaic power data sets under different weather types are obtained, a prediction error is calculated according to the photovoltaic power prediction data and the photovoltaic power actual measurement data in the data sets:
Figure BDA0003857545500000048
wherein e is i,j The prediction error of each jth sample point is calculated;
Figure BDA0003857545500000049
the predicted power value of the ith photovoltaic power station;
Figure BDA00038575455000000410
and the measured power value of the ith photovoltaic power station.
Further, in the step 4-a), taking the data sets of the two photovoltaic power stations in the sunny weather as an example, it is assumed that the predicted power value and the measured power value of the first photovoltaic power station are respectively P 1 f And P 1 r The predicted value and the measured value of the power of the second photovoltaic power station are respectively
Figure BDA00038575455000000411
And
Figure BDA00038575455000000412
then the variable P 1 r 、P 1 f And
Figure BDA00038575455000000413
the correlation model between them is expressed as:
Figure BDA00038575455000000414
wherein F (-) represents a cumulative probability distribution function of the variable; c (-) represents a Copula function for describing the multi-dimensional variable correlation, and further obtains a probability density function for describing the multi-variable correlation:
Figure BDA00038575455000000415
wherein f (·) represents a cumulative probability distribution function of the variable; c (-) represents a Copula density function for describing the correlation of multidimensional variables;
for convenience of illustration, the variable x is used 1 、x 2 And x 3 To replace P 1 r 、P 1 f And
Figure BDA0003857545500000051
and its joint probability density function is further written as:
f(x 1 ,x 2 ,x 3 )=f(x 1 )·f(x 2 |x 3 )·f(x 1 |x 2 ,x 3 )
wherein, f (x) 2 |x 3 ) Further expressed as:
Figure BDA0003857545500000052
similarly, f (x) 1 |x 2 ,x 3 ) Further expressed as:
f(x 1 |x 2 ,x 3 )=c 13|2 (F(x 1 |x 2 ),F(x 3 |x 2 ))·c 12 (F(x 1 ),F(x 2 ))·f(x 1 )
according to the expression, the variable x 1 ,x 2 And x 3 The joint probability density function of (a) is expressed as:
f(x 1 ,x 2 ,x 3 )=c 13|2 (F(x 1 |x 2 ),F(x 3 |x 2 ))·c 12 (F(x 1 ),F(x 2 ))f(x 1 )
·c 23 (F(x 2 ),F(x 3 ))·f(x 1 )·f(x 2 )·f(x 3 )
the key for constructing the D vine correlation model is that c 12 (·)、c 23 (. And c) 13|2 Selecting a suitable Copula function type and determining parameters of the Copula function type.
Further, in the step 4-b), c is determined one by Euclidean distance test 12 (·)、c 23 (. And c) 13|2 (ii) optimal Copula function type and parameters; with c 12 (. To) assume for example that variable x 1 And x 2 With m pieces of historical data, the difference between the empirical value and the theoretical value of the above-mentioned variable joint distribution function is expressed as:
Figure BDA0003857545500000053
wherein, C em (F(x 1,i ),F(x 2,i ) Is a sample point (x) 1,i ,x 2,i ) An actual probability value; c (F (x) 1,i ),F(x 2,i ) ) calculating a probability value for the sample point by a theoretical Copula function; a proper function type is selected from Gauss Copula, t Copula and Frank Copula, and parameters of the function type are optimized, so that an empirical value and a theoretical value of a variable joint distribution function are minimum.
Further, when the power prediction value of the photovoltaic power station is known, estimating the probability distribution of the power prediction error according to the D vine correlation model constructed in the step 4); the predicted values of the first photovoltaic power station and the second photovoltaic power station are assumed to be x respectively 2 =P 1 f And
Figure BDA0003857545500000054
the probability distribution of the actual power of the first photovoltaic power plant is then expressed as:
Figure BDA0003857545500000061
since the first photovoltaic power plant power prediction error is expressed as:
Δx=x 1 -x 2
after the probability distribution of the actual power of the first photovoltaic power station is obtained, the predicted power value P is subtracted from the probability distribution 1 f The probability distribution of the prediction error deltax is obtained.
According to the method, the historical data sets of the multi-photovoltaic power station under different meteorological conditions are divided by utilizing K-means clustering; the invention realizes multiple photovoltaic by using a D Vine-Copula function modeling the correlation of power station power prediction and measured data; according to the method, the constructed D vine correlation model is utilized, and under the condition that the power predicted value of the photovoltaic power station is known, the photovoltaic power prediction error probability distribution is accurately constructed.
The modeling method is effective to wind power prediction errors and wind-light collaborative prediction errors, and is particularly suitable for the condition that the output of a plurality of power stations has strong correlation.
Drawings
FIG. 1 is a flow chart of a multi-photovoltaic power plant power prediction error modeling method of the present invention;
FIG. 2 is a graph of historical data of predicted power and measured power of a photovoltaic power station in an application example of the present invention;
FIG. 3 is a graph of historical data of predicted power and measured power in sunny weather obtained by clustering;
FIG. 4 is a graph of historical data of predicted power and measured power in cloudy weather obtained by clustering;
FIG. 5 is a graph of historical data of predicted power and measured power in rainy weather obtained by clustering according to the present invention;
FIG. 6 is a scatter diagram of historical data of predicted power and measured power under sunny weather in an application example of the present invention;
FIG. 7 is a scatter diagram of historical data of predicted power and measured power in cloudy weather according to an embodiment of the present invention;
FIG. 8 is a scatter diagram of historical data of predicted power and measured power in rainy and snowy weather in an application example of the present invention;
FIG. 9 is a logical structure diagram of D Teng Vine-Copula in an application example of the present invention;
FIG. 10 is a power prediction error probability distribution diagram obtained under a condition of a fine day when the predicted value of the photovoltaic power station power is 0.2p.u according to the present invention;
FIG. 11 is a power prediction error probability distribution diagram obtained under the condition of multiple clouds when the predicted value of the photovoltaic power station power is 0.2p.u;
fig. 12 is a power prediction error probability distribution diagram obtained when the predicted value of the photovoltaic power station power is 0.2p.u under the rainy weather condition.
Detailed Description
In order to more specifically describe the present invention, the following detailed description of the embodiments of the present invention is provided with reference to the accompanying drawings.
Examples
The embodiment is a multi-photovoltaic power station power prediction error modeling method based on a Vine-Copula function, as shown in fig. 1, the model construction steps are as follows:
1) According to historical statistical data of a plurality of photovoltaic power stations in an area, a historical data set comprising meteorological factors, photovoltaic predicted power and photovoltaic measured power of the photovoltaic power stations is constructed;
2) According to the historical data sets of the photovoltaic power stations in the area in the step 1), taking comprehensive factors such as solar irradiance, air temperature and air pressure as dividing bases, and obtaining predicted power and actually measured power data sets of the photovoltaic power stations in typical weather types such as sunny days, cloudy days, rain and snow by adopting a K-means clustering method;
3) Calculating a photovoltaic power station power prediction error in each period of time according to the photovoltaic power station predicted power and the actually measured power data set in the typical weather type in the step 2);
4) The method comprises the following steps of constructing a correlation model between predicted power and actually measured power of the photovoltaic power station under different weather types by adopting a Vine-Copula function, wherein the specific process is as follows:
4-a) establishing a correlation model of the predicted power and the actually measured power of the photovoltaic power station under different weather types by using a D Vine Vine-Copula function structure, wherein the selection range of the binary Copula function can comprise Gaussian Copula, t Copula, frank Copula and the like;
4-b) selecting binary Copula types and parameters in the correlation model one by adopting Euclidean distance detection according to the photovoltaic power station output data sets under different weather types;
when the predicted power value of the photovoltaic power station is known, the probability distribution of the prediction error of the photovoltaic power station under the predicted power can be obtained according to the correlation model of the D Vine Vine-Copula function constructed in the step 4).
Specifically, in the step 1), in a photovoltaic power station with meteorological and electrical measurement devices, a historical data set containing meteorological factors and a historical data set containing photovoltaic power can be constructed:
W i,j =[I i,j ,T i,j ,V i,j ]
Figure BDA0003857545500000081
wherein subscript i represents the photovoltaic power plant serial number; subscript j represents the historical data sequence number; w is a group of i,j Representing the jth set of meteorological historical data of the ith photovoltaic power station; i is i,j 、T i,j And V i,j The weather factors commonly used for power prediction of the photovoltaic power station are solar irradiance, temperature and air pressure respectively; d i,j Representing the jth group of photovoltaic power historical data of the ith photovoltaic power station;
Figure BDA0003857545500000082
the predicted power value of the ith photovoltaic power station;
Figure BDA0003857545500000083
and the measured power value is the measured power value of the ith photovoltaic power station.
Specifically, in the step 2), according to the meteorological historical data set of the photovoltaic power station, the data set is divided into different weather types such as sunny days, cloudy days, rain and snow by adopting a K-means clustering method, the K-means clustering method adopts the euclidean distance between samples to describe the similarity of the samples, and taking the ith photovoltaic power station as an example, the euclidean distance between the data samples is as follows:
Figure BDA0003857545500000084
wherein, Δ w j.k Representing the Euclidean distance between the jth sample and the kth sample; i | · | | represents a 2 norm.
The K-means clustering method is a typical unsupervised learning method, and the objective function for dividing the data set is as follows:
Figure BDA0003857545500000085
wherein, K is the number of clusters, which is the weather type to be divided in this embodiment; w i,c The cluster center for the c-th cluster is determined by the expected value of the sample point belonging to the cluster. In the K-means clustering process, firstly, the clustering number K needs to be determined, K sample points are randomly selected as initial clustering, the Euclidean distance from the minimized sample point to the clustering center is used as an optimization objective function, the sample point and the clustering center in the clustering are continuously updated until convergence, and the division of different weather types is realized.
Through K-means clustering, the meteorological historical data set of the photovoltaic power station can be divided into sunny days, cloudy days and rainy and snowy days, and the corresponding data set can pass through a variable W i 1 ,W i 2 And W i 2 Can be expressed as sample points in each data set
Figure BDA0003857545500000086
And
Figure BDA0003857545500000087
wherein the usage types 1,2 and 3 correspond to sunny, cloudy and sleet weather.
Specifically, in the step 2), the meteorological historical data set of the photovoltaic power station is divided into meteorological data sets under sunny, cloudy and rainy-snowy weather types through a K-means clustering method, and the photovoltaic power data at the same moment can be divided according to the meteorological data time sequence labels to obtain sunny, cloudy and rainy days,Photovoltaic power data set under rain and snow weather type, and the corresponding data set can pass through variable
Figure BDA0003857545500000091
And
Figure BDA0003857545500000092
can be expressed as sample points in each data set
Figure BDA0003857545500000093
And
Figure BDA0003857545500000094
specifically, in step 3), after the photovoltaic power data sets under different weather types are obtained, a prediction error can be calculated according to the photovoltaic power prediction data and the photovoltaic power actual measurement data in the data sets:
Figure BDA0003857545500000095
wherein e is i,j The prediction error of each jth sample point.
Specifically, in the step 4-a), based on the obtained photovoltaic power data sets under different weather types, correlation modeling may be performed on the predicted photovoltaic power values and the measured photovoltaic power values. The photovoltaic power prediction aims at accurately obtaining the actual output of the photovoltaic power station as much as possible, so that the predicted value and the measured value have stronger correlation, and the photovoltaic power prediction can be modeled through a Copula theory. Taking the data sets of two photovoltaic power stations in the sunny weather type as an example, suppose that the predicted power value and the measured power value of the first photovoltaic power station are respectively P 1 f And P 1 r The predicted and measured values of the power of the second photovoltaic power station are respectively
Figure BDA0003857545500000096
And
Figure BDA0003857545500000097
then the variable P 1 r 、P 1 f And
Figure BDA0003857545500000098
the correlation model between can be expressed as:
Figure BDA0003857545500000099
wherein F (-) represents a cumulative probability distribution function of the variable; c (-) represents a Copula function for describing the multi-dimensional variable dependence. Further, a probability density function describing the multivariate correlation can be obtained:
Figure BDA00038575455000000910
wherein f (·) represents a cumulative probability distribution function of the variable; c (-) represents the Copula density function used to describe the multi-dimensional variable dependence. Commonly used Copula functions include gaussian Copula, t Copula, frank Copula, and the like. For multidimensional variables, the correlation structure may be complex, so that the Vine-Copula function is preferably used for modeling. The Vine-Copula function usually takes two forms, C Vine and D Vine, where the D Vine structure is suitable for describing situations where the correlation degree between two variables is close. For ease of illustration, the variable x is used 1 、x 2 And x 3 To replace P 1 r 、P 1 f And
Figure BDA00038575455000000911
and its joint probability density function is further written as:
f(x 1 ,x 2 ,x 3 )=f(x 1 )·f(x 2 |x 3 )·f(x 1 |x 2 ,x 3 )
wherein f (x) 2 |x 3 ) Further expressed as:
Figure BDA0003857545500000101
similarly, f (x) 1 |x 2 ,x 3 ) Further expressed as:
f(x 1 |x 2 ,x 3 )=c 13|2 (F(x 1 |x 2 ),F(x 3 |x 2 ))·c 12 (F(x 1 ),F(x 2 ))·f(x 1 )
according to the above expression, the variable x can be expressed 1 ,x 2 And x 3 The joint probability density function of (a) is expressed as:
f(x 1 ,x 2 ,x 3 )=c 13|2 (F(x 1 |x 2 ),F(x 3 |x 2 ))·c 12 (F(x 1 ),F(x 2 ))f(x 1 )
·c 23 (F(x 2 ),F(x 3 ))·f(x 1 )·f(x 2 )·f(x 3 )
according to the expression, the Copula function of the D vine models a correlation structure among multiple variables by adopting a Copula function between every two variables. The key for constructing the D vine correlation model is that c is 12 (·)、c 23 (. Cndot.) and c 13|2 (. Cndot.) A suitable Copula function type is selected and its parameters are determined.
Specifically, in the step 4-b), the Euclidean distance test is adopted to determine c one by one 12 (·)、c 23 (. Cndot.) and c 13|2 (ii) optimal Copula function type and parameters. With c 12 (. Cndot.) As an example, assume the variable x 1 And x 2 With m pieces of historical data, the difference between the empirical value and the theoretical value of the above-mentioned variable joint distribution function can be expressed as:
Figure BDA0003857545500000102
wherein, C em (F(x 1,i ),F(x 2,i ) Is a sample point (x) 1,i ,x 2,i ) An actual probability value; c (F (x) 1,i ),F(x 2,i ) Is prepared byThe resulting probability values are calculated by the theoretical Copula function at the sample points. A proper function type is selected from common Gauss Copula, t Copula and Frank Copula, and parameters of the function type are optimized, so that an empirical value and a theoretical value of a variable joint distribution function are minimum. According to the above process, the types of the Copula functions in the D-rattan correlation model can be determined one by one.
When the power prediction value of the photovoltaic power station is known, estimating the probability distribution of the power prediction error according to the D vine correlation model constructed in the step 4). The predicted values of the first photovoltaic power station and the second photovoltaic power station are assumed to be x respectively 2 =P 1 f And
Figure BDA0003857545500000103
the probability distribution of the actual power of the first photovoltaic plant can then be expressed as:
Figure BDA0003857545500000104
further, since the first photovoltaic power plant power prediction error can be expressed as:
Δx=x 1 -x 2
therefore, after the probability distribution of the actual power of the first photovoltaic power station is obtained, the predicted power value P can be subtracted from the probability distribution 1 f The probability distribution of the prediction error deltax is obtained. Similarly, if a correlation model of the output of a plurality of photovoltaic power stations is constructed in the step 4), when the power predicted value of each photovoltaic power station is known, the probability distribution of the actual power of each power station and the probability distribution of the prediction error can be obtained through the correlation model.
Application example
The method of the invention takes historical data of 2 photovoltaic power stations in east China as an example for explanation, and comprises the following steps:
(1) And according to historical statistical data of a plurality of photovoltaic power stations in the area, constructing a historical data set containing meteorological factors, photovoltaic predicted power and photovoltaic measured power of the photovoltaic power stations.
Fig. 2 is a curve of predicted power and actually measured power of a first photovoltaic power station located in eastern China. According to time sequence data recorded by the power station meteorological and electrical measurement device, a meteorological historical data set and a photovoltaic power historical data set can be constructed:
W 1,j =[I 1,j ,T 1,j ,V 1,j ]
Figure BDA0003857545500000111
wherein, W 1,j Representing a jth set of meteorological historical data for the first photovoltaic power plant; i is 1,j 、T 1,j And V 1,j The meteorological factors used for power prediction of the photovoltaic power station are solar irradiance, temperature and air pressure respectively; d 1,j Representing a jth group of photovoltaic power historical data of the first photovoltaic power station;
Figure BDA0003857545500000112
a predicted power value for the first photovoltaic power station;
Figure BDA0003857545500000113
the measured power value is the measured power value of the first photovoltaic power station. Likewise, for a second photovoltaic plant, its meteorological historical dataset and power historical dataset, respectively W, may be obtained 2,j And D 2,j
(2) According to the historical data set of the photovoltaic power station, the meteorological factors are used as dividing bases, and prediction data and actual measurement data of the photovoltaic power station in typical weather types such as sunny days, cloudy days, rain and snow are obtained.
And according to the meteorological historical data sets of the photovoltaic power stations, adopting a K-means clustering method to enable the meteorological historical data sets of the first photovoltaic power station to be data sets in different weather types such as sunny days, cloudy days, rain and snow. The corresponding data set passes through the variable W i 1 ,W i 2 And W i 2 Can be expressed as sample points in each data set
Figure BDA0003857545500000114
Figure BDA0003857545500000115
And
Figure BDA0003857545500000116
the use types 1,2 and 3 correspond to sunny, cloudy, and rainy and snowy weather. According to the meteorological data time sequence label, the photovoltaic power data at the same moment are divided to obtain photovoltaic power data sets under sunny days, cloudy days and rainy and snowy weather types, and the corresponding data sets can pass through variables
Figure BDA0003857545500000117
And
Figure BDA0003857545500000118
can be expressed as sample points in each data set
Figure BDA0003857545500000119
And
Figure BDA00038575455000001110
fig. 3 to 5 are curves of photovoltaic predicted power and measured power under three different weather conditions. It can be seen that the photovoltaic power station output fluctuation under different weather conditions is obviously different. Fig. 6 to 8 are scatter diagrams of photovoltaic predicted power and measured power under three different weather conditions. It can be seen that under the condition of sunny days, the correlation degree between the photovoltaic power predicted value and the actual value is very high, and the scatter diagrams are distributed near the diagonal; the correlation degree of the photovoltaic power predicted value and the actual value is reduced under the cloudy condition, and the area covered by scattered points in the graph is increased; under the rain and snow condition, the correlation degree of the photovoltaic power predicted value and the actual value is obviously reduced, and the characteristic of nonlinear correlation is presented. The difference between the photovoltaic power predicted value and the actual value under different meteorological conditions is obvious, and the necessity of considering the meteorological conditions for prediction error modeling is illustrated.
3) According to the predicted power and actually measured power data sets of the photovoltaic power station under different weather types
Figure BDA0003857545500000121
And
Figure BDA0003857545500000122
and calculating the power prediction error of the photovoltaic power station in each time period.
Figure BDA0003857545500000123
Wherein e is i,j The prediction error of each jth sample point.
4) The Vine-Copula function is adopted to construct a correlation model between predicted power and actually measured power of 2 photovoltaic power stations under different weather types, and for convenience of explanation, a variable x is adopted 1 、x 2 And x 3 To replace P 1 r 、P 1 f And
Figure BDA0003857545500000124
4-a) establishing a correlation model of the predicted power and the measured power of the photovoltaic power station under different weather types by using a D vine Copula function structure, wherein a variable x 1 、x 2 And x 3 The joint probability density function of (a) may be expressed as:
f(x 1 ,x 2 ,x 3 )=c 13|2 (F(x 1 |x 2 ),F(x 3 |x 2 ))·c 12 (F(x 1 ),F(x 2 ))f(x 1 )
·c 23 (F(x 2 ),F(x 3 ))·f(x 1 )·f(x 2 )·f(x 3 )
according to the expression, the Copula function of the D vine models a correlation structure among multiple variables by adopting a Copula function between every two variables. FIG. 9 is a logical structure of the D rattan Copula function. Wherein, structure c 12 (·)、c 23 (. And c) 13|2 (. Cndot.) is the binary Copula function type to be determined.
4-b) selecting c by European distance test according to two photovoltaic power station power historical data sets 12 (·)、c 23 (. And c) 13|2 (ii) optimal Copula function type and parameters. With c 12 (. To) assume for example that variable x 1 And x 2 With m history data, variable x 1 And x 2 The difference between the empirical and theoretical values of the joint distribution function can be expressed as:
Figure BDA0003857545500000125
wherein, C em (F(x 1,i ),F(x 2,i ) Is a sample point (x) 1,i ,x 2,i ) An actual probability value; c (F (x) 1,i ),F(x 2,i ) The resulting probability value is calculated for that sample point by the theoretical Copula function. Commonly used binary Copula functions include gaussian Copula, tCopula, frank Copula, and the like.
5) According to the correlation model of the predicted value and the measured value of the power of the multi-photovoltaic power station, when the predicted power value of the photovoltaic power station is known, the probability distribution of the prediction error of the photovoltaic power station under the predicted power is obtained through the correlation model:
Figure BDA0003857545500000131
wherein, the predicted values of the first photovoltaic power station and the second photovoltaic power station are assumed to be x respectively 2 =P 1 f And
Figure BDA0003857545500000132
when the actual power x of the first photovoltaic power station is obtained 1 After the probability distribution, the power prediction value P can be subtracted from the probability distribution 1 f And then obtaining the probability distribution of the prediction error.
Fig. 10 to 12 are power prediction error probability distributions of the first photovoltaic power station when the power prediction values of the two photovoltaic power stations are both 0.2p.u under different weather types. As can be seen from comparison, the probability distribution obtained by the modeling method provided by the invention has higher fitting degree with the statistical analysis result of the historical data. Compared with the Beta distribution which is commonly used for fitting the photovoltaic power prediction error at present, the method has higher fitting precision, and the effectiveness of the method is proved.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A multi-photovoltaic power station power prediction error modeling method is characterized in that the model construction steps are as follows:
1) According to historical statistical data of a plurality of photovoltaic power stations in the area, a historical data set containing meteorological factors, photovoltaic predicted power and photovoltaic measured power of the photovoltaic power stations is constructed;
2) Obtaining photovoltaic power station predicted power and actual measurement power data sets under typical weather types by adopting a K-means clustering method according to a plurality of photovoltaic power station historical data sets in the area in the step 1) and taking solar irradiance, air temperature and air pressure as dividing bases;
3) Calculating the power prediction error of the photovoltaic power station in each time period according to the photovoltaic power station predicted power and the actually measured power data set in the typical weather type in the step 2);
4) The method comprises the following steps of constructing a correlation model between predicted power and measured power of the photovoltaic power station under different weather types by adopting a Vine-Copula function, wherein the correlation model comprises the following specific processes:
4-a) establishing a D Vine correlation model of the predicted power and the actually measured power of the photovoltaic power station under different weather types by using a D Vine-Copula function structure;
4-b) selecting binary Copula types and parameters in the D rattan correlation model one by adopting Euclidean distance inspection according to the output data sets of the photovoltaic power station under different weather types.
2. The modeling method for multi-photovoltaic power plant power prediction errors of claim 1, characterized in that in step 1), in a photovoltaic power plant with meteorological and electrical measurement devices, a historical dataset containing meteorological factors and photovoltaic power is constructed:
W i,j =[I i,j ,T i,j ,V i,j ]
Figure FDA0003857545490000011
wherein, subscript i represents a photovoltaic power station serial number; subscript j represents the historical data sequence number; w is a group of i,j Representing a jth set of meteorological historical data for an ith photovoltaic power plant; i is i,j 、T i,j And V i,j The meteorological factors used for power prediction of the photovoltaic power station are solar irradiance, temperature and air pressure respectively; d i,j Representing the jth group of photovoltaic power historical data of the ith photovoltaic power station;
Figure FDA0003857545490000012
the predicted power value of the ith photovoltaic power station is obtained;
Figure FDA0003857545490000013
and the measured power value is the measured power value of the ith photovoltaic power station.
3. The multi-photovoltaic power station power prediction error modeling method of claim 2, characterized in that in the step 2), according to the historical data set of the meteorological factors of the photovoltaic power station, the data set is divided into different weather types by adopting a K-means clustering method, the K-means clustering method adopts Euclidean distances among samples to describe the similarity of the samples, taking the ith photovoltaic power station as an example, the Euclidean distances among the data samples are as follows:
Figure FDA0003857545490000021
wherein, Δ w j.k Representing the Euclidean distance between the jth sample and the kth sample; i | · | | represents a 2-norm.
4. The multi-photovoltaic power plant power prediction error modeling method of claim 2, wherein the K-means clustering method is a typical unsupervised learning method that partitions the data set with an objective function of:
Figure FDA0003857545490000022
k is the number of clusters, and the number of clusters is the weather type to be divided; w is a group of i,c Determining the cluster center of the c-th cluster according to the expected value of the sample point belonging to the cluster; in the K-means clustering process, firstly, the clustering number K needs to be determined, K sample points are randomly selected as initial clustering, the Euclidean distance from the minimized sample point to the clustering center is used as an optimization objective function, the sample point and the clustering center in the clustering are continuously updated until convergence, and the division of different weather types is realized.
5. The multi-photovoltaic power plant power prediction error modeling method of claim 4, characterized in that the photovoltaic power plant meteorological historical data sets are divided into sunny, cloudy, and sleet weather by K-means clustering, and the corresponding data sets are passed through a variable W i 1 ,W i 2 And W i 2 Showing, the sample points in each data set are represented as
Figure FDA0003857545490000023
And
Figure FDA0003857545490000024
wherein the usage types 1,2 and 3 correspond to sunny, cloudy and sleet weather.
6. The multi-photovoltaic power station power prediction error modeling method of claim 5, wherein in step 2), the photovoltaic power station meteorological historical data set is divided into meteorological data sets under sunny, cloudy, rainy and snowy weather types through a K-means clustering method, and the photovoltaic power data at the same time are divided according to meteorological data time sequence labels to obtain sunny, cloudy, rainy and cloudy daysPhotovoltaic power data set in snow weather type, corresponding data set passing variable
Figure FDA0003857545490000025
And
Figure FDA0003857545490000026
showing, the sample points in each data set are represented as
Figure FDA0003857545490000027
And
Figure FDA0003857545490000028
7. the modeling method for the power prediction errors of the multi-photovoltaic power station of claim 1, wherein in the step 3), after the photovoltaic power data sets under different weather types are obtained, the prediction errors are calculated according to the photovoltaic power prediction data and the photovoltaic power actual measurement data in the data sets:
Figure FDA0003857545490000031
wherein e is i,j The prediction error of each jth sample point is calculated;
Figure FDA0003857545490000032
the predicted power value of the ith photovoltaic power station is obtained;
Figure FDA0003857545490000033
and the measured power value of the ith photovoltaic power station.
8. The multi-photovoltaic power station power prediction error modeling method of claim 1, characterized in that in step 4-a), the predicted power value and the measured power value of the first photovoltaic power station are assumed to be exemplified by data sets of two photovoltaic power stations in a sunny weather typeRespectively a value of P 1 f And P 1 r The predicted and measured values of the power of the second photovoltaic power station are respectively
Figure FDA0003857545490000034
And
Figure FDA0003857545490000035
then variable P 1 r 、P 1 f And P 2 f The correlation model between them is expressed as:
F(P 1 r ,P 1 f ,P 2 f )=C(F(P 1 r ),F(P 1 f ),F(P 2 f ))
wherein F (·) represents a cumulative probability distribution function of the variable; c (-) represents a Copula function for describing the multi-dimensional variable correlation, and further obtains a probability density function for describing the multi-dimensional variable correlation:
f(P 1 r ,P 1 f ,P 2 f )=c(F(P 1 r ),F(P 1 f ),F(P 2 f ))·f(P 1 r )·f(P 1 f )·f(P 2 f )
wherein f (·) represents a cumulative probability distribution function of the variable; c (-) represents a Copula density function for describing the correlation of multidimensional variables;
for ease of illustration, the variable x is used 1 、x 2 And x 3 To replace P 1 r 、P 1 f And P 2 f And its joint probability density function is further written as:
f(x 1 ,x 2 ,x 3 )=f(x 1 )·f(x 2 |x 3 )·f(x 1 |x 2 ,x 3 )
wherein, f (x) 2 |x 3 ) Further expressed as:
Figure FDA0003857545490000036
similarly, f (x) 1 |x 2 ,x 3 ) Further expressed as:
f(x 1 |x 2 ,x 3 )=c 13|2 (F(x 1 |x 2 ),F(x 3 |x 2 ))·c 12 (F(x 1 ),F(x 2 ))·f(x 1 )
according to the expression, the variable x 1 ,x 2 And x 3 Is expressed as:
f(x 1 ,x 2 ,x 3 )=c 13|2 (F(x 1 |x 2 ),F(x 3 |x 2 ))·c 12 (F(x 1 ),F(x 2 ))f(x 1 )·c 23 (F(x 2 ),F(x 3 ))·f(x 1 )·f(x 2 )·f(x 3 )
the key for constructing the D vine correlation model is that c is 12 (·)、c 23 (. Cndot.) and c 13|2 (. Cndot.) A suitable Copula function type is selected and its parameters are determined.
9. The multi-photovoltaic power plant power prediction error modeling method of claim 8, wherein in said step 4-b), c is determined one by one using Euclidean distance testing 12 (·)、c 23 (. Cndot.) and c 13|2 (. H) optimal Copula function type and parameters; with c 12 (. To) assume for example that variable x 1 And x 2 With m pieces of historical data, the difference between the empirical value and the theoretical value of the above-mentioned variable joint distribution function is expressed as:
Figure FDA0003857545490000041
wherein, C em (F(x 1,i ),F(x 2,i ) Is a sample point (x) 1,i ,x 2,i ) An actual probability value; c (F (x) 1,i ),F(x 2,i ))Calculating a probability value for the sample point through a theoretical Copula function; a proper function type is selected from Gauss Copula, t Copula and Frank Copula, and parameters of the function type are optimized, so that an empirical value and a theoretical value of a variable joint distribution function are minimum.
10. The modeling method for multi-photovoltaic power station power prediction errors according to claim 9, characterized in that when the predicted value of the photovoltaic power station power is known, the probability distribution of the power prediction errors is estimated according to the D-rattan correlation model constructed in step 4); the predicted values of the first photovoltaic power station and the second photovoltaic power station are assumed to be x respectively 2 =P 1 f And x 3 =P 2 f Then, the probability distribution of the actual power of the first photovoltaic power station is expressed as:
Figure FDA0003857545490000042
since the first photovoltaic power plant power prediction error is expressed as:
Δx=x 1 -x 2
after the probability distribution of the actual power of the first photovoltaic power station is obtained, the predicted power value P is subtracted from the probability distribution 1 f The probability distribution of the prediction error deltax is obtained.
CN202211152635.4A 2022-09-21 2022-09-21 Multi-photovoltaic power station power prediction error modeling method Pending CN115511170A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211152635.4A CN115511170A (en) 2022-09-21 2022-09-21 Multi-photovoltaic power station power prediction error modeling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211152635.4A CN115511170A (en) 2022-09-21 2022-09-21 Multi-photovoltaic power station power prediction error modeling method

Publications (1)

Publication Number Publication Date
CN115511170A true CN115511170A (en) 2022-12-23

Family

ID=84503545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211152635.4A Pending CN115511170A (en) 2022-09-21 2022-09-21 Multi-photovoltaic power station power prediction error modeling method

Country Status (1)

Country Link
CN (1) CN115511170A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252729A (en) * 2023-11-17 2023-12-19 北京恒信启华信息技术股份有限公司 Photovoltaic power station management method and system based on big data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252729A (en) * 2023-11-17 2023-12-19 北京恒信启华信息技术股份有限公司 Photovoltaic power station management method and system based on big data
CN117252729B (en) * 2023-11-17 2024-04-16 北京恒信启华信息技术股份有限公司 Photovoltaic power station management method and system based on big data

Similar Documents

Publication Publication Date Title
CN113962364B (en) Multi-factor power load prediction method based on deep learning
Dong et al. Wind power day-ahead prediction with cluster analysis of NWP
Liu et al. Multivariate exploration of non-intrusive load monitoring via spatiotemporal pattern network
Zargar et al. Development of a markov-chain-based solar generation model for smart microgrid energy management system
CN110619360A (en) Ultra-short-term wind power prediction method considering historical sample similarity
CN110380444B (en) Capacity planning method for distributed wind power orderly access to power grid under multiple scenes based on variable structure Copula
CN105069521A (en) Photovoltaic power plant output power prediction method based on weighted FCM clustering algorithm
CN111008726B (en) Class picture conversion method in power load prediction
CN115511170A (en) Multi-photovoltaic power station power prediction error modeling method
CN115952429A (en) Self-adaptive DBSCAN abnormal battery identification method based on Euclidean distance without prior weight
Liu Short-term prediction method of solar photovoltaic power generation based on machine learning in smart grid
CN110852492A (en) Photovoltaic power ultra-short-term prediction method for finding similarity based on Mahalanobis distance
CN117200181A (en) Photovoltaic power generation amount prediction method based on DBSCAN-EM-GMM and Web technology
Zhang et al. Load forecasting method based on improved deep learning in cloud computing environment
CN112132344A (en) Short-term wind power prediction method based on similar day and FRS-SVM
Mahdavi et al. Probabilistic estimation of pv generation at customer and distribution feeder levels using net-demand data
Kumari et al. Machine learning techniques for hourly global horizontal irradiance prediction: A case study for smart cities of India
Bandyopadhyay et al. A machine learning based heating and cooling load forecasting approach for DHC networks
CN116307111A (en) Reactive load prediction method based on K-means clustering and random forest algorithm
CN116454875A (en) Regional wind farm mid-term power probability prediction method and system based on cluster division
Deng et al. A Survey of the Researches on Grid-Connected Solar Power Generation Systems and Power Forecasting Methods Based on Ground-Based Cloud Atlas
Tai et al. Power prediction of photovoltaic power generation based on LSTM model with additive Attention mechanism
Yan et al. [Retracted] Research on Probability Distribution of Short‐Term Photovoltaic Output Forecast Error Based on Numerical Characteristic Clustering
Kotriwala et al. Load classification and forecasting for temporary power installations
Wang et al. Photovoltaic data cleaning method based on DBSCAN clustering, quartile algorithm and Pearson correlation coefficient interpolation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination