CN112561139A - Short-term photovoltaic power generation power prediction method and system - Google Patents

Short-term photovoltaic power generation power prediction method and system Download PDF

Info

Publication number
CN112561139A
CN112561139A CN202011395236.1A CN202011395236A CN112561139A CN 112561139 A CN112561139 A CN 112561139A CN 202011395236 A CN202011395236 A CN 202011395236A CN 112561139 A CN112561139 A CN 112561139A
Authority
CN
China
Prior art keywords
data
photovoltaic power
power generation
clustering
meteorological
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011395236.1A
Other languages
Chinese (zh)
Inventor
张倩
张金金
李国丽
王群京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202011395236.1A priority Critical patent/CN112561139A/en
Publication of CN112561139A publication Critical patent/CN112561139A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Educational Administration (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a short-term photovoltaic power generation power prediction method and a short-term photovoltaic power generation power prediction system. Data cleaning, namely cleaning abnormal data by using an iForest algorithm; the characteristic selection is to select a meteorological factor with strong correlation as the input characteristic of the model according to the calculated meteorological factor and the Pearson coefficient of the photovoltaic power generation power; normalization eliminates the adverse effect of the difference in the values of different types of input data on the learning and training of the model. A K-means algorithm based on Davies-Bouldin indexes performs clustering analysis on the features, a prediction result of short-term photovoltaic power generation power after error correction is given on the premise of giving main network parameters, and a single BP method and an LSTM method are used for comparison, so that the prediction accuracy of the method is verified to be more ideal.

Description

Short-term photovoltaic power generation power prediction method and system
Technical Field
The invention relates to the technical field of photovoltaic power generation, in particular to a short-term photovoltaic power generation power prediction method and system.
Background
The photovoltaic power generation has a strong daily change period, and the output power of the photovoltaic power generation is influenced by various meteorological factors. Parameters such as solar radiation intensity, atmospheric temperature, relative humidity, wind speed, wind direction and air pressure have different degrees of influence on photovoltaic power generation. Therefore, it should be a research topic that attracts much attention to how to reasonably select training data while trying to improve prediction accuracy by using different prediction models.
Disclosure of Invention
The short-term photovoltaic power generation power prediction method and system provided by the invention can solve the technical problems.
In order to achieve the purpose, the invention adopts the following technical scheme:
a short-term photovoltaic power generation power prediction method comprises the following steps:
s100, considering weather types of days to be predicted, dividing the weather types into different weather types, and selecting historical meteorological and photovoltaic power data and predicted daily meteorological data of the same weather type closest to the weather types as reference as input sample data;
s200, preprocessing input data, including abnormal data cleaning, feature selection and normalization of historical data;
s300, clustering by using a self-adaptive K-means algorithm of Davies Bouldin indexes according to the selected meteorological factors;
s400, predicting the clustered data by combining with corresponding historical photovoltaic power data through an LSTM;
and S500, integrating the predicted results according to the time points and correcting errors to obtain a final predicted result.
Further, the preprocessing the input data in S200 includes:
s201, the abnormal situation exists in the data for predicting the photovoltaic power generation power
Data cleaning;
s202, selecting characteristics of factors influencing photovoltaic power generation power;
s203, normalization processing is carried out for eliminating unit limit of each data;
wherein, the S202 specifically includes:
the relationship between each meteorological factor and the photovoltaic power is reflected by calculating the Pearson correlation coefficient between the photovoltaic output power and each meteorological factor, and the calculation formula is as follows:
Figure BDA0002814755970000021
in the formula, r represents a Pearson correlation coefficient, ν represents photovoltaic output power, and γ represents a meteorological factor.
Further, in S202, the correlation degree between the meteorological factors and the photovoltaic power can be determined through the following value ranges, as shown in the following table:
Figure BDA0002814755970000022
further, in S203, the input data is normalized, specifically according to the following formula:
Figure BDA0002814755970000023
in the formula (I), the compound is shown in the specification,
Figure BDA0002814755970000024
for normalized data, x*
Figure BDA0002814755970000025
And
Figure BDA0002814755970000026
the raw input data, the maximum value and the minimum value in the raw input data, respectively.
Further, the S300, according to the selected meteorological factors, clustering by using a Davies Bouldin index self-adaptive K-means algorithm;
the K-means algorithm comprises the following steps:
step 1, randomly selecting k samples from a data set as initial clustering centers (mu)12,…,μk};
Step 2, calculating Euclidean distances from the residual samples to the clustering centers, and distributing the Euclidean distances to the nearest clustering centers to form k clusters, wherein the measurement of the distances is given in (2.3.3);
Figure BDA0002814755970000031
in the formula, n represents the dimension of space, AkAnd BkThe kth attribute of A and B, respectively;
step 3, updating the clustering center through a distance measurement method to be the mean value of all samples belonging to the cluster;
and 4, repeating the steps 2 and 3 until the algorithm is converged.
Further, the S300 further includes:
in order to automatically select the optimal clustering number K, quantitative indexes are introduced to search the optimal clustering of the samples, the key of the proposed self-adaptive process is clustering evaluation, related indexes are numerous, and the Davies-Bouldin index uses the inherent quantity and characteristics of a data set, so that the method is suitable for K-means clustering evaluation;
wherein the Davies-Bouldin index is defined as follows:
Figure BDA0002814755970000032
in the formula (I), the compound is shown in the specification,
Figure BDA0002814755970000033
and
Figure BDA0002814755970000034
respectively representing the average distance from the i cluster sample to the center of the corresponding cluster;
Figure BDA0002814755970000035
representing the Euclidean distance between the center of the cluster i and the cluster j; i isDBIThe smaller the clustering performance, the better the clustering performance, and the best clustering number k can be obtainedbest
To avoid generating too many clusters, the number of clusters is limited by a threshold, denoted as kmax
Further, the error correction method in S500 includes:
s501, calculating absolute value | P of photovoltaic power difference of 2 adjacent sampling points in training sampleα+1-PαI, forming a historical photovoltaic power fluctuation quantity sequence delta Pα
S502, converting delta PαDividing into 4 intervals, calculating probability distribution P of each intervaliAnd a historical maximum fluctuation max (P)α);
S503, averaging values according to fluctuation amount intervals
Figure BDA0002814755970000041
To perform weighted summation to obtain a comprehensive confidence correction quantity C Δ P, as shown below;
Figure BDA0002814755970000042
s504, similarly, calculating the predicted daily power output fluctuation quantity sequence
Figure BDA0002814755970000043
The value in the sequence exceeds max (P)α) Is corrected by C.DELTA.P, and the new fluctuation amount sequence and the final predicted output obtained after correction are respectively recorded as
Figure BDA0002814755970000044
And
Figure BDA0002814755970000045
the concrete correction measures are shown in formula (2.3.6) and formula (2.3.7);
Figure BDA0002814755970000046
Figure BDA0002814755970000047
further, the influencing factors in step S202 include: solar radiation intensity, atmospheric temperature, relative humidity, wind speed, wind direction, air pressure.
On the other hand, the invention also discloses a short-term photovoltaic power generation power prediction system, which comprises the following units:
the input sample determining unit is used for considering the weather types of the days to be predicted, dividing the weather types into different weather types, and selecting historical meteorological and photovoltaic power data and predicted daily meteorological data of the same weather type closest to the weather types as reference as input sample data;
the data preprocessing unit is used for preprocessing input data, and comprises abnormal data cleaning, feature selection and normalization of historical data;
the clustering unit is used for clustering by using a self-adaptive K-means algorithm of Davies Bouldin indexes according to the selected meteorological factors;
the prediction unit is used for predicting the clustered data by using the LSTM in combination with corresponding historical photovoltaic power data;
and the correcting unit is used for integrating the predicted result according to the time point and correcting the error to obtain the final predicted result.
According to the technical scheme, the short-term photovoltaic power generation power prediction method starts with the selection of the data set and the evaluation index, and carries out data preprocessing on the selected data set, including data cleaning, feature selection and normalization. Data cleaning, namely cleaning abnormal data by using an iForest algorithm; the characteristic selection is to select a meteorological factor with strong correlation as the input characteristic of the model according to the calculated meteorological factor and the Pearson coefficient of the photovoltaic power generation power; normalization eliminates the adverse effect of the difference in the values of different types of input data on the learning and training of the model. A K-means algorithm based on Davies-Bouldin indexes performs clustering analysis on the features, a prediction result of short-term photovoltaic power generation power after error correction is given on the premise of giving main network parameters, and a single BP method and an LSTM method are used for comparison, so that the prediction accuracy of the method is verified to be more ideal.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a flow chart of K-means based on Davies-Bouldin index according to an embodiment of the present invention;
FIG. 3 is a plot of irradiance versus photovoltaic power generated by an embodiment of the present invention;
FIG. 4 is a graph of temperature versus photovoltaic power generation for an embodiment of the present invention;
FIG. 5 is a graph of humidity versus photovoltaic power generation power for an embodiment of the present invention;
FIG. 6 is a Davies-Bouldin index for different clustering schemes according to embodiments of the present invention;
FIG. 7 is a schematic diagram illustrating a visualization of a clustering result according to an embodiment of the present invention;
FIG. 8 is a sunny prediction of an embodiment of the present invention;
FIG. 9 is a result of a multi-cloud prediction according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention.
The deep learning algorithm has the characteristics of a training mode with a multi-level internal structure and repeated learning characteristics, so that the photovoltaic prediction problem can be better solved.
As shown in fig. 1, the method for predicting short-term photovoltaic power generation power according to this embodiment includes the following steps:
s100, considering weather types of days to be predicted, dividing the weather types into different weather types, and selecting historical meteorological and photovoltaic power data and predicted daily meteorological data of the same weather type closest to the weather types as reference as input sample data;
s200, preprocessing input data, including abnormal data cleaning, feature selection and normalization of historical data;
s300, clustering by using a self-adaptive K-means algorithm of Davies Bouldin indexes according to the selected meteorological factors;
s400, predicting the clustered data by combining with corresponding historical photovoltaic power data through an LSTM;
and S500, integrating the predicted results according to the time points and correcting errors to obtain a final predicted result.
The principle of the invention is illustrated below:
1.1 data preprocessing method
(1) Cleaning of exception data
The method and the device use an iForest algorithm to clean abnormal data of the training sample;
(2) feature selection
The factors influencing the photovoltaic power generation power are numerous, and theoretically, the more the input variables are, the stronger the recognition capability is. In practice, however, too many variables tend to cause many problems, such as overfitting, invalid features that make prediction accuracy not increase or decrease and the prediction process more complicated. Therefore, accurate and detailed input data is the key to improve the prediction accuracy. The relationship between each meteorological factor and the photovoltaic power is reflected by calculating the Pearson correlation coefficient between the photovoltaic output power and each meteorological factor, and the calculation formula is as follows:
Figure BDA0002814755970000061
in the formula, r represents a Pearson correlation coefficient, ν represents photovoltaic output power, and γ represents a meteorological factor.
The correlation degree of meteorological factors and photovoltaic power can be judged through the following value ranges, as shown in the following table.
Table 1 correlation degree of meteorological factors and photovoltaic power Tab.1the degree of roughness of photovoltaic meteorological factors and photovoltaic power
Figure BDA0002814755970000071
According to the embodiment of the invention, meteorological factors with extremely strong correlation, strong correlation and moderate correlation degrees are selected as the input characteristics of the photovoltaic power for subsequent processing.
(3) Normalization
In order to eliminate the adverse effect of the difference in the values of different types of input data on the learning and training of the model, the input data needs to be normalized:
Figure BDA0002814755970000072
in the formula (I), the compound is shown in the specification,
Figure BDA0002814755970000073
for normalized data, x*
Figure BDA0002814755970000074
And
Figure BDA0002814755970000075
the raw input data, the maximum value and the minimum value in the raw input data, respectively.
1.2 principle of adaptive K-means algorithm
The conventional K-means is not suitable for directly clustering input data sets, because the number of clusters cannot be subjectively determined under the condition of not knowing the data sets, therefore, the embodiment of the invention provides the self-adaptive K-means which can automatically set the number of clusters according to the input data sets, and the main idea is an iterative process based on distance.
The K-means algorithm comprises the following steps:
step 1 randomly selects k samples from the data set as initial cluster centers [ mu ]12,…,μk}。
And 2, calculating Euclidean distances from the residual samples to the clustering centers, and distributing the Euclidean distances to the nearest clustering centers to form k clusters, wherein the distance measure is given in (2.3.3).
Figure BDA0002814755970000081
In the formula, n represents the dimension of space, AkAnd BkAre the kth attributes of a and B, respectively.
And 3, updating the clustering center through a distance measurement method to be the mean value of all samples belonging to the cluster.
Step 4 repeats steps 2 and 3 until the algorithm converges.
To automatically select the best cluster number k, a quantitative index is introduced to search for the best cluster of samples. The key of the proposed adaptive process is cluster evaluation, and related indexes are numerous, while the Davies-Bouldin index uses the inherent quantity and characteristics of a data set and is suitable for K-means cluster evaluation. The definition is as follows:
Figure BDA0002814755970000082
in the formula (I), the compound is shown in the specification,
Figure BDA0002814755970000083
and
Figure BDA0002814755970000084
the average distances of the i and j cluster samples to the corresponding cluster center are indicated, respectively.
Figure BDA0002814755970000085
Representing the center of cluster iEuclidean distance to cluster j. I isDBIThe smaller the clustering performance, the better the clustering performance, and the best clustering number k can be obtainedbest
To avoid generating too many clusters, the number of clusters is limited by a threshold, denoted as kmaxThe adaptive clustering process is shown in fig. 2.
1.3 error correction method
The photovoltaic output power has certain fluctuation characteristics, and on the basis, error correction is carried out on the predicted photovoltaic power according to the historical power output fluctuation amount, and the following is a specific process.
1) Calculating absolute value | P of photovoltaic power difference of adjacent 2 sampling points in similar day (training sample)α+1-PαI, forming a historical photovoltaic power fluctuation quantity sequence delta Pα
2) Will be delta PαDividing into 4 intervals, calculating probability distribution P of each intervaliAnd a historical maximum fluctuation max (P)α)。
3) According to the average value of each fluctuation interval
Figure BDA0002814755970000086
To perform a weighted summation to obtain the integrated confidence correction amount C Δ P, as shown below.
Figure BDA0002814755970000091
4) Similarly, calculating the predicted daily power output fluctuation sequence
Figure BDA0002814755970000092
The value in the sequence exceeds max (P)α) Is corrected by C.DELTA.P, and the new fluctuation amount sequence and the final predicted output obtained after correction are respectively recorded as
Figure BDA0002814755970000093
And
Figure BDA0002814755970000094
concrete correction measuresSee formula (2.3.6) and formula (2.3.7).
Figure BDA0002814755970000095
Figure BDA0002814755970000096
And (3) predicting the photovoltaic output of 7 days in the national day of the 2018 year in the certain area, selecting 6 groups of field actual measurement meteorological data and photovoltaic power generation power data of the solar radiation intensity, atmospheric temperature, relative humidity, wind speed, wind direction and air pressure in the area, wherein the sampling period is 15min, and the prediction time period is 6:30-17:30, namely 45 time sampling points per day. The weather conditions of 7 days in national celebration are divided into two weather types of sunny days and cloudy days, and data of 10 similar days closest to the predicted time of day under the two weather types are respectively selected as training samples, and 900 groups of data are counted.
Like short-term load prediction, MAPE is adopted to evaluate the quality of a short-term photovoltaic power generation power model, and RMSE is adopted to reflect the prediction precision.
The invention is described in detail below with reference to specific applications:
2.1 data preprocessing
Data preprocessing refers to the processing of data before the main processing of the data. The method comprises the steps that firstly, data cleaning is carried out on abnormal conditions existing in data for predicting photovoltaic power generation power; then, selecting characteristics according to a plurality of meteorological factors influencing photovoltaic power generation; and finally, carrying out normalization processing for eliminating unit limitation of each data.
2.2 data cleansing
Accurate and reliable data are the basis of prediction, and factors influencing the weather and photovoltaic power data are numerous, such as unsmooth communication, equipment abnormity, artificial electricity limitation and the like. By directly utilizing the abnormal data for prediction, the prediction precision of the photovoltaic power generation power is inevitably reduced, and adverse effects are brought to the operation and the scheduling of the power grid.
The embodiment of the invention is based on that an emsemble, isolationForest module provided by a sklern package of python is used for realizing an iForest algorithm, abnormal data cleaning is carried out on training samples, and 855 groups of training samples are left after 900 groups of training samples are cleaned by main parameter setting of the algorithm.
2.3 feature selection
The photovoltaic power generation has a strong daily change period, and the output power of the photovoltaic power generation is influenced by various meteorological factors. Parameters such as solar radiation intensity, atmospheric temperature, relative humidity, wind speed, wind direction and air pressure have different degrees of influence on photovoltaic power generation. Accurate and detailed input data is the key for improving the prediction accuracy, but too much input data makes the prediction process more complicated and the prediction accuracy lower.
The embodiment of the invention calculates the Pearson correlation coefficient between the photovoltaic output power and each meteorological factor by using the training sample data. As shown in table 2.
TABLE 2 correlation coefficient of photovoltaic output power with various meteorological factors
Tab.4.1The correlation coefficient between photovoltaic output power and meteorological factors
Figure BDA0002814755970000101
The calculation results in table 2 show that irradiance and atmospheric temperature in the period of time of the training sample in the area are strongly related to photovoltaic output power, relative humidity is strongly related (negative) to photovoltaic output power, air pressure is moderately related to photovoltaic output power, and wind speed and wind direction are weakly related to photovoltaic output power. The visualization with strong correlation between meteorological factors and photovoltaic power generation power is shown in fig. 3, 4 and 5.
Fig. 3 is a relationship curve of solar irradiance and photovoltaic power generation power during the day, and it can be seen that the variation trends between the solar irradiance and the photovoltaic power generation power are almost consistent and have a strong linear relationship. Just because the randomness and intermittency of irradiance cause the photovoltaic power generation to have volatility and instability, the irradiance intensity has direct influence on the photovoltaic power generation, so that the irradiance intensity can be used as an important input characteristic for photovoltaic power generation prediction.
Fig. 4 is a relationship curve of the atmospheric temperature and the photovoltaic power generation during the day, and it can be seen that the variation trend between the atmospheric temperature and the photovoltaic power generation is similar. Under the condition that shielding objects in the air are few, the relationship between the atmospheric temperature and the irradiance is close, so that under the condition that the temperature of the photovoltaic module is kept unchanged, the photovoltaic power generation power is increased along with the rise of the atmospheric temperature, and the atmospheric temperature and the photovoltaic power generation power are in positive correlation and have strong quasi-linear relationship. The change of the temperature can influence the photovoltaic power generation power to generate slight change, and has certain influence effect.
Fig. 5 is a graph of relative humidity versus photovoltaic power generation during the day, and it can be seen that the relative humidity and the photovoltaic power generation change in a substantially opposite direction. Generally, when the relative humidity is high, the mobility of air is poor, the cloud layer of the sky is dense, the irradiance is reduced, and the photovoltaic power generation power is affected, so that the relative humidity and the cloud layer are in negative correlation.
Through the analysis, the embodiment of the invention selects three meteorological factors of solar radiation intensity, relative humidity and atmospheric temperature as input data of the prediction model, and uses photovoltaic power generation power as output data.
Normalization
The data normalization processing is a basic work of data analysis, the purpose of normalization is to map data to a [0,1] interval in a unified mode, and due to the fact that the distribution range of data of various dimensions for predicting photovoltaic power generation is large in difference, for the data normalization processing, on one hand, the convergence rate of a model can be improved, and on the other hand, the accuracy of the model can be greatly improved. To this end, the embodiment of the present invention processes each set of data based on the MATLAB direct call function mapminmax according to equation (2.3.2).
Adaptive K-means clustering process
And performing clustering analysis on the irradiation intensity, the atmospheric temperature and the relative humidity by adopting a K-means algorithm based on a Davies-Bouldin index so as to establish a prediction model in different categories, wherein the data set comprises 855 groups of preprocessed training data and 315 groups of test data in a prediction day, and 1170 groups of data are counted. The Davies-Bouldin index results for the different clustering schemes are shown in FIG. 6.
As can be seen from FIG. 6, the minimum Davies-Bouldin index is 1.3789, indicating the best cluster number when the cluster category is 3. Thus, the pre-processed data may be divided into three clusters (cluster 1 for 342, cluster 2 for 364, and cluster 3 for 464). For further analysis, the clustering results were visualized as shown in fig. 7 below.
As can be seen from fig. 7, there are almost no confusing and crossing seeds in the 3 clusters, which indicates that the clustering effect is good, and the data in the cluster 1 is more concentrated, which is beneficial to prediction. Clusters 2 and 3 have scattered seeds but a small number of seeds and have little effect on subsequent predictions. In general, clustering results show that the fitness of the training samples and the prediction day samples selected by the embodiment of the invention is higher, which is important for improving the prediction accuracy of the photovoltaic power generation power.
Short-term photovoltaic power generation power prediction simulation result and analysis
Like short-term load prediction, the number of stacking layers, the number of neurons in hidden layers, Dropout parameters and the number of samples processed in each batch in the LSTM network are adjusted to improve the predicted performance of the network, and after multiple tests, the settings of the parameters are shown in table 3 below.
TABLE 3LSTM network major parameter settings Tab.3Main parameters setting of LSTM network
Figure BDA0002814755970000121
And then establishing an LSTM model by using the meteorological data and the corresponding photovoltaic power generation data of each cluster for training and prediction, integrating the prediction result of each cluster according to a time point, and correcting according to an error correction method of 2.3.4 sections to obtain a final short-term photovoltaic power generation power prediction result.
A single BP model and an LSTM model are used as comparison experiments, in order to observe and compare prediction results more clearly and intuitively, 3 days of sunny days and 4 days of cloudy days are predicted, and 1 day of each of the two weather types is taken for quantitative analysis. Fig. 8 and 9 are a photovoltaic power generation power prediction curve and an actual curve of No. 10 month 6 (sunny day) and No. 10 month 2 (cloudy), respectively. The prediction error statistics for the 3 methods are given in table 4.
TABLE 4 statistical results of errors for the methods
Tab.4Error statistics of three methods
Figure BDA0002814755970000122
Figure BDA0002814755970000131
Fig. 8 shows that under the condition of fine days, the photovoltaic output curve has certain regularity, and the prediction effects of the 3 prediction models are ideal. Fig. 9 is a diagram, which is different from the case of the cloud model in that the moving trajectory and the size of the cloud are not easily predicted under a sunny condition, and therefore, the prediction curves of the 3 models have a large deviation from the actual curves in some time periods. Wherein, the prediction curve of the self-adaptive K-means-LSTM model is closer to the general trend of the actual power curve.
As can be seen from Table 4, the prediction errors MAPE and RMSE of the 3 models are smaller in sunny days (days 4, 5 and 6), and the prediction result of the self-adaptive K-means-LSTM model is the most accurate. Prediction errors of cloudy days ( days 1, 2, 3 and 7) are large, but the prediction result of the self-adaptive K-means-LSTM model is obviously superior to those of the other two models, which shows that the model has higher precision under the same weather condition and is suitable for the condition that the weather condition fluctuates.
In summary, the embodiment of the invention performs research and analysis on short-term photovoltaic power generation power prediction examples, starts with selection of data sets and evaluation indexes, and performs data preprocessing on the selected data sets, including data cleaning, feature selection and normalization. Data cleaning, namely cleaning abnormal data by using an iForest algorithm; the characteristic selection is to select a meteorological factor with strong correlation as the input characteristic of the model according to the calculated meteorological factor and the Pearson coefficient of the photovoltaic power generation power; normalization eliminates the adverse effect of the difference in the values of different types of input data on the learning and training of the model. A K-means algorithm based on Davies-Bouldin indexes performs clustering analysis on the features, a prediction result of short-term photovoltaic power generation power after error correction is given on the premise of giving main network parameters, and a single BP method and an LSTM method are used for comparison, so that the prediction accuracy of the method is verified to be more ideal.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A short-term photovoltaic power generation power prediction method is characterized by comprising the following steps:
the method comprises the following steps:
s100, considering weather types of days to be predicted, dividing the weather types into different weather types, and selecting historical meteorological and photovoltaic power data and predicted daily meteorological data of the same weather type closest to the weather types as reference as input sample data;
s200, preprocessing input data, including abnormal data cleaning, feature selection and normalization of historical data;
s300, clustering by using a self-adaptive K-means algorithm of Davies Bouldin indexes according to the selected meteorological factors;
s400, predicting the clustered data by combining with corresponding historical photovoltaic power data through an LSTM;
and S500, integrating the predicted results according to the time points and correcting errors to obtain a final predicted result.
2. The short term photovoltaic power generation power prediction method of claim 1, characterized by: the preprocessing step of the input data in the S200 includes:
s201, cleaning data for predicting abnormal conditions existing in the data for photovoltaic power generation;
s202, selecting characteristics of factors influencing photovoltaic power generation power;
s203, normalization processing is carried out for eliminating unit limit of each data;
wherein, the S202 specifically includes:
the relationship between each meteorological factor and the photovoltaic power is reflected by calculating the Pearson correlation coefficient between the photovoltaic output power and each meteorological factor, and the calculation formula is as follows:
Figure FDA0002814755960000011
in the formula, r represents a Pearson correlation coefficient, ν represents photovoltaic output power, and γ represents a meteorological factor.
3. The short term photovoltaic power generation power prediction method of claim 2, characterized by: in S202, the correlation between the meteorological factors and the photovoltaic power can be determined through the following value ranges, as shown in the following table:
Figure FDA0002814755960000012
Figure FDA0002814755960000021
4. the short term photovoltaic power generation power prediction method of claim 2, characterized by: in S203, the input data is normalized, specifically according to the following formula:
Figure FDA0002814755960000022
in the formula (I), the compound is shown in the specification,
Figure FDA0002814755960000023
for normalized data, x*
Figure FDA0002814755960000027
And
Figure FDA0002814755960000028
the raw input data, the maximum value and the minimum value in the raw input data, respectively.
5. The short term photovoltaic power generation power prediction method of claim 1, characterized by:
the S300, clustering by using a self-adaptive K-means algorithm of Davies Bouldin indexes according to the selected meteorological factors;
the K-means algorithm comprises the following steps:
step 1, randomly selecting k samples from a data set as initial clustering centers (mu)12,…,μk};
Step 2, calculating Euclidean distances from the residual samples to the clustering centers, and distributing the Euclidean distances to the nearest clustering centers to form k clusters, wherein the measurement of the distances is given in (2.3.3);
Figure FDA0002814755960000026
in the formula, n represents the dimension of space, AkAnd BkThe kth attribute of A and B, respectively;
step 3, updating the clustering center through a distance measurement method to be the mean value of all samples belonging to the cluster;
and 4, repeating the steps 2 and 3 until the algorithm is converged.
6. The short term photovoltaic power generation power prediction method of claim 5, characterized by:
the S300 further includes:
in order to automatically select the optimal clustering number K, quantitative indexes are introduced to search the optimal clustering of the samples, the key of the proposed self-adaptive process is clustering evaluation, related indexes are numerous, and the Davies-Bouldin index uses the inherent quantity and characteristics of a data set, so that the method is suitable for K-means clustering evaluation;
wherein the Davies-Bouldin index is defined as follows:
Figure FDA0002814755960000031
in the formula (I), the compound is shown in the specification,
Figure FDA0002814755960000032
and
Figure FDA0002814755960000033
respectively representing the average distance from the i cluster sample to the center of the corresponding cluster;
Figure FDA0002814755960000034
representing the Euclidean distance between the center of the cluster i and the cluster j; i isDBIThe smaller the clustering performance, the better the clustering performance, and the best clustering number k can be obtainedbest
To avoid generating too many clusters, the number of clusters is limited by a threshold, denoted as kmax
7. The short term photovoltaic power generation power prediction method of claim 1, characterized by:
the error correction method in the step S500 comprises the following steps:
s501, calculating absolute value | P of photovoltaic power difference of 2 adjacent sampling points in training sampleα+1-PαL, forming historical photovoltaic power fluctuationsQuantity sequence Δ Pα
S502, converting delta PαDividing into 4 intervals, calculating probability distribution P of each intervaliAnd a historical maximum fluctuation max (P)α);
S503, averaging values according to fluctuation amount intervals
Figure FDA0002814755960000035
To perform weighted summation to obtain a comprehensive confidence correction quantity C Δ P, as shown below;
Figure FDA0002814755960000036
s504, similarly, calculating the predicted daily power output fluctuation quantity sequence
Figure FDA0002814755960000037
The value in the sequence exceeds max (P)α) Is corrected by C.DELTA.P, and the new fluctuation amount sequence and the final predicted output obtained after correction are respectively recorded as
Figure FDA0002814755960000038
And
Figure FDA0002814755960000039
the concrete correction measures are shown in formula (2.3.6) and formula (2.3.7);
Figure FDA00028147559600000310
Figure FDA00028147559600000311
8. the short term photovoltaic power generation power prediction method of claim 2, characterized by: the influencing factors in step S202 include: solar radiation intensity, atmospheric temperature, relative humidity, wind speed, wind direction, air pressure.
9. A short-term photovoltaic power generation power prediction system is characterized by comprising the following units:
the input sample determining unit is used for considering the weather types of the days to be predicted, dividing the weather types into different weather types, and selecting historical meteorological and photovoltaic power data and predicted daily meteorological data of the same weather type closest to the weather types as reference as input sample data;
the data preprocessing unit is used for preprocessing input data, and comprises abnormal data cleaning, feature selection and normalization of historical data;
the clustering unit is used for clustering by using a self-adaptive K-means algorithm of Davies Bouldin indexes according to the selected meteorological factors;
the prediction unit is used for predicting the clustered data by using the LSTM in combination with corresponding historical photovoltaic power data;
and the correcting unit is used for integrating the predicted result according to the time point and correcting the error to obtain the final predicted result.
CN202011395236.1A 2020-12-03 2020-12-03 Short-term photovoltaic power generation power prediction method and system Pending CN112561139A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011395236.1A CN112561139A (en) 2020-12-03 2020-12-03 Short-term photovoltaic power generation power prediction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011395236.1A CN112561139A (en) 2020-12-03 2020-12-03 Short-term photovoltaic power generation power prediction method and system

Publications (1)

Publication Number Publication Date
CN112561139A true CN112561139A (en) 2021-03-26

Family

ID=75047430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011395236.1A Pending CN112561139A (en) 2020-12-03 2020-12-03 Short-term photovoltaic power generation power prediction method and system

Country Status (1)

Country Link
CN (1) CN112561139A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344293A (en) * 2021-06-29 2021-09-03 东南大学 Photovoltaic power prediction method based on NCA-fusion regression tree model
CN114237309A (en) * 2021-12-15 2022-03-25 新奥数能科技有限公司 Angle adjusting method and device for photovoltaic module
CN115577549A (en) * 2022-10-25 2023-01-06 北京兆瓦云数据科技有限公司 Virtual power plant closed-loop direct control method containing multiple types of resources
CN115660132A (en) * 2022-08-05 2023-01-31 科大数字(上海)能源科技有限公司 Photovoltaic power generation power prediction method and system
CN116050666A (en) * 2023-03-20 2023-05-02 中国电建集团江西省电力建设有限公司 Photovoltaic power generation power prediction method for irradiation characteristic clustering
CN117889965A (en) * 2024-03-15 2024-04-16 山西创芯光电科技有限公司 Performance test method of medium-short wave double-color infrared detector

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564192A (en) * 2017-12-29 2018-09-21 河海大学 A kind of short-term photovoltaic power prediction technique based on meteorological factor weight similar day
JP2019083601A (en) * 2017-10-27 2019-05-30 株式会社東芝 Power generation planning device, power generation planning method and power generation planning program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019083601A (en) * 2017-10-27 2019-05-30 株式会社東芝 Power generation planning device, power generation planning method and power generation planning program
CN108564192A (en) * 2017-12-29 2018-09-21 河海大学 A kind of short-term photovoltaic power prediction technique based on meteorological factor weight similar day

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
HAMIDREZA JAHANGIR等: ""Deep Learning-Based Forecasting Approach in Smart Grids With Microclustering and Bidirectional LSTM Network"", 《IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS》 *
刘沛汉等: ""一种光伏系统短期功率预测模型"", 《四川电力技术》 *
张健: ""基于自适应K-means与DNN的短期负荷预测研究分析"", 《电子测量技术》 *
张金金等: "模糊聚类-Elman神经网络短期光伏发电预测模型", 《电测与仪表》 *
程伟: ""基于神经网络的微电网光伏发电及负荷短期预测研究"", 《中国优秀硕士学位论文全文数据库工程科技Ⅱ辑》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344293A (en) * 2021-06-29 2021-09-03 东南大学 Photovoltaic power prediction method based on NCA-fusion regression tree model
CN113344293B (en) * 2021-06-29 2024-04-05 东南大学 Photovoltaic power prediction method based on NCA-fusion regression tree model
CN114237309A (en) * 2021-12-15 2022-03-25 新奥数能科技有限公司 Angle adjusting method and device for photovoltaic module
CN115660132A (en) * 2022-08-05 2023-01-31 科大数字(上海)能源科技有限公司 Photovoltaic power generation power prediction method and system
CN115660132B (en) * 2022-08-05 2024-01-30 科大数字(上海)能源科技有限公司 Photovoltaic power generation power prediction method and system
CN115577549A (en) * 2022-10-25 2023-01-06 北京兆瓦云数据科技有限公司 Virtual power plant closed-loop direct control method containing multiple types of resources
CN116050666A (en) * 2023-03-20 2023-05-02 中国电建集团江西省电力建设有限公司 Photovoltaic power generation power prediction method for irradiation characteristic clustering
CN116050666B (en) * 2023-03-20 2023-07-18 中国电建集团江西省电力建设有限公司 Photovoltaic power generation power prediction method for irradiation characteristic clustering
CN117889965A (en) * 2024-03-15 2024-04-16 山西创芯光电科技有限公司 Performance test method of medium-short wave double-color infrared detector
CN117889965B (en) * 2024-03-15 2024-05-24 山西创芯光电科技有限公司 Performance test method of medium-short wave double-color infrared detector

Similar Documents

Publication Publication Date Title
CN112561139A (en) Short-term photovoltaic power generation power prediction method and system
CN111199016B (en) Daily load curve clustering method for improving K-means based on DTW
CN108564192B (en) Short-term photovoltaic power prediction method based on meteorological factor weight similarity day
CN111369070B (en) Multimode fusion photovoltaic power prediction method based on envelope clustering
CN109978284B (en) Photovoltaic power generation power time-sharing prediction method based on hybrid neural network model
CN106548270B (en) Photovoltaic power station power abnormity data identification method and device
CN110245783B (en) Short-term load prediction method based on C-means clustering fuzzy rough set
CN111008726B (en) Class picture conversion method in power load prediction
CN114792156B (en) Photovoltaic output power prediction method and system based on curve characteristic index clustering
CN111709454A (en) Multi-wind-field output clustering evaluation method based on optimal copula model
CN113515512A (en) Quality control and improvement method for industrial internet platform data
CN115115090A (en) Wind power short-term prediction method based on improved LSTM-CNN
CN112288157A (en) Wind power plant power prediction method based on fuzzy clustering and deep reinforcement learning
CN117493921A (en) Artificial intelligence energy-saving management method and system based on big data
CN113627594B (en) One-dimensional time sequence data augmentation method based on WGAN
CN110991689A (en) Distributed photovoltaic power generation system short-term prediction method based on LSTM-Morlet model
Su et al. A LSTM based wind power forecasting method considering wind frequency components and the wind turbine states
CN110365014B (en) Voltage partitioning method considering voltage sensitivity time-varying characteristic
CN115660132B (en) Photovoltaic power generation power prediction method and system
CN115879750A (en) Aquatic seedling raising environment monitoring management system and method
CN112803403B (en) Offshore wind power clustering effect multi-level evaluation method based on time-frequency characteristics
CN114936599A (en) Base station energy consumption abnormity monitoring method and system based on wavelet decomposition and migration discrimination
CN106709587B (en) Direct radiation prediction method based on conventional weather forecast
CN110175705B (en) Load prediction method and memory and system comprising same
Ye et al. Combined approach for short-term wind power forecasting considering meteorological fluctuation and feature extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210326