CN116702937A - Photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization - Google Patents

Photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization Download PDF

Info

Publication number
CN116702937A
CN116702937A CN202211610935.2A CN202211610935A CN116702937A CN 116702937 A CN116702937 A CN 116702937A CN 202211610935 A CN202211610935 A CN 202211610935A CN 116702937 A CN116702937 A CN 116702937A
Authority
CN
China
Prior art keywords
day
data
photovoltaic output
neural network
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211610935.2A
Other languages
Chinese (zh)
Inventor
李楠
黄凯
王攀
刘黎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingmen Power Supply Co of State Grid Hubei Electric Power Co Ltd
Original Assignee
Jingmen Power Supply Co of State Grid Hubei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingmen Power Supply Co of State Grid Hubei Electric Power Co Ltd filed Critical Jingmen Power Supply Co of State Grid Hubei Electric Power Co Ltd
Priority to CN202211610935.2A priority Critical patent/CN116702937A/en
Publication of CN116702937A publication Critical patent/CN116702937A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Tourism & Hospitality (AREA)
  • Evolutionary Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Marketing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Operations Research (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Quality & Reliability (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization, which comprises the following steps: step 1, collecting historical meteorological data of photovoltaic power stations in a region to be predicted and historical photovoltaic output data corresponding to the historical meteorological data, and preprocessing the historical meteorological data and the historical photovoltaic output data by using an average interpolation method; step 2, calculating the correlation coefficient of the historical meteorological data and the historical photovoltaic output data by adopting the Pearson correlation coefficient, reserving the data with the correlation coefficient larger than a threshold value, and constructing a training set; step 3, clustering the historical photovoltaic output data through K-means clustering and dividing a similar day data set; step 4, optimizing the weight, the threshold value and the hidden layer node number of the BP neural network through a genetic algorithm and an ant colony algorithm, and constructing a photovoltaic output day-ahead prediction model; and 5, inputting the first seventy percent of each similar day data set as training set data into a prediction model to train a neural network and storing the model with the highest photovoltaic output prediction precision.

Description

Photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization
Technical Field
The invention belongs to the technical field of power systems and automation thereof, and particularly relates to a photovoltaic output day-ahead prediction method based on K-means mean value clustering and an optimized BP neural network.
Background
Distributed photovoltaic energy is used as a renewable energy source with mature technology and is applied to power grid power generation on a large scale. Because photovoltaic output has uncertainty factors such as randomness, fluctuation, intermittence and the like, large-scale photovoltaic integration into a power grid can bring a series of problems to the safe and stable operation of a power system, such as voltage and frequency deviation, voltage fluctuation and grid disconnection possibly occur. The accurate future prediction of the photovoltaic power generation power is beneficial to making a future power generation plan, reduces the harm caused by wind power grid connection, and has important value and significance.
The photovoltaic output prediction method mainly comprises a physical method, a statistical method and a deep learning method. The physical method is to build a corresponding mathematical model by researching the characteristics of the photovoltaic power generation equipment so as to predict the power, and the physical model does not need the support of a large amount of historical data, but needs to calibrate the equipment frequently. The statistical method is to build a functional mapping relation between the historical data and the output power, such as a regression prediction method, a gray theory, a time sequence method and the like. Statistical models typically rely on historical data and require that the ill-formed data points in the historical data be excluded prior to prediction. The deep learning method benefits from the rapid increase of computing power, and can learn the mapping relation between input and output by using an artificial intelligence algorithm, and mainly adopts a nonlinear mapping model. The BP neural network is one of the most widely applied modeling methods due to the simple structure and strong nonlinear mapping capability, but has low convergence speed and is easy to be trapped in local optimum, the network weight threshold value, the hidden layer node number and the like need to be manually determined according to experience, and theoretical support is lacked. The photovoltaic output is greatly influenced by weather fluctuation, the output curves under different meteorological conditions are greatly different, and if the similar day division is not carried out, the prediction accuracy of the model is greatly influenced, so that the similar day division is also very important.
Disclosure of Invention
The invention provides a photovoltaic output day-ahead prediction method based on K-means mean clustering and BP neural network optimization, which aims to overcome the defect of low prediction precision in the photovoltaic power generation power prediction method in the prior art, and comprises the following steps:
step 1, collecting historical meteorological data of photovoltaic power stations in a region to be predicted and historical photovoltaic output data corresponding to the historical meteorological data, and preprocessing the historical meteorological data and the historical photovoltaic output data by using an average interpolation method;
step 2, calculating the correlation coefficient of the historical meteorological data and the historical photovoltaic output data by adopting the Pearson correlation coefficient, reserving the data with the correlation coefficient larger than a threshold value, and constructing a training set;
step 3, clustering the historical photovoltaic output data through K-means clustering and dividing a similar day data set;
step 4, optimizing the weight, the threshold value and the hidden layer node number of the BP neural network through a genetic algorithm and an ant colony algorithm, and constructing a photovoltaic output day-ahead prediction model;
step 5, inputting the first seventy percent of each similar day data set as training set data into a prediction model, training a neural network and storing the model with highest prediction precision;
and 6, judging the type of the similar day of the day to be predicted according to the predicted weather, and inputting the historical meteorological data before the day in the similar day data set into a model with highest photovoltaic output prediction precision to obtain the photovoltaic output prediction data of the predicted day.
The example provides a photovoltaic output day-ahead prediction method based on K-means clustering and BP neural network optimization. The method comprises the steps of modeling and training historical meteorological data recorded by photovoltaic power stations in a region to be predicted and historical photovoltaic output data corresponding to the historical meteorological data, preprocessing the collected historical meteorological data by an average interpolation method, calculating correlation coefficients of the historical meteorological data and the historical photovoltaic output data by adopting pearson correlation coefficients, and selecting the historical meteorological data with the correlation coefficients of the historical photovoltaic output data larger than a threshold value as a training set by adopting a proper threshold value; clustering historical photovoltaic output data by adopting a K-means mean value clustering method, and optimizing the weight, the threshold and the hidden layer node number of the BP neural network by a genetic algorithm and an ant colony algorithm, so that the occurrence of a local optimal solution of the BP neural network is avoided; and inputting the historical meteorological data with the correlation coefficient of the photovoltaic output larger than the threshold value into an optimized BP neural network to train a prediction model, wherein the optimized BP neural network carries out regression prediction on the clustered historical meteorological data, and the future prediction precision of the photovoltaic output of the model can be improved.
In the photovoltaic output day-ahead prediction method based on K-means mean value clustering and the BP neural network optimization, step 1 collects historical meteorological data of photovoltaic power stations in a region to be predicted and historical photovoltaic output data corresponding to the meteorological data, and preprocessing the historical meteorological data and the historical photovoltaic output data by using an average interpolation method.
In the photovoltaic output day-ahead prediction method based on K-means mean clustering and the BP neural network optimization, the specific steps of calculating the correlation coefficient of the historical meteorological data and the historical photovoltaic output data by adopting the pearson correlation coefficient, reserving the data with the correlation coefficient larger than a threshold value, and constructing a training set are as follows:
the pearson correlation coefficient P between the historical meteorological data X and the photovoltaic output value Y is calculated, and the calculation formula is as follows:
wherein n is the sequence length; x is x i And y i The ith variable of sequence X and sequence Y, respectively;and->The average of sequence X and sequence Y, respectively. The value range of P is [ -1,1]The larger the absolute value of P represents the higher degree of correlation between the two sequences.
In the photovoltaic output day-ahead prediction method based on K-means mean clustering and the BP neural network optimization, the step 3 of clustering historical photovoltaic output data through K-means clustering and dividing similar day data sets specifically comprises the following steps:
for a given dataset x= { X 1 ,X 2 ...X n Each object contains t features and the data set X corresponds to an n X t matrix. Clustering process by studying similarity between objects in dataset X, samples in dataset X are partitioned into k different categories c= { C following a certain clustering criterion 1 ,C 2 ...C k And the different categories are independent of each other. To measure the similarity between objects, a distance function is introduced. In data set X, arbitrary sample X e And X f The similarity between the two can be determined by Euclidean distance d ef Expressed as:
when sample X e And X f The more similar or close, d ef The smaller; otherwise, the larger its value.
And carrying out K-means mean value clustering on the historical photovoltaic output data, dividing the historical photovoltaic output data with high similarity into the same similar day data set, and storing a clustering center of the similar day data set.
In the photovoltaic output day-ahead prediction method based on K-means mean clustering and BP neural network optimization, step 4 optimizes the weight, threshold and hidden layer node number of the BP neural network through a genetic algorithm and an ant colony algorithm, and the construction of the photovoltaic output day-ahead prediction model comprises the following steps:
before modeling a GA-ACO optimized BP neural network, optimizing through a GA algorithm to generate an optimized solution of BP neural network weight, threshold and hidden layer node number; subsequently, the distribution of the pheromones is initialized, the concentration of the pheromones on the optimized solution path is increased, and the purpose of the method is to increase the concentration of the pheromones on the optimized solution path, so that the convergence speed and the accuracy in ACO searching are improved.
The initialization formula of the pheromone is
τ=τ G +c
Wherein τ G The concentration value of the pheromone after GA optimization; c is a pheromone constant.
The main steps of the GA-ACO optimized BP neural network are as follows:
step 4.1: and initializing the BP neural network and the ant colony. Initializing required parameter settings, namely: connection weight omega between input layer and hidden layer ij A hidden layer threshold alpha, a connection weight omega between a hidden layer and an output layer jk Output layer threshold β, and hidden layer node number n. The above parameters are denoted as p 1 、p 2 、…、p n Composition element set I ni The method comprises the steps of carrying out a first treatment on the surface of the Initializing the number S of ants, the pheromone volatilization coefficient rho, the target error E and the like in the ant colony algorithm.
Step 4.2: s ants start searching and update pheromone until all S ants complete searching. During searching, ants can update the pheromone values on all sides through which the ants pass in real time, and the pheromone value updating formula is as follows:
wherein ρ is the pheromone volatilization coefficient;for the information quantity of kth ant on the j element path in the current circulation set, +.>The information quantity increment of the kth ant on the j element path in the current circulation set is obtained.
Step 4.3: the genetic algorithm is added to the ant colony algorithm. And performing operations such as crossing, mutation and the like on the ant colony. The most commonly used single-point crossover is selected in the crossover algorithm, namely, a point is randomly selected in the gene sequence to serve as a crossover point, and partial alleles of two different individuals are interchanged by taking the point as a boundary to generate two new gene sequences; in the mutation algorithm, a normal distribution with the mean value of mu and the variance of sigma is selected to carry out mutation operation on part of genes with smaller probability, and the expression of the new individual is generated
σ′=σeN(0,Δσ)
x′=x+N(0,Δσ)
Wherein x is the next path node of the ant colony. And selecting a fitness function to calculate individual fitness. The least mean square error of the learning samples is used herein as a fitness function. And (3) calculating an fitness value according to the formula (1), and judging whether the requirement of the current optimal solution is met. If yes, the step 4 is carried out, otherwise, the step 2 is carried out.
Step 4.4: taking the optimizing result of the ant colony algorithm in the last step as a parameter of the BP neural network, training the neural network, and calculating an error e. Has the following components
e q =O q -Y q
Wherein O is q Is the expected value; y is Y q Is a predicted value; q is the number of neurons, q=1, 2, …, n.
Step 4.5: and (4) updating the weight, the threshold and the hidden layer node number of the BP neural network according to the result of the step 4.4, and judging whether the requirement is met. If yes, the algorithm is ended, and the optimal weight, the threshold value and the hidden layer node number are output; otherwise, go to step 4.3.
In the photovoltaic output day-ahead prediction method based on K-means mean clustering and BP neural network optimization, the step 5 is to input the first seventy percent of each similar day data set as training set data into a prediction model to train the neural network and save the model with the highest prediction precision, and the method specifically comprises the following steps:
in the same similar day data set obtained by K-means clustering, weather factors of the previous day of two adjacent days are used as input, photovoltaic output of the next day is used as output to train a network, and mean absolute error (MAPE) and mean square error (RMSE) are used as network precision evaluation indexes, wherein the specific calculation formula is as follows:
and obtaining and storing a prediction model with higher precision through repeated training.
In the photovoltaic output day-ahead prediction method based on K-means mean value clustering and the BP neural network optimization, the step 6 judges the type of the day similar to the day to be predicted according to the predicted weather, and inputs day-ahead historical meteorological data in the similar day data set into the model with highest photovoltaic output prediction precision to obtain photovoltaic output prediction data of the predicted day specifically comprises the following steps:
step 6.1: and (3) obtaining weather information of a day to be predicted through weather forecast, calculating the Euclidean distance between the weather information to be predicted and the clustering center obtained in the step (3), and judging the type of the similar day of the day to be predicted.
And 6.2, inputting historical meteorological data of similar days closest to the predicted day into a corresponding prediction model to obtain a predicted result before the photovoltaic output day.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that: the accuracy of similar day selection can be improved through pearson correlation coefficient analysis and K-means clustering, and therefore the difficulty of model training is reduced. The weight, the threshold and the hidden layer node number of the BP neural network are optimized by adopting a genetic algorithm and an ant colony algorithm, so that the problems of local optimization and overfitting of the BP neural network weight can be effectively solved; by combining the methods, the precision of the photovoltaic output day-ahead prediction can be improved.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a flow chart of the GA-ACO optimized BP neural network of the invention;
FIG. 3 is a graph comparing the predicted result of the present invention with the predicted result and the true value of the basic BP neural network.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described in the following in conjunction with the embodiments of the present invention, and it is obvious that the described embodiments are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.
The invention will be further illustrated, but is not limited, by the following examples.
The embodiment relates to a photovoltaic output day-ahead prediction method based on K-means clustering and BP neural network optimization, wherein a flow chart is shown in figure 1, and the method comprises the following steps:
step 1, collecting historical meteorological data of photovoltaic power stations in a region to be predicted and historical photovoltaic output data corresponding to the meteorological data, and preprocessing the historical meteorological data and the historical photovoltaic output data by using an average interpolation method.
In the step, historical meteorological data recorded by photovoltaic power stations in a region to be predicted and corresponding historical photovoltaic output data are collected, the sampling time interval is 15 minutes, the data collected in one natural day are used as a sample, and an average interpolation method is adopted to preprocess the historical meteorological data and the corresponding photovoltaic output data.
In the embodiment, the total solar horizontal radiation, the air temperature, the cloud layer opacity, the atmospheric precipitation, the relative humidity, the snowfall depth, the ground air pressure and the air speed are selected as the meteorological data, the meteorological data is normalized by adopting an average interpolation method, and the calculation formula is as follows:
wherein x is * Represents the normalized value of x, x min Represents the minimum value of x, x max Represents the maximum value of x;
and 2, calculating the correlation coefficient of the historical meteorological data and the historical photovoltaic output data by adopting the Pearson correlation coefficient, reserving the data of which the correlation coefficient is larger than a threshold value, and constructing a training set.
In this step, the meteorological factor data X calculate the pearson correlation coefficient P between the photovoltaic output values Y, and the calculation formula is as follows:
wherein n is the sequence length; x is x i And y i The ith variable of sequence X and sequence Y, respectively;and->The average of sequence X and sequence Y, respectively. The value range of P is [ -1,1]The larger the absolute value of P represents the higher degree of correlation between the two sequences.
In the present embodiment, the threshold value of the pearson correlation coefficient is set to 0.2.
And step 3, clustering the historical photovoltaic output data through K-means clustering and dividing a similar day data set.
In the step, the training set photovoltaic output historical data is clustered by adopting a K-means mean clustering method, and the specific steps are as follows:
for a given dataset x= { X 1 ,X 2 ...X n Each object contains t features, and the data set X corresponds to an n×t. Clustering process by studying similarity between objects in dataset X, samples in dataset X are partitioned into k different categories c= { C following a certain clustering criterion 1 ,C 2 ...C k And the different categories are independent of each other. To measure the similarity between objects, a distance function is introduced. In data set X, arbitrary sample X e And X f The similarity between the two can be determined by Euclidean distance d ef Expressed as:
when sample X e And X f The more similar or close, d ef The smaller; otherwise, the larger its value.
And carrying out K-means mean value clustering on the historical photovoltaic output data, dividing the historical photovoltaic output data with high similarity into the same similar day data set, and storing a clustering center of the similar day data set.
In this embodiment the cluster k value is determined by the elbow rule and since the initial cluster center is randomly chosen, it is necessary to determine the appropriate cluster center by multiple clusters.
And 4, optimizing the weight, the threshold and the hidden layer node number of the BP neural network through a genetic algorithm and an ant colony algorithm, and constructing a photovoltaic output day-ahead prediction model.
In the step, the weight, the threshold and the hidden layer node number of the BP neural network are optimized through a genetic algorithm and an ant colony algorithm, and the photovoltaic output prediction model is constructed specifically by the following steps:
before modeling a GA-ACO optimized BP neural network, optimizing through a GA algorithm to generate an optimized solution of BP neural network weight, threshold and hidden layer node number; subsequently, the distribution of the pheromones is initialized, the concentration of the pheromones on the optimized solution path is increased, and the purpose of the method is to increase the concentration of the pheromones on the optimized solution path, so that the convergence speed and the accuracy in ACO searching are improved.
The initialization formula of the pheromone is
τ=τ G +c
Wherein τ G The concentration value of the pheromone after GA optimization; c is a pheromone constant.
The main steps of the GA-ACO optimized BP neural network are as follows:
step 4.1: and initializing the BP neural network and the ant colony. Initializing required parameter settings, namely: connection weight omega between input layer and hidden layer ij A hidden layer threshold alpha, a connection weight omega between a hidden layer and an output layer jk Output layer threshold β, and hidden layer node number n. The above parameters are denoted as p 1 、p 2 、…、p n Composition element set I ni The method comprises the steps of carrying out a first treatment on the surface of the Initializing the number S of ants, the pheromone volatilization coefficient rho, the target error E and the like in the ant colony algorithm.
Step 4.2: s ants start searching and update pheromone until all S ants complete searching. During searching, ants can update the pheromone values on all sides through which the ants pass in real time, and the pheromone value updating formula is as follows:
wherein ρ is the pheromone volatilization coefficient;the information quantity of the kth ant on the j element path in the current circulation set is obtained.
Step 4.3: the genetic algorithm is added to the ant colony algorithm. And performing operations such as crossing, mutation and the like on the ant colony. The most commonly used single-point crossover is selected in the crossover algorithm, namely, a point is randomly selected in the gene sequence to serve as a crossover point, and partial alleles of two different individuals are interchanged by taking the point as a boundary to generate two new gene sequences; in the mutation algorithm, a normal distribution with the mean value of mu and the variance of sigma is selected to carry out mutation operation on part of genes with smaller probability, and the expression of the new individual is generated
σ′=σeN(0,Δσ)
x′=x+N(0,Δσ)
Wherein x is the next path node of the ant colony. And selecting a fitness function to calculate individual fitness. The least mean square error of the learning samples is used herein as a fitness function. And (3) calculating an fitness value according to the formula (1), and judging whether the requirement of the current optimal solution is met. If yes, the step 4 is carried out, otherwise, the step 2 is carried out.
Step 4.4: taking the optimizing result of the ant colony algorithm in the last step as a parameter of the BP neural network, training the neural network, and calculating an error e. Has the following components
e q =O q -Y q
Wherein O is q Is the expected value; y is Y q Is a predicted value; q is the number of neurons, q=1, 2, …, n.
Step 4.5: and (4) updating the weight, the threshold and the hidden layer node number of the BP neural network according to the result of the step 4.4, and judging whether the requirement is met. If yes, the algorithm is ended, and the optimal weight, threshold value and hidden layer node number are output; otherwise, go to step 4.3.
And 5, inputting the first seventy percent of each similar day data set into a prediction model as training set data, training a neural network and storing the model with the highest prediction precision.
The method for training the neural network and storing the model with highest prediction precision comprises the following steps:
in the same similar day data set obtained by K-means clustering, weather factors of the previous day of two adjacent days are used as input, photovoltaic output of the next day is used as output to train a network, and mean absolute error (MAPE) and mean square error (RMSE) are used as network precision evaluation indexes, wherein the specific calculation formula is as follows:
and obtaining and storing a prediction model with higher precision through repeated training.
And 6, judging the type of the similar day of the day to be predicted according to the predicted weather, and inputting the day-ahead meteorological data in the similar day data set into the model with the highest photovoltaic output prediction precision to obtain the photovoltaic output prediction data of the predicted day.
In the step, the type of the similar day of the day to be predicted is judged according to the predicted weather, and the weather data before the day in the similar day data set are input into the model with the highest photovoltaic output prediction precision to obtain the photovoltaic output prediction data of the predicted day, specifically the steps are as follows:
step 6.1: and (3) obtaining weather information of a day to be predicted through weather forecast, calculating the Euclidean distance between the weather information to be predicted and the clustering center obtained in the step (3), and judging the type of the similar day of the day to be predicted.
And 6.2, inputting historical meteorological data of similar days closest to the predicted day into a corresponding prediction model to obtain a predicted result before the photovoltaic output day.
In order to verify the effectiveness of the photovoltaic output day-ahead prediction method based on K-means mean clustering and the BP neural network, the following four prediction models are adopted respectively, and the prediction results obtained by the four models are subjected to comparative analysis, wherein the results are shown in fig. 3 and table 1.
Method 1: a BP neural network prediction model;
method 2: optimizing a BP neural network prediction model;
method 3: K-means+BP neural network prediction model;
method 4: the K-means+ optimized BP neural network prediction model is characterized by comprising a model;
table 1 comparison of different model errors
As can be seen from the comparison data of Table 1 and FIG. 3, after the similar days are selected by the pearson correlation coefficient analysis and the K-means clustering, the model prediction accuracy is obviously improved due to the fact that the photovoltaic output fluctuation rules in the similar days are similar. In addition, after the initial weight of the BP neural network is optimized by adopting a genetic algorithm and an ant colony algorithm, the prediction precision of the BP neural network is improved; by combining the methods, the precision of the photovoltaic output day-ahead prediction can be improved.
The foregoing is merely illustrative of the preferred embodiments of the present invention and is not intended to limit the embodiments and scope of the present invention, and it should be appreciated by those skilled in the art that equivalent substitutions and obvious variations may be made using the teachings of the present invention, which are intended to be included within the scope of the present invention.

Claims (6)

1. A photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization is characterized by comprising the following steps of: the method comprises the following steps:
step 1, collecting historical meteorological data of photovoltaic power stations in a region to be predicted and historical photovoltaic output data corresponding to the historical meteorological data, and preprocessing the historical meteorological data and the historical photovoltaic output data by using an average interpolation method;
step 2, calculating the correlation coefficient of the historical meteorological data and the historical photovoltaic output data by adopting the Pearson correlation coefficient, reserving the data with the correlation coefficient larger than a threshold value, and constructing a training set;
step 3, clustering the historical photovoltaic output data through K-means clustering and dividing a similar day data set;
step 4, optimizing the weight, the threshold value and the hidden layer node number of the BP neural network through a genetic algorithm and an ant colony algorithm, and constructing a photovoltaic output day-ahead prediction model;
step 5, inputting the first seventy percent of each similar day data set as training set data into a prediction model, training a neural network and storing the model with highest prediction precision;
and 6, judging the type of the similar day of the day to be predicted according to the predicted weather, and inputting the historical meteorological data before the day in the similar day data set into a model with highest photovoltaic output prediction precision to obtain the photovoltaic output prediction data of the predicted day.
2. The photovoltaic output day-ahead prediction method based on K-means mean clustering and BP neural network optimization is characterized by comprising the following steps of: in the step 2, the pearson correlation coefficient is adopted to calculate the correlation coefficient of the historical meteorological data and the historical photovoltaic output data, the data with the correlation coefficient larger than a threshold value is reserved, and the specific steps of constructing a training set are as follows:
the historical meteorological data X calculates a pearson correlation coefficient P between historical photovoltaic output values Y, and the calculation formula is as follows:
wherein n is the sequence length; x is x i And y i The ith variable of sequence X and sequence Y, respectively;and->The average value of the sequence X and the sequence Y is respectively, and the value range of P is [ -1,1]The larger the absolute value of P represents the higher degree of correlation between the two sequences.
3. The photovoltaic output day-ahead prediction method based on K-means mean clustering and BP neural network optimization is characterized by comprising the following steps of: in the step 3, the specific steps of clustering the historical photovoltaic output data and dividing the similar day data set through K-means clustering are as follows:
for a given dataset x= { X 1 ,X 2 ...X n Each object contains t features, and the data set X corresponds to an n X t momentThe array, the cluster analysis process classifies samples in the data set X into k different categories C= { C by researching the similarity among objects in the data set X and following a certain clustering criterion 1 ,C 2 ...C k The different categories are independent of each other, and in order to measure the similarity between objects, a distance function is introduced, and in the data set X, any sample X e And X f The similarity between the two can be determined by Euclidean distance d ef Expressed as:
when sample X e And X f The more similar or close, d ef The smaller; otherwise, the larger its value,
and carrying out K-means mean value clustering on the historical photovoltaic output data, dividing the historical photovoltaic output data with high similarity into the same similar day data set, and storing a clustering center of the similar day data set.
4. The photovoltaic output day-ahead prediction method based on K-means mean clustering and BP neural network optimization is characterized by comprising the following steps of: in the step 4, the weight, the threshold and the hidden layer node number of the BP neural network are optimized through a genetic algorithm and an ant colony algorithm, and the step of constructing a photovoltaic output day-ahead prediction model is specifically as follows:
step 4.1: initializing a BP neural network and an ant colony, and initializing required parameter settings, namely: connection weight omega between input layer and hidden layer ij A hidden layer threshold alpha, a connection weight omega between a hidden layer and an output layer jk Output layer threshold beta and hidden layer node number n, the parameters are denoted as p 1 、p 2 、…、p n Composition element set I ni The method comprises the steps of carrying out a first treatment on the surface of the Initializing the number S of ants, the pheromone volatilization coefficient rho and a target error E in an ant colony algorithm;
step 4.2: s ants start searching and update pheromones until all S ants complete searching, and at the same time of searching, the ants update the pheromone values on all sides passed by the ants in real time, and the pheromone value updating formula is as follows:
wherein ρ is the pheromone volatilization coefficient;information quantity of kth ant on j element path in current cycle set, +.>The information quantity increment of the kth ant on the j element path in the current circulation set is obtained;
step 4.3: adding a genetic algorithm into an ant colony algorithm, executing crossover and mutation operations on the ant colony, wherein the crossover algorithm adopts the most commonly used single-point crossover, namely randomly selecting a point in a gene sequence as a crossover point, and exchanging partial alleles of two different individuals by taking the point as a boundary to generate two new gene sequences; in the mutation algorithm, a normal distribution with the mean value of mu and the variance of sigma is selected to carry out mutation operation on part of genes with smaller probability, and the expression of the new individual is generated
σ′=σeN(0,Δσ)
x′=x+N(0,Δσ)
In the formula, x is the node of the next path of the ant colony, an fitness function is selected to calculate individual fitness, the method takes the minimum mean square error of a learning sample as the fitness function, a fitness value is calculated according to the formula (1), whether the requirement of the current optimal solution is met or not is judged, if yes, the step 4 is carried out, otherwise, the step 2 is carried out,
step 4.4: taking the optimizing result of the ant colony algorithm in the last step as the parameter of the BP neural network, training the neural network, and calculating the error e, wherein the error e is
e q =O q -Y q
Wherein O is q Is the expected value; y is Y q Is a predicted value; q is the number of neurons, q=1, 2, …, n;
step 4.5: updating the weight, the threshold, the number of hidden layer nodes and the number of hidden layer nodes of the BP neural network according to the result of the step 4.4, judging whether the requirement is met, if yes, ending the algorithm, and outputting the optimal weight, the threshold, the number of hidden layer nodes and the number of hidden layer nodes; otherwise, go to step 4.3.
5. The photovoltaic output day-ahead prediction method based on K-means mean clustering and BP neural network optimization is characterized by comprising the following steps of: in the step 5, the step of inputting the first seventy percent of each similar day data set as training set data into the prediction model to train the neural network and save the model with the highest prediction precision is specifically as follows:
in the same similar day data set obtained by K-means clustering, weather factors of the previous day of two adjacent days are used as input, photovoltaic output of the next day is used as output to train a network, and mean absolute error (MAPE) and mean square error (RMSE) are used as network precision evaluation indexes, wherein the specific calculation formula is as follows:
and obtaining and storing a prediction model with higher precision through repeated training.
6. The photovoltaic output day-ahead prediction method based on K-means mean clustering and BP neural network optimization is characterized by comprising the following steps of: in the step 6, the type of the similar day of the day to be predicted is judged according to the predicted weather, and the historical meteorological data before the day in the similar day data set are input into the model with the highest photovoltaic output prediction precision to obtain the photovoltaic output prediction data of the predicted day, and the steps are specifically as follows:
step 6.1: obtaining weather information of a day to be predicted through weather forecast, calculating Euclidean distance between the weather information to be predicted and the clustering center obtained in the step 3, and judging the similar day type of the day to be predicted;
and 6.2, inputting weather factors of similar days closest to the predicted day into corresponding prediction models to obtain a predicted result before the photovoltaic output day.
CN202211610935.2A 2022-12-14 2022-12-14 Photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization Pending CN116702937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211610935.2A CN116702937A (en) 2022-12-14 2022-12-14 Photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211610935.2A CN116702937A (en) 2022-12-14 2022-12-14 Photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization

Publications (1)

Publication Number Publication Date
CN116702937A true CN116702937A (en) 2023-09-05

Family

ID=87831760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211610935.2A Pending CN116702937A (en) 2022-12-14 2022-12-14 Photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization

Country Status (1)

Country Link
CN (1) CN116702937A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116960989A (en) * 2023-09-20 2023-10-27 云南电投绿能科技有限公司 Power load prediction method, device and equipment for power station and storage medium
CN116956753A (en) * 2023-09-21 2023-10-27 国能日新科技股份有限公司 Distributed photovoltaic prediction method and device based on simulated annealing and cyclic convolution

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116960989A (en) * 2023-09-20 2023-10-27 云南电投绿能科技有限公司 Power load prediction method, device and equipment for power station and storage medium
CN116960989B (en) * 2023-09-20 2023-12-01 云南电投绿能科技有限公司 Power load prediction method, device and equipment for power station and storage medium
CN116956753A (en) * 2023-09-21 2023-10-27 国能日新科技股份有限公司 Distributed photovoltaic prediction method and device based on simulated annealing and cyclic convolution
CN116956753B (en) * 2023-09-21 2023-12-08 国能日新科技股份有限公司 Distributed photovoltaic prediction method and device based on simulated annealing and cyclic convolution

Similar Documents

Publication Publication Date Title
JP5888640B2 (en) Photovoltaic power generation prediction apparatus, solar power generation prediction method, and solar power generation prediction program
CN113282122B (en) Commercial building energy consumption prediction optimization method and system
CN110619360A (en) Ultra-short-term wind power prediction method considering historical sample similarity
CN116702937A (en) Photovoltaic output day-ahead prediction method based on K-means mean value clustering and BP neural network optimization
CN111260126B (en) Short-term photovoltaic power generation prediction method considering correlation degree of weather and meteorological factors
CN110766200A (en) Method for predicting generating power of wind turbine generator based on K-means mean clustering
CN113344288B (en) Cascade hydropower station group water level prediction method and device and computer readable storage medium
CN109242200B (en) Wind power interval prediction method of Bayesian network prediction model
CN114298140A (en) Wind power short-term power prediction correction method considering unit classification
CN112819189A (en) Wind power output prediction method based on historical predicted value
CN114330100A (en) Short-term photovoltaic power probability interval prediction method
CN115204444A (en) Photovoltaic power prediction method based on improved cluster analysis and fusion integration algorithm
CN114897204A (en) Method and device for predicting short-term wind speed of offshore wind farm
CN113095547A (en) Short-term wind power prediction method based on GRA-LSTM-ICE model
CN117374941A (en) Photovoltaic power generation power prediction method based on neural network
CN116646927A (en) Wind power prediction method based on segmented filtering and longitudinal and transverse clustering
CN115456286A (en) Short-term photovoltaic power prediction method
CN115796327A (en) Wind power interval prediction method based on VMD (vertical vector decomposition) and IWOA-F-GRU (empirical mode decomposition) -based models
CN115713144A (en) Short-term wind speed multi-step prediction method based on combined CGRU model
CN115660132A (en) Photovoltaic power generation power prediction method and system
Zhang et al. Temperature prediction and analysis based on improved GA-BP neural network
CN114234392A (en) Air conditioner load fine prediction method based on improved PSO-LSTM
CN113761023A (en) Photovoltaic power generation short-term power prediction method based on improved generalized neural network
CN116911418A (en) Photovoltaic power generation power ultra-short-term prediction method based on wavelet transformation and BP neural network optimization
CN114997475B (en) Kmeans-based fusion model photovoltaic power generation short-term prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination