WO2020228568A1 - Method for training power generation amount prediction model of photovoltaic power station, power generation amount prediction method and device of photovoltaic power station, training system, prediction system and storage medium - Google Patents

Method for training power generation amount prediction model of photovoltaic power station, power generation amount prediction method and device of photovoltaic power station, training system, prediction system and storage medium Download PDF

Info

Publication number
WO2020228568A1
WO2020228568A1 PCT/CN2020/088709 CN2020088709W WO2020228568A1 WO 2020228568 A1 WO2020228568 A1 WO 2020228568A1 CN 2020088709 W CN2020088709 W CN 2020088709W WO 2020228568 A1 WO2020228568 A1 WO 2020228568A1
Authority
WO
WIPO (PCT)
Prior art keywords
environmental
feature
environmental data
historical
power generation
Prior art date
Application number
PCT/CN2020/088709
Other languages
French (fr)
Chinese (zh)
Inventor
周希波
李慧
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Publication of WO2020228568A1 publication Critical patent/WO2020228568A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Definitions

  • the embodiments of the present disclosure relate to a method for training a photovoltaic power station power generation prediction model, a photovoltaic power station power generation prediction method and device, a training system, a prediction system, and a storage medium.
  • At least one embodiment of the present disclosure provides a method for training a photovoltaic power station power generation prediction model, which includes:
  • a training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data
  • the mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms.
  • At least one embodiment of the present disclosure provides a method for predicting power generation of a photovoltaic power station, the method including:
  • the effective environmental data including one day's monitoring values of multiple preset environmental characteristics of the photovoltaic power station;
  • the power generation prediction model is trained through the following operations:
  • a training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data
  • the mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms
  • the preprocessing of the historical environmental data includes:
  • the daily historical environmental data includes monitoring values of multiple environmental characteristics
  • the feature encoding of the daily historical environmental data includes:
  • the statistical feature Based on the monitoring values of the multiple environmental features included in the daily historical environmental data, respectively calculate a statistical feature corresponding to each of the multiple environmental features, where the statistical feature includes a maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic;
  • the plurality of preset environmental characteristics include other environmental characteristics in the plurality of environmental characteristics except the filtered environmental characteristics.
  • At least one embodiment of the present disclosure provides a device for predicting power generation of a photovoltaic power station, the device including:
  • An obtaining unit configured to obtain effective environmental data, the effective environmental data including one day's monitoring values of multiple preset environmental characteristics of the photovoltaic power station;
  • a preprocessing unit configured to preprocess the effective environmental data to obtain a feature vector of effective environmental data
  • a prediction unit configured to input the effective environmental data feature vector into the pre-built power generation prediction model, and output a prediction result corresponding to the effective environmental data, where the prediction result is the day of the photovoltaic power station Effective power generation,
  • the power generation prediction model is trained through the following operations:
  • a training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data
  • the mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms
  • the preprocessing of the historical environmental data includes:
  • the daily historical environmental data includes monitoring values of multiple environmental characteristics
  • the feature encoding of the daily historical environmental data includes:
  • the statistical feature Based on the monitoring values of the multiple environmental features included in the daily historical environmental data, respectively calculate a statistical feature corresponding to each of the multiple environmental features, where the statistical feature includes a maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic;
  • the plurality of preset environmental characteristics include other environmental characteristics in the plurality of environmental characteristics except the filtered environmental characteristics.
  • At least one embodiment of the present disclosure provides a training system, which includes:
  • Memory configured to store one or more computer programs
  • the processor when the one or more computer programs are executed by the processor, the processor is caused to execute the method for training a photovoltaic power generation forecast model according to any embodiment of the present disclosure.
  • At least one embodiment of the present disclosure provides a prediction system, which includes:
  • Memory configured to store one or more computer programs
  • the processor when the one or more computer programs are executed by the processor, the processor is caused to execute the method for predicting power generation of a photovoltaic power station according to any embodiment of the present disclosure.
  • At least one embodiment of the present disclosure provides a non-transitory computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the training of the photovoltaic power generation prediction model according to any embodiment of the present disclosure is realized. method.
  • At least one embodiment of the present disclosure provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the method for predicting power generation of a photovoltaic power station according to any embodiment of the present disclosure is implemented.
  • Fig. 1 shows a schematic flow chart of a method for predicting power generation of a photovoltaic power station provided by at least one embodiment of the present disclosure
  • Fig. 2 shows a schematic flow chart of a method for constructing a power generation prediction model provided by at least one embodiment of the present disclosure
  • Fig. 3 shows a schematic structural diagram of a photovoltaic power station generating capacity prediction device provided by at least one embodiment of the present disclosure
  • FIG. 4 shows a schematic structural diagram of a preprocessing unit provided by at least one embodiment of the present disclosure
  • Figure 5 shows a schematic structural diagram of a system provided by at least one embodiment of the present disclosure.
  • Fig. 6 shows a schematic diagram of a computer system suitable for implementing a method for training a photovoltaic power station power generation prediction model, a photovoltaic power generation prediction method or device, a training system, and a computer system for the prediction system according to an embodiment of the disclosure.
  • the power generation of photovoltaic power plants is also affected by factors such as sunlight, weather, and precipitation.
  • the records of monitoring environmental information hardware and photovoltaic power plant hardware may have inconsistencies in granularity, or records may be missing due to hardware equipment shutdown or operation and maintenance. The existence of these problems makes accurate prediction of photovoltaic power generation capacity face severe challenges.
  • FIG. 1 shows a schematic flowchart of a method for predicting power generation of a photovoltaic power station according to at least one embodiment of the present disclosure.
  • the method includes:
  • Step 110 Obtain valid environmental data.
  • Step 120 preprocess the effective environmental data to obtain the effective environmental data feature vector
  • Step 130 Input the effective environmental data feature vector into the pre-built power generation prediction model, and output a prediction result corresponding to the effective environmental data.
  • the prediction result is the daily effective power generation of the photovoltaic power station.
  • the power generation prediction model can be trained by the method of training a photovoltaic power station power generation prediction model described below, which will not be repeated here.
  • the effective environmental data includes one-day monitoring values of multiple preset environmental characteristics of the photovoltaic power station.
  • effective environmental data refers to the monitoring values of multiple preset environmental characteristics of photovoltaic power plants obtained according to the granularity period during the working hours of the day.
  • the granularity period is defined as 1 hour, 30 minutes, etc.
  • the daily working hours can be determined according to the sunshine hours in different regions. For example, in Beijing, China, the daily working hours can be expressed from 7:00 to 18:00 Beijing time.
  • the aforementioned effective environmental data may include hourly monitoring values of multiple preset environmental characteristics of the photovoltaic power station obtained between 7:00 and 18:00.
  • the aforementioned preset environmental characteristics may include multiple of the following environmental characteristics, such as wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, sunshine hours, etc.
  • the monitoring values of these environmental characteristics can be recorded by the power station environmental monitor.
  • the preset environment characteristics will be described with reference to the method for training a photovoltaic power station power generation prediction model according to at least one embodiment of the present disclosure.
  • the forecast target is the daily effective power generation of photovoltaic power stations. For example, the difference between the historical cumulative positive active power generation of the power station recorded from the zero point of the next day to the zero point of the day.
  • the monitoring of environmental data is performed 24 hours a day without interruption, and the effective working time of photovoltaic power plants in actual work is related to the length of sunshine.
  • the actual effective working hours of photovoltaic power plants in Beijing, China may be 7:00-18:00 daily. Therefore, when predicting the effective power generation of photovoltaic power plants, effective environmental data is one of the important factors that affect the accuracy of the prediction results.
  • the idea of using effective environmental data to predict the daily positive active power generation of a photovoltaic power station is proposed, where the effective environmental data collection granularity may be a fixed period, such as 1 hour.
  • the feature encoding of effective environmental data includes the following steps:
  • the statistical feature corresponding to each of the multiple preset environmental features is calculated respectively, and the statistical feature includes the maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic;
  • the one-day monitoring values of multiple preset environmental characteristics of the photovoltaic power station included in the effective environmental data are obtained according to the granularity period during the working hours of the day.
  • each preset environmental feature in the effective environmental data is flattened into a 12-dimensional vector according to the hourly monitoring value of a single day, and the statistical feature corresponding to each preset environmental feature is counted.
  • the statistical feature may include the maximum value feature , Minimum feature, average feature, standard deviation feature, and then integrate each environmental feature with the statistical feature, and get the form of single-day hourly monitoring value, maximum feature, minimum feature, average feature, and standard deviation feature
  • the 16-dimensional vector is used as the vector of each environment feature.
  • the principal component analysis method is used to convert multiple environmental features into a set of linear independent effective environmental data feature vectors.
  • the preprocessing process of the effective environmental data in the embodiments of the present disclosure is illustrated as follows. It is assumed that the effective environmental data obtained at 8:00 am is as shown in the following table:
  • Environmental characteristics refer to wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, sunshine hours, etc.
  • Each environmental characteristic is that the environmental monitor obtains real data according to hourly monitoring, that is, the granularity period is 1 hour.
  • the environmental characteristics at other time points that are not monitored are filled with zeros to construct effective environmental data.
  • each environmental feature is the maximum, minimum, average, and standard deviation of each environmental feature.
  • these statistical features are integrated with the corresponding environmental features to obtain the vector of each environmental feature.
  • the principal component analysis method is used to convert all environmental features into a set of linear independent effective environmental data feature vectors.
  • the effective environmental data feature vector is input to the pre-built power generation prediction model, and the prediction result corresponding to the effective environmental data is output.
  • the prediction result is the daily effective power generation of the photovoltaic power station. The method of training the power generation prediction model will be described in detail below.
  • the power generation prediction model provided in the embodiments of the present disclosure may be trained using multiple regression algorithms, for example, Bayesian Ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm, and extreme random forest algorithm. That is, the power generation forecasting model includes sub-models constructed according to Bayesian Ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm and extreme random forest algorithm. It should be understood that the aforementioned Bayesian ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm, and extreme random forest algorithm are only exemplary, and the embodiments of the present disclosure are not limited thereto.
  • the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to multiple regression algorithms.
  • the output result of the power generation prediction model may be a weighted average of the prediction results obtained according to multiple regression algorithms, which is not limited in the embodiment of the present disclosure.
  • the embodiment of the present disclosure unifies the environmental monitoring statistical period with the statistical period of the photovoltaic power station power generation, and performs refined preprocessing on the environmental monitoring data, so that the statistical results are more accurate.
  • FIG. 2 shows a schematic flowchart of a method for training a photovoltaic power station power generation prediction model provided by at least one embodiment of the present disclosure. As shown in Figure 2, the method includes:
  • Step 201 Obtain daily historical power generation data of the photovoltaic power station for multiple days and historical environmental data for each day of the multiple days;
  • Step 202 preprocessing historical environmental data
  • Step 203 Construct a training set and a prediction set based on historical power generation data and preprocessed historical environmental data, and use multiple regression algorithms for training respectively to construct a mapping relationship between historical environmental data and historical power generation data as power generation
  • the prediction model, the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the above-mentioned multiple regression algorithms.
  • the multiple regression algorithms may include at least two of the following algorithms: Bayesian Ridge Regression Algorithm, Support Vector Regression Algorithm, Gradient Boosting Tree Algorithm, and Extreme Random Forest Algorithm.
  • Bayesian Ridge Regression Algorithm Support Vector Regression Algorithm
  • Gradient Boosting Tree Algorithm Gradient Boosting Tree Algorithm
  • Extreme Random Forest Algorithm Extreme Random Forest Algorithm
  • preprocessing historical environmental data may include:
  • the aforementioned daily historical environmental data includes daily monitoring values of multiple environmental characteristics.
  • the multiple environmental characteristics may include wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, and sunshine hours.
  • the embodiments of the present disclosure are not limited to this.
  • the aforementioned daily historical environmental data may be historical environmental data obtained during daily working hours according to a granular cycle.
  • the aforementioned daily historical environmental data may include the monitoring values of multiple environmental characteristics obtained during daily working hours according to the granularity period.
  • the granularity period for obtaining historical environmental data should be the same as the granularity period for obtaining valid environmental data described above.
  • the historical power generation data of the photovoltaic power station in the preset time range and the historical environmental data obtained during the daily working hours in the preset time range are used as the sample data set, and the sample data set is proportional to Divided into training set and prediction set, the division ratio is for example 7:3.
  • the preset time range may be 120 days, 30 days, etc., for example.
  • the historical power generation of the photovoltaic power station within 30 days and the historical environmental data recorded by the power station environment monitor every day according to the granularity cycle within the 30 days are acquired.
  • the historical power generation data correspond to the daily positive active power generation of each photovoltaic power station.
  • the historical power generation can be read from the electricity meter of the photovoltaic power station.
  • the feature encoding of daily historical environmental data includes:
  • the statistical features can include maximum, minimum, and average values. , At least one of the standard deviation features;
  • each environmental feature in Table 1 For example, statistical analysis is performed on each environmental feature in Table 1 to obtain a statistical feature corresponding to each environmental feature, and the statistical feature is the maximum, minimum, average, and standard deviation of each environmental feature. Then, these statistical features are integrated with the corresponding environmental features to obtain Table 2, where each column in Table 2 forms a vector for each environmental feature. That is, for each environmental feature, a 16-dimensional vector formed by a single day's hourly monitoring value, maximum feature, minimum feature, average feature, and standard deviation feature is used as the vector of each environmental feature.
  • filtering environmental features whose sparsity is greater than or equal to a sparsity threshold includes:
  • the ratio between the number of data equal to zero in the vector of each environmental feature and the total number of data in the vector of each environmental feature is taken as the sparsity of each environmental feature.
  • the sparsity threshold ⁇ can be set according to requirements. For example, you can judge whether the sparsity is greater than or equal to the sparsity threshold according to the following formula:
  • normalizing effective environment data after feature encoding includes normalizing a vector of environment features whose sparsity is less than a sparsity threshold.
  • the sparsity of one or more environmental features in wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, and sunshine hours is greater than or equal to the sparsity threshold
  • the one or more environmental features are filtered, and the vectors of the remaining environmental features are normalized.
  • the preset environmental characteristics used in the method for predicting power generation of a photovoltaic power plant are environmental characteristics other than the filtered environmental characteristics among the multiple environmental characteristics included in the historical environmental data.
  • converting the normalized daily historical environmental data into linear independent historical environmental data feature vectors includes:
  • Principal component analysis is used to convert the normalized daily historical environmental data into linear independent historical environmental data feature vectors.
  • the historical environment data feature vectors are input as training samples to multiple regression algorithms for training.
  • the multiple regression algorithms may include but are not limited to at least two of Bayesian Ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm, and extreme random forest algorithm.
  • Bayesian Ridge Regression Algorithm is a biased estimation regression method for collinearity data analysis.
  • the parameter W obeys Gaussian distribution
  • the accuracy of the data noise is ⁇
  • ⁇ -1 corresponds to the variance of the Gaussian distribution of the sample set X
  • ⁇ -1 corresponds to the variance of the Gaussian distribution of the parameter W
  • h W (X)-Y obeys the Gaussian distribution
  • the logarithmic posterior probability distribution formula of the linear model is:
  • is the posterior probability under the condition of the known sample set X
  • T is the target value vector of the sample data
  • const is a constant independent of the parameter W
  • the learning process of the Bayesian Ridge Regression algorithm is: multiply the posterior probability of the previous training set by the likelihood estimate of the new test sample point to obtain the posterior probability of the new training set.
  • X is drawn from the full set of samples. As the sample size increases, the predicted value h W (X) gradually converges and stabilizes, and finally the stable predicted value h W (X) is output.
  • Support vector regression algorithm is also a regression method.
  • the support vector regression algorithm uses support vector ideas and Lagrangian multiplier methods to perform regression analysis on data when fitting.
  • the sequence minimum optimization algorithm can be used to find the corresponding ⁇ , ⁇ * when the above formula is minimum. The algorithm selects two variables through a two-layer loop to maximize the decrease of the objective function, and then updates the regression model parameters. When the termination condition is met within the accuracy range ⁇ , the algorithm ends. Output h(x) under the optimal parameters as the prediction result.
  • the gradient boosting tree (GBRT) algorithm is an iterative learning model based on the CART regression tree.
  • Extreme random forest uses all samples every time it generates a decision tree, and the decision tree split is completely random.
  • the feature attribute is in category form, different categories are randomly clustered together as branches; when the feature attribute is in numerical form, any number between the maximum and minimum values of the feature attribute is randomly selected as the bifurcation value to establish Branch.
  • the decision tree is split, all feature attributes in the node are traversed, the branches of all feature attributes are obtained according to the above method, the mean square error is calculated, and the feature with the largest mean square error is finally selected for splitting.
  • a proportional voting model is used to fuse the prediction results of the sub-models constructed by the above various algorithms to obtain the final prediction results.
  • the proportional voting model is used to fuse the prediction results of the sub-models constructed by the above various algorithms to obtain the final prediction results, including: weighted average of the prediction results obtained by each regression algorithm, and the weighted average of the prediction results as the final forecast result.
  • weighted average of the prediction results obtained by each regression algorithm is only an example of fusing the prediction results obtained by each regression algorithm.
  • the prediction results obtained by various regression algorithms can also be fused in other ways, which is not limited in the embodiments of the present disclosure.
  • the power generation prediction model constructed in the embodiments of the present disclosure is trained using multiple regression algorithms, which can effectively prevent overfitting problems, and obtain the final prediction results by fusing the prediction results obtained by multiple regression algorithms, thereby effectively Improve forecast accuracy.
  • the embodiments of the present disclosure can overcome the problem of granularity inconsistency between the power generation of photovoltaic power plants and environmental data. They use training feature granularity to be small measurement units, such as hours, minutes, etc., and predict targets to be large measurement units, such as days. The embodiment of the present disclosure also obtains multiple prediction results through multiple different regression models, and obtains the final prediction result according to a proportional voting method, which can obtain better prediction accuracy without causing over-fitting problems.
  • FIG. 3 shows a schematic structural diagram of a photovoltaic power station generating capacity prediction device 300 provided by at least one embodiment of the present disclosure.
  • the photovoltaic power station generating capacity prediction device 300 includes:
  • the obtaining unit 301 is configured to obtain valid environmental data.
  • the effective environmental data includes one day's monitoring values of multiple preset environmental characteristics of the photovoltaic power station.
  • the preprocessing unit 302 is configured to preprocess valid environmental data to obtain a feature vector of valid environmental data
  • the prediction unit 303 is configured to input the effective environmental data feature vector into a pre-built power generation prediction model, and output a prediction result corresponding to the effective environmental data.
  • the prediction result is the effective power generation of the photovoltaic power station in a day.
  • the power generation prediction model can be trained by the method of training the photovoltaic power station power generation prediction model described above, which will not be repeated here.
  • the effective environmental data refers to the monitoring values of multiple preset environmental characteristics of the photovoltaic power station obtained according to the granularity period during the working hours of a day.
  • the granularity period is defined as 1 hour, 30 minutes, etc.
  • the daily working hours can be determined according to the sunshine hours in different regions. For example, in Beijing, China, the daily working hours can be expressed from 7:00 to 18:00 Beijing time.
  • the aforementioned effective environmental data may include hourly monitoring values of multiple preset environmental characteristics of the photovoltaic power station obtained between 7:00 and 18:00.
  • the preset environmental characteristics in the aforementioned effective environmental data may include multiple environmental characteristics, such as wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, and sunshine hours Wait.
  • the monitoring values of these environmental characteristics can be recorded by the power station environmental monitor.
  • the forecast target is the daily effective power generation of photovoltaic power stations. For example, the difference between the historical cumulative positive active power generation of the power station recorded from the zero point of the next day to the zero point of the day.
  • the monitoring of environmental data is performed 24 hours a day without interruption, and the effective working time of photovoltaic power plants in actual work is related to the length of sunshine.
  • the actual effective working hours of photovoltaic power plants in Beijing, China may be 7:00-18:00 daily. Therefore, when predicting the effective power generation of photovoltaic power plants, effective environmental data is one of the important factors that affect the accuracy of the prediction results.
  • the idea of using effective environmental data to predict the daily positive active power generation of a photovoltaic power station is proposed, where the effective environmental data collection granularity may be a fixed period, such as 1 hour.
  • the obtaining unit 301 the preprocessing unit 302, and the prediction unit 303, please refer to the detailed description of step 110, step 120 and step 130 above, which will not be repeated in the embodiment of the present disclosure.
  • the preprocessing unit 302 may further include:
  • the feature encoding subunit 3021 is configured to perform feature encoding on effective environment data
  • the normalization subunit 3022 is configured to normalize the effective environment data after feature encoding
  • the dimensionality reduction processing subunit 3023 is configured to convert the normalized effective environment data into a linear independent effective environment data feature vector.
  • the feature encoding subunit 3021 is further configured to: based on one day's monitoring values of multiple preset environmental features of the photovoltaic power station included in the effective environmental data, respectively calculate the multiple preset environmental features The statistical feature corresponding to each preset environmental feature; and filtering the environmental feature or statistical feature whose sparsity is greater than or equal to the sparsity threshold, the statistical feature includes at least one of the maximum feature, the minimum feature, the average feature, and the standard deviation feature One; and using the monitored value of each of the plurality of preset environmental features and the corresponding statistical feature as the vector of each preset environmental feature.
  • the normalization subunit 3022 is further configured to normalize the vector of each preset environment feature.
  • the dimensionality reduction processing subunit 3023 is further configured to adopt a principal component analysis method to convert the normalized effective environmental data into a linear independent effective environmental data feature vector.
  • the embodiment of the present disclosure unifies the environmental monitoring statistical period with the statistical period of the photovoltaic power station power generation, and performs refined preprocessing on the environmental monitoring data, so that the statistical results are more accurate.
  • FIG. 5 shows a schematic structural diagram of a system provided by at least one embodiment of the present disclosure.
  • the system shown in Figure 5 may be a predictive system.
  • the prediction system may include:
  • the power station environment monitoring device 401 is configured to monitor the photovoltaic power station to obtain environmental data of the photovoltaic power station;
  • the memory 402 is configured to store one or more computer programs
  • the processor 403 When one or more computer programs stored in the memory 402 are executed by the processor 403, the processor 403 is caused to execute the method for predicting the power generation amount of the photovoltaic power plant described in the foregoing embodiment.
  • the power station environment monitoring device 401 may be connected to the memory 402 and the processor 403 by wired or wireless connection. Wired and wireless connections mainly implement communication functions for data transmission and control signaling transmission.
  • the power station environment monitoring device 401 may include various sensors for detecting various environmental characteristics, for example.
  • the photovoltaic power station prediction system may not include the power station environment monitoring device 401.
  • the environmental data can be manually input or the environmental data can be stored in the memory in advance.
  • the power station environment monitoring device 401 may be configured to monitor multiple preset environmental characteristics of the photovoltaic power station to obtain one day's monitoring value of the multiple preset environmental characteristics of the photovoltaic power station.
  • the power station environment monitoring device 401 may be configured to monitor multiple preset environmental characteristics of the photovoltaic power station, so as to obtain the monitoring values of the multiple preset environmental characteristics of the photovoltaic power station according to the granular period during the working hours of the day.
  • the memory 402 and the processor 403 may be implemented as the same device, or may exist independently.
  • the processor 403 may be implemented as a functional structure for implementing the method for predicting the power generation amount of a photovoltaic power plant described in the foregoing embodiment.
  • system shown in FIG. 5 may also be a training system, which is used to implement the method for training a photovoltaic power station power generation prediction model described in the foregoing embodiment.
  • the processor 403 when one or more computer programs stored in the memory 402 are executed by the processor 403, the processor 403 is caused to execute the method for training a photovoltaic power station power generation prediction model described in the foregoing embodiment.
  • the power station environment monitoring device 401 may be configured to monitor the photovoltaic power station to obtain daily historical environmental data of the photovoltaic power station for multiple days.
  • the power station environment monitoring device 401 may be configured to obtain historical environmental data according to the granular period during daily working hours of multiple days.
  • the units or modules recorded in the photovoltaic power station power generation forecasting device 300 correspond to the steps in the method described with reference to FIG. 1. Therefore, the operations and features described above for the method in FIG. 1 are also applicable to the device 300 and the units contained therein, and will not be repeated here.
  • the apparatus 300 may be implemented in a browser or other security application of an electronic device in advance, or it may be loaded into the browser of the electronic device or its security application by downloading or the like.
  • Corresponding units in the photovoltaic power station generating capacity prediction device 300 can cooperate with units in electronic equipment to implement the solutions of the embodiments of the present disclosure.
  • FIG. 6 which shows the structure of a computer system 500 suitable for implementing a method for training a photovoltaic power station power generation prediction model, a photovoltaic power generation prediction method or device, a training system or a prediction system according to an embodiment of the present disclosure.
  • the computer system 500 includes a central processing unit (CPU) 501, which can be based on a program stored in a read only memory (ROM) 502 or a program loaded from a storage portion 508 into a random access memory (RAM) 503 And perform various appropriate actions and processing.
  • ROM read only memory
  • RAM random access memory
  • various programs and data required for the operation of the system 500 are also stored.
  • the CPU 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504.
  • An input/output (I/O) interface 505 is also connected to the bus 504.
  • the following components are connected to the I/O interface 505: an input part 506 including a keyboard, a mouse, etc.; an output part 507 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker; a storage part 508 including a hard disk, etc. ; And a communication section 509 including a network interface card such as a LAN card, a modem, etc. The communication section 509 performs communication processing via a network such as the Internet.
  • the driver 510 is also connected to the I/O interface 505 as needed.
  • a removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 510 as needed, so that the computer program read therefrom is installed into the storage portion 508 as needed.
  • an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a machine-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication part 509, and/or installed from the removable medium 511.
  • CPU central processing unit
  • the computer-readable medium shown in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the aforementioned module, program segment, or part of the code contains one or more for realizing the specified logic function Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the units or modules involved in the embodiments described in the present disclosure can be implemented in software or hardware.
  • the described unit or module may also be provided in the processor.
  • a processor includes an acquisition unit, a preprocessing unit, and a prediction unit.
  • the names of these units or modules do not constitute a limitation on the units or modules themselves under certain circumstances.
  • the acquiring unit can also be described as "a unit for acquiring effective environmental data.”
  • the present disclosure also provides a computer-readable storage medium.
  • the computer-readable storage medium may be included in the electronic device described in the above embodiment; or it may exist alone without being assembled into the In electronic equipment.
  • the computer readable storage medium stores one or more programs, and when the foregoing programs are used by one or more processors to execute the method for predicting the power generation amount of a photovoltaic power station described in the present disclosure.

Abstract

A method for training a power generation amount prediction model of a photovoltaic power station, a power generation amount prediction method and device of the photovoltaic power station, a training system, a prediction system and a storage medium. The power generation amount prediction method of the photovoltaic power station comprises: obtaining effective environment data, wherein the effective environment data comprises monitoring values of multiple preset environment features of the photovoltaic power station in one day; preprocessing the effective environment data to obtain an effective environment data feature vector; and inputting the effective environment data feature vector into a pre-constructed power generation amount prediction model, and outputting a prediction result corresponding to the effective environment data, wherein the prediction result is the effective power generation amount of the photovoltaic power station in one day.

Description

训练光伏电站发电量预测模型的方法、光伏电站发电量预测方法和装置、训练系统和预测系统及存储介质Method for training power generation forecast model of photovoltaic power station, method and device for power generation forecast of photovoltaic power station, training system, prediction system and storage medium
相关申请的交叉引用Cross references to related applications
本申请要求于2019年05月14日递交的第201910401500.9号中国专利申请的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。This application claims the priority of the Chinese patent application No. 201910401500.9 filed on May 14, 2019, and the contents of the above-mentioned Chinese patent application are quoted here in full as a part of this application.
技术领域Technical field
本公开的实施例涉及训练光伏电站发电量预测模型的方法、光伏电站发电量预测方法和装置、训练系统和预测系统及存储介质。The embodiments of the present disclosure relate to a method for training a photovoltaic power station power generation prediction model, a photovoltaic power station power generation prediction method and device, a training system, a prediction system, and a storage medium.
背景技术Background technique
光伏发电在能源电力系统中的应用越来越普及,光伏电站发电量的精准预测对电网安全、发电控制与调度,以及光伏电站的运维管理等方面有着重要的影响。The application of photovoltaic power generation in energy power systems is becoming more and more popular. The accurate prediction of photovoltaic power generation output has an important impact on grid security, power generation control and dispatch, and the operation and maintenance management of photovoltaic power plants.
发明内容Summary of the invention
本公开至少一个实施例提供了一种训练光伏电站发电量预测模型的方法,其包括:At least one embodiment of the present disclosure provides a method for training a photovoltaic power station power generation prediction model, which includes:
获取光伏电站的多天中每天的历史发电量数据和所述多天中的每天的历史环境数据获得的历史环境数据;Acquiring historical power generation data per day of the photovoltaic power station for multiple days and historical environmental data obtained by historical environmental data per day of the multiple days;
预处理所述历史环境数据;以及Preprocessing the historical environmental data; and
基于所述历史发电量数据和预处理后的所述历史环境数据构建训练集和预测集,分别采用多个回归算法进行训练,以构建所述历史环境数据和所述历史发电量数据之间的映射关系作为所述发电量预测模型,所述发电量预测模型的输出结果为对按照所述多个回归算法获得的预测结果进行融合而得到的。A training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data The mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms.
本公开至少一个实施例提供了一种光伏电站发电量预测方法,该方法包括:At least one embodiment of the present disclosure provides a method for predicting power generation of a photovoltaic power station, the method including:
获取有效环境数据,所述有效环境数据包括光伏电站的多个预设环境特 征的一天的监测值;Acquiring effective environmental data, the effective environmental data including one day's monitoring values of multiple preset environmental characteristics of the photovoltaic power station;
预处理所述有效环境数据,以得到有效环境数据特征向量;以及Preprocessing the effective environmental data to obtain a feature vector of effective environmental data; and
将所述有效环境数据特征向量输入到预先构建的发电量预测模型,输出与所述有效环境数据对应的预测结果,所述预测结果为所述光伏电站的所述一天的有效发电量,Inputting the effective environmental data feature vector into a pre-built power generation prediction model, and outputting a prediction result corresponding to the effective environmental data, the prediction result being the effective power generation of the photovoltaic power station for the day,
其中,所述发电量预测模型是通过以下操作训练的:Wherein, the power generation prediction model is trained through the following operations:
获取所述光伏电站的多天的历史发电量数据和所述多天中的每天的历史环境数据;Acquiring multiple days of historical power generation data of the photovoltaic power station and daily historical environmental data of the multiple days;
预处理所述历史环境数据;以及Preprocessing the historical environmental data; and
基于所述历史发电量数据和预处理后的所述历史环境数据构建训练集和预测集,分别采用多个回归算法进行训练,以构建所述历史环境数据和所述历史发电量数据之间的映射关系作为所述发电量预测模型,所述发电量预测模型的输出结果为对按照所述多个回归算法获得的预测结果进行融合而得到的,A training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data The mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms,
所述预处理所述历史环境数据,包括:The preprocessing of the historical environmental data includes:
将所述每天的历史环境数据进行特征编码;Feature coding the daily historical environmental data;
将特征编码后的所述每天的历史环境数据进行归一化;以及Normalize the daily historical environmental data after feature encoding; and
将归一化后的所述每天的历史环境数据转换成线性独立的历史环境数据特征向量,Convert the normalized daily historical environmental data into linear independent historical environmental data feature vectors,
所述每天的历史环境数据包括多个环境特征的监测值;以及The daily historical environmental data includes monitoring values of multiple environmental characteristics; and
所述将所述每天的历史环境数据进行特征编码,包括:The feature encoding of the daily historical environmental data includes:
基于所述每天的历史环境数据所包括的所述多个环境特征的所述监测值,分别计算所述多个环境特征中的每个环境特征对应的统计特征,所述统计特征包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;Based on the monitoring values of the multiple environmental features included in the daily historical environmental data, respectively calculate a statistical feature corresponding to each of the multiple environmental features, where the statistical feature includes a maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic;
使用所述多个环境特征中的每个环境特征的监测值和对应的所述统计特征作为所述多个环境特征中的每个环境特征的向量;以及Using the monitored value of each environmental feature in the plurality of environmental features and the corresponding statistical feature as a vector of each environmental feature in the plurality of environmental features; and
过滤所述多个环境特征中稀疏度大于等于稀疏度阈值的所述环境特征,以及Filtering the environmental features whose sparsity is greater than or equal to a sparsity threshold among the plurality of environmental features, and
所述多个预设环境特征包括所述多个环境特征中除过滤掉的环境特征之 外的其他环境特征。The plurality of preset environmental characteristics include other environmental characteristics in the plurality of environmental characteristics except the filtered environmental characteristics.
本公开至少一个实施例提供了一种光伏电站发电量预测装置,该装置包括:At least one embodiment of the present disclosure provides a device for predicting power generation of a photovoltaic power station, the device including:
获取单元,配置为获取有效环境数据,所述有效环境数据包括光伏电站的多个预设环境特征的一天的监测值;An obtaining unit configured to obtain effective environmental data, the effective environmental data including one day's monitoring values of multiple preset environmental characteristics of the photovoltaic power station;
预处理单元,配置为预处理所述有效环境数据,以得到有效环境数据特征向量;以及A preprocessing unit, configured to preprocess the effective environmental data to obtain a feature vector of effective environmental data; and
预测单元,配置为将所述有效环境数据特征向量输入到所述预先构建的发电量预测模型,输出与所述有效环境数据对应的预测结果,所述预测结果为所述光伏电站的所述一天的有效发电量,A prediction unit configured to input the effective environmental data feature vector into the pre-built power generation prediction model, and output a prediction result corresponding to the effective environmental data, where the prediction result is the day of the photovoltaic power station Effective power generation,
其中,所述发电量预测模型是通过以下操作训练的:Wherein, the power generation prediction model is trained through the following operations:
获取所述光伏电站的多天中每天的历史发电量数据和所述多天中的每天的历史环境数据;Acquiring daily historical power generation data of the photovoltaic power station for multiple days and daily historical environmental data of the multiple days;
预处理所述历史环境数据;以及Preprocessing the historical environmental data; and
基于所述历史发电量数据和预处理后的所述历史环境数据构建训练集和预测集,分别采用多个回归算法进行训练,以构建所述历史环境数据和所述历史发电量数据之间的映射关系作为所述发电量预测模型,所述发电量预测模型的输出结果为对按照所述多个回归算法获得的预测结果进行融合而得到的,A training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data The mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms,
所述预处理所述历史环境数据,包括:The preprocessing of the historical environmental data includes:
将所述每天的历史环境数据进行特征编码;Feature coding the daily historical environmental data;
将特征编码后的所述每天的历史环境数据进行归一化;以及Normalize the daily historical environmental data after feature encoding; and
将归一化后的所述每天的历史环境数据转换成线性独立的历史环境数据特征向量,Convert the normalized daily historical environmental data into linear independent historical environmental data feature vectors,
所述每天的历史环境数据包括多个环境特征的监测值;以及The daily historical environmental data includes monitoring values of multiple environmental characteristics; and
所述将所述每天的历史环境数据进行特征编码,包括:The feature encoding of the daily historical environmental data includes:
基于所述每天的历史环境数据所包括的所述多个环境特征的所述监测值,分别计算所述多个环境特征中的每个环境特征对应的统计特征,所述统计特征包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;Based on the monitoring values of the multiple environmental features included in the daily historical environmental data, respectively calculate a statistical feature corresponding to each of the multiple environmental features, where the statistical feature includes a maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic;
使用所述多个环境特征中的每个环境特征的监测值和对应的所述统计特征作为所述多个环境特征中的每个环境特征的向量;以及Using the monitored value of each environmental feature in the plurality of environmental features and the corresponding statistical feature as a vector of each environmental feature in the plurality of environmental features; and
过滤所述多个环境特征中稀疏度大于等于稀疏度阈值的环境特征,以及Filtering the environmental features whose sparsity is greater than or equal to the sparsity threshold among the plurality of environmental features, and
所述多个预设环境特征包括所述多个环境特征中除过滤掉的环境特征之外的其他环境特征。The plurality of preset environmental characteristics include other environmental characteristics in the plurality of environmental characteristics except the filtered environmental characteristics.
本公开至少一个实施例提供了一种训练系统,该系统包括:At least one embodiment of the present disclosure provides a training system, which includes:
处理器;以及Processor; and
存储器,配置为存储一个或多个计算机程序;Memory, configured to store one or more computer programs;
其中,当所述一个或多个计算机程序被所述处理器执行时,致使所述处理器执行根据本公开任一实施例的训练光伏电站发电量预测模型的方法。Wherein, when the one or more computer programs are executed by the processor, the processor is caused to execute the method for training a photovoltaic power generation forecast model according to any embodiment of the present disclosure.
本公开至少一个实施例提供了一种预测系统,该系统包括:At least one embodiment of the present disclosure provides a prediction system, which includes:
处理器;以及Processor; and
存储器,配置为存储一个或多个计算机程序;Memory, configured to store one or more computer programs;
其中,当所述一个或多个计算机程序被所述处理器执行时,致使所述处理器执行根据本公开任一实施例的光伏电站发电量预测方法。Wherein, when the one or more computer programs are executed by the processor, the processor is caused to execute the method for predicting power generation of a photovoltaic power station according to any embodiment of the present disclosure.
本公开至少一个实施例提供了一种非瞬时计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现根据本公开任一实施例的训练光伏电站发电量预测模型的方法。At least one embodiment of the present disclosure provides a non-transitory computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the training of the photovoltaic power generation prediction model according to any embodiment of the present disclosure is realized. method.
本公开至少一个实施例提供了一种非瞬时计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现根据本公开任一实施例的光伏电站发电量预测方法。At least one embodiment of the present disclosure provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the method for predicting power generation of a photovoltaic power station according to any embodiment of the present disclosure is implemented.
附图说明Description of the drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述的附图仅仅涉及本公开的一些实施例,而非对本公开的限制。In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the drawings of the embodiments will be briefly introduced below. Obviously, the drawings described below only relate to some embodiments of the present disclosure, rather than limit the present disclosure.
图1示出了本公开至少一个实施例提供的光伏电站发电量预测方法的流程示意图;Fig. 1 shows a schematic flow chart of a method for predicting power generation of a photovoltaic power station provided by at least one embodiment of the present disclosure;
图2示出了本公开至少一个实施例提供的发电量预测模型构建方法的流 程示意图;Fig. 2 shows a schematic flow chart of a method for constructing a power generation prediction model provided by at least one embodiment of the present disclosure;
图3示出了本公开至少一个实施例提供的光伏电站发电量预测装置的结构示意图;Fig. 3 shows a schematic structural diagram of a photovoltaic power station generating capacity prediction device provided by at least one embodiment of the present disclosure;
图4示出了本公开至少一个实施例提供的预处理单元的结构示意图;FIG. 4 shows a schematic structural diagram of a preprocessing unit provided by at least one embodiment of the present disclosure;
图5示出了本公开至少一个实施例提供的系统的结构示意图;以及Figure 5 shows a schematic structural diagram of a system provided by at least one embodiment of the present disclosure; and
图6示出了适于用来实现本公开实施例的训练光伏电站发电量预测模型的方法、光伏电站发电量预测方法或装置、训练系统和预测系统的计算机系统的结构示意图。Fig. 6 shows a schematic diagram of a computer system suitable for implementing a method for training a photovoltaic power station power generation prediction model, a photovoltaic power generation prediction method or device, a training system, and a computer system for the prediction system according to an embodiment of the disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合附图,对本公开实施例的技术方案进行清楚、完整地描述。显然,所描述的实施例是本公开的一部分实施例,而不是全部的实施例。基于所描述的本公开的实施例,本领域普通技术人员在无需创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the objectives, technical solutions, and advantages of the embodiments of the present disclosure clearer, the technical solutions of the embodiments of the present disclosure will be described clearly and completely in conjunction with the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present disclosure, rather than all of the embodiments. Based on the described embodiments of the present disclosure, all other embodiments obtained by those of ordinary skill in the art without creative labor are within the protection scope of the present disclosure.
光伏电站的发电量除了由光伏电站规模、光伏面板阵列、电力转换组件等硬件结构决定之外,还受外部环境,如光照、气象、降水等因素的影响。但是,监测环境信息硬件和光伏电站硬件的记录可能会出现粒度不一致,也可能因硬件设备关闭或运维致使记录缺失,这些问题的存在使得光伏电站的发电量的精准预测面临严峻的挑战。In addition to the hardware structure of photovoltaic power plants, photovoltaic panel arrays, power conversion components and other hardware structures, the power generation of photovoltaic power plants is also affected by factors such as sunlight, weather, and precipitation. However, the records of monitoring environmental information hardware and photovoltaic power plant hardware may have inconsistencies in granularity, or records may be missing due to hardware equipment shutdown or operation and maintenance. The existence of these problems makes accurate prediction of photovoltaic power generation capacity face severe challenges.
请参考图1,图1示出了本公开至少一个实施例提供的光伏电站发电量预测方法的流程示意图。Please refer to FIG. 1, which shows a schematic flowchart of a method for predicting power generation of a photovoltaic power station according to at least one embodiment of the present disclosure.
如图1所示,该方法包括:As shown in Figure 1, the method includes:
步骤110,获取有效环境数据。Step 110: Obtain valid environmental data.
步骤120,预处理有效环境数据,以得到有效环境数据特征向量; Step 120, preprocess the effective environmental data to obtain the effective environmental data feature vector;
步骤130,将有效环境数据特征向量输入到预先构建的发电量预测模型,输出与有效环境数据对应的预测结果,该预测结果为光伏电站的每日有效发电量。该发电量预测模型可通过下文中描述的训练光伏电站发电量预测模型的方法进行训练,在此将不再赘述。Step 130: Input the effective environmental data feature vector into the pre-built power generation prediction model, and output a prediction result corresponding to the effective environmental data. The prediction result is the daily effective power generation of the photovoltaic power station. The power generation prediction model can be trained by the method of training a photovoltaic power station power generation prediction model described below, which will not be repeated here.
本公开实施例中,有效环境数据包括光伏电站的多个预设环境特征的一天的监测值。例如,有效环境数据是指一天的工作时间内按照粒度周期 获得的、光伏电站的多个预设环境特征的监测值。粒度周期定义为1小时、30分钟等。每日工作时间可以根据不同地区的日照时间确定,例如中国北京地区可以以北京时间7:00-18:00表示每日工作时间。例如,在粒度周期为1小时的情况下,上述的有效环境数据可包括在7:00-18:00之间获得的光伏电站的多个预设环境特征每小时的监测值。In the embodiment of the present disclosure, the effective environmental data includes one-day monitoring values of multiple preset environmental characteristics of the photovoltaic power station. For example, effective environmental data refers to the monitoring values of multiple preset environmental characteristics of photovoltaic power plants obtained according to the granularity period during the working hours of the day. The granularity period is defined as 1 hour, 30 minutes, etc. The daily working hours can be determined according to the sunshine hours in different regions. For example, in Beijing, China, the daily working hours can be expressed from 7:00 to 18:00 Beijing time. For example, when the granularity period is 1 hour, the aforementioned effective environmental data may include hourly monitoring values of multiple preset environmental characteristics of the photovoltaic power station obtained between 7:00 and 18:00.
上述的预设环境特征可以包括以下环境特征中的多个,例如风向、风速、温度、湿度、大气压力、降水量、总辐射、直辐射、散辐射、曝辐、日照时数等。这些环境特征的监测值可以由电站环境监测仪记录。下文中将参照根据本公开至少一个实施例的训练光伏电站发电量预测模型的方法来描述预设环境特征。The aforementioned preset environmental characteristics may include multiple of the following environmental characteristics, such as wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, sunshine hours, etc. The monitoring values of these environmental characteristics can be recorded by the power station environmental monitor. Hereinafter, the preset environment characteristics will be described with reference to the method for training a photovoltaic power station power generation prediction model according to at least one embodiment of the present disclosure.
预测目标是光伏发电站的每日有效发电量。例如,次日零点到当日零点记录的电站的历史累计正向有功发电量的差值。The forecast target is the daily effective power generation of photovoltaic power stations. For example, the difference between the historical cumulative positive active power generation of the power station recorded from the zero point of the next day to the zero point of the day.
在一些技术方案中,环境数据的监测是全日24小时不间断地进行监测,而光伏电站在实际工作中有效工作时间是与日照时间长度关联的。例如,中国北京地区光伏电站实际有效工作时间可能是每日7:00-18:00。因此,对光伏电站有效发电量进行预测时,有效环境数据是影响预测结果精准性的重要因素之一。本公开实施例中提出利用有效环境数据对光伏发电站每日正向有功发电量进行预测的思想,其中有效环境数据的采集粒度可以是固定周期,例如1小时等。In some technical solutions, the monitoring of environmental data is performed 24 hours a day without interruption, and the effective working time of photovoltaic power plants in actual work is related to the length of sunshine. For example, the actual effective working hours of photovoltaic power plants in Beijing, China may be 7:00-18:00 daily. Therefore, when predicting the effective power generation of photovoltaic power plants, effective environmental data is one of the important factors that affect the accuracy of the prediction results. In the embodiments of the present disclosure, the idea of using effective environmental data to predict the daily positive active power generation of a photovoltaic power station is proposed, where the effective environmental data collection granularity may be a fixed period, such as 1 hour.
获取有效环境数据后,需要预处理有效环境数据,其可以包括以下步骤:After obtaining valid environmental data, it is necessary to preprocess the valid environmental data, which may include the following steps:
将有效环境数据进行特征编码;Characterize effective environmental data;
将特征编码后的有效环境数据进行归一化;Normalize the effective environmental data after feature encoding;
将归一化后的有效环境数据转换成线性独立的有效环境数据特征向量。Convert the normalized effective environmental data into linear independent effective environmental data feature vectors.
其中,将有效环境数据进行特征编码包括以下步骤:Among them, the feature encoding of effective environmental data includes the following steps:
基于有效环境数据包括的光伏电站的多个预设环境特征的一天的监测值,分别计算该多个预设环境特征中的每个预设环境特征对应的统计特征,该统计特征包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;以及Based on the daily monitoring values of multiple preset environmental features of the photovoltaic power station included in the valid environmental data, the statistical feature corresponding to each of the multiple preset environmental features is calculated respectively, and the statistical feature includes the maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic; and
使用该多个预设环境特征中的每个预设环境特征的监测值和对应的统计 特征作为该多个预设环境特征中的每个预设环境特征的向量。Use the monitored value of each of the plurality of preset environmental features and the corresponding statistical feature as a vector of each of the plurality of preset environmental features.
例如,有效环境数据所包括光伏电站的多个预设环境特征的一天的监测值是在一天的工作时间内按照粒度周期获得的。例如,对有效环境数据中每个预设环境特征按照单日每小时监测值进行扁平化处理为12维的向量,统计每个预设环境特征对应的统计特征,该统计特征可包括最大值特征、最小值特征、平均值特征、标准差特征,然后将各个环境特征与统计特征进行整合,得到包括由单日每小时监测值、最大值特征、最小值特征、平均值特征、标准差特征形成的16维的向量作为各个环境特征的向量。最后对每一列环境特征,即各个环境特征的向量,进行归一化处理后,采用主成分分析法将多个环境特征转换成一组线性独立的有效环境数据特征向量。For example, the one-day monitoring values of multiple preset environmental characteristics of the photovoltaic power station included in the effective environmental data are obtained according to the granularity period during the working hours of the day. For example, each preset environmental feature in the effective environmental data is flattened into a 12-dimensional vector according to the hourly monitoring value of a single day, and the statistical feature corresponding to each preset environmental feature is counted. The statistical feature may include the maximum value feature , Minimum feature, average feature, standard deviation feature, and then integrate each environmental feature with the statistical feature, and get the form of single-day hourly monitoring value, maximum feature, minimum feature, average feature, and standard deviation feature The 16-dimensional vector is used as the vector of each environment feature. Finally, after normalizing each column of environmental features, that is, the vector of each environmental feature, the principal component analysis method is used to convert multiple environmental features into a set of linear independent effective environmental data feature vectors.
为进一步说明,本公开实施例对有效环境数据的预处理过程,下面举例说明,假设在上午8:00获得的有效环境数据如下表所示:For further explanation, the preprocessing process of the effective environmental data in the embodiments of the present disclosure is illustrated as follows. It is assumed that the effective environmental data obtained at 8:00 am is as shown in the following table:
Figure PCTCN2020088709-appb-000001
Figure PCTCN2020088709-appb-000001
环境特征是指风向、风速、温度、湿度、大气压力、降水量、总辐射、直辐射、散辐射、曝辐、日照时数等。每个环境特征是环境监测仪按照小时监测得到真实数据,即以1小时为粒度周期。Environmental characteristics refer to wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, sunshine hours, etc. Each environmental characteristic is that the environmental monitor obtains real data according to hourly monitoring, that is, the granularity period is 1 hour.
其中,未监测的其他时间点的环境特征对应的补零,以构建有效环境数据。Among them, the environmental characteristics at other time points that are not monitored are filled with zeros to construct effective environmental data.
然后,对每个环境特征进行统计分析得到与每个环境特征对应的统计特征,该统计特征为每个环境特征的最大值、最小值、平均值和标准差。然后,将这些统计特征与对应的环境特征整合,以得到每个环境特征的向量。Then, statistical analysis is performed on each environmental feature to obtain a statistical feature corresponding to each environmental feature, and the statistical feature is the maximum, minimum, average, and standard deviation of each environmental feature. Then, these statistical features are integrated with the corresponding environmental features to obtain the vector of each environmental feature.
然后,将环境特征的向量进行归一化处理后,采用主成分分析法将所有环境特征转换成一组线性独立的有效环境数据特征向量。Then, after normalizing the environmental feature vectors, the principal component analysis method is used to convert all environmental features into a set of linear independent effective environmental data feature vectors.
将有效环境数据特征向量输入到预先构建的发电量预测模型,输出与有效环境数据对应的预测结果,该预测结果为光伏电站的每日有效发电量。训练发电量预测模型的方法将在下文中进行详细描述。The effective environmental data feature vector is input to the pre-built power generation prediction model, and the prediction result corresponding to the effective environmental data is output. The prediction result is the daily effective power generation of the photovoltaic power station. The method of training the power generation prediction model will be described in detail below.
本公开实施例中提供的发电量预测模型可以采用多个回归算法进行训练,例如,贝叶斯岭回归算法、支持向量回归算法、梯度提升树算法和极端随机森林算法。即发电量预测模型包括分别按照贝叶斯岭回归算法、支持向量回归算法、梯度提升树算法和极端随机森林算法构建的子模型。应理解上述的贝叶斯岭回归算法、支持向量回归算法、梯度提升树算法和极端随机森林算法仅是示例性的,本公开的实施例并不限于此。The power generation prediction model provided in the embodiments of the present disclosure may be trained using multiple regression algorithms, for example, Bayesian Ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm, and extreme random forest algorithm. That is, the power generation forecasting model includes sub-models constructed according to Bayesian Ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm and extreme random forest algorithm. It should be understood that the aforementioned Bayesian ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm, and extreme random forest algorithm are only exemplary, and the embodiments of the present disclosure are not limited thereto.
该发电量预测模型的输出结果为对按照多个回归算法获得的预测结果进行融合而得到的。例如,该发电量预测模型的输出结果可以是为按照多个回归算法获得的预测结果的加权平均值,本公开的实施例对此不作限制。The output result of the power generation prediction model is obtained by fusing the prediction results obtained according to multiple regression algorithms. For example, the output result of the power generation prediction model may be a weighted average of the prediction results obtained according to multiple regression algorithms, which is not limited in the embodiment of the present disclosure.
本公开实施例,通过将环境监测统计周期与光伏电站发电量的统计周期进行统一,并对环境监测数据进行精细化预处理,使得统计结果更为精准。The embodiment of the present disclosure unifies the environmental monitoring statistical period with the statistical period of the photovoltaic power station power generation, and performs refined preprocessing on the environmental monitoring data, so that the statistical results are more accurate.
请参考图2,图2示出了本公开至少一个实施例提供的训练光伏电站发电量预测模型的方法的流程示意图。如图2所示,该方法包括:Please refer to FIG. 2, which shows a schematic flowchart of a method for training a photovoltaic power station power generation prediction model provided by at least one embodiment of the present disclosure. As shown in Figure 2, the method includes:
步骤201,获取光伏电站的多天中每天的历史发电量数据和该多天中每天的历史环境数据;Step 201: Obtain daily historical power generation data of the photovoltaic power station for multiple days and historical environmental data for each day of the multiple days;
步骤202,预处理历史环境数据; Step 202, preprocessing historical environmental data;
步骤203,基于历史发电量数据和预处理后的历史环境数据构建训练集和预测集,分别采用多个回归算法进行训练,以构建历史环境数据和历史发电量数据之间的映射关系作为发电量预测模型,发电量预测模型的输出结果为对按照上述多个回归算法获得的预测结果进行融合而得到的。Step 203: Construct a training set and a prediction set based on historical power generation data and preprocessed historical environmental data, and use multiple regression algorithms for training respectively to construct a mapping relationship between historical environmental data and historical power generation data as power generation The prediction model, the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the above-mentioned multiple regression algorithms.
例如,该多个回归算法可包括以下算法中的至少两个:贝叶斯岭回归算法、支持向量回归算法、梯度提升树算法和极端随机森林算法,然而应理解本公开的实施例对此不作限制。For example, the multiple regression algorithms may include at least two of the following algorithms: Bayesian Ridge Regression Algorithm, Support Vector Regression Algorithm, Gradient Boosting Tree Algorithm, and Extreme Random Forest Algorithm. However, it should be understood that the embodiments of the present disclosure do not deal with this. limit.
例如,在至少一个实施例中,预处理历史环境数据,可包括:For example, in at least one embodiment, preprocessing historical environmental data may include:
将每天的历史环境数据进行特征编码;Characterize daily historical environmental data;
将特征编码后的每天的历史环境数据进行归一化;以及Normalize the daily historical environmental data after feature encoding; and
将归一化后的每天的历史环境数据转换成线性独立的历史环境数据特征向量。Convert the normalized daily historical environmental data into linear independent historical environmental data feature vectors.
例如,上述的每天的历史环境数据包括多个环境特征每天的监测值。例 如,该多个环境特征可包括风向、风速、温度、湿度、大气压力、降水量、总辐射、直辐射、散辐射、曝辐、日照时数,然而应理解本公开的实施例并不限于此。For example, the aforementioned daily historical environmental data includes daily monitoring values of multiple environmental characteristics. For example, the multiple environmental characteristics may include wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, and sunshine hours. However, it should be understood that the embodiments of the present disclosure are not limited to this.
例如,上述的每天的历史环境数据可以是每天工作时间内按照粒度周期获得的历史环境数据。在上述的每天的历史环境数据包括多个环境特征每天的监测值的情况下,上述的每天的历史环境数据可包括每天工作时间内按照粒度周期获得多个环境特征的监测值。此外,应理解,用于获得历史环境数据的粒度周期与上述的用于获得有效环境数据的粒度周期应相同。For example, the aforementioned daily historical environmental data may be historical environmental data obtained during daily working hours according to a granular cycle. In the case where the aforementioned daily historical environmental data includes the daily monitoring values of multiple environmental characteristics, the aforementioned daily historical environmental data may include the monitoring values of multiple environmental characteristics obtained during daily working hours according to the granularity period. In addition, it should be understood that the granularity period for obtaining historical environmental data should be the same as the granularity period for obtaining valid environmental data described above.
本公开实施例中,假设以预设时间范围的光伏电站的历史发电量数据和该预设时间范围中每天工作时间内按粒度周期获取的历史环境数据作为样本数据集,将样本数据集按照比例划分为来训练集和预测集,划分比例如7:3等。预设时间范围例如可以是120日、30日等。In the embodiment of the present disclosure, it is assumed that the historical power generation data of the photovoltaic power station in the preset time range and the historical environmental data obtained during the daily working hours in the preset time range are used as the sample data set, and the sample data set is proportional to Divided into training set and prediction set, the division ratio is for example 7:3. The preset time range may be 120 days, 30 days, etc., for example.
假设获取30日内的光伏电站的历史发电量和该30日内电站环境监测仪每天按照粒度周期记录的历史环境数据。其中历史发电量数据分别对应每个光伏电站的每日正向有功发电量。历史发电量可以从光伏电站的电表中读取得到。假设Y表示历史发电量数据,如Y={y 1,y 2,…,y n},其中n表示预设时间范围的最大值。 It is assumed that the historical power generation of the photovoltaic power station within 30 days and the historical environmental data recorded by the power station environment monitor every day according to the granularity cycle within the 30 days are acquired. Among them, the historical power generation data correspond to the daily positive active power generation of each photovoltaic power station. The historical power generation can be read from the electricity meter of the photovoltaic power station. Assume that Y represents historical power generation data, such as Y={y 1 , y 2 ,..., y n }, where n represents the maximum value of the preset time range.
历史环境数据对应地从光伏电站周围设置的环境监测仪中读取得到。假设X表示历史环境数据,如X={x 1,x 2,…,x n},其中n表示预设时间范围的最大值,其中每个x可以如下表1所示: Correspondingly, historical environmental data is read from environmental monitors installed around the photovoltaic power station. Suppose X represents historical environmental data, such as X={x 1 , x 2 ,..., x n }, where n represents the maximum value of the preset time range, and each x can be as shown in Table 1 below:
Figure PCTCN2020088709-appb-000002
Figure PCTCN2020088709-appb-000002
Figure PCTCN2020088709-appb-000003
Figure PCTCN2020088709-appb-000003
表1Table 1
例如,在至少一个实施例中,将每天的历史环境数据进行特征编码,包括:For example, in at least one embodiment, the feature encoding of daily historical environmental data includes:
基于每天的历史环境数据所包括的多个环境特征的监测值,分别计算该多个环境特征中的每个环境特征对应的统计特征,统计特征可包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;Based on the monitoring values of multiple environmental features included in daily historical environmental data, respectively calculate the statistical features corresponding to each of the multiple environmental features. The statistical features can include maximum, minimum, and average values. , At least one of the standard deviation features;
使用该多个环境特征中的每个环境特征的监测值和对应的统计特征作为该多个环境特征中的每个环境特征的向量;以及Using the monitored value and corresponding statistical feature of each of the multiple environmental features as the vector of each of the multiple environmental features; and
过滤该多个环境特征中稀疏度大于等于稀疏度阈值的环境特征。Filter the environmental features whose sparsity is greater than or equal to the sparsity threshold among the multiple environmental features.
例如,对表1中每个环境特征进行统计分析得到与每个环境特征对应的统计特征,该统计特征为每个环境特征的最大值、最小值、平均值和标准差。然后,将这些统计特征与对应的环境特征整合得到表2,其中表2中的每列形成每个环境特征的向量。即对于各个环境特征,由单日每小时监测值、最大值特征、最小值特征、平均值特征、标准差特征形成的16维的向量作为各个环境特征的向量。For example, statistical analysis is performed on each environmental feature in Table 1 to obtain a statistical feature corresponding to each environmental feature, and the statistical feature is the maximum, minimum, average, and standard deviation of each environmental feature. Then, these statistical features are integrated with the corresponding environmental features to obtain Table 2, where each column in Table 2 forms a vector for each environmental feature. That is, for each environmental feature, a 16-dimensional vector formed by a single day's hourly monitoring value, maximum feature, minimum feature, average feature, and standard deviation feature is used as the vector of each environmental feature.
Figure PCTCN2020088709-appb-000004
Figure PCTCN2020088709-appb-000004
Figure PCTCN2020088709-appb-000005
Figure PCTCN2020088709-appb-000005
表2Table 2
例如,在至少一个实施例中,过滤稀疏度大于等于稀疏度阈值的环境特征,包括:For example, in at least one embodiment, filtering environmental features whose sparsity is greater than or equal to a sparsity threshold includes:
将每个环境特征的向量中等于零的数据的个数与每个环境特征的向量的数据总数之间的比值作为每个环境特征的稀疏度。The ratio between the number of data equal to zero in the vector of each environmental feature and the total number of data in the vector of each environmental feature is taken as the sparsity of each environmental feature.
稀疏度阈值ε可以根据需求设置,例如,可以根据下面的公式判断稀疏度是否大于等于稀疏度阈值:The sparsity threshold ε can be set according to requirements. For example, you can judge whether the sparsity is greater than or equal to the sparsity threshold according to the following formula:
Figure PCTCN2020088709-appb-000006
Figure PCTCN2020088709-appb-000006
例如,在至少一个实施例中,将特征编码后的有效环境数据进行归一化,包括:将稀疏度小于稀疏度阈值的环境特征的向量归一化。For example, in at least one embodiment, normalizing effective environment data after feature encoding includes normalizing a vector of environment features whose sparsity is less than a sparsity threshold.
例如,在风向、风速、温度、湿度、大气压力、降水量、总辐射、直辐射、散辐射、曝辐、日照时数中的一个或多个环境特征的稀疏度大于或等于稀疏度阈值的情况下,过滤该一个或多个环境特征,并将剩余的环境特征的向量归一化。For example, where the sparsity of one or more environmental features in wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, and sunshine hours is greater than or equal to the sparsity threshold In this case, the one or more environmental features are filtered, and the vectors of the remaining environmental features are normalized.
应理解,上述根据本公开至少一个实施例的光伏电站发电量预测方法中使用的预设环境特征为历史环境数据所包括的多个环境特征中除了过滤掉的环境特征之外的其他环境特征。It should be understood that the preset environmental characteristics used in the method for predicting power generation of a photovoltaic power plant according to at least one embodiment of the present disclosure are environmental characteristics other than the filtered environmental characteristics among the multiple environmental characteristics included in the historical environmental data.
例如,在至少一个实施例中,将归一化后的每天的历史环境数据转换成线性独立的历史环境数据特征向量,包括:For example, in at least one embodiment, converting the normalized daily historical environmental data into linear independent historical environmental data feature vectors includes:
采用主成分分析法将归一化后的每天的历史环境数据转换成线性独立的历史环境数据特征向量。Principal component analysis is used to convert the normalized daily historical environmental data into linear independent historical environmental data feature vectors.
过滤稀疏度大于等于稀疏度阈值的环境特征,将剩下的环境特征进行归一化后,采用主成分分析法将所有环境特征转换成一组线性独立的历史环境数据特征向量
Figure PCTCN2020088709-appb-000007
其中,
Figure PCTCN2020088709-appb-000008
表示为t 1,1,t 1,2,…t 1,16,其中t 1,i为第1个环境特征对应不同监测时刻的取值,例如,i取值为1-12时用于表示每个环境特征在7:00-18:00的时间范围内的每小时读取的监测值,i取值为13-16时,则用于表示每个环境特征的最大值,最小值,平均值和标准差。
Filter the environmental features whose sparsity is greater than or equal to the sparsity threshold, and after normalizing the remaining environmental features, use principal component analysis to convert all environmental features into a set of linear independent historical environmental data feature vectors
Figure PCTCN2020088709-appb-000007
among them,
Figure PCTCN2020088709-appb-000008
Expressed as t 1,1 ,t 1,2 ,...t 1,16 , where t 1,i is the value of the first environmental feature corresponding to different monitoring moments, for example, when i is 1-12, it is used to indicate The monitoring value read every hour in the time range of 7:00-18:00 for each environmental feature. When the value of i is 13-16, it is used to represent the maximum, minimum, and average of each environmental feature Value and standard deviation.
然后,将历史环境数据特征向量作为训练样本分别输入到多个回归算法进行训练。例如,该多个回归算法可包括但不限于贝叶斯岭回归算法、支持向量回归算法、梯度提升树算法和极端随机森林算法中的至少两个。Then, the historical environment data feature vectors are input as training samples to multiple regression algorithms for training. For example, the multiple regression algorithms may include but are not limited to at least two of Bayesian Ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm, and extreme random forest algorithm.
贝叶斯岭回归算法是一种用于共线性数据分析的有偏估计回归方法。Bayesian Ridge Regression Algorithm is a biased estimation regression method for collinearity data analysis.
假设样本集合为X={x 1,x 2,…,x n},参数W服从高斯分布
Figure PCTCN2020088709-appb-000009
数据噪声的精度为β、β -1对应于样本集合X的高斯分布的方差,α -1对应于参数W的高斯分布的方差,h W(X)-Y服从高斯分布
Figure PCTCN2020088709-appb-000010
则线性模型的对数后验概率分布公式为:
Assuming that the sample set is X={x 1 , x 2 ,..., x n }, the parameter W obeys Gaussian distribution
Figure PCTCN2020088709-appb-000009
The accuracy of the data noise is β, β -1 corresponds to the variance of the Gaussian distribution of the sample set X, α -1 corresponds to the variance of the Gaussian distribution of the parameter W, and h W (X)-Y obeys the Gaussian distribution
Figure PCTCN2020088709-appb-000010
Then the logarithmic posterior probability distribution formula of the linear model is:
Figure PCTCN2020088709-appb-000011
Figure PCTCN2020088709-appb-000011
其中,θ为在已知样本集合X条件下的后验概率,T为样本数据的目标值向量,const为与参数W无关的常量,Y为历史发电量数据,如Y={y 1,y 2,…,y n}。贝叶斯岭回归算法的学习过程为:在前一个训练集合的后验概率上,乘以新的测试样本点的似然估计,得到新的训练集合的后验概率。训练过程中,X是从样本全集中累计抽取,随着样本量的增加,预测值h W(X)逐渐收敛趋于稳定,最终输出稳定后的预测值h W(X)。 Among them, θ is the posterior probability under the condition of the known sample set X, T is the target value vector of the sample data, const is a constant independent of the parameter W, and Y is the historical power generation data, such as Y={y 1 , y 2 ,...,y n }. The learning process of the Bayesian Ridge Regression algorithm is: multiply the posterior probability of the previous training set by the likelihood estimate of the new test sample point to obtain the posterior probability of the new training set. During the training process, X is drawn from the full set of samples. As the sample size increases, the predicted value h W (X) gradually converges and stabilizes, and finally the stable predicted value h W (X) is output.
支持向量回归算法,也是一种回归方法。支持向量回归算法是在拟合时,采用支持向量的思想,和拉格朗日乘子式的方式,对数据进行回归分析的。Support vector regression algorithm is also a regression method. The support vector regression algorithm uses support vector ideas and Lagrangian multiplier methods to perform regression analysis on data when fitting.
假设样本集合为X={x 1,x 2,…,x n},支持向量回归算法构造约束优化问题: Assuming that the sample set is X={x 1 , x 2 ,..., x n }, the support vector regression algorithm constructs a constrained optimization problem:
Figure PCTCN2020088709-appb-000012
Figure PCTCN2020088709-appb-000012
Figure PCTCN2020088709-appb-000013
Figure PCTCN2020088709-appb-000013
支持向量回归模型通过迭代使得目标函数在约束条件下达到极小值。假设样本集合和预测值之间的拟合函数为y=h(x)=W TX,构建预测值与测试集中真实值的误差作为目标函数,通过迭代使得目标函数在上述约束条件下达到极小值。可以利用序列最小优化算法求出上式最小时对应的α,α *。算法通过两层循环分别选择两个变量,使得目标函数最大化下降,然后更新回归模型参数。当在精度范围ε内满足终止条件时,算法结束。输出最优参数下的h(x)作为预测结果。 The support vector regression model makes the objective function reach the minimum value under constraint conditions through iteration. Assuming that the fitting function between the sample set and the predicted value is y=h(x)=W T X, the error between the predicted value and the true value in the test set is constructed as the objective function, and the objective function reaches the extreme under the above constraints through iteration. Small value. The sequence minimum optimization algorithm can be used to find the corresponding α,α * when the above formula is minimum. The algorithm selects two variables through a two-layer loop to maximize the decrease of the objective function, and then updates the regression model parameters. When the termination condition is met within the accuracy range ε, the algorithm ends. Output h(x) under the optimal parameters as the prediction result.
梯度提升树(GBRT)算法是基于CART回归树的迭代学习模型。The gradient boosting tree (GBRT) algorithm is an iterative learning model based on the CART regression tree.
假设训练样本为X={x 1,x 2,…,x n},最大化迭代次数为T,损失函数为L,GBRT首先初始化弱学习器: Assuming that the training sample is X={x 1 , x 2 ,..., x n }, the maximum number of iterations is T, and the loss function is L. GBRT first initializes the weak learner:
Figure PCTCN2020088709-appb-000014
Figure PCTCN2020088709-appb-000014
其中,y i为历史发电量数据Y的取值,如Y={y 1,y 2,…,y n}。 Among them, y i is the value of historical power generation data Y, such as Y={y 1 , y 2 ,..., y n }.
在每轮迭代中,算法的目标是找到一个CART回归树,使得本轮的损失函数最小。对于每轮迭代,对样本计算负梯度:In each round of iteration, the goal of the algorithm is to find a CART regression tree that minimizes the loss function of this round. For each iteration, calculate the negative gradient for the sample:
Figure PCTCN2020088709-appb-000015
Figure PCTCN2020088709-appb-000015
利用(X,r t)拟合CART回归树,其对应的叶子节点区域为R tj,j=1,2,…,k,计算最佳拟合值: Use (X,r t ) to fit the CART regression tree, the corresponding leaf node area is R tj ,j=1, 2,...,k, calculate the best fit value:
Figure PCTCN2020088709-appb-000016
Figure PCTCN2020088709-appb-000016
根据拟合值,更新强学习器:According to the fitted value, update the strong learner:
Figure PCTCN2020088709-appb-000017
Figure PCTCN2020088709-appb-000017
经过T轮迭代后,得到最终的回归模型:After T rounds of iteration, the final regression model is obtained:
Figure PCTCN2020088709-appb-000018
Figure PCTCN2020088709-appb-000018
极端随机森林算法是选取全部训练样本X={x 1,x 2,…,x n},按指定百分比随机选取特征,构建K棵CART树,然后对测试样本进行回归预测,取K棵树的预测结果的均值作为最终的预测结果。即输出f t(X)作为最终的预测结果。 The extreme random forest algorithm is to select all training samples X={x 1 , x 2 ,..., x n }, randomly select features according to a specified percentage, build K CART trees, and then perform regression prediction on the test samples, and take K trees The average of the predicted results is used as the final predicted result. That is, output f t (X) as the final prediction result.
极端随机森林每次在生成决策树的时候都使用所有的样本,且决策树分裂是完全随机的。当特征属性是类别形式时,随机将不同的类别聚在一起作为分支;当特征属性是数值形式时,随机选择处于该特征属性的最大值和最小值之间的任意数作为分叉值,建立分支。决策树分裂时,遍历节点内的所有特征属性,按上述方法得到所有特征属性的分支,计算均方误差,最终选取均方误差最大的特征进行分裂。Extreme random forest uses all samples every time it generates a decision tree, and the decision tree split is completely random. When the feature attribute is in category form, different categories are randomly clustered together as branches; when the feature attribute is in numerical form, any number between the maximum and minimum values of the feature attribute is randomly selected as the bifurcation value to establish Branch. When the decision tree is split, all feature attributes in the node are traversed, the branches of all feature attributes are obtained according to the above method, the mean square error is calculated, and the feature with the largest mean square error is finally selected for splitting.
在预测光伏电站的发电量时,采用等比投票模型将上述各个算法构建的子模型的预测结果进行融合,得到最终的预测结果。When predicting the power generation of a photovoltaic power station, a proportional voting model is used to fuse the prediction results of the sub-models constructed by the above various algorithms to obtain the final prediction results.
采用等比投票模型将上述各个算法构建的子模型的预测结果进行融合,得到最终的预测结果,包括:对各个回归算法获得的预测结果进行加权平均,并将预测结果的加权平均值作为最终的预测结果。应理解,对各个回归算法获得的预测结果进行加权平均仅是对各个回归算法获得的预测结果进行融合的一个示例。在其他实施例中,还可以通过其他方式对各个回归算法获得的预测结果进行融合,本公开的实施例对此不作限制。The proportional voting model is used to fuse the prediction results of the sub-models constructed by the above various algorithms to obtain the final prediction results, including: weighted average of the prediction results obtained by each regression algorithm, and the weighted average of the prediction results as the final forecast result. It should be understood that the weighted average of the prediction results obtained by each regression algorithm is only an example of fusing the prediction results obtained by each regression algorithm. In other embodiments, the prediction results obtained by various regression algorithms can also be fused in other ways, which is not limited in the embodiments of the present disclosure.
本公开实施例中构建的发电量预测模型采用多种回归算法训练得到,其可以有效防止过拟合问题,并通过将多个回归算法得到的预测结果进行融合得到最终的预测结果,从而有效地提高预测精度。The power generation prediction model constructed in the embodiments of the present disclosure is trained using multiple regression algorithms, which can effectively prevent overfitting problems, and obtain the final prediction results by fusing the prediction results obtained by multiple regression algorithms, thereby effectively Improve forecast accuracy.
本公开实施例可以克服光伏电站的发电量和环境数据之间存在粒度不一致的问题,其利用训练特征粒度为小计量单位,例如小时,分钟等,预测目标为大计量单元,例如日。本公开实施例还通过多个不同的回归模型获得多个预测结果,并按照等比投票方式得到最终的预测结果,可以获得较好的预测精度,且不会产生过拟合问题。The embodiments of the present disclosure can overcome the problem of granularity inconsistency between the power generation of photovoltaic power plants and environmental data. They use training feature granularity to be small measurement units, such as hours, minutes, etc., and predict targets to be large measurement units, such as days. The embodiment of the present disclosure also obtains multiple prediction results through multiple different regression models, and obtains the final prediction result according to a proportional voting method, which can obtain better prediction accuracy without causing over-fitting problems.
应当注意,尽管在附图中以特定顺序描述了本公开方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。相反,流程图中描绘的步骤可以改变执行顺序。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。It should be noted that although the operations of the method of the present disclosure are described in a specific order in the drawings, this does not require or imply that these operations must be performed in the specific order, or that all the operations shown must be performed to achieve the desired result. . Conversely, the steps depicted in the flowchart can change the order of execution. Additionally or alternatively, some steps may be omitted, multiple steps may be combined into one step for execution, and/or one step may be decomposed into multiple steps for execution.
参考图3,图3示出了本公开至少一个实施例提供的光伏电站发电量预测装置300的结构示意图。如图3所示,该光伏电站发电量预测装置300包括:Referring to Fig. 3, Fig. 3 shows a schematic structural diagram of a photovoltaic power station generating capacity prediction device 300 provided by at least one embodiment of the present disclosure. As shown in Figure 3, the photovoltaic power station generating capacity prediction device 300 includes:
获取单元301,配置为获取有效环境数据。该有效环境数据包括光伏电站的多个预设环境特征的一天的监测值。The obtaining unit 301 is configured to obtain valid environmental data. The effective environmental data includes one day's monitoring values of multiple preset environmental characteristics of the photovoltaic power station.
预处理单元302,配置为预处理有效环境数据,以得到有效环境数据特征向量;The preprocessing unit 302 is configured to preprocess valid environmental data to obtain a feature vector of valid environmental data;
预测单元303,配置为将有效环境数据特征向量输入到预先构建的发电量预测模型,输出与有效环境数据对应的预测结果,该预测结果为光伏电站的一天的有效发电量。该发电量预测模型可通过上文中描述的训练光伏电站发电量预测模型的方法进行训练,在此将不再赘述。The prediction unit 303 is configured to input the effective environmental data feature vector into a pre-built power generation prediction model, and output a prediction result corresponding to the effective environmental data. The prediction result is the effective power generation of the photovoltaic power station in a day. The power generation prediction model can be trained by the method of training the photovoltaic power station power generation prediction model described above, which will not be repeated here.
本公开实施例中,有效环境数据是指一天的工作时间内按照粒度周期获得的、光伏电站的多个预设环境特征的监测值。粒度周期定义为1小时、30分钟等。每日工作时间可以根据不同地区的日照时间确定,例如中国北京地区可以以北京时间7:00-18:00表示每日工作时间。例如,在粒度周期为1小时的情况下,上述的有效环境数据可包括在7:00-18:00之间获得的光伏电站的多个预设环境特征每小时的监测值。In the embodiments of the present disclosure, the effective environmental data refers to the monitoring values of multiple preset environmental characteristics of the photovoltaic power station obtained according to the granularity period during the working hours of a day. The granularity period is defined as 1 hour, 30 minutes, etc. The daily working hours can be determined according to the sunshine hours in different regions. For example, in Beijing, China, the daily working hours can be expressed from 7:00 to 18:00 Beijing time. For example, when the granularity period is 1 hour, the aforementioned effective environmental data may include hourly monitoring values of multiple preset environmental characteristics of the photovoltaic power station obtained between 7:00 and 18:00.
上述的有效环境数据中的预设环境特征可以包括环境特征中的多个,例如风向、风速、温度、湿度、大气压力、降水量、总辐射、直辐射、散辐射、曝辐、日照时数等。这些环境特征的监测值可以由电站环境监测仪记录。The preset environmental characteristics in the aforementioned effective environmental data may include multiple environmental characteristics, such as wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, and sunshine hours Wait. The monitoring values of these environmental characteristics can be recorded by the power station environmental monitor.
预测目标是光伏发电站的每日有效发电量。例如,次日零点到当日零点记录的电站的历史累计正向有功发电量的差值。The forecast target is the daily effective power generation of photovoltaic power stations. For example, the difference between the historical cumulative positive active power generation of the power station recorded from the zero point of the next day to the zero point of the day.
在一些技术方案中,环境数据的监测是全日24小时不间断地进行监测,而光伏电站在实际工作中有效工作时间是与日照时间长度关联的。例如,中国北京地区光伏电站实际有效工作时间可能是每日7:00-18:00。因此,对光伏电站有效发电量进行预测时,有效环境数据是影响预测结果精准性的重要因素之一。本公开实施例中提出利用有效环境数据对光伏发电站每日正向有功发电量进行预测的思想,其中有效环境数据的采集粒度可以是固定周期,例如1小时等。In some technical solutions, the monitoring of environmental data is performed 24 hours a day without interruption, and the effective working time of photovoltaic power plants in actual work is related to the length of sunshine. For example, the actual effective working hours of photovoltaic power plants in Beijing, China may be 7:00-18:00 daily. Therefore, when predicting the effective power generation of photovoltaic power plants, effective environmental data is one of the important factors that affect the accuracy of the prediction results. In the embodiments of the present disclosure, the idea of using effective environmental data to predict the daily positive active power generation of a photovoltaic power station is proposed, where the effective environmental data collection granularity may be a fixed period, such as 1 hour.
获取单元301、预处理单元302和预测单元303可参见上文中对步骤 110、步骤120和步骤130的详细描述,本公开的实施例对此不再赘述。For the obtaining unit 301, the preprocessing unit 302, and the prediction unit 303, please refer to the detailed description of step 110, step 120 and step 130 above, which will not be repeated in the embodiment of the present disclosure.
参见图4在至少一个实施例中,预处理单元302还可以包括:Referring to FIG. 4, in at least one embodiment, the preprocessing unit 302 may further include:
特征编码子单元3021,配置为将有效环境数据进行特征编码;The feature encoding subunit 3021 is configured to perform feature encoding on effective environment data;
归一化子单元3022,配置为将特征编码后的有效环境数据进行归一化;The normalization subunit 3022 is configured to normalize the effective environment data after feature encoding;
降维处理子单元3023,配置为将归一化后的有效环境数据转换成线性独立的有效环境数据特征向量。The dimensionality reduction processing subunit 3023 is configured to convert the normalized effective environment data into a linear independent effective environment data feature vector.
例如,在至少一个实施例中,特征编码子单元3021还配置为:基于有效环境数据包括的光伏电站的多个预设环境特征的一天的监测值,分别计算该多个预设环境特征中的每个预设环境特征对应的统计特征;以及过滤稀疏度大于等于稀疏度阈值的环境特征或统计特征,该统计特征包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;以及使用该多个预设环境特征中的每个预设环境特征的监测值和对应的统计特征作为每个预设环境特征的向量。For example, in at least one embodiment, the feature encoding subunit 3021 is further configured to: based on one day's monitoring values of multiple preset environmental features of the photovoltaic power station included in the effective environmental data, respectively calculate the multiple preset environmental features The statistical feature corresponding to each preset environmental feature; and filtering the environmental feature or statistical feature whose sparsity is greater than or equal to the sparsity threshold, the statistical feature includes at least one of the maximum feature, the minimum feature, the average feature, and the standard deviation feature One; and using the monitored value of each of the plurality of preset environmental features and the corresponding statistical feature as the vector of each preset environmental feature.
例如,在至少一个实施例中,归一化子单元3022还配置为将每个预设环境特征的向量进行归一化。For example, in at least one embodiment, the normalization subunit 3022 is further configured to normalize the vector of each preset environment feature.
例如,在至少一个实施例中,降维处理子单元3023还配置为采用主成分分析法将归一化后的有效环境数据转换成线性独立的有效环境数据特征向量。For example, in at least one embodiment, the dimensionality reduction processing subunit 3023 is further configured to adopt a principal component analysis method to convert the normalized effective environmental data into a linear independent effective environmental data feature vector.
本公开实施例,通过将环境监测统计周期与光伏电站发电量的统计周期进行统一,并对环境监测数据进行精细化预处理,使得统计结果更为精准。The embodiment of the present disclosure unifies the environmental monitoring statistical period with the statistical period of the photovoltaic power station power generation, and performs refined preprocessing on the environmental monitoring data, so that the statistical results are more accurate.
请参考图5,图5示出了本公开至少一个实施例提供的一种系统的结构示意图。图5所示的系统可以是预测系统。如图5所示,该预测系统可包括:Please refer to FIG. 5, which shows a schematic structural diagram of a system provided by at least one embodiment of the present disclosure. The system shown in Figure 5 may be a predictive system. As shown in Figure 5, the prediction system may include:
电站环境监测装置401,配置为监测光伏电站,以获得光伏电站的环境数据;The power station environment monitoring device 401 is configured to monitor the photovoltaic power station to obtain environmental data of the photovoltaic power station;
存储器402,配置为存储一个或多个计算机程序;以及The memory 402 is configured to store one or more computer programs; and
处理器403。 Processor 403.
存储器402中存储的一个或多个计算机程序被处理器403执行时,致使处理器403执行上述实施例描述的光伏电站发电量预测方法。When one or more computer programs stored in the memory 402 are executed by the processor 403, the processor 403 is caused to execute the method for predicting the power generation amount of the photovoltaic power plant described in the foregoing embodiment.
电站环境监测装置401与存储器402和处理器403之间可以有线或无线连接。有线和无线连接主要实现通信功能,以便数据传输和控制信令的 传输。电站环境监测装置401例如可包括用于检测各环境特征的各种传感器。The power station environment monitoring device 401 may be connected to the memory 402 and the processor 403 by wired or wireless connection. Wired and wireless connections mainly implement communication functions for data transmission and control signaling transmission. The power station environment monitoring device 401 may include various sensors for detecting various environmental characteristics, for example.
然而,应理解,在一些实施例中,光伏电站预测系统中也可不包括电站环境监测装置401。例如,可人为地输入环境数据或者环境数据可预先存储在存储器中。However, it should be understood that, in some embodiments, the photovoltaic power station prediction system may not include the power station environment monitoring device 401. For example, the environmental data can be manually input or the environmental data can be stored in the memory in advance.
在一些实施例中,电站环境监测装置401可配置为监测光伏电站的多个预设环境特征,以获得光伏电站的多个预设环境特征的一天的监测值。例如,电站环境监测装置401可配置为监测光伏电站的多个预设环境特征,以在一天的工作时间内按粒度周期获得光伏电站的多个预设环境特征的监测值。In some embodiments, the power station environment monitoring device 401 may be configured to monitor multiple preset environmental characteristics of the photovoltaic power station to obtain one day's monitoring value of the multiple preset environmental characteristics of the photovoltaic power station. For example, the power station environment monitoring device 401 may be configured to monitor multiple preset environmental characteristics of the photovoltaic power station, so as to obtain the monitoring values of the multiple preset environmental characteristics of the photovoltaic power station according to the granular period during the working hours of the day.
存储器402和处理器403可以实现为同一设备,也可以分别独立存在。处理器403可以实现为实施上述实施例描述的光伏电站发电量预测方法的功能结构。The memory 402 and the processor 403 may be implemented as the same device, or may exist independently. The processor 403 may be implemented as a functional structure for implementing the method for predicting the power generation amount of a photovoltaic power plant described in the foregoing embodiment.
应理解,图5中所示的系统还可以是训练系统,其用于实施上述实施例中描述的训练光伏电站发电量预测模型的方法。It should be understood that the system shown in FIG. 5 may also be a training system, which is used to implement the method for training a photovoltaic power station power generation prediction model described in the foregoing embodiment.
例如,存储器402中存储的一个或多个计算机程序被处理器403执行时,致使处理器403执行上述实施例描述的训练光伏电站发电量预测模型的方法。For example, when one or more computer programs stored in the memory 402 are executed by the processor 403, the processor 403 is caused to execute the method for training a photovoltaic power station power generation prediction model described in the foregoing embodiment.
此外,电站环境监测装置401可配置为监测光伏电站,以获得光伏电站的多天中的每天的历史环境数据。例如,电站环境监测装置401可配置为在多天中的每天工作时间内按照粒度周期获得的历史环境数据。In addition, the power station environment monitoring device 401 may be configured to monitor the photovoltaic power station to obtain daily historical environmental data of the photovoltaic power station for multiple days. For example, the power station environment monitoring device 401 may be configured to obtain historical environmental data according to the granular period during daily working hours of multiple days.
应当理解,光伏电站发电量预测装置300中记载的诸单元或模块与参考图1描述的方法中的各个步骤相对应。由此,上文针对图1的方法描述的操作和特征同样适用于装置300及其中包含的单元,在此不再赘述。装置300可以预先实现在电子设备的浏览器或其他安全应用中,也可以通过下载等方式而加载到电子设备的浏览器或其安全应用中。光伏电站发电量预测装置300中的相应单元可以与电子设备中的单元相互配合以实现本公开实施例的方案。It should be understood that the units or modules recorded in the photovoltaic power station power generation forecasting device 300 correspond to the steps in the method described with reference to FIG. 1. Therefore, the operations and features described above for the method in FIG. 1 are also applicable to the device 300 and the units contained therein, and will not be repeated here. The apparatus 300 may be implemented in a browser or other security application of an electronic device in advance, or it may be loaded into the browser of the electronic device or its security application by downloading or the like. Corresponding units in the photovoltaic power station generating capacity prediction device 300 can cooperate with units in electronic equipment to implement the solutions of the embodiments of the present disclosure.
在上文详细描述中提及的若干模块或者单元,这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体 化。In the several modules or units mentioned in the detailed description above, this division is not mandatory. In fact, according to the embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of a module or unit described above can be further divided into multiple modules or units to be embodied.
下面参考图6,其示出了适于用来实现根据本公开实施例的训练光伏电站发电量预测模型的方法、光伏电站发电量预测方法或装置、训练系统或预测系统的计算机系统500的结构示意图。6, which shows the structure of a computer system 500 suitable for implementing a method for training a photovoltaic power station power generation prediction model, a photovoltaic power generation prediction method or device, a training system or a prediction system according to an embodiment of the present disclosure. Schematic.
如图6所示,计算机系统500包括中央处理单元(CPU)501,其可以根据存储在只读存储器(ROM)502中的程序或者从存储部分508加载到随机访问存储器(RAM)503中的程序而执行各种适当的动作和处理。在RAM 503中,还存储有系统500操作所需的各种程序和数据。CPU 501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。As shown in FIG. 6, the computer system 500 includes a central processing unit (CPU) 501, which can be based on a program stored in a read only memory (ROM) 502 or a program loaded from a storage portion 508 into a random access memory (RAM) 503 And perform various appropriate actions and processing. In the RAM 503, various programs and data required for the operation of the system 500 are also stored. The CPU 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to the bus 504.
以下部件连接至I/O接口505:包括键盘、鼠标等的输入部分506;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分507;包括硬盘等的存储部分508;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分509。通信部分509经由诸如因特网的网络执行通信处理。驱动器510也根据需要连接至I/O接口505。可拆卸介质511,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器510上,以便于从其上读出的计算机程序根据需要被安装入存储部分508。The following components are connected to the I/O interface 505: an input part 506 including a keyboard, a mouse, etc.; an output part 507 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker; a storage part 508 including a hard disk, etc. ; And a communication section 509 including a network interface card such as a LAN card, a modem, etc. The communication section 509 performs communication processing via a network such as the Internet. The driver 510 is also connected to the I/O interface 505 as needed. A removable medium 511, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 510 as needed, so that the computer program read therefrom is installed into the storage portion 508 as needed.
特别地,根据本公开的实施例,上文参考图1和图2描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在机器可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分509从网络上被下载和安装,和/或从可拆卸介质511被安装。在该计算机程序被中央处理单元(CPU)501执行时,执行本公开的系统中限定的上述功能。In particular, according to an embodiment of the present disclosure, the process described above with reference to FIGS. 1 and 2 may be implemented as a computer software program. For example, an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a machine-readable medium, and the computer program contains program code for executing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network through the communication part 509, and/or installed from the removable medium 511. When the computer program is executed by the central processing unit (CPU) 501, the above-mentioned functions defined in the system of the present disclosure are executed.
需要说明的是,本公开所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光 存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. The computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device. In the present disclosure, a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium. The computer-readable medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device . The program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,前述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the accompanying drawings illustrate the possible implementation architecture, functions, and operations of the system, method, and computer program product according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the aforementioned module, program segment, or part of the code contains one or more for realizing the specified logic function Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元或模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元或模块也可以设置在处理器中,例如,可以描述为:一种处理器包括获取单元、预处理单元以及预测单元。其中,这些单元或模块的名称在某种情况下并不构成对该单元或模块本身的限定,例如,获取单元还可以被描述为“用于获取有效环境数据的单元”。The units or modules involved in the embodiments described in the present disclosure can be implemented in software or hardware. The described unit or module may also be provided in the processor. For example, it may be described as: a processor includes an acquisition unit, a preprocessing unit, and a prediction unit. Wherein, the names of these units or modules do not constitute a limitation on the units or modules themselves under certain circumstances. For example, the acquiring unit can also be described as "a unit for acquiring effective environmental data."
作为另一方面,本公开还提供了一种计算机可读存储介质,该计算机可读存储介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中的。上述计算机可读存储介质存储有 一个或者多个程序,当上述前述程序被一个或者一个以上的处理器用来执行描述于本公开的光伏电站发电量预测方法。As another aspect, the present disclosure also provides a computer-readable storage medium. The computer-readable storage medium may be included in the electronic device described in the above embodiment; or it may exist alone without being assembled into the In electronic equipment. The computer readable storage medium stores one or more programs, and when the foregoing programs are used by one or more processors to execute the method for predicting the power generation amount of a photovoltaic power station described in the present disclosure.
以上所述仅是本公开的示范性实施方式,而非用于限制本公开的保护范围,本公开的保护范围由所附的权利要求确定。The foregoing descriptions are merely exemplary implementations of the present disclosure, and are not used to limit the protection scope of the present disclosure, which is determined by the appended claims.

Claims (23)

  1. 一种训练光伏电站发电量预测模型的方法,包括:A method for training a photovoltaic power station's power generation forecast model, including:
    获取光伏电站的多天中每天的历史发电量数据和所述多天中的每天的历史环境数据;Acquiring daily historical power generation data of the photovoltaic power station for multiple days and daily historical environmental data of the multiple days;
    预处理所述历史环境数据;以及Preprocessing the historical environmental data; and
    基于所述历史发电量数据和预处理后的所述历史环境数据构建训练集和预测集,分别采用多个回归算法进行训练,以构建所述历史环境数据和所述历史发电量数据之间的映射关系作为所述发电量预测模型,所述发电量预测模型的输出结果为对按照所述多个回归算法获得的预测结果进行融合而得到的。A training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data The mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms.
  2. 根据权利要求1所述的方法,其中,所述多个回归算法包括选自由贝叶斯岭回归算法、支持向量回归算法、梯度提升树算法和极端随机森林算法构成的组中的至少两个。The method according to claim 1, wherein the multiple regression algorithms comprise at least two selected from the group consisting of Bayesian Ridge regression algorithm, support vector regression algorithm, gradient boosting tree algorithm, and extreme random forest algorithm.
  3. 根据权利要求1或2所述的方法,其中,所述预处理所述历史环境数据,包括:The method according to claim 1 or 2, wherein said preprocessing said historical environmental data comprises:
    将所述每天的历史环境数据进行特征编码;Feature coding the daily historical environmental data;
    将特征编码后的所述每天的历史环境数据进行归一化;以及Normalize the daily historical environmental data after feature encoding; and
    将归一化后的所述每天的历史环境数据转换成线性独立的历史环境数据特征向量。The normalized daily historical environmental data is converted into linear independent historical environmental data feature vectors.
  4. 根据权利要求3所述的方法,其中,The method of claim 3, wherein:
    所述每天的历史环境数据包括多个环境特征的监测值;以及The daily historical environmental data includes monitoring values of multiple environmental characteristics; and
    所述将所述每天的历史环境数据进行特征编码,包括:The feature encoding of the daily historical environmental data includes:
    基于所述每天的历史环境数据所包括的所述多个环境特征的所述监测值,分别计算所述多个环境特征中的每个环境特征对应的统计特征,所述统计特征包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;Based on the monitoring values of the multiple environmental features included in the daily historical environmental data, respectively calculate a statistical feature corresponding to each of the multiple environmental features, where the statistical feature includes a maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic;
    使用所述多个环境特征中的每个环境特征的监测值和对应的所述统计特征作为所述多个环境特征中的每个环境特征的向量;以及Using the monitored value of each environmental feature in the plurality of environmental features and the corresponding statistical feature as a vector of each environmental feature in the plurality of environmental features; and
    过滤所述多个环境特征中稀疏度大于等于稀疏度阈值的环境特征。Filter the environmental features whose sparsity is greater than or equal to the sparsity threshold among the multiple environmental features.
  5. 根据权利要求4所述的方法,其中,所述过滤所述多个环境特征中稀 疏度大于等于稀疏度阈值的环境特征,包括:The method according to claim 4, wherein the filtering of the environmental features whose sparsity is greater than or equal to a sparsity threshold among the plurality of environmental features comprises:
    将所述多个环境特征中的每个环境特征的向量中等于零的数据的个数与所述多个环境特征中的每个环境特征的所述向量的数据总数之间的比值作为所述多个环境特征中的每个环境特征的稀疏度。The ratio between the number of data equal to zero in the vector of each of the plurality of environmental features and the total number of data in the vector of each of the plurality of environmental features is used as the number The sparsity of each of the environmental features.
  6. 根据权利要求4或5所述的方法,其中,所述将特征编码后的所述每天的历史环境数据进行归一化,包括:The method according to claim 4 or 5, wherein the normalizing the daily historical environmental data after feature encoding includes:
    将所述多个环境特征中稀疏度小于所述稀疏度阈值的所述环境特征的向量归一化。Normalize the vector of the environmental feature whose sparsity is less than the sparsity threshold among the plurality of environmental features.
  7. 根据权利要求3-6中任一项所述的方法,其中,所述将归一化后的所述每天的历史环境数据转换成线性独立的历史环境数据特征向量,包括:The method according to any one of claims 3-6, wherein the converting the normalized daily historical environmental data into a linear independent historical environmental data feature vector comprises:
    采用主成分分析法将归一化后的所述每天的历史环境数据转换成所述线性独立的所述历史环境数据特征向量。A principal component analysis method is used to convert the normalized daily historical environmental data into the linear independent historical environmental data feature vector.
  8. 根据权利要求4-7中任一项所述的方法,其中,所述多个环境特征包括风向、风速、温度、湿度、大气压力、降水量、总辐射、直辐射、散辐射、曝辐、日照时数中的至少两个。The method according to any one of claims 4-7, wherein the multiple environmental characteristics include wind direction, wind speed, temperature, humidity, atmospheric pressure, precipitation, total radiation, direct radiation, diffuse radiation, exposure, At least two of the sunshine hours.
  9. 根据权利要求1-8中任一项所述的方法,其中,The method according to any one of claims 1-8, wherein:
    所述每天的历史环境数据是每天工作时间内按照粒度周期获得的历史环境数据。The daily historical environmental data is historical environmental data obtained during daily working hours according to the granularity cycle.
  10. 根据权利要求9所述的方法,其中,所述粒度周期为1小时或30分钟。The method according to claim 9, wherein the particle size period is 1 hour or 30 minutes.
  11. 一种光伏电站发电量预测方法,包括:A method for predicting power generation of photovoltaic power stations, including:
    获取有效环境数据,所述有效环境数据包括光伏电站的多个预设环境特征的一天的监测值;Acquiring effective environmental data, the effective environmental data including one-day monitoring values of multiple preset environmental characteristics of the photovoltaic power station;
    预处理所述有效环境数据,以得到有效环境数据特征向量;以及Preprocessing the effective environmental data to obtain a feature vector of effective environmental data; and
    将所述有效环境数据特征向量输入到预先构建的发电量预测模型,输出与所述有效环境数据对应的预测结果,所述预测结果为所述光伏电站的所述一天的有效发电量,Inputting the effective environmental data feature vector into a pre-built power generation prediction model, and outputting a prediction result corresponding to the effective environmental data, the prediction result being the effective power generation of the photovoltaic power station for the day,
    其中,所述发电量预测模型是通过以下操作训练的:Wherein, the power generation prediction model is trained through the following operations:
    获取所述光伏电站的多天中每天的历史发电量数据和所述多天中的每天的历史环境数据;Acquiring daily historical power generation data of the photovoltaic power station for multiple days and daily historical environmental data of the multiple days;
    预处理所述历史环境数据;以及Preprocessing the historical environmental data; and
    基于所述历史发电量数据和预处理后的所述历史环境数据构建训练集和预测集,分别采用多个回归算法进行训练,以构建所述历史环境数据和所述历史发电量数据之间的映射关系作为所述发电量预测模型,所述发电量预测模型的输出结果为对按照所述多个回归算法获得的预测结果进行融合而得到的,A training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data The mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms,
    所述预处理所述历史环境数据,包括:The preprocessing of the historical environmental data includes:
    将所述每天的历史环境数据进行特征编码;Feature coding the daily historical environmental data;
    将特征编码后的所述每天的历史环境数据进行归一化;以及Normalize the daily historical environmental data after feature encoding; and
    将归一化后的所述每天的历史环境数据转换成线性独立的历史环境数据特征向量,Convert the normalized daily historical environmental data into linear independent historical environmental data feature vectors,
    所述每天的历史环境数据包括多个环境特征的监测值;以及The daily historical environmental data includes monitoring values of multiple environmental characteristics; and
    所述将所述每天的历史环境数据进行特征编码,包括:The feature encoding of the daily historical environmental data includes:
    基于所述每天的历史环境数据所包括的所述多个环境特征的所述监测值,分别计算所述多个环境特征中的每个环境特征对应的统计特征,所述统计特征包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;Based on the monitoring values of the multiple environmental features included in the daily historical environmental data, respectively calculate a statistical feature corresponding to each of the multiple environmental features, where the statistical feature includes a maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic;
    使用所述多个环境特征中的每个环境特征的监测值和对应的所述统计特征作为所述多个环境特征中的每个环境特征的向量;以及Using the monitored value of each environmental feature in the plurality of environmental features and the corresponding statistical feature as a vector of each environmental feature in the plurality of environmental features; and
    过滤所述多个环境特征中稀疏度大于等于稀疏度阈值的环境特征,以及Filtering the environmental features whose sparsity is greater than or equal to the sparsity threshold among the plurality of environmental features, and
    所述多个预设环境特征包括所述多个环境特征中除过滤掉的环境特征之外的其他环境特征。The plurality of preset environmental characteristics include other environmental characteristics in the plurality of environmental characteristics except the filtered environmental characteristics.
  12. 根据权利要求11所述的光伏电站发电量预测方法,其中,所述预处理所述有效环境数据,包括:The method for predicting power generation of a photovoltaic power station according to claim 11, wherein said preprocessing said effective environmental data comprises:
    将所述有效环境数据进行特征编码;Feature encoding the effective environmental data;
    将特征编码后的所述有效环境数据进行归一化;以及Normalize the effective environmental data after feature encoding; and
    将归一化后的所述有效环境数据转换成线性独立的有效环境数据特征向量。The normalized effective environmental data is converted into a linear independent effective environmental data feature vector.
  13. 根据权利要求12所述的光伏电站发电量预测方法,其中,所述将所 述有效环境数据进行特征编码,包括:The method for predicting power generation of a photovoltaic power station according to claim 12, wherein said performing feature coding on said effective environmental data comprises:
    基于所述有效环境数据包括的所述光伏电站的所述多个预设环境特征的所述一天的所述监测值,分别计算所述多个预设环境特征中的每个预设环境特征对应的统计特征,所述统计特征包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;以及Based on the monitoring value of the day of the plurality of preset environmental characteristics of the photovoltaic power station included in the effective environmental data, respectively calculate the corresponding of each preset environmental feature of the plurality of preset environmental characteristics The statistical features of, the statistical features include at least one of the maximum feature, the minimum feature, the average feature, and the standard deviation feature; and
    使用所述多个预设环境特征中的每个预设环境特征的所述监测值和对应的所述统计特征作为所述多个预设环境特征中的每个预设环境特征的向量。Use the monitored value of each preset environmental feature of the plurality of preset environmental features and the corresponding statistical feature as a vector of each preset environmental feature of the plurality of preset environmental features.
  14. 根据权利要求13所述的光伏电站发电量预测方法,其中,所述将特征编码后的所述有效环境数据进行归一化,包括:The method for predicting power generation of a photovoltaic power station according to claim 13, wherein said normalizing said effective environmental data after feature encoding comprises:
    将每个预设环境特征的向量进行归一化。Normalize the vector of each preset environment feature.
  15. 根据权利要求12-14中任一项所述的光伏电站发电量预测方法,其中,所述将归一化后的所述有效环境数据转换成线性独立的有效环境数据特征向量,包括:The method for predicting power generation capacity of a photovoltaic power station according to any one of claims 12-14, wherein said converting the normalized effective environmental data into a linear independent effective environmental data feature vector comprises:
    采用主成分分析法将归一化后的所述有效环境数据转换成所述线性独立的有效环境数据特征向量。The principal component analysis method is used to convert the normalized effective environmental data into the linear independent effective environmental data feature vector.
  16. 根据权利要求11-15中任一项所述的光伏电站发电量预测方法,其中,所述光伏电站的多个预设环境特征的一天的监测值包括在所述一天的工作时间内按照粒度周期获得的所述光伏电站的所述多个预设环境特征的监测值。The method for predicting the power generation capacity of a photovoltaic power station according to any one of claims 11-15, wherein the one-day monitoring values of the multiple preset environmental characteristics of the photovoltaic power station include The obtained monitoring values of the plurality of preset environmental characteristics of the photovoltaic power station.
  17. 一种光伏电站发电量预测装置,包括:A photovoltaic power station generating capacity prediction device, including:
    获取单元,配置为获取有效环境数据,所述有效环境数据包括光伏电站的多个预设环境特征的一天的监测值;An obtaining unit configured to obtain effective environmental data, the effective environmental data including one day's monitoring values of multiple preset environmental characteristics of the photovoltaic power station;
    预处理单元,配置为预处理所述有效环境数据,以得到有效环境数据特征向量;以及A preprocessing unit, configured to preprocess the effective environmental data to obtain a feature vector of effective environmental data; and
    预测单元,配置为将所述有效环境数据特征向量输入到所述预先构建的发电量预测模型,输出与所述有效环境数据对应的预测结果,所述预测结果为所述光伏电站的所述一天的有效发电量,A prediction unit configured to input the effective environmental data feature vector into the pre-built power generation prediction model, and output a prediction result corresponding to the effective environmental data, where the prediction result is the day of the photovoltaic power station Effective power generation,
    其中,所述发电量预测模型是通过以下操作训练的:Wherein, the power generation prediction model is trained through the following operations:
    获取所述光伏电站的多天中每天的历史发电量数据和所述多天中的每天的历史环境数据;Acquiring daily historical power generation data of the photovoltaic power station for multiple days and daily historical environmental data of the multiple days;
    预处理所述历史环境数据;以及Preprocessing the historical environmental data; and
    基于所述历史发电量数据和预处理后的所述历史环境数据构建训练集和预测集,分别采用多个回归算法进行训练,以构建所述历史环境数据和所述历史发电量数据之间的映射关系作为所述发电量预测模型,所述发电量预测模型的输出结果为对按照所述多个回归算法获得的预测结果进行融合而得到的,A training set and a prediction set are constructed based on the historical power generation data and the preprocessed historical environmental data, and multiple regression algorithms are used for training respectively to construct the relationship between the historical environmental data and the historical power generation data The mapping relationship is used as the power generation prediction model, and the output result of the power generation prediction model is obtained by fusing the prediction results obtained according to the multiple regression algorithms,
    所述预处理所述历史环境数据,包括:The preprocessing of the historical environmental data includes:
    将所述每天的历史环境数据进行特征编码;Feature coding the daily historical environmental data;
    将特征编码后的所述每天的历史环境数据进行归一化;以及Normalize the daily historical environmental data after feature encoding; and
    将归一化后的所述每天的历史环境数据转换成线性独立的历史环境数据特征向量,Convert the normalized daily historical environmental data into linear independent historical environmental data feature vectors,
    所述每天的历史环境数据包括多个环境特征的监测值;以及The daily historical environmental data includes monitoring values of multiple environmental characteristics; and
    所述将所述每天的历史环境数据进行特征编码,包括:The feature encoding of the daily historical environmental data includes:
    基于所述每天的历史环境数据所包括的所述多个环境特征的所述监测值,分别计算所述多个环境特征中的每个环境特征对应的统计特征,所述统计特征包括最大值特征、最小值特征、平均值特征、标准差特征中的至少之一;Based on the monitoring values of the multiple environmental features included in the daily historical environmental data, respectively calculate a statistical feature corresponding to each of the multiple environmental features, where the statistical feature includes a maximum value feature , At least one of the minimum characteristic, the average characteristic, and the standard deviation characteristic;
    使用所述多个环境特征中的每个环境特征的监测值和对应的所述统计特征作为所述多个环境特征中的每个环境特征的向量;以及Using the monitored value of each environmental feature in the plurality of environmental features and the corresponding statistical feature as a vector of each environmental feature in the plurality of environmental features; and
    过滤所述多个环境特征中稀疏度大于等于稀疏度阈值的环境特征,以及Filtering the environmental features whose sparsity is greater than or equal to the sparsity threshold among the plurality of environmental features, and
    所述多个预设环境特征包括所述多个环境特征中除过滤掉的环境特征之外的其他环境特征。The plurality of preset environmental characteristics include other environmental characteristics in the plurality of environmental characteristics except the filtered environmental characteristics.
  18. 一种训练系统,包括:A training system including:
    处理器;以及Processor; and
    存储器,配置为存储一个或多个计算机程序;Memory, configured to store one or more computer programs;
    其中,当所述一个或多个计算机程序被所述处理器执行时,致使所述处理器执行如权利要求1-10中任一项所述的方法。Wherein, when the one or more computer programs are executed by the processor, the processor is caused to execute the method according to any one of claims 1-10.
  19. 根据权利要求18所述的训练系统,还包括:电站环境监测装置,The training system according to claim 18, further comprising: a power station environment monitoring device,
    所述电站环境监测装置配置为监测所述光伏电站,以获得所述光伏电站 的所述多天中每天的历史环境数据。The power station environment monitoring device is configured to monitor the photovoltaic power station to obtain historical environmental data of the photovoltaic power station for each day in the multiple days.
  20. 一种预测系统,包括:A forecasting system including:
    处理器;以及Processor; and
    存储器,配置为存储一个或多个计算机程序;Memory, configured to store one or more computer programs;
    其中,当所述一个或多个计算机程序被所述处理器执行时,致使所述处理器执行如权利要求11-16中任一项所述的方法。Wherein, when the one or more computer programs are executed by the processor, the processor is caused to execute the method according to any one of claims 11-16.
  21. 根据权利要求20所述的预测系统,还包括:电站环境监测装置,The prediction system according to claim 20, further comprising: a power station environment monitoring device,
    所述电站环境监测装置配置为监测所述光伏电站的所述多个预设环境特征,以获得所述光伏电站的所述多个预设环境特征的一天的监测值。The power station environment monitoring device is configured to monitor the plurality of preset environmental characteristics of the photovoltaic power station to obtain a day's monitoring value of the plurality of preset environmental characteristics of the photovoltaic power station.
  22. 一种非瞬时计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-10中任一项所述的方法。A non-transitory computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, the method according to any one of claims 1-10 is realized.
  23. 一种非瞬时计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求11-16中任一项所述的方法。A non-transitory computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method according to any one of claims 11-16 is realized.
PCT/CN2020/088709 2019-05-14 2020-05-06 Method for training power generation amount prediction model of photovoltaic power station, power generation amount prediction method and device of photovoltaic power station, training system, prediction system and storage medium WO2020228568A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910401500.9 2019-05-14
CN201910401500.9A CN111950752A (en) 2019-05-14 2019-05-14 Photovoltaic power station generating capacity prediction method, device and system and storage medium thereof

Publications (1)

Publication Number Publication Date
WO2020228568A1 true WO2020228568A1 (en) 2020-11-19

Family

ID=73288831

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/088709 WO2020228568A1 (en) 2019-05-14 2020-05-06 Method for training power generation amount prediction model of photovoltaic power station, power generation amount prediction method and device of photovoltaic power station, training system, prediction system and storage medium

Country Status (2)

Country Link
CN (1) CN111950752A (en)
WO (1) WO2020228568A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668806B (en) * 2021-01-17 2022-09-06 中国南方电网有限责任公司 Photovoltaic power ultra-short-term prediction method based on improved random forest
CN113505927B (en) * 2021-07-14 2022-03-18 广东工业大学 Method, device, equipment and medium for selecting battery capacity of solar bird repelling equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103390199A (en) * 2013-07-18 2013-11-13 国家电网公司 Photovoltaic power generation capacity/power prediction device
CN104992248A (en) * 2015-07-07 2015-10-21 中山大学 Microgrid photovoltaic power station generating capacity combined forecasting method
CN105512763A (en) * 2015-12-07 2016-04-20 河海大学常州校区 Method and system for predicting photovoltaic power station middle-short term power generation
CN108960522A (en) * 2018-07-16 2018-12-07 浙江电腾云光伏科技有限公司 A kind of photovoltaic power generation quantity prediction analysis method
WO2019021438A1 (en) * 2017-07-27 2019-01-31 三菱電機株式会社 Solar power generation amount prediction device, solar power generation amount prediction system, prediction method, and program

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9053439B2 (en) * 2012-09-28 2015-06-09 Hewlett-Packard Development Company, L.P. Predicting near-future photovoltaic generation
CN104732296A (en) * 2015-04-01 2015-06-24 贵州电力试验研究院 Modeling method for distributed photovoltaic output power short-term prediction model
CN105303262A (en) * 2015-11-12 2016-02-03 河海大学 Short period load prediction method based on kernel principle component analysis and random forest
CN108428019B (en) * 2018-05-15 2021-09-03 阳光电源股份有限公司 Method for establishing component battery temperature calculation model and photovoltaic power prediction method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103390199A (en) * 2013-07-18 2013-11-13 国家电网公司 Photovoltaic power generation capacity/power prediction device
CN104992248A (en) * 2015-07-07 2015-10-21 中山大学 Microgrid photovoltaic power station generating capacity combined forecasting method
CN105512763A (en) * 2015-12-07 2016-04-20 河海大学常州校区 Method and system for predicting photovoltaic power station middle-short term power generation
WO2019021438A1 (en) * 2017-07-27 2019-01-31 三菱電機株式会社 Solar power generation amount prediction device, solar power generation amount prediction system, prediction method, and program
CN108960522A (en) * 2018-07-16 2018-12-07 浙江电腾云光伏科技有限公司 A kind of photovoltaic power generation quantity prediction analysis method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
傅美平等 (FU, MEIPING ET AL.): "基于相似日和最小二乘支持向量机的光伏发电短期预测 (Short-term, photostatic power forecasting based on similar days and least square support vector machine)", 电力系统保护与控制 (POWER SYSTEM PROTECTION AND CONTROL), vol. 40, no. 16, 16 August 2012 (2012-08-16), XP55753626, ISSN: 1674-3415, DOI: 20200623113640A *
单英浩等 (SHAN, YINGHAO ET AL.): "基于改进BP_SVM_ELM与粒_省略_SF的微电网光伏发电组合预测方法 (non-official translation: Combined forecasting of photovoltaic power generation in microgrid based on the improved BP-SVM-ELM and SOM-LSF)", 中国电机工程学报 (PROCEEDINGS OF THE CSEE), vol. 36, no. 12, 20 June 2016 (2016-06-20), XP55753630, ISSN: 0258-8013, DOI: 20200623114602A *

Also Published As

Publication number Publication date
CN111950752A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
US11488074B2 (en) Method for quantile probabilistic short-term power load ensemble forecasting, electronic device and storage medium
CN105391083B (en) Wind power interval short term prediction method based on variation mode decomposition and Method Using Relevance Vector Machine
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN112285807B (en) Meteorological information prediction method and device
US11436494B1 (en) Optimal power flow computation method based on multi-task deep learning
CN111461463B (en) Short-term load prediction method, system and equipment based on TCN-BP
CN112598180A (en) Distributed regional wind power prediction method
CN110751318A (en) IPSO-LSTM-based ultra-short-term power load prediction method
WO2020228568A1 (en) Method for training power generation amount prediction model of photovoltaic power station, power generation amount prediction method and device of photovoltaic power station, training system, prediction system and storage medium
CN110556820A (en) Method and apparatus for determining energy system operating scenarios
CN113822418A (en) Wind power plant power prediction method, system, device and storage medium
CN110852522A (en) Short-term power load prediction method and system
CN114091647A (en) Solar 10.7 cm radio flow forecasting method based on BP neural network
CN116722545B (en) Photovoltaic power generation prediction method based on multi-source data and related equipment
CN115564115A (en) Wind power plant power prediction method and related equipment
CN116307111A (en) Reactive load prediction method based on K-means clustering and random forest algorithm
CN111539573B (en) Power prediction method and system for wind-solar hybrid off-grid system
CN115204698A (en) Real-time analysis method for power supply stability of low-voltage transformer area
Paulin et al. SOLAR PHOTOVOLTAIC OUTPUT POWER FORECASTING USING BACK PROPAGATION NEURAL NETWORK.
CN113112085A (en) New energy station power generation load prediction method based on BP neural network
CN113449920A (en) Wind power prediction method, system and computer readable medium
CN114552570A (en) Offshore wind power prediction management system
Li et al. Deep intelligence-driven efficient forecasting for the agriculture economy of computational social systems
CN116706907B (en) Photovoltaic power generation prediction method based on fuzzy reasoning and related equipment
Bharti et al. Attention Mechanism in Deep Learning for Wind Power Forecasting

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20806090

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20806090

Country of ref document: EP

Kind code of ref document: A1