CN116629431A - Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning - Google Patents

Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning Download PDF

Info

Publication number
CN116629431A
CN116629431A CN202310615845.0A CN202310615845A CN116629431A CN 116629431 A CN116629431 A CN 116629431A CN 202310615845 A CN202310615845 A CN 202310615845A CN 116629431 A CN116629431 A CN 116629431A
Authority
CN
China
Prior art keywords
power generation
modal
decomposition
photovoltaic power
photovoltaic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310615845.0A
Other languages
Chinese (zh)
Inventor
蹇照民
山宪武
桓露
潘红伟
麦尔旦·艾麦尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Marketing Service Center Of State Grid Xinjiang Electric Power Co ltd Capital Intensive Center Metering Center
Original Assignee
Marketing Service Center Of State Grid Xinjiang Electric Power Co ltd Capital Intensive Center Metering Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Marketing Service Center Of State Grid Xinjiang Electric Power Co ltd Capital Intensive Center Metering Center filed Critical Marketing Service Center Of State Grid Xinjiang Electric Power Co ltd Capital Intensive Center Metering Center
Priority to CN202310615845.0A priority Critical patent/CN116629431A/en
Publication of CN116629431A publication Critical patent/CN116629431A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Marketing (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Biomedical Technology (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning, comprising the steps of selecting and removing irrelevant or redundant characteristics in a photovoltaic data set by using an Embedded characteristic, improving the prediction precision of a model and accelerating the training speed of the model; the variable modal decomposition is used for decomposing photovoltaic power generation data into a plurality of simple modal components, the influence of noise on a power generation prediction result is reduced, the variable modal decomposition is improved by adopting a greedy algorithm, modal components are selected to form a modal component subset, the number of modal component prediction models is reduced, and the accuracy of final power generation prediction is improved; and finally, constructing and combining a plurality of individual learners by a Stacking integrated learning method to complete a photovoltaic power generation amount prediction task. The photovoltaic power generation capacity prediction method has the advantages of better prediction performance, high prediction accuracy and small mean square error.

Description

Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning
Technical Field
The invention belongs to the technical field of artificial intelligence and photovoltaic power generation, and relates to a photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning, which can predict future photovoltaic power generation amount according to historical power generation amount data and photovoltaic sign data.
Background
At present, regarding the prediction of photovoltaic power generation amount, although the prediction accuracy of a model can be improved to a certain extent by a photovoltaic power generation amount prediction method based on variation modal decomposition, a large improvement space still exists; at present, a single model is used for directly predicting the generated energy, the single model is limited by the structure of the model, and the prediction accuracy rate is limited in the improvement space.
The defects of the specific existing products are as follows: the existing product uses a single model based on original variation modal decomposition to realize the prediction of the photovoltaic power generation amount, a large improvement space exists in a variation modal decomposition method on a photovoltaic power generation amount prediction task, the combination method of the variation modal decomposition and the model constructed by using an integrated learning method is insufficient in the aspect of the prediction of the photovoltaic power generation amount, and the performance is still a large improvement space.
Disclosure of Invention
The invention aims to provide a method and a device for predicting photovoltaic power generation capacity by selecting a mode component subset to improve variation mode decomposition based on a greedy algorithm and combining with the characteristic selection of Embedded, the improvement of variation mode decomposition and Stacking integrated learning.
The technical scheme adopted by the invention is as follows:
the photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning comprises three parts, namely an Embedded feature selection part, a variation modal decomposition improvement part based on modal component subset selection part and a Stacking ensemble learning part, and specifically comprises the following steps: selecting and removing irrelevant or redundant characteristics in the photovoltaic data set by using the Embedded characteristics, so that the prediction accuracy of the model is improved, and the training speed of the model is accelerated; the variable modal decomposition is used for decomposing photovoltaic power generation data into a plurality of simple modal components, the influence of noise on a power generation prediction result is reduced, the variable modal decomposition is improved by adopting a greedy algorithm, modal components are selected to form a modal component subset, the number of modal component prediction models is reduced, and the accuracy of final power generation prediction is improved; and finally, constructing and combining a plurality of individual learners by a Stacking integrated learning method to complete a photovoltaic power generation amount prediction task.
Further, using an Embedded method to select features, and selecting an extreme random tree as a model evaluator; the step of selecting the subset of photovoltaic features for the extreme random tree is: a plurality of m feature photovoltaic feature subsets are selected, and then the out-of-bag data error with the smallest data error is selected from the photovoltaic feature subsets as a final photovoltaic feature subset.
Further, the process of selecting a subset of the features of the m-feature photovoltaic signature from the photovoltaic signature dataset by the extreme randomizer tree is: firstly, calculating the importance of each photovoltaic feature, then screening the features meeting the feature importance requirement, and repeating the above processes until m features are selected.
In the variation modal decomposition, firstly, searching the parameter decomposition modal number K and the bandwidth limit alpha by using a particle swarm search algorithm, then performing variation modal decomposition on photovoltaic power generation data according to the parameters K and the alpha to obtain a plurality of power generation modal components, and then performing modal component selection by using a greedy algorithm to obtain a power generation modal component subset.
Further, searching the optimal parameter combination of the number K of the decomposition modes and the bandwidth limitation alpha by adopting a particle swarm search algorithm, and initializing the size N of the particle swarm, the initial position [ K, alpha ] and the initial speed v; then carrying out variation modal decomposition on the photovoltaic power generation quantity sequence S according to the parameter decomposition modal number K and the bandwidth limit alpha to obtain a plurality of modal components, and calculating the envelope entropy of each modal component, wherein the calculation formula of the envelope entropy is as follows (1):
wherein a (j) represents envelope signals of K modal classifications decomposed by variation modes after Hilbert demodulation, n represents sampling points, p (j) represents normalized probability distribution sequence of a (j), E p An entropy value of p (j), i.e., envelope entropy;
then finding out the historical minimum envelope entropy of each particle, further finding out the global minimum envelope entropy of the whole particle swarm, and updating the position [ K, alpha ] and the speed v of each particle by using the parameters K and alpha corresponding to the current global minimum envelope entropy;
and finally judging whether M iterations are completed, if not, continuing searching, and if so, outputting an optimal solution [ K, alpha ].
Further, the process of using the variational modal decomposition algorithm is as follows: input S represents the original photovoltaic power generation sequence, output { u } k The power generation modal component obtained by decomposition is expressed as a set, and u is initialized k 、ω k And a Lagrangian multiplier λ, n=0, n is incremented by 1 to accumulate the number of iterations, and u is updated from k=1 according to equations (2) and (3) k And omega k Every time K is updated, 1 is increased automatically until k=k, K is the number of decomposition modes, λ is updated according to the formula (4) until the formula (5) is met, iteration is stopped, otherwise n is transferred to 1 which is increased automatically, and decomposition is continued;
wherein formula (2), formula (3) and formula (4) are each u k 、ω k And an update function of λ, equation (5) is an iteration constraint of the variational modal decomposition algorithm:
where ω represents the frequency,and->Respectively indicate->f (t) and lambda n Fourier transform of (t);
still further, the greedy algorithm involves variables and functions that include: C. s, solution, select, feasible; c= { IMF 0 ,IMF 1 ,…,IMF K-1 -IMF in which i The ith generating capacity modal component is represented, the generating capacity modal component corresponding to the maximum decision coefficient is selected from C by the select function, S is null under the initial condition, the generating capacity modal component selected by the select function is added to S, the constraint condition is that the expanding S is used for photovoltaic generating capacity prediction, the decision coefficient is larger than the unexpanded S, and the feaable function judges whether the current candidate generating capacity IMF component is added to S by comparing the decision coefficient for generating capacity prediction by using the expanding S and the unexpanded S.
Further, the algorithm process of the greedy selection algorithm is as follows: inputting C to represent a modal component candidate set, outputting S to represent a selected modal component subset, wherein S is null under the initial condition, selecting a modal component C corresponding to the largest decision coefficient from C as a candidate component, attempting to add C to C, and comparing the component subset S added with C with the component subset S not added with C 0 Determining coefficients for performing power generation amount prediction, if the determining coefficients corresponding to S are smaller than S 0 The description c does not participate in constructing the optimal solution, and c is deleted from S.
Further, the Stacking integrated learning method comprises the following steps: constructing training subsets on the training set through k-fold cross validation, respectively constructing RandomForest, XGBoost, LSTM, BP models and GRU 5 models as individual learners, training the individual learners by using the training subsets to obtain base models, constructing linear regression models as element learners, and training the element learners by using the output of the base models; similarly, when testing, the base model must be used to predict to generate a new test set, and then the test set must be predicted.
A photovoltaic power generation amount prediction device based on variation modal decomposition and ensemble learning, comprising:
the feature selection module is used for removing irrelevant or redundant features in the photovoltaic data set through the selected features, so that the prediction accuracy of the model is improved, and the training speed of the model is accelerated;
the variation modal decomposition module is used for decomposing the photovoltaic power generation amount data into a plurality of simple modal components, reducing the influence of noise on a power generation amount prediction result and adopting a greedy algorithm to improve variation modal decomposition;
and the Stacking integrated learning module is used for constructing and combining a plurality of individual learners to complete the photovoltaic power generation amount prediction task.
The invention has the characteristics and effects that:
(1) And (3) performing feature selection by using an Embedded method to obtain a photovoltaic feature subset, removing irrelevant or redundant features, and improving the training speed and the prediction performance of the model.
(2) The mode component subset is selected based on a greedy algorithm to improve variation mode decomposition, the power generation mode component subset is obtained, photovoltaic power generation data is subjected to variation mode decomposition to obtain a plurality of mode components, the greedy algorithm is used for selecting a mode component subset formed by a plurality of mode components with the best prediction performance to participate in prediction, and the accuracy of final power generation prediction is improved.
(3) And (3) integrating an individual learner by using a Stacking method to construct a Stacking model, training a prediction model of the modal component subset on the photovoltaic sign subset, and combining the output of the modal component subset prediction model to obtain a final power generation quantity prediction result.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
Fig. 1 is a photovoltaic power generation prediction processing framework based on variation modal decomposition and ensemble learning.
Fig. 2 is a flow chart of Extra-Trees selection of a subset of photovoltaic features.
Fig. 3 is a flow chart of the PSO algorithm searching for the combination of the best parameters of the number of decomposition modalities K and the bandwidth limitation α.
Fig. 4 is a flowchart of Stacking algorithm integration RandomForest, XGBoost, LSTM, BP and GRU.
FIG. 5 is a schematic diagram of time series cross-validation data set partitioning.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples.
A photovoltaic power generation amount prediction method based on variational modal decomposition and ensemble learning is provided, wherein variational modal decomposition is improved by selecting modal component subsets based on a greedy algorithm, and the photovoltaic power generation amount combination prediction method is provided by combining with the characteristic selection of an Embedded, the improvement of variational modal decomposition and the ensemble learning. And selecting and removing irrelevant or redundant characteristics in the photovoltaic data set by using the Embedded characteristics, so that the prediction accuracy of the model is improved, and the training speed of the model is accelerated. And the variation modal decomposition is used for decomposing the photovoltaic power generation amount data into a plurality of simple modal components, so that the influence of noise on a power generation amount prediction result is reduced. And a greedy algorithm is adopted to improve variation modal decomposition, modal components are selected to form a modal component subset, the number of modal component prediction models is reduced, and the accuracy of final power generation amount prediction is improved. And constructing and combining a plurality of individual learners by using a Stacking integrated learning method to complete a photovoltaic power generation amount prediction task. The overall processing framework of the present invention is shown in fig. 1.
1. Embedded feature selection
The invention uses an Embedded method for feature selection, and selects an extreme random tree (Extremely Randomized Trees, extra-Trees) as a model evaluator. Extra-Trees can measure the importance of each feature, selecting the most important feature based on this importance index.
The feature importance measurement indexes of the regression decision tree comprise mean square error, out-of-bag data error and the like, the out-of-bag data is a sample subset which does not participate in the training of the decision tree, the sample subset can be used for evaluating the performance of the decision tree, and the prediction error rate of a calculation model is called out-of-bag data error. The feature importance calculation formula based on the out-of-bag data error metric is shown as formula (1.1).
Wherein fi (X) represents the feature importance of feature X, error 1 Out-of-bag data error representing decision tree, error 2 Representing the error of the out-of-bag data after noise interference is added to the characteristic X of all samples of the out-of-bag data randomly 2 -error 1 Representing the change of the data error outside the bag caused by the change of the characteristic X, wherein T represents that T trees exist in the forest, and Sigma (error) 2 -error 1 ) Representing the sum of the out-of-bag data error variations for the T trees. If random noise is added to the feature X, the accuracy of the out-of-bag data is significantly reduced, which indicates that the feature X plays a great role in sample prediction, and therefore the importance of the feature X cannot be ignored.
The Extra-Trees selection of the subset of photovoltaic features is to select a plurality of subsets of photovoltaic features having m features, and then select the subset of photovoltaic features from each subset of photovoltaic features having the smallest data error outside the bag as the final subset of photovoltaic features. The Extra-Trees process of selecting a subset of the photovoltaic features having m features from the photovoltaic feature dataset is as shown in FIG. 2, first calculating the importance of each photovoltaic feature, then screening features meeting the feature importance requirements, and repeating the above process until m features are selected. 2. Improved variational modal decomposition based on modal component subset selection
According to the invention, a particle swarm search algorithm is used for searching the parameter decomposition mode number K and the bandwidth limitation alpha, then the photovoltaic power generation data is subjected to variation mode decomposition according to the parameters K and alpha to obtain a plurality of power generation mode components, and then a greedy algorithm is used for carrying out mode component selection to obtain a power generation mode component subset.
2.1 particle swarm search algorithm determining the number of decomposition modalities K and the bandwidth limitation alpha
The number K of parameter decomposition modes and the choice of bandwidth limitation alpha have great influence on the decomposition effect of the VMD method. As shown in fig. 3, the present invention adopts a particle swarm search algorithm to search the optimal parameter combination of the number of decomposition modes K and the bandwidth limitation α, and first initializes the particle swarm size N, the initial position [ K, α ] and the initial velocity v. Then carrying out variation modal decomposition on the photovoltaic power generation quantity sequence S according to the parameter decomposition modal number K and the bandwidth limit alpha to obtain a plurality of modal components, and calculating the envelope entropy of each modal component, wherein the calculation formula of the envelope entropy is shown as a formula (2.1). And then finding out the historical minimum envelope entropy of each particle, further finding out the global minimum envelope entropy of the whole particle swarm, and updating the position [ K, alpha ] and the speed v of each particle by using the parameters K and alpha corresponding to the current global minimum envelope entropy. And finally judging whether M iterations are completed, if not, continuing searching, and if so, outputting an optimal solution [ K, alpha ].
Wherein a (j) represents envelope signals of K modal classifications decomposed by variation modes after Hilbert demodulation, n represents sampling points, p (j) represents normalized probability distribution sequence of a (j), E p The entropy value of p (j), i.e., the envelope entropy, is represented.
2.2 Modal decomposition to obtain Modal components
The algorithm description of the variation modal decomposition algorithm used in the invention is shown in table 1, the input S represents the original photovoltaic power generation sequence, and the output { u } k The power generation modal component obtained by decomposition is expressed as a set, and u is initialized k 、ω k And a Lagrangian multiplier λ, n=0, n is incremented by 1 to accumulate the number of iterations, and u is updated from k=1 according to equations (2.2) and (2.3) k And omega k Every time K is updated, 1 is increased automatically until k=k, K is the number of decomposition modes, λ is updated according to the formula (2.4) until the formula (2.5) is met, iteration is stopped, and otherwise n is increased automatically to 1 to continue decomposition.
Table 1 Process for decomposing photovoltaic Power Generation data by a variation Modal decomposition Algorithm
Formula (2.2), formula (2.3) and formula (2.4) are each u k 、ω k And an update function of λ, equation (2.5) is an iterative constraint of the variational modal decomposition algorithm.
Wherein ω represents the frequency,and->Respectively indicate->f (t) and lambda n Fourier transform of (t).
2.3 processing framework for greedy Algorithm selection of Modal Components
The variables and functions involved in the greedy algorithm in the invention are shown in table 2, and c= { IMF 0 ,IMF 1 ,…,IMF K-1 -IMF in which i The ith generating capacity modal component is represented, the generating capacity modal component corresponding to the maximum decision coefficient is selected from C by the select function, S is null under the initial condition, the generating capacity modal component selected by the select function is added to S, the constraint condition is that the expanded S is used for photovoltaic generating capacity prediction, the decision coefficient is larger than the unexpanded S, and the feasible function judges whether the current candidate generating capacity IMF component is added to S by comparing the decision coefficient for generating capacity prediction by using the expanded S and the unexpanded S.
TABLE 2 variable and function of greedy algorithm
The algorithm description of the greedy selection algorithm used in the invention is shown in table 3, wherein C is input to represent a modal component candidate set, S is output to represent a selected modal component subset, S is null under the initial condition, the modal component C corresponding to the largest decision coefficient is selected from C as a candidate component, C is tried to be added into C, and the component subset S after C is added is compared with the component subset S without C 0 Determining coefficients for performing power generation amount prediction, if the determining coefficients corresponding to S are smaller than S 0 The description c does not participate in constructing the optimal solution, and c is deleted from S.
TABLE 3 greedy Algorithm Process for selecting Power Generation Modal Components
3. Stacking ensemble learning
The flow chart of the Stacking method in the invention is shown in fig. 4, a training subset is constructed on a training set through k-fold cross validation, k in the text is taken as 5, randomForest, XGBoost, LSTM, BP models and GRU 5 models are respectively constructed as individual learners, the individual learners are trained by using the training subset to obtain base models, a linear regression model is constructed as a meta learner, and the meta learner is trained by using the output of the base models. Similarly, when testing, the base model must be used to predict to generate a new test set, and then the test set must be predicted.
The photovoltaic data set in the present invention is a time series, which is arranged in chronological order, so that the verification set data of the time series must follow the training set in order. The data set division schematic diagram of k-fold cross validation of the time sequence is shown in fig. 5, and the 1 st fold data of the data set is only used as a training set and is not used as a validation set. As shown in formula (2.6), dividing the data set D into k mutually exclusive subsets with similar sizes, completing the division of the training set and the verification set, and returning divided indexes, wherein the i+1st fold data is the verification set assuming that the previous i fold data is the training set, and the returned training set index set and verification set index are {1,2, …, i } and i+1, respectively.
Wherein D is i Represents the i-th subset of data, k represents the number of cross-validation folds, i.e., k represents k-fold cross-validation.
The Stacking method trains the individual learner on the original data set based on time series cross-validation, and generates a new data set for training the meta learner based on the output of the individual learner. Wherein the output of the individual learner is composed of a predicted result of each fold data. The outputs of all individual learners are combined laterally to obtain the feature input of the new training set, while the labels of the original training set are still treated as labels of the new training set. The element learner is trained by the new training set, and the individual learners are successfully combined. Similarly, the prediction process also needs to output prediction results through the prediction of all the base models, a new test set is constructed according to the output results of the base models, and then the meta learner predicts the test set to complete the combination of the prediction results of the individual learners.
According to the invention, randomForest, XGBoost, LSTM, BP, GRU and various evaluation indexes of the improved variation modal decomposition Stacking model, the variation modal decomposition Stacking model and the Stacking model on a photovoltaic power generation capacity prediction task, which are not selected, are respectively compared.
RandomForest, XGBoost, LSTM, BP, GRU and the invention have three results of mean square error, average absolute error and determination coefficient in photovoltaic power generation amount prediction task shown in table 4.
TABLE 4 RandomForest, XGBoost, LSTM, BP, GRU and mean squared error, average absolute error and decision coefficient value of the present invention
As can be seen from Table 4, the combined photovoltaic power generation amount prediction method combining feature selection, modification variation modal decomposition and integrated learning provided by the invention has the advantages of best prediction performance, highest prediction accuracy and minimum error. The mean square error of the method provided by the invention is 0.2232, the average absolute error is 0.3387, the average absolute error is 31.40% lower than that of other methods, the decision coefficient values are 0.9797 respectively, and the average absolute error is 4.68% higher than that of other methods.
The three results of the improved variation modal decomposition Stacking model, the Stacking model and the photovoltaic power generation amount prediction task are shown in table 5.
TABLE 5 improved variation Modal decomposition Stacking model, and mean squared error, mean absolute error, and decision coefficient value of the present invention for feature non-selection
As can be seen from Table 5, the combined photovoltaic power generation amount prediction method combining feature selection, improved variation modal decomposition and integrated learning has the best prediction performance, the highest prediction accuracy and the smallest mean square error. The mean square error of the method provided by the invention is 0.2232, the average of the mean square error is 32.91% lower than that of other methods, the average absolute error is 0.3387, the average of the mean square error is 16.36% lower than that of other methods, the decision coefficient values are 0.9797 respectively, and the average of the decision coefficient values is 1.03% higher than that of other methods.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It should be understood by those skilled in the art that the above embodiments do not limit the scope of the present invention in any way, and all technical solutions obtained by equivalent substitution and the like fall within the scope of the present invention. The invention is not related in part to the same as or can be practiced with the prior art.

Claims (10)

1. The photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning is characterized by comprising three parts, namely an Embedded feature selection part, a variation modal decomposition improvement part and a Stacking ensemble learning part, based on a modal component subset selection part, specifically comprising the following steps: selecting and removing irrelevant or redundant characteristics in the photovoltaic data set by using the Embedded characteristics, so that the prediction accuracy of the model is improved, and the training speed of the model is accelerated; the variable modal decomposition is used for decomposing photovoltaic power generation data into a plurality of simple modal components, the influence of noise on a power generation prediction result is reduced, the variable modal decomposition is improved by adopting a greedy algorithm, modal components are selected to form a modal component subset, the number of modal component prediction models is reduced, and the accuracy of final power generation prediction is improved; and finally, constructing and combining a plurality of individual learners by a Stacking integrated learning method to complete a photovoltaic power generation amount prediction task.
2. The photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning according to claim 1, wherein feature selection is performed by using an Embedded method, and an extreme random tree is selected as a model evaluator; the step of selecting the subset of photovoltaic features for the extreme random tree is: a plurality of m feature photovoltaic feature subsets are selected, and then the out-of-bag data error with the smallest data error is selected from the photovoltaic feature subsets as a final photovoltaic feature subset.
3. A method of predicting photovoltaic power generation based on decomposition of variation modalities and ensemble learning as claimed in claim 2, wherein the process of selecting a subset of the features of the photovoltaic signature from the set of photovoltaic signature data using an extreme stochastic tree is: firstly, calculating the importance of each photovoltaic feature, then screening the features meeting the feature importance requirement, and repeating the above processes until m features are selected.
4. The photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning according to claim 1, wherein in the variation modal decomposition, a particle swarm search algorithm is firstly used for searching the number K of parameter decomposition modes and bandwidth limitation alpha, then the variation modal decomposition is carried out on photovoltaic power generation amount data according to the parameters K and alpha to obtain a plurality of power generation amount modal components, and then a greedy algorithm is adopted for carrying out modal component selection to obtain a power generation amount modal component subset.
5. The photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning according to claim 4, wherein an optimal parameter combination of the number of decomposition modalities K and the bandwidth limitation α is searched by a particle swarm search algorithm, and the particle swarm size N, the initial position [ K, α ] and the initial speed v are initialized first; then carrying out variation modal decomposition on the photovoltaic power generation quantity sequence S according to the parameter decomposition modal number K and the bandwidth limit alpha to obtain a plurality of modal components, and calculating the envelope entropy of each modal component, wherein the calculation formula of the envelope entropy is as follows (1):
wherein a (j) represents envelope signals of K modal classifications decomposed by variation modes after Hilbert demodulation, and n representsSampling points, p (j) represents a normalized probability distribution sequence of a (j), E p An entropy value of p (j), i.e., envelope entropy;
then finding out the historical minimum envelope entropy of each particle, further finding out the global minimum envelope entropy of the whole particle swarm, and updating the position [ K, alpha ] and the speed v of each particle by using the parameters K and alpha corresponding to the current global minimum envelope entropy;
and finally judging whether M iterations are completed, if not, continuing searching, and if so, outputting an optimal solution [ K, alpha ].
6. The photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning according to claim 4, wherein the process of using the variation modal decomposition algorithm is: input S represents the original photovoltaic power generation sequence, output { u } k The power generation modal component obtained by decomposition is expressed as a set, and u is initialized k 、ω k And a Lagrangian multiplier λ, n=0, n is incremented by 1 to accumulate the number of iterations, and u is updated from k=1 according to equations (2) and (3) k And omega k Every time K is updated, 1 is increased automatically until k=k, K is the number of decomposition modes, λ is updated according to the formula (4) until the formula (5) is met, iteration is stopped, otherwise n is transferred to 1 which is increased automatically, and decomposition is continued;
wherein formula (2), formula (3) and formula (4) are each u k 、ω k And an update function of λ, equation (5) is an iteration constraint of the variational modal decomposition algorithm:
where ω represents the frequency,and->Respectively indicate->f (t) and lambda n Fourier transform of (t);
7. the photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning according to claim 4, wherein the variables and functions related to the greedy algorithm include: C. s, solution, select, feasible; c= { IMF 0 ,IMF 1 ,…,IMF K-1 -IMF in which i The ith generating capacity modal component is represented, the generating capacity modal component corresponding to the maximum decision coefficient is selected from C by the select function, S is null under the initial condition, the generating capacity modal component selected by the select function is added to S, the constraint condition is that the expanding S is used for photovoltaic generating capacity prediction, the decision coefficient is larger than the unexpanded S, and the feaable function judges whether the current candidate generating capacity IMF component is added to S by comparing the decision coefficient for generating capacity prediction by using the expanding S and the unexpanded S.
8. The photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning according to claim 7, wherein the algorithm process of the greedy selection algorithm is as follows: inputting C to represent a modal component candidate set, outputting S to represent a selected modal component subset, selecting a modal component C corresponding to the largest decision coefficient from C as a candidate component under the initial condition that S is null, attempting to add C to C, and comparing the components added to CQuantity subset S and component subset S without added c 0 Determining coefficients for performing power generation amount prediction, if the determining coefficients corresponding to S are smaller than S 0 The description c does not participate in constructing the optimal solution, and c is deleted from S.
9. The photovoltaic power generation amount prediction method based on variation modal decomposition and ensemble learning according to claim 1, wherein the Stacking ensemble learning method comprises the following steps: constructing training subsets on the training set through k-fold cross validation, respectively constructing RandomForest, XGBoost, LSTM, BP models and GRU 5 models as individual learners, training the individual learners by using the training subsets to obtain base models, constructing linear regression models as element learners, and training the element learners by using the output of the base models; similarly, when testing, the base model must be used to predict to generate a new test set, and then the test set must be predicted.
10. The photovoltaic power generation amount prediction device based on variation modal decomposition and ensemble learning is characterized by comprising:
the feature selection module is used for removing irrelevant or redundant features in the photovoltaic data set through the selected features, so that the prediction accuracy of the model is improved, and the training speed of the model is accelerated;
the variation modal decomposition module is used for decomposing the photovoltaic power generation amount data into a plurality of simple modal components, reducing the influence of noise on a power generation amount prediction result and adopting a greedy algorithm to improve variation modal decomposition;
and the Stacking integrated learning module is used for constructing and combining a plurality of individual learners to complete the photovoltaic power generation amount prediction task.
CN202310615845.0A 2023-05-26 2023-05-26 Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning Pending CN116629431A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310615845.0A CN116629431A (en) 2023-05-26 2023-05-26 Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310615845.0A CN116629431A (en) 2023-05-26 2023-05-26 Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning

Publications (1)

Publication Number Publication Date
CN116629431A true CN116629431A (en) 2023-08-22

Family

ID=87641345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310615845.0A Pending CN116629431A (en) 2023-05-26 2023-05-26 Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning

Country Status (1)

Country Link
CN (1) CN116629431A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992256A (en) * 2023-09-26 2023-11-03 中国海洋大学 Ocean tide prediction precision improving method
CN117522626A (en) * 2023-11-15 2024-02-06 河北大学 Photovoltaic output prediction method based on feature selection and abnormal multi-model fusion

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992256A (en) * 2023-09-26 2023-11-03 中国海洋大学 Ocean tide prediction precision improving method
CN116992256B (en) * 2023-09-26 2023-12-19 中国海洋大学 Ocean tide prediction precision improving method
CN117522626A (en) * 2023-11-15 2024-02-06 河北大学 Photovoltaic output prediction method based on feature selection and abnormal multi-model fusion

Similar Documents

Publication Publication Date Title
US11544917B2 (en) Power electronic circuit fault diagnosis method based on optimizing deep belief network
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN116629431A (en) Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning
Nguyen et al. Filter based backward elimination in wrapper based PSO for feature selection in classification
KR20210040248A (en) Generative structure-property inverse computational co-design of materials
CN109977098A (en) Non-stationary time-series data predication method, system, storage medium and computer equipment
CN109886464B (en) Low-information-loss short-term wind speed prediction method based on optimized singular value decomposition generated feature set
CN112417028B (en) Wind speed time sequence characteristic mining method and short-term wind power prediction method
CN113159361A (en) Short-term load prediction method and system based on VDM and Stacking model fusion
US11704570B2 (en) Learning device, learning system, and learning method
CN112232244A (en) Fault diagnosis method for rolling bearing
WO2019006541A1 (en) System and method for automatic building of learning machines using learning machines
CN112884236B (en) Short-term load prediction method and system based on VDM decomposition and LSTM improvement
CN114399032A (en) Method and system for predicting metering error of electric energy meter
CN114169110A (en) Motor bearing fault diagnosis method based on feature optimization and GWAA-XGboost
CN103440275A (en) Prim-based K-means clustering method
CN116933175A (en) Electric automobile charging load prediction method and device
CN113516019B (en) Hyperspectral image unmixing method and device and electronic equipment
CN112651499A (en) Structural model pruning method based on ant colony optimization algorithm and interlayer information
CN115345106B (en) Verilog-A model construction method, system and equipment for electronic device
CN115860260A (en) Resident air conditioner load prediction model considering frequency domain data characteristic decomposition
CN108898227A (en) Learning rate calculation method and device, disaggregated model calculation method and device
CN113807387B (en) SVM classification-based characteristic index wind power output time sequence construction method and device
CN114282614B (en) Medium-long runoff prediction method for optimizing CNN-GRU based on random forest and IFDA
KR102486461B1 (en) Method and Apparatus for Virtual Measurement for Calculating Predicted Value and Feature Importance by Time Series Section Based on Feature Values of Time Series Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination