CN116702992A

CN116702992A - Power generation power prediction method and device, electronic equipment and storage medium

Info

Publication number: CN116702992A
Application number: CN202310731929.0A
Authority: CN
Inventors: 周兆星; 柳昭昭
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2023-06-20
Filing date: 2023-06-20
Publication date: 2023-09-05

Abstract

The invention discloses a method and a device for predicting generated power, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring feature data, and determining feature importance of each feature sample in the feature data by combining random forests with out-of-band data distribution; and determining a training sample based on the feature data; setting an original prediction model based on a two-way long-short-term memory neural network, and training the original prediction model by using a training sample; optimizing and adjusting the hyper-parameters of the original prediction model by combining a firefly optimization algorithm with a chaos initialization and population advantage replacement strategy to obtain a target prediction model; grouping each feature sample in the feature data according to the feature importance to obtain a plurality of feature combinations; and inputting the feature combination into a target prediction model to obtain a predicted value, and determining the target feature combination based on the predicted value to predict. The method can effectively improve the stability and the prediction precision of the prediction model, and can be widely applied to the technical field of model training.

Description

Power generation power prediction method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of model training technologies, and in particular, to a method and apparatus for predicting generated power, an electronic device, and a storage medium.

Background

With the increase in global warming and climate warming, solar energy has become one of the most desirable renewable energy sources in the world today. Traditional grid dispatching is based on reliable power supply and predictable load, and the reliability of grid operation can be improved by adjusting the power supply side and the power utilization side, but photovoltaic power generation has randomness, intermittence and volatility under the influence of weather and environment. These characteristics of photovoltaic power generation can negatively impact the stable operation of the power system. When large-scale photovoltaic power is connected, the power generation condition can become changeable, so that the power quality of a power grid can be negatively influenced, most power grids can adopt a photovoltaic electricity limiting method to reduce the influence of photovoltaic power generation on the power grid, but the occurrence of the condition can lead to limited development of photovoltaic power generation, so that the accurate photovoltaic power generation power prediction can reduce the photovoltaic electricity limiting quantity, the development and the utilization of solar energy can be greatly improved, and the economic loss to the power grid is reduced.

Therefore, the method can accurately predict the generated energy of the photovoltaic power generation and has important value for developing the photovoltaic power generation technology. The energy expectations of photovoltaic power generation can only be divided into four types: mid-long term prediction (daily or weekly), short term prediction (hourly or daily), and ultra-short term prediction (every minute or few minutes). Medium-long term prediction is generally used for maintenance and operation management of a photovoltaic electric field, ultra-short term prediction is generally used for real-time dispatching of a power grid, and short term prediction is generally used for assisting a planning and dispatching unit to make a daily power generation plan and an economic dispatching scheme. The photovoltaic power plant has important significance for reasonably arranging a daily power generation plan, realizing efficient economic dispatch and promoting clean energy development and utilization for the power department due to short-term photovoltaic power prediction, and becomes a current research hotspot. The existing main research methods of photovoltaic power prediction can be divided into a prediction method based on a statistical analysis model and a prediction method based on an artificial intelligent model, the two methods can be used for realizing the prediction of photovoltaic power, the existing photovoltaic power generation power prediction method is high in photovoltaic power generation power prediction accuracy in a stable running state, but accuracy is often not guaranteed under the influence of various meteorological factors, and from the aspect of prediction accuracy, the existing various machine learning prediction algorithms are poor in effect in the field of photovoltaic power generation prediction. The overall level of Root Mean Square Error (RMSE) of photovoltaic power generation power prediction is still about 6% -10%, the commercial application prediction software RMSE is low, the stability is insufficient, the difference between the commercial application prediction software RMSE and the practical application requirements of the operation of a power system and a micro-grid is large, and the increasingly-growing precision requirement of photovoltaic power generation power prediction is difficult to meet.

Disclosure of Invention

The present invention aims to solve at least one of the technical problems in the related art to some extent. Therefore, the invention provides a method, a device, electronic equipment and a storage medium for predicting the generated power, which can obtain an excellent prediction model and realize accurate photovoltaic power generation prediction.

In one aspect, an embodiment of the present invention provides a method for predicting generated power, including:

acquiring feature data, and determining feature importance of each feature sample in the feature data by combining random forests with out-of-band data distribution; and determining a training sample based on the feature data; wherein, the characteristic data comprises a plurality of characteristic samples;

setting an original prediction model based on a two-way long-short-term memory neural network, and training the original prediction model by using a training sample; optimizing and adjusting the hyper-parameters of the original prediction model by combining a firefly optimization algorithm with a chaos initialization and population advantage replacement strategy to obtain a target prediction model;

grouping each feature sample in the feature data according to the feature importance to obtain a plurality of feature combinations;

inputting the feature combination into a target prediction model to obtain a predicted value, and determining the target feature combination based on the predicted value;

And determining predicted input data according to the target feature combination, and predicting through a target prediction model to obtain a prediction result.

Optionally, determining the feature importance of each feature sample in the feature data by combining random forest with out-of-band data distribution includes:

based on each characteristic sample in the characteristic data, establishing a plurality of first decision trees through a random forest;

self-help resampling is carried out on the first decision tree, a training set and an out-of-band data set are generated, and a second decision tree is constructed;

based on the ordering of the second decision tree to the out-of-band data set, statistically obtaining first sample data;

disturbance of each characteristic sample in the out-of-band data set is carried out, and an out-of-band data sample set corresponding to each characteristic sample is obtained; performing type estimation on the out-of-band data sample set by using a second decision tree to obtain second sample data;

and carrying out accumulation summation on the difference values of the first sample data and the second sample data, and determining the feature importance of each feature sample according to the average value of the accumulation summation.

Optionally, optimizing and adjusting the hyper-parameters of the original prediction model by combining a firefly optimization algorithm with a chaotic initialization and population advantage replacement strategy, wherein the method comprises the following steps:

generating a firefly position through a chaos initialization sequence to obtain a first firefly population; wherein, firefly position represents each corresponding super parameter in the original prediction model;

Sequentially performing iterative updating processing on luciferin, firefly positions and decision radii based on the first firefly population until a preset condition is reached, so as to obtain a second firefly population;

performing population dominance replacement operation on firefly individuals in the second firefly population, and performing mutation treatment on the firefly individuals subjected to the population dominance replacement operation to obtain a target firefly population; and determining the super parameters after the optimization and adjustment of the original prediction model based on the target firefly population.

Optionally, based on the first firefly population, sequentially performing iterative updating processing of luciferin, firefly positions and decision radii until a preset condition is reached, to obtain a second firefly population, including:

acquiring fitness function values of each firefly individual in the first firefly population; the fitness function value characterizes the root mean square error of the model predicted value and the actual value; the fitness function value is inversely related to the prediction accuracy;

and sequentially carrying out iterative updating processing on luciferin, firefly positions and decision radii based on the fitness function value until the objective function value and the objective firefly position are unchanged within the preset iteration times, so as to obtain a second firefly population.

Optionally, performing a population dominance replacement operation on firefly individuals in the second firefly population, comprising:

acquiring fitness function values of each firefly individual in the second firefly population; the fitness function value characterizes the root mean square error of the model predicted value and the actual value; the fitness function value is inversely related to the prediction accuracy;

sorting the firefly individuals of the second firefly population based on the fitness function value;

and replacing firefly individuals in the front and rear parts after sequencing based on the prediction proportion.

Optionally, performing mutation treatment on firefly individuals subjected to the population dominance substitution operation, including:

and carrying out mutation treatment on firefly individuals subjected to population advantage replacement operation based on the normally distributed random vectors.

Optionally, grouping each feature sample in the feature data according to the feature importance, to obtain a number of feature combinations, including:

according to the magnitude of the numerical value of the feature importance, sequencing each feature sample in the feature data from high to low;

taking 1 as the number of combined samples, extracting the front characteristic samples with the number of combined samples in the sequence, and obtaining characteristic combinations;

and taking the number of combined samples plus 1 as the number of combined samples, and then returning to the step of extracting the front characteristic samples with the number of combined samples in the sequence until the number of combined samples reaches the total number of the characteristic samples, so as to obtain a plurality of characteristic combinations.

In another aspect, an embodiment of the present invention provides a generated power prediction apparatus, including:

the first module is used for acquiring the characteristic data and determining the characteristic importance of each characteristic sample in the characteristic data through random forest and out-of-band data distribution; and determining a training sample based on the feature data; wherein, the characteristic data comprises a plurality of characteristic samples;

the second module is used for setting an original prediction model based on the two-way long-short-term memory neural network and training the original prediction model by using a training sample; optimizing and adjusting the hyper-parameters of the original prediction model by combining a firefly optimization algorithm with a chaos initialization and population advantage replacement strategy to obtain a target prediction model;

the third module is used for grouping all feature samples in the feature data according to the feature importance to obtain a plurality of feature combinations;

a fourth module, configured to input the feature combination into a target prediction model, obtain a predicted value, and determine a target feature combination based on the predicted value;

and a fifth module, configured to determine predicted input data according to the target feature combination, and predict the predicted input data through a target prediction model to obtain a prediction result.

Optionally, the first module is specifically configured to:

Optionally, the second module is specifically configured to:

Optionally, based on the first firefly population, the second module sequentially performs iterative update processing of luciferin, firefly positions and decision radii until a preset condition is reached, so as to obtain a second firefly population, including:

Optionally, performing a population dominance replacement operation on firefly individuals in the second firefly population in the second module, including:

Optionally, the mutation processing is performed on firefly individuals performing the population advantage replacement operation in the second module, including:

Optionally, the third module is specifically configured to:

In another aspect, an embodiment of the present invention provides an electronic device, including: a processor and a memory; the memory is used for storing programs; the processor executes a program to implement the above-described generated power prediction method.

In another aspect, an embodiment of the present invention provides a computer storage medium in which a processor-executable program is stored, which when executed by a processor is configured to implement the above-described generated power prediction method.

Firstly, acquiring feature data, and determining feature importance of each feature sample in the feature data by combining random forest with out-of-band data distribution; and determining a training sample based on the feature data; wherein, the characteristic data comprises a plurality of characteristic samples; setting an original prediction model based on a two-way long-short-term memory neural network, and training the original prediction model by using a training sample; optimizing and adjusting the hyper-parameters of the original prediction model by combining a firefly optimization algorithm with a chaos initialization and population advantage replacement strategy to obtain a target prediction model; grouping each feature sample in the feature data according to the feature importance to obtain a plurality of feature combinations; inputting the feature combination into a target prediction model to obtain a predicted value, and determining the target feature combination based on the predicted value; and determining predicted input data according to the target feature combination, and predicting through a target prediction model to obtain a prediction result. According to the embodiment of the invention, firstly, the characteristic importance ranking is carried out on the data collected in the photovoltaic electric field through a random forest algorithm, so that the relevant characteristic factors influencing the photovoltaic power generation power are primarily determined, and then the chaotic initialization and population advantage replacement strategies are applied to a firefly optimization algorithm; and further determining a target feature combination of the predicted input data, enhancing the predicted relevance of the reference data. The embodiment of the invention can effectively improve the stability and the prediction precision of the prediction model.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate and do not limit the invention.

FIG. 1 is a schematic view of an implementation environment for generating power prediction according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a method for predicting generated power according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an overall flow of generated power prediction according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of another overall flow of generated power prediction provided by an embodiment of the present invention;

FIG. 5 is a schematic diagram of the prediction accuracy of a test set predicted by using a Bilstm according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of test set prediction accuracy using FA-Bilstm for prediction according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of test set prediction accuracy using RF-FA-bit for prediction according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of the prediction accuracy of a test set predicted by FA-bit in accordance with an embodiment of the present invention;

FIG. 9 is a diagram showing the prediction accuracy of a test set predicted by IFA-bit m according to an embodiment of the present invention

FIG. 10 is a schematic diagram showing the comparison of prediction accuracy of various prediction methods according to an embodiment of the present invention

FIG. 11 is a schematic structural diagram of a prediction apparatus for generating power according to an embodiment of the present invention;

fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It should be noted that although functional block diagrams are depicted as block diagrams, and logical sequences are shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than the block diagrams in the system. The terms first/S100, second/S200, and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

It can be understood that the method for predicting the generated power provided by the embodiment of the invention can be applied to any computer equipment with data processing and calculating capabilities, and the computer equipment can be various terminals or servers. When the computer device in the embodiment is a server, the server is an independent physical server, or is a server cluster or a distributed system formed by a plurality of physical servers, or is a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network ), basic cloud computing services such as big data and artificial intelligence platforms, and the like. Alternatively, the terminal is a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like, but is not limited thereto.

FIG. 1 is a schematic view of an embodiment of the invention. Referring to fig. 1, the implementation environment includes at least one terminal 102 and a server 101. The terminal 102 and the server 101 can be connected through a network in a wireless or wired mode to complete data transmission and exchange.

The server 101 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligent platforms, and the like.

In addition, server 101 may also be a node server in a blockchain network. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like.

The terminal 102 may be, but is not limited to, a smart phone, tablet, notebook, desktop, smart box, smart watch, etc. The terminal 102 and the server 101 may be directly or indirectly connected through wired or wireless communication, which is not limited in this embodiment of the present invention.

The embodiment of the present invention provides a method for predicting power generation, which is described below by taking an example in which the method for predicting power generation is applied to the server 101, based on the implementation environment shown in fig. 1, it will be understood that the method for predicting power generation may also be applied to the terminal 102.

Referring to fig. 2, fig. 2 is a flowchart of a power generation power prediction method applied to a server according to an embodiment of the present invention, and an execution subject of the power generation power prediction method may be any one of the foregoing computer devices. Referring to fig. 2, the method includes the steps of:

s100, acquiring feature data, and determining feature importance of each feature sample in the feature data through random forest and out-of-band data distribution; and determining a training sample based on the feature data;

wherein, the characteristic data comprises a plurality of characteristic samples; taking a photovoltaic power generation power prediction scenario as an example, the feature data includes a plurality of feature samples affecting the photovoltaic power generation power.

It should be noted that, in some embodiments, determining the feature importance of each feature sample in the feature data by combining random forest with out-of-band data distribution may include: based on each characteristic sample in the characteristic data, establishing a plurality of first decision trees through a random forest; self-help resampling is carried out on the first decision tree, a training set and an out-of-band data set are generated, and a second decision tree is constructed; based on the ordering of the second decision tree to the out-of-band data set, statistically obtaining first sample data; disturbance of each characteristic sample in the out-of-band data set is carried out, and an out-of-band data sample set corresponding to each characteristic sample is obtained; performing type estimation on the out-of-band data sample set by using a second decision tree to obtain second sample data; and carrying out accumulation summation on the difference values of the first sample data and the second sample data, and determining the feature importance of each feature sample according to the average value of the accumulation summation.

In some embodiments, feature importance analysis based on random forest method (RF) may be achieved by:

random forests are a set classifier employing Bagging that implements classification prediction of samples by merging multiple decision trees. This approach can effectively improve the analysis capability because it can determine the type of sample based on the mode of the class labels provided by the decision tree.

In RF, the training set of each decision tree is generated by automatic resampling, i.e., randomly extracting N samples from N raw data sets. Some samples may be sampled multiple times, while others may not. By statistics, the training set of decision trees typically contains 2/3 of the original data, the remaining 1/3 being used for statistical feature importance. When the decision tree is built, D features are randomly extracted from the D features, the feature with the strongest classification function is selected to be used as a splitting attribute according to the Gini gain optimization principle, and then the attributes are classified into new child nodes. The Gini value is a common method for measuring the quality of information d, and the calculation formula is as follows:

p _k indicating the weight of the kth class in the dataset, and |y| indicates the number of valued classes of the class. Gini (D) may reflect the possibility of extracting two types of data from the data set D, the types of which differ, so the smaller Gini (D) the higher the quality of the data set D. The data set D is split according to the feature a, and the Gini gain of the data set D can be calculated by the following formula:

In the formula, v represents the number of the value classes of a, and |D ^v And the I represents the number of samples related to the v-th value. The Gini gain optimization principle is to select the best characteristic from the Gini gains according to the statistics nodes to distribute the characteristics so as to achieve the effect of optimizing the calculation nodes. Due to the application of OOB data allocation accuracy rate, the special can be effectively evaluatedThe severity of the points, and thus the quality of the child node dataset, is optimized, indicating that this feature has the best allocation characteristics. However, since the dual random mechanism of random forest is not enough to measure the severity of the feature by only depending on the occurrence number of the feature in the decision tree, this embodiment proposes a method of using the OOB data allocation accuracy rate to measure the severity of the feature more accurately.

If there are k decision trees in RF (corresponding to the first decision tree), the meaning of feature a can be determined according to the following steps:

1) In the initial case of k=1, a self-service resampling technique is used to generate a training set and an out-of-band (OOB) data set, and a decision tree T is constructed on the training set and the out-of-band (OOB) _k (i.e., a second decision tree) to better simulate the actual situation;

2) Through T _k For the ordering of OOB data, the appropriate number of samples is counted and recorded as R _k (i.e., first sample data);

3) By providing perturbation to the value of property a in OOB, we can create a new OOB sample set and use T _k To complete the type estimation to determine which samples are correctly partitioned and record it as R' _k (i.e., second sample data);

4) Let k=2, 3, …, k, repeat steps 1 to 3;

5) The importance of feature a can be calculated from the following equation:

and the value of the feature a is disturbed, and if the classification accuracy is not changed greatly before and after disturbance, the feature a plays a small role in classification, and the classification performance is low. At this time R _k -R′ _k The value of (a) will be small, so the larger the value of IMP (a) the better the classification of feature a.

S200, setting an original prediction model based on a two-way long-short-term memory neural network, and training the original prediction model by using a training sample; optimizing and adjusting the hyper-parameters of the original prediction model by combining a firefly optimization algorithm with a chaos initialization and population advantage replacement strategy to obtain a target prediction model;

it should be noted that, in some embodiments, the optimization adjustment of the hyper-parameters of the original prediction model by combining the firefly optimization algorithm with the chaotic initialization and the population advantage replacement strategy may include: generating a firefly position through a chaos initialization sequence to obtain a first firefly population; wherein, firefly position represents each corresponding super parameter in the original prediction model; sequentially performing iterative updating processing on luciferin, firefly positions and decision radii based on the first firefly population until a preset condition is reached, so as to obtain a second firefly population; performing population dominance replacement operation on firefly individuals in the second firefly population, and performing mutation treatment on the firefly individuals subjected to the population dominance replacement operation to obtain a target firefly population; and determining the super parameters after the optimization and adjustment of the original prediction model based on the target firefly population.

In some embodiments, based on the first firefly population, performing iterative update processing of luciferin, firefly position and decision radius in sequence until a preset condition is reached, to obtain a second firefly population, including: acquiring fitness function values of each firefly individual in the first firefly population; the fitness function value characterizes the root mean square error of the model predicted value and the actual value; the fitness function value is inversely related to the prediction accuracy; and sequentially carrying out iterative updating processing on luciferin, firefly positions and decision radii based on the fitness function value until the objective function value and the objective firefly position are unchanged within the preset iteration times, so as to obtain a second firefly population.

In some embodiments, performing a population dominance replacement operation on firefly individuals in the second firefly population comprises: acquiring fitness function values of each firefly individual in the second firefly population; the fitness function value characterizes the root mean square error of the model predicted value and the actual value; the fitness function value is inversely related to the prediction accuracy; sorting the firefly individuals of the second firefly population based on the fitness function value; and replacing firefly individuals in the front and rear parts after sequencing based on the prediction proportion.

In some embodiments, mutating firefly individuals for a population dominance replacement operation comprises: and carrying out mutation treatment on firefly individuals subjected to population advantage replacement operation based on the normally distributed random vectors.

In some embodiments, aiming at the problems of poor global searching capability and easy sinking into local optimum existing in a firefly optimization algorithm (FA), a chaos initialization and population advantage replacement strategy is improved on the firefly optimization algorithm, so that the optimization adjustment of model super parameters is realized, and the realization steps are as follows:

in the deep learning process, many parameters of the model, such as the number of neurons, the learning rate, the convolution kernel size, etc., are difficult to obtain accurately, and such parameters are called super-parameters. Finding hyper-parameters of a model is often difficult, and constructing an excellent algorithm requires a large number of parameter adjustment tests, requiring a large amount of time and effort. Therefore, it is necessary to optimize the super parameters of the deep learning model by an efficient optimization algorithm.

The purpose of deep learning hyper-parameter optimization is to find a set of suitable hyper-parameter combinations that make the network model achieve the best effect on the validation set. The super-parametric optimization can be solved by considering the following formula:

The specific model operation process is as follows:

in the modified firefly algorithm (IFA), assuming that N fireflies are dispersed as much as possible in the solution space, three main parameters of firefly i at time t are: position x in search space _i (t), fluorescein level l _i (t) decision radiusN _i And (t) is the current firefly position. The IFA algorithm is mainly divided into four parts: chaos initiates firefly position, updates fluorescein, updates firefly position, and updates decision radius. The specific model of the algorithm is as follows:

chaos initializing firefly position:

q _(n+1).d ＝1-2*r ² _n,d q _n,d ∈(-1,1)n＝0,1...N (4)

q _n for the chaotic initialization sequence, d is the dimension in space.

Updating the fluorescence value:

l _i (t)＝(1-ρ)l _i (t-1)+γf(x _i (t)) (5)

wherein: ρ ε [0,1 ]]Is fluorescein decay constant, gamma is 0,1]Updating the constant f (x) for fluorescein _i (t)) calculating a movement probability for the fitness function value:

the firefly was calculated as:

decision radius:

r _s for the decision radius threshold value,beta is a tuning constant; n is n _t A threshold value for controlling the number of firefly neighbors.

Re-calculating the fitness function value of the moved firefly population, and recording the optimal objective function value N of the population _best (t) and an optimal firefly position x _best (t) if the position is unchanged in five iterations, performing population dominance replacement operation, wherein the part adopts a normal distribution form for the part The method comprises the steps of performing operation, namely sorting current firefly individuals according to fitness function values, replacing the first 15% of firefly individuals with the second 15% of firefly individuals, and performing mutation treatment on the updated firefly individuals:

x _i (t)＝x _i (t)*(1+N(0,1))(9)

n is a random vector, obeying a normal distribution of expected 0, variance 1.

In the IFA algorithm, the hyper-parameters of the Bilstm (two-way long-short-term memory neural network) model correspond to the positions of all members of the firefly population, and the fitness function is determined as the root mean square error of the prediction result and the actual value of the Bilstm model test set, so that the smaller the fitness function value is, the higher the model prediction accuracy is. Thereby obtaining the best and optimal biptm model hyper-parameters of the prediction accuracy.

It should also be noted that, the embodiment of the invention uses the Bilstm to predict the photovoltaic power generation power, and the implementation steps are as follows:

the lstm network technique is an improved recurrent neural network, which incorporates memory-possessing units into a conventional RNN, and introduces C _t And h _t Thus, the output of linear and nonlinear information is realized, and the speed and accuracy of the system are improved.

In the lstm architecture, the three unique configurations of the forgetting gate, the input gate and the output gate give it the technical ability to solve interactions within short and long time-limited sequences. The forget gate loses the data information in the original state, and the input gate stores the data information of the external state so as to update the state of the unit; finally, the input/output gate gathers all results and computes the input/output of the lstm structure.

Forgetting the door x according to equation 10 _t And h _t-1 As an input message.

Formulas 11, 12, 13 are input gate calculations that relate these states to those generated by the tanh functionThe vectors are added to achieve the computation of the input gate.

The above formula combines the current memory C _t And long-term memory C _t-1 Form a new cell state C _t 。

Finally, the output gate generates h _t Is provided for the final output of (a). The whole process is divided into two stages, shown by equations 14, 12.

The h function represents the hyperbolic tangent activation function, the sigmoid function represents the matrix of weight coefficients of the forgetting gate, the input gate, the output gate and the memory cell, and b _f ，b _t ，b _o ，b _c Then the bias conditions of these gates are represented respectively. The Bilstm neural network structure consists of a bidirectional circulating neural network and a standard lstm neural network, wherein the Bilstm neural network comprises the lstm neural network with 2 training directions, the network end nodes of each layer comprise complete past information and future information, and the 2 training directions are finally connected with an output layer.

S300, grouping each feature sample in the feature data according to the feature importance to obtain a plurality of feature combinations;

it should be noted that, in some embodiments, step S300 may include: according to the magnitude of the numerical value of the feature importance, sequencing each feature sample in the feature data from high to low; taking 1 as the number of combined samples, extracting the front characteristic samples with the number of combined samples in the sequence, and obtaining characteristic combinations; and taking the number of combined samples plus 1 as the number of combined samples, and then returning to the step of extracting the front characteristic samples with the number of combined samples in the sequence until the number of combined samples reaches the total number of the characteristic samples, so as to obtain a plurality of characteristic combinations.

In some embodiments, after eleven input parameters of air temperature, azimuth angle, cloud opacity, dew point temperature, DHI (solar scattering radiation index), DNI (solar direct radiation index), GHI (total solar horizontal radiation), GTI (fixed dip radiation), GTI (tracking dip radiation), atmospheric precipitation, relative humidity are input to the random forest feature importance screening module, from the first 1 feature to the first 11 features in order of importance, according to the input prediction model.

S400, inputting the feature combination into a target prediction model to obtain a predicted value, and determining the target feature combination based on the predicted value;

in some embodiments, importance ranking is performed on predicted input features of photovoltaic power generation through a Random Forest (RF), so that an input feature combination input prediction model with higher regression prediction correlation is selected, for example, the input features comprise 11 different parameters, after feature ranking is performed through the random forest, the first n features are sequentially selected to perform power generation prediction and test (n=1, 2..11), then an input parameter combination with highest measurement accuracy is selected to perform prediction model construction, and thus a prediction model with optimal parameters is obtained. Finally, the feature dimension group with highest prediction accuracy is used as the feature number of the model input, and finally, the method selects the first eight features with highest importance ranking: air temperature, azimuth angle, cloud opacity, dew point temperature, DHI (solar scattered radiation index), DNI (solar direct radiation index), GHI (total solar horizontal radiation), and relative humidity.

S500, determining prediction input data according to the target feature combination, and predicting through a target prediction model to obtain a prediction result.

Specifically, the feature combinations may be eight features that are important for the photovoltaic power generation power influence as determined in step S400, including air temperature, azimuth angle, cloud opacity, dew point temperature, DHI (solar scattering radiation index), DNI (solar direct radiation index), GHI (total solar horizontal radiation), and relative humidity; and further, acquiring historical data of the related features as prediction input data, and inputting a target prediction model for prediction analysis to obtain a final prediction result.

In order to fully explain the technical principle of the service data prediction according to the embodiment of the invention, the technical scheme of the invention is supplemented by combining some specific embodiments. It is to be understood that the following is illustrative of the principles of the present invention and is not in limitation thereof.

As shown in FIG. 3, the method of the invention is used for predicting the photovoltaic power generation power by combining an RF-IFA-bit model (obtained by combining a random forest and an improved firefly optimization algorithm with a two-way long-short-term memory neural network), related weather data affecting the photovoltaic power generation power is firstly obtained as original data, then data preprocessing (such as missing data complement, abnormal data elimination and the like) is carried out, then related weather data affecting the photovoltaic power generation power after being subjected to Random Forest (RF) feature importance sorting grouping is taken as input quantity, the first n input features are input (bit) prediction models for training according to the arrangement sequence (specifically comprising generating firefly positions by chaotic initialization sequences), attractiveness and relative bright spots among different fireflies are determined, updating iteration is carried out on the firefly positions, population advantage replacement is carried out until the target requirements are met, an optimal parameter combination is obtained, finally the optimized bit prediction model is obtained, the data and the photovoltaic power generation power are connected, thus regression prediction is carried out, the input of different conditions corresponds to different sets of photovoltaic power generation power prediction results, and the weather prediction model has good weather prediction effect. Finally, the RF-IFA-Bilstm model is used for photovoltaic generation power prediction. In some embodiments, the main flow of the generated power prediction is shown in fig. 4: firstly, acquiring photovoltaic power generation power and related external information data; further, data preprocessing (such as missing data complement, abnormal data rejection, etc.) is performed; then the data is subjected to a random forest feature importance analysis module (random forest (RF) feature importance sequencing grouping is carried out) to obtain a training set and a testing set of photovoltaic power generation power data; the training set is used for training the IFA-Bilstm model, and the testing set is used for verifying the prediction accuracy of the trained IFA-Bilstm model; and finally, the trained model is used for photovoltaic power generation power prediction to obtain a prediction result.

And carrying out test by using the related data of the photovoltaic power generation power of a certain photovoltaic electric field in northwest. The test-related data results are as follows:

for the expected accuracy of the accurate evaluation mode, the Root Mean Square Error (RMSE), the average offset error (MBE) and the average absolute error (MAE) are used as evaluation indexes, so that the effectiveness and the application value of the prediction model are evaluated.

Wherein root mean square error (Root wMean Square Error, RNSE): square root of the ratio of the square of the observed value and true value deviation to the number of observations N. The root mean square error also reflects the degree of deviation of the measured data from the true value, and the smaller the root mean square error is, the higher the measurement accuracy is.

The mean absolute error (Mean Absolute Error, MAE), also known as L1 loss, is one of the simplest loss functions and is also an easy to understand evaluation index. It is calculated by taking the absolute difference between the predicted value and the actual value and taking the average value over the whole dataset. Mathematically, it is the arithmetic mean of absolute errors. MAEs measure only the magnitude of the error, not their direction of interest. The lower the MAE, the higher the accuracy of the model. The advantages are that: since absolute values are used, all errors are weighted in the same proportion. If the training data has outliers, the MAE does not penalize high errors caused by outliers. It provides an average measure of the model execution.

The Mean Bias Error (MBE) is a trend of overestimated or underestimated parameter values of the measurement process. The deviation is only in one direction, and can be positive or negative. A positive deviation means that the error of the data is overestimated and a negative deviation means that the error is underestimated. The average deviation error is the average of the differences between the predicted value and the actual value. The assessment index quantifies the overall bias and captures the average bias in the predictions. It is almost similar to MAE, the only difference being that no absolute value is taken here. This evaluation index should be handled carefully, since the positive and negative errors can cancel each other out. The advantages are that: MBE is a good metric to examine the direction of the model (i.e., whether there is a positive or negative bias) and correct for model bias.

As shown in fig. 5 to 10, fig. 5 to 10 respectively show comparison conditions of photovoltaic power generation power prediction accuracy of the photovoltaic electric field. As can be seen from fig. 5 to 10, compared with the methods of the Bilstm, the FA-Bilstm, the RF-FA-Bilstm and the IFA-Bilstm, the difference between the predicted value and the actual value of the obtained photovoltaic power based on the RF-IFA-Bilstm photovoltaic power prediction method in the embodiment is small, which meets the practical application requirements, and has stronger universality and accuracy; in the embodiment, the test set RMSE of the photovoltaic power generation power prediction method of RF-IFA-Bilstm is 0.133%, and compared with the other methods, the prediction accuracy is higher; in addition, as can be seen from fig. 5 to 10, compared with the Bilstm model and the FA-Bilstm model, the accuracy of the IFA-Bilstm model has a certain increase, which illustrates the effectiveness of the improvement of the IFA optimization algorithm, the prediction accuracy of the RF-IFA-Bilstm model is obviously increased, and the effectiveness of the RF-IFA-Bilstm model proposed by the technology is also illustrated.

The photovoltaic power generation power prediction system, the photovoltaic power generation power prediction equipment and the detailed description of the computer readable storage content can help a user to better know the power generation efficiency of photovoltaic power generation, so that funds are used more effectively, electric energy is saved, and benefits are further improved. Compared with a single photovoltaic power generation power prediction model, the prediction performance is effectively improved; on the selection of the influence factors, a mode of combining hardware influence and meteorological influence is adopted, the influence of the influence factors on the photovoltaic power generation capacity is more accurately selected, the influence of irrelevant variables on a machine learning model is removed, a reliable basis is laid for improving the prediction accuracy of the model, and a photovoltaic power generation power data set comprises the following data: air temperature, azimuth angle, cloud opacity, dew point temperature, DHI (solar scattering radiation index), DNI (solar direct radiation index), GHI (total solar horizontal radiation), GTI (fixed tilt angle radiation), GTI (tracking tilt angle radiation), atmospheric precipitation, and relative humidity. By analyzing the importance of random forest data, the technology determines 8 variables such as air temperature, azimuth angle, cloud opacity, dew point temperature, DHI (solar scattering radiation index), DNI (direct solar radiation index), GHI (total solar horizontal radiation), relative humidity and the like through the random forest feature importance ranking and the IFA-Bilstm prediction model and the measurement accuracy, so that the influence of factors such as weather, radiance and the like on photovoltaic power generation capacity is fully considered, and a foundation is provided for constructing an accurate photovoltaic power generation power prediction mode. In addition, the photovoltaic power generation amount and the meteorological features are closely related, so the invention further discusses the effect of the factors on the photovoltaic power generation amount so as to obtain more accurate prediction results. The meteorological features have time characteristics, the characteristics can be mined through a deep learning method, therefore, a random forest method (RF) is adopted to analyze and sort the feature importance of input data, data with small influence degree of the features are removed, the accuracy of a prediction model can be effectively improved, meanwhile, an Improved Firefly Algorithm (IFA) is adopted to optimize parameters of a bidirectional long-short-term neural network (Bilstm), and the accuracy and the reliability of the prediction model can be better improved. The photovoltaic power generation power prediction mode with the optimal parameter setting is ensured, the prediction precision can be remarkably improved, and the method is far superior to the traditional deep learning method. The characteristics of a random forest algorithm, an improved firefly algorithm and a bidirectional long-short-term neural network are fully utilized, the problems of low prediction precision and poor generalization performance of a single neural network model are solved, and the prediction performance is further improved.

In summary, according to the embodiment of the invention, the data acquired in the photovoltaic electric field are subjected to feature importance sequencing through a random forest algorithm, the photovoltaic power short-term prediction model which is made of eight input parameters, namely air temperature, azimuth angle, cloud opacity, dew point temperature, DHI (solar scattering radiation index), DNI (solar direct radiation index), GHI (total solar horizontal radiation), GTI (fixed inclination radiation), GTI (tracking inclination radiation), atmospheric precipitation and relative humidity, is subjected to importance sequencing, then a bidirectional long-short-term memory neural network (BiLSTM) is adopted for prediction of photovoltaic power through experimental comparison, the parameters are subjected to iterative optimization through an improved firefly optimization algorithm, finally, the photovoltaic power prediction model with optimal effect is obtained, and finally, the photovoltaic power short-term prediction model which is made of eight input parameters, namely air temperature, azimuth angle, cloud opacity, dew point temperature, DHI (solar scattering radiation index), DNI (solar direct radiation index), GHI (total solar horizontal radiation) and relative humidity, is selected, and the rest three parameters are removed. Secondly, aiming at the problems of poor global searching capability and easiness in sinking into local optimization of a firefly optimization algorithm (FA), a chaos initialization and population advantage replacement strategy is improved for the firefly optimization algorithm, a firefly position is generated through a chaos initialization sequence, global searching capability of the algorithm is improved, parameter optimizing capability of the algorithm is improved through the population advantage replacement strategy, and an improved firefly optimization algorithm (IFA) is provided to efficiently conduct parameter optimizing on a Bilstm algorithm. The result shows that compared with a single bit, a two-way long-short-term memory neural network (FA-bit), an RF-FA-bit and an IFA-bit after optimization of a firefly optimization algorithm, the RF-IFA-bit photovoltaic power generation prediction model provided by the technology can better process complex prediction problems, because the RF-IFA-bit photovoltaic power generation power prediction model is used for screening the importance of data characteristics and is beneficial to reducing the influence of irrelevant characteristics, the method is particularly important in processing the multivariable optimization problem, the time required by a training model can be effectively reduced, the sensor data dimension required by photovoltaic power generation capacity prediction is reduced, and the input characteristic importance analysis of a random forest method (RF) and the improved firefly optimization algorithm (IFA) are adopted for optimizing the parameters of the two-way long-term memory neural network (BiLSTM), so that the stability and the prediction accuracy of the provided photovoltaic power generation power prediction model are improved.

The photovoltaic power generation short-term prediction model has the advantages that the significance of improving the stability of a power grid, improving the capacity of the power grid to absorb light and reducing the phenomenon of discarding light and discarding electricity of a photovoltaic power generation system is improved, meanwhile, photovoltaic power generation has intermittence, randomness and volatility, instability of the power grid system can be caused, along with the increase of the specific gravity of the power grid system of a photovoltaic power generation station, the photovoltaic power generation power prediction accuracy is particularly important, the higher the photovoltaic power prediction accuracy is, the smaller the influence of photovoltaic grid connection on the safe operation of the power grid is, the efficient photovoltaic power generation power short-term prediction model can effectively help a power grid dispatching department to make scheduling plans of various power supplies, meanwhile, the characteristic importance is screened by adopting a random forest method, the photovoltaic power generation power can be predicted in a short-term mode through sensor data with lower dimensionality, and the number of sensors used for predicting characteristic data acquisition sensors in a photovoltaic electric field is reduced, and the construction cost is reduced. The embodiment of the invention can rely on government and enterprise power grid projects, and the power utilization rate can be greatly improved by using a related algorithm. The embodiment of the invention can be combined with telecommunication equipment such as a base station, and the like, and the accurate photovoltaic power generation power prediction can ensure that the equipment such as the base station, and the like can stably supply power through a distributed photovoltaic system, so that 20% -30% of the consumption electric energy of a public power grid can be expected to be reduced.

The beneficial effects of the implementation of the invention at least comprise:

(1) The importance ranking is carried out on the predicted input features of the photovoltaic power generation through a Random Forest (RF), so that an input feature combination input prediction model with higher regression prediction correlation is selected, then an input parameter combination with highest measurement accuracy is selected for carrying out prediction model construction, and a prediction model with optimal parameters is obtained.

(2) The improved firefly optimization algorithm (IFA) is adopted to carry out super-parameter optimization on the two-way long-short-term memory neural network method (Bilstm), the chaotic initialization method is introduced into the traditional firefly optimization algorithm to obtain the improved firefly optimization algorithm, the algorithm is prevented from falling into local optimum in a population advantage replacement mode, and the algorithm has better global searching capability through comparison, so that the influence of the parameter falling into the local optimum on the effect of the prediction model is prevented.

(3) The method is characterized in that a Random Forest (RF) feature importance analysis module is combined with an IFA-bit, an RF-IFA-bit method is provided, the feature importance of input data is analyzed through the Random Forest (RF), the input data is reduced in dimension in an importance sorting mode, the prediction accuracy of a model is improved, meanwhile, an improved firefly optimization algorithm further improves the optimizing capability of the algorithm through chaos initialization and population advantage replacement strategies, the global searching capability of the super-parameters of the bit is improved, the super-parameters of the firefly optimization algorithm is matched with the input data after the dimension reduction of the random forest, and the short-term prediction accuracy of photovoltaic power generation power is improved to the maximum extent.

On the other hand, as shown in fig. 11, an embodiment of the present invention provides a generated power prediction apparatus 1100, including: a first module 1110, configured to obtain feature data, and determine feature importance of each feature sample in the feature data by combining random forest with out-of-band data allocation; and determining a training sample based on the feature data; the characteristic data comprises a plurality of characteristic samples influencing the photovoltaic power generation power; a second module 1120, configured to set an original prediction model based on the two-way long-short-term memory neural network, and train the original prediction model with a training sample; optimizing and adjusting the hyper-parameters of the original prediction model by combining a firefly optimization algorithm with a chaos initialization and population advantage replacement strategy to obtain a target prediction model; a third module 1130, configured to group each feature sample in the feature data according to the feature importance, to obtain a plurality of feature combinations; a fourth module 1140, configured to input the feature combination into a target prediction model, obtain a predicted value, and determine a target feature combination based on the predicted value; and a fifth module, configured to determine predicted input data according to the target feature combination, and predict the predicted input data through a target prediction model to obtain a prediction result.

The content of the method embodiment of the invention is suitable for the device embodiment, the specific function of the device embodiment is the same as that of the method embodiment, and the achieved beneficial effects are the same as those of the method.

On the other hand, as shown in fig. 12, an embodiment of the present invention further provides an electronic device 1200, which includes at least one processor 1210, and at least one memory 1220 for storing at least one program; take a processor 1210 and a memory 1220 as examples.

Processor 1210 and memory 1220 may be connected by a bus or other means.

Memory 1220 acts as a non-transitory computer readable storage medium that may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, memory 1220 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some implementations, the memory 920 may optionally include memory located remotely from the processor, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The above described embodiments of the electronic device are merely illustrative, wherein the units described as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

It should be noted that, the computer readable medium shown in the embodiments of the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The content of the method embodiment of the invention is suitable for the system embodiment, the specific function of the system embodiment is the same as that of the method embodiment, and the achieved beneficial effects are the same as those of the method.

Another aspect of the embodiments of the present invention also provides a computer-readable storage medium storing a program that is executed by a processor to implement a method as before.

The content of the method embodiment of the invention is applicable to the computer readable storage medium embodiment, the functions of the computer readable storage medium embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method.

Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the foregoing method.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

It should be noted that although in the above detailed description several modules of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the invention. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present invention may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, a touch terminal, or a network device, etc.) to perform the method according to the embodiments of the present invention.

In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.

Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution apparatus, device, or apparatus, such as a computer-based apparatus, processor-containing apparatus, or other apparatus that can fetch the instructions from the instruction execution apparatus, device, or apparatus and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution apparatus, device, or apparatus.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution device. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and the equivalent modifications or substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims

1. A method for predicting generated power, comprising:

setting an original prediction model based on a two-way long-short-term memory neural network, and training the original prediction model by using the training sample; optimizing and adjusting the super parameters of the original prediction model by combining a firefly optimization algorithm with a chaos initialization and population advantage replacement strategy to obtain a target prediction model;

Grouping each characteristic sample in the characteristic data according to the characteristic importance to obtain a plurality of characteristic combinations;

inputting the feature combination into the target prediction model to obtain a predicted value, and determining a target feature combination based on the predicted value;

and determining predicted input data according to the target feature combination, and predicting through the target prediction model to obtain a prediction result.

2. The method for predicting generated power according to claim 1, wherein determining the feature importance of each feature sample in the feature data by combining random forest with out-of-band data distribution comprises:

statistically obtaining first sample data based on the ordering of the out-of-band data set by the second decision tree;

disturbance of each characteristic sample in the out-of-band data set is carried out, and an out-of-band data sample set corresponding to each characteristic sample is obtained; performing type estimation on the out-of-band data sample set by using the second decision tree to obtain second sample data;

And carrying out accumulation summation on the difference value of the first sample data and the second sample data, and determining the feature importance of each feature sample according to the average value of the accumulation summation.

3. The method for predicting generated power according to claim 1, wherein the optimizing and adjusting the super parameters of the original prediction model by a firefly optimization algorithm in combination with a chaotic initialization and a population dominance substitution strategy comprises:

generating a firefly position through a chaos initialization sequence to obtain a first firefly population; wherein the firefly position characterizes each corresponding hyper-parameter in the original predictive model;

sequentially performing iterative updating processing on luciferin, the firefly position and the decision radius based on the first firefly population until a preset condition is reached, so as to obtain a second firefly population;

performing population dominance replacement operation on firefly individuals in the second firefly population, and performing mutation treatment on the firefly individuals subjected to the population dominance replacement operation to obtain a target firefly population; and determining the super parameters after the optimization adjustment of the original prediction model based on the target firefly population.

4. The method for predicting generated power according to claim 3, wherein the step of sequentially performing iterative update processing of luciferin, the firefly position, and the decision radius based on the first firefly population until a preset condition is reached, and obtaining a second firefly population comprises:

acquiring an fitness function value of each firefly individual in the first firefly population; the fitness function value represents the root mean square error of the model predicted value and the actual value; the fitness function value is inversely related to the prediction precision;

and based on the fitness function value, sequentially performing iterative updating processing on the luciferin, the firefly position and the decision radius until the objective function value and the objective firefly position are unchanged in the preset iteration times, and obtaining a second firefly population.

5. The method of claim 3, wherein said performing a population dominance replacement operation on firefly individuals in said second firefly population comprises:

acquiring fitness function values of each firefly individual in the second firefly population; the fitness function value represents the root mean square error of the model predicted value and the actual value; the fitness function value is inversely related to the prediction precision;

Sorting each of the firefly individuals of the second firefly population based on the fitness function value;

6. The method of claim 3, wherein the mutation processing of the firefly individual performing the population advantage substitution operation comprises:

and carrying out mutation treatment on firefly individuals carrying out the population dominance substitution operation based on the normally distributed random vector.

7. The method for predicting generated power according to claim 1, wherein the grouping each of the feature samples in the feature data according to the feature importance to obtain a number of feature combinations includes:

sorting all the characteristic samples in the characteristic data from high to low according to the numerical value of the characteristic importance;

taking 1 as the number of combined samples, extracting the front characteristic samples with the number of the combined samples in the sequence, and obtaining characteristic combinations;

and taking the number of combined samples plus 1 as the number of combined samples, and returning to the step of extracting the front characteristic samples with the number of the combined samples in the sequence until the number of the combined samples reaches the total number of the characteristic samples, so as to obtain a plurality of characteristic combinations.

8. A generated power prediction apparatus, comprising:

the first module is used for acquiring characteristic data and determining the characteristic importance of each characteristic sample in the characteristic data through random forest and out-of-band data distribution; and determining a training sample based on the feature data; wherein, the characteristic data comprises a plurality of characteristic samples;

the second module is used for setting an original prediction model based on a two-way long-short-term memory neural network, and training the original prediction model by using the training sample; optimizing and adjusting the super parameters of the original prediction model by combining a firefly optimization algorithm with a chaos initialization and population advantage replacement strategy to obtain a target prediction model;

a third module, configured to group each of the feature samples in the feature data according to the feature importance, to obtain a plurality of feature combinations;

a fourth module, configured to input the feature combination into the target prediction model, obtain a predicted value, and determine a target feature combination based on the predicted value;

and a fifth module, configured to determine predicted input data according to the target feature combination, and predict the predicted input data through the target prediction model to obtain a prediction result.

9. An electronic device comprising a processor and a memory;

the memory is used for storing programs;

the processor executing the program implements the method of any one of claims 1 to 7.

10. A computer storage medium in which a processor executable program is stored, characterized in that the processor executable program is for implementing the method according to any one of claims 1 to 7 when being executed by the processor.