CN114970345A

CN114970345A - Short-term load prediction model construction method, device, equipment and readable storage medium

Info

Publication number: CN114970345A
Application number: CN202210583319.6A
Authority: CN
Inventors: 胡志坚; 焦龄霄; 李天格; 刘盛辉
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2022-05-25
Filing date: 2022-05-25
Publication date: 2022-08-30

Abstract

The application relates to a short-term load prediction model construction method, a short-term load prediction model construction device, equipment and a readable storage medium, relates to the technical field of short-term load prediction, and is characterized in that an initial feature set comprising features such as calendar rules, weather influences and historical loads is screened on the basis of an mRMR-IPSO to obtain an initial feature data set; training an initial Stacking ensemble learning model based on the initial feature data set to obtain an initial prediction model; inputting the initial characteristic data set into an initial prediction model to obtain a load prediction error sequence, constructing an error characteristic set based on the load prediction error sequence, and screening the error characteristic set based on the mRMR-IPSO to obtain an error characteristic data set; training an initial Stacking ensemble learning model based on an error feature data set to obtain an error prediction model; a short-term load prediction model is established based on the initial prediction model and the error prediction model, so that the prediction precision and the generalization performance of the short-term load prediction model are improved.

Description

Short-term load prediction model construction method, device, equipment and readable storage medium

Technical Field

The present application relates to the field of short-term load prediction technologies, and in particular, to a method, an apparatus, a device, and a readable storage medium for constructing a short-term load prediction model.

Background

The short-term load prediction can play an important role in various aspects such as power system planning, power system scheduling operation, power market transaction, power system energy storage regulation and control, power system stability analysis and the like. The essence of short-term load prediction is to deeply mine historical data, find out the internal association between the load and various influence factors, establish a mapping relation and achieve the aim of accurate prediction. However, the power load is easily affected by various factors such as work and rest of people, living habits, weather conditions and the like, and has the characteristics of complexity and variability, so that deep excavation of the load characteristics is difficult, and inconvenience is brought to load prediction work.

In the related art, a conventional statistical prediction model is generally used for prediction with respect to a short-term load prediction model. The statistical prediction model comprises a time series model, a regression analysis model, an exponential smoothing model and the like. For example, a time series model, which builds and estimates a model of the random process according to the process characteristics exhibited by the time series, and then uses the model for prediction, may be classified into an autoregressive model that handles stationary time series, a moving average model and an autoregressive moving average model, and a cumulative autoregressive moving average model that handles non-stationary time series. Although the traditional statistical prediction model has the advantages of high prediction efficiency and simple model, the prediction accuracy of the traditional statistical prediction model for the nonlinear sequence is not high and the robustness is poor.

In addition, when the current load prediction model is subjected to model training, input features are often not selected, so that the model training effect is poor, and the prediction accuracy is poor; the current load prediction model is generally a single prediction model, which has the problem of weak generalization performance, and with the wide access of new energy and active loads in a novel power system, the short-term load is random and has enhanced volatility, so that the short-term load prediction faces greater challenges.

Disclosure of Invention

The application provides a short-term load prediction model construction method, a short-term load prediction model construction device, a short-term load prediction model construction equipment and a readable storage medium, and aims to solve the problems that a traditional load prediction model in the related art is poor in prediction accuracy and poor in generalization performance.

In a first aspect, a short-term load prediction model construction method is provided, which includes the following steps:

acquiring an original feature set, wherein the original feature set comprises a calendar rule feature, a weather influence feature and a historical load feature, and screening the original feature set based on an mRMR-IPSO feature selection method to obtain an initial feature data set;

performing optimization training on an initial Stacking ensemble learning model based on the initial feature data set to obtain an initial prediction model;

inputting the initial characteristic data set into the initial prediction model to obtain a load prediction error sequence, constructing an error characteristic set based on the load prediction error sequence, wherein the error characteristic set comprises a calendar rule characteristic, a weather influence characteristic and a historical error characteristic, and screening the error characteristic set based on an mRMR-IPSO characteristic selection method to obtain an error characteristic data set;

performing optimization training on the initial Stacking ensemble learning model based on the error characteristic data set to obtain an error prediction model;

a short-term load prediction model is created based on the initial prediction model and the error prediction model.

In some embodiments, the screening the original feature set based on the mRMR-IPSO feature selection method to obtain an initial feature data set includes:

screening a primary selection feature subset from the original feature set based on correlation characteristics and maximum correlation minimum redundancy characteristics of an mRMR algorithm;

binary coding is carried out on the initially selected feature subset, and an initialization population is generated;

and iteratively updating the initialization population based on an IPSO algorithm to obtain an initial characteristic data set.

In some embodiments, before the step of optimally training the initial Stacking ensemble learning model based on the initial feature data set to obtain the initial prediction model, the method further includes:

respectively training a plurality of independent prediction models in a preset prediction model set based on the initial characteristic data set to obtain a plurality of trained prediction models;

screening at least one first prediction model and one second prediction model from the plurality of trained prediction models based on the correlation between the model prediction precision and the model prediction error;

an initial Stacking ensemble learning model is built based on a first prediction model and a second prediction model, the first prediction model is used as a base model of the initial Stacking ensemble learning model, and the second prediction model is used as a meta model of the initial Stacking ensemble learning model.

In some embodiments, the set of predictive models includes a LightGBM model, an LSTM model, an RF model, an XGBoost model, a GBDT model, a DBN model, an SVR model, and a KNN model.

In some embodiments, the performing optimization training on the initial Stacking ensemble learning model based on the initial feature data set to obtain an initial prediction model includes:

training each base model based on the initial feature data set and K-fold cross validation, and outputting a test result of each base model, wherein K is a positive integer;

merging the test results of all the base models to obtain a test result set;

and training the meta-model based on the test result set to generate an initial prediction model.

In some embodiments, the load prediction error sequence E is calculated by the following formula:

in the formula, L _tranin Representing the true value of the load corresponding to the initial feature data set,

the load prediction value corresponding to the initial feature data set is represented.

In some embodiments, after the step of creating a short-term load prediction model based on the initial prediction model and the error prediction model, the method further comprises:

and calculating the short-term load prediction model based on the load preliminary prediction value output by the initial prediction model and the load error prediction value output by the error prediction model to obtain a short-term load prediction value.

In a second aspect, a short-term load prediction model building apparatus is provided, including:

the system comprises a first data processing unit, a second data processing unit and a third data processing unit, wherein the first data processing unit is used for acquiring an original feature set, the original feature set comprises a calendar rule feature, a weather influence feature and a historical load feature, and the original feature set is screened based on an mRMR-IPSO feature selection method to obtain an initial feature data set;

the first model training unit is used for carrying out optimization training on an initial Stacking ensemble learning model based on the initial characteristic data set to obtain an initial prediction model;

the second data processing unit is used for inputting the initial characteristic data set into the initial prediction model to obtain a load prediction error sequence, constructing an error characteristic set based on the load prediction error sequence, wherein the error characteristic set comprises a calendar rule characteristic, a weather influence characteristic and a historical error characteristic, and screening the error characteristic set based on an mRMR-IPSO characteristic selection method to obtain an error characteristic data set;

the second model training unit is used for carrying out optimization training on the initial Stacking ensemble learning model based on the error characteristic data set to obtain an error prediction model;

a model creation unit for creating a short-term load prediction model based on the initial prediction model and the error prediction model.

In a third aspect, a short-term load prediction model building device is provided, including: the short-term load prediction model building method comprises a memory and a processor, wherein at least one instruction is stored in the memory, and is loaded and executed by the processor to realize the short-term load prediction model building method.

In a fourth aspect, a computer-readable storage medium is provided, which stores a computer program that, when executed by a processor, implements the aforementioned short-term load prediction model construction method.

The beneficial effect that technical scheme that this application provided brought includes: the prediction accuracy and generalization performance of the short-term load prediction model are effectively improved.

The application provides a short-term load prediction model construction method, a short-term load prediction model construction device and a readable storage medium, wherein the method comprises the steps of obtaining an original feature set, wherein the original feature set comprises a calendar rule feature, a weather influence feature and a historical load feature, and screening the original feature set based on an mRMR-IPSO feature selection method to obtain an initial feature data set; performing optimization training on the initial Stacking ensemble learning model based on the initial characteristic data set to obtain an initial prediction model; inputting the initial characteristic data set into an initial prediction model to obtain a load prediction error sequence, constructing an error characteristic set based on the load prediction error sequence, wherein the error characteristic set comprises a calendar rule characteristic, a weather influence characteristic and a historical load characteristic, and screening the error characteristic set based on an mRMR-IPSO characteristic selection method to obtain an error characteristic data set; performing optimization training on the initial Stacking ensemble learning model based on the error characteristic data set to obtain an error prediction model; a short-term load prediction model is created based on the initial prediction model and the error prediction model. According to the method, aiming at the characteristics that short-term load is influenced by a plurality of factors and space-time variation, load influence factors are considered, a complete original feature set is constructed, and then an mRMR-IPSO feature selection method is adopted to screen the original feature set so as to improve the quality of model training data; compared with a single prediction model, the method can fuse a plurality of single models, exert the advantages of each model, has stronger generalization performance, and introduces an error correction link on the basis of the load preliminary prediction, namely, the error correction is carried out on the preliminary load prediction based on the error prediction model, so that the potential information in the error can be mined, and the model prediction error can be further reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart of a short-term load prediction model construction method according to an embodiment of the present disclosure;

FIG. 2 is a flowchart of short-term load prediction based on Stacking ensemble learning and error correction provided in an embodiment of the present application;

fig. 3 is a flowchart of mRMR feature selection provided in an embodiment of the present application;

FIG. 4 is a flowchart of IPSO feature selection provided by an embodiment of the present application;

fig. 5 is a flowchart of mRMR-IPSO feature selection provided in an embodiment of the present application;

FIG. 6 is a diagram of a Stacking ensemble learning framework provided in an embodiment of the present application;

FIG. 7 is a flowchart of a specific training process of Stacking ensemble learning provided in an embodiment of the present application;

FIG. 8 is a waveform diagram of experimental data provided by an embodiment of the present application;

FIG. 9 is a comparative prediction error analysis graph for each of the independent prediction models provided in the embodiments of the present application;

FIG. 10 is a graph illustrating a correlation analysis of model prediction errors for each of the independent prediction models provided in an embodiment of the present application;

FIG. 11 is a comparison diagram of the prediction result analysis of different Stacking ensemble learning models provided in the embodiment of the present application;

FIG. 12 is a comparison graph of the prediction results of different combination prediction models provided in the embodiments of the present application;

fig. 13 is a schematic structural diagram of a short-term load prediction model building device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making creative efforts shall fall within the protection scope of the present application.

The embodiment of the application provides a short-term load prediction model construction method, a short-term load prediction model construction device, a short-term load prediction model construction equipment and a readable storage medium, and can solve the problems of poor prediction precision and poor generalization performance of a traditional load prediction model in the related art.

Referring to fig. 1 and fig. 2, an embodiment of the present application provides a method for constructing a short-term load prediction model, including the following steps:

step S1: acquiring an original feature set, wherein the original feature set comprises a calendar rule feature, a weather influence feature and a historical load feature, and screening the original feature set based on an mRMR-IPSO feature selection method to obtain an initial feature data set;

further, the screening of the primitive feature set based on the mRMR-IPSO feature selection method to obtain an initial feature data set includes:

carrying out binary coding on the initially selected feature subset and generating an initialization population;

Exemplarily, in the present embodiment, referring to fig. 3 to fig. 5, an original feature set for short-term load prediction is constructed, which includes various factors such as calendar rules, weather influences, historical loads, and the like, and the original feature set is screened by an mRMR (maximum Relevance minimum Redundancy) -IPSO (Improved Particle Swarm Optimization) feature selection method. The screening process includes two stages: the first stage selects features based on the mRMR algorithm, namely the original feature set S ₀ Carrying out data preprocessing and mRMR feature sorting to obtain a primary selection feature subset S _f (ii) a The second stage is to select features based on IPSO algorithm, i.e. to select the initially selected feature subset S _f Binary coding is carried out, an initialization population is generated randomly, then an adaptive function in an IPSO algorithm updates particles in the population and outputs selected features, the selected features are input into a LightGBM prediction model, the selected features are predicted based on a preset training set and five-fold cross validation to obtain prediction errors, then the IPSO algorithm is subjected to iterative updating based on the feature quantity and the prediction errors of the selected features, and finally an input feature variable data set (namely an initial feature data set S) of short-term load prediction is obtained _e )。

Specifically, step S11: referring to table 1, a raw feature set S including various factors such as calendar rules, weather influences, historical loads and the like is constructed ₀ ；

Table 1 raw feature set example

Wherein, in the calendar rule feature, C _hour 0 to 23 is an hour, C _month 1 to 12 denotes month, C _day 1-31 denotes the day of the month, C _week 1-7 represents the current weekDay four, C _wow With 0 or 1 representing weekend or non-weekend, C _holi Holidays or non-holidays are denoted by 0 or 1; in the weather-affecting feature, T _h Denotes the h hour temperature, T, before the predicted time _24d Indicates the predicted temperature at that time, T, d days before the day _avg(t) 、T _max(t) And T _min(t) Respectively, represent time periods t-24, t]Average temperature in day, maximum temperature in day and minimum temperature in day, T _t ' and T _t "respectively denotes the first and second derivatives of the temperature at the predicted time; in the historical load characteristics, L _h Denotes the h hour load before the predicted time, L _24d Indicates the predicted load at that time d days before the day, L _avg(t-24) 、L _max(t-24) And L _min(t-24) Respectively, represent time periods t-48, t-24]Average daily load, maximum daily load and minimum daily load of L' _t-24 And L ″) _t-24 The first and second derivatives of the load at that time of the day before the predicted day are shown, respectively.

Step S12: referring to FIG. 3, first, feature selection is performed for the 1 st time, i.e., the pair

Calculating the correlation between the target variable and the target variable y, and selecting the feature s with the maximum correlation ₁ A 1 is to ₁ Adding the selected feature set S ₁ ；

Wherein, the calculation formula of the correlation I (x; y) is as follows:

features of greatest correlation s ₁ The calculation formula of (2) is as follows:

step S13: then the nth feature selection is carried out, and

calculating the maximum correlation minimum redundancy phi value, and selecting the characteristic s with the maximum phi value _n And will bes _n Adding the selected feature set S _n-1 Forming a new set of selected features S _n ；

The maximum correlation minimum redundancy phi value calculation formula is as follows:

characteristic s of maximum value of phi _n The calculation formula of (2) is as follows:

step S14: step S13 is repeated until the set of selected features S _n The number of the middle features reaches a set threshold value N _t Further obtain the initially selected feature subset S _f I.e. S _f ＝S _n (ii) a The set threshold of the number of features may be: n is a radical of _t ＝40。

Step S15: referring to FIG. 4, a subset S of the initially selected features is shown _f Carrying out binary coding, wherein 1 represents that the characteristic is selected, and 0 represents that the characteristic is not selected; and after determining the size of the population and the fitness function, randomly generating an initialization population.

Step S16: calculating the fitness value of each individual in the initialized population according to the fitness function; wherein the fitness function comprises:

where f (x) is a fitness function for particle x, λ is a weighting factor, and the weighting factor may be set to λ ═ 0.75; e _MAPE The average absolute percentage error of the accuracy evaluation index of the prediction model is calculated; n is a radical of hydrogen _s Is the number of selected features; n is a radical of _t 40 is the total number of the features in the initially selected feature subset; n is the total number of samples participating in the evaluation, and the total number of samples may be set to n 7754;y _t and

respectively representing the real value and the predicted value of the load at the time t.

Step S17: updating the inertial weight omega and the individual learning factor c of the particle ₁ And population learning factor c ₂ And updating the global optimal solution G _best And individual optimal solution P _best Thereby calculating the position and the flying speed of the updated particle;

the formula for calculating the inertia weight ω of the particles is as follows:

ω＝ω _max -(ω _max -ω _min )(t/T _max ) ²

in the formula, ω _max And ω _min The maximum value and the minimum value of ω, respectively, can be set to ω _max ＝0.9、ω _min ＝0.4；T _max For the maximum number of iterations, the maximum number of iterations may be set to T _max 50; t is the current iteration number;

the calculation formulas of the two learning factors of the particle are respectively as follows:

in the formula, c _1max And c _1min Are respectively c ₁ Can be set to c and the minimum value of _1max ＝2.5、c _1min ＝05；c _2max And c _2min Are respectively c ₂ Can be set to c and the minimum value of _2max ＝2.5、c _2min ＝0.5；T _max For the maximum number of iterations, the maximum number of iterations may be set to T _max 50; t is the current iteration number;

particle siteThe calculation formula is as follows: from

In the formula (I), the compound is shown in the specification,

the position of the particle i at the t +1 th iteration;

the velocity of the particle i at the t +1 th iteration;

the formula of the flight speed is as follows:

in the formula, r ₁ And r ₂ Are respectively [0,1 ]]A random number within;

and

respectively an individual optimal solution and a population global optimal solution in the t iteration; wherein the size of the particle population may be set to 10.

Step S18: repeating the steps S16 and S17 until the maximum number of iterations is reached, so as to obtain the final refined initial feature data set S _e 。

Step S2: performing optimization training on an initial Stacking ensemble learning model based on the initial feature data set to obtain an initial prediction model;

further, before the step of performing optimization training on the initial Stacking ensemble learning model based on the initial feature data set to obtain an initial prediction model, the method further includes:

Further, the set of prediction models includes a LightGBM model, an LSTM model, an RF model, an XGBoost model, a GBDT model, a DBN model, an SVR model, and a KNN model.

Further, the performing optimization training on the initial Stacking ensemble learning model based on the initial feature data set to obtain an initial prediction model includes:

merging the test results of all the base models to obtain a test result set;

Exemplarily, in the present embodiment, the initial feature data set S is obtained based on the screening _e And carrying out optimization training on the initial Stacking ensemble learning model to obtain an initial prediction model.

Specifically, step S21: initial feature data set S obtained by screening _e Performing optimization training on the multiple independent prediction models to further obtain model prediction precision and model prediction error correlation of each independent prediction model, and determining a base model and a meta-model combination of the Stacking integrated learning model from the multiple independent prediction models according to the model prediction precision and the model prediction error correlation of each independent prediction model; wherein, the number of the independent prediction models can be set to 8, which are respectively: a LightGBM model, an RF model, an LSTM model, an XGboost model, a GBDT model, a DBN model, an SVR model and a KNN model;

accuracy of model prediction E _p The calculation formula of (2) is as follows:

model prediction error correlation r _xy The calculation formula of (c) is:

where n is the total number of prediction samples, the total number of prediction samples may be set to n 7754; x is a radical of a fluorine atom _i And y _i The prediction errors of two different independent prediction models are respectively obtained;

and

the prediction error average values of two different independent prediction models are respectively.

Step S22: referring to fig. 6 and 7, an initial feature data set S is set _e Dividing into K equal parts, setting K to 5 in this embodiment, and it should be noted that the specific setting of K may be determined according to actual conditions; and a K-fold cross validation training basis model is adopted. Referring to fig. 7, taking K-1 parts as a training set and 1 part as a test set in turn for the basis learner in each basis model, and performing training and testing for K times to obtain a predicted value corresponding to the 5 training sets and a predicted value corresponding to the 5 test sets; and superposing the predicted values corresponding to the 5 training sets of each base model to be used as a part of a new training set in the test result set, and averaging the predicted values corresponding to the 5 testing sets of each base model to be used as a part of the new testing set in the test result set.

Step S23: and training and testing the meta-learner in the meta-model by using the test result set, and further generating the optimized Stacking-based ensemble learning, namely an initial prediction model.

Step S3: inputting the initial characteristic data set into the initial prediction model to obtain a load prediction error sequence, constructing an error characteristic set based on the load prediction error sequence, wherein the error characteristic set comprises a calendar rule characteristic, a weather influence characteristic and a historical error characteristic, and screening the error characteristic set based on an mRMR-IPSO characteristic selection method to obtain an error characteristic data set;

further, the calculation formula of the load prediction error sequence E is as follows:

in the formula, L _tranin Representing the true value of the load corresponding to the initial feature dataset,

Exemplarily, in the present embodiment, the initial feature data set S is _e Inputting the data into an initial prediction model, outputting a load prediction value, and predicting an error sequence according to the load

The calculation formula of (2) outputs a load prediction error sequence; and constructing an error feature set according to the load prediction error sequence (see table 2).

Table 2 example set of error features

Wherein, in the calendar rule feature, C _hour 0 to 23 is an hour, C _month 1 to 12 denotes month, C _day 1-31 denotes the day of the month, C _week 1-7 represents day of the week, C _wow With 0 or 1 representing weekend or non-weekend, C _holi Holidays or non-holidays are denoted by 0 or 1; in the weather-affecting feature, T _h Denotes the h hour temperature, T, before the predicted time _24d Indicates the predicted temperature at that time, T, d days before the day _avg(t) 、T _max(t) And T _min(t) Respectively, represent time periods t-24, t]Average temperature in day, maximum temperature in day and minimum temperature in day, T _t ' and T _t "respectively denotes the first and second derivatives of the temperature at the predicted time; in the historical error characteristics, E _h Indicating an error h hours before the predicted time, E _24d Indicating the error at the time d days before the predicted day, E _avg(t-24) 、E _max(t-24) And E _min(t-24) Respectively, represent time periods t-48, t-24]Average error within day, highest error within day and lowest error within day, E' _t-24 And E ″) _t-24 Respectively representing the first derivative and the second derivative of the error at that time of day prior to the predicted day.

The error feature set is screened based on the mRMR-IPSO feature selection method to obtain an error feature data set, and it should be noted that, since the screening method and principle of the error feature data set are similar to those of the initial feature data set, no further description is given here for the simplicity of description.

Step S4: performing optimization training on the initial Stacking ensemble learning model based on the error characteristic data set to obtain an error prediction model;

exemplarily, in this embodiment, referring to fig. 6 and 7, the error feature data set is divided into K equal parts, and in this embodiment, K is set to 5, and it should be noted that the specific setting of K may be determined according to an actual situation; and a K-fold cross validation training basis model is adopted. Referring to fig. 7, taking K-1 parts as a training set and 1 part as a test set in turn for the basis learner in each basis model, and performing training and testing for K times to obtain a predicted value corresponding to the 5 training sets and a predicted value corresponding to the 5 test sets; and superposing the predicted values corresponding to the 5 training sets of each base model to be used as a part of a new training set in the test result set, and averaging the predicted values corresponding to the 5 testing sets of each base model to be used as a part of the new testing set in the test result set. And then training and testing the meta-learner in the meta-model by using the test result set, and further generating the optimized Stacking-based ensemble learning, namely an error prediction model.

Step S5: a short-term load prediction model is created based on the initial prediction model and the error prediction model.

Further, after the step of creating a short-term load prediction model based on the initial prediction model and the error prediction model, the method further includes:

Exemplarily, in this embodiment, the short-term load prediction model includes an initial prediction model and an error prediction model, the initial characteristic data set is input to the initial prediction model to obtain a load preliminary prediction value, the error characteristic data set is input to the error prediction model to obtain a load error prediction value, and then the load preliminary prediction value and the load error prediction value are added to obtain a final short-term load prediction value.

Therefore, according to the method, the load influence factors are considered for the characteristics of short-term load audience multi-factor influence and space-time variability, a complete original feature set is constructed, and an mRMR-IPSO feature selection method is adopted to screen input feature variables, namely, the original feature set is firstly subjected to primary selection by adopting mRMR, and then the primary selection feature subset is subjected to fine selection by adopting an IPSO optimal feature subset search strategy, so that an initial feature data set and an error feature data set are screened out, and the quality of model training data is improved; and then deeply excavating a nonlinear mapping relation between the input characteristic variables and the predicted load by using a Stacking ensemble learning model to perform load preliminary prediction, then predicting the load prediction error sequence again by using the Stacking ensemble learning model, and finally adding the load preliminary prediction result and the load error prediction result to obtain a final prediction result. The prediction accuracy is effectively improved by combining the multi-algorithm fusion advantages of the Stacking ensemble learning model on the basis of feature selection, and the short-term load prediction model in the embodiment is formed by training the Stacking ensemble learning model, compared with a single prediction model, the short-term load prediction model can be fused with a plurality of single models, the advantages of the models are exerted, the short-term load prediction model has stronger generalization performance, meanwhile, on the basis of load preliminary prediction, an error correction link is introduced, namely, error correction is carried out on the preliminary load prediction on the basis of the error prediction model, and potential information in errors can be mined, so that the model prediction errors are further reduced.

This embodiment is further explained below with reference to fig. 8 to 12.

The historical data of a certain year is taken as experimental data, and the data information is shown in the attached figure 8. The original feature set totals 8616 samples, which are divided into a training set and a test set according to the ratio of 9:1, and prediction is carried out by taking 1h as a step length.

Firstly, the mRMR-IPSO is adopted to perform feature selection on the original feature set, and the feature selection result is shown in table 3.

Table 3 mRMR-IPSO feature selection result example

Then, a base model and a meta model combination of the Stacking ensemble learning are determined in the selected independent prediction models (including the LightGBM model, the LSTM model, the RF model, the XGBoost model, the GBDT model, the DBN model, the SVR model, and the KNN model) according to the model prediction accuracy and the model prediction error correlation of each independent prediction model, which are shown in fig. 9 to 10 (data in the box in fig. 10 is the error correlation between each independent model). Wherein the evaluation index of the prediction result comprises the average absolute percentage error E _MAPE And root mean square error E _RMSE And, and:

in the formula, n is 862, which is the number of prediction points participating in calculating the error value; y is _t And

The parameters of each independent model are shown in table 4.

TABLE 4 example of the hyper-parameter settings for each independent model

Among them, the LightGBM model has the smallest prediction error, and E thereof _MAPE 1.45 percent, the prediction precision of the RF model, the LSTM model, the XGboost model and the GBDT model are similar, the excellent prediction performance is shown, the prediction precision of the DBN model and the SVR model is poor, the prediction error of the KNN is the largest, and the E is the maximum _MAPE Up to 4.15%.

Therefore, except for the DBN model, the SVR model and the KNN model, the prediction error correlation of other models is higher, because the selected models are strong learning models, the prediction accuracy is in a higher level, and the correlation among the LightGBM model, the RF model, the XGboost model and the GBDT model reaches more than 0.75, because the models are all tree models although having certain difference in algorithm principle. The error correlation of the DBN model, the SVR model and the KNN model is lower than that of other models, because the DBN model, the SVR model and the KNN model belong to different deep learning and traditional machine learning, and the training principles are different.

Therefore, by integrating the prediction accuracy of each independent model and the difference of the prediction results, the embodiment selects the LightGBM model and the LSTM model with higher prediction accuracy and selects the DBN model and the SVR model with smaller prediction error correlation with the LightGBM model and the LSTM model as the base learning model for Stacking ensemble learning, and the meta-learning model selects the LightGBM model with the best prediction performance.

In order to analyze the rationality of the method for selecting the Stacking ensemble learning basis and the meta model in the embodiment, different basis learning models are set and combined to construct different Stacking ensemble learning models for comparative analysis, and the prediction error is shown in fig. 11. Therefore, the selection of either the base learning model or the meta learning model has an influence on the performance of the integration model, and the integration model (i.e., Stacking4) built by the selected meta learning model of the embodiment has the optimal performance.

In order to analyze the influence of feature selection, a comparison experiment is performed by three methods of mRMR-IPSO feature selection selected in the embodiment, feature selection based on the traditional LightGBM model feature contribution degree, and feature selection by adopting manual experience, and the number of the features selected by the latter two methods is set to be consistent with the number of the selected methods in the embodiment. The predicted results are shown in table 5.

TABLE 5 prediction of effectiveness of different feature selection methods

Feature selection method	E _MAPE /％	E _RMSE /MW
			mRMR-IPSO	1.50	3.819
LightGBM model feature contribution degree	1.60	4.046
			Experience of the human being	1.83	4.586

Therefore, the input feature set selected by artificial experience has the worst prediction precision, the feature selection effect based on the LightGBM model feature contribution degree is better than that of the artificial experience selection, the feature selection method used in the embodiment has the best prediction precision, and E is the best prediction precision _MAPE The prediction accuracy is 1.50%, which is improved by 18.0% compared with the prediction accuracy of manual empirical feature selection.

In conclusion, because the load fluctuation conditions of different regions are different, the influence factors of the different regions are different, and the artificial experience selection features only depend on the commonality of the load fluctuation of each region and are difficult to deeply dig the load characteristics of the region, the prediction accuracy of the artificial experience selection features is poor; the features are selected based on the LightGBM model feature contribution degree and are sorted according to the gains of the tree model, the gains are related to the utilization rate of each feature in the training process, the correlation characteristics of the features and the load to be predicted can be better mined, and the prediction accuracy can be improved to a certain extent; the method used by the embodiment firstly obtains the feature subset from the correlation analysis of the input feature set and the output load to be predicted according to the mathematical statistics principle, and then further selects the feature subset from the aspect of reducing the model prediction error according to the model prediction performance, so that the selected feature subset has the best prediction accuracy.

In order to verify the rationality and the advantages of the hybrid prediction model provided by the embodiment, different models are set for comparison. The model is set as follows.

M1: predicting by using an LSTM model;

m2: forecasting by utilizing a Stacking ensemble learning model;

m3: performing feature selection and Stacking ensemble learning model prediction by using the mRMR-IPSO;

m4: forecasting by using a Stacking ensemble learning model, and correcting errors by using an LSTM model;

m5: the model mentioned in this embodiment;

the prediction results and prediction errors for the different models are shown in fig. 12 and table 6, respectively.

TABLE 6 comparison of prediction errors for different models

As can be seen by comparing the prediction evaluation indexes of different models, when the model provided by the embodiment is used for load prediction, the prediction error E is _MAPE 1.12%, E _RMSE 3.027MW, the prediction error is smaller than that of other 4 models, and the rationality and superiority of the hybrid prediction model provided by the embodiment are fully demonstrated. And E predicted by using 4 models of M1, M2, M3 and M4 _MAPE 2.35%, 1.63%, 1.32% and 1.52%, respectively, with the M1 model having the worst predicted performance and its E _MAPE Up to 2.35%, E _RMSE 6.543MW was reached. Therefore, by comparing the M1 models with the M5 models, the prediction accuracy of the model provided by the embodiment is doubled compared with that of the LSTM model commonly used for load prediction, because the Stacking ensemble learning model adopted by the embodiment can be effectively combined with different models, and the advantages of different models are fully exerted to reduce the problem of poor generalization capability of a single model; compared with a model without feature selection, the prediction performance of the Stacking ensemble learning model adopting feature selection is better, and the feature selection strategy is further shown to be favorable for better mining load characteristics of the model, so that higher prediction accuracy is achieved; the M3 and M5 models are compared to obtain the model, the load prediction model added in the error correction link can obtain better prediction effect than the original load prediction model, and the error prediction model established in the embodiment can fully mine and utilize hidden information in the error sequence, so that the prediction precision is improved.

The above results were further analyzed:

(1) the original load sequence has larger randomness and stronger fluctuation, and a satisfactory load prediction result is difficult to obtain only by relying on a traditional single prediction model.

(2) The embodiment starts from three aspects of performing feature selection to improve the quality of training data input into a prediction model, adopting an ensemble learning combination model to improve the performance of the prediction model and introducing an error correction link to mine prediction error information, and the built hybrid prediction model based on Stacking ensemble learning and error correction can effectively improve the overall prediction performance of the model in each link.

The embodiment of the present application further provides a short-term load prediction model building apparatus, including:

According to the method, aiming at the characteristics of influence of multiple factors and space-time variability of short-term load audiences, load influence factors are considered, a complete original feature set is constructed, and then an mRMR-IPSO feature selection method is adopted to screen the original feature set so as to improve the quality of model training data; compared with a single prediction model, the method can fuse a plurality of single models, exerts the advantages of the models, has stronger generalization performance, and introduces an error correction link on the basis of load preliminary prediction, namely, the error correction is carried out on the preliminary load prediction based on the error prediction model, so that potential information in errors can be mined, and the model prediction errors can be further reduced.

Further, the first model training unit is specifically configured to:

Further, the first model training unit is further configured to:

Further, the first model training unit is specifically configured to:

combining the test results of the base models to obtain a test result set;

Further, the model creating unit is further configured to:

It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working processes of the apparatus and the units described above may refer to the corresponding processes in the foregoing short-term load prediction model construction method embodiment, and are not described herein again.

The apparatus provided by the above embodiment may be implemented in the form of a computer program that can be run on a short-term load prediction model building device as shown in fig. 13.

The embodiment of the present application further provides a short-term load prediction model building device, including: the short-term load prediction model building method comprises a memory, a processor and a network interface which are connected through a system bus, wherein at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor so as to realize all or part of the steps of the short-term load prediction model building method.

The network interface is used for performing network communication, such as sending distributed tasks. Those skilled in the art will appreciate that the architecture shown in fig. 13 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

The Processor may be a CPU, other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center of the computer device and the various interfaces and lines connecting the various parts of the overall computer device.

The memory may be used to store computer programs and/or modules, and the processor may implement various functions of the computer device by executing or executing the computer programs and/or modules stored in the memory, as well as by invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a video playing function, an image playing function, etc.), and the like; the storage data area may store data (such as video data, image data, etc.) created according to the use of the cellular phone, etc. Further, the memory may include high speed random access memory, and may include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, all or part of the steps of the aforementioned short-term load prediction model construction method are implemented.

The embodiments of the present application may implement all or part of the foregoing processes, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of the foregoing methods. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-Only memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, in accordance with legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunications signals.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, server, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A short-term load prediction model construction method is characterized by comprising the following steps:

2. The method for constructing a short-term load prediction model according to claim 1, wherein the screening of the raw feature set based on the mRMR-IPSO feature selection method to obtain an initial feature data set comprises:

3. The method for constructing a short-term load prediction model according to claim 1, wherein before the step of performing optimization training on the initial Stacking ensemble learning model based on the initial feature data set to obtain the initial prediction model, the method further comprises:

4. The method of constructing a short-term load prediction model according to claim 3, wherein the set of prediction models comprises a LightGBM model, an LSTM model, an RF model, an XGboost model, a GBDT model, a DBN model, an SVR model, and a KNN model.

5. The method for constructing a short-term load prediction model according to claim 3, wherein the performing optimization training on the initial Stacking ensemble learning model based on the initial feature data set to obtain an initial prediction model comprises:

merging the test results of all the base models to obtain a test result set;

6. The method for constructing a short-term load prediction model according to claim 1, wherein the load prediction error sequence E is calculated by the formula:

representing load predictions corresponding to an initial feature data setAnd (6) measuring.

7. The method of constructing a short-term load prediction model as claimed in claim 1, further comprising, after said step of creating a short-term load prediction model based on said initial prediction model and said error prediction model:

8. A short-term load prediction model construction apparatus, comprising:

9. A short-term load prediction model construction device, characterized by comprising: a memory and a processor, the memory having stored therein at least one instruction that is loaded and executed by the processor to implement the short term load prediction model construction method of any of claims 1 to 7.

10. A computer-readable storage medium characterized by: the computer storage medium stores a computer program which, when executed by a processor, implements the short-term load prediction model construction method of any one of claims 1 to 7.