CN115879590A

CN115879590A - Load prediction method based on wavelet feature extraction and integrated learning model

Info

Publication number: CN115879590A
Application number: CN202210407539.3A
Authority: CN
Inventors: 王书峰; 李勇; 安彬; 许满库; 赵军愉; 葛硕
Original assignee: State Grid Corp of China SGCC; Baoding Power Supply Co of State Grid Hebei Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; Baoding Power Supply Co of State Grid Hebei Electric Power Co Ltd
Priority date: 2022-04-19
Filing date: 2022-04-19
Publication date: 2023-03-31

Abstract

The invention provides a load prediction method based on wavelet feature extraction and an ensemble learning model, which is characterized in that a wavelet transformation method is adopted to carry out frequency decomposition on a load sequence, an XGboost algorithm is used to carry out feature selection on the load sequence, and a Stacking ensemble learning model is combined to carry out load prediction. In the first stage of load prediction, the load data, temperature, humidity, rainfall and holiday information at a plurality of historical moments are adopted, and relevant data are preprocessed and normalized; and in the second stage, based on the data processed in the previous stage, wavelet decomposition is applied to obtain a stable load sequence with a plurality of frequency components, and the XGboost algorithm is used for carrying out feature selection on the decomposed load sequence, so that irrelevant features are removed, and the input dimensionality is reduced. And the third stage inputs the feature set constructed before into the Stacking deep learning model for training. On the basis of a single model, the method can effectively improve the precision and generalization capability of load prediction, reduce the training time of the model, and has strong practicability.

Description

Load prediction method based on wavelet feature extraction and integrated learning model

Technical Field

The invention relates to a load prediction technology, in particular to a load prediction method based on wavelet feature extraction and integrated learning model

Background

In recent years, with the rapid development of the economy of China, the demand for electric power is increasing, the electric power industry is the basic industry of China, and the accurate and effective operation of the electric power industry is about the normal operation of the economic and social development of China. The power load prediction technology is one of key technologies in the power industry, and plays an important role in power generation, planning, scheduling, maintenance and the like of a power system. The accurate load prediction is beneficial to reasonably arranging a power generation plan and a power grid operation mode, provides reference for making a power system scheduling plan, ensures the safe and reliable operation of the power system, and effectively improves the economic benefit and the social benefit of the power system.

In the short-term prediction technology of the power load, the artificial neural network is widely applied, the BP neural network, the recurrent neural network and the long-term and short-term memory neural network all show good prediction capability, but the single models have specific application ranges and are poor in generalization capability of the models, the advantages of the multiple models can be combined through an integrated learning mode, the limitation problem of a single algorithm is effectively overcome, and the prediction accuracy and stability are improved.

Disclosure of Invention

The invention designs a load prediction method based on wavelet feature extraction and an integrated learning model, which changes loads into a stable sequence through wavelet transformation, performs feature selection by applying an XGboost algorithm, and further improves the precision of load prediction by using a Stacking integrated deep learning model.

The invention adopts the following technical scheme:

a load prediction method based on wavelet feature extraction and an integrated learning model is characterized in that a wavelet transformation method is adopted to carry out frequency decomposition and feature extraction on a load sequence, an XGboost algorithm is used to carry out feature selection on the load sequence, and a Stacking integrated deep learning model is combined to carry out power load prediction. In the first stage of power load prediction, load data, temperature, humidity, rainfall and holiday characteristic information at a plurality of historical moments are adopted, and relevant data are preprocessed and normalized; and in the second stage, based on the data processed in the previous stage, wavelet decomposition is applied to obtain a stable load sequence with a plurality of frequency components, and the XGboost algorithm is used for selecting the characteristics of the decomposed load sequence, so that irrelevant characteristics are removed and the input dimensionality is reduced. And in the third stage, inputting the previously constructed feature set into a Stacking deep learning model for training, acquiring data at multiple moments in real time, and inputting the data into the trained ensemble learning model for prediction.

A load prediction method based on wavelet feature extraction and integrated learning model is characterized by comprising the following steps:

step 1: respectively carrying out abnormal data identification, missing value processing and normalization processing on the power load, temperature, humidity, rainfall and holiday information data at a plurality of historical moments to obtain a preprocessed power load data set at the plurality of historical moments, preprocessed temperatures at the plurality of historical moments, preprocessed humidity at the plurality of historical moments, preprocessed rainfall at the plurality of historical moments and preprocessed holiday data set at the plurality of historical moments;

step 2: decomposing the preprocessed power loads at a plurality of historical moments into load sequences of a plurality of frequency components by a wavelet decomposition method;

and 3, step 3: performing feature selection on the load sequence of each frequency component by using an XGboost algorithm to obtain a load feature vector of each frequency component, constructing a feature matrix of each frequency component by combining preprocessed temperature, humidity, rainfall and holiday data of a plurality of historical moments, uniformly dividing the feature matrix of each frequency component into a plurality of groups of feature matrix samples of each frequency component, and marking a load true value of each group of feature matrix samples of each frequency component;

and 4, step 4: introducing a Stacking integrated deep learning model, inputting each group of characteristic matrix samples of each frequency component into the Stacking integrated deep learning model for prediction to obtain the predicted load of the Stacking integrated deep learning model, constructing a loss function model by combining the load truth value of each group of characteristic matrix samples of each frequency component, obtaining the optimized Stacking integrated deep learning model through K-turn cross validation training, and summing the predicted results of each frequency component model to obtain the final predicted result;

and 5: acquiring power load, temperature, humidity, rainfall and holiday sign data of multiple moments in real time, and predicting the power load, temperature, humidity, rainfall and holiday sign data of the multiple moments acquired in real time through the optimized packing integrated deep learning model in the steps 1, 2, 3 and 4 to obtain predicted power load;

preferably, in step 2, the preprocessed power loads at the plurality of historical times are:

data＝{load(k),k∈[1,L]}

wherein, load (k) represents the power load at the kth sampling point, and L is the number of the sampling points;

2, obtaining a plurality of frequency components after wavelet decomposition and single-branch reconstruction of the original load sequence

Preferably, the feature matrix of each frequency component in step 3 is defined as: for each wavelet component, feature selection is performed using the XGBoost algorithm. XGboost is a massively parallelizable algorithm, the most basic component of which is the CART regression tree. It constructs a classification tree based on training features and training data. The importance of each feature to the model contribution is determined by calculating the sum of the number of times the feature is used to segment the tree.

According to the feature importance score, selecting n historical moments with the highest feature relevance to combine into a new load feature set X _f 。

Wherein, X _f A set of load characteristics representing the f-th frequency component,

representing the characteristic load value at the kth sampling point at the ith history time selected by the f-th frequency component characteristic, f ∈ [1, F]F represents the number of frequency components, i ∈ [1, N ]]N represents the number of history moments selected by the feature;

step 3, the method for defining the multiple groups of characteristic matrix samples of each frequency component comprises the following steps:

for each wavelet component, obtaining a load characteristic set X after characteristic selection _f The preprocessed temperature, humidity, rainfall and holiday information data at the corresponding moment are constructed into a meteorological feature set T, the load feature set and the meteorological feature are assembled to construct a new input feature set, and 5 input feature sets, namely XA4, XD1, XD2, XD3 and XD4, can be obtained corresponding to each wavelet component.

And 4, integrating the Stacking integrated deep learning model, wherein the selected base learners are a BP (back propagation) neural network, an RBF (radial basis function) neural network, a random forest and a long-short term memory neural network.

The invention has the beneficial effects that: compared with the prior art, the method has the advantages that the non-stationary original sequence is transformed into the stationary load sequence by using the wavelet transform, and the characteristics of the load sequence on different time scales can be analyzed. Feature importance selection is carried out on the influence factors of the load sequence through the XGboost algorithm, irrelevant features are removed through constructing a proper feature set, meanwhile, the dimensionality of input variables can be effectively reduced, and the training difficulty of the model is reduced. By integrating the deep learning model by Stacking, the defect of a single model in load prediction is overcome, the generalization capability of the model is improved, the prediction precision and stability of the model are effectively improved, the running cost of a power system is reduced, and guidance is provided for subsequent power generation plan designation and power scheduling.

Drawings

FIG. 1: is a flow chart of the method of the invention;

FIG. 2: is a schematic diagram of the integrated deep learning model of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

The method combines a wavelet transformation method to extract the characteristics of original load data on the basis of the existing single model, converts nonstationary load data into stable characteristic data through wavelet decomposition and wavelet reconstruction, and performs characteristic selection on the characteristic data by using an XGboost algorithm, thereby removing irrelevant data, reducing the dimensionality of input variables and reducing the training difficulty. By the Stacking integrated deep learning method and the combination of various prediction models, the precision and the stability of load prediction are effectively improved.

The invention provides a power load prediction method based on wavelet feature extraction and integrated deep learning, which is shown in a flow diagram of figure 1 and comprises the following steps:

a load prediction method based on wavelet feature extraction and integrated learning model comprises the following steps:

step 1: respectively carrying out abnormal data identification, missing value processing and normalization processing on the power load, temperature, humidity, rainfall and holiday information data at a plurality of historical moments to obtain preprocessed power load data sets at a plurality of historical moments, preprocessed temperature, humidity, rainfall and holiday data sets at a plurality of historical moments;

the default value processing method in the step 1 comprises the following steps: in the data acquisition process, due to machine faults or personnel recording errors and the like, data loss at a certain time point can be caused, the missing data influences the integrity of a data set and the continuity of the data in time, the missing data should be supplemented, the used method is to select K adjacent data and then calculate an average value to be used as a missing data value.

The abnormal data identification and correction method in the step 1 mainly comprises a horizontal processing method and a vertical processing method:

the principle of horizontal processing is to determine the error between the time point and the adjacent time point, and usually, the power load data is continuously collected, so that the two adjacent data points should be smooth and continuous. If the difference between the collected power load data and the data of the adjacent time points before and after the collected power load data is too large at a certain moment, the data point is considered to be an abnormal data point, and the data point should be corrected, namely:

the correction method comprises the following steps:

in the formula: x (d, t) represents the data of the t time point on the d day, X (d, t-1) represents the data of the t-1 time point on the d day, X (d, t + 1) represents the data of the t +1 time point on the d day,

is an error threshold.

The principle of the vertical processing method is to judge the error between the data point and the data at the same time of the previous and next two days, the power load data usually implies the periodic characteristic with day as the unit, that is, the load between the adjacent dates should have similar variation characteristics, if at a certain time, the acquired data has a larger error with the data at the same time of the previous and next two same types of days, the data point is considered as an abnormal data point, and the data point should be corrected, that is:

the correction method comprises the following steps:

in the formula: x (d, t) represents data of the t-th time point on the d-th day, X (d, t-1) represents data of the t-1-th time point on the d-th day, X (d, t + 1) represents data of the t + 1-th time point on the d-th day, K (d-1) represents an average load value on the d-1-th day, K (d + 1) represents an average load value of the d + 1-th day, and zeta represents an error threshold value.

The data normalization method in step 1 aims to eliminate the influence of different dimension data on model training, and adopts a maximum and minimum value normalization method, wherein the formula is as follows:

in the formula: x is the number of _i Denotes the original data value, x' _i Representing the data value after normalization, x _min Is the minimum value, x, in the original data sample _max Is the maximum value in the original data sample.

and (3) projecting the original load signal into different scale spaces by the wavelet transformation transducer in the step 2, and decomposing the original load signal into signals with different frequency components. These decomposed "subsequences" exhibit the characteristics of the original sequence in different frequency domains, and also exhibit the periodic characteristics of the loaded sequence in different time intervals.

Step 2, the preprocessed power loads at the plurality of historical moments are as follows:

data＝{load(k),k∈[1,L]}

wherein, load (k) represents the power load at the kth sampling point, and L =3000 is the number of sampling points;

and 2, obtaining a plurality of frequency components by the original load sequence after fourth-order wavelet decomposition and single-branch reconstruction. The wavelet basis function used is a fourth-order Daubechies wavelet, the data after wavelet decomposition has five components, namely an approximate component A4 and detail components D1, D2, D3 and D4, and single-branch reconstruction is carried out on the decomposed frequency components.

And step 3: performing feature selection on the load sequence of each frequency component by using an XGboost algorithm to obtain a load feature vector of each frequency component, constructing a feature matrix of each frequency component by combining preprocessed temperature, humidity, rainfall and holiday data of a plurality of historical moments, uniformly dividing the feature matrix of each frequency component into a plurality of groups of feature matrix samples of each frequency component, and marking a load true value of each group of feature matrix samples of each frequency component;

the XGboost algorithm in the step 3 is a large-scale parallelizable algorithm, the basic component of which is a CART regression tree, and the estimation of the target variable is realized by establishing a series of decision trees and distributing a quantization weight to each leaf node. The XGboost has k decision trees in the training process, and the basic formula of the XGboost is as follows:

in the formula:

indicates the predicted value, f _k Represents the kth decision tree, K being the total number of decision trees, and>

a collection space formed for all decision tree functions. The XGboost objective function is:

wherein

Is a loss function of the model>

Is a regularization term to prevent model overfitting, gamma and lambda are complexity parameters, T is the number of leaf nodes, | omega | | sweet silica fume ² Is the two-norm of the leaf node weights. The XGboost constructs a classification tree according to the training characteristics and the training data. The importance of each feature to the model contribution is determined by calculating the sum of the number of times the feature is used to segment the tree.

And 3, defining the feature matrix of each frequency component as: according to the feature importance score, selecting n historical moments with the highest feature relevance to combine into a new load feature set X _f 。

Wherein X _f A set of loading characteristics representing the f-th frequency component,

representing the characteristic load value of the ith frequency component characteristic selection at the kth sampling point at the ith historical time, wherein f is E [1, F ∈ [ ]]F denotes the number of frequency components, F =5 denotes the number of frequency components after wavelet decomposition, i ∈ [1, n]N =15 is the number of history times selected by the feature;

for each wavelet component, a channel bitObtaining a load characteristic set X of the selected sign _f The temperature, humidity, rainfall and holiday information data preprocessed at the corresponding moment are constructed into a meteorological feature set T, the load feature set and the meteorological feature set are constructed into a new input feature set, and 5 input feature sets, namely XA4, XD1, XD2, XD3 and XD4, can be obtained corresponding to each wavelet component.

And 4, step 4: introducing a Stacking integrated deep learning model, inputting each group of characteristic matrix samples of each frequency component into the Stacking integrated deep learning model for prediction to obtain the predicted load of the Stacking integrated deep learning model, constructing a loss function model by combining the load truth value of each group of characteristic matrix samples of each frequency component, obtaining the optimized Stacking integrated deep learning model through K-fold cross validation training, and summing the predicted results of each frequency component model to obtain the final predicted result;

the Stacking integrated deep learning model in the step 4 is a representative algorithm in integrated learning, and the basic idea is to divide an original data set into a plurality of data subsets, respectively predict by a plurality of primary learners of a first layer, take the prediction result of a base learner as the input of a prediction model of a second layer, and take the prediction result of the model of the second layer as the final output.

And 4, the Stacking integrated deep learning model adopts BP neural network, RBF neural network, random forest and long-short term memory neural network as the selected base learning device.

The BP neural network uses a supervised learning approach that optimizes the weights and thresholds of the layers by using an algorithm with error gradient descent. The training process of the BP neural network mainly comprises two parts, namely forward propagation of signals and back propagation of errors, wherein the forward propagation value of the signals is input into the process that the signals are propagated through an input layer, a hidden layer and an output layer, and the back propagation of the errors refers to that the weight and the threshold value of each layer are changed by using a gradient descent method of the errors. The BP neural network selected for use comprises 3 layers, an input layer, an output layer and a hidden layer, wherein the number of nodes of the input layer is as follows:

n＝N+4

in the formula, N is the number of nodes of the input layer, and N is the number of features selected by XGboost

The number of hidden layer nodes is:

in the formula: m is the number of hidden layer nodes, n is the number of input layer nodes, l is the number of output layer nodes, alpha is a constant, the value range is 1-10, in this embodiment, 8 is taken

The number of the output layer nodes is 1, namely the load true value of the moment point to be predicted.

The RBF neural network has a network structure similar to that of a BP neural network and also comprises an input layer, an output layer and a hidden layer, and the difference is that the RBF neural network only has one hidden layer, and the activation function of the hidden layer node is a radial basis function. The RBF network parameters are selected similarly to the BP network.

Random Forest (RF) is a learning algorithm that uses a Bagging integration strategy, with a decision tree based learner. The random forest randomly selects samples and characteristics in the prediction process, the data used by each decision tree in the prediction process are different, and the final output is calculated by combining the results of a plurality of decision trees.

And performing random sampling on the input sample set D for m times, wherein the selected sampling method is Bootstrap sampling, namely, the original sample set D has returned selected partial data, and m sub-training sets can be obtained after sampling. The sub-training sets are respectively trained by using m decision tree models, it is noted that the decision tree randomly selects the segmentation nodes of the training set in the training process, the output results of a plurality of trained decision trees are combined, and the final prediction result can be obtained by a method of calculating the arithmetic mean.

The long-short term memory neural network is a deep neural network which controls the access and reading of the model through a gate control unit. The LSTM network includes three gates: input gate, forget gate, output gate, the output of each gate is:

output value of input gate: i (t) = σ (W) _i [h(t-1),x(t)]+b _i )

Output value of forget gate: f (t) = σ (W) _f [h(t-1),x(t)]+b _f )

Output value of the output gate: o (t) = σ (W) _o [h(t-1),x(t)]+b ₀ )

Memory cell state value: c (t) = f (t) × c (t-1) + i (t) × tanh (W) _c [h(t-1),x(t)]+b _c )

Output value at present time: h (t) = o (t) × tanhc (t)

In the formula: sigma is an activation function, generally selected as sigmoid function, W _i ，W _f ，W _o ，W _c The weight matrixes are respectively an input gate, a forgetting gate, an output gate and a cell state; b is a mixture of _i ，b _f ，b ₀ ，b _c Respectively, corresponding offset vectors.

The gating unit screens signals by using the weight set, effectively relieves the problems of gradient explosion and gradient disappearance in the process of back propagation of the gradient, and solves the problem of long-term dependence of data.

And 5: acquiring power load, temperature, humidity, rainfall and holiday sign data at multiple moments in real time, and predicting the power load, temperature, humidity, rainfall and holiday sign data acquired at the multiple moments in real time through the optimized Stacking integrated deep learning model in the steps 1, 2, 3 and 4 to obtain predicted power load;

for each reconstructed wavelet signal, dividing the feature set obtained in the last step into a plurality of feature subsets, and respectively inputting the sub data sets into a base learner of the first-layer prediction model. In order to ensure that the base learners are independent and have differences and the base learners have higher prediction performance, a BP neural network, an RBF neural network, a random forest and a long-short term memory neural network can be selected as the base learners. According to the selected 4 base learners, the feature set can be divided into 4 feature subsets, and each feature subset is not intersected with each other. The flow chart of model training and prediction is shown in fig. 2, and for a single basis learner (for example, a BP neural network), the training and prediction steps are as follows:

step 1: for each base learner, splitting the training set into 5 sub-training sets in a 5-fold cross validation mode, wherein one sub-training set is selected from each sub-learner to serve as a validation set.

Step 2: training 5 BP neural networks by using training set data, respectively obtaining a training set prediction result 1 and a test set prediction result 1 by using a trained network prediction test set and a trained verification set, recombining the generated 5 training set prediction results into BP training set prediction, and calculating the average value of the 5 test set results to generate a BP test set prediction result.

And step 3: and (3) repeating the steps by using RBF, random forest and LSTM to respectively obtain RBF training set prediction, RF training set prediction, LSTM training set prediction, RBF test set prediction result, RF test set prediction result and LSTM test set prediction result.

And 4, step 4: and (3) training a secondary model, wherein an XGboost model can be adopted as the secondary training model, so that the integration of the primary model is realized. And (4) taking the prediction result of the training set of the single prediction model obtained in the step (3) as input, taking the corresponding actual value as output, and training a secondary model.

And 5: and taking the prediction result of the RBF test set as input, and predicting by using the trained secondary model to obtain the final prediction result.

It should be understood that parts of the specification not set forth in detail are well within the prior art.

It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A load prediction method based on wavelet feature extraction and integrated learning model is characterized by comprising the following steps:

step 1: respectively preprocessing the power load, temperature, humidity, rainfall and holiday information data at a plurality of historical moments;

step 2: decomposing the preprocessed power load sequence into load sequences with a plurality of frequency components by a wavelet decomposition method;

and 3, step 3: performing feature selection on the load sequence of each frequency component by using an XGboost algorithm, and constructing an optimal input set by combining temperature, humidity, rainfall and holiday data;

and 4, step 4: training, verifying and predicting by using a Stacking ensemble learning method, and performing wavelet reconstruction on a plurality of wavelet component prediction results to obtain a final prediction result.

2. The wavelet feature extraction and deep learning model-integrated power load prediction method according to claim 1,

the pretreatment in the step 1 comprises the following steps: abnormal data identification, missing value processing and normalization processing.

3. The wavelet feature extraction and deep learning model integrated power load prediction method according to claim 1, wherein the preprocessed power loads at the plurality of historical times in step 2 are:

data＝{load(k),k∈[1,L]}

wherein, load (k) represents the power load at the kth sampling point, and L is the number of sampling points.

4. The wavelet feature extraction and deep learning model integrated power load prediction method according to claim 1, wherein in step 2, multiple frequency components can be obtained from the original load sequence through wavelet decomposition and single-branch reconstruction.

5. The power load prediction method based on wavelet feature extraction and integrated deep learning model as claimed in claim 1, characterized in thatAnd 3, defining the feature matrix of each frequency component in the step 3 as: for each wavelet component, after feature selection is carried out through an XGboost algorithm, according to feature importance scores of the wavelet component, n historical moments with highest feature relevance are selected to be combined into a new load feature set X _f ；

representing the characteristic load value of the ith frequency component characteristic selection at the kth sampling point at the ith historical time, wherein f is E [1, F ∈ [ ]]F represents the number of frequency components, i ∈ [1, N ]]And N represents the number of historical times that the feature was selected.

6. The wavelet feature extraction and deep learning model-integrated power load prediction method according to claim 1, wherein the step 3 defines the multiple sets of feature matrix samples of each frequency component by:

for each wavelet component, obtaining a load characteristic set X after characteristic selection _f The temperature, humidity, rainfall and holiday information data preprocessed at the corresponding moment are constructed into a meteorological feature set T, the load feature set and the meteorological feature set are assembled to construct a new input feature set, and one input feature set can be obtained corresponding to each wavelet component.

7. The wavelet feature extraction and deep learning model-integrated power load prediction method according to claim 1, wherein the load truth value for each set of feature matrix samples for each frequency component labeled in step 3 is: and manually calibrating the wavelet decomposition value of the real load.

8. The wavelet feature extraction and deep learning model integrated power load prediction method according to claim 1, wherein the optimization training in step 4 is K-fold cross validation training.

9. The wavelet feature extraction and deep learning model-integrated power load prediction method according to claim 1, wherein the prediction result of each frequency component model in step 4 is as follows: and predicting the prediction result obtained by the multiple groups of characteristic matrix samples of each frequency component through the optimized Stacking integrated deep learning model.

10. A load prediction method based on wavelet change and ensemble learning is characterized by comprising the following steps:

the data processing unit is used for preprocessing data, constructing an input feature set and sending the feature set to the load prediction unit and the model construction unit;

the model building unit is used for building a prediction model according to the input feature set, the target feature value and the Stacking ensemble learning method;

and the load prediction unit is used for acquiring related data in real time and obtaining an output result through a prediction model.