CN109063911B

CN109063911B - Load aggregation grouping prediction method based on gated cycle unit network

Info

Publication number: CN109063911B
Application number: CN201810876297.6A
Authority: CN
Inventors: 王守相; 陈海文; 蔡声霞
Original assignee: Tianjin Xianghe Electric Technology Co ltd
Current assignee: Tianjin Xianghe Electric Technology Co ltd
Priority date: 2018-08-03
Filing date: 2018-08-03
Publication date: 2021-07-23
Anticipated expiration: 2038-08-03
Also published as: CN109063911A

Abstract

The invention relates to a load aggregate grouping prediction method based on a gated cyclic unit network, which comprises the following steps: clustering user load data by using a self-adaptive distributed spectral clustering algorithm so as to obtain a plurality of power utilization groups with similar load characteristics, and solving a load characteristic matrix of each group; building three GRU networks, training the three GRU networks by extracting the time sequence characteristics of the groups to obtain prediction models of the three GRU networks, and performing model fusion on the three GRU networks by a random forest algorithm to obtain a load prediction model of each group; inputting the characteristics of the moment to be predicted into a load prediction model, respectively obtaining the load prediction value of each group, and summing the prediction values of different groups to obtain the prediction value of the final load aggregate; by introducing the grouping prediction, the deep neural network and the model fusion method, the invention can master the load characteristics and the change rule of the user, and has high prediction precision and strong applicability.

Description

Load aggregation grouping prediction method based on gated cycle unit network

Technical Field

The invention belongs to the technical field of load prediction of power systems, and particularly relates to a load aggregate grouping prediction method based on a gated cyclic unit network.

Background

Accurate and rapid load prediction plays a significant role in safe and economic operation of the power system. In the conventional load prediction, hierarchical division is performed according to a physical structure measured by a power system, for example, a system level, a bus level, a substation level, a microgrid level, and the like, and a load prediction method developed for a specific level cannot be applied to other levels. In recent years, with the popularization of smart meters, power companies can acquire a large amount of fine-grained user load data. Based on the data of the intelligent electric meter, the method can get rid of the limitation of a power system measuring structure, can flexibly divide load aggregates with different scales according to requirements and carry out load prediction, namely can realize the load prediction of the power system at the traditional level, and can form the load aggregates and carry out the load prediction according to regions (such as buildings, districts, blocks and plots), industries (such as residents and industrial and commercial industries), electricity price types (time sharing, peak valley and the like) and the like so as to meet the requirement of more refined load prediction.

The prediction of load aggregate is a bottom-up load prediction method based on smart meters. The load aggregate can be divided as required and is more flexible, but different dividing methods can cause huge difference of the scale of the predicted object, and the traditional load prediction method is only suitable for specific load scale and has no generalization capability. Particularly, when the load scale is reduced, because the population effect of the small-scale load is weakened, the mean absolute error percentage (MAPE) index of load prediction is obviously improved along with the reduction of the prediction scale, and the traditional prediction method is not suitable for prediction of load aggregates. Aiming at the characteristics of flexible load aggregate division, variable scale, close connection with user load characteristics and the like, a prediction method with applicability and high precision is provided, and is a difficult point of load aggregate prediction.

Because the data of the intelligent electric meter is closely related to the load characteristics of the users, the load change common law among different users is found through cluster analysis, the load aggregate is divided into a plurality of power utilization groups according to the load change common law, modeling analysis is carried out on the load aggregate, and the prediction accuracy of the load aggregate can be improved. In the prediction method, a BP neural network, a Support Vector Machine (SVM), and the like are widely used in load prediction. These algorithms transform the dynamic time modeling problem into a static modeling problem by training to establish a nonlinear relationship between the output and the input. However, as typical time series data, the load change has a dynamic characteristic, that is, the load change rule is influenced by the change process of a period of time in addition to the current time state. The traditional method mostly takes historical data of similar days and typical days as input, and the characteristics of load change on time sequence cannot be considered, so that the load prediction error is large. The method is characterized in that potential power utilization behavior rules are found from historical power utilization data of users, and the development change of the load is presumed through data evolution, and is the key for accurately predicting the load. With the development of deep learning, a Recurrent Neural Network (RNN) represented by long-short term memory (LSTM) can consider the correlation between time sequences, can describe the change process of the time sequences more comprehensively, and is widely applied to multiple fields such as speech recognition and natural language processing. The LSTM is used for wind power and resident load prediction, and the LSTM is proved to be capable of considering the internal law of time series development and evolution and seizing the essential characteristics of the time series, so that the prediction precision is improved. However, LSTM has the disadvantage of long training time and, since the loading polymer contains multiple loading characteristics, different loading characteristics are applicable to different neural network structures. Therefore, based on the problems, a model fusion method based on a deep neural network is provided, the advantages of different network structures are fully integrated and utilized, and the prediction accuracy of the load aggregation is improved.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a prediction method suitable for load aggregation, and the prediction method can grasp the load characteristics and the change rule of a user by introducing a grouping prediction method, a deep neural network method and a model fusion method, and has high prediction precision and strong applicability.

The technical problem to be solved by the invention is realized by adopting the following technical scheme:

a load aggregation group prediction method based on a gated cyclic unit network comprises the following steps:

(1) clustering user load data by using a self-adaptive distributed spectral clustering algorithm so as to obtain a plurality of power utilization groups with similar load characteristics, and solving a load characteristic matrix of each group;

(2) building three GRU networks with different structures, training the GRU networks with the different structures by extracting the time sequence characteristics of the groups to obtain prediction models of the three GRU networks, and performing model fusion on the three GRU networks by a random forest algorithm to obtain a load prediction model of each group;

(3) inputting the characteristics of the time to be predicted into the load prediction model obtained in the step (2), respectively obtaining the load prediction value of each group, and summing the prediction values of different groups to obtain the final prediction value of the load aggregate.

In addition, the process of obtaining the electricity utilization population in the step (1) is as follows: and (3) averaging the load data of each user according to the week, carrying out normalized scaling on the maximum-minimum values to an interval [0,1], obtaining a load characteristic curve for each user, integrating all the user characteristic curves into a matrix, and clustering the matrix to obtain the power utilization group.

Note that, in the step (2), the GRU network is composed of an input layer, an output layer and a hidden layer, wherein the hidden layer includes a plurality of cascaded GRU units; the GRU unit comprises a reset gate and an update gate, and the output and the memory information are controlled through a gate control mechanism.

In addition, the GRU networks with three structures are respectively built for the load groups in the step (2), the low-frequency, medium-frequency and high-frequency characteristics of the load polymers are learned by controlling the network depth and the number of the GRU units, the load change characteristics of different frequency domains are learned, and finally the output of three deep neural networks is fused through a random forest algorithm.

The invention has the advantages and positive effects that:

1. the invention provides grouping prediction of the load aggregate according to the load characteristics, and applies the distributed spectral clustering algorithm to the grouping clustering of the load aggregate, thereby improving the clustering precision and stability compared with the traditional K-means algorithm, and overcoming the defects of low calculation speed and large memory occupation of a single-machine spectral clustering algorithm; the model fusion idea is applied to the prediction of load aggregation, GRU deep neural networks with different structures are used as element models, dynamic modeling of time sequences is realized, a plurality of element models are fused through a random forest algorithm, the characteristics of different network structures can be fully utilized, and the load prediction precision is further improved;

2. compared with conventional methods such as BP and SVM, the prediction method of the invention with the combination of the grouping prediction and the model fusion has applicability to the prediction problem of load aggregate, and has higher prediction precision under the conditions of different load scales; the prediction time scale can be flexibly adjusted through a rolling prediction mode, and the load prediction method has higher prediction precision on the prediction scale of 30min-24 h.

Drawings

The technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings and examples, but it should be understood that these drawings are designed for illustrative purposes only and thus do not limit the scope of the present invention. Furthermore, unless otherwise indicated, the drawings are intended to be illustrative of the structural configurations described herein and are not necessarily drawn to scale.

FIG. 1 is a load cluster diagram;

FIG. 2 is a graph showing the variation of DB indexes of a distributed spectral clustering algorithm and a K-means algorithm with the clustering number;

FIG. 3 is a diagram of three GRU network structures;

FIG. 4 is a diagram of a predictive architecture based on GRU network and model fusion;

FIG. 5 is a graph illustrating the comparison of prediction errors in different methods for different numbers of users;

FIG. 6 is a graph of prediction accuracy MAPE versus prediction time scale for four methods;

Detailed Description

First, it should be noted that the specific structures, features, advantages, etc. of the present invention will be specifically described below by way of example, but all the descriptions are for illustrative purposes only and should not be construed as limiting the present invention in any way. Furthermore, any individual technical features described or implicit in the embodiments mentioned herein may still be continued in any combination or subtraction between these technical features (or their equivalents) to obtain still further embodiments of the invention that may not be mentioned directly herein.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and furthermore, the terms "comprises" and "having", and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. indicate orientations and positional relationships that are conventionally used in the products of the present invention, and are used merely for convenience in describing the present invention and for simplicity in description, but do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and therefore, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

The present invention will be specifically described with reference to fig. 1 to 6.

Example 1

FIG. 1 is a load cluster diagram; FIG. 2 is a graph showing the variation of DB indexes of a distributed spectral clustering algorithm and a K-means algorithm with the clustering number; FIG. 3 is a diagram of three GRU network structures; FIG. 4 is a diagram of a predictive architecture based on GRU network and model fusion; FIG. 5 is a graph illustrating the comparison of prediction errors in different methods for different numbers of users; FIG. 6 is a graph of prediction accuracy MAPE versus prediction time scale for four methods; as shown in fig. 1 to 6, in the load aggregate grouping prediction method based on the gated cycle unit network provided in this embodiment, the load aggregate is predicted by taking the london smart meter dataset as an example, and the specific process is as follows:

clustering by adopting load clustering method of distributed spectral clustering

Firstly, the load data Lm of each user is averaged according to the week and is normalized and scaled to the interval [0,1] through the maximum-minimum value]Thus, a load characteristic curve is obtained for each user, and all the user characteristic curves are integrated into a matrix C_M×TFor matrix C_M×TThe electricity utilization groups can be obtained by clustering, calculation is carried out according to typical 30-minute sampling intervals of the intelligent electric meter, each user characteristic curve contains 336 points which are 48 multiplied by 7, the data dimensionality is very high, and the number of users to be analyzed is large, so that the accuracy, the speed and the stability of the traditional clustering method cannot meet the analysis requirements. The spectral clustering overcomes the defects that the traditional clustering algorithms such as K-means and the like can only identify data distributed in a convex sphere and can possibly fall into local optimum, can cluster on samples with any shapes and converge to global optimum, and essentially converts the clustering problem into the optimal partition problem of the graph. The spectral clustering algorithm flow is as follows:

from a matrix C_M×TSolving a Euclidean distance matrix H between n users_m,nThe calculation formula is as follows:

in the formula, H_m,nDenotes the Euclidean distance, H, between the mth and nth users_m,nIs a symmetric matrix and the diagonal element is 0.

Adopting a Gaussian function to construct a similarity matrix A of Hm, n:

in the formula, σ_mAnd σ_nIn actual operation, the self-adaptation means that values of several scale parameters sigma are specified in advance, spectral clustering is performed respectively, and finally sigma which enables the best clustering result is selected as a parameter.

Further, a laplacian matrix L can be constructed:

L＝D^-1/2AD^-1/2 (4)

according to the perturbation theory, the optimal classification number is determined by calculating the eigenvalue of the similarity matrix, and if the determined optimal classification number is k, the corresponding k eigenvectors are X₁,X₂,…,X_kAnd if the obtained eigenvector matrix X is equal to (X)₁,X₂,…,X_k). And clustering the obtained feature matrix by adopting a K-means method to obtain a final power utilization group division result.

The spectral clustering algorithm has the largest calculation amount when calculating the similarity matrix and searching k eigenvectors, and occupies more storage space. In order to overcome the defect of the efficiency of the traditional spectral clustering algorithm, the distributed spectral clustering algorithm uses a nearest neighbor sparse similar matrix to replace an original similar matrix, and simultaneously adopts a distributed computing framework based on MapReduce to compute the characteristic vector. Firstly, storing n/p rows of matrixes on p nodes, setting all n/p data points to have the same key in each map stage, and calculating local data and input x by each node in the reduce stage_iThe distance of (c):

wherein x_jAnd (3) representing local data of the nodes, setting two keys at the map stage for ensuring that the obtained distance matrix has symmetry, and respectively returning row and column numbers and corresponding distances to determine the positions of all elements. The parallelization calculation among the nodes reduces the complexity of the problem from (6) to (7):

O(n²d+n²logt) (6)

O(n²d/p+(n²logt)/p) (7)

as shown in formula (2), the same parallelization step is also adopted for the calculation of the similarity matrix, and a sparse similarity matrix is obtained. The calculation amount of the eigenvalue of the similar matrix is large, the memory is occupied, and due to the sparseness of the similar matrix in the spectral clustering, a PARPACK algorithm for solving the eigenvalue in a parallelization mode is adopted and is respectively deployed on all the calculation nodes. The complexity of parallel computing eigenvalues is reduced from equation (8) to equation (9):

O(m³)+(O(nm)+O(nt))×O(m-k) (8)

O(m³)+(O(mn/p)+O(nt/p))×O(m-k) (9)

the spectral clustering realizes grouping by clustering the characteristic vectors, and the clustering step is realized by adopting a parallelized K-means algorithm. The Spark is internally provided with a distributed version K-means algorithm and can be called through an MLlib packet. And p is set as the number of the computing nodes, and the theoretical computing complexity of the distributed K-means algorithm is only 1/p of that of a single machine version.

2676 users in the data set of the london smart meter were clustered, as can be seen from fig. 1: obvious difference exists among different user groups, but the same user groups have commonality; (a) the cluster belongs to a type with relatively stable electricity utilization, and the whole load is at a lower level; and (b) the group of the class is opposite, the lowest load state is still kept at a higher level, which indicates that the class of users have more devices for long-term work; (c) the group (e) has two obvious power utilization peaks in the morning and evening, but the peak duration and the peak size are different; (d) the fluctuation of the similar population is strong.

The computational performance of the proposed algorithm was analyzed below, with the control method being the traditional standalone version of the K-means algorithm and the experimental environment being an 8-node distributed computing cluster (Intel Xeon E7-8850v2 x 8,16G Registered DDR3 memory 16). Using Davies-Bouldin index (DB) as a clustering result evaluation index, wherein the DB index is defined as:

in the formula (I), the compound is shown in the specification,

is the intra-class mean distance of class i, j, w_i,w_jThe clustering centers of two classes are respectively, the smaller DB means the smaller the intra-class distance, the larger the inter-class distance and the greater the clustering effectGood results are obtained.

In order to ensure comparability between clustering results, the clustering quantity is manually specified, and because the selection of an initial point can influence the clustering results, 10 times of clustering is repeated for each specified clustering quantity, and corresponding DB indexes are recorded. The change condition of the DB indexes of the two methods along with the clustering quantity is shown in FIG. 2, and it can be seen from FIG. 2 that the DB values of the spectral clustering algorithm are both smaller than the DB value of the K-means algorithm corresponding to different clustering quantities, which shows that the clustering effect of the spectral clustering algorithm is superior to that of the K-means algorithm.

Prediction model design based on GRU network and model fusion

The GRU network consists of an input layer, an output layer and a hidden layer. Wherein the hidden layer comprises a plurality of cascaded GRU units.

The GRU unit comprises a reset door r_tAn update gate z_tAnd the information such as output, memory and the like is controlled through a gate control mechanism, and prediction is made at the current time step. The GRU work flow is as follows: at each moment, the GRU unit receives the current state x through the update gate_tHidden state h from the previous moment_t-1After receiving the input information, the activation function determines whether the neuron is activated or not through matrix operation. Similarly, the reset gate also receives x_tAnd h_t-1The calculation result determines how much past information needs to be forgotten. The current time input is overlapped with the output of the reset gate through operation, and the current memory content h 'is formed through an activation function'_t. Current memory h'_tInput h with the previous step_t-1The output content h of the final gate control unit is determined by the dynamic control of the update gate_tAt the same time h_tWill also pass to the next GRU unit. The calculation formula between the variables is as follows:

z_t＝σ(W^(z)x_t+U^(z)h_t-1) (11)

r_t＝σ(W^(r)x_t+U^(r)h_t-1) (12)

in the formula, W^(z)And U^(z)Represents the weight of the update gate, W^(r)And U^(r)The weight of a forgetting gate is represented, W and U represent the weight of the network when the current memory is formed, sigma (x) is an activation function sigmoid, and tanh (x) is an activation function tanh.

After the GRU network is obtained, it is trained by a time-spread backward error propagation algorithm (BPTT). The present embodiment selects the Mean Absolute Error (MAE) of the predicted load value as the loss function:

there is a large difference in load fluctuation between different load populations. For a relatively stable load group, the effect of a shallow layer multi-unit GRU network is good, and for a load group with strong fluctuation, a multilayer superposed network is adopted to fully extract high-frequency characteristics. In order to adapt to the load characteristics of different groups, GRU networks with three structures are respectively provided for each type of load group so as to fully learn the load change characteristics of different frequency domains, and finally the output of three deep neural networks is fused through a random forest algorithm. The method of multi-model fusion ensures that the prediction model has stronger applicability.

Three types of GRU network structures are shown in fig. 3. By controlling the network depth and the number of GRU units, the low-frequency, medium-frequency and high-frequency characteristics of the load aggregate are learnt in a targeted manner, and finally model fusion is realized through random forests. The deep neural network input features selected herein are as follows:

(1) load data vector E, E ═ E for past k times_t-k,…,e_t-2,e_t-1K is set to be 6 in the embodiment, namely prediction is carried out according to load change of the past 3 hours;

(2) dividing the 24 hours into 48 prediction points by a time coefficient I to which the points to be predicted belong, wherein the I belongs to {1, 2.., 48 };

(3) the week number D, D belonging to the point to be predicted belongs to {1, 2.. 7 };

(4) working day/holiday H, setting the working day at 1 and setting the holiday at 0;

(5) the temperature T at the previous moment;

(6) the current weather type W; 13 weather types such as clear weather and light rain specified by the london weather bureau, wherein W belongs to {1, 2.., 13 };

since the GRU network requires inputs between 0-1, the vectors E, T are processed by way of maximum and minimum normalization to convert I, D, H, W into a thermally encoded form. For the type variable J, the number of classes to which J belongs is set as M, the variable J after hot coding comprises M bits, and only one bit corresponding to the class to which the variable J belongs is set as 1.

And (3) forming a feature matrix X by the processed features:

X＝{E,I,D,H,T,W} (16)

all GRU networks were built based on the Keras framework, trained and tested on GPU NVidia GTX 10606G, using TensorFlow as the computational back-end. In the training process of the GRU network, a gradient descent algorithm is realized through an optimizer, and the commonly used optimizer comprises Adagrad, Adadelta, RMSprop and Adam. Adam is adopted as an optimizer, and the method has the advantages of being capable of achieving adaptive learning rate adjustment and efficient in training. For the random forest algorithm for multi-model fusion, the number of CART regression trees is set to be 50, the depth of the maximum decision tree is not limited, and the maximum feature number used by the random forest is set to be 3. Training the random forest model after obtaining three GRU networks, wherein the training data is a verification set L_val。

After a prediction model is obtained through training, dividing a given load aggregate to be predicted into i load groups according to a clustering result, predicting each load group i by adopting a structure shown in figure 4, and finally predicting i predicted values e_i-tAdding to obtain the predicted value E of the final load aggregation at the time t_t。

Load aggregate prediction error analysis

Setting the number M of users of load aggregates to be predicted to be 2676, wherein the load aggregates can be subdivided into 5 types of load groups, each type of load groups comprises a data sample 7392 group, the dimensionality of an output power matrix Es is 7392 multiplied by 1, the dimensionality of an input characteristic matrix X is 7392 multiplied by 75, and data is divided into a training set, a verification set and a test set according to the proportion of 8:1: 1. The training set is used for training a GRU network, the verification set is used for training a random forest model, an importance coefficient wk is output, and the test set is used for testing the performance of a final model. Except MAPE, the Mean Absolute Error (MAE) of the model is calculated as a supplementary index, and the prediction effect of the prediction model on different load populations is shown in Table 1.

TABLE 1

As can be seen from Table 1, for different load groups, the difference of the importance of the three networks is large, and the prediction precision after model fusion is superior to that of any single model, so that the model fusion method provided by the invention can fully utilize the structural characteristics of different networks and realize the automatic distribution of the weights of the networks, thereby further improving the prediction precision.

In order to prove the superiority of the method, a BP neural network (three layers, the number of neurons is 128,256,128), a Support Vector Machine (SVM) of a Gaussian kernel function and a Random Forest (RF) algorithm are respectively adopted to directly predict the load aggregation. The four method accuracy pairs are shown in table 2.

TABLE 2

As can be seen from table 2, the average absolute error percentage (MAPE) obtained by the algorithm and the average absolute error (MAE) of the calculation model are lower than those of the BP neural network, the SVM, and the random forest algorithm in the embodiment of the present invention, which indicates that the accuracy of the algorithm in the embodiment of the present invention is the highest among the four methods.

Fourth, comparison of predicted performance under different scale load aggregates

Load aggregation is divided flexibly, and essentially different dividing methods only affect the number of aggregation users. In order to verify the applicability of the algorithm of the invention to different-scale load aggregates, 500,1000,1500,2000,2676 is set as the number of users M in a random sampling mode, and the prediction accuracy of the method is compared with that of the traditional method. The prediction accuracy of the four methods is shown in fig. 5 as the number of users changes. As can be seen from fig. 5, the method provided herein achieves the highest prediction accuracy under different load scale conditions due to the adoption of the packet prediction, the dynamic time modeling, and the model fusion techniques. From the MAE indexes of the methods under different user scales, the absolute error of the prediction method is minimum, the error change is small along with the increase of the number of users, and the performance is stable, so that the method provided by the invention has good applicability to load aggregates of different scales. And the other three methods, particularly the SVM, perform well when the number of users is small, the absolute error is rapidly increased along with the increase of the number of users, the prediction performance is unstable, and the applicability is poor.

Fifth, rolling prediction effect

The prediction time scale of the method can be flexibly adjusted through a rolling prediction mode, the number of users loading aggregates is still set to 2676, the prediction scale is respectively expanded from 30min to 24h in the future, and the prediction accuracy MAPE changes of the four methods are shown in FIG. 6. The curves in the figure are the results of multiple linear fits to the scatter points. As can be seen from fig. 6, the prediction accuracy of the four methods is reduced with the increase of the prediction time scale, wherein the random forest algorithm and the SVM performance are significantly degraded. After the prediction scale exceeds 10 hours, the error conditions of the two artificial neural network methods tend to be stable, the precision advantage of the method is maintained, and the method is more obvious compared with ultra-short-term prediction.

In conclusion, the invention provides the grouping prediction of the load aggregate according to the load characteristics, and applies the distributed spectral clustering algorithm to the grouping clustering of the load aggregate, thereby improving the clustering precision and stability compared with the traditional K-means algorithm, and overcoming the defects of low calculation speed and large memory occupation of a single-machine spectral clustering algorithm; the model fusion idea is applied to the prediction of load aggregation, GRU deep neural networks with different structures are used as element models, dynamic modeling of time sequences is realized, a plurality of element models are fused through a random forest algorithm, the characteristics of different network structures can be fully utilized, and the load prediction precision is further improved; compared with conventional methods such as BP and SVM, the prediction method of the invention with the combination of the grouping prediction and the model fusion has applicability to the prediction problem of load aggregate, and has higher prediction precision under the conditions of different load scales; the prediction time scale can be flexibly adjusted through a rolling prediction mode, and the load prediction method has higher prediction precision on the prediction scale of 30min-24 h.

The present invention has been described in detail with reference to the above examples, but the description is only for the preferred examples of the present invention and should not be construed as limiting the scope of the present invention. All equivalent changes and modifications made within the scope of the present invention shall fall within the scope of the present invention.

Claims

1. A load aggregation grouping prediction method based on a gated cyclic unit network is characterized by comprising the following steps: the packet prediction method comprises the following steps:

(2) respectively building three GRU networks with different structures for load groups, training the GRU networks with the different structures by extracting the time sequence characteristics of the groups to obtain prediction models of the three GRU networks, and performing model fusion on the three GRU networks by a random forest algorithm to obtain a load prediction model of each group;

2. The load aggregation group prediction method based on the gated cyclic unit network according to claim 1, wherein: the electricity utilization population obtaining process in the step (1) comprises the following steps: and (3) averaging the load data of each user according to the week, carrying out normalized scaling on the maximum-minimum values to an interval [0,1], obtaining a load characteristic curve for each user, integrating all the user characteristic curves into a matrix, and clustering the matrix to obtain the power utilization group.

3. The load aggregation group prediction method based on the gated cyclic unit network according to claim 1, wherein: the GRU network in the step (2) is composed of an input layer, an output layer and a hidden layer, wherein the hidden layer comprises a plurality of cascaded GRU units.

4. The load aggregation group prediction method based on the gated round robin unit network according to claim 3, wherein: the GRU unit comprises a reset gate and an update gate, and the output and the memory information are controlled through a gate control mechanism.

5. The load aggregation group prediction method based on the gated cyclic unit network according to claim 1, wherein: and (3) respectively building GRU networks with three structures for the load groups in the step (2), learning the low-frequency, medium-frequency and high-frequency characteristics of the load polymers by controlling the network depth and the number of GRU units, learning the load change characteristics of different frequency domains, and finally fusing the outputs of the three deep neural networks by a random forest algorithm.