CN111030889A

CN111030889A - Network traffic prediction method based on GRU model

Info

Publication number: CN111030889A
Application number: CN201911343425.1A
Authority: CN
Inventors: 赵炜; 尚立; 杨会峰; 李井泉; 江明亮; 王旭蕊; 刘惠; 纪春华; 杨杨; 郭少勇; 喻鹏
Original assignee: State Grid Corp of China SGCC; Beijing University of Posts and Telecommunications; Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; Beijing University of Posts and Telecommunications; Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-04-17
Anticipated expiration: 2039-12-24
Also published as: CN111030889B

Abstract

The invention discloses a network flow prediction method based on a GRU model, relating to the technical field of information communication; inputting a network flow data sequence into a GRU neural network model and finishing predicting network flow; the accuracy and the effect of network flow prediction are improved by inputting the network flow data sequence into the GRU neural network model and completing the prediction of the network flow and the like.

Description

Network traffic prediction method based on GRU model

Technical Field

The invention relates to the technical field of information communication, in particular to a network traffic prediction method based on a GRU model.

Background

The electric power data communication network is a comprehensive wide area network transmission platform and is also an important component of electric power information infrastructure. With the rapid development of power data networks, the network scale is continuously enlarged, and sufficient and reliable information support is increasingly required to ensure the safe and reliable operation of the power data networks. The prediction of the network flow of the power data network can provide important information for the safe operation of the power data network, and particularly, the method can sense the flow abnormity and the operation state abnormity of the power data network in advance, so that the operation of the power data network is guaranteed, and the method has important research value and application prospect. In general, network traffic data is affected by various complex and random factors, but is nonlinear time-series data in nature.

The characteristics of modern internet make network traffic prediction important in improving network efficiency, reliability and adaptability. In recent years, many scholars have studied on network traffic prediction, and many network traffic prediction methods have been proposed. Currently, prediction models for network traffic prediction include time series models, neural network models, and the like. However, because the network flow data sequence is influenced by various uncertain factors and the data of the influencing factors are difficult to express, the network flow sequence has the complex characteristics of high nonlinearity and non-stationarity, and the traditional time sequence model and the neural network model are difficult to process, so that the accuracy of network flow prediction by adopting a simple prediction model is low, and the reasonable planning and distribution of the network are influenced.

Therefore, how to improve the accuracy of network traffic prediction to improve the reliability of the network is a problem to be solved by those skilled in the art.

In order to solve the development state of the prior art, the existing patents and documents are searched, compared and analyzed, and the following technical information with high relevance to the invention is screened out:

patent scheme 1: 201510793377.1 network flow prediction method based on flow trend

The invention provides a method for processing faults of a wireless sensor network, aiming at the problems that the wireless sensor network adopting a centralized fault processing mode occupies network resources and the wireless sensor network adopting a distributed fault processing mode consumes sensor resources and energy. The method comprises the following steps: extracting a network traffic trend in a time period before a current time period; predicting the network traffic trend at the future moment according to the extracted network traffic trend; calculating the error between the extracted network flow value and the network flow trend thereof, and predicting the flow error; and predicting the predicted value of the network flow at the future moment according to the predicted network flow trend and the predicted flow error. The invention greatly reduces the number of training samples required by flow error prediction and flow estimation, and saves training time; and the extracted network traffic trend not only highlights the periodic characteristics of the traffic in each time period, but also maintains the local structural characteristics of the traffic.

Patent scheme 2: 201611249158.8 neural network based network flow prediction system and flow prediction method thereof

The invention provides a network traffic prediction method based on a BP (back propagation) neural network, which is based on the principle that data is normalized to enable a sample data value to be between 0 and 1, parameters of the BP neural network are initialized, the BP neural network is pre-trained and optimized by using a BP algorithm, and finally the trained BP neural network is used for prediction to obtain a prediction result. The method can not only extract the characteristics of the data, but also optimize the network by using the BP algorithm, thereby solving the problem of complex network structure and difficult training and improving the accuracy of flow prediction to a certain extent. The invention can monitor, detect and analyze various backbone networks, monitor and detect network abnormal events in the backbone networks in real time, and realize early warning of network abnormal conditions.

Patent scheme 3: 201810011664.6 flow prediction method based on neural network

The invention provides a neural network-based traffic prediction method, which is characterized in that computer data are sampled according to a set sampling time period, the window length of a training set is determined, and the abnormal traffic can be prevented and detected by matching the use of data sampling, data set setting, LSTM model training and data judgment. The method comprises the following steps: sampling computer data according to a set sampling time period; dividing a training set and a verification set; substituting into the LSTM model for model training and verification; and sampling the computer flow to be predicted and then bringing the sampled computer flow into a well-trained LSTM model for prediction. The invention can realize the prevention and detection of abnormal flow by matching with data sampling, data set setting, LSTM model training and data judgment, and has the characteristics of high automation degree, high detection speed and wide application range.

The defects of the above patent scheme 1: the scheme extracts the network traffic trend of a period before the current moment and predicts the network traffic trend of a period in the future according to real-time network traffic data; then, calculating the errors of the network flow and the network flow trend in the past period and predicting the future network flow errors; finally, predicting a future network flow predicted value according to the predicted network flow trend and the predicted network flow error; in the scheme, the predefined cycle time has an important influence on the prediction of the future network traffic, so that the accuracy of the predicted value is influenced, and the prediction scheme is more difficult to express a highly complex nonlinear sequence, so that the universality of the scheme is not high.

The defects of the above patent scheme 2: the scheme provides a network flow prediction method based on a BP neural network. The BP neural network is easy to establish and train, has certain expression capacity on complex data sequences, firstly performs data normalization, then performs pre-training on the BP neural network, optimizes the BP neural network by using a BP algorithm, and finally performs prediction by using the trained BP neural network to obtain prediction. In the scheme, the BP neural network is mainly adopted to predict the network traffic data, but the BP neural network has poor memorability to the traffic data, and the improvement of the traffic prediction precision is limited.

The defect of the above patent scheme 3: the scheme provides a flow prediction method based on a neural network, which is used for sampling computer data according to a set sampling time period, training by using an LSTM model and predicting. However, the patent scheme only uses a single LSTM model, and although the LSTM model has better expressiveness in a nonlinear sequence, in practice, due to the defect of the gradient descent method, the learning rate is too fast to skip the optimal point, and there is a certain improvement space in prediction accuracy.

Problems with the prior art and considerations:

how to solve the technical problem of improving the accuracy and the effect of predicting the network flow.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a network traffic prediction method based on a GRU model, which improves the accuracy and effect of network traffic prediction by inputting a network traffic data sequence into the GRU neural network model and completing the prediction of network traffic and the like.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a network flow prediction method based on a GRU model inputs a network flow data sequence into a GRU neural network model and completes the prediction of network flow.

The further technical scheme is as follows: the GRU-Adam neural network model is a neural network model using an SGD gradient descent algorithm, the GRU-Adam model is a neural network model using an Adam gradient descent algorithm, the GRU-AdaGrad model is a neural network model using an AdaGrad gradient descent algorithm, the GRU-AdaGrad model is predicted by respectively using the GRU-SGD model, the GRU-Adam model and the GRU-AdaGrad model, and data predicted by each model are added to calculate an average value to obtain predicted network flow data.

The further technical scheme is as follows: specifically comprises steps S1-S5,

s1, acquiring historical network flow data;

s2, determining training data and verification data in the historical network traffic data;

s3, bringing the training data into each GRU neural network model for training;

s4, predicting the verification data through three GRU neural network models;

and S5, adding the predicted data and averaging to obtain the predicted network traffic data.

The further technical scheme is as follows: wherein the step of S3 specifically includes the step of S31,

s31, three GRU neural network models are provided, wherein the GRU-SGD model, the GRU-Adam model and the GRU-AdaGrad model are respectively a GRU-SGD model, a GRU-Adam model and a GRU-AdaGrad model, the GRU-SGD model is a neural network model using an SGD gradient descent algorithm, the GRU-Adam model is a neural network model using an AdaGrad gradient descent algorithm, training data are respectively input into each GRU neural network model, and the training data are firstly transmitted in the GRU neural network model in a forward direction.

The further technical scheme is as follows: wherein the step of S3 further comprises the step of S32,

and S32, calculating a loss function of time.

The further technical scheme is as follows: wherein the step of S3 further comprises the step of S33,

and S33, using the reverse chain derivation, and sequentially iterating until the loss function converges.

The further technical scheme is as follows: wherein the step of S3 further comprises the step of S34,

s34, the GRU-SGD model is updated by using an SGD gradient descent algorithm, the GRU-Adam model is updated by using an Adam gradient descent algorithm, and the GRU-AdaGrad model is updated by using an AdaGrad gradient descent algorithm.

The further technical scheme is as follows: wherein the step of S3 further comprises the step of S35,

and S35, repeating the steps S31-S34, continuously updating, stopping until the loss function is less than 0.2, and finishing the model training.

The further technical scheme is as follows: wherein the step of S34 specifically includes steps S341 to S343,

s341, calculating the reduction amount of each parameter by using an SGD gradient reduction algorithm through the GRU-SGD model, and updating;

s342, calculating the reduction amount of each parameter by using an Adam gradient reduction algorithm through the GRU-Adam model, and updating;

and S343, calculating the reduction amount of each parameter by using an AdaGrad gradient reduction algorithm through the GRU-AdaGrad model, and updating.

The further technical scheme is as follows: the method is run on a server basis.

Adopt the produced beneficial effect of above-mentioned technical scheme to lie in:

firstly, the accuracy and the effect of network flow prediction are improved by inputting a network flow data sequence into a GRU neural network model and completing the prediction of network flow and the like.

Secondly, the GRU-SGD model, the GRU-Adam model and the GRU-AdaGrad model are used for prediction respectively, the data predicted by each model are added to obtain the average value to obtain the predicted network traffic data, and the accuracy and the effect of network traffic prediction are further improved.

See detailed description of the preferred embodiments.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a block diagram of a GRU neural network model in the present invention;

FIG. 3 is a graph comparing predicted flow rate and actual flow rate data in the present invention.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the application, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways than those described herein, and it will be apparent to those of ordinary skill in the art that the present application is not limited to the specific embodiments disclosed below.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

The relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present application unless specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate. In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

In the description of the present application, it is to be understood that the orientation or positional relationship indicated by the directional terms such as "front, rear, upper, lower, left, right", "lateral, vertical, horizontal" and "top, bottom", etc., are generally based on the orientation or positional relationship shown in the drawings, and are used for convenience of description and simplicity of description only, and in the case of not making a reverse description, these directional terms do not indicate and imply that the device or element being referred to must have a particular orientation or be constructed and operated in a particular orientation, and therefore, should not be considered as limiting the scope of the present application; the terms "inner and outer" refer to the inner and outer relative to the profile of the respective component itself.

Spatially relative terms, such as "above … …," "above … …," "above … …," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial relationship to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary term "above … …" can include both an orientation of "above … …" and "below … …". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.

It should be noted that the terms "first", "second", and the like are used to define the components, and are only used for convenience of distinguishing the corresponding components, and the terms have no special meanings unless otherwise stated, and therefore, the scope of protection of the present application is not to be construed as being limited.

As shown in fig. 1, the present invention discloses a network traffic prediction method based on a GRU model, which includes steps S1-S5, and the network traffic data sequence is input into the GRU neural network model and the prediction of the network traffic is completed, specifically as follows:

the GRU-Adam neural network model is a neural network model using an SGD gradient descent algorithm, the GRU-Adam model is a neural network model using an Adam gradient descent algorithm, the GRU-AdaGrad model is a neural network model using an AdaGrad gradient descent algorithm, the GRU-AdaGrad model is predicted by respectively using the GRU-SGD model, the GRU-Adam model and the GRU-AdaGrad model, and data predicted by each model are added to calculate an average value to obtain predicted network flow data.

And S1, acquiring historical network traffic data.

And S2, determining training data and verification data in the historical network traffic data.

And S3, carrying the training data into each GRU neural network model for training.

And S32, calculating a loss function of time.

S341, the GRU-SGD model calculates the reduction amount of each parameter by using an SGD gradient reduction algorithm, and updates the reduction amount.

And S342, calculating the reduction amount of each parameter by using an Adam gradient reduction algorithm through the GRU-Adam model, and updating.

And S4, predicting the verification data through three GRU neural network models.

The GRU neural network model, the SGD gradient descent algorithm, the Adam gradient descent algorithm, and the AdaGrad gradient descent algorithm are not described herein again for the prior art.

Description of the drawings:

first, the present invention needs to explain variables used in a GRU-based network traffic prediction method. The variables used were as follows:

z_t: an update gate at time t;

r_t: a reset gate at time t;

storage information at time t;

h_t: output information of the GRU unit at the time t;

y_t: final output information at time t;

Δθ_t: the value of the gradient decline of the parameter.

The GRU-based network traffic prediction method comprises the steps of inputting a network traffic data sequence into a GRU neural network, training different GRU neural network models by adopting different gradient descent algorithms, and finally adding and restoring data predicted by the models into predicted network traffic data. The solution according to the invention is explained in detail below with reference to fig. 1, with the above-defined variables.

As shown in fig. 1, the steps are described as follows:

s1, acquiring historical network flow data;

s3, bringing the training data into a plurality of GRU models for training;

s4, predicting the verification data by a plurality of GRU neural network models;

Wherein, step S3 specifically includes:

and S31, inputting the training data x (t) into the GRU neural network model, wherein the GRU neural network model comprises three GRU models, namely GRU-SGD, GRU-Adam and GRU-AdaGrad, and the difference is that the used gradient descent algorithms are different. The training data is first propagated forward in the GRU neural unit.

The GRU consists of an update gate and a reset gate, the update gate z being at time step t_tThe calculation formula of (2) is as follows:

z_t＝σ(W_hz*h_t-1+W_xz*x_t) (1)

wherein x_tAs input vector at the t-th time step, h_t-1For information of the previous time step t-1, W_hzIs a weight matrix, W_hzFor updating the matrix, sigma is a sigmoid activation function, information can be compressed to be between 0 and 1,the formula and the derivative formula are as follows:

σ′(z)＝y(1-y) (3)

as shown in fig. 2, the refresh gate primarily determines how much information of the past time step can be retained until the subsequent time step.

The reset gate mainly determines how much past time step information is forgotten, and the reset gate r_tThe calculation formula of (2) is as follows:

r_t＝σ(W_hr*h_t-1+W_xr*x_t) (4)

wherein, W_xrIs a weight matrix, W_hrTo update the matrix. Memorizing information

Storing information of past time step by resetting gate, memorizing information

The calculation formula of (2) is as follows:

wherein, W_xcIs a weight matrix, W_hcTo update the matrix, ⊙ is a Hadamard product,

for the tanh activation function, the tanh activation function formula and its derivative formula are as follows:

GRU output unit h_t：

Wherein h is_t-1Output cell information for last time step, z_tIn order to update the door information,

for memorizing information.

Final output value y_tPass the sigmoid activation function again, W_oAs a weight matrix:

y_t＝σ(W_o*h_t) (9)

s32, the formula in the process of forward propagation shows that the parameter to be learned has W_hz、W_xz、W_hr、W_xr、W_hc、W_xc、W_oThe final output of the output layer is y_tCalculating a loss function for a certain time:

wherein y is_dAre true values. The loss of a single sequence is then:

s33, gradually deriving the derivative of each loss function for each parameter W using inverse chain derivation:

wherein the intermediate parameters are:

δ_y,t＝(y_d-y_t)*σ′ (19)

δ_h,t＝δ_y,tW_o+δ_z,t+1W_hz+δ_t+1W_hc*r_t+1+δ_h,t+1W_hr+δ_h,t+1*(1-z_t+1) (20)

after the partial derivatives for each parameter are calculated, the parameters can be updated, and the iterations are performed until the loss function converges.

And S34, updating parameters by adopting different gradient descent algorithms respectively for the three different GRU network models.

Wherein, the step of S34 includes:

s341, calculating the reduction quantity delta theta of each parameter by using an SGD gradient reduction algorithm through a GRU-SGD model_tThereby updating the parameter W.

Wherein, g_tIs the gradient of the weight, and,

for learning rate, Δ θ_tIs the amount by which the parameter W is decreased.

S342, calculating the reduction quantity delta theta of each parameter by using an Adam gradient reduction algorithm through a GRU-Adam model_tThereby updating the parameter W.

The Adam algorithm adjusts the learning rate of each parameter by first order moment estimation and second order moment estimation of the gradient. Adam corrects the first moment estimation and the second moment estimation offset, so that the learning rate has a stable range in each iteration and the parameters change stably.

m_t＝μ*m_t-1+(1-μ)*g_t(26)

Wherein, g_tIs the gradient of the weight, m_t，n_tFirst and second order moment estimates of the parametric partial derivatives, respectively, mu and v are exponential decay rates, between 0,1), typically 0.9 and 0.999,

is a correction value. Delta theta_tIs the amount by which the parameter W is decreased,

is the learning rate.

S343, GRU-AdaGrad model calculates the reduction quantity delta theta of each parameter by using an AdaGrad gradient reduction algorithm_tThereby updating the parameter W.

The AdaGrad algorithm forms a constraint term by recursion, the early stage g_tWhen the gradient is small, the constraint term is large, the gradient can be amplified, and the later period g_tWhen the gradient is larger, the constraint term is smaller, and the gradient can be constrained.

Wherein, g_tIs the weight gradient, n_tIs an estimate of the second moment of the weight gradient, Δ θ_tIs the amount by which the weight is decreased,

to learn the rate, e is used to guarantee that the denominator is not 0.

And S35, repeating the steps S31-S34, continuously updating the W parameter until the loss function E is less than 0.2, and finishing the model training.

Example data for the present invention are illustrated below:

s1, in the embodiment of the present invention, 14776 pieces of network traffic sequence data are collected as a data set.

S2, in the embodiment of the present invention, the first 12000 pieces of data in the network traffic sequence data set are used as a training set train (t), and the last 2776 pieces of data are used as a verification set val (t).

S3, in the embodiment of the invention, the training set sequence train (t) is input into three GRU neural network models for trainingAnd each GRU neural network comprises 32 GRU neural network units, the random batch size is set to be 128, and 100 times of training are carried out, wherein the GRU-SGD model adopts SGD to carry out gradient descent, the GRU-Adam model adopts Adam to carry out gradient descent, and the GRU-AdaGrad model adopts AdaGrad to carry out gradient descent. Obtaining a well-trained GRU-SGD model after the training is finished_sgdGRU-Adam model_AdamGRU-AdaGrad model_AdaGrad. An Adam gradient descent algorithm is used.

S4, inputting the verification set data val (t) into the trained GRU neural network model_sgd、model_Adam、model_AdaGradIn (1), output predicted data pre_sgd(t)、pre_Adam(t)、pre_AdaGrad(t)。

S5, adding the predicted sub-sequence data to obtain the final predicted network traffic data

As shown in fig. 3, the predicted flow value pre (t) is compared with the actual flow value val (t).

The purpose of the invention is as follows:

the network flow prediction aims at accurately predicting the flow change in a future network and providing reliable data for network planning and maintenance. Most of the existing network traffic prediction models adopt a method for constructing a linear mathematical model or a neural network model, most of the existing network traffic prediction methods are processed by adopting a linear or nonlinear method, and the accuracy and the real-time performance of prediction are difficult to guarantee due to the flakiness. In order to solve the above problems, the present patent provides a network traffic prediction method based on GRU. The method comprises the steps of inputting a network flow data sequence into a plurality of built GRU network models, adopting different gradient descent algorithms for each GRU model, respectively predicting by utilizing GRU neural network models, and then adding predicted data to obtain the predicted network flow data by averaging. The invention selects to integrate the GRU neural network in the neural network model into the network flow prediction, and the GRU network has good memorability and expression capability to the time sequence. For network flow data under a common condition, the change of the network flow data is influenced by various factors which are difficult to express, the sequence of the network flow data has the complex characteristics of high nonlinearity and non-stationarity, and different characteristics exist under different network environment conditions. The invention discloses a GRU-based network traffic prediction method, which respectively predicts data by utilizing GRU neural network models adopting different gradient descent algorithms, and finally adds and calculates an average value to obtain predicted network traffic data. The invention aims to overcome the defects in the prior art on the basis of the prior art and further improve the accuracy of network flow data sequence prediction.

The technical contribution of the invention is as follows:

the network flow prediction is widely applied to various fields of networks, and a network flow data sequence of the network flow prediction is a nonlinear time sequence in nature, but has the characteristic of high instability due to the influence of various uncertain factors, so that the network flow data is difficult to express and express, and further planning and maintaining of future networks become difficult. For this reason, network traffic prediction is of paramount importance. The invention provides a network flow prediction method based on GRU. Compared with the prior work, the main contributions of the invention lie in the following aspects:

(1) the invention predicts the network flow sequence by utilizing the GRU neural network algorithm, and the GRU neural network has memory and can make better prediction on the nonlinear time sequence.

(2) In order to enable the traffic prediction method provided by the invention to carry out network traffic prediction under different situations and keep better prediction accuracy, the network traffic prediction method provided by the invention adopts a method of multi-model prediction addition averaging, three GRU models adopt different gradient descent algorithms to carry out prediction, and finally the prediction data is a value obtained by summing and averaging.

Description of the effects of the invention:

the method utilizes the GRU neural network to predict the network traffic data sequence, simultaneously adopts three GRU models with different gradient descent algorithms, and finally obtains the final predicted network traffic data by summing and averaging the predicted values of the three models.

The invention can memorize the change rule of the past network flow data through the GRU neural network, the GRU neural network is relatively simple, the resource occupation is less, the capability of expressing the nonlinear network flow sequence is strong, and the prediction effect is improved by respectively predicting the subsequence.

The invention adopts three GRU models with different gradient descent methods, and aims to adapt to prediction in different scenes.

Claims

1. A network traffic prediction method based on GRU model is characterized in that: the network traffic data sequence is input into the GRU neural network model and the predicted network traffic is completed.

2. a kind of network traffic prediction method based on GRU model according to claim 1, is characterized in that: the quantity of described GRU neural network model is three, is respectively GRU-SGD model, GRU-Adam model and GRU- AdaGrad model, the GRU-SGD model is a neural network model using the SGD gradient descent algorithm, the GRU-Adam model is a neural network model using the Adam gradient descent algorithm, and the GRU-AdaGrad model is using the AdaGrad gradient descent algorithm. The neural network model is predicted by the GRU-SGD model, the GRU-Adam model and the GRU-AdaGrad model respectively, and the data predicted by each model is added and averaged to obtain the predicted network traffic data.

3. a kind of network traffic prediction method based on GRU model according to claim 1 is characterized in that: comprises specifically S1～S5 steps,

S1. Obtain historical network traffic data;

S2. Determine the training data and verification data in the historical network traffic data;

S3. Bring the training data into each GRU neural network model for training;

S4. Predict the validation data through three GRU neural network models;

S5. Add and average the predicted data to obtain predicted network traffic data.

4. a kind of network traffic prediction method based on GRU model according to claim 3, is characterized in that: wherein step S3 specifically comprises step S31,

S31. There are three types of GRU neural network models, namely the GRU-SGD model, the GRU-Adam model and the GRU-AdaGrad model. The GRU-SGD model is a neural network model using the SGD gradient descent algorithm, and the GRU-Adam model In order to use the neural network model of the Adam gradient descent algorithm, the GRU-AdaGrad model is a neural network model using the AdaGrad gradient descent algorithm, and the training data are respectively input into each GRU neural network model, and the training data is first in the GRU neural network model. forward propagation in.

5. a kind of network traffic prediction method based on GRU model according to claim 4, is characterized in that: wherein S3 step specifically also comprises S32 step,

S32, a loss function of calculation time.

6. a kind of network traffic prediction method based on GRU model according to claim 5, is characterized in that: wherein S3 step specifically also comprises S33 step,

S33. Use reverse chain derivation, and iterate successively until the loss function converges.

7. a kind of network traffic prediction method based on GRU model according to claim 6, is characterized in that: wherein S3 step specifically also comprises S34 step,

S34. The GRU-SGD model is updated using the SGD gradient descent algorithm, the GRU-Adam model is updated using the Adam gradient descent algorithm, and the GRU-AdaGrad model is updated using the AdaGrad gradient descent algorithm.

8. a kind of network traffic prediction method based on GRU model according to claim 7, is characterized in that: wherein S3 step specifically also comprises S35 step,

S35. Repeat steps S31 to S34, and update continuously until the loss function is less than 0.2, and the model training is completed.

9. The method for predicting network traffic based on a GRU model according to claim 7, wherein step S34 specifically includes steps S341 to S343,

S341, the GRU-SGD model uses the SGD gradient descent algorithm to calculate the amount of decline of each parameter, and then update it;

S342, the GRU-Adam model uses the Adam gradient descent algorithm to calculate the amount of decline of each parameter, thereby updating;

S343, the GRU-AdaGrad model uses the AdaGrad gradient descent algorithm to calculate the amount of decline of each parameter and update it.

10 . The method for predicting network traffic based on a GRU model according to any one of claims 1 to 9 , wherein the method is run based on a server. 11 .