CN115022193B

CN115022193B - Local area network flow prediction method based on deep learning model

Info

Publication number: CN115022193B
Application number: CN202210561413.1A
Authority: CN
Inventors: 陈智勇; 朱俊宏; 马万艺; 赖俊宇; 余长江; 皮昌洪
Original assignee: Chongqing Huifa Network Technology Co ltd; University of Electronic Science and Technology of China
Current assignee: Chongqing Huifa Network Technology Co ltd; University of Electronic Science and Technology of China
Priority date: 2022-05-23
Filing date: 2022-05-23
Publication date: 2024-02-02
Anticipated expiration: 2042-05-23
Also published as: CN115022193A

Abstract

The invention discloses a local area network link flow prediction method based on an improved ConvLSTM deep learning model, and relates to the field of network flow prediction. Aiming at the problems that the prediction accuracy is low, the spatial characteristics of the target network flow cannot be predicted at the same time, the flow of all links of the target network cannot be predicted through a single model and the like in the target network flow prediction method based on the traditional linear model and the nonlinear model, the invention provides a local area network flow prediction method based on an improved ConvLSTM deep learning model. The model is added with a similar residual error structure module and a focus mechanism-based Squeeze-and-specification module so as to finally achieve the aims of accurate and rapid simulation and twinning of the target network link flow. The method improves the accuracy of local area network space-time flow matrix prediction, reduces training iteration times and improves the utilization rate of computing resources.

Description

Local area network flow prediction method based on deep learning model

Technical Field

The invention belongs to the technical fields of computer network traffic engineering, network twinning and network simulation, and particularly relates to a local area network traffic prediction method based on an improved ConvLSTM deep learning model.

Background

Network simulation and network twinning are research hotspots in the current computer network and communication arts. Besides the software simulation (simulation) method, the network simulation (simulation) technology can be adopted to realize high fidelity reproduction of key elements such as nodes, links, topology and the like in the target network. For example, a virtual instance in a cloud computing platform (a virtual machine based on a traditional host virtualization technology or a container based on a lightweight virtualization technology) is adopted to realize simulation or twin reproduction of a target network node, and a cloud platform bottom virtual link and a virtual network are adopted to realize simulation or twin reproduction of a target network link and topology. The network simulation technology has the advantages of high simulation fidelity, flexibility, expandability, support of upper layer protocols, direct deployment of application programs and the like.

Network simulation and network twinning are not only required to realize high fidelity reproduction of target network nodes, links and topology, but also required to realize real-time and accurate reproduction of transmission flow in a target network. The main method for realizing the link flow reproduction in the simulation network or the twin network in the target network comprises two methods: the first method is a method based on the target network traffic log. The target network flow can be completely recorded by using a tcpdump and other packet grabbing tools and stored as a TraceFile, and then the flow is reproduced in the simulation network or the twin network according to the file. Although the method can accurately reproduce the flow of the target network within a period of time, the method cannot meet the requirement of real-time reproduction. The second method is based on a target network historical flow data training (statistics, machine learning and deep learning) model, and predicting the future flow of the target network through the obtained model, and based on the model, realizing flow reproduction in a simulation network or a twin network. Although the method sacrifices a certain flow accuracy, the method can better meet the real-time requirements of network flow twinning and synchronization. Currently, the flow prediction methods can be classified into the following two types:

1) The first type of method comprises the following steps: predicting network traffic based on the linear model;

2) The second type of method is as follows: and predicting network traffic based on the nonlinear model.

In one aspect, the first class of methods includes autoregressive models, moving average models, autoregressive synthetic moving average models, and the like. The method is suitable for predicting the traffic generated by the network nodes in a short period, and a comprehensive model capable of accurately predicting the complex service application traffic generated by a plurality of network nodes is difficult to construct by adopting the method one. On the other hand, the second type of method comprises a wavelet model, a regression score integration moving average model, a support vector machine in the machine learning field, an artificial neural network model and the like, can realize the prediction of long-term traffic on the autocorrelation of network traffic, but cannot usually realize the high fidelity requirement of short-term traffic prediction. With the development of deep learning technology in recent years, prediction of target network flow is realized based on a deep neural network model, so that better accuracy can be achieved, and the method is gradually focused on related research fields. In particular, network traffic prediction methods based on Recurrent Neural Network (RNN) models, long and Short Term Memory (LSTM) network models, have enabled accurate prediction of network link traffic. However, the method based on deep learning generally only realizes traffic prediction of the time dimension of the links of the target network, cannot capture the spatial characteristics of multiple links of the target network, and cannot predict all link traffic and spatial association thereof in the target network at the same time.

The local area network is an important component part with the largest quantity and the most complicated flow in a typical target network, so in network simulation and network twinning application, in order to realize accurate and real-time reproduction of the whole network flow, a local area network flow accurate prediction theory method and an implementation technology based on deep learning need to be further researched so as to effectively improve the fidelity of local area network simulation and twinning.

Disclosure of Invention

Aiming at the problems that the prediction accuracy is low, the spatial characteristics of the target network flow cannot be predicted at the same time, the flow of all links of the target network cannot be predicted through a single model and the like in the target network flow prediction method based on the traditional linear model and the nonlinear model, the invention provides a local area network flow prediction method based on an improved ConvLSTM deep learning model. The model is added with a class residual error structure module (RES template) and a focus mechanism-based Squeeze-and-specification module (SE module) to finally achieve the aims of accurate and rapid simulation and twinning of the target network link flow.

The technical scheme of the invention is realized as follows: a local area network link flow prediction method based on an improved ConvLSTM deep learning model firstly sets a target local area network with N three-layer nodes, wherein the N is included _c Each client node N _s Individual server nodes and N _r A plurality of router nodes; the method comprises the following steps:

step 1: collecting a local area network traffic matrix sequence, and performing equidistant averaging on the initial matrix sequence;

step 1.1: collecting the flow of all links of a target network, and counting the access flow reaching N sampling nodes every second to obtain flow data at the moment t:

wherein:

D _t ∈R ^N×N

element(s)The flow of the i node to the j node at the time T is shown, and the T is the total sampling time;

step 1.2: collecting data of K days, and performing interval S on the collected data due to fluctuation of local area network flow data _S The internal average is found that the traffic sequence length T= (24×60×60×K)/S _S ；

Step 2: target network traffic matrix conversion, re-weighting N x N point-to-point traffic matrixModeling an M multiplied by N mode traffic matrix of the server node to the user node; where M is the number of uplink and downlink for the user to transmit data with the server, i.e., for N _c Each client node N _s Individual server nodes and N _r Each router node, m= (N) _s +N _r ) X 2; the post-conversion traffic matrix is as follows:

wherein:

step 3: normalizing the converted flow matrix obtained in the step 2, and accelerating the speed of gradient descent to solve the optimal solution after normalization, thereby improving the prediction precision; the normalization formula is expressed as:

wherein: d, d ^max And d ^min Representing D' _t The maximum and minimum values at all times,representing normalized values, the normalized flow matrix is represented as D ^* _t ；

Step 4: to prevent model overfitting, the Loss function Loss is defined as the mean square error of the predicted flow matrix plus epsilon times the mean square error of the actual flow matrix to predict the predicted error for each element, as:

wherein epsilon is an adjustable super parameter,is the traffic matrix D ^* _t True value +.>Is a predicted value of (2);

step 5: and (3) extracting time feature and space feature information by using a SE-ConvLSTM-Res model according to the mode flow matrix sequence obtained in the step (2), wherein the SE-ConvLSTM-Res model comprises the following steps: res module, SE module and ConvLSTM module;

step 5.1: the Res module operation specific flow is as follows:

step 5.1.1: for flow matrix D ^* _t Splitting:

wherein the method comprises the steps of Element->n represents the total number of the split element matrixes;

step 5.1.2: for a pair ofThe elements of (2) are respectively combined with a convolutional neural network with a convolutional kernel of 2x2 after feature extraction, and finally added with the output of an SE module, and the process is as follows:

wherein O is _RES For the output of the Res module, ε is an adjustable parameter, ω _k Representing the weight of the kth layer of the convolutional neural network, b _k Representing a deviation coefficient of the convolutional neural network;

step 5.2: the SE module mainly comprises compression operation and excitation operation, and the specific flow is as follows:

step 5.2.1: the compression operation is realized by global average pooling, which compresses the spatial features into a global feature in a global average way, but retains the information of the channels and the real-time features:

wherein z is E R ^C C represents a channel, z is the entire time series obtained,is an element of an ith link to a jth node;

step 5.2.2: the excitation operation is to capture the nonlinear relationship between channels; through the realization of an activation function and a full connection layer, the captured channel information is mapped between (0, 1) by using a sigmoid gate mechanism through the dimension reduction and dimension increase of the two full connection layers:

s＝σ(ω ₂ ⊙ReLU(ω ₁ ⊙z))

where s represents the gate activation value, ω, between channels ₁ For fully connected FC ₁ Weights of layers, omega ₂ For fully connected FC ₂ The weight of the layer, sigma is a sigmoid activation function, and ReLU is an activation function;

finally, the learned gate activation value between channels is combined with x _c Multiplying to obtain the output of the SE module as the time channel weight O _SE ，x _c Is a matrix element of a channel:

O _SE ＝x _c ⊙s；

step 5.3: the ConvLSTM module specifically comprises the following steps:

step 5.3.1: convLSTM improves full connectivity to convolutional neural networks based on LSTM, i.e., input X of model _t H from the cell state at the previous time _t Merging the signals passing through the convolutional neural network module to be used as the input of the SE module;

wherein ω represents a weight set of the convolutional neural network full-connection layer;

step 5.3.2: the output of the SE module is overlapped with the output of the Res module to be used as the input of the LSTM part, namely the input O _RES +O _SE ；

Step 5.3.3: the LSTM portion of the ConvLSTM module performs the following calculations in order:

f _t ＝Sigmoid(Conv(x _t ；ω _xf )+Conv(h _t-1 ；ω _hf )+b _f ) (4)

i _t ＝Sigmoid(Conv(x _t ；ω _xi )+Conv(h _t-1 ；ω _hi )+b _i ) (5)

o _t ＝Sigmoid(Conv(x _t ；ω _xo )+Conv(h _t-1 ；ω _ho )+b _o ) (6)

g _t ＝Tanh(Conv(x _t ；ω _xg )+Conv(h _t-1 ；ω _hg )+b _g ) (7)

c _t ＝f _t ⊙c _t-1 +i _t ⊙g _t (8)

h _t ＝o _t ⊙Tanh(c _t ) (9)

wherein x is _t Is the input element of the corresponding calculation process of the LSTM part in the ConvLSTM module, and x _t ＝O _RES +O _SE ；

ω _xf 、ω _hf 、ω _xi 、ω _hi 、ω _xo 、ω _ho 、ω _xg 、ω _hg Weights corresponding to convolution layers in the convolution calculation process, b _f 、b _i 、b _o 、b _g As the deviation coefficient of the neural network, C _t-1 And h _t-1 The state value of the model at the last moment of the calculation process is calculated;

finally, c _t And h _t As input of the full connection layer, the full connection layer output is a target matrix; and (3) training the SE-ConvLSTM-Res model by using the data obtained in the step (3), and carrying out flow prediction by using the trained SE-ConvLSTM-Res model.

The invention has the beneficial effects that: the invention provides the following methods for solving the problems of accurate and efficient prediction of the flow of a target local area network in the fields of network simulation and network twinning:

1. the deep learning method is applied to solving the problem of target local area network traffic prediction. Simplifying the original flow matrix data structure into a data structure based on the uplink and downlink of the server based on the unique characteristics of the local area network flow;

2. the variance of the loss value in the flow matrix is added into the loss function, so that the deep learning model is effectively prevented from being over fitted.

3. Based on the concept of deep learning SE module and residual error, a SE-ConvLSTM-Res enhancement model is provided, and the sensitivity of each moment can be better grasped in the time dimension by the enhancement model. In the spatial dimension, the traffic characteristics of the local area network can be learned more quickly.

The method improves the accuracy of local area network space-time flow matrix prediction, reduces training iteration times and improves the utilization rate of computing resources.

Drawings

FIG. 1 is a block diagram of a SE-ConvLSTM-Res deep learning model of the present invention;

fig. 2 is a flowchart illustrating the operation of the lan traffic matrix prediction method according to the present invention.

DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION

For a detailed description of the technical scheme disclosed in the invention, the following is a further detailed description of the invention with reference to the drawings and specific embodiments of the description:

in order to solve the problem of high-precision prediction of the local area network traffic matrix, the invention provides a local area network traffic prediction method based on deep learning. The method provided by the invention simplifies the original data structure into a data structure which is more beneficial to deep learning aiming at the characteristics of the local area network. In order to prevent the excessive difference of the errors of the flow predictions of all the nodes in the flow matrix, the variance of the loss value in the flow matrix is added into a loss function, so that the deep learning model is prevented from being over fitted. The invention adds SE and Res modules based on ConvLSTM, and provides an SE-ConvLSTM-Res improved model which can capture the time and space characteristics of a local area network flow matrix more quickly, improve the training speed of the model and improve certain prediction precision. Finally, the real-time synchronization and the accurate reproduction of the local area network flow in the twin network are facilitated.

The first step: collecting a local area network Traffic Matrix (Traffic Matrix) sequence, and performing equidistant average on the initial Matrix sequence:

(1.1) collecting the flow of all links of a target network, and counting the access flow reaching N sampling nodes every second to obtain flow data at the moment t:

wherein:

D _t ∈R ^N×N

element(s)And the traffic size sent to the j node by the i node at the time t is shown.

(1.2) suppose data are collected for k days. Due to the fluctuation of the local area network flow data, the collected data is processed at small intervals S _S Internal average, the traffic sequence length t= (24×60×60×k)/S _S

For (1.1) accessing different servers in the local area network, different traffic models are generated, such as accessing a web server, the generated traffic has burstiness, and in order to acquire traffic characteristics of the web server as much as possible, the data sampling time of the node needs to be as short as possible. For (1.2), because the local area network flow data has larger fluctuation, the data with larger fluctuation can reduce the prediction precision of the model, so the sampled data obtained in (1.1) are sampled at equal m intervals, and the data stability is improved.

And a second step of: the local area network is provided with N three layers of nodes, wherein the N three layers of nodes comprise N _c Each client node N _s Individual server nodes and N _r And a router node. The flow from the local area network end to the end can be divided into two parts, namely the flow between users and the flow between the users and the server:

D(t)＝D _c (t)+D _s (t)

the user-to-user traffic can be expressed as:

the flow between the user and the server consists of the flow from the user to the server and the flow from the server to the user, namely the upstream flow of the server and the downstream flow of the server, which can be expressed as follows:

because of D in local area network _c (t) is small and D _s (t) is relatively large. Spatially D _c (t) and D _s (t) the large difference in size affects the accuracy of the model convolution extracted features. And D is _s D in (t) _cs (t) and D _sc (t) have partial repetition, structurally transposed with respect to each other. And then, the local area network traffic matrix structure is remodelled, and the N multiplied by N point-to-point traffic matrix is remodelled into an M multiplied by N mode traffic matrix of the server node to the user node. Where M is the number of uplinks and downlinks for the user to transmit data with the server, where m= (N) _s +N _r ) X 2. The converted traffic matrix is as follows:

wherein:

D′ _t ∈R ^M×N

D′ _t elements of (2)Two models representing upstream and downstream traffic of a server, < >>Representing traffic destined for the jth user node by the ith server.

And a third step of: the flow matrix sequence is normalized, the speed of gradient descent to solve the optimal solution is increased after the data normalization, and the model precision is improved; the invention uses maximum and minimum normalization to normalize, and the formula can be expressed as:

fourth step: determining a loss function, a model training process may overstrain a certain space in MXNTraining of points, but neglecting other points in space, results in a decrease in overall prediction accuracy, although the overall traffic matrix error MSE is still in a decreasing trend. In order to prevent the model from being over fitted, the spatial overall prediction precision is synchronously improved, and the training loss function is defined as the mean square error MSE of a predicted flow matrix and an actual flow matrix plus the mean square error of the prediction error of each element of the flow matrix of epsilon times. Namely, the loss function is:

wherein epsilon is an adjustable super-parameter,is the traffic matrix D ^* _t True value +.>Is a predicted value of (a).

Fifth step: and (3) extracting time feature and space feature information by using the SE-ConvLSTM-Res model according to the mode flow matrix sequence obtained in the second step. SE-ConvLSTM-Res model introduction:

(5.1) ConvLSTM model (Convolutional LSTM Network) is an improved model based on long-short-term memory neural network, and ConvLSTM improves the full connection of input and weight into convolution based on the long-short-term memory neural network LSTM model. The original one-dimensional sequence of the LSTM model is improved to input pictures which can be two-dimensional. The original LSTM model can only extract time feature information, and ConvLSTM can extract time and space feature information at the same time due to convolution. The method is suitable for the scene of local area network traffic matrix prediction.

(5.2) because the traffic patterns of different servers in the local area network are very different, such as file servers, web servers, audio and video servers and users, the traffic patterns of the audio and video servers are continuous in time, the traffic patterns of the web servers are bursty in time, the generated traffic is generally smaller than that of the audio and video, the traffic patterns of the file servers are continuous in time, have stronger correlation in space, and the generated traffic is larger. The ConvLSTM is directly used for flow prediction of the local area network, and interference information of other servers is mixed in a predicted flow matrix due to the characteristic of integral convolution when a model is trained, so that the time cost of model training is increased.

The invention is improved based on ConvLSTM model, introduces concepts of a module (Deep residual network) similar to ResNet and a SE module (Squeeze and Excitation Block) and obtains the SE-ConvLSTM-Res model. The SE module is a lightweight module, and utilizes a attention mechanism to screen output channel information of ConvLSTM-Res, so that sensitivity of a model to channel characteristics is improved, characteristics of important channels are strengthened, and characteristics of non-important channels are weakened. The introduced Res-like model will input information X _t Information separation of individual servers of (2) feature extraction with a 2x2 size convolution kernel convolution layer. And after the extraction, combining the characteristic information of each server. The final ConvLSTM convolved output is added with the gamma-fold server characteristic information and combined as the in-cell input. The SE-ConvLSTM-Res model structure is shown in FIG. 1, wherein the SE module enhances channel information at sensitive moments and weakens channel information at insensitive moments in time sequence based on the attention mechanism. And the Res module spatially weakens the crossing of the spatial information between different servers and enhances the spatial information of a single server. The training speed of the model is increased, and the predicted loss is reduced.

SE-ConvLSTM-Res model flow introduction:

the specific operation flow of the improved model RES module is as follows:

(1.1) splitting an input traffic matrixSet of server traffic matrix>

Wherein the method comprises the steps of Element->

(1.2) pairThe elements of (2) are respectively combined with a convolutional neural network with a convolutional kernel of 2x2 after feature extraction, and finally added with the output of an SE module, and the process is as follows:

the improved model SE module mainly comprises compression operation and excitation operation, and the specific flow is as follows:

(1.1) the compression operation is implemented by global averaging pooling (global average pooling), which compresses the spatial features into a global feature on the whole average, but retains the information of its channels and the temporal features:

wherein z is E R ^C C represents a channel and z represents the entire time series.

(1.2) the excitation operation is to capture the nonlinear relationship between channels. By activating functions and fully connected layers. The activation functions include the ReLU and sigmoid functions. Through dimension reduction and dimension increase of two full connection layers, the captured channel information is mapped between (0, 1) by using a sigmoid gate mechanism:

s＝σ(ω ₂ ⊙ReLU(ω ₁ ⊙z))

wherein the method comprises the steps ofFor fully connected FC ₁ Weights of layers, ++>For fully connected FC ₂ Layer weights. Sigma is a sigmoid activation function.

Finally, the learned gate activation value between channels is combined with x _c Multiplication can achieve the effect of adjusting the time channel weight:

O _SE ＝x _c ⊙s

the ConvLSTM module specifically executes the following steps:

(1.1) ConvLSTM improves full connectivity to convolutional neural networks based on LSTM, input X _t H from the cell state at the previous time _t Merging the signals passing through the convolutional neural network module to be used as input of the SE module, wherein the formula is as follows:

I _SE ＝Conv([X _t ,h _t ]；ω)

(1.2) superimposing the output of the SE module and the Res module as the input to the LSTM section, i.e., input O _RES +O _SE

(1.3) the LSTM portion of ConvLSTM was calculated as follows:

f _t ＝Sigmoid(Conv(x _t ；ω _xf )+Conv(h _t-1 ；ω _hf )+b _f ) (4)

i _t ＝Sigmoid(Conv(x _t ；ω _xi )+Conv(h _t-1 ；ω _hi )+b _i ) (5)

o _t ＝Sigmoid(Conv(x _t ；ω _xo )+Conv(h _t-1 ；ω _ho )+b _o ) (6)

g _t ＝Tanh(Conv(x _t ；ω _xg )+Conv(h _t-1 ；ω _hg )+b _g ) (7)

c _t ＝f _t ⊙c _t-1 +i _t ⊙g _t (8)

h _t ＝o _t ⊙Tanh(c _t ) (9)

wherein the method comprises the steps ofb is the deviation, C _t-1 And H is _t-1 Representing the state value of the last moment model.

Sixth step: in order to make the structure of the resulting output consistent with that expected, the output of the SE-ConvLSTM-Res model is finally passed through the full connection layer.

In the experiment, the input dimension is set to be 20, and the output hidden dimension is set to be 16. The convolution kernel size of the ConvLSTM portion in the SE-ConvLSTM-Res model is 3x3. The Res module output weight epsilon is set to be 1, and the learning rate is 10 ^-3 The attenuation ratio gamma is 0.9.

Claims

1. A local area network link flow prediction method based on an improved ConvLSTM deep learning model firstly sets a target local area network with N three-layer nodes, wherein the N is included _c Each client node N _s Individual server nodes and N _r A plurality of router nodes; the method comprises the following steps:

wherein:

D _t ∈R ^N×N

Step 2: target network traffic momentMatrix conversion, remodelling the N multiplied by N point-to-point traffic matrix into an M multiplied by N mode traffic matrix of the server node to the user node; where M is the number of uplink and downlink for the user to transmit data with the server, i.e., for N _c Each client node N _s Individual server nodes and N _r Each router node, m= (N) _s +N _r ) X 2; the post-conversion traffic matrix is as follows:

wherein:

step 5.1: the Res module operation specific flow is as follows:

step 5.1.1: for flow matrix D ^* _t Splitting:

s＝σ(ω ₂ ⊙ReLU(ω ₁ ⊙z))

where s represents the gate activation value, ω, between channels ₁ For fully connected FC ₁ Weights of layers, omega ₂ For fully connected FC ₂ Weights of layers, sigma is sigmoidAn activation function, reLU is the activation function;

finally, the learned gate activation value between channels is combined with x _c Multiplying to obtain the output of the SE module as the time channel weight O _SE ，x _c Is a matrix element of a channel;

O _SE ＝x _c ⊙s；

step 5.3: the ConvLSTM module specifically comprises the following steps:

f _t ＝Sigmoid(Conv(x _t ；ω _xf )+Conv(h _t-1 ；ω _hf )+b _f ) (4)

i _t ＝Sigmoid(Conv(x _t ；ω _xi )+Conv(h _t-1 ；ω _hi )+b _i ) (5)

o _t ＝Sigmoid(Conv(x _t ；ω _xo )+Conv(h _t-1 ；ω _ho )+b _o ) (6)

g _t ＝Tanh(Conv(x _t ；ω _xg )+Conv(h _t-1 ；ω _hg )+b _g ) (7)

c _t ＝f _t ⊙c _t-1 +i _t ⊙g _t (8)

h _t ＝o _t ⊙Tanh(c _t ) (9)