CN111918321B

CN111918321B - Mobile flow prediction method based on space-time attention convolutional network

Info

Publication number: CN111918321B
Application number: CN202010708100.5A
Authority: CN
Inventors: 赵楠; 叶智养; 范孟林; 程一强; 刘泽华; 谭惠文
Original assignee: Hubei University of Technology
Current assignee: Hubei University of Technology
Priority date: 2020-07-22
Filing date: 2020-07-22
Publication date: 2022-08-05
Anticipated expiration: 2040-07-22
Also published as: CN111918321A

Abstract

The invention belongs to the technical field of mobile traffic prediction and discloses a mobile traffic prediction method based on a space-time attention convolutional network, wherein the space-time attention convolutional network models mobile traffic networks of an hour period, a day period and a week period through three time components respectively and obtains corresponding three pieces of mobile traffic prediction information; and fusing the three mobile traffic prediction information and the external interference information to obtain a final mobile traffic prediction result. The invention effectively solves the problem of prediction of the mobile flow.

Description

Mobile flow prediction method based on space-time attention convolutional network

Technical Field

The invention relates to the technical field of mobile traffic prediction, in particular to a mobile traffic prediction method based on a space-time attention convolutional network.

Background

According to Cisco's latest report, by 2022, global mobile data traffic is expected to reach 77 megabytes per month. Accurate prediction of mobile communication traffic is essential to ensure reliable network management and mobile services. However, since the mobile traffic prediction has high non-linear and dynamic spatio-temporal correlation, the accuracy of the mobile traffic prediction poses a great challenge.

In recent years, a great deal of research is carried out on a mobile traffic prediction method, which is mainly divided into the following two categories: statistical-based methods and machine learning-based methods. In the statistical-based approach, the prediction of mobile traffic is based on statistical distributions, such as alpha-stable distributions, autoregressive integrated mobile translation line (ARIMA) and entropy theory. Most statistical-based methods are based on linear statistical strategies and may not be suitable in many practical scenarios. In the machine learning based method, the prediction of the movement flow based on the machine learning method, such as linear regression, Support Vector Regression (SVR), recurrent neural network, and deep transfer learning, is studied. However, most of them ignore the spatio-temporal correlation of data and are difficult to extend to high-dimensional data. While some deep learning methods may handle high-dimensional spatiotemporal mobile traffic data, such as grid-based mobile traffic data using Convolutional Neural Networks (CNN), graph-based mobile traffic data using graph convolutional neural networks (GCN). However, capturing the dynamic spatiotemporal correlations of mobile traffic streams using deep learning algorithms remains somewhat challenging.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a mobile traffic prediction method based on a space-time attention convolutional network.

The embodiment of the application provides a mobile traffic prediction method based on a space-time attention convolutional network, wherein the space-time attention convolutional network models mobile traffic networks of an hour period, a day period and a week period through three time components respectively and obtains corresponding three pieces of mobile traffic prediction information; and fusing the three mobile traffic prediction information and the external interference information to obtain a final mobile traffic prediction result.

Preferably, the area to be predicted is divided into a plurality of sub-areas, and each sub-area is used as a node in the mobile traffic network of the area to be predicted;

the mobile traffic network is modeled into an undirected graph G (v, epsilon, A), wherein v is a set of N nodes, epsilon is a set of connected edges between the nodes, and A is an adjacency matrix of the undirected graph G.

Preferably, each time component comprises two space-time modules and a full connection layer which are connected in sequence;

each space-time module comprises a space-time attention module and a space-time convolution module; acquiring dynamic space-time correlation of the mobile flow through the space-time attention module, and acquiring space-time characteristics of the mobile flow through the space-time convolution module;

the full-link layer ensures that the dimension output by the time component is the same as the dimension of the prediction target.

Preferably, the space-time attention module consists of a time attention mechanism and a space attention mechanism;

obtaining correlations between different time periods by the time attention mechanism; the output of the temporal attention mechanism is expressed as:

wherein, X ^(l) The input variable of the first time-space module in the time component is represented, and l is more than or equal to 1 and less than or equal to 2;

representing the mobile traffic data after obtaining the time correlation, W _t1 And W _t2 Learnable parameters, B, both representing a temporal attention mechanism _t Offset matrix, σ, representing the time attention mechanism ₁ And σ ₂ Respectively representing a sigmoid activation function and a normalized exponential function;

obtained in a spatial dimension by the spatial attention mechanism

The correlation between different nodes in the network;

the normalized attention matrix of the spatiotemporal attention module is represented as:

wherein A is _s Denotes the normalized attention matrix, W _s1 And W _s2 Learnable parameters, B, both representing spatial attention mechanisms _s An offset matrix representing a spatial attention mechanism.

Preferably, the three time components are an hour cycle component, a day cycle component and a week cycle component respectively;

data of Δ h hours before the previous time τ is taken as input to the hourly periodic component, denoted X _h ＝(X _τ-Δh+1 ,X _τ-Δh+1 ,...,X _τ ),1≤Δh≤24-t _p ；X _h As the input flow to the first spatio-temporal module in the hourly period component;

will time t _d ＝τ+t _p -data of ad days before 24 as input to said daily cycle component, denoted as

X _d As the input flow to the first spatiotemporal module in the daily cycle component;

will time t _w ＝τ+t _p -data of Δ w weeks before 24 × 7 as input to said weekly periodic component, denoted as

Δw≥1；X _w As the input flow to the first spatio-temporal module in the cyclic part.

Preferably, in the space-time convolution module, the space-time characteristics of the mobile traffic are acquired sequentially through a graph convolution network and a standard convolution network;

the convolution operation of the graph convolution network is represented as follows:

wherein,

is the output of the graph convolution networkMobile traffic data indicating that information of the mobile traffic data is updated by characteristics of neighboring nodes; beta is a _k Is a variable of a polynomial coefficient, I _N Is the identity matrix, L is the normalized Laplace matrix, λ, of the undirected graph G _max Is the maximum eigenvalue of L, T _k (x) Is a Chebyshev polynomial, K is T _k (x) The number of terms of;

the output of the standard convolutional network is represented as:

wherein,

is the output of the standard convolutional network, which represents the mobile traffic of which the information is updated by the characteristics of the adjacent time periods; denotes the convolution operation, phi ₁ Representing the parameters of a standard convolutional network convolutional kernel, Relu () represents an activation function.

Preferably, each of the spatiotemporal modules further comprises a residual network; optimizing training efficiency and reducing prediction errors through the residual error network;

the output of the residual network is represented as:

wherein,

is the mobile traffic data phi after the residual error network convolution ₂ Convolution kernel parameters representing a residual network;

for each of the temporal components, the output of the ith spatio-temporal module is represented as:

wherein, X ^(l+1) Is the firstOutput of l spatio-temporal modules.

Preferably, the output of said fully connected layer

Expressed as:

wherein,

is the output of the time component and represents the predicted information of the moving flow rate corresponding to the time component; w _f And B _f Weight matrix and deviation matrix, X, representing fully connected layers, respectively ^(L) Representing the output of the second spatio-temporal module in the temporal component.

Preferably, the specific implementation manner of obtaining the final mobile traffic prediction result by fusing the three pieces of mobile traffic prediction information with the external interference information is as follows:

wherein,

total prediction information representing three time components;

respectively, the outputs of the hour period component, the day period component and the week period component are

Concretization of (1); w _h 、W _d 、W _w Is a learnable weight matrix;

wherein,

the estimated value is used as the final mobile flow prediction result;

representing external interference information.

Preferably, said spatiotemporal attention convolutional network estimates said prediction

And the mean square error between the true value Y and the real value Y as a loss function;

the loss function is expressed as:

wherein, L (theta) is a loss function, and theta is all learnable parameters of the space-time attention convolution network; the spatio-temporal attention convolutional network adjusts learnable parameters by obtaining a minimum loss function.

One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:

in the embodiment of the application, the space-time attention convolutional network models the mobile traffic network of an hour period, a day period and a week period through three time components respectively, and obtains corresponding three mobile traffic prediction information; the three pieces of mobile traffic prediction information are fused with the external interference information to obtain a final mobile traffic prediction result, so that the problem of mobile traffic prediction is effectively solved.

Drawings

In order to more clearly illustrate the technical solution in the present embodiment, the drawings needed to be used in the description of the embodiment will be briefly introduced below, and it is obvious that the drawings in the following description are one embodiment of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a schematic diagram of a framework of a time component in a mobile traffic prediction method based on a spatio-temporal attention convolutional network according to an embodiment of the present invention.

Detailed Description

The invention provides a mobile traffic prediction method based on a space-time attention convolutional network, wherein the space-time attention convolutional network models mobile traffic networks of an hour period, a day period and a week period through three time components respectively and obtains corresponding three mobile traffic prediction information; and fusing the three mobile traffic prediction information and the external interference information to obtain a final mobile traffic prediction result.

In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.

In this embodiment, the spatio-temporal attention convolutional network models the mobile traffic data of an hour period, a day period, and a week period through three time components, and each time component is composed of two spatio-temporal modules and a full connection layer. For each time component, the two spatio-temporal modules have the same structure, and the output of the first spatio-temporal module is used as the input of the second spatio-temporal module, and the output of the second spatio-temporal module is used as the input of the full connection layer. Two space-time modules are adopted to better extract the space-time characteristics of the mobile flow data and improve the precision of a prediction result. Each space-time module comprises a space-time attention module and a space-time convolution module, and the space-time attention module is used for acquiring the dynamic space-time correlation of the mobile flow; the space-time convolution module captures space-time characteristics of the mobile traffic using a graph convolution and a standard convolution. In addition, in order to improve the training efficiency, it is preferable to add a residual network in the spatio-temporal module. And fusing the results of the three time components and the external interference factors to finally obtain the prediction result of the mobile flow. And obtaining a minimum loss function to adjust the parameters of the space-time attention convolution network.

Taking city-level mobile traffic prediction as an example, a city is divided into 100 x 100 areas, each area being considered as a node in the entire urban mobile traffic network. Through analysis of mobile traffic data, it can be found that space-time correlation exists between mobile traffic, and then the urban mobile traffic network can be modeled as an undirected graph G ═ v, epsilon, a, where v is a set of N nodes, epsilon is a set of connected edges between nodes, and a is an adjacency matrix of the undirected graph G. Considering that the traffic of node i at time t is

The traffic of the N nodes at time t is represented as

The flow rate of N nodes in the time period Δ t is represented as X ═ X (X) ₁ ,X ₂ ,...,X _Δt ). Suppose the next t from time t _p The predicted flow of the node i in the time period is defined as

Wherein

Representing the traffic of node i at time t. Then given the past at time period flow X, the future N nodes' flows Y ═ can be predicted (Y) ₁ ,Y ₂ ,...,Y _N )。

In order to reduce the influence of historical irrelevant flow data on the prediction accuracy, the embodiment respectively adopts three time components, namely an hour period component, a day period component and a week period component, to acquire the information of the time dimension of the flow data. In an hourly periodic component, data for Δ h hours before the current time τ is used as input to the hourly periodic component, denoted X _h ＝(X _τ-Δh+1 ,X _τ-Δh+1 ,...,X _τ ),1≤Δh≤24-t _p (ii) a In the daily cycle part, time t _d ＝τ+t _p Data for Δ d days before-24 are intercepted as input to the daily cycle component, denoted as

In the cyclic part, time t _w ＝τ+t _p Data for Δ w weeks before-24 × 7 as input to the week cycle component, denoted as

Δw≥1。

In the undirected graph G of the mobile traffic network, the importance of the nodes can be represented by weights considering that different nodes have different influences on the prediction result. The space-time attention module consists of a time attention mechanism and a space attention mechanism, wherein the time attention mechanism and the space attention mechanism are respectively used for acquiring the space-time correlation of the mobile flow data. In the l (1. ltoreq. l. ltoreq.2) th spatio-temporal module, taking the hour period component as an example, a time attention mechanism is used to obtain the correlation between the different time periods. Considering the input flow X of the first time-space module ^(l) (X _h 、X _d 、X _w Collectively denoted by X) in the time dimension, the output of the temporal attention mechanism is represented as:

wherein,

representing the mobile traffic data after obtaining the time correlation, W _t1 And W _t2 Learnable parameters, B, both representing a temporal attention mechanism _t Offset matrix, σ, representing the time attention mechanism ₁ And σ ₂ Respectively representing a sigmoid activation function and a normalized exponential function; then obtained in the spatial dimension by the spatial attention mechanism

The correlation between different nodes in the network.

The normalized attention matrix of the spatiotemporal attention module may then be expressed as:

wherein, W _s1 And W _s2 Learnable parameters, B, both representing spatial attention mechanisms _s An offset matrix representing a spatial attention mechanism.

The attention moment matrix, which is obtained in combination with the spatial attention correlation (i.e., the correlation between different nodes), makes the network more concerned about important neighboring nodes and neighboring time periods.

In the time-space convolution module, a graph convolution network and a standard convolution network are sequentially adopted to obtain the time-space characteristics of the mobile flow. Considering that the mobile traffic network is modeled as an undirected graph G, a spectrogram convolution network can be employed to obtain the characteristics of the neighboring nodes of each node in the undirected graph G. In a spectrogram convolution network, a laplacian matrix is used to describe the structure of the network. By designing the spectral filter g θ based on the laplacian matrix of the graph, the graph convolution operation can be described as the input mobile traffic data being convolved by the spectral filter g θ. In the first spatio-temporal block, input data X are combined ^(l) And attention matrix A _s The convolution operation of the graph convolution network can be represented by the following formula:

wherein,

is the output of the graph convolution network, which represents the mobile traffic data for updating the self information through the characteristics of the adjacent nodes; beta is a _k Is a variable of a polynomial coefficient, I _N Is an identity matrix, L is a normalized Laplace matrix of the undirected graph G, λ _max Is the maximum eigenvalue of L, T _k (x) Is a Chebyshev polynomial, K is T _k (x) The number of terms of (c).

Then, in order to obtain the characteristics of the adjacent time periods of the nodes on the undirected graph G, a standard convolutional network is employed in the time dimension. By convolving the output of the network with a graph

Passing to the standard convolutional network, the output of the standard convolutional network can be obtained by the following formula:

wherein,

To optimize training efficiency and reduce prediction error, a residual network is added to each spatio-temporal module in each temporal component. In the l space-time block, the input X of the residual network is taken into account ^(l) The output of the residual network can be obtained by the following formula:

wherein,

then, by combining the outputs of the standard convolutional networks

And the output of the residual error network

The output of the ith spatio-temporal module may be given by:

in order to ensure that the output dimension of each time unit (namely three units of an hour period unit, a day period unit and a week period unit) is the same as that of the prediction target, a full connection layer is added at the end of each time unit. Since the output of the last spatio-temporal module (i.e., the second spatio-temporal module) in the temporal component may be represented as X ^(L) Then the output of the full connection layer

Can be obtained by the following formula:

wherein,

represents the final result of the prediction of the movement flow rate of the time unit, W _f And B _f Respectively representing the weight matrix and the deviation matrix of the fully connected layer.

A frame diagram of time components in a method for predicting mobile traffic based on a spatio-temporal attention convolutional network according to the present embodiment is shown in fig. 1.

Considering that the three time components may have different degrees of influence on different regions, the present embodiment fuses the outputs of the three time components (hourly, daily, and weekly). The final output can be obtained as follows:

wherein,

and

outputs of hour, day and week cycle units, respectively, i.e.

Concretization of (1); w _h 、W _d And W _w Is a learnable weight matrix used to adjust the degree of influence of the three time components on the prediction.

In addition, since the moving flow rate during holidays may be different from that in ordinary times, external factors are also considered in the prediction method provided by the present embodiment. In the external component, certain functions are manually extracted from the external data set, such as holiday events and metadata (i.e., day of week, weekday, and weekend). After the external features are input into the standard convolution network and the full connection layer in sequence, the output of the external component can be obtained

Then, the output of the external part is connected

With output from three time units

And directly combining to obtain a final prediction result. The final predicted mobile traffic can be represented by the following equation:

finally, the spatio-temporal attention convolutional network will predict the estimates

And the true value Y as a loss function. The loss function can be defined as:

where θ is all learnable parameters of the spatio-temporal attention convolutional network, including W _t1 、W _t2 、W _s1 、W _s2 、W _h 、W _d 、W _w 。

The spatio-temporal attention convolutional network adjusts the learnable parameters of the spatio-temporal attention convolutional network by obtaining a minimum loss function.

The mobile flow prediction method based on the space-time attention convolutional network provided by the embodiment of the invention at least comprises the following technical effects:

the invention fully considers that the mobile flow data has high nonlinearity and complexity, the space-time attention convolution network models the mobile flow data of an hour period, a day period and a week period through three time components respectively, and each time component consists of a space-time attention module and a space-time convolution module. The space-time attention module is used for acquiring the dynamic space-time correlation of the mobile flow; the space-time convolution module captures space-time characteristics of the mobile traffic using a graph convolution and a standard convolution. The results of the three components and the external interference factors are fused to finally obtain the prediction result of the mobile flow, so that the prediction problem of the mobile flow can be effectively solved.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to examples, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims

1. A mobile traffic prediction method based on a space-time attention convolutional network is characterized in that the space-time attention convolutional network models mobile traffic networks of an hour period, a day period and a week period through three time components respectively and obtains corresponding three mobile traffic prediction information; fusing the three mobile traffic prediction information and external interference information to obtain a final mobile traffic prediction result;

each time component comprises two space-time modules and a full connection layer which are connected in sequence; each space-time module comprises a space-time attention module and a space-time convolution module; acquiring dynamic space-time correlation of the mobile flow through the space-time attention module, and acquiring space-time characteristics of the mobile flow through the space-time convolution module; ensuring that the dimension output by the time component is the same as the dimension of the prediction target through the fully-connected layer;

the space-time attention module consists of a time attention mechanism and a space attention mechanism;

represents the mobile traffic data after obtaining the time correlation, W _t1 And W _t2 Learnable parameters, B, both representing a temporal attention mechanism _t Offset matrix, σ, representing the time attention mechanism ₁ And σ ₂ Respectively representing a sigmoid activation function and a normalization exponential function;

obtained in a spatial dimension by the spatial attention mechanism

The correlation between different nodes in the network;

wherein A is _s Represents the normalized attention matrix, W _s1 And W _s2 Learnable parameters, B, both representing spatial attention mechanisms _s An offset matrix representing a spatial attention mechanism.

2. The method for predicting the mobile traffic based on the spatio-temporal attention convolutional network as claimed in claim 1, wherein the region to be predicted is divided into a plurality of sub-regions, and each sub-region is used as a node in the mobile traffic network of the region to be predicted;

3. The method for predicting mobile traffic based on spatio-temporal attention convolutional network of claim 1, wherein the three time components are respectively an hour period component, a day period component, and a week period component;

data of Δ h hours before the previous time τ is taken as input to the hourly periodic component, denoted X _h ＝(X _τ-Δh+1 ,X _τ-Δh+2 ,… ,X _τ ),1≤Δh≤24-t _p ；X _h As the input flow to the first spatio-temporal module in the hourly period component;

will time t _d ＝τ+t _p -data of Δ d days before 24 as input to said daily cycle component, denoted X _d ＝ (X _{td-24× Δ d+1} , X _{td-24× Δ d+2} ,…,X _td ),1 ≤Δ d ≤ 7 ； X _d As the input flow to the first spatiotemporal module in the daily cycle component;

X _w As the input flow to the first spatio-temporal module in the cyclic part.

4. The method for predicting the mobile traffic based on the spatio-temporal attention convolution network as claimed in claim 3, wherein in the spatio-temporal convolution module, the spatio-temporal characteristics of the mobile traffic are obtained sequentially through a graph convolution network and a standard convolution network;

wherein,

is the output of the graph convolution network, which represents the mobile traffic data for updating the self information through the characteristics of the adjacent nodes; beta is a _k Is a variable of a polynomial coefficient, I _N Is an identity matrix, L is a normalized Laplace matrix of the undirected graph G, λ _max Is the maximum eigenvalue of L, T _k (x) Is a Chebyshev polynomial, K is T _k (x) The number of terms of;

the output of the standard convolutional network is represented as:

wherein,

5. The spatio-temporal attention convolutional network-based mobile traffic prediction method of claim 4, wherein each of the spatio-temporal modules further comprises a residual network; optimizing training efficiency and reducing prediction errors through the residual error network;

the output of the residual network is represented as:

wherein,

is the mobile traffic data after convolution by the storage network, phi ₂ Convolution kernel parameters representing a residual network;

wherein, X ^(l+1) Is the output of the ith spatio-temporal module.

6. The method of claim 5, wherein the output of the full connectivity layer is the output of the spatio-temporal attention convolutional network

Expressed as:

wherein,

is the output of the time component and represents the predicted information of the moving flow rate corresponding to the time component; w _f And B _f Weight matrix and deviation matrix, X, representing fully connected layers, respectively ^(L) Representing the output of a second spatio-temporal module in a temporal component。

7. The mobile traffic prediction method based on the spatio-temporal attention convolutional network as claimed in claim 6, wherein the specific implementation manner of fusing the three mobile traffic prediction information and the external interference information to obtain the final mobile traffic prediction result is as follows:

wherein,

total prediction information representing three time components;

Concretization of (1); w _h 、W _d 、W _w Is a learnable weight matrix;

wherein,

the estimated value is used as the final mobile flow prediction result;

representing external interference information.

8. The method of claim 7 wherein the method of predicting the mobile traffic based on spatio-temporal attention convolutional network is characterized in thatThen, the spatio-temporal attention convolutional network estimates the prediction

the loss function is expressed as: