CN116596109A

CN116596109A - Traffic flow prediction model based on gating time convolution network

Info

Publication number: CN116596109A
Application number: CN202310352647.XA
Authority: CN
Inventors: 康明
Original assignee: Individual
Current assignee: Individual
Priority date: 2023-04-04
Filing date: 2023-04-04
Publication date: 2023-08-15

Abstract

The invention relates to a traffic flow prediction model based on a gating time convolution network, wherein the traffic flow combination prediction model and a G-TCN model the time dependence, the spatial dependence and the long-time sequence of traffic flow respectively. The time-dependence and spatial-dependence of traffic flows are captured using a Time Convolutional Network (TCN) and a Graph Convolutional Network (GCN), respectively, and the STA-Block module models the time-space dependence of long-time sequences through a time-space attention mechanism and a gating fusion mechanism. An adaptive adjacency matrix is constructed in a G-TCN model, and learning is performed through node embedding, so that the model can accurately capture hidden space-time dependency relations in traffic flow data.

Description

Traffic flow prediction model based on gating time convolution network

Technical Field

The invention relates to the technical field, in particular to a traffic flow prediction model based on a gating time convolution network.

Background

Traffic flow predictions aim at predicting future traffic conditions in a road network based on historical observations. Traffic flow prediction can predict traffic conditions of the road section within a certain time in the future, and great help is provided in the aspects of signal control, traffic guidance, path planning and the like.

ARIMA is a classical statistical model in time series analysis, widely used for traffic flow prediction. Researchers extend the spatial domain into the ARIMA time series model to obtain a spatio-temporal autoregressive integral moving average model. However, since the time series analysis model is a pure inductive method, some ideal prior assumptions are required. And due to the complexity and non-linear nature of traffic data, these methods often perform poorly in practical applications.

The space-time diagram modeling has wide application in solving the problems of complex systems such as traffic speed prediction and the like. For example, in traffic speed prediction, speed sensors on urban roads form a graph in which edge weights are calculated from euclidean distances of two nodes. Since traffic congestion on a road can cause traffic speed on its entrance road to decrease, the network structure of the model floor map is considered as a priori knowledge of the interdependencies between nodes when modeling traffic speed time series data on each road.

At present, the research on time space diagram modeling mainly has two directions: one is to integrate a graph roll-up network (GCN) into a Recurrent Neural Network (RNN). And secondly, the GCN is integrated into a Convolutional Neural Network (CNN). While these studies demonstrate the effectiveness of introducing a graph network structure of data into a model, these approaches still face two major drawbacks. (1) The data graph structure assumed in these studies, when a connection does not contain an interdependence relationship between two nodes, and when an interdependence relationship between two nodes exists, there is a case where the connection is invalid. (2) Current studies of time space graph modeling are ineffective in learning time dependencies. RNN-based methods are prone to gradient extinction when capturing long time sequences.

The invention provides a traffic flow prediction model based on a gate control time convolution network (G-TCN), which consists of a gate control time convolution network module (TCN), a graph convolution network module (GCN) and a space-time attention mechanism module (STA-Block), wherein each STA-Block consists of a space attention mechanism, a time attention mechanism and a gate control fusion mechanism. Meanwhile, a graph roll stacking layer is provided, and an adaptive adjacency matrix can be obtained from data through an end-to-end supervision training mode. The present invention employs a stacked causal convolutional network of dilation to capture time dependence. The G-TCN model is capable of effectively and efficiently processing space-time diagram data with long time sequences, supported by an expansion causal convolutional network.

Disclosure of Invention

In order to solve the technical problems, the technical scheme provided by the invention is as follows: a traffic flow prediction model based on a gating time convolution network, wherein the traffic flow combination prediction model and a G-TCN model respectively model the time dependence, the spatial dependence and the long-time sequence of traffic flow. The time dependence and the space dependence of traffic flows are respectively captured by using TCN and GCN, and the STA-Block module models the time-space dependence of a long-time sequence through a time-space attention mechanism and a gating fusion mechanism. An adaptive adjacency matrix is constructed in a G-TCN model, and learning is performed through node embedding, so that the model can accurately capture hidden space-time dependency relations in traffic flow data.

The invention has the following advantages:

1. the invention constructs an adaptive adjacency matrix that automatically discovers hidden graph structures from data without any prior knowledge guidance.

2. An efficient framework is presented to capture spatio-temporal dependencies simultaneously. The invention combines the proposed GCN with the expansion causal convolution network, so that each graph convolution layer processes the expansion causal convolution layers with different granularity levels to extract the spatial correlation of node information.

3. A spatial attention mechanism and a temporal attention mechanism are proposed to learn the dynamic spatial correlation and nonlinear temporal dependence, respectively, in traffic flow data. In addition, a gating fusion mechanism is designed to adaptively fuse information extracted by a spatio-temporal attention mechanism to reduce error propagation during prediction.

Drawings

FIG. 1 is an overall framework diagram of the G-TCN model.

FIG. 2 is a diagram of an expansion cause and effect convolution network.

Fig. 3 is a graph of a Gated TCN framework.

Fig. 4 is an ST-Conv Block frame diagram.

Fig. 5 is a STA-Block framework diagram.

Detailed Description

The present invention will be described in further detail with reference to examples.

1 model

The G-TCN model captures the time characteristics of traffic flow by using TCN, and simultaneously adopts a coder-decoder structure to capture the space characteristics of traffic flow, wherein the coder and the decoder are composed of a plurality of STA-blocks so as to simulate the influence of space-time factors on traffic conditions. The encoder encodes the input traffic stream characteristics and the decoder predicts the output sequence. Between the encoder and the decoder, a transition attention mechanism is applied to transition the encoded traffic characteristics to generate a future time-step sequence representation as input to the decoder.

2 problem definition

In the present invention, the traffic road network is defined as graph g= (V, E), where V is a set of road nodes, E is a set of edges of the traffic road network, a E R ^N × ^N The adjacency matrix of fig. G is shown. If v _i ，v _j E V and (V) _i ,v _j ) E, then A _ij 1, otherwise 0. In each time step t, the graph G has a dynamic feature matrix X ^(t) ∈R ^N×D The traffic flow prediction problem aims at learning a function f that maps the history map signal S to a future T-map signal, given a map G, as follows:

wherein X is ^(t-S)：t ∈R ^N×D×S And X ^{(t+1)：(t+T)} ∈R ^N×D×T 。

2.1G-TCN model building

The G-TCN model consists of an input layer, a stacked space-time convolutional network layer, ST-Conv Block, STA-Block and an output layer, each network layer of the stack being connected to the ST-Conv Block by hops. The time space layer is composed of a GCN layer and a Gated time convolution layer (Gated TCN), and the Gated TCN is composed of two parallel time convolution layers. ST-Conv Block contains three spatio-temporal convolution blocks, capturing many-to-one effects from three different angles corresponding to space, time and space, respectively. STA-Block combines spatial and temporal attention mechanisms through gating fusion. By stacking multiple time-space layers, the G-TCN is able to handle spatial correlation at different temporal levels. Spatial features in traffic flow history data can be captured by the GCN, wherein the input h of the GCN is a three-dimensional tensor with the size of [ N, C, L ], N is the number of nodes, C is the dimension of a hidden layer, and L is the length of a data sequence.

The present invention chooses to use the Mean Absolute Error (MAE) as a loss function for G-TCN, defined as:

the G-TCN model willAs an overall output, the accepted field size of the G-TCN model is equal to the input sequence length, so that in the last spatio-temporal layer, the output time dimension is equal to 1, and the number of output channels of the last layer is set to a step size T to obtain the dimension required for output.

2.2 Time Convolutional Network (TCN)

The present invention captures the temporal trend of nodes using extended causal convolution as a temporal convolution layer (TCN). The extended causal convolution network realizes larger receptive field by increasing the depth of the layer, can correctly process long-term sequences, and relieves the gradient explosion problem. The extended causal volume operation slides the input by skipping values at specific steps. Mathematically, given a one-dimensional sequence input x εR ^T And filter f epsilon R ^K The extended causal convolution operation of x and f at step t is represented as shown in equation (2):

where d is the spreading factor controlling the jump distance.

Gated TCN: the gating mechanism can control information of each layer in the TCN, and the Gated TCN only comprises one output gate. Given input X E ^N×D×S The form is as follows:

h＝g(θ ₁ *X+b)·sigmoid(θ ₂ *X+c) (4)

wherein θ ₁ ，θ ₂ B and c are model parameters and g (·) is the activation function of the output.

2.3 graph roll-up network (GCN)

The GCN is a basic operation of extracting node characteristics from the structural information of the nodes, and smoothing the signals of the nodes by aggregating and converting the neighborhood information thereof. Order theRepresenting normalized adjacency matrix, X.epsilon.R ^N×D Representing the input signal, Z.epsilon.R ^N×M Representing the output, W.epsilon.R ^D×M Representing a matrix of model parameters, the GCN is defined as:

the diffusion process of the graph signal was modeled with K finite steps. Generalizing the diffusion convolution network into equation (6), the result is expressed as:

wherein P is ^k Representing the power series of the transition matrix. In the undirected graph, p=a/rowsum (a). In the directed graph, the diffusion process is divided into forward and backward, forward transfer matrix P _f =a/rowsum (a) and backward transfer matrix P _b ＝A ^T Rowsum (A). Thus, the diffusion graph convolution network can be defined as:

2.4 adaptive adjacency matrix

In the present invention an adaptive adjacency matrix is proposedThe matrix does not require any prior knowledge and is subjected to end-to-end learning by random gradient descent by using a learnable parameter E ₁ ，E ₂ ∈R ^N×c Two node embedded dictionaries are randomly initialized to mine traffic flow hidden spatial features. The adaptive adjacency matrix proposed by the present invention is shown in equation (8):

wherein E is ₁ Representing source node embedding, E ₂ Representing target node embedding, weak connections are eliminated using a ReLU activation function, and the softmax function normalizes the adaptive adjacency matrix. By combining predefined spatial dependencies and self-learning hidden spatial features, the present invention proposes the following graph roll layering:

when the graph network structure is not available, the present invention proposes to use the adaptive adjacency matrix alone to capture hidden spatial dependencies as shown in equation (10):

2.5 space-time convolutional network block

Space-time convolution network block packetA spatio-temporal convolution block containing three kernels captures many-to-one effects from three different angles corresponding to space, time, and space-time, respectively. The temporal kernel captures the temporal dependence of traffic streams at the same location and the spatial kernel captures the spatial dependence of traffic streams at neighboring locations over the same time step. Each spatio-temporal convolution block takes as input the output of the previous spatio-temporal attention block, i.eOutput->Calculated from equation (11):

wherein, the liquid crystal display device comprises a liquid crystal display device,and->Is a spatiotemporal kernel of f×f, f×1 denotes a temporal kernel, and 1×f denotes a spatial kernel. The inakrlu (·) represents inaky modified linear unit function, and the x represents convolution operation. Finally, the outputs of the three convolution kernels are connected and 1X 1 convolution +.>To compress the features while limiting the number of channels.

2.6 space-time attention block

The ST-Attention Block comprises a spatial Attention mechanism, a temporal Attention mechanism and a gating fusion mechanism. The invention represents the input of the first block as H ^(l-1) Vertex v _i At time step t _j Is represented as the hidden state of (2)The outputs of the spatial and temporal attentiveness mechanisms in the first block are denoted as +.>And->Vertex v _i At time step t _j The hidden states of (a) are denoted as +.>And->After the gated fusion, the output of the first block is obtained, denoted as H ^(l) 。

The invention designs a spatial attention mechanism to adaptively capture the correlation between sensors in a road network and dynamically allocate different weights to different road segments at different time steps. For time step t _j Vertex v of (2) _i Calculating a weighted sum of all vertices:

where V is the set of all vertices,is the attention score, the sum of the attention scores is equal to

The invention learns the attention score by considering the traffic characteristics and the graph network structure, and calculates the vertex v by adopting a dot product scaling method _i And v:

wherein, the connection operation is represented by the I,<·，·>representing inner product operators, 2D representationIs a dimension of (c). Then, the +.A. is added by the softmax function>Normalization is:

obtaining attention scoreThe hidden state is then updated by equation (14).

In order to stabilize the learning process, the invention expands the spatial attention mechanism to a multi-head attention mechanism. Specifically, K parallel attention mechanisms are connected with different leachable projections:

and->Representing three different nonlinear mappings in the kth head-attention mechanism, yielding an output in d=d/K dimensions.

The present invention designs a time-attention mechanism to adaptively model the nonlinear correlation between different time steps. Calculating attention score by multi-head method by measuring correlation between different time steps in consideration of traffic characteristics and time, taking into consideration vertex v _i Time step t _j The correlation between t is defined as:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing a time step t _j And t, correlation between->Is the attention fraction of the kth head, representing the time step t _j Importance to t->And->Representing two different leachable transitions, < >>Representing t _j A previous set of time steps. After the attention score is obtained, then at time step t _j Vertex v _i The hidden state update of (a) is as follows:

representing a non-linear projection, the learnable parameters in equations (19), (20) and (21) are shared between all vertices and time steps by parallel computation.

The invention designs a gating fusion mechanism which adaptively fuses a spatial attention mechanism and a temporal attention mechanism. In the L-th STA-Block, the outputs of the spatial and temporal attention mechanisms are denoted asAndand->Fusion is performed by equation (21):

W _z,1 ∈R ^D×D ，W _z,2 ∈R ^D×D ，b _z ∈R ^D is a learnable parameter, z represents gating.

3. Experiment

3.1 data description

The present invention uses the METR-LA dataset and the PEMS-BAY dataset to verify the performance of the proposed G-TCN model.

The METR-LA recorded four month traffic speed data via 207 sensors. PEMS-BAY contains 325 sensors for 6 months of traffic speed information. In the experiment, the dataset was divided into 7:2:1 as training set, test set and validation set, respectively. For predicting traffic flow rates of 15 minutes, 30 minutes and 60 minutes.

The present invention uses a min-max normalization method to normalize the data, limiting it to [0,1]. The normalization formula is as follows:

wherein x is _i Represents the ith raw data, x _max And x _min Representing the maximum and minimum values of the original data, respectivelyRepresenting the normalized input data.

3.2 Experimental setup

The invention is based on Pytorch deep learning framework, and completes the construction and training of traffic flow prediction model in PyCharm development environment. The invention uses an 8-layer G-TCN network, and the expansion factor sequence is 1,2,1,2,1,2,1,2. Using equation (6) as the GCN layer of the present invention, the diffusion step k=2. The model was trained using Adam optimizer with an initial learning rate of 0.001 and a discard rate p=0.3.

The invention evaluates the error between the actual traffic flow speed and the predicted result based on the following indicators:

average absolute error:

root mean square error:

average absolute percentage error:

wherein y is _i Andthe actual traffic speed and the predicted traffic speed n are the number of observations, respectively.

3.3 Baseline model

The present invention compares the G-TCN model with the following models:

(1) HA: historical average model. The average traffic information for the historical period is used as a prediction.

(2) VAR: vector autoregressions.

(3) SVR: and training the model by using a linear support vector machine to obtain the relationship between input and output so as to predict traffic flow.

(4) FNN: feedforward neural network with two hidden layers and L2 regularization.

(5) ARIMA: an autoregressive integrated moving average model with a Kalman filter.

(6) FC-LSTM: recurrent neural networks with fully connected LSTM hidden units.

(7) WaveNet: convolutional network architecture for sequence data.

(8) Graph WaveNet: graph WaveNet, which combines graph convolution with dilated occasional convolution.

(9) STGCN: space-time diagram convolution networks combine GCN and 1D convolutions.

(10) ASTGCN: based on the space-time diagram convolutional network of attention, a space-time attention mechanism is integrated into the STGCN for capturing dynamic space-time patterns.

(11) STSGCN: the spatiotemporal synchronous graph convolution network captures spatiotemporal correlation by stacking multiple local GCN layers with adjacent matrices on the time axis.

3.4 experimental results and analysis

The predicted performance of the G-TCN model and various baseline models on both sets of data was compared for 15 minutes, 30 minutes and 60 minutes. From table 2, it can be observed that the deep learning method is superior to the traditional time series method and machine learning model, demonstrating the ability of the deep neural network to model nonlinear traffic flow data. In deep learning methods, graph WaveNet and STGCN based models (e.g., graph WaveNet and STGCN) generally perform better than the FC-LSTM model, with Graph WaveNet achieving better results on both data sets, which perform far better than the equivalent models of HA, ARIMA and FC-LSTM. The G-TCN model of the invention is superior to the traditional convolution-based method (such as STGCN), and shows that the G-TCN model can better capture the space-time dependence in traffic flow data.

TABLE 1 comparison of Performance of different traffic flow prediction models on two sets of data sets

In contrast to the baseline model, the G-TCN model employs stacked time-space layers, including GCN layers with different parameters. Thus, each GCN layer in the G-TCN model can focus on its own time input range, and the G-TCN model achieves optimal predictive performance. Under different time points, the G-TCN model has better prediction performance, and the effectiveness of the space-time correlation of the G-TCN model in capturing traffic flow data is proved.

While the invention has been described in detail in the foregoing general description and with reference to specific embodiments thereof, it will be apparent to one skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.

Claims

1. A traffic flow prediction model based on a gating time convolution network is characterized in that: the traffic flow combination prediction model (G-TCN) models the time dependence, the space dependence and the long-time sequence of traffic flows respectively. The time-dependence and spatial-dependence of traffic flows are captured using a Time Convolutional Network (TCN) and a Graph Convolutional Network (GCN), respectively, and the STA-Block module models the time-space dependence of long-time sequences through a time-space attention mechanism and a gating fusion mechanism. An adaptive adjacency matrix is constructed in a G-TCN model, and learning is performed through node embedding, so that the model can accurately capture hidden space-time dependency relations in traffic flow data.

2. The traffic flow prediction model based on the gated time convolution network according to claim 1, wherein the traffic flow combination prediction model construction step is as follows: the three modules model the time dependence, the spatial dependence and the long time sequence of the traffic flow respectively. The time-dependence and spatial-dependence of traffic flows are captured using a Time Convolutional Network (TCN) and a Graph Convolutional Network (GCN), respectively, and the STA-Block module models the time-space dependence of long-time sequences through a time-space attention mechanism and a gating fusion mechanism. An adaptive adjacency matrix is constructed in a G-TCN model, and learning is performed through node embedding, so that the model can accurately capture hidden space-time dependency relations in traffic flow data. Finally, the prediction sequence is output through the linear layer to conduct prediction.