CN116307152A

CN116307152A - Traffic prediction method for space-time interactive dynamic graph attention network

Info

Publication number: CN116307152A
Application number: CN202310209521.7A
Authority: CN
Inventors: 韩雪; 丁治明; 郭黎敏; 陈雅君
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2023-03-07
Filing date: 2023-03-07
Publication date: 2023-06-23

Abstract

The invention discloses a traffic prediction method of a space-time interactive dynamic graph attention network. The invention provides an interactive learning module, which is used for carrying out staggered downsampling on acquired traffic data according to time intervals, generating a new dynamic diagram through an interactive learning strategy and synchronously capturing the space-time dependence of the partitioned traffic data. The invention also provides a dynamic graph attention module which captures the spatial correlation of dynamic changes in the traffic network on the basis of the dynamic graph generation module. The dynamic graph generating module generates a dynamic graph structure based on the historical information and the input data, and the structure can deeply mine dynamic relevance in the network topology. The invention can improve the accuracy of traffic prediction, relieve traffic pressure and improve the toughness of urban traffic.

Description

Traffic prediction method for space-time interactive dynamic graph attention network

Technical Field

The invention belongs to the technical field of traffic prediction and the field of deep learning, and particularly relates to a traffic prediction method of a space-time interactive dynamic graph attention network.

Background

The control and the guidance of the traffic flow are important research directions of intelligent traffic, and the real-time accurate traffic flow prediction is the basis of traffic flow induction, is an important means for vehicle evacuation in a congestion area, and is one of the core contents of intelligent traffic. Short-time traffic flow prediction relies on a big data environment, and traffic flow in a certain road section or even in the whole city range in a future period can be accurately predicted by means of artificial intelligence, cloud computing and the like, so that the travel efficiency and the safety of urban vehicles are effectively improved, urban traffic management departments can conveniently and fully balance traffic loads of road networks, full coordination of traffic resources is realized, the transportation efficiency of the road networks is maximized, the traffic jam degree is effectively reduced, and therefore, great practical application value and profound practical significance are provided for developing and deeply researching the theory and the method of traffic flow prediction of the road networks.

In recent years, with the continuous development of socioeconomic level, large cities are developing toward smart cities, and Intelligent Transportation Systems (ITS) have received considerable attention as an important component in smart cities. The traffic prediction is taken as an important component of an intelligent traffic system, and can provide support in aspects of traffic jam control, reasonable path planning and the like. The existing methods for solving the urban traffic prediction problem mainly comprise three types: a statistical-based predictive model, a nonlinear theoretical model, and an intelligent theory-based predictive model. For a statistical-based prediction model, a historical averaging method is the earliest statistical-based prediction model, the method assumes that the traffic flow at a certain point is operated according to a certain rule, the current traffic flow measured at the point and the historical data are added and averaged, and the obtained value is taken as the traffic flow value at the next moment of the point. These methods have great limitations on modeling of nonlinear data, and perform poorly in the practical practice of current data rapid growth. The nonlinear theory model is a model based on the nonlinear theory such as chaos theory, synergetic theory, system dynamics and the like, and is expressed in the traffic flow prediction field as analyzing the nonlinear characteristics of traffic flow by utilizing various nonlinear theory, so that the nonlinear model is established. There are typically models based on mutation theory, chaotic models, wavelet analysis, etc. Prediction model based on intelligent theory: the traffic flow prediction model based on the intelligent theory is based on a modern big data analysis technology, has no fixed model structure and no fixed relation between input and output, extracts valuable information through training of different modes of input information, and predicts traffic flow by using the extracted information. Currently, researchers have used KNNs, automatic encoders, etc. to identify traffic patterns, GNNs, GCNs, etc. to model traffic road network structures and extract spatial features, LSTM, etc. to model temporal correlations of traffic flows, etc.

While current deep learning exhibits superior performance in terms of the ability of modeling traffic flow data and extracting complex data features, there are still some drawbacks: the partial model only considers the time correlation, but ignores the space correlation; the dynamic change of the traffic condition is ignored by part of the model, and only the static adjacency matrix is established, so that the prediction capability of the model is reduced; the partial model has the problem of gradient disappearance or gradient explosion, so that data training is difficult to perform.

Aiming at the challenges, the invention provides a traffic prediction method of a space-time interactive dynamic graph attention network to realize efficient and accurate prediction of urban traffic flow data.

Disclosure of Invention

Aiming at the defects of the current model, the invention creatively provides a new traffic prediction method of a space-time interactive dynamic graph attention network, which can synchronously capture time correlation and space correlation, and can learn space-time dependence through an interactive learning structure and a dynamic graph convolution network, thereby realizing accurate and effective traffic flow prediction.

In order to achieve the above purpose, the invention adopts the following technical scheme:

a traffic prediction method for a spatio-temporal interactive dynamic graph attention network, the method comprising the steps of:

step 1: acquiring historical traffic flow data of each traffic node and preprocessing to obtain application data;

step 2: traffic network information and traffic prediction problem definition;

step 3: constructing an interactive learning module;

step 4: constructing a dynamic graph attention model;

step 5: the collected traffic flow data set is divided into a training set, a verification set and a test set, and the average absolute error, the average absolute percentage error and the root mean square error are used as evaluation indexes to evaluate the traffic prediction model of the space-time interactive dynamic graph attention network.

Preferably, the preprocessing operation of the historical traffic flow data in step 1 includes the following steps:

step 1: the invention adopts a Z-Score method to avoid the influence of abnormal values and extreme values, thereby obtaining preprocessed application data.

Preferably, step 2 includes the following steps for traffic network information and traffic prediction problem definition:

step 2-1: defining a traffic network as a weighted directed graph g= (V, E, a), wherein V represents a set of nodes, |v|=n, which is a set of road segments or sensors in the traffic network; e represents a set of edges, edges in the graph G represent connection relations among nodes, and weight values of the edges represent distances among the nodes;

the weighted adjacency matrix of the graph G is represented, and the element values in the a are represented by the distance between two nodes, which can be specifically represented by:

wherein d _ij E is the distance between two nodes in the road network ₁ Super-parameters for controlling the sparsity of the adjacent matrix A; sigma is the standard deviation;

step (a)2-2: defining a traffic prediction problem: the traffic flow prediction problem is aimed at passing through the history observation sequence

On the basis of which future traffic sequences of the traffic network are predicted

The traffic prediction problem is defined as:

wherein,,

representing the observation value of graph G at time step t,/for>

N represents the number of nodes in the traffic network, and C represents the number of characteristic channels; t represents the length of a given historical time series, T' represents the length of the time series to be predicted; f represents a learning function that can predict future flow sequences from historical observation sequences.

Preferably, the step 3 of constructing the interactive learning module includes the following:

step 3: for most time series data, the interleaved downsampled subsequence still retains most of the information of the original sequence due to trending and proximity; the invention adopts a processing mode of staggered downsampling to enable the two sub-sequences after segmentation to share the parameter weight in the dynamic graph attention module; assume that

For the input traffic flow data, the subsequences of the divided parity index sections of X are respectively expressed as

The outputs after the first interactive learning are respectively:

X′ _odd ＝tanh(DGAT(Conv(X _even )))⊙X _odd

X′ _even ＝tanh(DGAT(Conv(X _odd )))⊙X _even

the final outputs after the second interactive learning are respectively:

X _{odd_out} ＝X′ _odd +tanh(DGAT(Conv(X′ _even )))

X _{even_out} ＝X′ _even +tanh(DGAT(Conv(X′ _odd )))

wherein,,

and->

Respectively, traffic flow subsequences obtained after the first interactive learning, and +.>

For a real number set, C represents the number of characteristic channels, N represents the number of nodes in a traffic network, and T represents the length of a given historical time sequence; />

And->

Representing a traffic flow subsequence finally obtained after two interactive learning; as indicated by the letter Hadamard product, tan h is the activation function, DGAT is the dynamic attention model, conv is the one-dimensional convolution;

preferably, step 4 of constructing the dynamic graph attention model includes the following:

step 4-1: the structure of the DGAT module consists of two main modules: a graph generator module and a graph annotation force module; the DGAT module utilizes the generated graph structure to better explore the urban road with deeper levelThe spatial characteristics of the road network improve the performance of capturing spatial heterogeneity by the model; wherein the graph generator module is to hide features

And a predefined initial adjacency matrix->

As input; at input +.>

After the treatment of the graph attention mechanism, the graph attention mechanism is transmitted to the MLP to generate a matrix A':

A′＝SoftMax(MLP(GAT(X,A)))

in the process, ,

representing an adjacency matrix with spatiotemporal features, MLP representing a multi-layer perceptron, GAT representing a graph-annotating force operation,/->

Representing input features in a road network, +.>

Representing an initial weighted adjacency matrix of each node in the road network;

step 4-2: in addition to the adjacency matrix A' generated by the graph generator, the present invention also defines an adaptive adjacency matrix

Wherein,,

and->

Is a trainable parameter of the model, +.>

For real number set, C represents characteristic channel number, N represents node number in traffic network, A _apt The initial value is the adjacency matrix predefined by the prior map +.>

Step 4-3: using an adaptive fusion architecture will A _apt And A', and fusing the adjacent matrix A obtained after fusing _dyn Input into GAT for dynamic association simulation between nodes and exploring undiscovered node connections in the road network, the fusion operation may be defined as follows:

A _dy ＝αA _apt +(1-α)A′

wherein A is _dy And alpha is a learnable adaptive parameter factor for the fused dynamic adjacency matrix.

Step 4-4: the GAT module is defined as follows:

wherein X is _in To input data A _apt The initial value is an adjacency matrix predefined by a priori map

Representing a parameter matrix.

Preferably, the step 5 of evaluating the traffic prediction model of the time-space interactive dynamic graph attention network includes the following steps:

step 5: the invention adopts average absolute error MAE (representing the average value of absolute error between actual value and predicted value), average absolute percent error MAPE (representing the average value of absolute percent error between actual value and predicted value) and root mean square error RMSE (representing the arithmetic square root of mean square error between actual value and predicted value) as the evaluation index for evaluating the prediction performance of the module, and the evaluation index is calculated as follows:

wherein N represents the number of samples, Y _i A true value representing the traffic of a certain node at a certain moment,

represents Y _i And the corresponding time corresponds to the predicted value of the traffic flow of the node.

Compared with the prior art, the invention has the following beneficial effects and advantages:

most of the current research is limited to mining time and space characteristics respectively, so that the model cannot learn the characteristics of traffic flow data comprehensively. The invention provides a new traffic prediction model of a space-time interactive dynamic graph attention network, which embeds a graph attention mechanism into an interactive learning structure and synchronously captures time and space correlations. The time-space dependency can be learned through the interactive learning structure and the dynamic graph attention network, so that the characteristics of the nodes are comprehensively learned, and the prediction accuracy is further improved.

Drawings

FIG. 1 is a general framework of a traffic prediction model of a spatio-temporal interactive dynamic graph attention network provided by the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention.

The method obtains a large amount of traffic flow data from sensors of the urban road network, performs data cleaning on the traffic flow data, and obtains the attributes such as longitude and latitude, flow value, predicted starting time, predicted ending time and the like after finishing. The method is widely applicable to various time sequence prediction fields based on the dynamic graph attention neural network, and effectively processes complex time sequence data. FIG. 1 is a general framework diagram of the model of the present invention, which effectively improves the overall predictive performance of the model. The specific implementation is as follows:

step 2: traffic network information and traffic prediction problem definition;

step 3: constructing an interactive learning module;

step 4: constructing a dynamic graph attention model;

Specifically, the preprocessing operation for the historical traffic flow data includes the following steps: the invention adopts a Z-Score method to avoid the influence of abnormal values and extreme values, thereby obtaining preprocessed application data.

Specifically, the definition of traffic network information and traffic prediction problems includes the following:

(1) Defining a traffic network as a weighted directed graph g= (V, E, a), where V represents a set of nodes, |v|=n, which is a road segment or sensor in the traffic networkIs a collection of (3); e represents a set of edges, edges in the graph G represent connection relations among nodes, and weight values of the edges represent distances among the nodes;

defining a traffic prediction problem: the traffic flow prediction problem is aimed at passing through the history observation sequence

The traffic prediction problem is defined as:

wherein,,

representing the observation value of graph G at time step t,/for>

Specifically, constructing the interactive learning module includes the following: for most time series data, the interleaved downsampled subsequence still retains most of the information of the original sequence due to trending and proximity; the invention adopts a processing mode of staggered downsampling to enable the two sub-sequences after segmentation to share the parameter weight in the dynamic graph attention module; assume that

The outputs after the first interactive learning are respectively:

X′ _odd ＝tanh(DGAT(Conv(X _even )))⊙X _odd

X′ _even ＝tanh(DGAT(Conv(X _odd )))⊙X _even

the final outputs after the second interactive learning are respectively:

X _{odd_out} ＝X _odd +tanh(DGAT(Conv(X′ _even )))

X _{even_out} ＝X′ _even +tanh(DGAT(Conv(X′ _odd )))

wherein,,

and->

And->

Representing a traffic flow subsequence finally obtained after two interactive learning; as a result, hadamard product, tanh as the activation function, DGAT as the dynamic attention model, and Conv as the one-dimensional convolution operation.

Specifically, constructing the dynamic graph attention model includes the following:

(1) The structure of the DGAT module consists of two main modules: a graph generator module and a graph annotation force module; the DGAT module utilizes the generated graph structure to better explore the spatial characteristics of a deeper urban road network, and improves the performance of capturing spatial heterogeneity of the model; wherein the graph generator module is to hide features

And a predefined initial adjacency matrix

As input; at input +.>

Thereafter (I)>

A′＝SoftMax(MLP(GAT(X,A)))

wherein,,

Representing input features in a road network, +.>

(2) In addition to the adjacency matrix A' generated by the graph generator, the present invention also defines an adaptive adjacency matrix

Wherein,,

and->

Is a trainable parameter of the model, +.>

(3) Using an adaptive fusion architecture will A _apt And A', fusing, and obtaining the adjacentConnection matrix A _dyn Input into GAT for dynamic association simulation between nodes and exploring undiscovered node connections in the road network, the fusion operation may be defined as follows:

A _dy ＝αA _apt +(1-α)A′

(4) The GAT module is defined as follows:

Representing a parameter matrix.

Specifically, evaluating a traffic prediction model of a time-space interactive dynamic graph attention network includes: the invention adopts average absolute error MAE (representing the average value of absolute error between actual value and predicted value), average absolute percent error MAPE (representing the average value of absolute percent error between actual value and predicted value) and root mean square error RMSE (representing the arithmetic square root of mean square error between actual value and predicted value) as the evaluation index for evaluating the prediction performance of the module, and the evaluation index is calculated as follows:

The foregoing is merely illustrative of the present invention, and the scope of the invention is not limited thereto, as all other examples obtained by those skilled in the art without making any inventive effort fall within the scope of the invention.

Claims

1. A traffic prediction method for a spatio-temporal interactive dynamic graph attention network, comprising the steps of:

step 2: traffic network information and traffic prediction problem definition;

step 3: constructing an interactive learning module;

step 4: constructing a dynamic graph attention model;

2. The traffic prediction method of a spatio-temporal interactive dynamic graph attention network of claim 1, characterized by: in the implementation of the step 1, the abnormal value of the traffic data acquired from the sensor is processed, and the Z-Score method is adopted to avoid the abnormal value and the extreme value, so that the preprocessed application data is obtained.

3. The traffic prediction method of a spatio-temporal interactive dynamic graph attention network of claim 1, characterized by: the implementation steps of step 2 are as follows,

step 2-1: defining a traffic network as a weighted directed graph g= (V, E, a), wherein V represents a set of nodes, each node representing a road segment or sensor in the traffic network, |v|=n; e represents a set of edges, edges in the graph G represent connection relations among nodes, and weight values of the edges represent distances among the nodes;

representing the weighted adjacency matrix of fig. G, the element values in a are representing the distance between two nodes, expressed as:

wherein d _ij E is the distance between two nodes in the road network ₁ Super-parameters for controlling the sparsity of the adjacent matrix A; sigma is the standard deviation.

Step 2-2: defining a traffic prediction problem: the traffic flow prediction problem is aimed at passing through the history observation sequence

The traffic prediction problem is defined as:

wherein,,

representing the observation value of graph G at time step t,/for>

N represents the number of nodes in the traffic network, and C represents the number of characteristic channelsThe method comprises the steps of carrying out a first treatment on the surface of the T represents the length of a given historical time series, T' represents the length of the time series to be predicted; f represents a learning function that can predict future flow sequences from historical observation sequences.

4. The traffic prediction method of a spatio-temporal interactive dynamic graph attention network of claim 1, characterized by: the implementation steps of the step 3 are as follows: the processing mode of staggered downsampling is adopted to enable the two sub-sequences after segmentation to share the parameter weight in the dynamic graph attention module; assume that

For the input traffic flow data, the subsequences of X divided by parity index interval are respectively expressed as +.>

The outputs after the first interactive learning are respectively:

X′ _odd ＝tanh(DGAT(Conv(X _even )))⊙X _odd

X′ _even ＝tanh(DGAT(Conv(X _odd )))⊙X _even

the final outputs after the second interactive learning are respectively:

X _{odd_out} ＝X′ _odd +tanh(DGAT(Conv(X′ _even )))

X _{even_out} X′ _even +tanh(DGAT(Conv(X′ _odd )))

wherein,,

and->

For real number set, C represents featureThe number of tracks, N represents the number of nodes in the traffic network, and T represents the length of a given historical time sequence; />

And->

5. The traffic prediction method of a spatio-temporal interactive dynamic graph attention network of claim 1, characterized by: the implementation steps of the step 4 are as follows, and the step 4-1 is as follows: the structure of the DGAT module consists of two main modules: a graph generator module and a graph annotation force module; the DGAT module utilizes the generated graph structure to explore the spatial characteristics of a deeper urban road network, so that the performance of capturing spatial heterogeneity of the model is improved; the graph generator module will hide the features

And a pre-defined initial weighted adjacency matrix +.>

As input; at input +.>

A′＝SoftMax(MLP(GAT(X，A)))

wherein,,

Representing input features in a road network, +.>

Wherein,,

and->

Is a trainable parameter of the model, +.>

Step 4-3: using an adaptive fusion architecture will A _apt And A', and then fusing the obtained adjacency matrix A _dyn Input into GAT for dynamic association simulation between nodes and exploration of undiscovered node connections in the road network, the fusion operation is defined as follows:

A _dy ＝αA _apt +(1-α)A′

Step 4-4: the GAT module is defined as follows:

Representing a parameter matrix.

6. The traffic prediction method of a spatiotemporal interactive dynamic graph attention network of claim 1, wherein collected historical traffic flow data sets of each traffic node are divided into a training set, a validation set and a test set;

step 5: the average absolute error MAE is the average value representing the absolute error between the actual value and the predicted value, the average absolute percent error MAPE is the average value representing the absolute percent error between the actual value and the predicted value, and the root mean square error RMSE is the arithmetic square root representing the mean square error between the actual value and the predicted value, which is used as an evaluation index for evaluating the predicted performance of the module, and is calculated as follows:

represents Y _i A predicted value of traffic flow of the corresponding node at the corresponding moment;

the performance of the invention is proved to be in a better level after the results of the three evaluation indexes are calculated and compared with the current main stream model.