CN113487856B - Traffic flow combination prediction model based on graph convolution network and attention mechanism - Google Patents

Traffic flow combination prediction model based on graph convolution network and attention mechanism Download PDF

Info

Publication number
CN113487856B
CN113487856B CN202110621902.7A CN202110621902A CN113487856B CN 113487856 B CN113487856 B CN 113487856B CN 202110621902 A CN202110621902 A CN 202110621902A CN 113487856 B CN113487856 B CN 113487856B
Authority
CN
China
Prior art keywords
model
traffic
traffic flow
representing
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110621902.7A
Other languages
Chinese (zh)
Other versions
CN113487856A (en
Inventor
张红
陈林龙
曹洁
阚苏南
赵天信
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanzhou University of Technology
Original Assignee
Lanzhou University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanzhou University of Technology filed Critical Lanzhou University of Technology
Priority to CN202110621902.7A priority Critical patent/CN113487856B/en
Publication of CN113487856A publication Critical patent/CN113487856A/en
Application granted granted Critical
Publication of CN113487856B publication Critical patent/CN113487856B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0137Measuring and analyzing of parameters relative to traffic conditions for specific applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention relates to a traffic flow combination prediction model based on a graph convolution network and an attention mechanism, which comprises three parts, namely a graph convolution network GCN, a gating recursion unit GRU and a soft attention mechanism SoftAttention. The model of the invention can directly process traffic flow data on an original traffic network, effectively captures the space-time characteristics of traffic flow, captures the space correlation of the traffic flow on the network by using GCN, and automatically distinguishes the importance of each traffic flow sequence on the final prediction performance so as to improve the accuracy of prediction.

Description

Traffic flow combination prediction model based on graph convolution network and attention mechanism
Technical Field
The invention relates to the technical field, in particular to a traffic flow combination prediction model based on a graph convolution network and an attention mechanism.
Background
Traffic flow prediction is an important component of the Intelligent Transportation System (ITS). The traffic prediction can not only provide scientific basis for traffic managers to sense traffic jam in advance and limit vehicle running, but also provide suitable travel routes for urban passengers and improve travel efficiency. Traffic prediction is a process of analyzing urban road traffic conditions including traffic, speed and density, mining traffic patterns, and predicting road traffic trends. However, since the traffic flow has a complex space-time dependency and is influenced by external factors such as road environment, accurate and efficient traffic flow prediction is always a difficult task.
Up to now, various methods have been proposed to predict traffic flow, and related research methods may be classified into a method based on a conventional statistical theory and a machine learning method based on an intelligent calculation. Firstly, a method based on the traditional statistical theory mainly forms traffic flow data into a single time sequence and converts a traffic flow prediction problem into a time sequence prediction problem. In fact, traffic flow data is affected by many factors, it is difficult to obtain an accurate traffic flow prediction model, and existing models cannot accurately describe complex traffic flow data changes in a real-world environment. Secondly, machine learning methods based on intelligent computing are increasingly taking an important position in traffic flow prediction tasks.
In recent years, due to rapid development of deep learning, more and more researchers use deep neural networks to predict traffic flow with high accuracy. Many deep learning methods for traffic flow prediction have been proposed, such as SAE, DNN, DBN, LSTM, CNN-LSTM. Some methods only consider the time dependence and ignore the spatial correlation of traffic flow, so that the change of traffic conditions is not restricted by a road network, and the traffic state cannot be accurately predicted. Some prediction methods take into account spatial correlation for short-term passenger ride demand prediction.
Although some studies introduce CNNs for spatial correlation modeling and have made great progress in traffic flow prediction tasks, CNNs are commonly used for euclidean data such as images, regular meshes, etc., and such models cannot work in the context of urban road networks with complex topologies, and thus they cannot essentially describe the spatial correlation of the road network. Therefore, this method also has certain limitations. In recent years, with the rapid development of Graph Convolution Networks (GCNs), GCNs can process data of arbitrary graph structures, providing good solutions to the above-mentioned problems, while GCNs achieve good results on several different types of tasks based on graph structures, such as emotion classification, unsupervised learning, image classification, and the like.
Attention mechanism has been widely used in various tasks such as natural language processing, image captioning, and speech recognition. With the rapid development of attention mechanisms, existing attention models can be divided into a self-attention mechanism, a soft attention mechanism, a hard attention mechanism and the like. The goal of the attention mechanism is to select information from all inputs that is relatively important to the current task.
Accurate traffic flow prediction is a precondition guarantee for realizing intelligent traffic, but due to the complex space-time characteristic of traffic flow, the prediction is always a difficult problem. In order to capture the complex space-time correlation of the traffic flow, the invention provides a traffic flow combination prediction model based on a graph product network and an attention mechanism to predict the traffic flow.
Disclosure of Invention
The invention aims to solve the technical problem of providing a traffic flow combination prediction model based on a graph convolution network and an attention mechanism, which can directly process traffic flow data on an original traffic network, effectively capture the space-time characteristics of the traffic flow, capture the space correlation of the traffic flow on the network by using a GCN (group traffic network), learn the time dependence of the traffic flow through a GRU (group traffic channel), introduce a soft attention mechanism to adaptively distribute different degrees of attention to traffic flow sequences at different moments, and automatically distinguish the importance of each traffic flow sequence on the final prediction performance so as to improve the prediction accuracy.
In order to solve the technical problems, the technical scheme provided by the invention is as follows: the traffic flow combination prediction model comprises three parts, namely a graph convolution network GCN, a gating recursion unit GRU and a Soft Attention mechanism Soft Attention mechanism, wherein in the formed GGCN-SA traffic flow combination prediction model, the GCN is used for capturing a topological structure of a graph to obtain spatial correlation, the GRU is used for capturing dynamic change of node attributes to obtain time correlation, the Soft Attention mechanism Soft Attention is used for adaptively allocating Attention of different degrees at different moments to traffic flow sequences and automatically distinguishing the importance of each traffic flow sequence to final prediction performance so as to improve the prediction accuracy; the GGCN-SA model is constructed by combining GCN and GRU, and n pieces of historical time series traffic data are input into the GGCN-SA model to obtain n hidden states with space-time characteristics.
Further, the traffic flow combination prediction model is constructed by the following steps: the GCN maps the spatial characteristics and the relation of traffic flow between observation stations into a graph, and the output of the GCN is input into a GRU model which is used for capturing the time correlation of traffic data; inputting the hidden state into an attention model to determine a feature vector covering global traffic information changes; calculating a weight for each hidden state using multi-layer perception by a Softmax function; calculating each feature vector covering the global traffic information change in a weighted sum mode; and outputting the prediction result by using the full connection layer.
The invention has the following advantages: deep learning can learn deep space-time characteristics of traffic flow from a large amount of traffic flow data, and a novel deep learning-based traffic flow combined prediction model GGCN-SA is established to effectively capture the space-time characteristics of the traffic flow. The invention uses a Graph Convolution Network (GCN) to capture the spatial correlation of a road network, a Gated Recursion Unit (GRU) to capture the time dependency, and further introduces a soft attention mechanism (SoftAttention) to aggregate information in different neighborhood ranges so as to enhance the prediction performance of the model. A large number of experiments are carried out on METR-LA and SZ-taxi data sets, and the experimental results show that compared with a baseline method, the GGCN-SA model provided by the invention has better prediction performance.
1. The invention provides a novel deep learning model (GGCN-SA), which captures complex spatial correlation and time dependence from traffic flow data by using a graph convolution network and a gated recursion unit and is used for a traffic flow prediction task of an urban road network.
2. The invention designs a new soft attention mechanism (SoftAttention) to adaptively allocate different degrees of attention to traffic flow sequences at different times, automatically distinguish the importance of each traffic flow sequence on final prediction performance, aggregate information in different neighborhood ranges, and automatically learn the importance of the different neighborhood ranges on traffic flow.
3. Six-group comparison experiments are respectively carried out on two groups of traffic data sets, and the experimental results show that compared with the existing baseline method, the model has the best prediction performance on different data sets.
Drawings
Fig. 1 is a road network diagram structure G of the present invention.
FIG. 2 is a schematic diagram of the GGCN-SA model structure of the invention.
FIG. 3 is a schematic diagram of the GCN structure of the present invention.
Fig. 4 is an internal structure of a GRU network unit U of the present invention.
Fig. 5 is a GRU network structure of the present invention.
Fig. 6 is a schematic diagram of the soft attention mechanism of the present invention.
FIG. 7 is the left view of the present invention: a los Angeles highway; right panel: shenzhen Luhu region.
Fig. 8 shows the convergence of the GGCN-SA model of the present invention after 500 iterations.
FIG. 9 is the present invention: 9 (a) traffic time series prediction results; and 9 (b) predicting the traffic time series.
FIG. 10 shows the accuracy prediction results of the GGCN-SA model of the invention and other models.
Detailed Description
The present invention will be described in further detail with reference to examples.
1 model
The traffic flow combination prediction model based on the graph convolution network and the attention mechanism comprises three parts, namely a Graph Convolution Network (GCN), a Gated Recursive Unit (GRU) and a soft attention mechanism (SoftAttention), and the network structure of the model is shown in fig. 2. In the GGCN-SA model, GCN is used to capture the topology of the graph to obtain spatial correlation, GRU is used to capture the dynamic change of node attributes to obtain temporal correlation, soft attention mechanism (SoftAttention) is used to adaptively assign different degrees of attention to traffic flow sequences at different times, and to automatically distinguish the importance of each traffic flow sequence to the final prediction performance to improve the prediction accuracy. The GGCN-SA model is constructed by combining GCN and GRU, and n pieces of historical time-series traffic data are input into the GGCN-SA model to obtain n hidden states with space-time characteristics.
2 problem definition
In this study, the goal of traffic flow prediction is to predict traffic information over a certain period of time from historical traffic information on roads. In general, traffic conditions may refer to traffic flow, speed, and density. Without loss of generality, the present study represents traffic conditions in terms of traffic speed.
Definition 1: a road network G. As shown in fig. 1, an unweighted graph G = (V, E) is used herein to describe the topology of a road network, and each sensor detection point is regarded as one node, and the connection relationship of any two sensors is regarded as an edge between two corresponding nodes. Where V is a set of road nodes, V = { V = 1 ,v 2 ,...,v N N is the number of nodes and E is a set of edges. Expressing the observed traffic flow on G as a graph signal X ∈ R N×P Where P represents the number of node attribute features.
Definition 2: feature matrix X N×P . The traffic information on a road network is regarded as the attribute of a node in the network, and a characteristic matrix belongs to R by X ∈ N×P Representation where P represents the number of node attribute features, i.e., the length of the historical time series, and X t ∈R N×i Representing the traffic speed on each road at time i.
Definition 3: adjacency matrix A ∈ R N×N . The adjacency matrix a is used to represent the connection between roads, and contains only elements of 0 and 1. If there is no link between the roads, the element is 0, otherwise it is 1.
Thus, suppose X (t) Representing the graph signal observed at time T, the traffic prediction problem aims at learning a function f that maps the T' history graph signal to the future T graph signal, given a graph G, the traffic speed over time T is calculated as follows:
Figure BDA0003100158480000041
wherein T' is the historical time series length of the traffic speed, and T is the time series length of the traffic speed needing to be predicted.
3 model construction
The GCN maps the spatial characteristics and relationships of traffic flow between observers into a graph, and inputs the output of the GCN into a GRU model that is capable of capturing the temporal correlation of traffic data. The hidden state is then input into the attention model to determine a feature vector that covers the global traffic information change. Wherein the weight for each hidden state is calculated using multi-layer perception by means of a Softmax function. Each feature vector covering the global traffic information variation is calculated in a weighted sum manner. And finally, outputting the prediction result by using the full connection layer.
3.1 spatial correlation modeling
Given the feature matrix X and the adjacency matrix a, the GCN may replace the convolution operations in the previous CNN by performing spectral convolution operations taking into account the graph nodes and the first order neighborhood of nodes to capture the spatial features of the graph. Furthermore, the hierarchical propagation rules are applicable to stacking multiple networks. Therefore, the GCN model is used herein to learn spatial features from traffic data.
GCN Structure As shown in FIG. 3, the 2-layer GCN model can be expressed as:
Figure BDA0003100158480000042
wherein X represents a feature matrix, A represents an adjacency matrix,
Figure BDA0003100158480000043
the pre-processing step is shown as follows,
Figure BDA0003100158480000044
is a contiguous matrix of graphs with self-connected structures,
Figure BDA0003100158480000046
is a matrix of degrees and is,
Figure BDA0003100158480000045
W 0 ∈R P×H and W 1 ∈R H×T Representing the hidden layer weight of the first layer and the second layer respectively, wherein P is time length, H is the number of hidden units, f (X, A) is epsilon R N×T Representing an output with a prediction length T, reLU () represents an activation function.
By determining the topological relationship between the central road segment and the peripheral road segments, the GCN can encode the topological structure and the road segment attributes of the road network at the same time. On the basis, the study learns the spatial correlation of the road network through a GCN model.
3.2 time correlation modeling
The internal structure of the GRU network unit is shown in fig. 4.
Inputs to the current GRU unit include the output of the previous GRU unit and the current observation. Through internal processing, output characteristics are obtained and input into the next GRU unit. Wherein h is t-1 Hidden state at time t-1, x t Is traffic information at time t, r t For resetting the gate, for controlling the extent of ignoring the state information of the preceding moment, u t For updating the gate, for controlling the extent to which the state information has been brought into the current state at the previous moment, c t Memory contents stored for time t cell, h t The output state at time t. The GRU model takes the hidden state at the t-1 moment and the current traffic information as input to obtain the traffic state at the t moment. The model still keeps the change trend of historical traffic information while capturing the traffic information at the current moment, and has the capability of capturing time correlation.
The calculation formula of the GRU network unit is as follows
Figure BDA0003100158480000051
Wherein r is t Representing a reset gate, the smaller the reset gate, the less information of the previous state is entered. u. of t Indicating an update gate, the larger the value of the update gate, the more state information that is entered at the previous time. x is the number of t And h t-1 Representing the input vector at the current time t and the output vector at time t-1, y, respectively t An output vector representing time t. []Indicating that the two vectors are connected. Sigma represents a sigmoid activation function that controls the opening or closing of the reset gate and the update gate. By connecting a series of network units U, canA complete GRU neural network was constructed as shown in fig. 5.
3.3 Soft attention mechanism
Under the promotion of soft attention mechanism research, the soft attention mechanism is introduced into a Graph Convolution Network (GCN) and a Gated Recursion Unit (GRU) model to model the traffic data of a network structure in consideration of the spatiotemporal correlation of the graph structure of the traffic network and the traffic data. In the framework proposed herein, the output of the GCN is input into the GRU module, a soft attention mechanism is added to the output of the GRU, information in different neighborhood ranges is aggregated by using the soft attention mechanism, and then a feature vector capable of expressing the traffic state change trend is calculated for predicting future traffic tasks.
The structure of the soft attention mechanism is shown in fig. 6. Suppose a time series X i (i =1,2.., n), where n is the time series length, first, the hidden state H at different times is calculated using the GRU model i (i =1,2, · n), and represent them as H = { H = 1 ,H 2 ,...,H n }. Then, a scoring function is designed to calculate the weight for each hidden state. Finally, output H o Calculated in a weighted average manner.
The weight of each feature is divided by the Softmax function (score) i ) Performing normalization calculation and obtaining final weight (alpha) i ). Wherein w 1 And w 2 Weights of the first and second layers, respectively, b 1 And b 2 Is the deviation of the first and second layers, respectively.
score i =w (2) (w (1) H+b (1) )+b (2) (4)
Figure BDA0003100158480000061
Finally, output H o Calculated in a weighted average manner as follows.
Figure BDA0003100158480000062
The attention mechanism can be regarded as that the adaptive weight alpha is calculated i Generating an input sequence H i Fixed length of o
3.4 loss function
The aim of the training is to minimize the error between the actual traffic speed and the predicted traffic speed in the road network. The actual traffic speed and the predicted traffic speed of different road sections are respectively Y and
Figure BDA0003100158480000065
and (4) showing. Therefore, the loss function of the GGCN-SA model is shown below.
Figure BDA0003100158480000063
Figure BDA0003100158480000064
Wherein, Y t And
Figure BDA0003100158480000066
representing actual and predicted traffic speeds, respectively, with λ being the regularization parameter and w being the weight. The first term is used to minimize the error between the actual speed and the predicted speed. The second term | | w | | non-woven phosphor 2 The L2 regularization term can prevent the occurrence of parameters with overlarge numerical values in the model, and is beneficial to avoiding overfitting.
4. Experiment of
4.1 data description
Two sets of traffic data sets, namely the Loop Detector data set in los Angeles (METR-LA) and the taxi track data set in Shenzhen City (SZ-taxi), were used herein to verify the performance of the GGCN-SA model presented herein. The actual traffic data set of the experiment contains different attributes such as location, date, time period, speed and flow etc. The details of the experimental data set are shown in table 1:
TABLE 1 description of the Experimental data set
Data set METR-LA SZ-taxi
Data type Time series Time series
Position of Los Angeles highway Shenzhen Luhu region
Spacer 5-minute 15-minute
Time period
1/3/2012-7/3/2012 1/1/2015-31/1/2015
Properties Speed of rotation Speed of rotation
Recording 207 sensors 156 roads
The METR-LA dataset originated from a loop detector on the los Angeles highway, spanning from 3/1/2012 to 3/7/2012, with historical traffic speeds collected by 207 sensors, with traffic speeds summarized every 5 minutes. The SZ-taxi data set is originated from the Luohu region of Shenzhen city, and the time span is 1 month and 1 day in 2015 to 31 months in 2015. In the present study, 156 major roads in the lake region were selected as the study area, and the driving speeds of each road were summarized every 15 minutes.
The experimental data mainly includes two parts: one is a 156 by 156 adjacency matrix that describes the spatial relationship between roads. Each row represents a road and the values in the adjacency matrix represent connectivity between roads. The other is a feature matrix, which describes the change in traffic speed over time on each road. Each row represents a road and each column represents traffic speed on the road for a different time period.
Since the METR-LA dataset contains some missing data, we use a linear interpolation method to fill in missing values. Before entering the data into the predictive model, the data is normalized using the min-max normalization method, which is limited to [0,1]. Normalized formula is
Figure BDA0003100158480000071
Wherein x i Represents the ith original data, x max And x min Respectively represent the maximum and minimum values of the original data, and
Figure BDA0003100158480000072
representing the normalized input data.
4.2 Experimental Environment and parameter settings
The experiment was compiled and run on a Linux server (CPU: intel (R) Xeon (R) CPU E5-2620 v4@2.10GHz,GPU:NVIDIAGeForce GTX 1080). And completing construction and training of a traffic flow prediction model in a PyCharm development environment based on a TensorFlow deep learning framework.
The GGCN-SA model was trained using Adam optimizer herein, manually setting the initial learning rate to 0.001, using L2 regularization in the loss function to prevent overfitting. The GGCN-SA model selects a modified linear unit (ReLU) as an activation function, and can effectively improve the calculation speed of a neural network while avoiding the problem of gradient disappearance. In the experiment, all data sets were divided into a proportion of 8:2 as training and test sets, respectively. Traffic flow velocities were predicted for 15 minutes, 30 minutes, 45 minutes and 60 minutes.
4.3 results of the experiment
In this study, the predicted results of the GGCN-SA model were compared to the results of a historical average model (HA), an autoregressive moving average model (ARIMA), a Support Vector Regression (SVR) model, a Graph Convolution Network (GCN) model, and a Gated Recursion Unit (GRU) model.
(1) Historical average model (HA): average traffic information over a historical period is used as a prediction.
(2) Support vector regression model (SVR): support vector regression uses a linear support vector machine to train a model to obtain the relationship between input and output for traffic flow prediction.
(3) Autoregressive moving average model (ARIMA): ARIMA is one of the most widespread and popular models for time series prediction, which fits observed time series into a parametric model to predict future traffic data.
(4) Graph convolutional network model (GCN): the topological structure of the urban road network is captured by using the graph convolution network to obtain the spatial characteristics of the traffic data.
(5) Gated recursive unit model (GRU): RNN is a classical deep learning method for processing sequence learning tasks. GRU is the most prevalent variant of RNN and can be used for time series modeling.
Selecting a METR-LA data set, and carrying out 500 times of iterative training on the GGCN-SA model under a time sequence of 15 minutes, wherein the error change of the GGCN-SA model along with the increase of the iterative times is shown in figure 8. Meanwhile, predicted values and real values of traffic speeds of the GGCN-SA model and other comparison models on two different road sections in the METR-LA data set within one day are shown in FIG. 9.
Fig. 9 (a) and (b) show the predicted performance of the various models as the prediction interval increases. In general, as the prediction time interval becomes longer, the prediction error also increases due to error propagation. As can be seen from the figure, a method that only considers the temporal correlation can obtain good prediction accuracy in short-term prediction, such as a GRU model. However, as the prediction time interval increases, errors are continuously transmitted, and the prediction accuracy of the GRU model is sharply reduced. In contrast, the GCN-GRU model has a slower rate of performance degradation, mainly because the GCN-GRU can simultaneously capture the spatio-temporal characteristics of traffic flow, which is more important in long-term prediction. However, the prediction error of the GCN-GRU model increases as more time series are considered in the model. In contrast, the GGCN-SA model provided by the method achieves better prediction performance in almost all time steps, and the strategy of combining the graph convolution network and the gated recursion unit with the attention mechanism can better enhance the characterization capability of the model on the space-time characteristics of the traffic flow.
4.4 model evaluation
To better analyze the experimental results, the predictive performance of the model is evaluated, and the error between the actual traffic flow speed and the prediction result is evaluated based on the following indicators:
root Mean Square Error (RMSE):
Figure BDA0003100158480000081
determining the coefficient (R) 2 ):
Figure BDA0003100158480000082
Mean Absolute Error (MAE):
Figure BDA0003100158480000091
accuracy (Accuracy):
Figure BDA0003100158480000092
interpretable variance score (Var):
Figure BDA0003100158480000093
in the formula, Y t And
Figure BDA0003100158480000094
respectively the real speed and the predicted speed of the time sample j on the link i. N is the number of nodes on the road. Y and
Figure BDA0003100158480000095
are each Y t And
Figure BDA0003100158480000096
the set of (a) and (b),
Figure BDA0003100158480000097
is the average value of Y.
In particular, the prediction error is measured by RMSE and MAE, and the smaller the values of RMSE and MAE, the better the prediction effect. The accuracy is used for detecting the prediction precision, and the larger the numerical value is, the better the prediction effect is. R 2 And Var and the ability to measure the fitting of the prediction result to the actual data, the larger the value, the better the prediction effect.
Tables 2 and 3 show traffic predictions for the GGCN-SA model and other baseline methods performed on the METR-LA dataset and SZ-taxi dataset for 15 minutes, 30 minutes, 45 minutes, and 60 minutes, respectively. The baseline method does not combine both spatial correlation and temporal dependency, but rather models the temporal sequence or spatial topology in a coarse-grained manner. In contrast, the GGCN-SA model established by the method has more obvious advantages on a METR-LA data set than on an SZ-taxi data set by modeling the space topology of an observation station, and can more effectively mine space-time characteristics so as to enhance the representation capability of the model on the space-time characteristics of the traffic flow and predict more accurately.
(1) Effect of prediction Algorithm on accuracy
From tables 2 and 3 we can find that the neural network based methods include MLP model, GCN model, GRU model. Modeling its temporal characteristics then HAs better prediction accuracy compared to other methods (e.g., HA, ARIMA, and SVR models). From table 2, it can be seen that, on the METR-LA data set, for the traffic flow prediction of 15 minutes, the MAEs of the GGCN-SA model, the GCN model and the GRU model are respectively reduced by about 24.82%,22.86% and 21.00% and the accuracy is respectively improved by about 4.47%,4.35% and 4.35% compared with the HA model. Compared with an ARIMA model, the RMSE of the GGCN-SA model, the GCN model and the GRU model is respectively reduced by about 49.85 percent, 48.17 percent and 47.93 percent, and the accuracy is respectively improved by about 10.52 percent, 10.16 percent and 10.16 percent. The MAE of the GGCN-SA model, the GCN model and the GRU model is respectively reduced by about 15.50%,17.83% and 17.06% compared with the SVR model, and the accuracy is respectively improved by 1.78%,1.45% and 1.45% compared with the SVR model. This is mainly due to the difficulty of HA, ARIMA and SVR methods to capture traffic flow spatiotemporal characteristics. The lower predictive effect of the GCN model is because GCN considers only spatial features and ignores the temporal correlation of traffic data.
The GGCN-SA model was tested on the SZ-taxi dataset, as shown in Table 3, for the 15-minute traffic flow prediction, the RMSE of the GGCN-SA model, the GCN model and the GRU model was reduced by about 5.68%,0.77% and 0.93%, respectively, and the accuracy of the GGCN-SA model was improved by about 2.57%, while the accuracy of the GCN model and the GRU model was slightly lower than that of the HA model. Compared with an ARIMA model, the RMSE of the GGCN-SA model, the GCN model and the GRU model is respectively reduced by 40.40%,37.29% and 37.40%, and the accuracy is respectively improved by 89.95%,85.98% and 86.24%. This is mainly because ARIMA is difficult to capture traffic flow spatio-temporal characteristics and ARIMA is calculated by calculating the error of each node and averaging, which also increases the final total error if some data fluctuates. Therefore, ARIMA has the lowest prediction accuracy. Under different time sequences, the GGCN-SA model can obtain higher prediction precision on two groups of data sets, and the robustness of the model is better, so that the accuracy and the effectiveness of the GGCN-SA model in traffic flow prediction are proved.
(2) Effect of spatio-temporal correlation on prediction accuracy
In order to verify the influence of the space-time characteristics of the traffic flow on the traffic flow prediction result, the GGCN-SA model is compared with the GCN model and the GRU model. As shown by the SZ-taxi data set in the table 3, compared with the GCN model only considering the spatial characteristics, the RMSE of the GGCN-SA model is reduced by about 1.71 percent, the accuracy is improved by about 0.99 percent under 30 minutes, and the GGCN-SA model has better representation capability on the spatial characteristics of traffic flow. Compared with the GRU model only considering the time characteristics, the RMSE of the GGCN-SA model is reduced by about 4.66%, the accuracy is improved by about 2.00%, and the GGCN-SA model has better characterization capability on the time characteristics of the traffic flow. In summary, as can be seen from tables 2 and 3, the overall effect of the GGCN-SA model is better than that of the GCN model and the GRU model in the traffic flow predictions of 15, 30, 45 and 60 minutes on the two data sets, thereby proving that the GGCN-SA model can capture the spatiotemporal characteristics of the traffic flow at the same time and has better representation capability on the spatiotemporal characteristics of the traffic flow.
Table 2: prediction of GGCN-SA model and other baseline methods on METR-LA data set (None means that the value is too small, best results are shown in bold in the table)
Figure BDA0003100158480000101
Table 3: prediction results of GGCN-SA model and other baseline methods on SZ-taxi dataset
Figure BDA0003100158480000111
(3) Effect of attention mechanism on prediction results
The GGCN-SA model is compared with a model without an intention mechanism (GCN-GRU) to verify the impact of the spatiotemporal characteristics of traffic flow on traffic flow prediction results. The results are shown in Table 4, and at 15 minutes, the MAE of the GGCN-SA model on the METR-LA and SZ-taxi data sets is reduced by about 5.89% and 3.26% respectively compared with the MAE of the GCN-GRU model, and the accuracy is improved by about 0.55% and 1.27% respectively. Under 30 minutes, the MAE is reduced by about 1.51 percent and 4.11 percent respectively, and the accuracy is improved by about 0.11 percent and 1.28 percent respectively. Under 45 minutes, the MAE is respectively reduced by about 3.23 percent and 4.29 percent, and the accuracy is respectively improved by about 0.34 percent and 1.13 percent. Under 60 minutes, the MAE of the GGCN-SA model on the METR-LA data set is inferior to that of the GCN-GRU model, and on the SZ-taxi data set, the MAE of the GGCN-SA model is reduced by about 6.31 percent, and the accuracy is improved by about 1.28 percent.
Table 4: comparison of GGCN-SA model with GCN-GRU model on two data sets, METR-LA and SZ-taxi
Figure BDA0003100158480000112
Figure BDA0003100158480000121
Therefore, as can be seen from the data in table 4 and fig. 10, the prediction error of the GGCN-SA model proposed herein is smaller than that of the model without the intention mechanism (GCN-GRU), and the prediction accuracy of the GGCN-SA model is higher under different traffic data sets and prediction levels of different time intervals, so that the GGCN-SA model has better characterization capability on the spatiotemporal characteristics of traffic flow.
In conclusion, the GGCN-SA model can always obtain the best result at different time intervals, which shows that the GGCN-SA model has better representation capability on the space-time characteristics of traffic flow. The model can also capture the variation trend of the traffic speed and predict the starting time and the ending time of the traffic flow peak period. The GGCN-SA model can better capture the space-time characteristics of traffic flow, thereby proving the accuracy and the effectiveness of the GGCN-SA model in real-time traffic prediction.
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims (1)

1. The traffic flow combination prediction model based on the graph convolution network and the attention mechanism is characterized in that: the traffic flow combined prediction model comprises three parts, namely a graph convolution network GCN, a gating recursion unit GRU and a soft attention mechanism SoftAttention, wherein in the formed GGCN-SA traffic flow combined prediction model, the GCN is used for capturing a topological structure of a graph to obtain spatial correlation, the GRU is used for capturing dynamic change of node attributes to obtain time correlation, and the soft attention mechanism SoftAttention aggregates information in different neighborhood ranges to adaptively distribute attention of different degrees at different moments to traffic flow sequences and automatically distinguish the importance of each traffic flow sequence on final prediction performance so as to improve the accuracy of prediction; constructing a GGCN-SA model by combining GCN and GRU, and inputting n historical time series traffic data into the GGCN-SA model to obtain n hidden states with space-time characteristics;
the construction and training of the traffic flow combined prediction model combining the graph convolution network GCN, the gated recursion unit GRU and the soft attention mechanism comprises the following steps:
1) Describing the topological structure of the road network by using an unweighted graph G = (V, E), regarding each sensor detection point as a node, and regarding the connection relation of any two sensors as an edge between two corresponding nodes; where V is a set of road nodes, V = { V1, V2., vN }, N is the number of nodes, E is a set of edges; representing traffic flow observed on G as graph signals
Figure DEST_PATH_IMAGE001
Wherein P represents the number of node attribute features;
2) The traffic information on road network is regarded as the attribute of nodes in the network, and the characteristic matrix is used
Figure 551998DEST_PATH_IMAGE002
Representation, where P represents the number of node attribute features, i.e., the length of the historical time series, and
Figure DEST_PATH_IMAGE003
representing traffic information on each road at time i;
3) Using a contiguous matrix
Figure 297DEST_PATH_IMAGE004
Representing a connection between roads, the adjacency matrix containing only elements of 0 and 1; if there is no link between roads, the element is 0, otherwise it is 1;
4) By determining the topological relation between the central road section and the peripheral road sections, the GCN encodes the topological structure and the road section attributes of the road network at the same time, and maps the spatial characteristics and the relation of the traffic flow between the observation stations into a graph;
5) Inputting the output of the GCN into a GRU model, wherein the GRU model is used for capturing the time correlation of traffic data; a time sequenceXi (i=1,2.. N), where n is the time series length, hidden states at different times are calculated using the GRU modelHi (i=1,2.. Multidot., n), and represent them asH
Figure DEST_PATH_IMAGE005
{H1H2,...,Hn}; then, designing a scoring function to calculate the weight of each hidden state; finally, the output of the soft attention mechanism modelHoCalculating in a weighted average mode; the calculation formula of the GRU network unit is as follows:
Figure 796346DEST_PATH_IMAGE006
wherein the content of the first and second substances,r t representing a reset gate, the smaller the reset gate, the less information of a previous state is entered;u t the value of the update gate is larger, and the state information of the previous moment is more entered;x t andh t-1 respectively representing the input vector at the current time t and the output vector at the time t-1,y t an output vector representing time t; []Representing that the two vectors are connected;
Figure DEST_PATH_IMAGE007
representing a sigmoid activation function, and controlling the opening or closing of a reset gate and an update gate;c t the memory content stored in the unit is represented as t time;W u represents the weight of the update gate;W r representing the weight of the reset gate;b r indicating the bias of the reset gate;b u indicating the bias of the update gate;W c representing the weight of the cell store;b c an offset representing a cell storage;W o a weight representing the output;xrepresenting an input vector;
inputting the hidden state into a soft attention model, aggregating information in different neighborhood ranges by using a soft attention mechanism, and then calculating a characteristic vector capable of expressing the traffic state change trend for predicting a future traffic task; designing a scoring function to calculate the weight of each hidden state, wherein the weight of each feature is divided by the Softmax functionscore i Carrying out normalization calculation and obtaining the final weight of each hidden stateα i
score i =w 2 (w 1 H i +b 1 )+b 2
Figure 722714DEST_PATH_IMAGE008
Whereinw 1 Andw 2 the weights of the first and second layers of the two-layer GCN model,b 1 andb 2 is the deviation of the first layer and the second layer, respectively;
computing characteristics of each covering global traffic information change in a weighted sum modeOutput of eigenvector, soft attention mechanism modelHoThe calculation is performed in a weighted average manner as follows:
Figure DEST_PATH_IMAGE009
finally, outputting a prediction result by using a full connection layer;
loss function of GGCN-SA modellossAs follows:
Figure 864982DEST_PATH_IMAGE010
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE011
Ytand
Figure 969335DEST_PATH_IMAGE012
respectively representing an actual traffic state and a predicted traffic state;
Figure DEST_PATH_IMAGE013
is a regularization parameter;
wis the weight of the image,m=2,w p respectively takew1 andw2。
CN202110621902.7A 2021-06-04 2021-06-04 Traffic flow combination prediction model based on graph convolution network and attention mechanism Active CN113487856B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110621902.7A CN113487856B (en) 2021-06-04 2021-06-04 Traffic flow combination prediction model based on graph convolution network and attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110621902.7A CN113487856B (en) 2021-06-04 2021-06-04 Traffic flow combination prediction model based on graph convolution network and attention mechanism

Publications (2)

Publication Number Publication Date
CN113487856A CN113487856A (en) 2021-10-08
CN113487856B true CN113487856B (en) 2022-10-14

Family

ID=77934571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110621902.7A Active CN113487856B (en) 2021-06-04 2021-06-04 Traffic flow combination prediction model based on graph convolution network and attention mechanism

Country Status (1)

Country Link
CN (1) CN113487856B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935555B (en) * 2021-12-15 2022-03-18 华录易云科技有限公司 Road network structure-based situation adaptive traffic prediction method and system
CN114362858B (en) * 2021-12-27 2023-09-26 天翼物联科技有限公司 Narrowband Internet of things base station load prediction method, system and medium based on graph convolution
CN115392752B (en) * 2022-09-01 2023-11-28 亿雅捷交通系统(北京)有限公司 Subway short-time passenger flow prediction method, system, electronic equipment and storage medium
CN116504076A (en) * 2023-06-19 2023-07-28 贵州宏信达高新科技有限责任公司 Expressway traffic flow prediction method based on ETC portal data

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019081623A1 (en) * 2017-10-25 2019-05-02 Deepmind Technologies Limited Auto-regressive neural network systems with a soft attention mechanism using support data patches
CN109360207A (en) * 2018-09-26 2019-02-19 江南大学 A kind of fuzzy clustering method merging neighborhood information
CN109816101A (en) * 2019-01-31 2019-05-28 中科人工智能创新技术研究院(青岛)有限公司 A kind of session sequence of recommendation method and system based on figure convolutional neural networks
CN109754605B (en) * 2019-02-27 2021-12-07 中南大学 Traffic prediction method based on attention temporal graph convolution network
CN110675623B (en) * 2019-09-06 2020-12-01 中国科学院自动化研究所 Short-term traffic flow prediction method, system and device based on hybrid deep learning
CN111063194A (en) * 2020-01-13 2020-04-24 兰州理工大学 Traffic flow prediction method
CN111680075A (en) * 2020-04-16 2020-09-18 兰州理工大学 Hadoop + Spark traffic prediction system and method based on combination of offline analysis and online prediction
CN111932010B (en) * 2020-08-10 2023-09-22 重庆大学 Shared bicycle flow prediction method based on riding context information
CN112085163A (en) * 2020-08-26 2020-12-15 哈尔滨工程大学 Air quality prediction method based on attention enhancement graph convolutional neural network AGC and gated cyclic unit GRU
CN112183862A (en) * 2020-09-29 2021-01-05 长春理工大学 Traffic flow prediction method and system for urban road network
CN112289034A (en) * 2020-12-29 2021-01-29 四川高路交通信息工程有限公司 Deep neural network robust traffic prediction method based on multi-mode space-time data
CN112668700B (en) * 2020-12-30 2023-11-28 广州大学华软软件学院 Width graph convolution network model system based on grouping attention and training method
CN112785043B (en) * 2020-12-31 2022-08-30 河海大学 Flood forecasting method based on time sequence attention mechanism
CN112818035B (en) * 2021-01-29 2022-05-17 湖北工业大学 Network fault prediction method, terminal equipment and storage medium
CN112801404B (en) * 2021-02-14 2024-03-22 北京工业大学 Traffic prediction method based on self-adaptive space self-attention force diagram convolution

Also Published As

Publication number Publication date
CN113487856A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
CN113487856B (en) Traffic flow combination prediction model based on graph convolution network and attention mechanism
CN109754605B (en) Traffic prediction method based on attention temporal graph convolution network
Lu et al. A combined method for short-term traffic flow prediction based on recurrent neural network
US11657708B2 (en) Large-scale real-time traffic flow prediction method based on fuzzy logic and deep LSTM
CN111612243B (en) Traffic speed prediction method, system and storage medium
WO2020220439A1 (en) Highway traffic flow state recognition method based on deep neural network
Yu et al. Policy-based reinforcement learning for time series anomaly detection
CN112949828B (en) Graph convolution neural network traffic prediction method and system based on graph learning
CN113268916A (en) Traffic accident prediction method based on space-time graph convolutional network
CN114299723B (en) Traffic flow prediction method
CN113570859B (en) Traffic flow prediction method based on asynchronous space-time expansion graph convolution network
CN115578852B (en) DSTGCN-based traffic prediction method
CN111598325A (en) Traffic speed prediction method based on hierarchical clustering and hierarchical attention mechanism
CN111047078B (en) Traffic characteristic prediction method, system and storage medium
CN116721537A (en) Urban short-time traffic flow prediction method based on GCN-IPSO-LSTM combination model
CN113033899B (en) Unmanned adjacent vehicle track prediction method
CN113505536A (en) Optimized traffic flow prediction model based on space-time diagram convolution network
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN115376317A (en) Traffic flow prediction method based on dynamic graph convolution and time sequence convolution network
CN112927507B (en) Traffic flow prediction method based on LSTM-Attention
CN111141879A (en) Deep learning air quality monitoring method, device and equipment
Xiao et al. A Hidden Markov Model based unscented Kalman Filtering framework for ecosystem health prediction: A case study in Shanghai-Hangzhou Bay Urban Agglomeration
Sardinha et al. Context-aware demand prediction in bike sharing systems: Incorporating spatial, meteorological and calendrical context
CN114596726A (en) Parking position prediction method based on interpretable space-time attention mechanism
CN117494034A (en) Air quality prediction method based on traffic congestion index and multi-source data fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant