CN115620514A - Traffic flow prediction method based on adaptive generalized PageRank graph neural network - Google Patents
Traffic flow prediction method based on adaptive generalized PageRank graph neural network Download PDFInfo
- Publication number
- CN115620514A CN115620514A CN202211156320.7A CN202211156320A CN115620514A CN 115620514 A CN115620514 A CN 115620514A CN 202211156320 A CN202211156320 A CN 202211156320A CN 115620514 A CN115620514 A CN 115620514A
- Authority
- CN
- China
- Prior art keywords
- time
- pagerank
- generalized
- neural network
- space
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 15
- 230000003044 adaptive effect Effects 0.000 title description 17
- 238000010586 diagram Methods 0.000 claims abstract description 69
- 238000003062 neural network model Methods 0.000 claims abstract description 63
- 238000012549 training Methods 0.000 claims abstract description 18
- 230000006870 function Effects 0.000 claims description 25
- 239000011159 matrix material Substances 0.000 claims description 23
- 230000004927 fusion Effects 0.000 claims description 13
- 230000007246 mechanism Effects 0.000 claims description 13
- 230000000694 effects Effects 0.000 claims description 10
- 238000005295 random walk Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 8
- 238000012795 verification Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 4
- 230000000644 propagated effect Effects 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000010304 firing Methods 0.000 claims description 2
- 230000001131 transforming effect Effects 0.000 claims description 2
- 230000007704 transition Effects 0.000 claims description 2
- 238000013461 design Methods 0.000 abstract description 11
- 238000011160 research Methods 0.000 description 8
- 230000002123 temporal effect Effects 0.000 description 8
- 239000013598 vector Substances 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 4
- 230000002596 correlated effect Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- BUCXEFZXWKUCCY-UHFFFAOYSA-N 4-methyl-3-(2-phenylethyl)-1,2,4-oxadiazol-5-one Chemical compound O1C(=O)N(C)C(CCC=2C=CC=CC=2)=N1 BUCXEFZXWKUCCY-UHFFFAOYSA-N 0.000 description 1
- 102100040954 Ephrin-A1 Human genes 0.000 description 1
- 101000965523 Homo sapiens Ephrin-A1 Proteins 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000009982 effect on human Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000002229 photoelectron microspectroscopy Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
- G08G1/0133—Traffic data processing for classifying traffic situation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0108—Measuring and analyzing of parameters relative to traffic conditions based on the source of data
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Analytical Chemistry (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention provides a traffic flow prediction method based on a self-adaptive generalized PageRank diagram neural network. The method comprises the following steps: acquiring information point POI information in public traffic flow data, and constructing a distance code; constructing time information into time codes, and splicing distance codes and time codes into space-time codes (DTEs); the method comprises the steps of constructing a generalized PageRank-based space-time diagram neural network model, taking historical time sequence characteristics H and DTE as input data of the generalized PageRank-based space-time diagram neural network model, training the generalized PageRank-based space-time diagram neural network model, inputting historical traffic flow sequences into the trained generalized PageRank-based space-time diagram neural network model, and outputting future traffic flow sequences based on the generalized PageRank-based space-time diagram neural network model. The invention designs RPTA to adaptively model the non-linear correlation among different time step lengths, designs distance and time coding to combine the geographic information and the time information of a road network, and can effectively predict the traffic flow of a road.
Description
Technical Field
The invention relates to the technical field of traffic flow prediction, in particular to a traffic flow prediction method based on a self-adaptive generalized PageRank diagram neural network.
Background
Traffic flow prediction is the prediction of future traffic conditions, including but not limited to speed, flow, congestion, etc., from historical traffic data and the topology of the traffic network. Traffic flow prediction is one of the key issues in ITS (Intelligent transportation system). Traffic systems are influenced by various external factors, and traffic flows of different road sections and different times are mutually related and simultaneously show strong randomness and complexity. Accurate traffic prediction requires analyzing traffic conditions, such as speed, flow, etc., of an urban road network, mining traffic patterns, and further predicting future traffic conditions of the road network. Accurate traffic flow prediction can help traffic management departments to control traffic and reduce traffic congestion. Traffic flow prediction needs to consider external factors influencing traffic conditions, such as weather, holidays and the like, besides historical traffic conditions, and even online social platform information can help to perform traffic flow prediction. To model historical traffic time series, RNN (Recurrent Neural Network) was used in previous traffic flow prediction studies. On the other hand, traffic conditions are also affected by surrounding areas, and CNN (Convolutional Neural Networks) has been widely used in recent research for modeling such spatial correlation.
As research work progresses deeply and neural networks have been developed in recent years, researchers have defined traffic flow predictions as a graph modeling problem. They often use recurrent neural networks, time convolutional networks, or attention mechanisms to capture the temporal correlations. These methods ignore the effect of predicting the relative position between different time steps of a task. All the above studies also apply GNN (Graph Neural Networks) based methods to model the traffic condition relationship of neighboring nodes based on the road network. While recent GNN-based approaches have enjoyed favorable performance, they typically require pre-specified maps based on prior knowledge to model spatial correlation.
At present, the methods for constructing graph structures in the prior art include constructing graph structures based on distance or similarity functions, and mainly include the following three common methods: 1) The method based on the distance function is characterized in that a graph is constructed based on the road network distance between sensors, and a threshold Gaussian kernel is used for constructing an adjacency matrix; 2) The method based on the similarity function constructs an adjacency matrix by using the node attributes (such as interest point information and traffic connection information) or the similarity of traffic sequences; 3) Methods based on distance and similarity functions use multiple graphs to encode different types of correlations between nodes or regions. Contemporary research workers also propose adaptive adjacency matrices and learnable node-embedded dictionaries as supplements to the three above-described methods of constructing graph structures. Furthermore, recent research is more inclined to encode spatial and temporal information as vectors as additional node features. The research workers learn to obtain the spatial embedding of each node by using methods such as node2vec, encode the time information by using one-hot encoding, and splice the time information and the time information as additional features of each node.
The drawbacks of the above-mentioned prior art method of constructing graph structures include: the above method requires proper professional domain knowledge as prior knowledge, and the model effect is also very sensitive to the quality of the constructed graph. The graph structure constructed in the three ways described above, although quite intuitive, is not complete in the information considered therein, is directed to a particular aspect only, and cannot be directly adapted to a particular downstream prediction task. And the generated graph structure may introduce new deviations due to specific domain knowledge, and is not suitable for direct application in a specific domain without proper domain background knowledge. In general, most of the existing researches consider defining the adjacency matrix according to the prior knowledge (such as road network distance) or only construct an adaptive matrix to replace the adjacency matrix generated according to the prior knowledge, and these methods cannot simultaneously consider the correlation of space, time, semantics and the like.
Generally speaking, the traffic condition of an area is strongly correlated with the historical condition of the area, and the correlation changes in a nonlinear way along with time. The existing models largely ignore the influence of the relative position in time on the prediction result. Meanwhile, the existing model uses a method such as node2vec to extract structural information as an extra feature of a node, or uses one-hot coding to code information such as date, week, time and the like into a vector. However, one-hot encoding cannot reflect the relative position relationship between each time step, and node2vec cannot conveniently process newly added nodes, and these methods are not suitable for regression learning.
Disclosure of Invention
The embodiment of the invention provides a traffic flow prediction method based on a self-adaptive generalized PageRank graph neural network, so as to effectively predict the traffic flow of a road.
In order to achieve the purpose, the invention adopts the following technical scheme.
A traffic flow prediction method based on a self-adaptive generalized PageRank graph neural network comprises the following steps:
acquiring public traffic flow data, and preprocessing the public traffic flow data;
acquiring information point POI information in public traffic flow data, and constructing a distance code based on the POI information;
constructing time information into a time code to obtain a historical time sequence characteristic H;
the road network is regarded as a directed weighted graph, distance codes and time codes are spliced to form space-time codes DTEs, and the DTEs are used as additional features of nodes in the directed weighted graph;
constructing a generalized PageRank-based space-time diagram neural network model, taking historical time sequence characteristics H and DTE as input data of the generalized PageRank-based space-time diagram neural network model, and training the generalized PageRank-based space-time diagram neural network model;
judging whether the training effect of the space-time diagram neural network model based on the generalized PageRank meets the requirement or not by using a verification set, if so, obtaining a trained space-time diagram neural network model based on the generalized PageRank, and storing corresponding space-time diagram neural network model parameters based on the generalized PageRank;
and inputting the historical traffic flow sequence into a trained time-space diagram neural network model based on the generalized PageRank, and outputting a future traffic flow sequence based on the trained time-space diagram neural network model based on the generalized PageRank.
Preferably, the acquiring information point POI information in the public traffic flow data, constructing a distance code based on the POI information, includes:
treating a road network as a directed weighted graph Is a set of vertices representing nodes on the road network, which are traffic detectors, epsilon is a set of edges representing connectivity between vertices,is a contiguous matrix, provided with weighted graphsVertex vi in (1) generates observation sequence X I,: =X I,0 ,X I,1 ,...,X I,t ∈R t×c Where C is the number of traffic conditions, the goal of traffic flow prediction is to find a functionAccording to observed historical traffic flow sequence Q = { X :,t-q+1 ,X :,t-q+2 ,…,X :,t To predict future traffic flow sequence P = { X = :,t+1 ,X :,t+2 ,...,X :,t+p The value of } is;
drawingThe distance code DE of (3) is defined as a functionUse ofIs shown in whichDesigning a function zeta according to the probability of random walk from u to v;
whereinIs a random walk matrix and is a matrix of random walks,one group ofDistance coding DE (u) aggregated into vertices u;
using Sum pooling as aggregation function AGG, poI reflects the function of a region, subsetSelecting according to PoI information, collecting PoI information from OpenStreetMap by using OSMnx, and selecting m nodes with the maximum number of nearby PoIs as a subset
Preferably, the constructing the time information into the time code includes:
taking each set time in the week as a segment, pos is the position of the time in the week, i is the dimension, 10000 is hyper-parameter compliance, d model The method is based on the dimensionality of a hidden layer in a generalized PageRank-based space-time diagram neural network model, and the expression of time coding TE is as follows:
preferably, regarding the road network as a directed weighted graph, concatenating the distance code and the time code into a space-time code DTE, and regarding the DTE as an additional feature of a node in the directed weighted graph, the method includes:
connecting the distance code DE and the time code TE, and transforming through a full connection layer to construct a space-time code DTE, wherein the DTE comprises geographic information and time information of a road network, and the DTE is used as a vertex characteristic of a node of each layer in the directed weighted graph.
Preferably, the constructing of the generalized PageRank-based space-time graph neural network model comprises:
constructing a self-adaptive generalized PageRank space-time diagram neural network model of an encoder-decoder architecture, wherein the encoder and the decoder are formed by stacking space-time Blocks ST-Blocks with residual error structures, each ST-Block consists of an AGP layer of the self-adaptive generalized PageRank layer, a relative position time attention RPTA layer and a fusion layer, the AGP layer learns a hidden diagram structure and hidden characteristics, the hidden characteristics are propagated on an implicit diagram structure through the generalized PageRank, and the RPTA layer captures the correlation among different time steps through relative position information;
denote the 1 ST-block input as X (l) With the output being X (l+1) Mixing DTE with X (1) Connected as a new vertex feature H (l) =concat(X l DTE), using H (l) As the input of the AGP layer and the RPTA layer, the output of the AGP layer and the RPTA layer is fused as the final output X of the ST-Block by using a gating mechanism (l+1) ;
Adding in AGPInto a learnable node embedded dictionaryWhere N is the number of nodes, d model Is the dimension of node embedding, and a normalized symmetric adjacency matrix is deduced according to the EA of each layer
Define Asym with self-loop asApplying generalized PageRank to capture the correlation between vertices, generating hidden state features for each nodeWherein f is θ For a fully connected layer, the hidden state features are propagated using generalized PageRank, and the propagation process is expressed by the following formula:
wherein the weight γ k Is a model learnable parameter;
the RPTA layer is designed by using a relationship-aware self-attention mechanism, and the formula is as follows:
where T is the number of time steps,is a matrix of parameters that is a function of,is a relative position representation, the weight coefficient a ij Calculated using the equation softmax function:
whereinIs a model learnable parameter, d model Is the dimension of the hidden layer or layers,is a relative position representation;
applying the multi-head attention mechanism to the RPTA layer, the relative position is expressed by the following formula:
clip(j-i,k)=max(-k,min(k,x))
Preferably, the constructing a time-space diagram neural network model based on the generalized PageRank further includes:
fusion layers with gating mechanisms are used to fuse the outputs of the AGP layer and the RPTA layer, and the formula of the fusion layer is as follows:
z=σ(W 0 (X s +X t ))
fusion(X s ,X t )=z⊙(W 1 ·X s )+(1-z)(W 2 ·X t )
wherein, X s Is the output of AGP layer, X t For the output of the RTPA layer, σ represents a sigmoid function, and "-" represents multiplication of corresponding elements of the matrix;
a transition attention layer is designed between the encoder and decoder to input the encoded traffic characteristics to generate a future traffic condition representation, defining dte h =(DTE 1 ,…,DTE P ) DTE encoding of historical time step of a node, DTE p =(DTE 1 ,…,DTE Q ) DTE coding of the time step that the node needs to predict (P = Q = 12),for the coded node characteristics, N is the number of nodes, T is the time step, d model The feature dimension is used for calculating the correlation between the time step needing to be predicted and the historical time step according to the DTE coding, and the formula is as follows:
whereinIs a representation of the relative position of the object,part is a model learnable parameter, using an attention score a ij Converting the encoded node features to have associated features across historical time steps as input to a decoder:
preferably, the training of the spatio-temporal map neural network model based on the generalized PageRank with the historical time series characteristics H and DTE as input data of the spatio-temporal map neural network model based on the generalized PageRank includes:
constructing a space-time diagram neural network model based on the generalized PageRank, initializing model parameters, and taking historical time sequence characteristics H and DTE as input data of the space-time diagram neural network model based on the generalized PageRank;
calculating a Mean Square Error (MSE) as a loss function according to output data and actual numerical values of the time-space diagram neural network model based on the generalized PageRank, and updating network parameters of the time-space diagram neural network model based on the generalized PageRank by using an Adam algorithm;
and if the space-time diagram neural network model based on the generalized PageRank converges or the required training steps are reached, ending the training process of the space-time diagram neural network model based on the generalized PageRank, and recording the optimal parameters of the space-time diagram neural network model based on the generalized PageRank according to the model effect on the verification set.
According to the technical scheme provided by the embodiment of the invention, the embodiment of the invention designs a relative position time attention layer RPTA to adaptively model the nonlinear correlation between different time steps, designs distance and time codes to combine the geographic information and the time information of a road network, and can effectively predict the traffic flow of a road.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a processing flow chart of a traffic flow prediction method based on an adaptive generalized PageRank graph neural network according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a historical time series feature H according to an embodiment of the present invention.
FIG. 3 is an architecture diagram of an adaptive generalized PageRank neural network model according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an adaptive generalized PageRank layer according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For the convenience of understanding of the embodiments of the present invention, the following detailed description will be given by way of example with reference to the accompanying drawings, and the embodiments are not limited to the embodiments of the present invention.
The invention provides a novel adaptive generalized PageRank graph neural network method for traffic flow prediction, and designs a RPTA (relative position Temporal Attention layer) to adaptively model nonlinear correlation among different time steps, and obtains an effect superior to a baseline method on two public data sets, thereby providing good guiding and reference significance for subsequent research and application.
The invention discloses a processing flow of a traffic flow prediction method based on a self-adaptive generalized PageRank diagram neural network, which is shown in a figure 1 and comprises the following processing steps:
step S1: and acquiring public traffic flow data, and preprocessing the public traffic flow data.
METR-LA this traffic data set contains data collected on highway roads (Jagadish et al, 2014). We selected 207 detectors and collected data from 3/1/2012 to 6/30/2012/4 months for the experiment.
PEMS _ BAY this traffic data set was collected by the California transport Agents (Caltrans) Performance Measurement System (PeMS). We selected 325 detectors of Bay Area and collected data for the experiment from 1/2017 to 5/31/2017/6 months.
Data preprocessing: 1) Data standardization; 2) Missing data interpolation; 3) Training samples were constructed from the training data slices.
Step S2: POI (Polnt of Information) Information in the public traffic flow data is obtained, and distance codes are constructed based on the POI Information.
The PoI may reflect the function of an area, which has a significant effect on the mobility of humans. In the present invention, we use OSMnx to collect the PoI information, specifically the number of different classes of nodes of the PoI, from the OpenStreetMap.
And step S3: the time information is constructed as a time code TE.
Fig. 2 is a schematic diagram of a historical time series feature H according to an embodiment of the present invention. And (3) splicing the original data points into a historical time sequence characteristic H according to a time window in a sliding manner, and splicing the historical time sequence characteristic H into a training sample if the given historical time period is 1 hour (one data in 5 minutes and 12 values in total), and the predicted actual time period is also 1 hour (one data in 5 minutes and 12 values in total). Wherein, 288 data points in 24 hours a day constitute the historical time series characteristic H.
And step S4: regarding the road network as a directed weighted graph, splicing Distance coding and time coding into DTE (Distance and Temporal Encoding), and regarding the DTE as an additional feature of nodes in the directed weighted graph.
Step S5: and constructing a generalized PageRank-based space-time diagram neural network model, and training the generalized PageRank-based space-time diagram neural network model by taking the historical time sequence characteristics H and DTE as input data of the generalized PageRank-based space-time diagram neural network model.
Calculating a Mean Square Error (MSE) as a loss function according to output data and actual numerical values of the time-space diagram neural network model based on the generalized PageRank, and updating network parameters of the time-space diagram neural network model based on the generalized PageRank by using an Adam algorithm;
and if the space-time diagram neural network model based on the generalized PageRank converges or the required training steps are reached, ending the training process of the space-time diagram neural network model based on the generalized PageRank, and recording the optimal parameters of the space-time diagram neural network model based on the generalized PageRank according to the model effect on the verification set.
Step S6: and judging whether the training effect of the space-time diagram neural network model based on the generalized PageRank meets the requirement or not by using a verification set, if so, obtaining the trained space-time diagram neural network model based on the generalized PageRank, and storing corresponding space-time diagram neural network model parameters based on the generalized PageRank.
Step S7: and inputting the historical traffic flow sequence into a trained time-space diagram neural network model based on the generalized PageRank, and outputting a future traffic flow sequence based on the trained time-space diagram neural network model based on the generalized PageRank.
The traffic flow prediction problem may be defined as follows: treating a road network as a directed weighted graph WhereinIs a set of vertices representing points on the road network (traffic detectors), epsilon is a set of edges representing connectivity between vertices,is a contiguous matrix. With weighted graphsVertex v in (1) i Generating observation sequencesWhere C is the number of traffic conditions (e.g., traffic speed). The goal of traffic flow prediction is to find a functionAccording to observed historical traffic flow sequence Q = { X :,t-q+1 ,X :,t-q+2 ,…,X :,t To predict future traffic flow sequence P = { X = :,t+1, X :,t+2 ,…,X :,t+p The value of.
Fig. 3 is an architecture diagram of an adaptive generalized PageRank space-time diagram neural network model according to an embodiment of the present invention. The adaptive generalized PageRank space-time diagram neural network model is designed into a coder-decoder framework. Both the encoder and decoder are stacked with space-time Blocks (ST-Blocks) having a residual structure. Each ST-Block is composed of an adaptive generalized PageRank layer (AGP), a relative position time attention layer (RPTA), and a fusion layer. The AGP layer is intended to first learn hidden graph structures and hidden features and then propagate the hidden features over the hidden graph structures through Generalized PageRank (GPR) techniques. The RPTA layer aims to capture complex correlations between different time steps by means of relative position information. The switching attention layer is intended to transmit the encoded traffic characteristics to the decoder. Spatial information and temporal information are integrated into node features by Distance and Time Encoding (DTE). Furthermore, to apply the residual structure, all layers produce the same size output.
(1) Space-time coding (DTE)
Since a change in traffic conditions is greatly influenced by a road network, it is important to incorporate geographic information of the road network into a prediction model. Previous studies, such as GMAN, typically employed node2vec or one-hot node identifiers to encode graph structure information for vertices as vectors. However, node embedding for node2vec and one-hot generation are not conducive to inductive learning, and the encoding for one-hot is too sparse. To this end, the invention envisages a distance coding (DE) of the graphDE above is defined as a functionFor the sake of brevity, we useIs shown in whichFirst, function ζ is designed according to the probability of random walk from u to v on the graph.
WhereinIs a random walk matrix and is a matrix of random walks,then, we will group togetherThe distance codes DE (u) are aggregated into vertices u.
Among these, poI can reflect the function of a region using Sum firing as the aggregation function AGG, and according to previous studies, poI has a significant effect on human mobility, and therefore a subsetAnd selecting according to the PoI information. In the present invention OSMnx is used to collect the PoI information from the OpenStreetMap. Selecting m nodes with the largest number of nearby PoIs as the subset
But distance coding only provides a static representation of spatial information and lacks temporal information. Previous researches mainly apply one-hot coding to code information such as date, week, time and the like into vectors. However, one-hot encoding cannot reflect the relative position relationship between each time step, so the invention designs time encoding to solve the problem:
where pos is the location of the time of week (we divide the week into 2016 segments each 5 minutes, into), i is the dimension, 10000 is the hyperparameter compliance, d is the time of day model Is the dimension of the hidden layer in our model.
Finally, the distance coding and time coding are concatenated and transformed through a full concatenation layer to construct our space-time coding (DTE). As shown in fig. 3, the DTE contains geographic and temporal information of the road network, which is used as an additional vertex feature in each layer of our proposed model.
(2) Space-time block
As shown in fig. 3, a time-space block (ST-block) includes an adaptive generalized PageRank layer, a relative position time attention layer, and a fusion layer. We denote the input of block 1 as X (l) Expressing the output as X (l+1) . First, we combine the above DTE with X (l) Connected as a new vertex feature H (l) =concat(X l DTE). Then, we use H (l) As inputs to the adaptive generalized PageRank layer and the relative position time attention layer. Finally, we use the gating mechanism to fuse the outputs of the two layers as the final output X of ST-Block (l+1) 。
Adaptive generalized PageRank layer (AGP)
Generally, traffic conditions in one area are greatly affected by the nearby area. The invention provides an adaptive generalized PageRank layer (AGP) for adaptively capturing complex spatial dependencies between sensors in a road network. The original intent of the AGP design was to dynamically assign different edge weights to reflect the different dependencies between pairs of nodes. Therefore, the invention adds a learnable node embedded dictionary in the AGPWhere N is the number of nodes, d model Is the dimension of node embedding. An appropriate normalized symmetric adjacency matrix can be deduced from the EA of each layer
Defining A with self-ring sym Is composed ofGeneralized PageRank (GPR) is applied to capture the correlation between the vertices, and FIG. 4 is a schematic structural diagram of an adaptive generalized PageRank layer according to an embodiment of the present invention. As shown in fig. 4. First, hidden state features are generated for each node Wherein f is θ Is a full connection layer. Following propagation of hidden state features using GPR, the propagation process can be expressed by the following formula:
wherein the weight γ k Being a model learnable parameter, the AGP layer proposed by the present invention can adaptively control and adjust the structure of the propagation map to the node characteristics in each step.
Relative Position Temporal Attention layer (RPTA)
Generally speaking, the traffic condition of an area is strongly correlated with the historical condition of the area, and the correlation changes in a nonlinear way along with time. In order to model the correlation in a complex time dimension, the invention designs a relative position time attention layer (RPTA) to adaptively model the nonlinear correlation between different time steps. We designed RPTA using a relationship-aware self-attention mechanism. The formula is as follows:
where T is the number of time steps,is a matrix of parameters that is,is a relative position representation, the weight coefficient a ij Calculated using the equation softmax function:
whereinIs a model learnable parameter, d model Is the dimension of the hidden layer or layers,is a relative position representation. In order to be able to focus further on information from different representation subspaces at the same time, the present invention applies a multi-headed attention mechanism to the RPTA layer. The relative position representation is based on the assumption that information beyond a certain distance is useless, and the formula of the relative position representation is as follows:
clip(j-i,k)=max(-k,min(k,x))
A fusion layer:
generally, the traffic condition of a road at a certain time step is related to the traffic condition of the adjacent area near the area, as well as the traffic condition of the previous time step of the area. To this end, the present invention designs a fusion layer with gating mechanism to integrate the outputs of the two previous models, and the formula is as follows:
z=σ(W 0 (X s +X t ))
fusion(X s ,X t )=z⊙(W 1 ·X s )+(1-z)(W 2 ·X t )
wherein, X s Is the output of AGP layer, X t For the output of the RTPA layer, σ denotes a sigmoid function, and a indicates a multiplication of corresponding elements of the matrix.
Switching attention layers
The invention designs a conversion attention layer between an encoder and a decoder, and the encoded traffic characteristics are input so as to generate the future traffic condition representation. Definition dte h =(DTE 1 ,…,DTE P ) DTE encoding of historical time step of a node, DTE p =(DTE 1 ,…,DTE Q ) DTE coding of the time step that the node needs to predict (P = Q = 12),for the coded node characteristics (N is the number of nodes, T is the time step, d) model Is the dimension of the feature). According to the DTE coding, the correlation between the time step to be predicted and the historical time step can be calculated, and the formula is as follows:
whereinIs a representation of the relative position of the object,are all model learnable parameters. Then, the attention score a is used ij Converting the encoded node features to have associated features across historical time steps as input to a decoder:
the results of the comparison of the method designed by the present invention with previous traffic flow prediction algorithms are shown in table 1. As shown in the table, the invention achieves the highest accuracy under different setting conditions.
Table 1 comparative experimental results of the invention on different data sets
In summary, the embodiments of the present invention propose a novel adaptive generalized PageRank graph neural network architecture, which has performance comparable to the most advanced models available, without requiring appropriate knowledge to generate predefined graph structures. To address this problem, the AGP layer proposed by the present invention can dynamically assign different edge weights to reflect different dependencies between pairs of nodes in each model layer.
Generally speaking, the traffic condition of an area is strongly correlated with the historical condition of the area, and the correlation changes in a nonlinear way along with time. The existing models largely ignore the influence of the relative position in time on the prediction result. In order to further model the correlation in a complex time dimension, the invention designs a relative position time attention layer (RPTA) to adaptively model the nonlinear correlation between different time steps.
Meanwhile, the existing model uses a method such as node2vec to extract structural information as an extra feature of a node, or uses one-hot coding to code information such as date, week, time and the like into a vector. However, one-hot encoding cannot reflect the relative position relationship between each time step, and node2vec cannot conveniently process newly added nodes, and these methods are not suitable for regression learning. The invention aims at the defect that distance and time codes are designed to combine geographic information and time information of a road network.
The experimental results of the method of the invention on two real-world public transportation data sets illustrate the value of our method. Ablation studies demonstrate the importance of each component in the architecture.
Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, apparatus or system embodiments, which are substantially similar to method embodiments, are described in relative ease, and reference may be made to some descriptions of method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
While the invention has been described with reference to specific preferred embodiments, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the following claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (7)
1. A traffic flow prediction method based on a self-adaptive generalized PageRank graph neural network is characterized by comprising the following steps:
acquiring public traffic flow data, and preprocessing the public traffic flow data;
acquiring information point POI information in public traffic flow data, and constructing a distance code based on the POI information;
constructing time information into a time code to obtain a historical time sequence characteristic H;
the road network is regarded as a directed weighted graph, distance codes and time codes are spliced to form space-time codes DTEs, and the DTEs are used as additional features of nodes in the directed weighted graph;
constructing a generalized PageRank-based space-time diagram neural network model, taking historical time sequence characteristics H and DTE as input data of the generalized PageRank-based space-time diagram neural network model, and training the generalized PageRank-based space-time diagram neural network model;
judging whether the training effect of the space-time diagram neural network model based on the generalized PageRank meets the requirement or not by using a verification set, if so, obtaining a trained space-time diagram neural network model based on the generalized PageRank, and storing corresponding space-time diagram neural network model parameters based on the generalized PageRank;
and inputting the historical traffic flow sequence into a trained time-space diagram neural network model based on the generalized PageRank, and outputting a future traffic flow sequence based on the trained time-space diagram neural network model based on the generalized PageRank.
2. The method according to claim 1, wherein the acquiring information point POI information in the public traffic flow data and constructing distance codes based on the POI information comprises:
treating a road network as a directed weighted graph Is a set of vertices representing nodes on a road network, said nodes being traffic detectors, ε being a tableA set of edges showing connectivity between the vertices,is a contiguous matrix, provided with weighted graphsVertex v in (1) i Generating an observation sequence X I,: =X I,0 ,X I,1 ,…,X I,t ∈R t×C Where C is the number of traffic conditions, the goal of traffic flow prediction is to find a functionAccording to observed historical traffic flow sequence Q = { X :,t-q+1 ,X :,t-q+2 ,…,X :,t To predict future traffic flow sequence P = { X = :,t+1 ,X :,t+2 ,…,X :,t+p The value of } is;
drawing(s)The distance coding DE of (1) is defined as a functionUse ofIs shown in whichDesigning a function zeta according to the probability of random walk from u to v;
whereinIs a random walk matrix and is a matrix of random walks,one group ofDistance coding DE (u) aggregated into vertices u;
3. The method of claim 1, wherein constructing the time information as a time code comprises:
taking each set time in the week as a segment, pos is the position of the time in the week, i is the dimension, 10000 is hyper-parameter compliance, d model The method is based on the dimensionality of a hidden layer in a generalized PageRank-based space-time diagram neural network model, and the expression of time coding TE is as follows:
4. the method of claim 3, wherein the step of treating the road network as a directed weighted graph, the step of concatenating the distance code and the time code into a space-time code DTE, and the step of treating the DTE as an additional feature of a node in the directed weighted graph comprises:
and connecting the distance code DE and the time code TE, and transforming through a full connection layer to construct a space-time code DTE, wherein the DTE comprises geographic information and time information of a road network, and the DTE is used as the vertex characteristic of each layer of nodes in the directed weighted graph.
5. The method according to claim 4, wherein the constructing of the generalized PageRank-based spatiotemporal neural network model comprises:
constructing a self-adaptive generalized PageRank space-time diagram neural network model of an encoder-decoder architecture, wherein the encoder and the decoder are formed by stacking space-time Blocks ST-Blocks with residual error structures, each ST-Block consists of an AGP layer of the self-adaptive generalized PageRank layer, a relative position time attention RPTA layer and a fusion layer, the AGP layer learns a hidden diagram structure and hidden characteristics, the hidden characteristics are propagated on an implicit diagram structure through the generalized PageRank, and the RPTA layer captures the correlation among different time steps through relative position information;
denote the 1 ST-block input as X (l) With the output represented as X (l+1) Mixing DTE with X (l) Connected as a new vertex feature H (1) =concat(X 1 DTE), using H (1) As the input of the AGP layer and the RPTA layer, the output of the AGP layer and the RPTA layer is fused as the final output X of the ST-Block by using a gating mechanism (1+1) ;
Adding learnable node embedded dictionary in AGPWhere N is the number of nodes, d model Is the dimension of node embedding, according to E of each layer A Inferring a normalized symmetric adjacency matrix
Defining A with self-ring sym Is composed ofApplying generalized PageRank to capture the correlation between vertices, generating hidden state features for each nodeWherein f is θ For a fully connected layer, the hidden state features are propagated using generalized PageRank, and the propagation process is expressed by the following formula:
wherein the weight γ k Is a model learnable parameter;
the RPTA layer is designed by using a relationship-aware self-attention mechanism, and the formula is as follows:
where T is the number of time steps,is a matrix of parameters that is,is a relative position representation, the weight coefficient a ij Calculated using the equation softmax function:
wherein W Q ,Is a model learnable parameter, d model Is the dimension of the hidden layer or layers,is a relative position representation;
applying the multi-head attention mechanism to the RPTA layer, the formula for the relative position is as follows:
clip(j-i,k)=max(-k,min(k,x))
6. The method according to claim 5, wherein the constructing a generalized PageRank-based space-time graph neural network model further comprises:
fusing the outputs of the AGP layer and the RPTA layer using a fusion layer with a gating mechanism, the formula of the fusion layer is as follows:
z=σ(W 0 (X s +X t ))
fusion(X s ,X t )=z⊙(W 1 ·X s )+(1-z)(W 2 ·X t )
wherein X s Is the output of AGP layer, X t For the output of the RTPA layer, σ represents a sigmoid function, and "-" represents multiplication of corresponding elements of the matrix;
a transition attention layer is designed between the encoder and decoder to input the encoded traffic characteristics to generate a future traffic condition representation, defined dte h =(DTE 1 ,…,DTE P ) DTE encoding of historical time step of a node, DTE p =(DTE 1 ,…,DTE Q ) DTE coding of the time step that the node needs to predict (P = Q = 12),for the coded node characteristics, N is the number of nodes, T is the time step, d model The feature dimension is used for calculating the correlation between the time step needing to be predicted and the historical time step according to the DTE coding, and the formula is as follows:
whereinIs a relative position representation, W Q ,Are all model learnable parameters, using an attention score a ij Converting the encoded node features to have associated features across historical time steps as input to a decoder:
7. the method according to claim 6, wherein the training of the generalized PageRank-based spatiotemporal neural network model using the historical time-series characteristics H and DTE as input data of the generalized PageRank-based spatiotemporal neural network model comprises:
constructing a space-time diagram neural network model based on the generalized PageRank, initializing model parameters, and taking historical time sequence characteristics H and DTE as input data of the space-time diagram neural network model based on the generalized PageRank;
calculating a Mean Square Error (MSE) as a loss function according to output data and an actual numerical value of the space-time diagram neural network model based on the generalized PageRank, and updating network parameters of the space-time diagram neural network model based on the generalized PageRank by using an Adam algorithm;
and if the space-time diagram neural network model based on the generalized PageRank converges or the required training steps are reached, ending the training process of the space-time diagram neural network model based on the generalized PageRank, and recording the optimal parameters of the space-time diagram neural network model based on the generalized PageRank according to the model effect on the verification set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211156320.7A CN115620514A (en) | 2022-09-22 | 2022-09-22 | Traffic flow prediction method based on adaptive generalized PageRank graph neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211156320.7A CN115620514A (en) | 2022-09-22 | 2022-09-22 | Traffic flow prediction method based on adaptive generalized PageRank graph neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115620514A true CN115620514A (en) | 2023-01-17 |
Family
ID=84858028
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211156320.7A Pending CN115620514A (en) | 2022-09-22 | 2022-09-22 | Traffic flow prediction method based on adaptive generalized PageRank graph neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115620514A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110047291A (en) * | 2019-05-27 | 2019-07-23 | 清华大学深圳研究生院 | A kind of Short-time Traffic Flow Forecasting Methods considering diffusion process |
CN111161535A (en) * | 2019-12-23 | 2020-05-15 | 山东大学 | Attention mechanism-based graph neural network traffic flow prediction method and system |
US20210064999A1 (en) * | 2019-08-29 | 2021-03-04 | Nec Laboratories America, Inc. | Multi-scale multi-granularity spatial-temporal traffic volume prediction |
US20210209939A1 (en) * | 2020-12-08 | 2021-07-08 | Harbin Engineering University | Large-scale real-time traffic flow prediction method based on fuzzy logic and deep LSTM |
CN113673769A (en) * | 2021-08-24 | 2021-11-19 | 北京航空航天大学 | Graph neural network traffic flow prediction method based on multivariate time sequence interpolation |
-
2022
- 2022-09-22 CN CN202211156320.7A patent/CN115620514A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110047291A (en) * | 2019-05-27 | 2019-07-23 | 清华大学深圳研究生院 | A kind of Short-time Traffic Flow Forecasting Methods considering diffusion process |
US20210064999A1 (en) * | 2019-08-29 | 2021-03-04 | Nec Laboratories America, Inc. | Multi-scale multi-granularity spatial-temporal traffic volume prediction |
CN111161535A (en) * | 2019-12-23 | 2020-05-15 | 山东大学 | Attention mechanism-based graph neural network traffic flow prediction method and system |
US20210209939A1 (en) * | 2020-12-08 | 2021-07-08 | Harbin Engineering University | Large-scale real-time traffic flow prediction method based on fuzzy logic and deep LSTM |
CN113673769A (en) * | 2021-08-24 | 2021-11-19 | 北京航空航天大学 | Graph neural network traffic flow prediction method based on multivariate time sequence interpolation |
Non-Patent Citations (1)
Title |
---|
贺文武 等: "基于自适应门控图神经网络的交通流预测", 《计算机应用研究》, vol. 39, no. 8, 5 August 2022 (2022-08-05), pages 2306 - 2310 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111400620B (en) | User trajectory position prediction method based on space-time embedded Self-orientation | |
CN111161535B (en) | Attention mechanism-based graph neural network traffic flow prediction method and system | |
CN111612243B (en) | Traffic speed prediction method, system and storage medium | |
CN115240425B (en) | Traffic prediction method based on multi-scale space-time fusion graph network | |
CN111832814A (en) | Air pollutant concentration prediction method based on graph attention machine mechanism | |
CN114299723B (en) | Traffic flow prediction method | |
CN112863180B (en) | Traffic speed prediction method, device, electronic equipment and computer readable medium | |
CN113326981B (en) | Atmospheric environmental pollutant prediction model based on dynamic space-time attention mechanism | |
CN111260919B (en) | Traffic flow prediction method | |
CN113762595B (en) | Traffic time prediction model training method, traffic time prediction method and equipment | |
CN115273464A (en) | Traffic flow prediction method based on improved space-time Transformer | |
CN113239897B (en) | Human body action evaluation method based on space-time characteristic combination regression | |
CN114925836B (en) | Urban traffic flow reasoning method based on dynamic multi-view graph neural network | |
CN111626764A (en) | Commodity sales volume prediction method and device based on Transformer + LSTM neural network model | |
CN115587454A (en) | Traffic flow long-term prediction method and system based on improved Transformer model | |
CN114118375A (en) | Continuous dynamic network characterization learning method based on time sequence diagram Transformer | |
CN116975782A (en) | Hierarchical time sequence prediction method and system based on multi-level information fusion | |
CN116504060A (en) | Diffusion diagram attention network traffic flow prediction method based on Transformer | |
CN114386513A (en) | Interactive grading prediction method and system integrating comment and grading | |
CN112199884A (en) | Article molecule generation method, device, equipment and storage medium | |
CN115376317A (en) | Traffic flow prediction method based on dynamic graph convolution and time sequence convolution network | |
CN113408786B (en) | Traffic characteristic prediction method and system | |
CN114385910A (en) | Knowledge tracking based online learning content recommendation method and system | |
Xu et al. | Time series prediction via recurrent neural networks with the information bottleneck principle | |
CN114372627B (en) | Urban vehicle travel time estimation method based on hybrid deep learning framework |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |