CN115762147A - Traffic flow prediction method based on adaptive graph attention neural network - Google Patents
Traffic flow prediction method based on adaptive graph attention neural network Download PDFInfo
- Publication number
- CN115762147A CN115762147A CN202211386613.4A CN202211386613A CN115762147A CN 115762147 A CN115762147 A CN 115762147A CN 202211386613 A CN202211386613 A CN 202211386613A CN 115762147 A CN115762147 A CN 115762147A
- Authority
- CN
- China
- Prior art keywords
- time
- traffic flow
- data
- att
- space
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 17
- 230000003044 adaptive effect Effects 0.000 title claims description 21
- 230000007246 mechanism Effects 0.000 claims abstract description 31
- 230000007774 longterm Effects 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims abstract description 8
- 238000012545 processing Methods 0.000 claims abstract description 4
- 239000011159 matrix material Substances 0.000 claims description 51
- 238000010586 diagram Methods 0.000 claims description 31
- 230000006870 function Effects 0.000 claims description 24
- 230000004913 activation Effects 0.000 claims description 17
- 230000010339 dilation Effects 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 8
- 238000003062 neural network model Methods 0.000 claims description 8
- 238000012360 testing method Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 7
- 241000282326 Felis catus Species 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 5
- 239000013598 vector Substances 0.000 claims description 5
- 238000005065 mining Methods 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 239000003086 colorant Substances 0.000 claims description 3
- 230000006835 compression Effects 0.000 claims description 3
- 238000007906 compression Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 230000007480 spreading Effects 0.000 claims description 3
- 230000002123 temporal effect Effects 0.000 claims description 3
- 230000003416 augmentation Effects 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims description 2
- 238000012546 transfer Methods 0.000 claims 1
- 238000000605 extraction Methods 0.000 abstract description 3
- 238000007781 pre-processing Methods 0.000 abstract 1
- 238000004364 calculation method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a traffic flow prediction method based on a self-adaptive graph attention neural network, aims to predict the traffic flow of medium and long-term traffic vehicles, and belongs to the technical field of urban traffic planning and flow prediction. The method comprises the following steps: step 1: extracting flow data of a road, and preprocessing the flow data by an attention mechanism and a space-time data embedding method to obtain a preprocessed data sequence; step 2: on the basis, extracting space-time characteristics from the obtained data sequence; and step 3: after extraction through multiple network layers, convergence is performed using an improved multi-head attention mechanism, and a predicted result is obtained through a full-link layer. The method adopts multi-module parallel processing, improves the convolution mode and reduces the training time. The method of the invention can predict the traffic flow more accurately and can complete the prediction task better.
Description
Technical Field
The invention belongs to traffic flow prediction in the field of space-time sequence prediction, and relates to a traffic flow prediction method based on a self-adaptive graph attention neural network, which is only used for medium and long time flow prediction tasks in a traffic system.
Background
With the continuous development of national economy, the living standard of people is continuously improved, and the number of private cars is more and more, so the pressure borne by roads is continuously increased, and an intelligent transportation system is provided for solving the problem. Traffic flow prediction is a very important part in an intelligent traffic system, generates great assistance for traffic scheduling, is an indispensable part for traffic management departments to reasonably allocate road resources or provide more effective travel strategies for the public, and is one of effective methods for solving the current traffic efficiency problem.
At present, with the wide application of an Intelligent Transportation System (ITS), massive traffic data can be obtained in time, and the research on traffic speed prediction is further promoted. Fixed position sensors on the road record traffic data, including speed, flow and position information. A close spatiotemporal relationship exists among the traffic characteristics; therefore, the key to traffic prediction is capturing the dynamic spatiotemporal correlation of data. However, this task is challenging due to the complexity and non-linearity of traffic data.
First, the spatial dependency of the nodes is dynamic. Complex dependencies exist between nodes. And the spatial relationship between nodes is not independent but dynamically changes over time. However, several existing methods fail to dynamically model the traffic data spatio-temporally. Secondly, the nonlinearity of traffic speed changes and the propagation of errors during training make the traditional deep learning method insufficient for long-term prediction. Most importantly, these methods are based on a predefined graph structure matrix, which limits the dependence in the spatial utilization traffic data. This means that these methods still fail. And meanwhile, the space-time characteristics are extracted while the dynamic correlation of the traffic data is ignored.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A traffic flow prediction method based on a self-adaptive graph attention neural network is provided. The technical scheme of the invention is as follows:
a traffic flow prediction method based on an adaptive graph attention neural network comprises the following steps:
step 1: setting time slices, collecting and counting historical traffic flow information under each time slice through a detection device arranged at a traffic intersection, and forming a two-dimensional traffic flow matrix; dividing the obtained historical traffic flow information into a training set and a testing set;
and 2, step: the traffic flow detection device installed at each traffic intersection is regarded as a node, and the connection among the nodes under a single time slice and traffic flow data form a road network topological graph; connecting each node under the adjacent time slices with nodes of an upper time period and a lower time period, and dynamically adjusting the corresponding weight of the nodes by adopting an attention mechanism according to the road network flow information so as to construct a space-time network sequence and obtain a local space-time connection graph;
and step 3: expanding the convolution kernel by expanding convolution to increase the receptive field;
and 4, step 4: constructing a space-time graph convolution network model for predicting traffic flow based on an attention mechanism, wherein the space-time graph convolution network model adopts a mode of overlapping a plurality of modules, processes and outputs data in a time period, and adopts a multi-head attention mechanism to be connected with a residual error;
and 5: after output splicing, the output of the gate control mechanism block is obtained through a full connection layer, wherein a residual error structure is added for preventing overfitting when the output splicing passes through the full connection layer;
step 6: testing the self-adaptive graph attention neural network model after training by using a test set, evaluating the error of the model, returning to the step 2 if the error is larger than a set threshold value, and retraining the model;
and 7: and inputting the traffic flow data of the front N set time slices of the road section to be predicted into the trained adaptive graph attention neural network model, and predicting the traffic flow of the future N time slices of the road section.
Further, the step 1 specifically includes: setting 5 minutes as a time slice, collecting and counting historical traffic flow information under each time slice through a detection device arranged at a traffic intersection, and forming a two-dimensional traffic flow matrix;
the historical traffic flow data comprises the license plate number of the motor vehicle passing by the road section in a time slice, the passing time, the passing speed, the traffic flow information and the weather condition of the day, the repeated data and the invalid data are cleaned, and the residual data are subjected to z-score standardization processing Wherein x i Is used as the original data, and the data is transmitted,for new data, μ i Is a mean value, σ i Is the standard deviation. n is the number of stations in the road segment.
Further, the road network topology map in step 2 includes: the set V, | V | = N of all nodes represents the number of nodes, the edge set E among the nodes, and the weight adjacency matrix A of the edges among the nodes; obtaining a road network topological graph G = (V, E, A), and according to the topological structure of a local space-time graph, directly capturing the correlation between each node and the space-time neighbor thereof; the construction of the spatio-temporal network sequence uses A epsilon R N*N A adjacency matrix representing a space diagram, A ∈ R 3N*3N A adjacency matrix representing a local space-time diagram constructed on three consecutive spatial diagrams; for a node i in the space diagram, calculating a new index of the node i in the local space-time diagram by (t-1) N + i, wherein t (0 < t ≦ 3) represents a time step in the local space-time diagram; if two nodes are connected to each other in this local space-time diagram, the corresponding value in the adjacency matrix is set to 1; wherein the adjacency matrix of the local space-time diagram is represented as:
wherein v is i Representing a node i in a local space-time diagram, wherein an adjacent matrix A comprises 3N nodes, and the diagonal line of the adjacent matrix is an adjacent matrix of a space network with three continuous time steps; the two sides of the diagonal represent the connectivity of each node belonging to the adjacent time step to itself.
Further, in the step 2, the attention mechanism is adopted to dynamically adjust the corresponding weight of the node, and the specific steps are as follows:
each node block represents the current flow state at the time step t, and different colors represent different influence weights; the channels are divided in the time dimension, wherein one time step is a channel, and the aim is to dynamically adjust the space-time correlation by distributing dynamic weights to the characteristics at different time steps. Mining dynamic spatiotemporal correlations between data using a channel attention mechanism;
the feature compression of X is first performed by a global averaging pool, which converts each temporal channel into a number such that each channel has a global acceptance field in the spatial dimension;
X p 、f pool (X) indicates that each channel has a global receptive field in the spatial dimension, where T represents the historical time step and N represents the total number of sensors. Since the present study only considers the speed characteristics, C =1. Wherein X p ∈R T To learn the non-linear correlation between data, this equation is passed through two fully connected layers;
x att =W 2 δ(W 1 X p )
x att expressing the attention coefficient, whereinFor trainable parameters, r represents the scaling ratio of the channel, δ represents the ReLU activation function; furthermore, to obtain a weight value between 0 and 1, x is recalibrated using the sigmoid activation function as follows att :
x′ att =σ(x att )
Then, use is made of x' att And the Hadamard product of X to obtain dynamically adjusted spatio-temporal feature data as follows:
X att =X⊙x′ att
subsequently, X att And sending the data to a gating spreading convolution module and a space convolution module to further capture space-time characteristics.
Further, the step 3: expanding the convolution kernel by adopting expanding convolution to increase the receptive field, which specifically comprises the following steps:
in this module, different expansion rate convolution inputs are employed to achieve short, medium and long term prediction goals; x att ∈R T*N*C Is the input to the module, where T represents the time step of the input sequence; for mining short, medium and long term features, pair X att Performing dilation convolution, wherein the dilation rate is D =1,2,5 and 11; then, after convolution in the time dimension, the results are concatenated as follows:
X cat =concat(X att *f D=1 ,X att *f D=2 ,X att *f D=5 ,X att *f D=11 )
wherein X att *f D=i (i =1,2,5,11) denotes the expansion convolution, f ∈ R 1*2 Is a convolution kernel; in order to keep consistent with the dimensions of the spatial features, the connected vectors will be dimension converted through the fully connected layers;
X D =FC(X cat )
FC denotes the full connection layer, X D For the resulting vectors, finally, a gating mechanism is used to control the transmission of the time information; the system consists of two parallel activation functions; the Tanh activation function is used to overcome the vanishing gradient problem; sigmoid activation function maps data between 0 and 1Transmitting as message transmission control; x D By gating mechanism in the following form:
H T =g(X D1 )⊙σ(X D2 )。
wherein g (X) D1 )、σ(X D2 ) Respectively representing two activation functions, H T Is the result of its output.
Further, the step 4 specifically includes:
adopting a space convolution module, wherein a convolution operator adopts the following calculation formula:
wherein,θ、I N are learnable parameters; A.respectively representing the adjacency matrix and the augmentation matrix thereof,the matrix is normalized for it.
A self-learning graph structure matrix is generated by adopting an attention-based method; the matrix may learn hidden spatial correlations between nodes from the input data;
wherein X ∈ R T*N*C Is the input of the model, V s ∈R N*N ,U 1 ∈R C ,U 2 ∈R T*C And U 3 ∈R T Are learnable parameters; subsequently, all data are normalized by using a Softmax function to obtain an adaptive graph structure matrixThe following graph is presented to curl the layers:
finally, the time-space units are fused by a fusion mechanism, H s 、H T Respectively representing the gating mechanism and the output of the graph convolution layer to obtain the output of each dynamic space-time block;
Y=Z 1 H T +Z 2 H S
wherein Z 1 And Z 2 Is a learnable parameter matrix.
Further, the step 5 specifically includes:
the output layer converts the output of the last graph convolution layer into a traffic information sequence of T' time steps in the future, the input of the output layer being represented by transposing the input and reshaping it into X T ∈R T*N*C The T connected layers are used to generate the prediction, as follows:
wherein F (x) 1 ,x 2 ,…,x t ) Represents the prediction result at the ith time step,representing a learnable parameter.
wherein, Y represents a basic fact,representing the prediction of the model, δ is a threshold parameter that controls the range of squared error loss.
Further, the step 7: inputting the traffic flow data of the front N set time slices of the road section to be predicted into a trained adaptive graph attention neural network model, predicting the traffic flow of the future N time slices of the road section, and specifically comprising the following steps:
in this step, the history data is represented as x = (x) t ,x t-1 ,...,x t-T+1 ) As an input traffic sequence of length T, x' = (x) t+1 ,x t+2 ,…,x t+p ) Is the predicted flow data for the next time step, and specifically defines the formula shown below, whereAre learnable parameters:
the invention has the following advantages and beneficial effects:
first, spatio-temporal heterogeneity is fully taken into account and the spatial dependencies of the nodes are dynamic when trying to extract the spatial features of the graph. Complex dependencies exist between nodes. And the spatial relationship between nodes is not independent but dynamically changes over time. Most importantly, these methods are based on a predefined graph structure matrix, which limits the dependence in the spatial utilization traffic data. In the experiment, a time chart with more characteristics is obtained by using a space-time embedding method and combining an attention mechanism. And in the extraction process, the dynamic adjustment module is used for adjusting the structure of the relevant graph in time to extract more sufficient space-time characteristics, so that a better prediction effect can be achieved.
Drawings
FIG. 1 is a flow diagram of an adaptive neural network prediction module provided by the present invention;
FIG. 2 is a schematic diagram of a dynamic adjustment module;
FIG. 3 is a schematic diagram of a feature extraction module.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
as shown in fig. 1, a traffic flow prediction method based on an adaptive graph attention neural network includes the following steps:
step 1: setting 5 minutes as a time slice, collecting and counting historical traffic flow information under each time slice through a detection device arranged at a traffic intersection, and forming a two-dimensional traffic flow matrix;
the historical traffic flow data comprises the license plate number of the motor vehicle passing by the road section in a time slice, the passing time, the passing speed, the traffic flow information and the weather condition of the day, the repeated data and the invalid data are cleaned, and the residual data are subjected to z-score standardization processing Wherein x i Is used as the original data, and the data is transmitted,for new data, μ i Is a mean value, σ i Is the standard deviation.
Step 2: the traffic flow detection device installed at each traffic intersection is regarded as a node, and the connection among the nodes under a single time slice and traffic flow data form a road network topological graph; connecting each node under the adjacent time slices with nodes of an upper time period and a lower time period, and simultaneously considering that the weight of each node in each time period in a road network is different, and adjusting the corresponding weight of the node according to the traffic information of the road network by an attention mechanism to construct a space-time network sequence to obtain a local space-time connection graph;
the road network topological graph comprises: the set V, | V | = N of all nodes represents the number of nodes, the edge set E among the nodes, and the weight adjacency matrix A of the edges among the nodes; and obtaining a road network topological graph G = (V, E, A). According to the topological structure of the local space-time graph, the correlation between each node and the space-time neighbor thereof can be directly captured. The spatio-temporal network sequence is constructed by using A epsilon R N*N A adjacency matrix representing a space diagram, A ∈ R 3N*3N A adjacency matrix representing a local space-time diagram constructed on three consecutive spatial diagrams; for a node i in the space diagram, calculating a new index of the node i in the local space-time diagram by (t-1) N + i, wherein t (0 < t ≦ 3) represents a time step in the local space-time diagram; if two nodes are connected to each other in this local space-time diagram, the corresponding value in the adjacency matrix is set to 1; wherein the adjacency matrix of the local space-time diagram is represented as:
wherein v is i Representing node i in the local space-time diagram. The adjacency matrix a includes 3N nodes. A adjacency matrix of local space-time diagrams is illustrated. The diagonal of the adjacency matrix is the adjacency matrix of the spatial network of three consecutive time steps. The sides of the diagonal represent the connectivity of each node belonging to the adjacent time step to itself.
The node weight dynamic adjusting module in the step comprises the following steps:
referring to fig. 2, each node block represents the current traffic state at time step t, and different colors represent different impact weights. The present study divides the channels in the time dimension, where one time step is one channel. The purpose is to dynamically adjust the spatiotemporal correlation by assigning dynamic weights to the features at different time steps. We use the channel attention mechanism to mine the dynamic spatiotemporal correlation between data.
Feature compression for X is first performed by a global averaging pool that converts each temporal channel to a number such that each channel has a global acceptance field in the spatial dimension.
Wherein X p ∈R T To learn the non-linear correlation between data, equation (1) is passed through two fully connected layers.
x att =W 2 δ(W 1 X p )
Whereinr represents the scaling ratio of the channels and δ represents the ReLU activation function. Furthermore, to obtain a weight value between 0 and 1, x is recalibrated using the sigmoid activation function as follows att :
x′ att =σ(x att )
Then, x 'is used' att And the Hadamard product of X to obtain dynamically adjusted spatio-temporal feature data as follows:
X att =X⊙x′ att
then, X att And sending the data to a gating spreading convolution module and a space convolution module to further capture space-time characteristics.
And step 3: we use an improved gated dilation convolution to converge on the long-term features. For convolutional networks, it is difficult to obtain a larger field of view because it is limited by the size of the convolution kernel. To increase the receptive field, the convolution operation generally employs three methods: the use of larger convolution kernels, the use of deep nets, and aggregation operations prior to convolution. Here we use dilation convolution to expand the convolution kernel by adding "dilation" to obtain a larger receptive field.
In this block, we use different dilation rate convolution inputs to achieve short, medium and long term prediction goals. X att ∈R T*N*C Is the input to the module, where T represents the time step of the input sequence. For digging short, medium and long termPeriod characteristics of, for X att The dilation convolution was performed with dilation rates D =1,2,5, 11. Then, after convolution in the time dimension, the results are concatenated as follows:
X cat =concat(X att *f D=1 ,X att *f D=2 ,X att *f D=5 ,X att *f D=11 )
wherein X att *f D=i (i =1,2,5,11) denotes expansion convolution, f ∈ R 1*2 Is a convolution kernel. To keep consistent with the dimensions of the spatial features, the connected vectors will be dimension converted through the fully connected layers.
X D =FC(X cat )
Finally, a gating mechanism is used to control the transmission of the time information. Consisting of two parallel activation functions. The Tanh activation function is used to overcome the vanishing gradient problem. The sigmoid activation function maps data between 0 and 1 to message passing control. X D Passing through a gating mechanism in the following manner:
H T =g(X D1 )⊙σ(X D2 )
and 4, step 4: and (3) extracting the space-time characteristics of the data sequence based on the preprocessed data sequence output in the step (2). And constructing a graph convolution network model for predicting traffic flow based on an attention mechanism, wherein the network processes and outputs data in a time period in a mode of overlapping a plurality of modules, and the attention mechanism is adopted to reduce the loss of characteristics as much as possible.
In this module, in order to make the model lightweight and reduce excessive overhead, we use a spatial convolution module, and the convolution operator uses the following calculation formula:
As can be seen from the evolution of GCN, the process of graph convolution is essentially determined by the adjacency matrix. A self-learning graph structure matrix is generated by adopting a method based on attention. The matrix may learn hidden spatial correlations between nodes from the input data.
Wherein X ∈ R T*N*C Is the input of the model, V ∈ R N*N ,U 1 ∈R C ,U 2 ∈R T*C And U 3 ∈R T Are learnable parameters. Subsequently, all data are normalized by using a Softmax function to obtain an adaptive graph structure matrixThe following graph is presented to curl the layers:
and finally, fusing the time-space units through a fusion mechanism to obtain the output of each dynamic time-space block.
Y=Z 1 H T +Z 2 H S
Wherein Z 1 And Z 2 Is a learnable parameter matrix.
And 5: and splicing the obtained outputs, and then obtaining the output of the gate control mechanism block through a full connection layer, wherein a residual error structure is added for preventing overfitting when the outputs pass through the full connection layer.
In this step, the output layer is a traffic information sequence that converts the output of the last graph convolution layer into T' time steps in the future, and the input of the output layer is represented by transposing the input and reshaping it into X T ∈R T*N*C The T connected layers are used to generate the prediction, as follows:
wherein F (x) 1 ,x 2 ,…,x t ) Represents the prediction result at the ith time step,representing a learnable parameter.
Step 6: and (3) testing the self-adaptive graph attention neural network model after training is finished by using the test set, evaluating the error of the model, returning to the step (2) if the error is greater than a set threshold value, and retraining the model.
In this step, we select the Huber loss (Huber 1992) as the loss function. The Huber loss is less sensitive to outliers than the squared error loss.
Wherein, Y represents a basic fact,representing the prediction of the model, δ is a threshold parameter that controls the range of squared error loss.
And 7: and inputting the traffic flow data of the first N set time slices of the road section to be predicted into the trained adaptive graph attention neural network model, and predicting the traffic flow of the future N time slices of the road section.
In this step, the history data is represented as x = (x) t ,x t-1 ,...,x t-T+1 ) As an input traffic sequence of length T, x' = (x) t+1 ,x t+2 ,…,x t+p ) Is the flow data of the next time step that we predict, and specifically defines the formula as follows, whereAre learnable parameters:
the systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.
Claims (9)
1. A traffic flow prediction method based on a self-adaptive graph attention neural network is characterized by comprising the following steps:
step 1: setting time slices, collecting and counting historical traffic flow information under each time slice through a detection device installed at a traffic intersection, and forming a two-dimensional traffic flow matrix; dividing the obtained historical traffic flow information into a training set and a testing set;
step 2: the traffic flow detection device installed at each traffic intersection is regarded as a node, and the connection among the nodes under a single time slice and traffic flow data form a road network topological graph; connecting each node under the adjacent time slices with nodes of an upper time period and a lower time period, and dynamically adjusting the corresponding weight of the nodes by adopting an attention mechanism according to the road network flow information so as to construct a space-time network sequence and obtain a local space-time connection graph;
and step 3: expanding the convolution kernel by expanding convolution to increase the receptive field;
and 4, step 4: constructing a space-time graph convolution network model for predicting traffic flow based on an attention mechanism, wherein the space-time graph convolution network model adopts a mode of overlapping a plurality of modules, processes and outputs data in a time period, and adopts a multi-head attention mechanism to be connected with a residual error;
and 5: after output splicing, the output of the gate control mechanism block is obtained through a full connection layer, wherein a residual error structure is added for preventing overfitting when the output splicing passes through the full connection layer;
step 6: testing the self-adaptive graph attention neural network model after training by using a test set, evaluating the error of the model, returning to the step 2 if the error is larger than a set threshold value, and re-training the model;
and 7: and inputting the traffic flow data of the front N set time slices of the road section to be predicted into the trained adaptive graph attention neural network model, and predicting the traffic flow of the future N time slices of the road section.
2. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 1, wherein the step 1 specifically comprises: setting 5 minutes as a time slice, collecting and counting historical traffic flow information under each time slice through a detection device arranged at a traffic intersection, and forming a two-dimensional traffic flow matrix;
the historical traffic flow data comprises the license plate number of the motor vehicle passing by the road section in a time slice, the passing time, the passing speed, the traffic flow information and the weather condition of the day, the repeated data and the invalid data are cleaned, and the residual data are subjected to z-score standardization processing Wherein x i Is used as the original data, and the data is transmitted,for new data, μ i Is a mean value, σ i And n is the number of stations in the road section.
3. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 1, wherein the road network topology map in the step 2 comprises: the set V, | V | = N of all nodes represents the number of nodes, the edge set E among the nodes, and the weight adjacency matrix A of the edges among the nodes; obtaining a road network topological graph G = (V, E, A), and directly capturing each time according to the topological structure of the local space-time graphCorrelation between an individual node and its spatio-temporal neighbors; the construction of the spatio-temporal network sequence uses A epsilon R N*N A adjacency matrix representing a space diagram, A ∈ R 3N*3N A adjacency matrix representing a local space-time diagram constructed on three consecutive spatial diagrams; for a node i in the space diagram, calculating a new index of the node i in the local space-time diagram by (t-1) N + i, wherein t (0 < t ≦ 3) represents a time step in the local space-time diagram; if two nodes are connected to each other in this local space-time diagram, the corresponding value in the adjacency matrix is set to 1; wherein the adjacency matrix of the local space-time diagram is represented as:
wherein v is i Representing a node i in a local space-time diagram, wherein an adjacent matrix A comprises 3N nodes, and the diagonal line of the adjacent matrix is an adjacent matrix of a space network with three continuous time steps; the sides of the diagonal represent the connectivity of each node belonging to the adjacent time step to itself.
4. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 3, wherein the step 2 dynamically adjusts the corresponding weights of the nodes by adopting an attention mechanism, and comprises the following specific steps:
each node block represents the current flow state at the time step t, and different colors represent different influence weights; the channels are divided in the time dimension, wherein one time step is a channel, and the aim is to dynamically adjust the space-time correlation by distributing dynamic weights to the characteristics at different time steps. Mining dynamic spatiotemporal correlations between data using a channel attention mechanism;
the feature compression of X is first performed by a global averaging pool, which converts each temporal channel into a number such that each channel has a global acceptance field in the spatial dimension;
X p 、f pool (X) indicates that each channel has a global receptive field in the spatial dimension, where T represents the historical time step and N represents the total number of sensors. Since the present study only considers the speed characteristics, C =1. Wherein X p ∈R T To learn the non-linear correlation between data, this equation is passed through two fully connected layers;
x att =W 2 δ(W 1 X p )
x att expressing the attention coefficient, whereinFor trainable parameters, r represents the scaling ratio of the channel and δ represents the ReLU activation function; furthermore, to obtain a weight value between 0 and 1, x is recalibrated using the sigmoid activation function as follows att :
x′ att =σ(x att )
Then, use is made of x' att And the Hadamard product of X to obtain dynamically adjusted spatio-temporal feature data as follows:
X att =X⊙x′ att
then, X att And sending the data to a gating spreading convolution module and a space convolution module to further capture space-time characteristics.
5. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 4, wherein the step 3: expanding the convolution kernel by adopting expanding convolution to increase the receptive field, which specifically comprises the following steps:
in this module, different dilation rate convolution inputs are employed to achieve short, medium and long term prediction goals; x att ∈R T *N*C Is the input to the module, where T represents the time step of the input sequence; for mining short, medium and long term features, for X att Is expandedConvolution with expansion ratio D =1,2,5, 11; then, after convolution in the time dimension, the results are concatenated as follows:
X cat =concat(X att *f D=1 ,X att *f D=2 ,X att *f D=5 ,X att *f D=11 )
wherein X att *f D=i (i =1,2,5,11) denotes the expansion convolution, f ∈ R 1*2 Is a convolution kernel; to be consistent with the dimensions of the spatial features, the connected vectors will be dimension converted through the fully connected layers;
X D =FC(X cat )
FC denotes the full connection layer, X D For the resulting vector, finally, a gating mechanism is used to control the transmission of the time information; the system consists of two parallel activation functions; the Tanh activation function is used to overcome the vanishing gradient problem; the sigmoid activation function maps data between 0 and 1 as message transfer control; x D By gating mechanism in the following form:
H T =g(X D1 )⊙σ(X D2 )。
wherein g (X) D1 )、σ(X D2 ) Respectively representing two activation functions, H T Is the result of its output.
6. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 5, wherein the step 4 specifically comprises:
the signal x on the graph G is filtered with a kernel G θ using a spatial convolution module as follows:
wherein,θ、I N are learnable parameters; A.respectively representing the adjacency matrix and the augmentation matrix thereof,the matrix is normalized for it.
A self-learning graph structure matrix is generated by adopting an attention-based method; the matrix may learn hidden spatial correlations between nodes from the input data;
wherein X ∈ R T*N*C Is the input of the model, V s ∈R N*N ,U 1 ∈R C ,U 2 ∈R T*C And U 3 ∈R T Are learnable parameters; subsequently, all data are normalized by using a Softmax function to obtain an adaptive graph structure matrixThe following graph is presented to curl the layers:
finally, the time-space units are fused by a fusion mechanism, H s 、H T Respectively representing the gating mechanism and the output of the graph convolution layer to obtain the output of each dynamic space-time block;
Y=Z 1 H T +Z 2 H S
wherein Z 1 And Z 2 Is a learnable parameter matrix.
7. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 6, wherein the step 5 specifically comprises:
the output layer converts the output of the last graph convolution layer into a traffic information sequence of T' time steps in the future, the input of the output layer being represented by transposing the input and reshaping it into X T ∈R T*N*C The T connected layers are used to generate the prediction, as follows:
8. The traffic flow prediction method based on the adaptive graph attention neural network as claimed in claim 7, wherein said step 6 selects Huber loss as loss function,
9. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 8, characterized in that the step 7: inputting the traffic flow data of the front N set time slices of the road section to be predicted into a trained adaptive graph attention neural network model, predicting the traffic flow of the future N time slices of the road section, and specifically comprising the following steps:
in this step, the history data is represented asx=(x t ,x t-1 ,...,x t-T+1 ) As an input traffic sequence of length T, x' = (x) t+1 ,x t+2 ,…,x t+p ) Is the predicted flow data for the next time step, specifically defining the formula as follows, where θ is a learnable parameter: (x) t+1 ,x t+2 ,…,x t+p )=F θ (x t ,x t-1 ,...,x t-T+1 )。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211386613.4A CN115762147B (en) | 2022-11-07 | 2022-11-07 | Traffic flow prediction method based on self-adaptive graph meaning neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211386613.4A CN115762147B (en) | 2022-11-07 | 2022-11-07 | Traffic flow prediction method based on self-adaptive graph meaning neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115762147A true CN115762147A (en) | 2023-03-07 |
CN115762147B CN115762147B (en) | 2023-11-21 |
Family
ID=85357215
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211386613.4A Active CN115762147B (en) | 2022-11-07 | 2022-11-07 | Traffic flow prediction method based on self-adaptive graph meaning neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115762147B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116153089A (en) * | 2023-04-24 | 2023-05-23 | 云南大学 | Traffic flow prediction system and method based on space-time convolution and dynamic diagram |
CN117275215A (en) * | 2023-05-04 | 2023-12-22 | 长江空间信息技术工程有限公司(武汉) | Urban road congestion space-time prediction method based on graph process neural network |
CN118014135A (en) * | 2024-02-02 | 2024-05-10 | 哈尔滨工业大学 | Urban peak-to-time demand prediction method and system based on dynamic space-time hypergraph representation learning |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111223301A (en) * | 2020-03-11 | 2020-06-02 | 北京理工大学 | Traffic flow prediction method based on graph attention convolution network |
CN111540199A (en) * | 2020-04-21 | 2020-08-14 | 浙江省交通规划设计研究院有限公司 | High-speed traffic flow prediction method based on multi-mode fusion and graph attention machine mechanism |
CN112508173A (en) * | 2020-12-02 | 2021-03-16 | 中南大学 | Traffic space-time sequence multi-step prediction method, system and storage medium |
CN112785848A (en) * | 2021-01-04 | 2021-05-11 | 清华大学 | Traffic data prediction method and system |
US20210209938A1 (en) * | 2020-09-25 | 2021-07-08 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method, apparatus, system, and computer-readable medium for traffic pattern prediction |
CN114818515A (en) * | 2022-06-24 | 2022-07-29 | 中国海洋大学 | Multidimensional time sequence prediction method based on self-attention mechanism and graph convolution network |
-
2022
- 2022-11-07 CN CN202211386613.4A patent/CN115762147B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111223301A (en) * | 2020-03-11 | 2020-06-02 | 北京理工大学 | Traffic flow prediction method based on graph attention convolution network |
CN111540199A (en) * | 2020-04-21 | 2020-08-14 | 浙江省交通规划设计研究院有限公司 | High-speed traffic flow prediction method based on multi-mode fusion and graph attention machine mechanism |
US20210209938A1 (en) * | 2020-09-25 | 2021-07-08 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method, apparatus, system, and computer-readable medium for traffic pattern prediction |
CN112508173A (en) * | 2020-12-02 | 2021-03-16 | 中南大学 | Traffic space-time sequence multi-step prediction method, system and storage medium |
CN112785848A (en) * | 2021-01-04 | 2021-05-11 | 清华大学 | Traffic data prediction method and system |
CN114818515A (en) * | 2022-06-24 | 2022-07-29 | 中国海洋大学 | Multidimensional time sequence prediction method based on self-attention mechanism and graph convolution network |
Non-Patent Citations (1)
Title |
---|
姜山 等: "面向路网交通流态势预测的图神经网络模型", 《计算机科学与探索》, vol. 15, no. 6, pages 1084 - 1091 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116153089A (en) * | 2023-04-24 | 2023-05-23 | 云南大学 | Traffic flow prediction system and method based on space-time convolution and dynamic diagram |
CN117275215A (en) * | 2023-05-04 | 2023-12-22 | 长江空间信息技术工程有限公司(武汉) | Urban road congestion space-time prediction method based on graph process neural network |
CN118014135A (en) * | 2024-02-02 | 2024-05-10 | 哈尔滨工业大学 | Urban peak-to-time demand prediction method and system based on dynamic space-time hypergraph representation learning |
Also Published As
Publication number | Publication date |
---|---|
CN115762147B (en) | 2023-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Trafficgan: Network-scale deep traffic prediction with generative adversarial nets | |
Mallick et al. | Transfer learning with graph neural networks for short-term highway traffic forecasting | |
Chen et al. | A graph convolutional stacked bidirectional unidirectional-LSTM neural network for metro ridership prediction | |
CN115762147B (en) | Traffic flow prediction method based on self-adaptive graph meaning neural network | |
Xiao et al. | Predicting urban region heat via learning arrive-stay-leave behaviors of private cars | |
CN114519932B (en) | Regional traffic condition integrated prediction method based on space-time relation extraction | |
CN110728317A (en) | Training method and system of decision tree model, storage medium and prediction method | |
Faro et al. | Evaluation of the traffic parameters in a metropolitan area by fusing visual perceptions and CNN processing of webcam images | |
CN111242395B (en) | Method and device for constructing prediction model for OD (origin-destination) data | |
CN111242292A (en) | OD data prediction method and system based on deep space-time network | |
Ren et al. | Spatio-temporal spectrum load prediction using convolutional neural network and ResNet | |
CN116011684A (en) | Traffic flow prediction method based on space-time diagram convolutional network | |
CN111047078A (en) | Traffic characteristic prediction method, system and storage medium | |
CN117116048A (en) | Knowledge-driven traffic prediction method based on knowledge representation model and graph neural network | |
Rahman et al. | A deep learning approach for network-wide dynamic traffic prediction during hurricane evacuation | |
CN116089875A (en) | Traffic flow prediction method, device and storage medium integrating multisource space-time data | |
Feng et al. | A hybrid model integrating local and global spatial correlation for traffic prediction | |
CN117636626A (en) | Heterogeneous map traffic prediction method and system for strengthening road peripheral space characteristics | |
CN112529294B (en) | Training method, medium and equipment for individual random trip destination prediction model | |
CN117593877A (en) | Short-time traffic flow prediction method based on integrated graph convolution neural network | |
CN117610734A (en) | Deep learning-based user behavior prediction method, system and electronic equipment | |
CN117392846A (en) | Traffic flow prediction method for space-time self-adaptive graph learning fusion dynamic graph convolution | |
Radi et al. | Enhanced Implementation of Intelligent Transportation Systems (ITS) based on Machine Learning Approaches | |
Xue et al. | Urban population density estimation based on spatio‐temporal trajectories | |
CN115565370A (en) | Local space-time graph convolution traffic flow prediction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |