CN115762147A - Traffic flow prediction method based on adaptive graph attention neural network - Google Patents

Traffic flow prediction method based on adaptive graph attention neural network Download PDF

Info

Publication number
CN115762147A
CN115762147A CN202211386613.4A CN202211386613A CN115762147A CN 115762147 A CN115762147 A CN 115762147A CN 202211386613 A CN202211386613 A CN 202211386613A CN 115762147 A CN115762147 A CN 115762147A
Authority
CN
China
Prior art keywords
time
traffic flow
data
att
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211386613.4A
Other languages
Chinese (zh)
Other versions
CN115762147B (en
Inventor
黄海辉
李坤鸿
常光辉
王玮晗
胡智鹏
胡诗洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202211386613.4A priority Critical patent/CN115762147B/en
Publication of CN115762147A publication Critical patent/CN115762147A/en
Application granted granted Critical
Publication of CN115762147B publication Critical patent/CN115762147B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a traffic flow prediction method based on a self-adaptive graph attention neural network, aims to predict the traffic flow of medium and long-term traffic vehicles, and belongs to the technical field of urban traffic planning and flow prediction. The method comprises the following steps: step 1: extracting flow data of a road, and preprocessing the flow data by an attention mechanism and a space-time data embedding method to obtain a preprocessed data sequence; step 2: on the basis, extracting space-time characteristics from the obtained data sequence; and step 3: after extraction through multiple network layers, convergence is performed using an improved multi-head attention mechanism, and a predicted result is obtained through a full-link layer. The method adopts multi-module parallel processing, improves the convolution mode and reduces the training time. The method of the invention can predict the traffic flow more accurately and can complete the prediction task better.

Description

Traffic flow prediction method based on adaptive graph attention neural network
Technical Field
The invention belongs to traffic flow prediction in the field of space-time sequence prediction, and relates to a traffic flow prediction method based on a self-adaptive graph attention neural network, which is only used for medium and long time flow prediction tasks in a traffic system.
Background
With the continuous development of national economy, the living standard of people is continuously improved, and the number of private cars is more and more, so the pressure borne by roads is continuously increased, and an intelligent transportation system is provided for solving the problem. Traffic flow prediction is a very important part in an intelligent traffic system, generates great assistance for traffic scheduling, is an indispensable part for traffic management departments to reasonably allocate road resources or provide more effective travel strategies for the public, and is one of effective methods for solving the current traffic efficiency problem.
At present, with the wide application of an Intelligent Transportation System (ITS), massive traffic data can be obtained in time, and the research on traffic speed prediction is further promoted. Fixed position sensors on the road record traffic data, including speed, flow and position information. A close spatiotemporal relationship exists among the traffic characteristics; therefore, the key to traffic prediction is capturing the dynamic spatiotemporal correlation of data. However, this task is challenging due to the complexity and non-linearity of traffic data.
First, the spatial dependency of the nodes is dynamic. Complex dependencies exist between nodes. And the spatial relationship between nodes is not independent but dynamically changes over time. However, several existing methods fail to dynamically model the traffic data spatio-temporally. Secondly, the nonlinearity of traffic speed changes and the propagation of errors during training make the traditional deep learning method insufficient for long-term prediction. Most importantly, these methods are based on a predefined graph structure matrix, which limits the dependence in the spatial utilization traffic data. This means that these methods still fail. And meanwhile, the space-time characteristics are extracted while the dynamic correlation of the traffic data is ignored.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A traffic flow prediction method based on a self-adaptive graph attention neural network is provided. The technical scheme of the invention is as follows:
a traffic flow prediction method based on an adaptive graph attention neural network comprises the following steps:
step 1: setting time slices, collecting and counting historical traffic flow information under each time slice through a detection device arranged at a traffic intersection, and forming a two-dimensional traffic flow matrix; dividing the obtained historical traffic flow information into a training set and a testing set;
and 2, step: the traffic flow detection device installed at each traffic intersection is regarded as a node, and the connection among the nodes under a single time slice and traffic flow data form a road network topological graph; connecting each node under the adjacent time slices with nodes of an upper time period and a lower time period, and dynamically adjusting the corresponding weight of the nodes by adopting an attention mechanism according to the road network flow information so as to construct a space-time network sequence and obtain a local space-time connection graph;
and step 3: expanding the convolution kernel by expanding convolution to increase the receptive field;
and 4, step 4: constructing a space-time graph convolution network model for predicting traffic flow based on an attention mechanism, wherein the space-time graph convolution network model adopts a mode of overlapping a plurality of modules, processes and outputs data in a time period, and adopts a multi-head attention mechanism to be connected with a residual error;
and 5: after output splicing, the output of the gate control mechanism block is obtained through a full connection layer, wherein a residual error structure is added for preventing overfitting when the output splicing passes through the full connection layer;
step 6: testing the self-adaptive graph attention neural network model after training by using a test set, evaluating the error of the model, returning to the step 2 if the error is larger than a set threshold value, and retraining the model;
and 7: and inputting the traffic flow data of the front N set time slices of the road section to be predicted into the trained adaptive graph attention neural network model, and predicting the traffic flow of the future N time slices of the road section.
Further, the step 1 specifically includes: setting 5 minutes as a time slice, collecting and counting historical traffic flow information under each time slice through a detection device arranged at a traffic intersection, and forming a two-dimensional traffic flow matrix;
the historical traffic flow data comprises the license plate number of the motor vehicle passing by the road section in a time slice, the passing time, the passing speed, the traffic flow information and the weather condition of the day, the repeated data and the invalid data are cleaned, and the residual data are subjected to z-score standardization processing
Figure BDA0003930137210000031
Figure BDA0003930137210000032
Wherein x i Is used as the original data, and the data is transmitted,
Figure BDA0003930137210000033
for new data, μ i Is a mean value, σ i Is the standard deviation. n is the number of stations in the road segment.
Further, the road network topology map in step 2 includes: the set V, | V | = N of all nodes represents the number of nodes, the edge set E among the nodes, and the weight adjacency matrix A of the edges among the nodes; obtaining a road network topological graph G = (V, E, A), and according to the topological structure of a local space-time graph, directly capturing the correlation between each node and the space-time neighbor thereof; the construction of the spatio-temporal network sequence uses A epsilon R N*N A adjacency matrix representing a space diagram, A ∈ R 3N*3N A adjacency matrix representing a local space-time diagram constructed on three consecutive spatial diagrams; for a node i in the space diagram, calculating a new index of the node i in the local space-time diagram by (t-1) N + i, wherein t (0 < t ≦ 3) represents a time step in the local space-time diagram; if two nodes are connected to each other in this local space-time diagram, the corresponding value in the adjacency matrix is set to 1; wherein the adjacency matrix of the local space-time diagram is represented as:
Figure BDA0003930137210000034
wherein v is i Representing a node i in a local space-time diagram, wherein an adjacent matrix A comprises 3N nodes, and the diagonal line of the adjacent matrix is an adjacent matrix of a space network with three continuous time steps; the two sides of the diagonal represent the connectivity of each node belonging to the adjacent time step to itself.
Further, in the step 2, the attention mechanism is adopted to dynamically adjust the corresponding weight of the node, and the specific steps are as follows:
each node block represents the current flow state at the time step t, and different colors represent different influence weights; the channels are divided in the time dimension, wherein one time step is a channel, and the aim is to dynamically adjust the space-time correlation by distributing dynamic weights to the characteristics at different time steps. Mining dynamic spatiotemporal correlations between data using a channel attention mechanism;
the feature compression of X is first performed by a global averaging pool, which converts each temporal channel into a number such that each channel has a global acceptance field in the spatial dimension;
Figure BDA0003930137210000041
X p 、f pool (X) indicates that each channel has a global receptive field in the spatial dimension, where T represents the historical time step and N represents the total number of sensors. Since the present study only considers the speed characteristics, C =1. Wherein X p ∈R T To learn the non-linear correlation between data, this equation is passed through two fully connected layers;
x att =W 2 δ(W 1 X p )
x att expressing the attention coefficient, wherein
Figure BDA0003930137210000042
For trainable parameters, r represents the scaling ratio of the channel, δ represents the ReLU activation function; furthermore, to obtain a weight value between 0 and 1, x is recalibrated using the sigmoid activation function as follows att
x′ att =σ(x att )
Then, use is made of x' att And the Hadamard product of X to obtain dynamically adjusted spatio-temporal feature data as follows:
X att =X⊙x′ att
subsequently, X att And sending the data to a gating spreading convolution module and a space convolution module to further capture space-time characteristics.
Further, the step 3: expanding the convolution kernel by adopting expanding convolution to increase the receptive field, which specifically comprises the following steps:
in this module, different expansion rate convolution inputs are employed to achieve short, medium and long term prediction goals; x att ∈R T*N*C Is the input to the module, where T represents the time step of the input sequence; for mining short, medium and long term features, pair X att Performing dilation convolution, wherein the dilation rate is D =1,2,5 and 11; then, after convolution in the time dimension, the results are concatenated as follows:
X cat =concat(X att *f D=1 ,X att *f D=2 ,X att *f D=5 ,X att *f D=11 )
wherein X att *f D=i (i =1,2,5,11) denotes the expansion convolution, f ∈ R 1*2 Is a convolution kernel; in order to keep consistent with the dimensions of the spatial features, the connected vectors will be dimension converted through the fully connected layers;
X D =FC(X cat )
FC denotes the full connection layer, X D For the resulting vectors, finally, a gating mechanism is used to control the transmission of the time information; the system consists of two parallel activation functions; the Tanh activation function is used to overcome the vanishing gradient problem; sigmoid activation function maps data between 0 and 1Transmitting as message transmission control; x D By gating mechanism in the following form:
H T =g(X D1 )⊙σ(X D2 )。
wherein g (X) D1 )、σ(X D2 ) Respectively representing two activation functions, H T Is the result of its output.
Further, the step 4 specifically includes:
adopting a space convolution module, wherein a convolution operator adopts the following calculation formula:
Figure BDA0003930137210000051
wherein the content of the first and second substances,
Figure BDA0003930137210000052
θ、I N are learnable parameters; A.
Figure BDA0003930137210000057
respectively representing the adjacency matrix and the augmentation matrix thereof,
Figure BDA0003930137210000053
the matrix is normalized for it.
A self-learning graph structure matrix is generated by adopting an attention-based method; the matrix may learn hidden spatial correlations between nodes from the input data;
Figure BDA0003930137210000054
wherein X ∈ R T*N*C Is the input of the model, V s ∈R N*N ,U 1 ∈R C ,U 2 ∈R T*C And U 3 ∈R T Are learnable parameters; subsequently, all data are normalized by using a Softmax function to obtain an adaptive graph structure matrix
Figure BDA0003930137210000055
The following graph is presented to curl the layers:
Figure BDA0003930137210000056
finally, the time-space units are fused by a fusion mechanism, H s 、H T Respectively representing the gating mechanism and the output of the graph convolution layer to obtain the output of each dynamic space-time block;
Y=Z 1 H T +Z 2 H S
wherein Z 1 And Z 2 Is a learnable parameter matrix.
Further, the step 5 specifically includes:
the output layer converts the output of the last graph convolution layer into a traffic information sequence of T' time steps in the future, the input of the output layer being represented by transposing the input and reshaping it into X T ∈R T*N*C The T connected layers are used to generate the prediction, as follows:
Figure BDA0003930137210000061
wherein F (x) 1 ,x 2 ,…,x t ) Represents the prediction result at the ith time step,
Figure BDA0003930137210000062
representing a learnable parameter.
Further, said step 6 selects the Huber loss as a loss function,
Figure BDA0003930137210000063
Figure BDA0003930137210000064
wherein, Y represents a basic fact,
Figure BDA0003930137210000065
representing the prediction of the model, δ is a threshold parameter that controls the range of squared error loss.
Further, the step 7: inputting the traffic flow data of the front N set time slices of the road section to be predicted into a trained adaptive graph attention neural network model, predicting the traffic flow of the future N time slices of the road section, and specifically comprising the following steps:
in this step, the history data is represented as x = (x) t ,x t-1 ,...,x t-T+1 ) As an input traffic sequence of length T, x' = (x) t+1 ,x t+2 ,…,x t+p ) Is the predicted flow data for the next time step, and specifically defines the formula shown below, where
Figure BDA0003930137210000067
Are learnable parameters:
Figure BDA0003930137210000066
the invention has the following advantages and beneficial effects:
first, spatio-temporal heterogeneity is fully taken into account and the spatial dependencies of the nodes are dynamic when trying to extract the spatial features of the graph. Complex dependencies exist between nodes. And the spatial relationship between nodes is not independent but dynamically changes over time. Most importantly, these methods are based on a predefined graph structure matrix, which limits the dependence in the spatial utilization traffic data. In the experiment, a time chart with more characteristics is obtained by using a space-time embedding method and combining an attention mechanism. And in the extraction process, the dynamic adjustment module is used for adjusting the structure of the relevant graph in time to extract more sufficient space-time characteristics, so that a better prediction effect can be achieved.
Drawings
FIG. 1 is a flow diagram of an adaptive neural network prediction module provided by the present invention;
FIG. 2 is a schematic diagram of a dynamic adjustment module;
FIG. 3 is a schematic diagram of a feature extraction module.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
as shown in fig. 1, a traffic flow prediction method based on an adaptive graph attention neural network includes the following steps:
step 1: setting 5 minutes as a time slice, collecting and counting historical traffic flow information under each time slice through a detection device arranged at a traffic intersection, and forming a two-dimensional traffic flow matrix;
the historical traffic flow data comprises the license plate number of the motor vehicle passing by the road section in a time slice, the passing time, the passing speed, the traffic flow information and the weather condition of the day, the repeated data and the invalid data are cleaned, and the residual data are subjected to z-score standardization processing
Figure BDA0003930137210000071
Figure BDA0003930137210000072
Wherein x i Is used as the original data, and the data is transmitted,
Figure BDA0003930137210000073
for new data, μ i Is a mean value, σ i Is the standard deviation.
Step 2: the traffic flow detection device installed at each traffic intersection is regarded as a node, and the connection among the nodes under a single time slice and traffic flow data form a road network topological graph; connecting each node under the adjacent time slices with nodes of an upper time period and a lower time period, and simultaneously considering that the weight of each node in each time period in a road network is different, and adjusting the corresponding weight of the node according to the traffic information of the road network by an attention mechanism to construct a space-time network sequence to obtain a local space-time connection graph;
the road network topological graph comprises: the set V, | V | = N of all nodes represents the number of nodes, the edge set E among the nodes, and the weight adjacency matrix A of the edges among the nodes; and obtaining a road network topological graph G = (V, E, A). According to the topological structure of the local space-time graph, the correlation between each node and the space-time neighbor thereof can be directly captured. The spatio-temporal network sequence is constructed by using A epsilon R N*N A adjacency matrix representing a space diagram, A ∈ R 3N*3N A adjacency matrix representing a local space-time diagram constructed on three consecutive spatial diagrams; for a node i in the space diagram, calculating a new index of the node i in the local space-time diagram by (t-1) N + i, wherein t (0 < t ≦ 3) represents a time step in the local space-time diagram; if two nodes are connected to each other in this local space-time diagram, the corresponding value in the adjacency matrix is set to 1; wherein the adjacency matrix of the local space-time diagram is represented as:
Figure BDA0003930137210000081
wherein v is i Representing node i in the local space-time diagram. The adjacency matrix a includes 3N nodes. A adjacency matrix of local space-time diagrams is illustrated. The diagonal of the adjacency matrix is the adjacency matrix of the spatial network of three consecutive time steps. The sides of the diagonal represent the connectivity of each node belonging to the adjacent time step to itself.
The node weight dynamic adjusting module in the step comprises the following steps:
referring to fig. 2, each node block represents the current traffic state at time step t, and different colors represent different impact weights. The present study divides the channels in the time dimension, where one time step is one channel. The purpose is to dynamically adjust the spatiotemporal correlation by assigning dynamic weights to the features at different time steps. We use the channel attention mechanism to mine the dynamic spatiotemporal correlation between data.
Feature compression for X is first performed by a global averaging pool that converts each temporal channel to a number such that each channel has a global acceptance field in the spatial dimension.
Figure BDA0003930137210000091
Wherein X p ∈R T To learn the non-linear correlation between data, equation (1) is passed through two fully connected layers.
x att =W 2 δ(W 1 X p )
Wherein
Figure BDA0003930137210000092
r represents the scaling ratio of the channels and δ represents the ReLU activation function. Furthermore, to obtain a weight value between 0 and 1, x is recalibrated using the sigmoid activation function as follows att
x′ att =σ(x att )
Then, x 'is used' att And the Hadamard product of X to obtain dynamically adjusted spatio-temporal feature data as follows:
X att =X⊙x′ att
then, X att And sending the data to a gating spreading convolution module and a space convolution module to further capture space-time characteristics.
And step 3: we use an improved gated dilation convolution to converge on the long-term features. For convolutional networks, it is difficult to obtain a larger field of view because it is limited by the size of the convolution kernel. To increase the receptive field, the convolution operation generally employs three methods: the use of larger convolution kernels, the use of deep nets, and aggregation operations prior to convolution. Here we use dilation convolution to expand the convolution kernel by adding "dilation" to obtain a larger receptive field.
In this block, we use different dilation rate convolution inputs to achieve short, medium and long term prediction goals. X att ∈R T*N*C Is the input to the module, where T represents the time step of the input sequence. For digging short, medium and long termPeriod characteristics of, for X att The dilation convolution was performed with dilation rates D =1,2,5, 11. Then, after convolution in the time dimension, the results are concatenated as follows:
X cat =concat(X att *f D=1 ,X att *f D=2 ,X att *f D=5 ,X att *f D=11 )
wherein X att *f D=i (i =1,2,5,11) denotes expansion convolution, f ∈ R 1*2 Is a convolution kernel. To keep consistent with the dimensions of the spatial features, the connected vectors will be dimension converted through the fully connected layers.
X D =FC(X cat )
Finally, a gating mechanism is used to control the transmission of the time information. Consisting of two parallel activation functions. The Tanh activation function is used to overcome the vanishing gradient problem. The sigmoid activation function maps data between 0 and 1 to message passing control. X D Passing through a gating mechanism in the following manner:
H T =g(X D1 )⊙σ(X D2 )
and 4, step 4: and (3) extracting the space-time characteristics of the data sequence based on the preprocessed data sequence output in the step (2). And constructing a graph convolution network model for predicting traffic flow based on an attention mechanism, wherein the network processes and outputs data in a time period in a mode of overlapping a plurality of modules, and the attention mechanism is adopted to reduce the loss of characteristics as much as possible.
In this module, in order to make the model lightweight and reduce excessive overhead, we use a spatial convolution module, and the convolution operator uses the following calculation formula:
Figure BDA0003930137210000101
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003930137210000102
θ is a learnable parameter.
As can be seen from the evolution of GCN, the process of graph convolution is essentially determined by the adjacency matrix. A self-learning graph structure matrix is generated by adopting a method based on attention. The matrix may learn hidden spatial correlations between nodes from the input data.
Figure BDA0003930137210000103
Wherein X ∈ R T*N*C Is the input of the model, V ∈ R N*N ,U 1 ∈R C ,U 2 ∈R T*C And U 3 ∈R T Are learnable parameters. Subsequently, all data are normalized by using a Softmax function to obtain an adaptive graph structure matrix
Figure BDA0003930137210000104
The following graph is presented to curl the layers:
Figure BDA0003930137210000105
and finally, fusing the time-space units through a fusion mechanism to obtain the output of each dynamic time-space block.
Y=Z 1 H T +Z 2 H S
Wherein Z 1 And Z 2 Is a learnable parameter matrix.
And 5: and splicing the obtained outputs, and then obtaining the output of the gate control mechanism block through a full connection layer, wherein a residual error structure is added for preventing overfitting when the outputs pass through the full connection layer.
In this step, the output layer is a traffic information sequence that converts the output of the last graph convolution layer into T' time steps in the future, and the input of the output layer is represented by transposing the input and reshaping it into X T ∈R T*N*C The T connected layers are used to generate the prediction, as follows:
Figure BDA0003930137210000111
wherein F (x) 1 ,x 2 ,…,x t ) Represents the prediction result at the ith time step,
Figure BDA0003930137210000112
representing a learnable parameter.
Step 6: and (3) testing the self-adaptive graph attention neural network model after training is finished by using the test set, evaluating the error of the model, returning to the step (2) if the error is greater than a set threshold value, and retraining the model.
In this step, we select the Huber loss (Huber 1992) as the loss function. The Huber loss is less sensitive to outliers than the squared error loss.
Figure BDA0003930137210000113
Wherein, Y represents a basic fact,
Figure BDA0003930137210000114
representing the prediction of the model, δ is a threshold parameter that controls the range of squared error loss.
And 7: and inputting the traffic flow data of the first N set time slices of the road section to be predicted into the trained adaptive graph attention neural network model, and predicting the traffic flow of the future N time slices of the road section.
In this step, the history data is represented as x = (x) t ,x t-1 ,...,x t-T+1 ) As an input traffic sequence of length T, x' = (x) t+1 ,x t+2 ,…,x t+p ) Is the flow data of the next time step that we predict, and specifically defines the formula as follows, where
Figure BDA0003930137210000115
Are learnable parameters:
Figure BDA0003930137210000121
the systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims (9)

1. A traffic flow prediction method based on a self-adaptive graph attention neural network is characterized by comprising the following steps:
step 1: setting time slices, collecting and counting historical traffic flow information under each time slice through a detection device installed at a traffic intersection, and forming a two-dimensional traffic flow matrix; dividing the obtained historical traffic flow information into a training set and a testing set;
step 2: the traffic flow detection device installed at each traffic intersection is regarded as a node, and the connection among the nodes under a single time slice and traffic flow data form a road network topological graph; connecting each node under the adjacent time slices with nodes of an upper time period and a lower time period, and dynamically adjusting the corresponding weight of the nodes by adopting an attention mechanism according to the road network flow information so as to construct a space-time network sequence and obtain a local space-time connection graph;
and step 3: expanding the convolution kernel by expanding convolution to increase the receptive field;
and 4, step 4: constructing a space-time graph convolution network model for predicting traffic flow based on an attention mechanism, wherein the space-time graph convolution network model adopts a mode of overlapping a plurality of modules, processes and outputs data in a time period, and adopts a multi-head attention mechanism to be connected with a residual error;
and 5: after output splicing, the output of the gate control mechanism block is obtained through a full connection layer, wherein a residual error structure is added for preventing overfitting when the output splicing passes through the full connection layer;
step 6: testing the self-adaptive graph attention neural network model after training by using a test set, evaluating the error of the model, returning to the step 2 if the error is larger than a set threshold value, and re-training the model;
and 7: and inputting the traffic flow data of the front N set time slices of the road section to be predicted into the trained adaptive graph attention neural network model, and predicting the traffic flow of the future N time slices of the road section.
2. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 1, wherein the step 1 specifically comprises: setting 5 minutes as a time slice, collecting and counting historical traffic flow information under each time slice through a detection device arranged at a traffic intersection, and forming a two-dimensional traffic flow matrix;
the historical traffic flow data comprises the license plate number of the motor vehicle passing by the road section in a time slice, the passing time, the passing speed, the traffic flow information and the weather condition of the day, the repeated data and the invalid data are cleaned, and the residual data are subjected to z-score standardization processing
Figure FDA0003930137200000021
Figure FDA0003930137200000022
Wherein x i Is used as the original data, and the data is transmitted,
Figure FDA0003930137200000023
for new data, μ i Is a mean value, σ i And n is the number of stations in the road section.
3. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 1, wherein the road network topology map in the step 2 comprises: the set V, | V | = N of all nodes represents the number of nodes, the edge set E among the nodes, and the weight adjacency matrix A of the edges among the nodes; obtaining a road network topological graph G = (V, E, A), and directly capturing each time according to the topological structure of the local space-time graphCorrelation between an individual node and its spatio-temporal neighbors; the construction of the spatio-temporal network sequence uses A epsilon R N*N A adjacency matrix representing a space diagram, A ∈ R 3N*3N A adjacency matrix representing a local space-time diagram constructed on three consecutive spatial diagrams; for a node i in the space diagram, calculating a new index of the node i in the local space-time diagram by (t-1) N + i, wherein t (0 < t ≦ 3) represents a time step in the local space-time diagram; if two nodes are connected to each other in this local space-time diagram, the corresponding value in the adjacency matrix is set to 1; wherein the adjacency matrix of the local space-time diagram is represented as:
Figure FDA0003930137200000024
wherein v is i Representing a node i in a local space-time diagram, wherein an adjacent matrix A comprises 3N nodes, and the diagonal line of the adjacent matrix is an adjacent matrix of a space network with three continuous time steps; the sides of the diagonal represent the connectivity of each node belonging to the adjacent time step to itself.
4. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 3, wherein the step 2 dynamically adjusts the corresponding weights of the nodes by adopting an attention mechanism, and comprises the following specific steps:
each node block represents the current flow state at the time step t, and different colors represent different influence weights; the channels are divided in the time dimension, wherein one time step is a channel, and the aim is to dynamically adjust the space-time correlation by distributing dynamic weights to the characteristics at different time steps. Mining dynamic spatiotemporal correlations between data using a channel attention mechanism;
the feature compression of X is first performed by a global averaging pool, which converts each temporal channel into a number such that each channel has a global acceptance field in the spatial dimension;
Figure FDA0003930137200000031
X p 、f pool (X) indicates that each channel has a global receptive field in the spatial dimension, where T represents the historical time step and N represents the total number of sensors. Since the present study only considers the speed characteristics, C =1. Wherein X p ∈R T To learn the non-linear correlation between data, this equation is passed through two fully connected layers;
x att =W 2 δ(W 1 X p )
x att expressing the attention coefficient, wherein
Figure FDA0003930137200000032
For trainable parameters, r represents the scaling ratio of the channel and δ represents the ReLU activation function; furthermore, to obtain a weight value between 0 and 1, x is recalibrated using the sigmoid activation function as follows att
x′ att =σ(x att )
Then, use is made of x' att And the Hadamard product of X to obtain dynamically adjusted spatio-temporal feature data as follows:
X att =X⊙x′ att
then, X att And sending the data to a gating spreading convolution module and a space convolution module to further capture space-time characteristics.
5. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 4, wherein the step 3: expanding the convolution kernel by adopting expanding convolution to increase the receptive field, which specifically comprises the following steps:
in this module, different dilation rate convolution inputs are employed to achieve short, medium and long term prediction goals; x att ∈R T *N*C Is the input to the module, where T represents the time step of the input sequence; for mining short, medium and long term features, for X att Is expandedConvolution with expansion ratio D =1,2,5, 11; then, after convolution in the time dimension, the results are concatenated as follows:
X cat =concat(X att *f D=1 ,X att *f D=2 ,X att *f D=5 ,X att *f D=11 )
wherein X att *f D=i (i =1,2,5,11) denotes the expansion convolution, f ∈ R 1*2 Is a convolution kernel; to be consistent with the dimensions of the spatial features, the connected vectors will be dimension converted through the fully connected layers;
X D =FC(X cat )
FC denotes the full connection layer, X D For the resulting vector, finally, a gating mechanism is used to control the transmission of the time information; the system consists of two parallel activation functions; the Tanh activation function is used to overcome the vanishing gradient problem; the sigmoid activation function maps data between 0 and 1 as message transfer control; x D By gating mechanism in the following form:
H T =g(X D1 )⊙σ(X D2 )。
wherein g (X) D1 )、σ(X D2 ) Respectively representing two activation functions, H T Is the result of its output.
6. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 5, wherein the step 4 specifically comprises:
the signal x on the graph G is filtered with a kernel G θ using a spatial convolution module as follows:
Figure FDA0003930137200000041
wherein the content of the first and second substances,
Figure FDA0003930137200000042
θ、I N are learnable parameters; A.
Figure FDA0003930137200000043
respectively representing the adjacency matrix and the augmentation matrix thereof,
Figure FDA0003930137200000044
the matrix is normalized for it.
A self-learning graph structure matrix is generated by adopting an attention-based method; the matrix may learn hidden spatial correlations between nodes from the input data;
Figure FDA0003930137200000045
wherein X ∈ R T*N*C Is the input of the model, V s ∈R N*N ,U 1 ∈R C ,U 2 ∈R T*C And U 3 ∈R T Are learnable parameters; subsequently, all data are normalized by using a Softmax function to obtain an adaptive graph structure matrix
Figure FDA0003930137200000046
The following graph is presented to curl the layers:
Figure FDA0003930137200000047
finally, the time-space units are fused by a fusion mechanism, H s 、H T Respectively representing the gating mechanism and the output of the graph convolution layer to obtain the output of each dynamic space-time block;
Y=Z 1 H T +Z 2 H S
wherein Z 1 And Z 2 Is a learnable parameter matrix.
7. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 6, wherein the step 5 specifically comprises:
the output layer converts the output of the last graph convolution layer into a traffic information sequence of T' time steps in the future, the input of the output layer being represented by transposing the input and reshaping it into X T ∈R T*N*C The T connected layers are used to generate the prediction, as follows:
Figure FDA0003930137200000051
wherein F (x) 1 ,x 2 ,…,x t ) Denotes the prediction at the ith time step, W 1 (i) ,W 2 (i)
Figure FDA0003930137200000052
Representing a learnable parameter.
8. The traffic flow prediction method based on the adaptive graph attention neural network as claimed in claim 7, wherein said step 6 selects Huber loss as loss function,
Figure FDA0003930137200000053
wherein, Y represents a basic fact,
Figure FDA0003930137200000054
representing the prediction of the model, δ is a threshold parameter that controls the range of squared error loss.
9. The traffic flow prediction method based on the adaptive graph attention neural network according to claim 8, characterized in that the step 7: inputting the traffic flow data of the front N set time slices of the road section to be predicted into a trained adaptive graph attention neural network model, predicting the traffic flow of the future N time slices of the road section, and specifically comprising the following steps:
in this step, the history data is represented asx=(x t ,x t-1 ,...,x t-T+1 ) As an input traffic sequence of length T, x' = (x) t+1 ,x t+2 ,…,x t+p ) Is the predicted flow data for the next time step, specifically defining the formula as follows, where θ is a learnable parameter: (x) t+1 ,x t+2 ,…,x t+p )=F θ (x t ,x t-1 ,...,x t-T+1 )。
CN202211386613.4A 2022-11-07 2022-11-07 Traffic flow prediction method based on self-adaptive graph meaning neural network Active CN115762147B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211386613.4A CN115762147B (en) 2022-11-07 2022-11-07 Traffic flow prediction method based on self-adaptive graph meaning neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211386613.4A CN115762147B (en) 2022-11-07 2022-11-07 Traffic flow prediction method based on self-adaptive graph meaning neural network

Publications (2)

Publication Number Publication Date
CN115762147A true CN115762147A (en) 2023-03-07
CN115762147B CN115762147B (en) 2023-11-21

Family

ID=85357215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211386613.4A Active CN115762147B (en) 2022-11-07 2022-11-07 Traffic flow prediction method based on self-adaptive graph meaning neural network

Country Status (1)

Country Link
CN (1) CN115762147B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116153089A (en) * 2023-04-24 2023-05-23 云南大学 Traffic flow prediction system and method based on space-time convolution and dynamic diagram

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111223301A (en) * 2020-03-11 2020-06-02 北京理工大学 Traffic flow prediction method based on graph attention convolution network
CN111540199A (en) * 2020-04-21 2020-08-14 浙江省交通规划设计研究院有限公司 High-speed traffic flow prediction method based on multi-mode fusion and graph attention machine mechanism
CN112508173A (en) * 2020-12-02 2021-03-16 中南大学 Traffic space-time sequence multi-step prediction method, system and storage medium
CN112785848A (en) * 2021-01-04 2021-05-11 清华大学 Traffic data prediction method and system
US20210209938A1 (en) * 2020-09-25 2021-07-08 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, apparatus, system, and computer-readable medium for traffic pattern prediction
CN114818515A (en) * 2022-06-24 2022-07-29 中国海洋大学 Multidimensional time sequence prediction method based on self-attention mechanism and graph convolution network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111223301A (en) * 2020-03-11 2020-06-02 北京理工大学 Traffic flow prediction method based on graph attention convolution network
CN111540199A (en) * 2020-04-21 2020-08-14 浙江省交通规划设计研究院有限公司 High-speed traffic flow prediction method based on multi-mode fusion and graph attention machine mechanism
US20210209938A1 (en) * 2020-09-25 2021-07-08 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, apparatus, system, and computer-readable medium for traffic pattern prediction
CN112508173A (en) * 2020-12-02 2021-03-16 中南大学 Traffic space-time sequence multi-step prediction method, system and storage medium
CN112785848A (en) * 2021-01-04 2021-05-11 清华大学 Traffic data prediction method and system
CN114818515A (en) * 2022-06-24 2022-07-29 中国海洋大学 Multidimensional time sequence prediction method based on self-attention mechanism and graph convolution network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
姜山 等: "面向路网交通流态势预测的图神经网络模型", 《计算机科学与探索》, vol. 15, no. 6, pages 1084 - 1091 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116153089A (en) * 2023-04-24 2023-05-23 云南大学 Traffic flow prediction system and method based on space-time convolution and dynamic diagram

Also Published As

Publication number Publication date
CN115762147B (en) 2023-11-21

Similar Documents

Publication Publication Date Title
Zhang et al. Trafficgan: Network-scale deep traffic prediction with generative adversarial nets
Mallick et al. Transfer learning with graph neural networks for short-term highway traffic forecasting
CN110263280B (en) Multi-view-based dynamic link prediction depth model and application
CN107180530B (en) A kind of road network trend prediction method based on depth space-time convolution loop network
Chen et al. A graph convolutional stacked bidirectional unidirectional-LSTM neural network for metro ridership prediction
CN110473592B (en) Multi-view human synthetic lethal gene prediction method
Lee et al. Real-time optimization for adaptive traffic signal control using genetic algorithms
CN114519932B (en) Regional traffic condition integrated prediction method based on space-time relation extraction
Faro et al. Evaluation of the traffic parameters in a metropolitan area by fusing visual perceptions and CNN processing of webcam images
CN111242395B (en) Method and device for constructing prediction model for OD (origin-destination) data
CN111047078B (en) Traffic characteristic prediction method, system and storage medium
CN111242292A (en) OD data prediction method and system based on deep space-time network
CN115204478A (en) Public traffic flow prediction method combining urban interest points and space-time causal relationship
Wu Spatiotemporal dynamic forecasting and analysis of regional traffic flow in urban road networks using deep learning convolutional neural network
CN116011684A (en) Traffic flow prediction method based on space-time diagram convolutional network
CN115762147B (en) Traffic flow prediction method based on self-adaptive graph meaning neural network
Jiang et al. Bi‐GRCN: A Spatio‐Temporal Traffic Flow Prediction Model Based on Graph Neural Network
Rahman et al. A deep learning approach for network-wide dynamic traffic prediction during hurricane evacuation
Feng et al. A hybrid model integrating local and global spatial correlation for traffic prediction
CN117610734A (en) Deep learning-based user behavior prediction method, system and electronic equipment
CN111079900B (en) Image processing method and device based on self-adaptive connection neural network
CN112529294A (en) Training method, medium and equipment for individual random trip destination prediction model
Cho et al. An image generation approach for traffic density classification at large-scale road network
CN115565370A (en) Local space-time graph convolution traffic flow prediction method and system
Xue et al. Urban population density estimation based on spatio‐temporal trajectories

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant