CN116682271A - Traffic flow prediction method based on U-shaped multi-scale space-time diagram convolutional network - Google Patents

Traffic flow prediction method based on U-shaped multi-scale space-time diagram convolutional network Download PDF

Info

Publication number
CN116682271A
CN116682271A CN202310674025.9A CN202310674025A CN116682271A CN 116682271 A CN116682271 A CN 116682271A CN 202310674025 A CN202310674025 A CN 202310674025A CN 116682271 A CN116682271 A CN 116682271A
Authority
CN
China
Prior art keywords
time
space
gating
network
traffic flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310674025.9A
Other languages
Chinese (zh)
Inventor
张帅
余望之
宋小波
姚家渭
张文宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Half Cloud Technology Co ltd
Zhejiang University of Finance and Economics
Original Assignee
Hangzhou Half Cloud Technology Co ltd
Zhejiang University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Half Cloud Technology Co ltd, Zhejiang University of Finance and Economics filed Critical Hangzhou Half Cloud Technology Co ltd
Priority to CN202310674025.9A priority Critical patent/CN116682271A/en
Publication of CN116682271A publication Critical patent/CN116682271A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/065Traffic control systems for road vehicles by counting the vehicles in a section of the road or in a parking area, i.e. comparing incoming count with outgoing count
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0137Measuring and analyzing of parameters relative to traffic conditions for specific applications
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The application discloses a traffic flow prediction method based on a U-shaped multi-scale space-time diagram convolution network, wherein the U-shaped multi-scale space-time diagram convolution network comprises a space-time encoder and a space-time decoder, original characteristic data comprising channel dimension, node dimension and time dimension are constructed according to historical traffic flow data of each node in the acquired traffic network in a preset time period, then the original characteristic data is input into the space-time encoder, space-time characteristics are extracted, then the extracted space-time characteristics are input into the space-time decoder, each decoding layer of the space-time decoder is connected with a corresponding encoding layer in the space-time encoder in a jumping mode, and finally a prediction result is obtained. The method can comprehensively capture the space-time dependency relationship on different scales, and can obtain better prediction performance in a plurality of prediction time point step sizes.

Description

Traffic flow prediction method based on U-shaped multi-scale space-time diagram convolutional network
Technical Field
The application belongs to the technical field of traffic management, and particularly relates to a traffic flow prediction method based on a U-shaped multi-scale space-time diagram convolution network.
Background
In recent years, intelligent Traffic Systems (ITS) have become an indispensable component in traffic networks, and accurate traffic flow prediction can realize efficient and reliable traffic management, can also help people to make optimal traffic routes, and has very important significance. However, traffic flow data has complex spatio-temporal dependencies, typically captured by spatio-temporal modules and stacked in the channel dimension. Meanwhile, when capturing the implicit mode, the implicit fine granularity characteristic and the multi-scale space-time dependency, the model often ignores the channel semantics, so that the space-time dependency relationship is difficult to comprehensively express. Thus, achieving accurate traffic flow predictions remains a challenge.
Currently, deep learning-based models have been widely used for traffic prediction, and these models mainly consist of three branches: a Convolutional Neural Network (CNN) based model, a Recurrent Neural Network (RNN) based model, and a graph rolling network (GCN) based model. In the spatial dimension, the CNN-based model treats the traffic prediction problem as an image learning problem, thereby effectively capturing the spatial relationship between sensors. The RNN-based model captures temporal features in an iterative fashion in the time dimension to process time-series tasks. CNN-based models tend to ignore non-euclidean features of traffic networks, while RNN-based models are prone to time-consuming iterative propagation and gradient explosion/disappearance problems when capturing long-distance sequences. The GCN-based model regards the traffic network as a graph structure, so that the non-Euclidean features of the traffic network can be effectively captured, and the CNN-based model can be utilized to effectively extract the time features. However, GCN-based models still have some drawbacks. First, while some GCN-based research efforts have employed graph-based approaches to address traffic prediction problems, most GCN-based models capture spatio-temporal dependencies that are not sufficiently multi-scale to fully represent complex relationships between sensors. Second, while attention mechanisms have been widely used for some GCN-based traffic prediction models, they have not been used to capture implicit patterns and implicit fine-grained features that pile up on the channel dimension.
Disclosure of Invention
The application aims to provide a traffic flow prediction method based on a U-shaped multi-scale space-time diagram convolution network, so as to overcome the problems encountered by the existing deep learning model in traffic prediction.
In order to achieve the above purpose, the technical scheme of the application is as follows:
a traffic flow prediction method based on a U-shaped multi-scale space-time diagram convolutional network, the U-shaped multi-scale space-time diagram convolutional network comprising a space-time encoder and a space-time decoder, the traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolutional network comprising:
acquiring historical traffic flow data of each node in a traffic network in a preset time period, and constructing original characteristic data comprising channel dimension, node dimension and time dimension;
inputting original characteristic data into a space-time encoder, wherein the space-time encoder comprises M coding layers which are sequentially connected, in each coding layer, capturing time characteristics by a time gating encoder, and further capturing space characteristics by a static image gating image rolling network and a self-adaptive image gating image rolling network sequentially to obtain first characteristics; the other path of the input feature passes through a channel modeling module, the attention scores of different channels are calculated in the channel modeling module, and normalization and random discarding processing are carried out to obtain a second feature; then adding the first feature and the second feature to obtain an output feature of the coding layer;
inputting the output of the space-time encoder into a space-time decoder to obtain a prediction result, wherein the space-time decoder comprises M decoding layers which are sequentially connected, each decoding layer is in jump connection with a corresponding encoding layer in the space-time encoder, in each encoding layer, one path of input characteristics is decoded by a time gating decoder, additional fine granularity time characteristics are captured, and then adaptive spatial characteristic decoding is realized by an adaptive graph gating graph rolling network to capture fine granularity spatial characteristics, so that a third characteristic is obtained; the other path of the input features passes through a channel modeling module to effectively capture the implicit mode and the implicit fine granularity features among channels to obtain fourth features; and then adding the third feature and the fourth feature to obtain the output feature of the decoding layer.
Further, the feature channel dimension includes four features of traffic flow, a timestamp, a longitude of a node, and a latitude of the node.
Further, the time gating encoder adopts the following formula:
e=Γ(Cconv(X e )⊙g(X e )+r(X e ))
Cconv(X e )=X e W ec +b ec
g(X e )=σ(X e W eg +b eg )
r(X e )=X e W er +b er
wherein Cconv (X) e ) Is a one-dimensional causal convolution, W ec and bec Is a parameter of one-dimensional causal convolution; g (X) e ) Representing gating controlling one-dimensional causal convolution output, W eg and beg Is a gating parameter; r (X) e ) For residual connection to avoid network degradation, W er and ber Parameters of residual connection; e is the output of the time-gated encoder, Γ represents the activation function, Γ represents the Hadamard product calculated per element, X e Is the input to the time-gated encoder.
Further, the static graph gating graph rolling network adopts the following formula:
wherein ,output of the rolling network for static diagram gating diagram, W 1 and W2 Is a linear layer, sigma and +.is the product of Sigmoid function and hadamard per element used to form gating, BN is the batch normalization, X is the input to the static graph gating graph convolution network, A FG Is a static diagram.
Further, the adaptive graph gating graph rolling network adopts the following formula:
wherein ,for the output of an adaptive graph gating graph rolling network, W 1 and W2 Is a linear layer, sigma and +.are the Sigmoid function and the hadamard product per element used to form gating, BN is the batch normalization, X is the input to the adaptive graph gating graph convolution network, A AG Is an adaptive graph.
Further, the channel modeling module adopts the following formula expression:
CSAM=Dropout(LN(AS))
wherein AS is the attention score, CSAM is the output of the channel modeling module, dropout is the random discard, LN is the normalization, Q is the query vector obtained by the query mapping of the input, K is the key vector obtained by the key mapping of the input, and V is the value vector obtained by the value mapping of the input.
Further, the time gating decoder adopts the following formula:
d=Γ(Tconv(X d )⊙g(X d )+r(X d ))
Tconv(X d )=X d W dc +b dc
g(X d )=σ(X d W dg +b dg )
r(X d )=X d W dr +b dr
wherein Tconv (X) d ) For one-dimensional transposed convolution, W dc and bdc Is a one-dimensional transposed convolved parameter; g (X) d ) Representing gating controlling one-dimensional transposed convolution output, W dg and bdg Is a gating parameter; r (X) d ) For residual connection to avoid network degradation, W dr and bdr Parameters of residual connection; d is the output of the time-gated decoder, Γ represents the ReLU activation function; sigma is a Sigmoid activation function; as indicated by the addition of the element-calculated Hadamard product, X d Is the input to the time-gated codec.
According to the traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolution network, provided by the application, the gradient explosion/disappearance problem caused by capturing long-distance sequences can be effectively relieved in the encoding and decoding processes through the novel time encoder-decoder module with causal convolution and transposed convolution. The method can effectively capture the implicit mode and the implicit fine granularity characteristic among channels through a new channel self-attention mechanism to strengthen the channel semantics, and strengthen the representation capability of space-time dependency. According to the application, through the full convolution and jump connected U-shaped multi-scale space-time diagram convolution network, space-time dependency relations on different scales can be comprehensively captured, and good prediction performance can be obtained in a plurality of prediction time point step sizes.
Drawings
FIG. 1 is a schematic diagram of a U-shaped multi-scale space-time diagram convolutional network model of the present application.
Fig. 2 is a flow chart of a traffic flow prediction method based on a U-shaped multi-scale space-time diagram convolutional network.
Fig. 3 is a schematic diagram of a coding layer structure of a space-time encoder according to an embodiment of the present application.
FIG. 4 is a schematic diagram of a time-gated encoder according to an embodiment of the present application.
FIG. 5 is a schematic diagram of a static map gating rolled network according to an embodiment of the present application.
Fig. 6 is a schematic diagram of an adaptive graph gating rolled network according to an embodiment of the present application.
FIG. 7 is a schematic diagram of a channel modeling module according to an embodiment of the present application.
Fig. 8 is a schematic diagram of a decoding layer structure of a space-time decoder according to an embodiment of the present application.
FIG. 9 is a schematic diagram of a time-gated decoder according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The traffic network g= (V, E, a) is a directed graph or undirected graph representing spatial relationships between different nodes, where V represents N nodes observed on traffic paths, E represents a set of edges representing connectivity between nodes,refers to a weighted adjacency matrix that represents the degree of connectivity between nodes. Traffic flow refers to the flow of traffic detected by sensors in nodes per unit of time. The purpose of traffic flow prediction is to find a function that uses the T historical time point steps of the spatio-temporal signal and the traffic network graph G to predict the future T' time point steps of the spatio-temporal signal. The above-described mapping relationship can be expressed as formula (1):
[Y t+1 ,Y t+2 ,...,Y t+T ]=f([X t-T′+1 ,X t-T′+2 ,...,X t ];G) (1)
where T' and T are the historical and future time point steps for prediction, respectively.Andhistorical and future spatiotemporal signals, respectively, D is the number of input features of the model. The present application contemplates four features, namely traffic flow sequence, time stamp, longitude of sensor and latitude of sensor, wherein time stamp refers to time feature with period attribute (cycle, month period and year period) for enhancing the representation capability of the period attribute of traffic flow.
The application provides a U-shaped multi-scale space-time diagram convolution network model for traffic prediction. As shown in FIG. 1, the U-shaped multi-scale space-time diagram convolutional network model (GSTC-Unet) provided by the application comprises a space-time encoder and a space-time decoder, wherein jump connection is carried out between the space-time encoder and the space-time decoder, so that multi-scale space-time dependency can be effectively captured.
Wherein the space-time encoder comprises a Time Gating Encoder (TGE), a spatial modeling module and a channel modeling module; the space-time decoder includes a Time Gate Decoder (TGD), an adaptive graph gate-map convolution network (GGCN-AG), and a channel modeling module.
The application adopts a Time Gating Encoder (TGE) and a Time Gating Decoder (TGD) to extract the time characteristics efficiently so as to effectively relieve the gradient explosion/disappearance problem generated in the long-distance sequence capturing process. The spatial modeling module captures spatial features using a static graph gated graph rolling network (GGCN-FG) and an adaptive graph gated graph rolling network (GGCN-AG). The channel modeling module captures implicit patterns and implicit fine granularity characteristics among channels by using a channel self-attention mechanism (CSAM), so that the representation capability of the space-time dependency relationship of the traffic network is enhanced.
In one embodiment, as shown in fig. 2, the traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolution network provided by the application comprises the following steps:
step S1, historical traffic flow data of each node in a traffic network in a preset time period is obtained, and original characteristic data comprising channel dimension, node dimension and time dimension are constructed.
The application predicts traffic flow at T' time points in the future by using traffic flow at T historical time points and the traffic network graph G. Aiming at the proposed U-shaped multi-scale space-time diagram convolution network model, input original characteristic data of the network model are firstly constructed.
The original characteristic data constructed in the embodiment is used forRepresentation, wherein C ie Is the channel dimension, N is the number of nodes of the traffic network, T ie Is the time dimension of the input. The channel dimension comprises traffic flow, a time stamp, longitude of the node, latitude of the node and other characteristics.
S2, inputting original characteristic data into a space-time encoder, wherein the space-time encoder comprises M coding layers which are sequentially connected, in each coding layer, capturing time characteristics by a time gating encoder, and then further capturing space characteristics by a static diagram gating diagram rolling network and a self-adaptive diagram gating diagram rolling network in sequence to obtain first characteristics; the other path of the input feature passes through a channel modeling module, the attention scores of different channels are calculated in the channel modeling module, and normalization and random discarding processing are carried out to obtain a second feature; the first feature and the second feature are then added to obtain an output feature of the encoding layer.
In a specific embodiment, the space-time encoder comprises 4 coding layers connected in sequence, each coding layer is shown in fig. 3 and comprises two branches, and the output characteristics of the two branches are added to obtain the output characteristics of the coding layer.
Specifically, the first branch of the coding layer includes a time gating encoder TGE, a static graph gating graph rolling network GGCN-FG and an adaptive graph gating graph rolling network GGCN-AG; the second branch is the channel modeling module CSAM.
In a specific embodiment, a Time Gating Encoder (TGE) is shown in fig. 4, the TGE employing one-dimensional causal convolution to adequately capture the temporal characteristics of the traffic network and stack them in the channel dimension for downsampling the temporal characteristics. In addition, TGE uses gating modules that can effectively capture time dependent outputs to adaptively control the effective outputs. Thus, TGE constructs causal relationships of time series predictions.
Given an input wherein Cie Is the channel dimension of the input, N is the number of nodes, T ie Is the time dimension of the input, TGE may be expressed as follows:
e=Γ(Cconv(X e )⊙g(X e )+r(X e )) (2)
Cconv(X e )=X e W ec +b ec (3)
g(X e )=σ(X e W eg +b eg ) (4)
r(X e )=X e W er +b er (5)
wherein Cconv (X) e ) For one-dimensional causal convolution, for coding temporal characteristics, W ec and bec Is a parameter of one-dimensional causal convolution; g (X) e ) Representing gating controlling one-dimensional causal convolution output, W eg and beg Is a gating parameter; r (X) e ) For residual connection to avoid network degradation, W er and ber Parameters of residual connection;output as TGE, wherein C oe For the channel dimension of the output, T oe Is the time dimension of the output; Γ represents an activation function; the product of Hadamard calculated by element; />With the output being adaptively controlled.
In a specific embodiment, a static graph gated graph rolling network (GGCN-FG) is shown in FIG. 5, where the static graph is derived from the distance between different nodes. Generally, different nodes have different geographical location distributions, while neighboring nodes have higher similarity. Therefore, the application combines the longitude and latitude positions of the nodes and the Gaussian kernel based on the threshold value, calculates the Euclidean distance between the nodes in the traffic network and represents the static diagram. The static diagram (FG) is shown in formula (6):
wherein ,Xi and Yi Latitude and longitude, X, respectively, of node i j and Yj The latitude and longitude where node j is located, σ is the standard deviation of the Euclidean distance, and ε is the threshold of the Gaussian kernel.
Assume thatRepresenting the generated static graph; />Representing input data, where C in T is the channel dimension of the input in Is the time dimension of the input; />Representing a linear layer, where C in C as the channel dimension of the input out Is the channel dimension of the output. Therefore, GCN can be defined as formula (7):
g=A FG XW (7)
as shown in GGCN-FG of fig. 5, based on equation (7), the output of GCN is further adaptively controlled by gating, more statistical information is obtained by regularization method, and network degradation is avoided by residual connection. Based on FG, GGCN-FG expressed by formula (8) can be obtained:
wherein ,for GGCN output, C out For the channel dimension of the output, T out Is the time dimension; w (W) 1 and W2 For the linear layer, σ and +.are the Sigmoid function and the per-element hadamard product used to form gating, BN is batch normalization.
In a specific embodiment, an adaptive graph gated graph rolling network (GGCN-AG) is shown in FIG. 6. In addition to static graphs generated from geographical location distribution, adaptive graphs are also very important, which may represent hidden spatial dependencies between nodes. The adaptive map (AG) is independent of a priori knowledge and can be defined according to equation (9):
wherein ,the model training method is a randomly initialized vector, hidden space dependency relations among nodes can be automatically captured in the model training process, C is the length of the vector, and ReLU and SoftMax functions are used for improving the speed and effect of model training.
As shown in GGCN-AG of fig. 6, based on equation (7), the output of GCN is further adaptively controlled by gating, more statistical information is obtained by regularization method, and network degradation is avoided by residual connection. Based on the adaptive map AG, GGCN-FG expressed by formula (10) can be obtained:
wherein ,for GGCN output, C out For the channel dimension of the output, T out Is the time dimension; w (W) 1 and W2 For the linear layer, σ and +.are the Sigmoid function and the per-element hadamard product used to form gating, BN is batch normalization.
In a specific embodiment, the channel modeling module, as shown in FIG. 7, in capturing the spatio-temporal dependencies of traffic flow data, different channels represent different spatio-temporal semantics and therefore often require conversion channels. In order to improve the representation capability of space-time dependency relationship among channels and capture the implicit mode and implicit fine granularity characteristics among channels, the application provides a novel CSAM to realize semantic interaction among channels. AS shown in the channel modeling module in fig. 7, CSAM employs a self-attention mechanism to calculate the Attention Score (AS) for the different channels AS shown in equation (14). CSAM can comprehensively represent implicit patterns and implicit fine-grained features of stacked spatio-temporal dependencies in the channel dimension. In addition, the present embodiment also employs Layer Normalization (LN) and random discard (Dropout) to accelerate network convergence and prevent overfitting. Thus, CSAM can be expressed as equation (15):
CSAM=Dropout(LN(AS)) (15)
wherein , and />Respectively representing a query vector, a key vector and a value vector, which are respectively obtained by input through query mapping, key mapping and value mapping, and taking the query vector, the key vector and the value vector as input of a self-attention mechanism; c (C) in Is the channel dimension of the input, C d Is the channel dimension, T, of the self-attention mechanism in Is the time dimension of the input, softmax is the activation function; />Is the output of CSAM, C out Is the channel dimension of the output, T out Is the time dimension of the output.
S3, inputting the output of the space-time encoder into a space-time decoder to obtain a prediction result, wherein the space-time decoder comprises M decoding layers which are sequentially connected, each decoding layer is in jump connection with a corresponding encoding layer in the space-time encoder, in each encoding layer, one path of input characteristics is decoded by the time gating decoder to acquire additional fine granularity time characteristics, and then adaptive spatial characteristic decoding is realized by an adaptive graph gating graph rolling network to acquire fine granularity spatial characteristics to obtain third characteristics; the other path of the input features passes through a channel modeling module to effectively capture the implicit mode and the implicit fine granularity features among channels to obtain fourth features; and then adding the third feature and the fourth feature to obtain the output feature of the decoding layer.
In traffic flow prediction, extraction of time-dependent relationships is indispensable. This is especially true when dealing with long distance sequences, which are prone to gradient explosion/extinction problems during training. Accordingly, the present application proposes a novel time encoder-decoder module comprising TGE and TGD, thereby effectively extracting the time characteristics of traffic flow data.
The space-time decoder of this embodiment also includes four decoding layers, each of which is shown in fig. 8, and also has two branches, and the output features of the two branches are added to obtain the output features of the decoding layers.
Specifically, the first branch of the decoding layer includes a time gating decoder TGD and an adaptive graph gating graph rolling network GGCN-AG; the second branch is the channel modeling module CSAM.
In a specific embodiment, a Time Gated Decoder (TGD) as shown in fig. 9, to upsample and restore the time features captured in the TGD to the original time scale as input, the TGD employs one-dimensional transpose convolution to capture additional time features and map them to higher dimensional space, thereby alleviating the gradient explosion/extinction problem caused by the conventional zero-fill method. TGD also employs gating to adaptively control model outputs. Thus, TGD alleviates the problem of feature underutilization.
In this embodiment, each decoding layer of the space-time decoder is in skip connection with a corresponding encoding layer of the space-time encoder, so that the input of the first decoding layer is obtained by splicing the output of the space-time decoder with the output of the fourth encoding layer. And the input of the second decoding layer is obtained by splicing the output of the first decoding layer with the output of the third encoding layer, and so on.
For any one decoding layer, given input featuresAnd output characteristics of the coding layer->Is spliced to obtain-> wherein Cid Channel dimension for input feature, C ie Channel dimension, T, for coding layer output features id Is the time dimension of the input.
The time gate decoder TGD may be defined as follows:
d=Γ(Tconv(X d )⊙g(X d )+r(X d )) (16)
Tconv(X d )=X d W dc +b dc (17)
g(X d )=σ(X d W dg +b dg ) (18)
r(X d )=X d W dr +b dr (19)
wherein Tconv (X) d ) For one-dimensional transpose convolution, for decoding temporal features, W dc and bdc Is a one-dimensional transposed convolved parameter; g (X) d ) Representing gating controlling one-dimensional transposed convolution output, W dg and bdg Is a gating parameter; r (X) d ) For residual connection to avoid network degradation, W dr and bdr Parameters of residual connection;output as TGD, wherein C od For the channel dimension of the output, T od Is the time dimension of the output; Γ represents a ReLU activation function; sigma is a Sigmoid activation function; the product of Hadamard calculated by element is indicated by the letter.
The application also verifies the model through experiments, and tests the performance of the proposed model by adopting two data sets of PeMSD4 and PeMSD8, wherein the data sets are from Caltrans Performance Measurement System (PeMS). The PeMSD4 dataset has 180 sensors located in san francisco. The PeMSD8 dataset has 175 sensors located at san Bei Nadi no. Both data sets contain traffic flow data from month 1 of 2017, 6 to month 1 of 2017, 8, with 5 minutes as a time point step. Referring to the study by Li et al (2017), traffic flow data was normalized using the Z-Score method, and training, validation and test sets of data were partitioned using a 6:2:2 ratio.
Traffic flow for 12 historical time point steps (60 minutes) was used to predict traffic flow for 12 future successive time point steps and three statistical indicators were used to evaluate the performance of the model: mean Absolute Error (MAE), root Mean Square Error (RMSE), and Mean Absolute Percent Error (MAPE). Let y be p (1. Ltoreq.p. Ltoreq.P)The three statistical indexes can be calculated according to the following formulas, wherein P represents the P-th true value and the predicted value, and P represents the size of the test sample:
adam was chosen as the optimizer for the experiment, with an initial learning rate of 0.001, training round number of 150, and batch size of 64. In TGE and TGD, one-dimensional causal convolution, one-dimensional transpose convolution, gating, and residual concatenation all use a convolution kernel with a filter size of 3 and a time point step of 1 as the convolution kernel. In the spatial modeling module, σ and ε are both set to 0.5, E 1 and E2 The length of (2) is set to 10. In the channel modeling module, the channel dimension C of the self-attention mechanism d Let 8 be the number. The number of coding layers and decoding layers/is set to 4. Input time dimension T in Let 12 be the output time dimension T out Set to 12.t represents the time feature scale of each compression of TGE, set to 2. Input channel dimension C in Channel dimension C of 32, output out 32.
Table 1 lists the average predictions for 3 time-point steps (15 minutes), 6 time-point steps (30 minutes) and 12 time-point steps (60 minutes) in the future for GSTC-Unet and other reference models proposed by the present application on both PeMSD4 and PeMSD8 datasets. The comparison experiment results show that:
(1) The deep learning models (LSTM, STGCN, DCRNN, ASTGNN, MTGNN, graph WaveNet and GSTC-Unet) generally outperform the traditional statistical model (HA). It follows that complex and self-learning deep learning models are more adept at modeling traffic flow data with spatiotemporal relationships.
(2) In the deep learning model, the GCN-based models (STGCN, DCRNN, ASTGNN, MTGNN, graph WaveNet, and GSTC-Unet) perform better than the model without GCN structure (LSTM), thus it can be seen that the GCN-based models are better at processing traffic flow data with non-euclidean features.
(3) Compared with STGCN, DCRNN, ASTGNN, MTGNN and Graph WaveNet, GSTC-Unet has better performance because the U-MSSGCN adopted by GSTC-Unet can effectively capture the multi-scale space-time dependency relationship of traffic flow data.
(4) Compared to RNN-based models (i.e., LSTM and DCRNN), CNN-based models (i.e., STGCN, MTGNN, graph WaveNet and GSTC-une) perform better (especially for predictions of 30 and 60 minutes into the future) because the RNN gradient explosion/disappearance problem limits its ability to capture long distance sequences. Furthermore, GSTC-Unet performs best in all CNN-based models, because in GSTC-Unet, the temporal encoder-decoder module captures temporal signatures with CNN-based TGEs that construct causal relationships for time series predictions and TGDs that alleviate the problem of signature underutilization. Therefore, GSTC-Unet can effectively alleviate the gradient explosion/disappearance problem during long-distance sequence capture, thereby achieving excellent performance in long-distance prediction.
(5) ASTMNN adopts a trend-aware self-attention mechanism in the space-time dimension, and can adaptively learn the dynamic space-time characteristics of traffic flow data, but ignores the characteristics in the channel dimension. Compared with ASTMNN, the GSTC-Unet adopts novel CSAM in the channel dimension, and can capture the implicit mode and the implicit fine granularity characteristic between channels, so that the representation capability of the model on the time-space dependence is enhanced, and the GSTC-Unet has better performance than the ASTMNN.
(6) In summary, the GSTC-Unet provided by the application obtains the best prediction result on most indexes, especially for long-distance prediction of 30 minutes and 60 minutes in the future.
Table 1 results of comparative experiments of different models on two data sets
The GSTC-Unet provided by the application can comprehensively capture long-distance sequences, implicit modes, implicit fine granularity characteristics and multi-scale space-time dependency relationships, thereby realizing accurate traffic flow prediction. First, the present application proposes a novel time encoder-decoder based on causal convolution and transposed convolution, which is capable of constructing causal relationships for time series prediction and alleviating the defect of feature under-utilization, which contributes to long-distance series prediction. Secondly, the application provides a new channel self-attention mechanism which can enhance the channel semantics and effectively capture the implicit mode and the implicit fine granularity characteristics among channels, and simultaneously enhance the representation capability of the model on the space-time dependency relationship. Finally, the application provides a novel U-shaped multi-scale space-time diagram convolution network with full convolution and jump connection, which can comprehensively capture space-time dependency relations on different scales. A large number of experimental results show that GSTC-Unet has remarkable superiority in predicting traffic flow.
The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (7)

1. The traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolution network is characterized in that the U-shaped multi-scale space-time diagram convolution network comprises a space-time encoder and a space-time decoder, and the traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolution network comprises the following steps:
acquiring historical traffic flow data of each node in a traffic network in a preset time period, and constructing original characteristic data comprising channel dimension, node dimension and time dimension;
inputting original characteristic data into a space-time encoder, wherein the space-time encoder comprises M coding layers which are sequentially connected, in each coding layer, capturing time characteristics by a time gating encoder, and further capturing space characteristics by a static image gating image rolling network and a self-adaptive image gating image rolling network sequentially to obtain first characteristics; the other path of the input feature passes through a channel modeling module, the attention scores of different channels are calculated in the channel modeling module, and normalization and random discarding processing are carried out to obtain a second feature; then adding the first feature and the second feature to obtain an output feature of the coding layer;
inputting the output of the space-time encoder into a space-time decoder to obtain a prediction result, wherein the space-time decoder comprises M decoding layers which are sequentially connected, each decoding layer is in jump connection with a corresponding encoding layer in the space-time encoder, in each encoding layer, one path of input characteristics is decoded by a time gating decoder, additional fine granularity time characteristics are captured, and then adaptive spatial characteristic decoding is realized by an adaptive graph gating graph rolling network to capture fine granularity spatial characteristics, so that a third characteristic is obtained; the other path of the input features passes through a channel modeling module to effectively capture the implicit mode and the implicit fine granularity features among channels to obtain fourth features; and then adding the third feature and the fourth feature to obtain the output feature of the decoding layer.
2. The traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolutional network according to claim 1, wherein the characteristic channel dimensions comprise four characteristics of traffic flow, time stamp, longitude of node and latitude of node.
3. The traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolutional network according to claim 1, wherein the time gating encoder adopts the following formula:
e=Γ(Cconv(X e )⊙g(X e )+r(X e ))
Cconv(X e )=X e W ec +b ec
g(X e )=σ(X e W eg +b eg )
r(X e )=X e W er +b er
wherein Cconv (X) e ) Is a one-dimensional causal convolution, W ec and bec Is a parameter of one-dimensional causal convolution; g (X) e ) Representing gating controlling one-dimensional causal convolution output, W eg and beg Is a gating parameter; r (X) e ) For residual connection to avoid network degradation, W er and ber Parameters of residual connection; e is the output of the time-gated encoder, Γ represents the activation function, Γ represents the Hadamard product calculated per element, X e Is the input to the time-gated encoder.
4. The traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolution network according to claim 1, wherein the static diagram gating diagram convolution network is expressed as follows by adopting a formula:
wherein ,output of the rolling network for static diagram gating diagram, W 1 and W2 Is a linear layer, sigma and +.is the product of Sigmoid function and hadamard per element used to form gating, BN is the batch normalization, X is the input to the static graph gating graph convolution network, A FG Is a static diagram.
5. The traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolution network according to claim 1, wherein the adaptive diagram gating diagram convolution network is expressed as follows by adopting a formula:
wherein ,for the output of an adaptive graph gating graph rolling network, W 1 and W2 Is a linear layer, sigma and +.are the Sigmoid function and the hadamard product per element used to form gating, BN is the batch normalization, X is the input to the adaptive graph gating graph convolution network, A AG Is an adaptive graph.
6. The traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolutional network according to claim 1, wherein the channel modeling module adopts the following formula expression:
CSAM=Dropout(LN(AS))
wherein AS is the attention score, CSAM is the output of the channel modeling module, dropout is the random discard, LN is the normalization, Q is the query vector obtained by the query mapping of the input, K is the key vector obtained by the key mapping of the input, and V is the value vector obtained by the value mapping of the input.
7. The traffic flow prediction method based on the U-shaped multi-scale space-time diagram convolutional network according to claim 1, wherein the time gating decoder adopts the following formula:
d=Γ(Tconv(X d )⊙g(X d )+r(X d ))
Tconv(X d )=X d W dc +b dc
g(X d )=σ(X d W dg +b dg )
r(X d )=X d W dr +b dr
wherein Tconv (X) d ) For one-dimensional transposed convolution, W dc and bdc Is a one-dimensional transposed convolved parameter; g (X) d ) Representing gating controlling one-dimensional transposed convolution output, W dg and bdg Is a gating parameter; r (X) d ) For residual connection to avoid network degradation, W dr and bdr Parameters of residual connection; d is the output of the time-gated decoder, Γ represents the ReLU activation function; sigma is a Sigmoid activation function; as indicated by the addition of the element-calculated Hadamard product, X d Is the input to the time-gated codec.
CN202310674025.9A 2023-06-07 2023-06-07 Traffic flow prediction method based on U-shaped multi-scale space-time diagram convolutional network Pending CN116682271A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310674025.9A CN116682271A (en) 2023-06-07 2023-06-07 Traffic flow prediction method based on U-shaped multi-scale space-time diagram convolutional network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310674025.9A CN116682271A (en) 2023-06-07 2023-06-07 Traffic flow prediction method based on U-shaped multi-scale space-time diagram convolutional network

Publications (1)

Publication Number Publication Date
CN116682271A true CN116682271A (en) 2023-09-01

Family

ID=87778736

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310674025.9A Pending CN116682271A (en) 2023-06-07 2023-06-07 Traffic flow prediction method based on U-shaped multi-scale space-time diagram convolutional network

Country Status (1)

Country Link
CN (1) CN116682271A (en)

Similar Documents

Publication Publication Date Title
Zhang et al. Deep convolutional neural networks for forest fire detection
CN113936339B (en) Fighting identification method and device based on double-channel cross attention mechanism
CN113487088A (en) Traffic prediction method and device based on dynamic space-time diagram convolution attention model
CN114220271B (en) Traffic flow prediction method, equipment and storage medium based on dynamic space-time diagram convolution circulation network
Zhang et al. Deep learning driven blockwise moving object detection with binary scene modeling
CN112507898A (en) Multi-modal dynamic gesture recognition method based on lightweight 3D residual error network and TCN
Masurekar et al. Real time object detection using YOLOv3
CN115240425A (en) Traffic prediction method based on multi-scale space-time fusion graph network
CN109685153A (en) A kind of social networks rumour discrimination method based on characteristic aggregation
CN113536972B (en) Self-supervision cross-domain crowd counting method based on target domain pseudo label
CN113298319B (en) Traffic speed prediction method based on skip map attention gating cycle network
CN113807318B (en) Action recognition method based on double-flow convolutional neural network and bidirectional GRU
CN113378074A (en) Social network user trajectory analysis method based on self-supervision learning
CN111966823A (en) Graph node classification method facing label noise
CN115578851A (en) Traffic prediction method based on MGCN
Zhang et al. Recognizing human group behaviors with multi-group causalities
CN112738014A (en) Industrial control flow abnormity detection method and system based on convolution time sequence network
CN113570859A (en) Traffic flow prediction method based on asynchronous space-time expansion graph convolution network
Su et al. Semantic segmentation of high resolution remote sensing image based on batch-attention mechanism
CN116306780B (en) Dynamic graph link generation method
CN110738129B (en) End-to-end video time sequence behavior detection method based on R-C3D network
CN103544503A (en) Behavior recognition method based on multi-instance markov model
CN115240120B (en) Behavior identification method based on countermeasure network and electronic equipment
CN116682271A (en) Traffic flow prediction method based on U-shaped multi-scale space-time diagram convolutional network
Jiao et al. Realization and improvement of object recognition system on raspberry pi 3b+

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination