CN113657645B - Space-time connection enhanced 3DCNN traffic prediction method based on attention mechanism - Google Patents
Space-time connection enhanced 3DCNN traffic prediction method based on attention mechanism Download PDFInfo
- Publication number
- CN113657645B CN113657645B CN202110801710.4A CN202110801710A CN113657645B CN 113657645 B CN113657645 B CN 113657645B CN 202110801710 A CN202110801710 A CN 202110801710A CN 113657645 B CN113657645 B CN 113657645B
- Authority
- CN
- China
- Prior art keywords
- time
- module
- component
- heterogeneity
- space
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007246 mechanism Effects 0.000 title claims abstract description 24
- 238000000034 method Methods 0.000 title claims abstract description 11
- 230000000737 periodic effect Effects 0.000 claims abstract description 21
- 239000011159 matrix material Substances 0.000 claims description 41
- 238000012549 training Methods 0.000 claims description 32
- 238000012360 testing method Methods 0.000 claims description 21
- 238000010276 construction Methods 0.000 claims description 19
- 238000011176 pooling Methods 0.000 claims description 19
- 230000002123 temporal effect Effects 0.000 claims description 15
- 239000000284 extract Substances 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 9
- 238000011156 evaluation Methods 0.000 claims description 6
- 230000036962 time dependent Effects 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 3
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 claims description 3
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 238000011160 research Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- YHXISWVBGDMDLQ-UHFFFAOYSA-N moclobemide Chemical compound C1=CC(Cl)=CC=C1C(=O)NCCN1CCOCC1 YHXISWVBGDMDLQ-UHFFFAOYSA-N 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
- G08G1/0129—Traffic data processing for creating historical data or processing based on historical data
Abstract
The invention discloses a space-time connection enhanced 3DCNN traffic prediction method based on an attention mechanism, which designs a space-time connection enhanced 3DCNN model framework consisting of a periodic component, a trend component and a near-term component, wherein the periodic component and the trend component are used for extracting space-time characteristics of traffic flow data with longer time intervals, and the near-term component is used for extracting the space-time characteristics of the near-term data and further learning information extracted by the other two components so as to improve prediction accuracy; taking the channel dimension into consideration on the basis of the space-time dimension of the traditional attention mechanism, constructing a space-time influence attention module to capture the influence degree of space-time characteristics, and achieving the purposes of quantifying the space-time heterogeneity and further improving the prediction accuracy; and connecting the multi-attention modules, wherein the visibility of the information obtained by the lower-layer module to the higher-layer module is increased. The invention further improves the prediction accuracy of the traffic flow data.
Description
Technical Field
The invention belongs to the field of traffic flow prediction, and relates to a space-time connection enhanced 3DCNN traffic flow prediction model based on an attention mechanism.
Background
With the acceleration of the urban process and the improvement of the living standard of residents, the urban automobile has a rapid increase in the quantity of maintenance, brings the problems of serious environmental pollution, traffic jam and the like while bringing people with rapid and convenient living, especially in the peak period of working hours in the morning and evening, and during severe weather, large-scale gatherings or legal holidays, the urban trunk roads are often in a heavy congestion state, so that the road traffic capacity is greatly reduced, trouble is brought to travel of people, great challenges are brought to management of traffic departments, and the urban operation efficiency is reduced. Intelligent traffic system (Intelligent Transportation System, ITS) construction aims to improve traffic operation conditions, improve traffic operation efficiency and solve traffic congestion problems. The accurate and efficient traffic flow prediction is an important ring for solving the traffic jam problem in ITS, has important guiding significance for urban road planning, resident travel route guidance, traffic department management, dispersion and the like, and is currently becoming a research hotspot and difficulty in the intelligent traffic field.
Many scholars have conducted extensive research in the field of traffic prediction over the last decades and have achieved a great deal of success. Initially, traffic flow prediction problems were solved as time series prediction problems, and research results generally used models such as an integrated moving average autoregressive model (Autoregressive Integrated Moving Average model, ARIMA) or a recurrent neural network (Recurrent Neural Networks, RNN) as a framework. However, these models only consider temporal features, neglecting the influence of near-area spatial information, and do not adequately model traffic flow data. In order to be able to learn the spatial information of traffic flow data, convolutional neural networks (Convolutional Neural Network, CNN) are beginning to be applied in the field of traffic prediction, but their weak time information extraction capability again makes the model inadequate for learning time characteristics. For this reason, more and more researches use models with different characteristics, such as RNN and CNN, in combination for comprehensive extraction and processing of space-time characteristic information, but also bring problems of complex model structure, difficult optimization and the like. Therefore, how to use a model with a simple structure to effectively extract space-time characteristics without affecting the prediction accuracy is still an important point and a difficult point in the current traffic prediction field. In addition, most of the researches neglect the problem of heterogeneity of space-time characteristics in traffic flow data, and how to identify and quantify the heterogeneity of the space-time contribution degree is one of the problems to be considered in the field.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a space-time connection enhanced 3DCNN traffic prediction model based on an attention mechanism to realize efficient and accurate prediction of large-scale traffic flow data, thereby providing basis for intelligent traffic control and induction. The main content is as follows: (1) An STC3DCNN model framework consisting of a periodical component, a trend component and a near-term component is designed, wherein the periodical component and the trend component are used for extracting the space-time characteristics of traffic flow data with longer time intervals, and the near-term component is used for extracting the space-time characteristics of near-term data and further learning information extracted by the other two components so as to improve prediction accuracy. (2) Based on the space-time dimension of the traditional attention mechanism, the channel dimension is considered, and a space-time influence attention module is constructed to capture the influence degree of space-time characteristics, so that the purpose of quantifying the space-time heterogeneity and further improving the prediction precision is achieved. (3) The multi-attention modules are generally connected in series, and the information obtained by the lower-layer modules is not high in visibility to the higher-layer modules. The invention adopts the attention connection mode, namely, the information extracted by the low-layer attention module is transmitted into the high-layer attention module to continue learning, thus improving the model performance.
The invention comprises two steps: the method comprises the steps of constructing a space-time connection enhanced 3DCNN traffic prediction model based on an attention mechanism, and training and testing the space-time connection enhanced 3DCNN traffic prediction model based on the attention mechanism.
Step 1: 3DCNN model construction based on attention mechanism and space-time connection enhancement
The 3DCNN model construction of the spatiotemporal connection enhancement based on the attention mechanism comprises 4 steps: periodic component construction, trend component construction, recent component construction, and component combination;
step 1-1: periodic component building
The periodic component consists of a residual error module and a space influence attention module;
the residual error module comprises two convolution operations for extracting space-time characteristics of traffic flow data, wherein each convolution operation comprises 3D convolution, 3D batch standardization and activation operations; the residual block formula is as follows:
where f is a convolution operation, X l Is the output of the first layer residual error module and the input of the first layer residual error module and the first layer residual error module, X' l To pass through the intermediate value of one convolution operation, X l+1 The output of the residual error module of the layer I+1;
inputting the features extracted by the residual error module into a spatial influence attention module to quantify the heterogeneity of the spatial features, wherein the spatial influence attention module comprises pooling, convolution and activation operations; the formula of the spatial impact attention module is as follows:
S p =σ(conv(concat(MP(X T ),AP(X T )))) [2]
wherein ,XT For the input of the module, MP and AP are the maximum pooling operation and the average pooling operation, the time dimension is compressed, only the space dimension characteristics are reserved, the salient characteristic information and the background characteristic information are extracted, and meanwhile the heterogeneity of the space characteristics is quantified; concat is matrix fusion, conv is convolution operation, sigma is activation operation, S p A spatial feature heterogeneity matrix derived for the periodic component;
step 1-2: trend component construction
The trend component consists of a residual error module, a space influence attention module and a time influence attention module; the space-time characteristics of traffic flow data are obtained through a residual error module, and the characteristic information is respectively input into a space influence attention module and a time influence attention module to quantify the heterogeneity of the space characteristics and the heterogeneity of the time characteristics; the operation of the residual error module is consistent with that of the periodic component;
the spatial impact attention module formula is as follows:
S tr =σ(conv(concat(MP(X T ),AP(X T )))) [3]
wherein ,Str A spatial feature heterogeneity matrix derived for the trend component;
the time-dependent attention module formula is as follows:
T tr =σ(conv(concat(ζ(MP(X T )),ζ(AP(X T ))))) [4]
as with the spatial impact attention module, max pooling operations andthe average pooling operation is used for quantifying the heterogeneity of the time characteristics, and finally obtaining a time characteristic heterogeneity matrix T tr ζ represents an error back propagation operation;
step 1-3: recent component build
The near term component consists of a residual error module, a space influence attention module and a time influence attention module; the component adopts an attention connection mechanism to enhance the performance of an attention module, and a space influence attention module formula is as follows:
wherein ,spatial feature heterogeneity matrix extracted for layer I spatial influence attention module,/for example>For the layer 1 spatial influence of the input of the attention module,/for the layer 1 spatial influence of the attention module>A spatial characteristic heterogeneity matrix, w, primarily extracted for the l+1 layer spatial influence attention module 1 and w2 Initializing to 0 for the custom parameter; heterogeneity matrix extracted by the bottom attention module +.>Input to influence attention module with high-rise space +.>Multiplying, extracting heterogeneity characteristic of bottom layer by pooling operation and convolution operation, and mixing with characteristic heterogeneity matrix of the initial extraction of the bottom layer>Combining to obtain the spatial characteristic heterogeneous moment of the layerArray Representing element multiplication;
the time-dependent attention module formula is as follows:
wherein ,time feature heterogeneity matrix extracted for layer I time influence attention module,/A>For the layer 1 time, the input of the attention module is influenced,/for the layer 1 time>Time feature heterogeneity, q, initially extracted for layer 1 time-influence attention module 1 and q2 Initializing to 0 for the custom parameter; />A time characteristic heterogeneity matrix output by the attention module is influenced for the time of the layer 1;
after the spatial characteristic heterogeneity matrix and the temporal characteristic heterogeneity matrix are obtained, the spatial characteristic heterogeneity matrix and the temporal characteristic heterogeneity matrix are multiplied by input to be used as input of a residual error module of the next layer, and the formula is as follows:
the final results obtained by the near term components are as follows:
wherein ,Lm Representing the maximum layer number of the residual error module, wherein Y is the result of a recent component, namely the result of a prediction model, and res represents the residual error layer;
step 1-4: assembly combination
The results input recent component of the period component and the trend component learn further as follows:
wherein , and />Heterogeneity matrix quantifying for recent module first layer spatial impact attention module and time impact attention module, S tr and Ttr Spatial feature and temporal feature heterogeneity matrix quantifying trend components, S p A spatial feature heterogeneity matrix obtained for the periodic component;
after combination, obtaining a final prediction result by a recent component, wherein the steps of the recent component are shown in steps 1-3;
step 2: STC3DCNN model training and testing
The training and testing of the STC3DCNN model comprises two steps: model training and model testing;
step 2-1: model training
Dividing traffic flow data into a training set and a testing set, sequencing the training set data according to time sequence, and extracting partial data as input according to the requirements of each component in each training round; the period component extracts periodic data of m time points as input, the recent component extracts n time point data of recent data as input, and the trend component extracts n time point data which are separated from the recent data by one week as input;
the Adam optimization algorithm is adopted to adjust parameters during model training; when the training times reach the set k value, the model stops training;
step 2-2: model testing
And predicting in the test set by using the trained model, and comparing the obtained predicted value with a real observed value to obtain the prediction precision.
The evaluation index of the model performance is as follows:
mean absolute error (Mean Absolute Error, MAE): the result is the average of the absolute errors between the actual and predicted values, and the formula is as follows:
average absolute percent error (Mean Absolute Percentage Error, MAPE): the result is the average of the absolute percentage error between the actual and predicted values, as follows:
root mean square error (Root Mean Square Error, RMSE): the result is the arithmetic square root of the mean square error between the actual and predicted values, as follows:
after the model obtains three evaluation indexes, the model is compared with the current main stream model, and the performance of the model is proved to be in the leading position.
By adopting the method, namely the 3D convolution is adopted by the space-time connection enhanced 3DCNN traffic prediction model based on the attention mechanism, the space-time characteristics of traffic flow data can be extracted better under the condition of not increasing the complexity of the model structure. Based on the trained parameter model, a high-precision prediction result can be obtained. Meanwhile, the invention has the following characteristics:
1) The method has the advantages that the residual error module is formed by three-dimensional convolution, the single three-dimensional convolution can simultaneously capture the time characteristic and the space characteristic of the traffic flow data without increasing the complexity of the model structure, and the residual error module is integrated with a plurality of three-dimensional convolutions, so that the model can extract the space-time characteristic of a larger range, and the characteristics of more areas and more time of cities can be considered more effectively.
2) At present, most researches do not consider the heterogeneity of the space-time characteristics, namely, each region in a city and the characteristics of each time point are different.
3) Traffic flow data is huge in scale, the scale of a training set is generally more than millions, and the performance of attention modules arranged in the prior art can not accurately analyze the characteristics of all data in the face of data of the magnitude, so that the attention module performance is improved by adopting attention connection. Thereby further improving the prediction accuracy of the traffic flow data.
Drawings
FIG. 1 is a general flow chart of the present invention;
fig. 2 is a network configuration diagram of the present invention.
Detailed Description
As shown in fig. 1, the present invention comprises two steps: the method comprises the steps of constructing a space-time connection enhanced 3DCNN traffic prediction model based on an attention mechanism, and training and testing the space-time connection enhanced 3DCNN traffic prediction model based on the attention mechanism.
Step 1: space-time connection enhanced 3DCNN traffic prediction model construction based on attention mechanism (step 1 in FIG. 1)
As shown in step1 in fig. 1, the space-time connection enhanced 3DCNN traffic prediction model construction based on the attention mechanism includes 4 steps: periodic component construction, trend component construction, recent component construction, and component combination.
Step 1-1: periodic component building
As shown in step1 of FIG. 2, the period component is composed of a layer of residual modules and a layer of spatial impact attention modules.
The residual module contains two convolution operations for extracting spatiotemporal features of the traffic flow data, each convolution operation containing a 3D convolution (conv), a 3D batch normalization (batch normalization, BN), and an activation operation. The residual block formula is as follows:
wherein, sigma is the activation operation,the output of the first layer residual error module is also the input of the first layer residual error module and the second layer residual error module,for intermediate values after a convolution operation, +.>Is the output of the layer 1 residual module.
The method comprises the steps that the characteristics extracted by a residual error module are input into a spatial influence attention module to quantify heterogeneity of spatial characteristics, wherein the spatial influence attention module comprises pooling, convolution and activation operations, and the spatial influence attention module has the following formula:
wherein ,XT For the input of the module, MP and AP are the maximum pooling operation and the average pooling operation, compress the time dimension, only preserve the space dimension features, and extract the salient feature information and the background feature information,representing element multiplication. X is X m(s) and Xa(s) For the characteristic information after two pooling operations, < >> and />For the spatial characteristics of each channel, k is the channel number, C m Is the maximum number of channels. /> and />For custom matrix (initialized to 0), and +.> and />Element multiplication to quantify spatial feature heterogeneity in each channel, concat is matrix fusion, S p A spatial feature heterogeneity matrix is derived for the periodic component.
Step 1-2: trend component construction
As shown in step2 of FIG. 2, the trend component is composed of two layers of residual modules, one layer of spatial influence attention module, and one layer of temporal influence attention module. And obtaining space-time characteristics of traffic flow data through a residual error module, and respectively inputting the characteristic information into a space influence attention module and a time influence attention module to quantify the heterogeneity of the space characteristics and the heterogeneity of the time characteristics. The operation of the residual module is consistent with the periodic component and will not be described in detail here.
The spatial impact attention module formula is as follows:
wherein ,Str A spatial feature heterogeneity matrix derived for the trend component.
The time-dependent attention module formula is as follows:
wherein ,Xm(tr) and Xa(tr) For the information processed by the maximum pooling operation and the average pooling operation, the dimension of the space dimension is reduced, and only the time dimension characteristics are reserved. As with the spatial impact attention module, its channel characteristics are customized and />Element multiplication is used for quantifying the heterogeneity of the time characteristics, and finally a time characteristic heterogeneity matrix T is obtained tr 。
Step 1-3: recent component build
As shown in step3 of fig. 2, the near term component is composed of a three-layer residual module, a three-layer spatial influence attention module, and a three-layer temporal influence attention module. The component adopts an attention connection mechanism to enhance the performance of an attention module, and a space influence attention module formula is as follows:
wherein ,spatial feature heterogeneity information extracted for layer I spatial influence attention module,/for example>For the layer 1 spatial influence of the input of the attention module,/for the layer 1 spatial influence of the attention module>The heterogeneity, w, of spatial features initially extracted for the layer 1 spatial influence attention module 1 and w2 For the custom parameter, initialize to 0. Heterogeneity information extracted by the underlying attention module +.>Input to influence attention module with high-rise space +.>Multiplying, extracting heterogeneity characteristic of bottom layer by pooling operation and convolution operation, and extracting heterogeneity of the heterogeneity characteristic with the layer>Binding gives the heterogeneity of the layer +.>
The time-dependent attention module formula is as follows:
wherein ,time feature heterogeneity information extracted for layer-1 time-influence attention module,/for example>For the layer 1 time, the input of the attention module is influenced,/for the layer 1 time>Time feature heterogeneity, q, initially extracted for layer 1 time-influence attention module 1 and q2 For the custom parameter, initialize to 0./>The temporal feature heterogeneity matrix output for the layer 1 time-impact attention module.
After the spatial characteristic heterogeneity and the temporal characteristic heterogeneity are obtained, multiplying the spatial characteristic heterogeneity and the temporal characteristic heterogeneity by an input to be used as the input of a residual error module of the next layer, and the formula is as follows:
the final results obtained by the near term components are as follows:
and through the processing of the three-layer residual error module, the space and time influence attention module, the prediction result finally obtained by the STC3DCNN model is Y.
Step 1-4: assembly combination
As shown in step4 of FIG. 2, the result input recent component of the period component and the trend component learn further as follows:
wherein , and />Heterogeneity quantified for the near term module first tier spatial impact attention module and time impact attention module, S tr and Ttr Spatial feature and temporal feature heterogeneity quantified for trend components, S p Spatial feature heterogeneity obtained for periodic components. u (u) 1 、u 2 and u3 For the custom variable, initialize to 0.
After combination, the final prediction result is obtained by the recent component, and the steps of the recent component are shown in steps 1-3.
Step 2: space-time connection enhanced 3DCNN traffic prediction model training and testing based on attention mechanism
As shown in step2 in fig. 1, the training and testing of the space-time connection enhanced 3DCNN traffic prediction model based on the attention mechanism is divided into two steps: model training and model testing.
Step 2-1: model training
The data set is selected from taxi driving data of 1 month and 1 day in 2015, and New York. This dataset divided new york city into a 10 x 20 grid plot, data collected every half hour, data divided into two types: inflow data and outflow data, so the data format of each time step is R 10×20×2 . We sort the data in time steps to form three-dimensional data, so that each component inputs data in the form of R T×10×20×2 Where T is the total time step.
The data of the first 40 days in the data set is used as a training set, and part of data is extracted as input according to the requirements of each component in each training round. The periodicity component extracts periodicity data for 4 time points as input (t=4), the near term component extracts 16 time point data for near term data as input (t=16), and the trend component extracts 16 time point data one week apart from the near term data as input (t=16).
The parameters are adjusted by adopting an Adam optimization algorithm during model training, the first-order exponential decay rate is set to 0.9, the second-order exponential decay rate is set to 0.999, the learning rate is set to 1e-3, and the weight decay is set to 0.005. When the number of training times reached 600 times set, the model stopped training.
Step 2-2: model testing
The data of the first 40 days in the data set is used as a test set, a trained model is used for prediction in the test set, the obtained predicted value is compared with the actual observed value to obtain the prediction precision, and three indexes of MAE, MAPE and RMSE are adopted as the evaluation indexes of the model performance. After the model obtains three evaluation indexes, the three evaluation indexes are compared with the current mainstream model. The STC3DCNN model provided by the invention is in the leading position on most indexes, the data of three indexes of the model are 6.12, 0.36 and 15.05 respectively, and experimental results prove that the STC3DCNN model is effective.
Claims (2)
1. The space-time connection enhanced 3DCNN traffic prediction method based on the attention mechanism is characterized by comprising the following steps of:
step 1: 3DCNN model construction based on attention mechanism and space-time connection enhancement
The 3DCNN model construction of the spatiotemporal connection enhancement based on the attention mechanism comprises 4 steps: periodic component construction, trend component construction, recent component construction, and component combination;
step 1-1: periodic component building
The periodic component consists of a residual error module and a space influence attention module;
the residual error module comprises two convolution operations for extracting space-time characteristics of traffic flow data, wherein each convolution operation comprises 3D convolution, 3D batch standardization and activation operations; the residual block formula is as follows:
where f is a convolution operation, X l Is the output of the first layer residual error module and the input of the first layer residual error module and the first layer residual error module, X' l To pass through the intermediate value of one convolution operation, X l+1 The output of the residual error module of the layer I+1;
inputting the features extracted by the residual error module into a spatial influence attention module to quantify the heterogeneity of the spatial features, wherein the spatial influence attention module comprises pooling, convolution and activation operations; the formula of the spatial impact attention module is as follows:
S p =σ(conv(concat(MP(X T ),AP(X T )))) [2]
wherein ,XT For the input of the module, MP and AP are max pooling operations and average pooling operations, compress the time dimension, preserve only the space dimension features,extracting salient feature information and background feature information, and quantifying heterogeneity of spatial features; concat is matrix fusion, conv is convolution operation, sigma is activation operation, S p A spatial feature heterogeneity matrix derived for the periodic component;
step 1-2: trend component construction
The trend component consists of a residual error module, a space influence attention module and a time influence attention module; the space-time characteristics of traffic flow data are obtained through a residual error module, and the characteristic information is respectively input into a space influence attention module and a time influence attention module to quantify the heterogeneity of the space characteristics and the heterogeneity of the time characteristics; the operation of the residual error module is consistent with that of the periodic component;
the spatial impact attention module formula is as follows:
S tr =σ(conv(concat(MP(X T ),AP(X T )))) [3]
wherein ,Str A spatial feature heterogeneity matrix derived for the trend component;
the time-dependent attention module formula is as follows:
T tr =σ(conv(concat(ζ(MP(X T )),ζ(AP(X T ))))) [4]
as with the spatial impact attention module, the max pooling operation and the average pooling operation are used for quantifying the heterogeneity of the temporal feature, and finally the temporal feature heterogeneity matrix T is obtained tr ζ represents an error back propagation operation;
step 1-3: recent component build
The near term component consists of a residual error module, a space influence attention module and a time influence attention module; the component adopts an attention connection mechanism to enhance the performance of an attention module, and a space influence attention module formula is as follows:
wherein ,spatial feature heterogeneity matrix extracted for layer I spatial influence attention module,/for example>For the input of the layer 1 space influencing attention module, S' c l+1 A spatial characteristic heterogeneity matrix, w, primarily extracted for the l+1 layer spatial influence attention module 1 and w2 Initializing to 0 for the custom parameter; heterogeneity matrix extracted by the bottom attention module +.>Input to influence attention module with high-rise space +.>Multiplying, extracting the heterogeneity characteristic of the bottom layer by using pooling operation and convolution operation, and then mixing the heterogeneity characteristic with the characteristic heterogeneity matrix S 'preliminarily extracted by the bottom layer' c l+1 Combining to obtain the spatial characteristic heterogeneity matrix of the layer> Representing element multiplication;
the time-dependent attention module formula is as follows:
wherein ,time feature heterogeneity matrix extracted for layer I time influence attention module,/A>For the input of the layer 1 time influencing attention module, T' c l+1 Time feature heterogeneity, q, initially extracted for layer 1 time-influence attention module 1 and q2 Initializing to 0 for the custom parameter; />A time characteristic heterogeneity matrix output by the attention module is influenced for the time of the layer 1;
after the spatial characteristic heterogeneity matrix and the temporal characteristic heterogeneity matrix are obtained, the spatial characteristic heterogeneity matrix and the temporal characteristic heterogeneity matrix are multiplied by input to be used as input of a residual error module of the next layer, and the formula is as follows:
the final results obtained by the near term components are as follows:
wherein ,Lm Representing the maximum layer number of the residual error module, wherein Y is the result of a recent component, namely the result of a prediction model, and res represents the residual error layer;
step 1-4: assembly combination
The results input recent component of the period component and the trend component learn further as follows:
wherein , and />Heterogeneity matrix quantifying for recent module first layer spatial impact attention module and time impact attention module, S tr and Ttr Spatial feature and temporal feature heterogeneity matrix quantifying trend components, S p A spatial feature heterogeneity matrix obtained for the periodic component;
after combination, obtaining a final prediction result by a recent component, wherein the steps of the recent component are shown in steps 1-3;
step 2: STC3DCNN model training and testing
The training and testing of the STC3DCNN model comprises two steps: model training and model testing;
step 2-1: model training
Dividing traffic flow data into a training set and a testing set, sequencing the training set data according to time sequence, and extracting partial data as input according to the requirements of each component in each training round; the period component extracts periodic data of m time points as input, the recent component extracts n time point data of recent data as input, and the trend component extracts n time point data which are separated from the recent data by one week as input;
the Adam optimization algorithm is adopted to adjust parameters during model training; when the training times reach the set k value, the model stops training;
step 2-2: model testing
And predicting in the test set by using the trained model, and comparing the obtained predicted value with a real observed value to obtain the prediction precision.
2. The attention-mechanism-based spatiotemporal connectivity-enhanced 3DCNN traffic prediction method according to claim 1, wherein: the performance evaluation index of the model test is as follows:
mean absolute error MAE: the result is the average of the absolute errors between the actual and predicted values, and the formula is as follows:
y i the actual value is represented by a value that is,representing the predicted value;
average absolute percentage error MAPE: the result is the average of the absolute percentage error between the actual and predicted values, as follows:
root mean square error RMSE: the result is the arithmetic square root of the mean square error between the actual and predicted values, as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110801710.4A CN113657645B (en) | 2021-07-15 | 2021-07-15 | Space-time connection enhanced 3DCNN traffic prediction method based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110801710.4A CN113657645B (en) | 2021-07-15 | 2021-07-15 | Space-time connection enhanced 3DCNN traffic prediction method based on attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113657645A CN113657645A (en) | 2021-11-16 |
CN113657645B true CN113657645B (en) | 2023-09-26 |
Family
ID=78489507
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110801710.4A Active CN113657645B (en) | 2021-07-15 | 2021-07-15 | Space-time connection enhanced 3DCNN traffic prediction method based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113657645B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115482666B (en) * | 2022-09-13 | 2024-05-07 | 杭州电子科技大学 | Multi-graph convolution neural network traffic prediction method based on data fusion |
CN117831301B (en) * | 2024-03-05 | 2024-05-07 | 西南林业大学 | Traffic flow prediction method combining three-dimensional residual convolution neural network and space-time attention mechanism |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111210633A (en) * | 2020-02-09 | 2020-05-29 | 北京工业大学 | Short-term traffic flow prediction method based on deep learning |
CN112801404A (en) * | 2021-02-14 | 2021-05-14 | 北京工业大学 | Traffic prediction method based on self-adaptive spatial self-attention-seeking convolution |
-
2021
- 2021-07-15 CN CN202110801710.4A patent/CN113657645B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111210633A (en) * | 2020-02-09 | 2020-05-29 | 北京工业大学 | Short-term traffic flow prediction method based on deep learning |
CN112801404A (en) * | 2021-02-14 | 2021-05-14 | 北京工业大学 | Traffic prediction method based on self-adaptive spatial self-attention-seeking convolution |
Non-Patent Citations (1)
Title |
---|
基于卷积神经网络与双向长短时记忆网络组合模型的短时交通流预测;徐先峰 等;《工业仪表与自动化装置》(第1期);第13-18页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113657645A (en) | 2021-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109285346B (en) | Urban road network traffic state prediction method based on key road sections | |
CN110570651B (en) | Road network traffic situation prediction method and system based on deep learning | |
CN111210633B (en) | Short-term traffic flow prediction method based on deep learning | |
WO2023056696A1 (en) | Urban rail transit short-term passenger flow forecasting method based on recurrent neural network | |
CN108197739B (en) | Urban rail transit passenger flow prediction method | |
CN113657645B (en) | Space-time connection enhanced 3DCNN traffic prediction method based on attention mechanism | |
CN102074124B (en) | Dynamic bus arrival time prediction method based on support vector machine (SVM) and H-infinity filtering | |
CN110322695A (en) | A kind of Short-time Traffic Flow Forecasting Methods based on deep learning | |
CN108597227A (en) | Road traffic flow forecasting method under freeway toll station | |
CN110310479A (en) | A kind of Forecast of Urban Traffic Flow forecasting system and method | |
CN112863182B (en) | Cross-modal data prediction method based on transfer learning | |
CN106355905A (en) | Control method for overhead signal based on checkpoint data | |
CN112966853A (en) | Urban road network short-term traffic flow prediction method based on space-time residual error mixed model | |
CN111462492B (en) | Key road section detection method based on Rich flow | |
CN112884014A (en) | Traffic speed short-time prediction method based on road section topological structure classification | |
CN104916124B (en) | Public bicycle system regulation and control method based on Markov model | |
CN109377761A (en) | Traffic factor network establishing method based on Markov-chain model | |
CN112633602A (en) | Traffic congestion index prediction method and device based on GIS map information | |
CN115828121A (en) | Traffic flow prediction method based on adjacent DBSCAN fusion time-varying multi-graph volume network | |
CN110070720B (en) | Calculation method for improving fitting degree of traffic capacity model of intersection road occupation construction area | |
CN116913088A (en) | Intelligent flow prediction method for expressway | |
CN111341109A (en) | City-level signal recommendation system based on space-time similarity | |
CN112950926A (en) | Urban trunk road speed prediction method based on big data and deep learning | |
CN115966107A (en) | Airport traffic flow prediction method based on graph neural network | |
CN103020733A (en) | Method and system for predicting single flight noise of airport based on weight |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |