CN113657645B

CN113657645B - Space-time connection enhanced 3DCNN traffic prediction method based on attention mechanism

Info

Publication number: CN113657645B
Application number: CN202110801710.4A
Authority: CN
Inventors: 王兴起; 赵一鸣; 邵艳利; 方景龙; 魏丹; 陈滨
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2021-07-15
Filing date: 2021-07-15
Publication date: 2023-09-26
Anticipated expiration: 2041-07-15
Also published as: CN113657645A

Abstract

The invention discloses a space-time connection enhanced 3DCNN traffic prediction method based on an attention mechanism, which designs a space-time connection enhanced 3DCNN model framework consisting of a periodic component, a trend component and a near-term component, wherein the periodic component and the trend component are used for extracting space-time characteristics of traffic flow data with longer time intervals, and the near-term component is used for extracting the space-time characteristics of the near-term data and further learning information extracted by the other two components so as to improve prediction accuracy; taking the channel dimension into consideration on the basis of the space-time dimension of the traditional attention mechanism, constructing a space-time influence attention module to capture the influence degree of space-time characteristics, and achieving the purposes of quantifying the space-time heterogeneity and further improving the prediction accuracy; and connecting the multi-attention modules, wherein the visibility of the information obtained by the lower-layer module to the higher-layer module is increased. The invention further improves the prediction accuracy of the traffic flow data.

Description

Space-time connection enhanced 3DCNN traffic prediction method based on attention mechanism

Technical Field

The invention belongs to the field of traffic flow prediction, and relates to a space-time connection enhanced 3DCNN traffic flow prediction model based on an attention mechanism.

Background

With the acceleration of the urban process and the improvement of the living standard of residents, the urban automobile has a rapid increase in the quantity of maintenance, brings the problems of serious environmental pollution, traffic jam and the like while bringing people with rapid and convenient living, especially in the peak period of working hours in the morning and evening, and during severe weather, large-scale gatherings or legal holidays, the urban trunk roads are often in a heavy congestion state, so that the road traffic capacity is greatly reduced, trouble is brought to travel of people, great challenges are brought to management of traffic departments, and the urban operation efficiency is reduced. Intelligent traffic system (Intelligent Transportation System, ITS) construction aims to improve traffic operation conditions, improve traffic operation efficiency and solve traffic congestion problems. The accurate and efficient traffic flow prediction is an important ring for solving the traffic jam problem in ITS, has important guiding significance for urban road planning, resident travel route guidance, traffic department management, dispersion and the like, and is currently becoming a research hotspot and difficulty in the intelligent traffic field.

Many scholars have conducted extensive research in the field of traffic prediction over the last decades and have achieved a great deal of success. Initially, traffic flow prediction problems were solved as time series prediction problems, and research results generally used models such as an integrated moving average autoregressive model (Autoregressive Integrated Moving Average model, ARIMA) or a recurrent neural network (Recurrent Neural Networks, RNN) as a framework. However, these models only consider temporal features, neglecting the influence of near-area spatial information, and do not adequately model traffic flow data. In order to be able to learn the spatial information of traffic flow data, convolutional neural networks (Convolutional Neural Network, CNN) are beginning to be applied in the field of traffic prediction, but their weak time information extraction capability again makes the model inadequate for learning time characteristics. For this reason, more and more researches use models with different characteristics, such as RNN and CNN, in combination for comprehensive extraction and processing of space-time characteristic information, but also bring problems of complex model structure, difficult optimization and the like. Therefore, how to use a model with a simple structure to effectively extract space-time characteristics without affecting the prediction accuracy is still an important point and a difficult point in the current traffic prediction field. In addition, most of the researches neglect the problem of heterogeneity of space-time characteristics in traffic flow data, and how to identify and quantify the heterogeneity of the space-time contribution degree is one of the problems to be considered in the field.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a space-time connection enhanced 3DCNN traffic prediction model based on an attention mechanism to realize efficient and accurate prediction of large-scale traffic flow data, thereby providing basis for intelligent traffic control and induction. The main content is as follows: (1) An STC3DCNN model framework consisting of a periodical component, a trend component and a near-term component is designed, wherein the periodical component and the trend component are used for extracting the space-time characteristics of traffic flow data with longer time intervals, and the near-term component is used for extracting the space-time characteristics of near-term data and further learning information extracted by the other two components so as to improve prediction accuracy. (2) Based on the space-time dimension of the traditional attention mechanism, the channel dimension is considered, and a space-time influence attention module is constructed to capture the influence degree of space-time characteristics, so that the purpose of quantifying the space-time heterogeneity and further improving the prediction precision is achieved. (3) The multi-attention modules are generally connected in series, and the information obtained by the lower-layer modules is not high in visibility to the higher-layer modules. The invention adopts the attention connection mode, namely, the information extracted by the low-layer attention module is transmitted into the high-layer attention module to continue learning, thus improving the model performance.

The invention comprises two steps: the method comprises the steps of constructing a space-time connection enhanced 3DCNN traffic prediction model based on an attention mechanism, and training and testing the space-time connection enhanced 3DCNN traffic prediction model based on the attention mechanism.

Step 1: 3DCNN model construction based on attention mechanism and space-time connection enhancement

The 3DCNN model construction of the spatiotemporal connection enhancement based on the attention mechanism comprises 4 steps: periodic component construction, trend component construction, recent component construction, and component combination;

step 1-1: periodic component building

The periodic component consists of a residual error module and a space influence attention module;

the residual error module comprises two convolution operations for extracting space-time characteristics of traffic flow data, wherein each convolution operation comprises 3D convolution, 3D batch standardization and activation operations; the residual block formula is as follows:

where f is a convolution operation, X ^l Is the output of the first layer residual error module and the input of the first layer residual error module and the first layer residual error module, X' ^l To pass through the intermediate value of one convolution operation, X ^l+1 The output of the residual error module of the layer I+1;

inputting the features extracted by the residual error module into a spatial influence attention module to quantify the heterogeneity of the spatial features, wherein the spatial influence attention module comprises pooling, convolution and activation operations; the formula of the spatial impact attention module is as follows:

S _p ＝σ(conv(concat(MP(X _T ),AP(X _T )))) [2]

wherein ,X_T For the input of the module, MP and AP are the maximum pooling operation and the average pooling operation, the time dimension is compressed, only the space dimension characteristics are reserved, the salient characteristic information and the background characteristic information are extracted, and meanwhile the heterogeneity of the space characteristics is quantified; concat is matrix fusion, conv is convolution operation, sigma is activation operation, S _p A spatial feature heterogeneity matrix derived for the periodic component;

step 1-2: trend component construction

The trend component consists of a residual error module, a space influence attention module and a time influence attention module; the space-time characteristics of traffic flow data are obtained through a residual error module, and the characteristic information is respectively input into a space influence attention module and a time influence attention module to quantify the heterogeneity of the space characteristics and the heterogeneity of the time characteristics; the operation of the residual error module is consistent with that of the periodic component;

the spatial impact attention module formula is as follows:

S _tr ＝σ(conv(concat(MP(X _T ),AP(X _T )))) [3]

wherein ,S_tr A spatial feature heterogeneity matrix derived for the trend component;

the time-dependent attention module formula is as follows:

T _tr ＝σ(conv(concat(ζ(MP(X _T )),ζ(AP(X _T ))))) [4]

as with the spatial impact attention module, max pooling operations andthe average pooling operation is used for quantifying the heterogeneity of the time characteristics, and finally obtaining a time characteristic heterogeneity matrix T _tr ζ represents an error back propagation operation;

step 1-3: recent component build

The near term component consists of a residual error module, a space influence attention module and a time influence attention module; the component adopts an attention connection mechanism to enhance the performance of an attention module, and a space influence attention module formula is as follows:

wherein ,spatial feature heterogeneity matrix extracted for layer I spatial influence attention module,/for example>For the layer 1 spatial influence of the input of the attention module,/for the layer 1 spatial influence of the attention module>A spatial characteristic heterogeneity matrix, w, primarily extracted for the l+1 layer spatial influence attention module ₁ and w₂ Initializing to 0 for the custom parameter; heterogeneity matrix extracted by the bottom attention module +.>Input to influence attention module with high-rise space +.>Multiplying, extracting heterogeneity characteristic of bottom layer by pooling operation and convolution operation, and mixing with characteristic heterogeneity matrix of the initial extraction of the bottom layer>Combining to obtain the spatial characteristic heterogeneous moment of the layerArray Representing element multiplication;

the time-dependent attention module formula is as follows:

wherein ,time feature heterogeneity matrix extracted for layer I time influence attention module,/A>For the layer 1 time, the input of the attention module is influenced,/for the layer 1 time>Time feature heterogeneity, q, initially extracted for layer 1 time-influence attention module ₁ and q₂ Initializing to 0 for the custom parameter; />A time characteristic heterogeneity matrix output by the attention module is influenced for the time of the layer 1;

after the spatial characteristic heterogeneity matrix and the temporal characteristic heterogeneity matrix are obtained, the spatial characteristic heterogeneity matrix and the temporal characteristic heterogeneity matrix are multiplied by input to be used as input of a residual error module of the next layer, and the formula is as follows:

the final results obtained by the near term components are as follows:

wherein ,L_m Representing the maximum layer number of the residual error module, wherein Y is the result of a recent component, namely the result of a prediction model, and res represents the residual error layer;

step 1-4: assembly combination

The results input recent component of the period component and the trend component learn further as follows:

wherein , and />Heterogeneity matrix quantifying for recent module first layer spatial impact attention module and time impact attention module, S _tr and T_tr Spatial feature and temporal feature heterogeneity matrix quantifying trend components, S _p A spatial feature heterogeneity matrix obtained for the periodic component;

after combination, obtaining a final prediction result by a recent component, wherein the steps of the recent component are shown in steps 1-3;

step 2: STC3DCNN model training and testing

The training and testing of the STC3DCNN model comprises two steps: model training and model testing;

step 2-1: model training

Dividing traffic flow data into a training set and a testing set, sequencing the training set data according to time sequence, and extracting partial data as input according to the requirements of each component in each training round; the period component extracts periodic data of m time points as input, the recent component extracts n time point data of recent data as input, and the trend component extracts n time point data which are separated from the recent data by one week as input;

the Adam optimization algorithm is adopted to adjust parameters during model training; when the training times reach the set k value, the model stops training;

step 2-2: model testing

And predicting in the test set by using the trained model, and comparing the obtained predicted value with a real observed value to obtain the prediction precision.

The evaluation index of the model performance is as follows:

mean absolute error (Mean Absolute Error, MAE): the result is the average of the absolute errors between the actual and predicted values, and the formula is as follows:

average absolute percent error (Mean Absolute Percentage Error, MAPE): the result is the average of the absolute percentage error between the actual and predicted values, as follows:

root mean square error (Root Mean Square Error, RMSE): the result is the arithmetic square root of the mean square error between the actual and predicted values, as follows:

after the model obtains three evaluation indexes, the model is compared with the current main stream model, and the performance of the model is proved to be in the leading position.

By adopting the method, namely the 3D convolution is adopted by the space-time connection enhanced 3DCNN traffic prediction model based on the attention mechanism, the space-time characteristics of traffic flow data can be extracted better under the condition of not increasing the complexity of the model structure. Based on the trained parameter model, a high-precision prediction result can be obtained. Meanwhile, the invention has the following characteristics:

1) The method has the advantages that the residual error module is formed by three-dimensional convolution, the single three-dimensional convolution can simultaneously capture the time characteristic and the space characteristic of the traffic flow data without increasing the complexity of the model structure, and the residual error module is integrated with a plurality of three-dimensional convolutions, so that the model can extract the space-time characteristic of a larger range, and the characteristics of more areas and more time of cities can be considered more effectively.

2) At present, most researches do not consider the heterogeneity of the space-time characteristics, namely, each region in a city and the characteristics of each time point are different.

3) Traffic flow data is huge in scale, the scale of a training set is generally more than millions, and the performance of attention modules arranged in the prior art can not accurately analyze the characteristics of all data in the face of data of the magnitude, so that the attention module performance is improved by adopting attention connection. Thereby further improving the prediction accuracy of the traffic flow data.

Drawings

FIG. 1 is a general flow chart of the present invention;

fig. 2 is a network configuration diagram of the present invention.

Detailed Description

As shown in fig. 1, the present invention comprises two steps: the method comprises the steps of constructing a space-time connection enhanced 3DCNN traffic prediction model based on an attention mechanism, and training and testing the space-time connection enhanced 3DCNN traffic prediction model based on the attention mechanism.

Step 1: space-time connection enhanced 3DCNN traffic prediction model construction based on attention mechanism (step 1 in FIG. 1)

As shown in step1 in fig. 1, the space-time connection enhanced 3DCNN traffic prediction model construction based on the attention mechanism includes 4 steps: periodic component construction, trend component construction, recent component construction, and component combination.

Step 1-1: periodic component building

As shown in step1 of FIG. 2, the period component is composed of a layer of residual modules and a layer of spatial impact attention modules.

The residual module contains two convolution operations for extracting spatiotemporal features of the traffic flow data, each convolution operation containing a 3D convolution (conv), a 3D batch normalization (batch normalization, BN), and an activation operation. The residual block formula is as follows:

wherein, sigma is the activation operation,the output of the first layer residual error module is also the input of the first layer residual error module and the second layer residual error module,for intermediate values after a convolution operation, +.>Is the output of the layer 1 residual module.

The method comprises the steps that the characteristics extracted by a residual error module are input into a spatial influence attention module to quantify heterogeneity of spatial characteristics, wherein the spatial influence attention module comprises pooling, convolution and activation operations, and the spatial influence attention module has the following formula:

wherein ,X_T For the input of the module, MP and AP are the maximum pooling operation and the average pooling operation, compress the time dimension, only preserve the space dimension features, and extract the salient feature information and the background feature information,representing element multiplication. X is X _m(s) and X_a(s) For the characteristic information after two pooling operations, < >> and />For the spatial characteristics of each channel, k is the channel number, C _m Is the maximum number of channels. /> and />For custom matrix (initialized to 0), and +.> and />Element multiplication to quantify spatial feature heterogeneity in each channel, concat is matrix fusion, S _p A spatial feature heterogeneity matrix is derived for the periodic component.

Step 1-2: trend component construction

As shown in step2 of FIG. 2, the trend component is composed of two layers of residual modules, one layer of spatial influence attention module, and one layer of temporal influence attention module. And obtaining space-time characteristics of traffic flow data through a residual error module, and respectively inputting the characteristic information into a space influence attention module and a time influence attention module to quantify the heterogeneity of the space characteristics and the heterogeneity of the time characteristics. The operation of the residual module is consistent with the periodic component and will not be described in detail here.

The spatial impact attention module formula is as follows:

wherein ,S_tr A spatial feature heterogeneity matrix derived for the trend component.

The time-dependent attention module formula is as follows:

wherein ,X_m(tr) and X_a(tr) For the information processed by the maximum pooling operation and the average pooling operation, the dimension of the space dimension is reduced, and only the time dimension characteristics are reserved. As with the spatial impact attention module, its channel characteristics are customized and />Element multiplication is used for quantifying the heterogeneity of the time characteristics, and finally a time characteristic heterogeneity matrix T is obtained _tr 。

Step 1-3: recent component build

As shown in step3 of fig. 2, the near term component is composed of a three-layer residual module, a three-layer spatial influence attention module, and a three-layer temporal influence attention module. The component adopts an attention connection mechanism to enhance the performance of an attention module, and a space influence attention module formula is as follows:

wherein ,spatial feature heterogeneity information extracted for layer I spatial influence attention module,/for example>For the layer 1 spatial influence of the input of the attention module,/for the layer 1 spatial influence of the attention module>The heterogeneity, w, of spatial features initially extracted for the layer 1 spatial influence attention module ₁ and w₂ For the custom parameter, initialize to 0. Heterogeneity information extracted by the underlying attention module +.>Input to influence attention module with high-rise space +.>Multiplying, extracting heterogeneity characteristic of bottom layer by pooling operation and convolution operation, and extracting heterogeneity of the heterogeneity characteristic with the layer>Binding gives the heterogeneity of the layer +.>

The time-dependent attention module formula is as follows:

wherein ,time feature heterogeneity information extracted for layer-1 time-influence attention module,/for example>For the layer 1 time, the input of the attention module is influenced,/for the layer 1 time>Time feature heterogeneity, q, initially extracted for layer 1 time-influence attention module ₁ and q₂ For the custom parameter, initialize to 0./>The temporal feature heterogeneity matrix output for the layer 1 time-impact attention module.

After the spatial characteristic heterogeneity and the temporal characteristic heterogeneity are obtained, multiplying the spatial characteristic heterogeneity and the temporal characteristic heterogeneity by an input to be used as the input of a residual error module of the next layer, and the formula is as follows:

the final results obtained by the near term components are as follows:

and through the processing of the three-layer residual error module, the space and time influence attention module, the prediction result finally obtained by the STC3DCNN model is Y.

Step 1-4: assembly combination

As shown in step4 of FIG. 2, the result input recent component of the period component and the trend component learn further as follows:

wherein , and />Heterogeneity quantified for the near term module first tier spatial impact attention module and time impact attention module, S _tr and T_tr Spatial feature and temporal feature heterogeneity quantified for trend components, S _p Spatial feature heterogeneity obtained for periodic components. u (u) ₁ 、u ₂ and u₃ For the custom variable, initialize to 0.

After combination, the final prediction result is obtained by the recent component, and the steps of the recent component are shown in steps 1-3.

Step 2: space-time connection enhanced 3DCNN traffic prediction model training and testing based on attention mechanism

As shown in step2 in fig. 1, the training and testing of the space-time connection enhanced 3DCNN traffic prediction model based on the attention mechanism is divided into two steps: model training and model testing.

Step 2-1: model training

The data set is selected from taxi driving data of 1 month and 1 day in 2015, and New York. This dataset divided new york city into a 10 x 20 grid plot, data collected every half hour, data divided into two types: inflow data and outflow data, so the data format of each time step is R ^10×20×2 . We sort the data in time steps to form three-dimensional data, so that each component inputs data in the form of R ^T×10×20×2 Where T is the total time step.

The data of the first 40 days in the data set is used as a training set, and part of data is extracted as input according to the requirements of each component in each training round. The periodicity component extracts periodicity data for 4 time points as input (t=4), the near term component extracts 16 time point data for near term data as input (t=16), and the trend component extracts 16 time point data one week apart from the near term data as input (t=16).

The parameters are adjusted by adopting an Adam optimization algorithm during model training, the first-order exponential decay rate is set to 0.9, the second-order exponential decay rate is set to 0.999, the learning rate is set to 1e-3, and the weight decay is set to 0.005. When the number of training times reached 600 times set, the model stopped training.

Step 2-2: model testing

The data of the first 40 days in the data set is used as a test set, a trained model is used for prediction in the test set, the obtained predicted value is compared with the actual observed value to obtain the prediction precision, and three indexes of MAE, MAPE and RMSE are adopted as the evaluation indexes of the model performance. After the model obtains three evaluation indexes, the three evaluation indexes are compared with the current mainstream model. The STC3DCNN model provided by the invention is in the leading position on most indexes, the data of three indexes of the model are 6.12, 0.36 and 15.05 respectively, and experimental results prove that the STC3DCNN model is effective.

Claims

1. The space-time connection enhanced 3DCNN traffic prediction method based on the attention mechanism is characterized by comprising the following steps of:

step 1-1: periodic component building

S _p ＝σ(conv(concat(MP(X _T )，AP(X _T )))) [2]

wherein ,X_T For the input of the module, MP and AP are max pooling operations and average pooling operations, compress the time dimension, preserve only the space dimension features,extracting salient feature information and background feature information, and quantifying heterogeneity of spatial features; concat is matrix fusion, conv is convolution operation, sigma is activation operation, S _p A spatial feature heterogeneity matrix derived for the periodic component;

step 1-2: trend component construction

the spatial impact attention module formula is as follows:

S _tr ＝σ(conv(concat(MP(X _T )，AP(X _T )))) [3]

the time-dependent attention module formula is as follows:

T _tr ＝σ(conv(concat(ζ(MP(X _T ))，ζ(AP(X _T ))))) [4]

as with the spatial impact attention module, the max pooling operation and the average pooling operation are used for quantifying the heterogeneity of the temporal feature, and finally the temporal feature heterogeneity matrix T is obtained _tr ζ represents an error back propagation operation;

step 1-3: recent component build

wherein ,spatial feature heterogeneity matrix extracted for layer I spatial influence attention module,/for example>For the input of the layer 1 space influencing attention module, S' _c ^l+1 A spatial characteristic heterogeneity matrix, w, primarily extracted for the l+1 layer spatial influence attention module ₁ and w₂ Initializing to 0 for the custom parameter; heterogeneity matrix extracted by the bottom attention module +.>Input to influence attention module with high-rise space +.>Multiplying, extracting the heterogeneity characteristic of the bottom layer by using pooling operation and convolution operation, and then mixing the heterogeneity characteristic with the characteristic heterogeneity matrix S 'preliminarily extracted by the bottom layer' _c ^l+1 Combining to obtain the spatial characteristic heterogeneity matrix of the layer> Representing element multiplication;

the time-dependent attention module formula is as follows:

wherein ,time feature heterogeneity matrix extracted for layer I time influence attention module,/A>For the input of the layer 1 time influencing attention module, T' _c ^l+1 Time feature heterogeneity, q, initially extracted for layer 1 time-influence attention module ₁ and q₂ Initializing to 0 for the custom parameter; />A time characteristic heterogeneity matrix output by the attention module is influenced for the time of the layer 1;

the final results obtained by the near term components are as follows:

step 1-4: assembly combination

step 2: STC3DCNN model training and testing

step 2-1: model training

step 2-2: model testing

2. The attention-mechanism-based spatiotemporal connectivity-enhanced 3DCNN traffic prediction method according to claim 1, wherein: the performance evaluation index of the model test is as follows:

mean absolute error MAE: the result is the average of the absolute errors between the actual and predicted values, and the formula is as follows:

y _i the actual value is represented by a value that is,representing the predicted value;

average absolute percentage error MAPE: the result is the average of the absolute percentage error between the actual and predicted values, as follows:

root mean square error RMSE: the result is the arithmetic square root of the mean square error between the actual and predicted values, as follows: