CN115034478B

CN115034478B - Traffic flow prediction method based on field self-adaption and knowledge migration

Info

Publication number: CN115034478B
Application number: CN202210665488.4A
Authority: CN
Inventors: 杨燕; 欧阳小草; 江永全; 张熠玲; 周威
Original assignee: Southwest Jiaotong University
Current assignee: Southwest Jiaotong University
Priority date: 2022-06-14
Filing date: 2022-06-14
Publication date: 2023-06-23
Anticipated expiration: 2042-06-14
Also published as: CN115034478A

Abstract

The invention discloses a traffic flow prediction method based on field self-adaption and knowledge migration, and belongs to the technical field of data mining. Firstly, the space-time characteristics of a source domain and a target domain are extracted, and meanwhile, the potential space-time pattern-based sharing space-time knowledge is learned, so that the space-time dependence is obtained, and meanwhile, the space-time pattern is mined to help to improve the prediction performance of the target domain. Secondly, a knowledge attention module is constructed to extract the movable space-time information from the shared space-time knowledge, so as to obtain a more detailed feature representation corresponding to each space-time feature. And finally, merging the extracted space-time characteristics with the movable space-time information to realize final traffic flow prediction. The invention can be used in actual scenes, can acquire the space-time characteristics in traffic flow data, and can be expanded to prediction tasks in other space-time data fields, thereby having good universality.

Description

Traffic flow prediction method based on field self-adaption and knowledge migration

Technical Field

The invention belongs to the technical field of data mining, and particularly relates to a cross-domain knowledge migration method.

Background

The construction of the smart city greatly improves the living standard of urban residents, and the traffic flow prediction has important positions in the smart city. The accurate traffic flow prediction plays a guiding role in relieving urban traffic jams and planning traffic roads. Because of its great practical value, many scholars have conducted intensive studies on traffic flow predictions. In recent years, with the rapid increase in traffic flow data volume, deep learning is widely applied to traffic flow prediction tasks in smart cities. In particular, a cyclic convolutional neural network (recurrent neural network, RNN) and its variant model: gated convolutional networks (gate recurrent unit, GRU) and long-term memory networks (LSTM), which perform well in terms of capture time dependence. For urban traffic flow prediction based on grid data, convolutional neural networks (convolutional neural network, CNN) are often used to extract local features to model spatiotemporal correlations. However, the data acquisition sensors in cities typically exhibit an irregular distribution, constituting a structure other than euclidean space, and CNNs have difficulty processing the data of the non-euclidean structure. The graph rolling network (graph convolutionalnetwork, GCN) greatly improves traffic flow prediction performance due to its strong ability to process non-euclidean structural data. The combined model of GCN, CNN and RNN also achieves a remarkable effect on traffic flow prediction problems. Although the conventional works have achieved excellent prediction effects, the following problems still remain: first, deep learning-based methods rely heavily on the amount of training data, and if there is only a small amount of training data, their predictive performance will be greatly degraded. In practical applications, however, we often face the problem of data volume shortages, such as new smart cities or cities that can only provide small amounts of available data due to data privacy, which makes training these deep learning based models very difficult. Secondly, most of the existing traffic flow prediction models based on transfer learning are aimed at cities based on grid data, and the transfer learning method based on graph structures is rarely considered. Second, the data distribution differences that exist between different cities are ignored, which may lead to negative migration or instability of the migration effect. Therefore, it is very interesting to explore traffic flow prediction methods based on graph neural networks that can reduce the data distribution differences.

1. Traffic flow prediction

Through searching and finding of the existing patent and related technology, the existing methods related to traffic flow prediction are as follows:

(1) Zhang Xu, zhang Langwen, xie Wei, wang Yaochu, ran Jielong, sun Jinhui, chen Zhile. A traffic flow prediction method based on dynamic spatiotemporal correlation [ P ]. Guangdong province: CN113112793a,2021-07-13 proposes a traffic flow prediction method based on dynamic spatiotemporal correlation. According to the method, the dynamic similarity between the traffic point positions is learned through a flow gating mechanism, and the attention transfer mechanism is introduced to process time transfer, so that regional traffic flow prediction is realized.

(2) He Hong, wang Xinfeng, sun Xiaoxiao, dong. Multi-directional traffic flow prediction method based on Point-of-interest space-time residual neural network [ P ]. Zhejiang province: CN114154740a,2022-03-08 proposes a point-of-interest based spatio-temporal residual neural network model. The model enhances the space-time characteristics by adding time signals and interest point signals; further, extracting space-time characteristics by adopting a 3D convolutional neural network; and finally, compressing all the space-time characteristic information by a weighting method, thereby realizing regional traffic flow prediction.

The current traffic flow prediction method based on the graph neural network thoroughly considers the space-time dependence existing in the traffic flow data, but does not consider how to realize accurate traffic flow prediction under the condition of data volume shortage.

2. Migration learning

Through the search discovery of the prior patent and the related technology, the prior traffic flow prediction method related to the transfer learning comprises the following steps:

(1) Wang Senzhang, yin Chengyu, miao Hao A joint prediction method of cross-city traffic flow based on deep transfer learning [ P ]. Jiangsu province: CN110148296a,2019-08-20, proposes a depth knowledge migration model. The model uses a convolution long-short time network model ConvLSTM and a method of maximum average difference of conditions, and applies the idea of transfer learning to urban traffic flow prediction, thereby realizing urban traffic flow prediction based on grid data.

(2) Xuan Fan, xu Cui, strong nest country, liu Xincheng, zhou Guodong. Traffic flow prediction method based on deep migration fusion learning [ P ]. Jiangsu province: CN112862084B,2021-11-30, proposes a traffic flow prediction method based on deep migration fusion learning. The method combines the space-time characteristics, the characteristic transformation, the deep neural network and the migration learning algorithm, thereby realizing the prediction of traffic flow.

However, most of the above methods are based on grid data, so these methods are not directly applicable to traffic flow prediction tasks based on graph data. Meanwhile, the methods do not consider the problem of data distribution difference between the source domain and the target domain. Therefore, we consider the problem of domain difference and combine the graph neural network to provide a traffic flow prediction method based on domain self-adaption and knowledge migration.

Disclosure of Invention

The invention aims to provide a field-adaptive traffic flow prediction method, which can effectively solve the technical problem that accurate traffic flow prediction cannot be realized in cities with insufficient data volume.

The technical route for realizing the invention is as follows:

a traffic flow prediction method based on field self-adaption and knowledge migration comprises the following steps:

step 1, preprocessing traffic flow data, including:

1.1, using sensors in urban road network, taking data volume of urban traffic flow measured in months or years as source domain, taking data volume of urban traffic flow less than thirty days as target domain, recording traffic flow condition in a period of history time, defining the traffic flow data volume as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,

representing real numbers, X _S Is traffic flow data of the whole city as a source domain, < > for the whole city>

Is traffic flow data of a source domain at a time t, wherein t represents the current time and N _S Is the number of the whole urban traffic flow sensors as the source domain, P _S Total number of time slices, X, of traffic flow data of source domain _C Traffic flow data of the whole city, which is the target domain, < >>

Is traffic flow data of a target domain at a time t, N _C Number of traffic flow sensors, P, being target domain _C The total number of time slices of the traffic flow data of the target domain, D represents the characteristic dimension of the traffic flow data;

1.2, aiming at the defect value in the traffic flow data, filling the data defect value by using a linear interpolation method, and carrying out normalization processing on the filled data by using zero-mean normalization;

1.3 building model input data: the traffic flow data of the source domain and the target domain are arranged, P represents the historical time steps of the input model, Q represents the future time steps needing to be predicted, the historical data of the previous P time steps from the current moment t are overlapped according to time sequence, and the historical traffic flow data of the P time steps of the source domain are respectively formed

And historical traffic flow data for P time steps of the target domain +.>

Taking the historical traffic flow data of P time steps before the time t of the source domain and the target domain as model input, and taking the model output as the traffic flow of Q future time steps after the time t of the source domain and the target domain;

step 2, setting the maximum training frequency as e=100;

step 3, constructing a time-space sub-network: the network is used for extracting space-time characteristics and shared space-time knowledge, constructing the relevance of space-time sub-network processing time and space by using a cyclic recurrent neural network GRU and a graph convolution network GCN, learning an adaptive shared space-time knowledge, and expressing all model parameters of the space-time sub-network as theta _f The specific process comprises the following steps:

3.1 taking the historical data of the source domain and the target domain as the input of the space-time subnetwork in a cascading mode, and recording as X= [ X ] _S ；X _C ]X represents the cascading result of traffic flow data of the source domain and the target domain; firstly, mapping original features into a feature space through a full connection layer FC; secondly, to capture the time dependence in the data, the mapped embedded features are input into the recurrent neural network GRU, and finally, the modified linear unit ReLU is adopted as an activation function, and the process is described as follows:

H( ^l) ＝ReLU(GRU(FC(X)) (3)

wherein H is ^(l) A hidden feature representation representing a first layer of the spatio-temporal subnetwork;

3.2 to further mine spatial dependencies in the data, spatial correlations in the data are extracted using a graph rolling network GCN:

H ^(l+1) ＝GCN(H ^(l) ) (4)

wherein H is ^(l+1) A hidden feature representation representing the (l+1) th layer output of the spatio-temporal subnetwork;

3.3 learning an adaptive shared spatiotemporal knowledge in order to mine potential spatiotemporal patterns in source and target domains, first, randomly initializing the shared spatiotemporal knowledge, parameterizing it, and noting it as

Wherein G is _k Representing the number of classes, d, of potential spatial modes _k Is a feature dimension that shares spatiotemporal knowledge; then, along with iterative training of the space-time subnetwork, a gradient descent algorithm is adopted to update parameterized shared space-time knowledge K;

step 4, constructing a field discriminator: in order to obtain domain-invariant spatio-temporal characteristics, a domain-adaptive method is used to reduce the data distribution difference between the source domain and the target domain; the method specifically comprises the following steps:

defining a two-class domain discriminator, and expressing all model parameters of the domain discriminator as theta _adv Setting the real domain label of the source domain as 1 and the real domain label of the target domain as 0; predicting the domain label probability with input characteristics through a domain discriminator, and calculating a domain self-adaptive loss function L according to the predicted domain label probability and the real domain label _adv ：

Wherein q _j Representing the predictive field label probability, d, for the jth sample _j A true domain label representing the jth sample, n representing the total number of samples; j= {1,2,., n };

step 5, constructing a predictor: the predictor includes a knowledge attention module and a fully-connected layer, representing all model parameters of the predictor as θ _g The method comprises the steps of carrying out a first treatment on the surface of the The specific process is as follows:

5.1 first, the time-space characteristics are linearly changed, namely:

where i represents an i-th traffic flow sensor, i= {1,2, (N _S +N _C )}，

Represents the ithThe time-space characteristics of the traffic flow sensor at the time t; />

d _h Representing hidden feature dimensions; />

Representing the characteristic linear change result of the ith traffic flow sensor at the moment t, ++>

5.2 in order to obtain the movable time-space information provided by the traffic flow sensor, a knowledge attention module needs to be constructed, and the specific process comprises the following steps: features of time and space

As query items to query for migratable information in shared spatiotemporal knowledge, the process of building an attention module is represented as follows:

characteristic value of ith traffic flow sensor at t moment and g _k Similarity score, g ', for each potential spatial pattern' _k Index representing potential spatial modes, exp (·) representing an exponential function based on a natural constant e, ++>

Is the movable time-space information of the ith traffic flow sensor at the time t;

5.3 space-time TexSign of sign

And migratable spatiotemporal information->

As input to the final prediction module, and then outputs the prediction result using the fully connected layer FC, namely:

representing a predicted value of an ith traffic flow sensor at a time t;

5.4 calculating a prediction loss function of the Source Domain from the prediction result

And predictive loss function of the target domain->

is the true value of the ith traffic flow sensor in the source domain at time t,/->

Is the predicted value of the ith traffic flow sensor in the source domain at time t, +.>

Is the true value of the ith traffic flow sensor in the target domain at time t,

is the predicted value of the ith traffic flow sensor in the target domain at the moment t, and simultaneously calculates the loss functions of the source domain and the target domain

And->

Step 6, calculating a final loss function L:

wherein λ is a super parameter that adjusts the predictive loss function and the domain adaptive loss function, λ ε (0, 1);

step 7, updating model parameters theta of the space-time subnetwork by using gradient descent method _f Model parameters θ of domain discriminator _adv Model parameters θ of predictor _g The specific process can be expressed as:

represents partial differential operation, μ represents the learning rate of the gradient descent algorithm;

and 8, repeating the steps 3, 4, 5, 6 and 7 until the training times are equal to E, and finally outputting a trained traffic flow prediction model based on field self-adaption and knowledge migration.

Compared with the prior art, the invention has the advantages and effects that:

(1) The invention provides a field self-adaptive model aiming at the problem that accurate traffic flow prediction cannot be realized due to small urban data quantity, and the model can simultaneously acquire space-time characteristics in space-time data, has small prediction error and higher accuracy. (2) The shared space-time knowledge provided by the invention can effectively capture the potential space-time knowledge and realize the knowledge migration of the potential space-time model among different cities. (3) The framework provided by the invention can be expanded to other related urban space-time data fields, solves similar problems and has universality.

Drawings

Figure 1 is a block diagram of the framework of the invention,

figure 2 is a block diagram of a knowledge attention module in accordance with the present invention,

FIG. 3 is a schematic diagram of correlation of model input values, predicted values, and true values in the present invention.

Detailed Description

The invention is described in further detail below with reference to the accompanying drawings.

1. Traffic flow prediction model framework based on domain adaptation and knowledge migration:

the whole framework structure of the invention is shown in fig. 1, namely a traffic flow prediction model based on domain adaptation and knowledge migration. It is mainly divided into three parts: (i) And a spatiotemporal sub-network for extracting spatiotemporal features in traffic flow and learning shared spatiotemporal knowledge. (ii) The domain discriminator is used for classifying the space-time characteristics generated by the source domain and the target domain to restrict the space-time sub-network to extract the space-time characteristics unchanged in the domain and reduce the characteristic distribution difference between the source domain and the target domain. (iii) a predictor, the module consisting essentially of two parts: the knowledge attention module is connected with the full connection layer. The knowledge attention module is used for mining the migratable space-time information in the shared space-time knowledge and used for predicting the target domain. And the full connection layer carries out linear mapping on the extracted hidden representation to obtain a final target domain prediction result.

2. Knowledge attention module:

as shown in fig. 2, the detailed procedure of the knowledge attention module is as follows: first, the spatiotemporal characteristics of the ith traffic flow sensor at time t are input

At the same time input g _k Shared spatiotemporal knowledge of the individual potential spatiotemporal patterns +.>

The space-time characteristic is linearly changed through the full connection layer to obtain the space-time characteristic embedded +.>

Next, the +_s>

And->

Multiplying, scaling, normalizing exponential function SoftMax to obtain attention fraction ++>

Multiplying the score by the spatiotemporal knowledge to obtain final migratable spatiotemporal information>

3. Model input value, predicted value, true value:

as shown in fig. 3, the model inputs include a source domain input and a target domain input, the model predictions include a source domain predicted value and a target domain predicted value, and the true values include a source domain true value and a target domain true value.

Examples

step 1, preprocessing traffic flow data, including:

And historical traffic flow data for P time steps of the target domain +.>

step 2, setting the maximum training frequency as e=100;

H ^(l) ＝ReLU(GRU(FC(X)) (3)

H ^(l+1) ＝GCN(H ^(l) ) (4)

Wherein q _j Representing the predictive field label probability, d, for the jth sample _j True field label representing the jth sample, n representing the total of samplesA number; j= {1,2,., n };

5.1 first, the time-space characteristics are linearly changed, namely:

where i represents an i-th traffic flow sensor, i= {1,2, (N _S +N _C )}，

Representing the time-space characteristics of the ith traffic flow sensor at the time t; />

d _h Representing hidden feature dimensions; />

5.3 spatiotemporal characterization

And migratable spatiotemporal information->

representing a predicted value of an ith traffic flow sensor at a time t;

And predictive loss function of the target domain->

And->

Step 6, calculating a final loss function L:

step 7, updating model parameters theta of the space-time subnetwork by using gradient descent method _f Domain discriminatorModel parameters theta of (2) _adv Model parameters θ of predictor _g The specific process can be expressed as:

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The scope of the invention is indicated by the appended claims rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims

1. A traffic flow prediction method based on field self-adaption and knowledge migration comprises the following steps:

step 1, preprocessing traffic flow data, including:

And historical traffic flow data for P time steps of the target domain +.>

step 2, setting the maximum training frequency as e=100;

H ^(l) ＝ReLU(GRU(FC(X))) (3)

H ^(l+1) ＝GCN(H ^(l) ) (4)

3.3 to mine potential spatiotemporal patterns in the Source and target DomainsLearning an adaptive shared spatiotemporal knowledge, first, randomly initializing the shared spatiotemporal knowledge, parameterizing it, and recording it as

5.1 first, the time-space characteristics are linearly changed, namely:

where i represents the i-th traffic flow sensor, i= {1,2 "...，(N _S +N _C )}，

d _h Representing hidden feature dimensions; />

5.3 spatiotemporal characterization

And migratable spatiotemporal information->

representing a predicted value of an ith traffic flow sensor at a time t;

And predictive loss function of the target domain->

Is the true value of the ith traffic flow sensor in the target domain at time t,/>

Is the predicted value of the ith traffic flow sensor in the target domain at the moment t, and simultaneously calculates the loss functions of the source domain and the target domain +.>

And->

Step 6, calculating a final loss function L: