CN115953902A

CN115953902A - Traffic flow prediction method based on multi-view space-time diagram convolution network

Info

Publication number: CN115953902A
Application number: CN202310132783.8A
Authority: CN
Inventors: 顾军华; 冀震雷; 张亚娟; 蒋家海; 金建铭; 郭睿哲
Original assignee: Hebei University of Technology
Current assignee: Hebei University of Technology
Priority date: 2023-02-20
Filing date: 2023-02-20
Publication date: 2023-04-11

Abstract

The invention relates to a traffic flow prediction method based on a multi-view space-time graph convolutional network, which takes traffic data on a road network, a static graph adjacent matrix and a trend similar graph adjacent matrix as input, extracts time characteristics and space characteristics alternately and then uses an output module to perform characteristic fusion, thereby realizing accurate prediction of traffic flow. The method extracts the spatial dependency relationship in the road from different angles, realizes the capture of the spatial dependency relationship from the global angle and the local angle, provides a multi-view space-time graph convolution network, and realizes the traffic flow prediction. The dependency relationship in the road observation points is modeled from different angles, and then the multi-view convolution module is used for extracting the spatial features, so that the capability of the module for capturing the spatial dependency is improved, and the prediction performance of the model is improved.

Description

Traffic flow prediction method based on multi-view space-time diagram convolution network

Technical Field

The invention belongs to the field of traffic flow prediction, and relates to a traffic flow prediction method based on a multi-view space-time graph convolutional network. The prediction method utilizes the graph convolutional neural network and the convolutional neural network to realize the prediction of traffic flow.

Background

With the development of sensor technology, traffic data shows a trend of explosive growth, and a traffic system formally enters a traffic big data age. The intelligent traffic system utilizes traffic big data to control urban traffic. In recent years, with the support of deep learning technology, intelligent transportation systems have been greatly successful, and traffic flow prediction, which is one of basic configurations of intelligent transportation systems, has attracted increasing researchers' attention. The change pattern in the traffic data changes along with the change of time and space, the traffic flow at the observation point is influenced not only by the historical traffic flow but also by the flow of the surrounding observation points, and how to properly process the influencing factors is one of the great challenges in realizing accurate traffic flow prediction.

With the rise of deep learning, deep learning models such as convolutional neural networks and graph convolutional networks are used for mining complex spatial dependency relationships in traffic data, and when processing time dependency relationships in traffic data, models such as cyclic neural networks are mainly used. Although the existing deep learning models have achieved better prediction results in traffic flow prediction tasks, some problems still exist. The traditional traffic flow prediction method generally describes the spatial dependence of roads by using a adjacency matrix, for example, MTGNN uses an adaptive adjacency matrix when capturing the spatial dependence, however, the adaptive adjacency matrix often reflects local spatial dependence in the roads, and ignores global spatial dependence in the roads. In the real world, there may be additional spatial relationships between road segments due to regional functionality (e.g., there may be close relationships between homes and work areas). In order to better capture local space dependence and global space dependence in a road network and solve the problems existing in an MTGNN model, the invention provides a multi-view-based space-time graph convolutional network model MVSTGCN.

Disclosure of Invention

The invention provides a traffic flow prediction method based on a multi-view space-time diagram convolution network, which aims at the traffic flow prediction problem on a road network and carries out traffic flow prediction. Specifically, the invention relates to a novel space-time feature extraction method, which realizes complex space dependency extraction by considering the interaction relationship among road observation points from multiple angles in a space dimension, and extracts a time dependency relationship by using a gated time convolution module in a time dimension. In order to reduce data redundancy caused by multi-angle extraction of spatial dependency and improve the robustness of a model, a feature separation idea is introduced, and finally, a large number of experiments are performed on two real traffic data sets, so that the feasibility and the effectiveness of the designed network are verified.

The technical scheme of the invention is as follows:

a traffic flow prediction method based on a multi-view space-time graph convolutional network is characterized by comprising the following steps:

s1, acquiring a traffic data set and constructing a static graph adjacency matrix A of a road network _road The traffic data set comprises traffic data of each observation point in the road network at different moments;

s2, acquiring a trend similarity graph adjacency matrix A _dtw ；

S3, constructing an MVSTGCN model:

the MVSTGCN model comprises an output module and a plurality of spatiotemporal feature extraction layers which are connected in series, spatiotemporal features under different scales are obtained by stacking a plurality of spatiotemporal feature extraction layers, the output of the current spatiotemporal feature extraction layer is used as the input of the next spatiotemporal feature extraction layer, the output of the current spatiotemporal feature extraction layer is recorded into the output module, and when all the spatiotemporal feature extraction layers extract spatiotemporal features, a prediction result is given out by the output module;

each space-time feature extraction layer comprises a structure formed by connecting a gating time convolution module and a multi-view graph convolution module in series, the multi-view graph convolution module is used for extracting space features, and the gating time convolution module is used for extracting time features;

the input of the gate control time convolution module is traffic data processed by a linear conversion layer, the output of the gate control time convolution module is connected with the multi-view graph convolution module, and meanwhile, the static graph adjacency matrix A _road Adjacency matrix A of trend similarity graph _dtw Also as input to the multi-view convolution module; residual error connection is carried out between the input of the gating time convolution module and the output of the multi-view convolution module;

the output module comprises at least two full connection layers (FC), and the activation function of the output module is ELU;

and S4, training the MVSTGCN model by using the traffic data set to finally obtain the MVSTGCN model for traffic flow prediction.

The specific process of the step S1 is as follows:

s11, obtaining graph structure information G = (V, E) of a road network, wherein V = { V = (V, E) } ₁ ，v ₂ ,...,v _N E represents the set of observation points in the road network, and E represents the set of edges corresponding to the connection relationship between the observation points, if<v _i ，v _j >E represents the observation point v _i And v _j Direct road communication exists between the two observation points, otherwise, the two observation points do not have direct relation, and a static graph adjacency matrix A is formed _road ；

S12, acquiring traffic data of the road network, and cleaning the data;

s13, time slice division is carried out on the cleaned traffic data, and the traffic data of all observation points in one time slice correspond to one characteristic matrix X ^(t) Is shown as

Representing observation point v _i The traffic data in the t-th time slice, wherein N is the number of observation points; n is the number of observation points;

s14, stacking the data of the time slices to obtain a final traffic data set X = [ X = [ ] ⁽¹⁾ ,X ⁽²⁾ ,...,X ^(Len) ]Len represents the step size of the time step of the traffic data, and Len is far larger than the step size T of the time step input by the MVSTGCN model.

The calculation process of the gated time convolution module is formula (2):

Z＝Tanh(TCN(X′))*σ(TCN(X′)) (2)

wherein: TCN represents a hole causal convolutional network; x' represents a characteristic matrix of the traffic data processed by the linear conversion layer; z represents the output of the gated time convolution module; tanh and σ represent Tanh and Sigmoid activation functions, respectively; * Representing element multiplication;

the output module comprises two full connection layers (FC) and carries out nonlinear transformation on the space-time characteristics under different scales to obtain a final prediction result.

The multi-view graph convolution module is composed of three multi-scale graph convolution networks and an aggregation layer, and the three multi-scale graph convolution networks MS-GCN are used for respectively extracting the spatial features peculiar to the static graph, the trend similar graph and the public spatial features on the graph; the formula of the operation process is expressed as:

Q _rs ＝MS-GCN1(Z,A _road ) (7)

Q _ss ＝MS-GCN2(Z,A _dtw ) (8)

Q _rc ＝MS-GCN3(Z,A _road ) (9)

Q _sc ＝MS-GCN3(Z,A _dtw ) (10)

wherein: MS-GCN1, MS-GCN2 and MS-GCN3 represent three multi-scale graph convolutional networks; z represents the output of the gated time convolution module; a. The _dtw And A _road Representing a trend similarity graph adjacency matrix and a static graph adjacency matrix; q _rs ，Q _ss ，Q _rc And Q _sc Respectively representing the special spatial feature of the static diagram, the special spatial feature of the trend similar diagram, the public spatial feature of the static diagram and the public spatial feature of the trend similar diagram;

the formula of the multi-scale graph convolutional network is expressed as formula (6):

wherein theta is _l Weight representing the l-th layer, obtained by the attention module, θ when l =0 ₀ =1, this time representing the residual relationship of layer 0; q represents the output of the multi-scale graph convolutional network MS-GCN; m represents the number of layers of the graph convolution layer; h ^(l) Representing the output of the l layer of the graph convolution network;

after the spatial features are obtained, splicing the obtained spatial features, inputting the spliced feature vectors into a polymerization layer for feature polymerization, wherein the calculation process is as shown in a formula (11):

Q’＝Agg(Concat(Q _rs ，Q _ss ，Q _rc ，Q _sc )) (11)

wherein: concat represents a vector splicing operation; agg represents the feature aggregation operation, Q' is the output of the multi-view convolution module, and the aggregation layer is the fully connected layer or channel attention.

The total loss L of the MVSTGCN model is a loss function L of the multi-view convolution module _gcn And predicting the loss L _model Addition, L = L _mode l+L _gcn ，L _model The calculation formula of (2) is as follows:

wherein: s represents a predicted time step; n represents the number of observation points; d represents the dimension of the characteristic to be predicted, D is fixed to be 1 in the traffic flow prediction task and represents the flow characteristic;

is a prediction result; y is a real result; t represents the t-th time slice; s represents the predicted s-th time slice; />

A set of parameters representing MVSTGCN;

loss function L of multi-view convolution module _gcn Is defined as:

L _gcn ＝αL ₁ -βL ₂ (14)

wherein alpha and beta represent the hyperparameter, L ₁ Representing the similarity of common spatial features in static and trend-like plots, L ₂ Representing the sum of the similarity of the special spatial features and the common spatial features of the static map and the similarity of the special spatial features and the common spatial features in the trend similarity map; the similarity is obtained by F norm calculation.

In S2, the similarity of the data sequences between every two observation points is calculated by adopting a dynamic time warping algorithm (DTW), and a trend similarity matrix Sim, sim is constructed _ij Corresponding observation point v _i And observation point v _j A minimum alignment distance therebetween;

then using a threshold function to select the connectivity relationship when Sim _ij When the value is less than the threshold value, an observation point v is set _i And the observation point v _j If the adjacent matrix A of the trend similar graph is not set to be in the connected relation, the adjacent matrix A of the trend similar graph is finally obtained _dtw ；

Or setting the value of top K, and selecting Sim in the same row of the trend similarity matrix _ij Top K values of (a) are set to 1 for the position in the adjacency matrix and 0 for the rest.

The dynamic time warping algorithm (DTW) needs to set a search step for limiting the DTW in finding the alignment relationship.

The invention also protects a computer readable storage medium, wherein a computer program is stored in the storage medium, and the computer program is suitable for executing the traffic flow prediction method based on the multi-view space-time graph convolutional network when being loaded by a computer.

Compared with the prior art, the invention has the beneficial effects that:

the invention designs a multi-view space-time graph convolution network for traffic flow prediction by utilizing an algorithm of a deep convolution neural network. The method takes traffic data, a static graph adjacency matrix and a trend similar graph adjacency matrix on a road network as input, extracts time features and space features alternately, and then uses an output module to perform feature fusion, thereby realizing accurate prediction of traffic flow. Compared with the STG-NCDE with the best prediction precision in the embodiment, the method obtains better prediction performance on the real traffic data sets PeMS03 and PeMS 08. Specifically, in the PeMS03 dataset, there was a 5.33% decrease in MAE, a 3.91% decrease in MAPE, and a 6.34% decrease in RMSE; on the PeMS08 dataset, MAE decreased by 1.49%, MAPE decreased by 1.51%, and RMSE decreased by 1.57% (the smaller the values of the three evaluation indices the better).

The invention realizes the extraction of the spatial dependency relationship in the road from different angles, realizes the capture of the spatial dependency relationship from the global angle and the local angle, provides a multi-view space-time graph convolution network and realizes the traffic flow prediction. The dependency relationship in the road observation points is modeled from different angles, and then the multi-view graph convolution module is used for extracting the spatial features, so that the capability of the module for capturing the spatial dependency is improved, and the prediction performance of the model is improved.

The invention provides a multi-view space-time graph convolution network, which can extract potential spatial dependence and strengthen the prediction effect. The gate control time convolution module is mainly composed of two cavity causal convolution networks which are respectively activated by different activation functions, is simple in structure, has no dependency relationship in the calculation process, can realize parallel calculation, and has the advantages of short training time and no error accumulation.

Drawings

FIG. 1 is a schematic diagram of the structure of a void causal convolutional network TCN.

FIG. 2 is a schematic diagram of a gated time convolution module.

FIG. 3 is a schematic diagram of the structure of a multi-scale graph convolutional network.

FIG. 4 is a schematic diagram of the structure of the multi-view convolution module.

FIG. 5 is a schematic structural diagram of MVSTGCN model.

FIG. 6 is a graph of comparison results of single-step prediction of MAE.

Figure 7 single-step prediction MAPE comparison results plot.

FIG. 8 is a graph of the comparison results of single step prediction of RMSE.

Detailed Description

In order to make the technical scheme of the present invention clearer, the present invention is further explained with reference to the attached drawings.

The method comprises the steps of extracting potential space dependence characteristics in traffic data through a multi-view graph convolution module, extracting potential time dependence characteristics in the traffic data through a gating time convolution module, alternately extracting time dependence characteristics and space dependence characteristics to obtain space-time dependence relations under different scales, inputting the space-time dependence relations under the different scales into an output module to perform nonlinear data conversion, and finally obtaining a prediction result. The multi-scale graph convolutional network carries out message transmission based on the adjacency matrix to realize the capture of spatial characteristics; the multi-view graph convolution module is composed of 3 multi-scale graph convolution networks, extracts special spatial features and public spatial features from traffic data based on an adjacency matrix respectively, and plays a role in feature separation, and the adjacency matrix reflects a spatial dependency relationship; the output module is composed of a plurality of full connection layers and an activation function and is used for carrying out nonlinear data conversion on input space-time characteristics.

The invention is realized by the following steps:

the method comprises the steps that firstly, traffic data of a certain area are obtained through a crawler script or an API interface provided by a traffic management department, preprocessing is carried out on the obtained traffic data, and then a traffic data set is constructed;

the road network is represented as follows: the invention relates to a traffic flow prediction problem, in particular to a multi-element time series prediction problem, aiming at predicting the traffic flow in a future period of time by historical traffic flow data ^N×N A adjacency matrix representing an undirected graph G, where A _ij Represents an observation point v _i And v _j When in an adjacent relation of<v _i ,v _j >When E belongs to, the observation point v is indicated _i And v _j Have direct road communication between them, do not need to pass through other observation points, A _ij The value of (1) is 1 or the distance between two observation points, otherwise is 0, which indicates that there is no direct relationship between the two observation points. X ^(t) ∈R ^N×D The characteristic matrix is formed by data at each observation point at the t-th moment, D represents the characteristic number of the data at the observation point, and the characteristics usually comprise flow, speed and occupancy rate;

and the traffic data at each observation point in the road network at different moments form a traffic data set.

The road network refers to a road system in which various roads are interconnected and meshed in a certain area. When a traffic prediction task is carried out, an area to be subjected to traffic prediction is selected, the connection relation among roads in the area is a road network, then the positions for carrying out traffic data statistics in the road network are called observation points, the connection relation among the observation points can be obtained through the road network, and then a static graph adjacent matrix A is used _road Representing the connection relationship between the observation points.

Secondly, determining the input and output of the model: for the traffic flow prediction task, the input of the model is historical traffic data X of N observation points at T past time steps ^(t-T+1):(t) ∈R ^N×D×T The goal is to learn a mapping function f and output traffic flow data X of N observation points at S time steps in the future ^(t+1):(t+S) ∈R ^N×D×S Expressed using the formula:

the traffic flow prediction is to predict traffic data of S time steps in the future by using traffic data of past T time steps, and since the historical data is up to T time, the output is from T +1 time.

The third step: the traffic data set is divided and the data is normalized: dividing the standard according to the common proportion, 60% of the data is used for training, and 20%The data of (2) was used for verification, 20% of the data was used for testing, and the data was normalized using the Z-Score method and respectively recorded as Temp _train ，Temp _val And Temp _test . For Temp _train Taking Temp first _train Taking traffic flow data of the first T time steps as the input of a sample 1, taking traffic flow data of the first T +1 to the first T + S time steps as the real result of the sample 1, taking the traffic flow data of the first T +1 to the first 2T time steps as the input of a sample 2, taking the traffic flow data of the first 2T +1 to 2T + S time steps as the real result of the sample 2, and repeating the steps to obtain a training set X for model training _train And its corresponding true result Y. In the same way for Temp _val And Temp _test Processing to obtain verification sets X _val Test set X _test And its corresponding real result Y _val And Y _test . At this time, each sample corresponds to the adjacency matrix A of the static map _road Adjacency matrix A of trend similarity graph _dtw 。

The fourth step: constructing an MVSTGCN model, namely a multi-view space-time graph convolution network model:

the MVSTGCN model comprises an output module and a plurality of spatiotemporal feature extraction layers which are connected in series, spatiotemporal features under different scales can be obtained by stacking a plurality of spatiotemporal feature extraction layers, the output of the current spatiotemporal feature extraction layer is used as the input of the next spatiotemporal feature extraction layer, the output of the current spatiotemporal feature extraction layer is recorded into the output module, and when all the spatiotemporal feature extraction layers extract spatiotemporal features, a prediction result is given out by the output module;

the input of the gate control time convolution module is the traffic data processed by the linear conversion layer, the output of the gate control time convolution module is connected with the multi-view graph convolution module, and meanwhile, the static graph is adjacent to the matrix A _road Adjacency matrix A of trend similarity graph _dtw Also as a pluralityAn input of a view graph convolution module; the input of the gated time convolution module is connected with the output of the multi-view convolution module by residual errors.

The output module comprises at least two full connection layers (FC), the activation function of the output module is ELU, and the final prediction result is obtained by carrying out nonlinear transformation on the space-time characteristics under different scales.

The overall structure of the MVSTGCN model is shown in FIG. 5. A in FIG. 5 _road Traffic data, A _dtw An input representing a model; linear represents a Linear translation layer that performs dimensional transformation on traffic data such that the data matches the input dimensions of the model. Residual error connection represents that the original input and the output of each feature extraction layer have a connection relation, and the problem of gradient disappearance when a model is trained is prevented, namely O = Z '+ g (Z'), wherein O represents the final output of the space-time feature extraction layer, Z 'represents the input of the space-time feature extraction layer, and g (Z') represents the output of Z 'after the Z' passes through a gating time convolution module and a multi-view graph convolution module. ELU stands for ELU activation function, expressed as P (x), and the calculation formula of ELU is:

x denotes a variable, i.e., input data of the ELU. FC stands for fully connected layer. Two ELUs and two FCs constitute an output module. + represents an elemental addition. The incomplete dashed box in the figure represents the multi-layered spatiotemporal feature extraction layer.

In order to capture the time dependency relationship in the traffic data, the invention adopts a gating time convolution module to extract the time characteristics. The gate control time convolution module (see fig. 2) is composed of two cavity cause-and-effect convolution networks TCN, the two cavity cause-and-effect convolution networks are identical in structure, the two cavity cause-and-effect convolution networks are respectively activated by a Tanh activation function and a Sigmoid activation function, the Tanh activation function is used for filtering the output of the cavity cause-and-effect convolution networks, and the Sigmoid activation function plays a gate control role and is used for controlling the proportion of information transmitted to the next layer. The hole convolution can ensure that the model processes a long enough sequence under the condition of small parameter quantity, and the causal convolution ensures that the model can not see future information. The calculation process of the final gating time convolution module is as follows:

Z＝Tanh(TCN(X′))*σ(TCN(X′)) (2)

wherein: TCN represents a hole causal convolutional network; x' represents a characteristic matrix of the traffic data which is input and processed by the linear conversion layer; z represents the output of the gated time convolution module; tanh and σ represent Tanh and Sigmoid activation functions, respectively; * Representing element multiplication.

The structure of the hole cause and effect convolution network is shown in FIG. 1. Each row in fig. 1 represents the input of the current layer, taking the input layer as an example, when a sequence with a length of 8 is input, feature extraction is performed by using a one-dimensional hole causal convolution kernel with a size of 2 and a hole rate of 1, the extracted data are the last seven positions of the second last layer, the first position is a padding bit, which is not needed in actual construction, and the first three positions of the third layer are padding bits, so that part of the connection relationship is ignored for making the picture look concise, for example, a circle of the third column of the second last row and a circle of the second and third columns of the first last row have a connection relationship, and the connection relationship is omitted here. Causal convolution means that the concatenation exists only at the current position and at the previous partial position, which prevents future data from being seen. The hole rate represents the jump range of the processing position of each convolution kernel when processing the sequence, and is added with 1, the output layer is the first row, in the second row, because the hole cause and effect convolution kernel with the size of 2 and the hole rate of 4 is used, the output layer only receives the calculation results of the data of the 8 th column and the 4 th column, and the data of the 5 th to the 7 th columns are ignored. The calculation formula for each position is:

wherein x ∈ R ^T Representing the input, T represents the length of the input sequence, and T is 8 in the figure; r is formed as R ^k Representing a parameter set of a convolution kernel, k being the size of the convolution kernel, k being 2 in the figure; t represents the currently calculated sequence position; d is the void rate of the convolution kernel; r is _a Denotes the a parameter of the convolution kernel, the subscript starting with 0; x is a radical of a fluorine atom _(t-d×a) RepresentThe input t-d x a positions, t represents the t time slice.

In order to capture potential spatial dependency relationships in traffic data, a multi-view graph convolution module is designed. The graph convolution network populates a traditional convolution neural network from Euclidean space data to a graph with a non-Euclidean structure, message transmission on the graph can be achieved, and further extraction of spatial dependency relationship is achieved. The flow of the multi-view convolution module is as follows:

(1) Constructing a trend similarity graph: based on the flow change trend similarity between the observation points, a trend similarity graph is constructed, and the specific process is as follows: the distances between every two observation points are respectively calculated, and time sequences cannot be well aligned by common cosine similarity and Euclidean distance measurement methods because the interaction of different observation points has time lag. In the present embodiment, the distance between each two observation points is calculated by using a dynamic time warping algorithm (DTW). The dynamic time warping algorithm allows the relation alignment to be carried out across time steps, and the distance calculation of the sequence inconsistency problem caused by the time lag relation can be well processed. The dynamic time warping algorithm uses the dynamic programming idea to calculate the minimum distance between sequences. The dynamic programming transfer equation used is expressed as:

γ(w,h)＝dis(q _w ,c _h )+min(γ(w-1,h-1),γ(w-1,h),γ(w,h-1)) (3)

in which gamma (w, h) represents the point of observation v _q Time series and observation point v of (0, w) _c The minimum alignment distance of the time series of (0, h); dis (q) _w ,c _h ) Representing observation point v _q Middle w time step and observation point v _c Distance of h time stepDifference (flow difference), where w, h represent time steps.

Calculating the minimum distance between observation point sequences by using the algorithm to form a trend similarity matrix Sim, and selecting a connectivity relation by using a threshold function when Sim _ij Less than the threshold value, sim _ij Setting the position in the corresponding adjacency matrix to be 1, otherwise setting the position to be 0, and finally obtaining the trend similarity graph adjacency matrix A _dtw 。

The general distance measurement method mostly uses cosine similarity, euclidean distance, and DTW is used herein to achieve relationship alignment across time steps, because traffic data has time-lag relationship from the perspective of spatial variation. In the conventional DTW, the time complexity is O (n) when calculating the distance of the sequence ² ) In calculating the time series, the consumption is large.

In this embodiment, the search step length of the DTW in finding the alignment relationship is preferably limited, for example, 12 time steps are found backwards, so that the complexity is reduced to O (n), and since the interactions of the observation points of the traffic sequence all occur within a limited time range, the limitation of the search step length does not affect the final performance, and the specific search step length needs to be selected through experiments, and the specific problem is specifically analyzed.

The minimum distance between the ith observation point and the jth observation point sequence in the road is set as if the time sequence lengths of the two observation points are both 12, and then Sim _ij Is gamma (12, 12), the threshold belongs to a hyperparameter of the network, since different thresholds result in matrices with different sparsity.

Sim in the same row of the trend similar matrix can be selected according to requirements _ij The position of the adjacent matrix corresponding to the top K value of the value is set to be 1, the rest is set to be 0, K is also a hyperparameter, and different K also determines the sparsity of the matrix.

(2) Constructing a multi-scale graph convolutional network MS-GCN: based on a given adjacency matrix A and a given characteristic matrix Z, a graph convolution network GCN is used for extracting spatial characteristics of the characteristic matrix Z, different weighting factors given according to the information content of a current layer are obtained in each graph convolution layer of the graph convolution network by using an attention module, and the operation process of the graph convolution network layer is expressed by a formula:

H ^(l+1) ＝f(H ^(l) ,A)＝ReLU(AH ^(l) W ^(l+1) +b ^(l+1) ) (4)

H ⁽⁰⁾ ＝Z∈R ^N×D (5)

wherein H ^(l) The output of the ith layer of the graph convolution network is represented, and l is an integer between 0 and m and is also the input of the (l + 1) th layer; w ^(l+1) And b ^(l+1) The learnable model parameters at level l +1 are represented, and the ReLU represents the ReLU activation function. In order to dynamically adjust the influence weights of different layers, the attention module assigns different weight factors according to the size of the information content contained in the current layer, and the structure of the multi-scale graph convolutional network MS-GCN is shown in fig. 3. The nodes on each circle in fig. 3 represent neighbor information to be aggregated for the current layer. The final formula for the computation process of the multi-scale convolutional network is expressed as:

wherein theta is _l Weight representing the l-th layer, obtained by the attention module, θ when l =0 ₀ =1, this time representing the residual relationship of layer 0; q represents the output of the multi-scale graph convolutional network MS-GCN; m represents the number of layers of the graph convolution layer.

(3) Constructing a multi-view graph convolution module: on the basis of the multi-scale graph convolution network, a multi-view graph convolution module is constructed, and the structure of the multi-view graph convolution module is shown in FIG. 4. Specifically, the multi-view convolution module is composed of three multi-scale graph convolution networks and an aggregation layer. The three multi-scale graph convolutional networks MS-GCN respectively extract the spatial features unique to the static graph, the spatial features unique to the trend similarity graph and the public spatial features on the graph. The formula of the operation process is expressed as:

Q _rs ＝MS-GCN1(Z,A _road ) (7)

Q _ss ＝MS-GCN2(Z,A _dtw ) (8)

Q _rc ＝MS-GCN3(Z,A _road ) (9)

Q _sc ＝MS-GCN3(Z,A _dtw ) (10)

wherein: MS-GCN1, MS-GCN2 and MS-GCN3 represent the aforementioned three multiscale graph convolutional networks; z represents a characteristic matrix formed by the traffic data processed by the gating time convolution module; a. The _dtw And A _road Representing a trend similarity graph adjacency matrix and a static graph adjacency matrix; q _rs ，Q _ss ，Q _rc And Q _sc Respectively, static map-specific spatial features, trend-likeness map-specific spatial features, static map common spatial features, and trend-likeness map common spatial features, subscript c denotes common, subscript r denotes static map, subscript first position s denotes trend-like, subscript second position s denotes specific.

After the spatial features are obtained, splicing the obtained spatial features, inputting the spliced feature vectors into a polymerization layer for feature polymerization, wherein a formula of a calculation process is defined as follows:

Q′＝Agg(Concat(Q _rs ，Q _ss ，Q _rc ，Q _sc )) (11)

wherein: concat represents a vector splicing operation; agg denotes a feature aggregation operation, and a specific implementation may use a fully connected layer or channel attention instead.

(4) Setting a loss function: in order to ensure that the three multi-scale graph convolution networks MS-GCN have the function of characteristic separation, corresponding loss functions need to be defined, and the learning result of the model is ensured. Let Q _rc ∈R ^N×D×T ，Q _sc ∈R ^N×D×T ，Q _rs ∈R ^N ^×D×T ,Q _ss ∈R ^N×D×T Respectively represent a static diagram common spatial feature, a trend similarity diagram common spatial feature, a static diagram specific spatial feature and a trend similarity diagram specific spatial feature. Firstly, two common space characteristics Q _rc And Q _sc Performing dimension conversion and then performing normalization operation to obtain Q' _rc ∈R ^N×DT And Q' _sc ∈R ^N×DT . Because of Q _rc And Q _sc Respectively representing static and trend-like plotsCommon spatial features, so the two matrices should have very high similarity, and the metric formula is calculated using F-norm, and the formula is:

the above formula can also be used to measure the difference between the spatial unique information and the spatial common information, and the unique information and the common information should have different information characteristics, so that there should be a great difference between the two matrices, using Q' _rs And Q' _ss The characteristic spatial features of the static graph and the characteristic spatial features of the trend similar graph after dimension conversion and normalization are represented respectively, the measurement formula is calculated by using an F norm as well, and the formula is as follows:

loss function L of final multi-view convolution module _gcn Is defined as follows:

L _gcn ＝αL ₁ -βL ₂ (14)

where alpha and beta represent hyper-parameters, used to control the degree of influence of the current loss function on the model parameters, L ₁ Representing the similarity of common spatial features in static and trend-like plots, L ₂ And the sum of the similarity of the special spatial features and the common spatial features of the static map and the similarity of the special spatial features and the common spatial features in the trend similarity map is represented.

The fifth step: training model parameters: according to Temp _train Constructing a trend similarity graph to obtain a trend similarity graph adjacency matrix A _dtw . Then X is _train Trend similarity graph adjacency matrix A _dtw And static graph adjacency matrix A _road Obtaining the prediction result of the MVSTGCN model as the input of the MVSTGCN model (two adjacent matrixes are only effective in the multi-view convolution module, and the adjacent matrixes of other modules do not participate in the calculation), and performing loss measurement by using the prediction result and the real result YAnd (4) updating the MVSTGCN model parameters by using a back propagation algorithm. The specific flow of data is:

(1) For training set X _train Carrying out dimension transformation on the traffic data to obtain X ', and inputting the X' into a first space-time feature extraction layer;

(2) The time-space feature extraction layer firstly utilizes a gate control time convolution module to extract time features of input to obtain time features Z, and then the multi-view graph convolution module is based on a trend similar graph adjacency matrix A _dtw And static graph adjacency matrix A _road Extracting the spatial feature Z to obtain the space-time feature Q 'output by the current layer' ^(p) ，Q′ ^(p) Representing the output result of the p-th space-time characteristic extraction layer;

(3) Will space-time feature Q' ^(p) Inputting to the next space-time feature extraction layer, and simultaneously inputting space-time features Q' ^(p) Recorded in a variable Skip;

(4) Repeating the steps (2) and (3) until the data passes through all the space-time feature extraction layers;

(5) Inputting the variable Skip into an output module, wherein the output result of the output module is the prediction result of the model

(6) Using predicted results

And the predicted loss L of the true result Y calculation model _model Adding the loss function of the multi-view convolution module and the predicted loss to obtain the final loss L = L of the model _model +L _gcn ，L _model The calculation formula of (2) is as follows:

wherein: s represents a predicted time step; n represents the number of observation points; d represents the dimension of the feature to be predicted, and D is fixed to be 1 in the traffic flow prediction taskIndicating the flow characteristics; t represents the t-th time slice; s represents the predicted s-th time slice;

all parameters representing MVSTGCN; i represents the ith observation point; u is the u-th feature.

(7) And updating the parameters of the model by using a back propagation algorithm.

And a sixth step: and (3) verifying the performance of the model: will verify the set X _val Inputting the model into an MVSTGCN model, and calculating the predicted loss L of the MVSTGCN model on a verification set _model ；

The seventh step: repeating the fifth and seventh steps until the predicted loss L on the validation set _model Less than a set threshold or a maximum number of repetitions.

Eighth step: testing the prediction performance of the MVSTGCN model: test set X _test Inputting the data into an MVSTGCN model, predicting the traffic flow and evaluating the performance of the final MVSTGCN model.

(1) MVSTGCN model overall performance assessment

The invention performs experiments on two real traffic flow data sets of PeMS03 and PeMS 08. The detailed information of the data set is shown in table 1. Both data sets were 5 minutes as a time step.

Table 1: data set information

The effectiveness and the accuracy of the traffic flow prediction method based on the multi-view space-time graph convolutional network are verified through comparison experiments on a real data set, and the experimental results are shown in table 2.

The present invention is compared to other reference methods: including FC-LSTM, DCRNN, STGCN, ASTGCN, STG2Seq, graph Wavenet, STSGGCN, STFGNN, ZGCNETS and STG-NCDE (the reference methods mentioned are all English shorthand for models in published literature), and three measures including mean absolute error MAE, root mean square error RMSE and mean absolute percentage error MAPE are used for evaluation. The smaller the values of the three measurement indexes are, the better the values are, and the corresponding calculation formula is as follows:

wherein: y is _s As an observed value of the traffic flow,

for the predicted value of the traffic flow, S represents the length of the time series to be predicted; s represents the time step of the s-th prediction, and the value range is (1, S).

From table 2, it can be seen that the predicted performance of the MVSTGCN model is superior to the other baseline models in all three indicators over both data sets. The MVSTGCN can obtain a spatial feature model with richer semantics through multi-angle spatial dependency relationship extraction, so that the model has better prediction performance.

Table 2: performance comparison results of MVSTGCN and reference model

To further verify the performance of MVSTGCN, a single step prediction error comparison was chosen on the PeMS08 dataset with the two most advanced models ZGCNETS, STG-NCDE, and the experimental comparison results were plotted as a line graph. The single-step prediction results for MAE, MAPE and RMSE are shown in FIGS. 6-8.

(2) To assess and understand the effects and performance of key components in the MVSTGCN model proposed by the present invention, an ablation study was performed on the PeMS03 dataset, naming the variants of MVSTGCN as follows:

MVSTGCN/w.o.R: only a static graph adjacency matrix is used, a multi-view graph convolution module is omitted, a common graph convolution network is provided, and the dependency relationship constructed by other angles is not considered.

MVSTGCN/w.o.C: on the basis of a single adjacent matrix, a multi-scale graph convolution network is used for replacing an original graph convolution network, namely, an attention module is added in a common graph convolution network on the basis of MVSTGCN/w.o.R, and the multi-scale graph convolution network is formed.

MVSTGCN/w.o.D: the static graph adjacency matrix and the trend similarity graph are used for spatial dependence extraction, and the shared parameter part in the graph 4 is not included, namely, only two MS-GCNs are provided.

The results of the ablation experiments are shown in table 3.

Table 3: ablation experiment

Experimental results show that the multi-view convolution module, the multi-scale graph convolution network and the feature separation idea are all important to the performance of the MVSTGCN. The effectiveness of multi-view feature extraction is proved by spatial feature fusion on the trend similarity graph and the static graph, and the multi-scale graph convolution network can perform dynamic weight adjustment on the middle output of the model to achieve a self-adaptive effect. The MVSTGCN model achieves a better traffic flow prediction result.

Nothing in this specification is said to apply to the prior art.

Claims

1. A traffic flow prediction method based on a multi-view space-time diagram convolution network is characterized by comprising the following steps:

s1, acquiring intersectionConstructing static graph adjacency matrix A of road network by data set _road The traffic data set comprises traffic data of each observation point in the road network at different moments;

s2, acquiring a trend similarity graph adjacency matrix A _dtw ；

S3, constructing an MVSTGCN model:

the input of the gate control time convolution module is the traffic data processed by the linear conversion layer, the output of the gate control time convolution module is connected with the multi-view graph convolution module, and meanwhile, the static graph is adjacent to the matrix A _road Adjacency matrix A of trend similarity graph _dtw Also as an input to the multi-view convolution module; residual error connection is carried out between the input of the gating time convolution module and the output of the multi-view convolution module;

2. The traffic flow prediction method based on the multi-view space-time graph convolutional network of claim 1,

the specific process of step S1 is:

s11, obtaining graph structure information G = (V, E) of a road network, wherein V = { V = ₁ ,v ₂ ,...,v _N } tableThe set of observation points in the road network is shown, E represents the set of edges and corresponds to the connection relationship between the observation points, if<v _i ,v _j >E indicates the observation point v _i And v _j Direct road communication exists between the two observation points, otherwise, the two observation points do not have direct relation, and a static graph adjacency matrix A is formed _road ；

S12, acquiring traffic data of the road network, and cleaning the data;

s13, time slice division is carried out on the cleaned traffic data, and the traffic data of all observation points in one time slice correspond to one characteristic matrix X ^(t) Is represented as

Representing observation point v _i The traffic data at the t-th time slice, wherein N is the number of observation points;

s14, stacking the data of the time slices to obtain a final traffic data set X = [ X ] ⁽¹⁾ ,X ⁽²⁾ ,...，X ^(Len) ]Len represents the step size of the time step of the traffic data, and Len is far larger than the step size T of the time step input by the MVSTGCN model.

3. The traffic flow prediction method based on the multi-view space-time graph convolutional network of claim 1,

the calculation process of the gated time convolution module is formula (2):

z＝Tanh(TCN(X′))*σ(TCN(X′)) (2)

4. The traffic flow prediction method based on the multi-view space-time graph convolutional network of claim 1,

Q _rs ＝MS-GCN1(Z，A _road ) (7)

Q _ss ＝MS-GCN2(Z，A _dtw ) (8)

Q _rc ＝MS-GCN3(Z，A _road ) (9)

Q _sc ＝MS-GCN3(Z，A _dtw ) (10)

Q′＝Agg(Concat(Q _rs ，Q _ss ，Q _rc ，Q _sc )) (11)

wherein: concat represents a vector splicing operation; agg represents a feature aggregation operation, Q' is the output of the multi-view convolution module, and the aggregation layer is a fully connected layer or channel attention.

5. The method according to claim 4, wherein the total loss L of the MVSTGCN model is a loss function L of the multi-view convolution module _gcn And predicting the loss L _model Addition, L = L _model +L _gcn ，L _model The calculation formula of (c) is:

is a predicted result; y is the true result; t represents the t-th time slice; s represents the predicted s-th time slice; />

A set of parameters representing MVSTGCN;

loss function L of multi-view convolution module _gcn Is defined as follows:

L _gcn ＝αL ₁ -βL ₂ (14)

wherein alpha and beta represent the hyperparameter, L ₁ Representing the similarity of common spatial features in static and trend-like plots, L ₂ Representing spatial features characteristic of static graph, static graphThe similarity of the public space features and the sum of the similarity of the special space features and the public space features in the trend similarity graph; the similarity is obtained by F norm calculation.

6. The traffic flow prediction method based on the multi-view space-time graph convolutional network of claim 1, wherein in S2, a dynamic time warping algorithm (DTW) is used to calculate the similarity of data sequences between every two observation points and construct a trend similarity matrix Sim, sim _ij Corresponding observation point v _i And the observation point v _j A minimum alignment distance therebetween;

then using a threshold function to select the connectivity relationship when Sim _ij When the value is less than the threshold value, an observation point v is set _i And observation point v _j There is direct connection relation between them, otherwise, it is set as no connection relation, finally the adjacent matrix A of trend similar graph is obtained _dtw ；

7. The traffic flow prediction method based on the multi-view space-time graph convolutional network of claim 6, wherein a dynamic time warping algorithm (DTW) needs to be set to limit the search step length of the DTW when finding the alignment relationship.

8. A computer-readable storage medium having stored therein a computer program adapted to, when loaded by a computer, perform the method of traffic flow prediction based on a multi-view space-time graph convolutional network of claims 1-7.