CN116129646B

CN116129646B - Traffic prediction method of graph convolution neural network based on feature intersection

Info

Publication number: CN116129646B
Application number: CN202310161842.4A
Authority: CN
Inventors: 胡海兵; 韩恺; 吴本伟
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2023-02-21
Filing date: 2023-02-21
Publication date: 2024-05-10
Anticipated expiration: 2043-02-21
Also published as: CN116129646A

Abstract

The invention relates to the technical field of traffic condition prediction and discloses a traffic prediction method of a graph convolution neural network based on characteristic intersection; predicting the current traffic situation through a traffic prediction model, wherein the traffic prediction model comprises an encoding module, an attention mechanism transformation layer and a decoding module; the traffic field data set capable of reflecting traffic conditions is processed by an encoding module, an attention mechanism conversion layer and a decoding module in sequence to obtain a prediction result; in the time and space dimensions, the invention can display and utilize the second-order and third-order crossing characteristics of adjacent nodes, thereby capturing the nonlinear characteristics of different adjacent nodes in a displaying way; through the technical framework of the end-to-end encoding module-decoding module, for the space embedded vector and the time embedded vector, not only the first-order linear weighting characteristic is used, but also the paired second-order and third-order characteristic interaction is used, so that the nonlinear relation in space and time can be captured better.

Description

Traffic prediction method of graph convolution neural network based on feature intersection

Technical Field

The invention relates to the technical field of traffic condition prediction, in particular to a traffic prediction method of a graph convolution neural network based on feature intersection.

Background

Traffic prediction is a fundamental and critical issue in the field of intelligent traffic systems. The main problem is to predict the traffic flow and the speed of vehicles in different time periods in the future through the historical time-space network sequence data of a certain region, which is important to reduce traffic accidents and improve public safety. Along with the wider and wider demands of vehicles, people are increasingly dependent on intelligent navigation and intelligent traffic systems. Traffic prediction can effectively save time of people and avoid traffic jam. However, the historical traffic data relates to complex network nodes, has the characteristics of high dimensionality, high complexity, space-time dependence and the like, and makes prediction of parameters such as traffic flow, speed and the like difficult.

Traffic prediction has been highly appreciated for many years, and particularly, after recent years of deep learning has been popular, related methods are more diverse. The current mainstream traffic prediction method comprises the following steps:

(1) Based on traditional machine learning methods, such as a k-nearest neighbor algorithm and an SVM algorithm, the training time of the methods is short, and the speed is high. However, these traditional machine learning methods cannot well use high-order nonlinear characteristics due to the shallow neural network, and cannot capture the dependency of spatio-temporal data, so that the effect is poor.

(2) Convolutional Neural Network (CNN) -based methods can capture higher-order nonlinear features to some extent, but are not good at processing non-european road network data.

(3) Two types of methods based on GNN are currently available, one is using spectral space and the other is using frequency domain space. A mixed convolution recurrent neural network (DCRNN) proposes a mixed convolution recurrent neural network that replaces the fully connected layers in the GRU with mixed graph convolutions. The Structure Learning Convolutional Neural Network (SLCNN) extends the traditional CNN model with space-time diagram structure learning and learns the dynamic diagram structure. The spatio-temporal synchronization graph convolution network (STSGCN) can effectively capture complex local spatio-temporal correlations through a carefully designed spatio-temporal synchronization modeling mechanism. The attention-based space-time diagram convolutional network (ASTGCN) uses a space-time attention mechanism to learn the space-time correlation of dynamic traffic data. The multi-headed attention network (GMAN) adopts the Encoder-Decoder structure as a whole. The depth space-time graph convolutional network (WaveNet) is built up by stacking one-dimensional convolutions that are expanded, and can handle very long sequences. However, none of these methods uses cross information and therefore it is more difficult to capture dominant high-order nonlinear features.

Disclosure of Invention

In order to solve the technical problems, the invention provides a new aggregation operator based on a graph neural network, aiming at capturing higher-order intersection information and displaying and utilizing the second-order and third-order intersection characteristics among nodes, so that indexes such as speed, flow and the like of traffic in the area in a future period (for example, 15 minutes, 30 minutes and 60 minutes) are more accurately estimated during traffic prediction.

In order to solve the technical problems, the invention adopts the following technical scheme:

The traffic prediction method of the graph convolution neural network based on the feature intersection predicts the current traffic situation through a traffic prediction model, wherein the traffic prediction model comprises an encoding module, an attention mechanism transformation layer and a decoding module; the coding module and the decoding module comprise k crossed space-time attention mechanism layers, and each crossed space-time attention mechanism layer comprises L attention mechanism modules; the traffic field data set capable of reflecting traffic conditions is processed by an encoding module, an attention mechanism conversion layer and a decoding module in sequence to obtain a prediction result;

The method specifically comprises the following steps:

constructing a matrix with dimensions of (P+Q) multiplied by N multiplied by C through a traffic field data set, wherein P is the step number of a historical time sequence in the traffic field data set, Q is the step number of a future time sequence to be predicted, N is the number of all nodes in the traffic field data set, N= |V|, V is the set of all nodes in the traffic field data set, and C is the attribute dimension in the traffic field data set; construction of network map by traffic domain data set A represents an adjacency matrix, E is a set of all connected edges; obtaining a network diagram/>, according to a Node2vec algorithmSpace embedded vector/>, of each node v _i And obtaining a space embedded vector/>, through two fully connected neural network layersR ^D represents a real number domain vector set with dimension D, and a time embedded vector/> is obtained according to the time characteristics of the traffic field datasetWill/>And/>The stitching map is a space-time joint vector/>V _i is the ith node in the network diagram, t _j represents the jth moment, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to P+Q; the dimension of the matrix input to the coding module is p×n×d;

inputting the matrix into a coding module, and sequentially performing a second step, a third step and a fourth step;

Step two, obtaining a network diagram The weight of the nodes v _i, i is more than or equal to 1 and less than or equal to N and the neighbor nodes:

Each neighbor node v of node v _i is acquired, A set of all neighbor nodes that are node v _i; after the hidden state vectors of the node v _i and the neighbor node v are initialized, iteration is sequentially carried out through k crossed space-time attention mechanism layers of the coding module, and the hidden state vectors/>, from the node v _i and the node v to the k iterated hidden state vectors are respectively obtained

In each iteration process, the hidden state vector and the space-time joint vector of the node v _i and the current moment of the node v are respectively carried outAfter splicing, a spliced vector is obtained, and then the similarity/>, of the spliced vector corresponding to the node v _i and the spliced vector corresponding to the node v _j is calculated through L attention mechanism modulesFinally, the similarity/>Weights normalized to 0 to 1/>

Step three, through weightAnd updating the space embedded vector of the node v _i by the hidden state vector aggregation of the neighbor node v, so as to realize the aggregation of the node v _i in the space dimension, and the space embedded vector updated by the node v _i is recorded as/>

Step four, obtaining weight by calculating similarity between a node v _i at time t _j and a node v _i at history time t, t < t _j And further obtain a time embedded vector updated by the node v _i, realize the aggregation of the node v _i in the time dimension, and record the time embedded vector updated by the node v _i as/>

Step five, spatial embedding vector after updating the node v _i And updated temporal embedding vector/>Blending to obtain a space-time vector H ^(k) of the node v _i; calculating the space-time joint vector/>, of the historical time sequence of the node v _i, through the multi-head attention mechanism of the attention mechanism transformation layerT=t ₁,…,t_P and spatio-temporal joint vector of future time series/>T _j＝t_P+1,…,t_P+Q and obtaining the normalized weight/>By weight/>The space-time vectors H ^(k) of all the historical moments of the node v _i are weighted and summed to obtain H ^(k+1), and the dimension of the matrix is changed from P multiplied by N multiplied by D to Q multiplied by N multiplied by D;

Inputting the matrix after the attention mechanism conversion layer into a decoding module, repeating the second, third and fourth steps to obtain a space-time vector H ^(2k+1)∈R^Q×N×D of a future time sequence, and obtaining a prediction result of the future time sequence through a two-layer fully connected neural network

The first step specifically comprises the following steps:

Obtaining network diagram through Node2vec algorithm Space embedded vector/>, of each node v _i And obtaining a space embedded vector/>, through two fully connected neural network layers

Performing one-hot coding on time features in the traffic field data set, splicing the obtained vectors with different dimensions, and inputting the spliced vectors into a fully-connected neural network to obtain time embedded vectors

Embedding space into vectorsAnd temporal embedding vector/>After splicing, mapping into the space-time joint vector/>, through a fully connected neural networkF is a mapping function.

Specifically, in the second step, in the k-1 th iteration process, calculating the similarity of the splicing vector corresponding to the node v _i of the current time t _j and the splicing vector corresponding to each neighbor node v through the L, 1.ltoreq.l and L attention mechanism modulesWhen (1):

When the iteration is carried out for k-1 times respectively, the hidden state vector and the space-time joint vector of the node v _i and the node v at the time t _j are different; /(I) When the iteration is carried out for k-1 times respectively, the hidden state vector of the node v _i and the node v at the time t _j are multiplied by the point of the space-time joint vector; /(I)When the hidden state vectors are k-1 iterations respectively, the hidden state vectors of the node v _i and the node v at the time t _j are obtained; /(I)Space-time joint vectors for node v; /(I)For/>Post-splice results,/>For/>The result after stitching, | represents vector stitching,/>And/>D=d/L as a learnable parameter.

Specifically, in step two, the similarity is determined by a softmax functionWeights normalized to 0 to 1/>

The third step specifically comprises:

Step S31: for each node v _i, weighting the hidden state vectors of all neighbor nodes v of the node v _i by a linear weighting function F and adding bit by bit to obtain

As a learnable parameter, ||represents a vector join,/>Representing a space hidden state vector of the node v _i after k iterations at the time t _j;

step S32: for each node v _i, multiplying the hidden state vectors of all neighbor nodes of the node v _i by each other by a function T, and summing to obtain a cross feature vector

Wherein, Indicating the total number of crossings of node v _i, +., Representing the number of neighbor nodes of v _i, W ₁ representing a learnable parameter;

Step S33: for each node v _i, multiplying the embedded vectors of all neighbor nodes of the node v _i three by a function E, obtaining a third-order display cross feature by adopting a HOFM algorithm, and summing to obtain a third-order cross feature vector

Wherein the method comprises the steps ofAs used herein, "& gt means that the corresponding positions in the vector are multiplied by" & gt means that the conditions before and after the symbol are satisfied simultaneously & lt & gt! =meaning that the elements before and after the symbol are different,/>W ₂ represents a learnable parameter;

step S34: the function F, the function T and the function E are regulated through the super parameter alpha ₁,α₂,α₃ to obtain the space embedded vector updated by the node v _i

H ^(k-1) is all of the k-1 iterations

The fourth step comprises the following steps:

Step S41: in the (k-1) th iteration process, calculating the similarity of the node v _i at the time t _j and the node v _i at the history time through the (L, 1, L, L) attention mechanism modules

Differences between hidden state vectors and space-time joint vectors of the node v _i at the time t _j and the node v _i at the time t,/>Point multiplication of hidden state vector and space-time joint vector of node v _i at time t _j and node v _i at time t respectively,/>The hidden state vectors are respectively a node v _i at the time t _j and a node v _i at the time t and are subjected to k-1 times of iteration; /(I)A space-time joint vector of a node v _i at the time t; /(I)For/>Post-splice results,/>For/>The result after stitching, | represents vector stitching,/>And/>D=d/L as a learnable parameter.

Step S42: similarity is determined by a softmax functionNormalized to a weight of 0 to 1/>

Wherein,

Step S43: for each node v _i, the hidden state vectors at each historical moment of the node v _i are weighted and added bit by bit through a linear weighting function F to obtain vectors

Is a learnable parameter,/>Representing a time hidden state vector of the node v _i after k iterations at the time t _j;

Step S44: for each node v _i, multiplying the hidden state vectors of the historical moment of the node v _i bit by two and summing the hidden state vectors through a function T _t to obtain vectors

Wherein,Represents the number of crossings at time t _j,/>Indicating the number of time steps before time t _j, as well as the bitwise multiplication, W ₃ indicating the learnable parameters;

Step S45: for each node v _i, multiplying the hidden state vector of the historical moment of the node v _i by three bits and three bits, summing, and obtaining a third-order display cross feature vector by adopting HOFM algorithm

Wherein the method comprises the steps ofRepresenting the number of crossings at time t _j, W ₄ represents a learnable parameter;

Step S46: blending functions F _t、T_t and E _t into updated time-embedded vector for node v _i by super-parameter beta ₁,β₂,β₃

Wherein H ^(k-1) is all of the k-1 time

The fifth step comprises the following steps:

Step S51: for each node v _i, H _S is blended with H _T by the hyperparameter ζ, resulting in a spatiotemporal vector H ^(k) for node v _i:

Wherein, Sigma represents a sigmoid function, W _η,1、W_η,2、b_η represents a learnable parameter;

step S52: for node v _i, calculating a spatio-temporal joint vector of the historical time series of node v _i by an attention mechanism module T=t ₁,…,t_P and spatio-temporal joint vector of future time series/>Correlation of t _j＝t_P+1,…,t_P+Q/>And get normalized weight/>

Are all learnable parameters, d=d/L;

step S53: by weight Summing the spatiotemporal vectors H ^(k) at all historic times of node v _i gives H ^(k+1):

h ^(k+1) is all of k+1 iterations The dimensions of the matrix are changed from p×n×d to q×n×d.

Further, MAE loss function is adoptedTraining a traffic prediction model:

Where Y _t is the actual label of the training dataset, Is the predicted outcome of the training dataset.

Compared with the prior art, the invention has the beneficial technical effects that:

according to the method, the nonlinear intersection characteristics of the time and space nodes are modeled, the high-order intersection characteristics are captured through the second-order and third-order display intersection of the neighbor nodes, and the gating fusion is utilized to adaptively utilize the space and time information, so that the node information representation is more accurate, and the accuracy of traffic prediction is improved.

The invention provides a novel node aggregation technology in the field of graph neural networks, which can achieve better characterization effect by using nonlinear characteristics in a display manner; an end-to-end technology is provided, and a multi-head attention mechanism is utilized, so that the training of a network is more stable.

Experimental results show that the traffic prediction method can effectively improve the accuracy of traffic prediction in multiple index dimensions and multiple time dimensions.

Drawings

FIG. 1 is a schematic diagram of a traffic prediction model according to the present invention.

Detailed Description

A preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.

As shown in fig. 1, in the traffic prediction method of the present invention, the current traffic situation is predicted by a traffic prediction model, and the traffic prediction model includes an encoding module, an attention mechanism conversion layer and a decoding module; the coding module and the decoding module comprise k crossed space-time attention mechanism layers, and each crossed space-time attention mechanism layer comprises L attention mechanism modules; and processing the traffic field data set capable of reflecting traffic conditions through the coding module, the attention mechanism conversion layer and the decoding module in sequence to obtain a prediction result.

The method specifically comprises the following steps:

S1, constructing a matrix with dimension of (P+Q) multiplied by N multiplied by C through a traffic field data set (such as METR-LA, PEMS-BAY, PEMS-04 and PEMS-08), wherein P is the step number of a historical time sequence in the traffic field data set, Q is the step number of a future time sequence to be predicted, N is the number of all nodes in the traffic field data set, N= |V|, V is the set of all nodes in the traffic field data set, and C is the attribute dimension in the traffic field data set; construction of network map by traffic domain data set A represents an adjacency matrix, a _ij =1 indicates that there is an edge connection between node i and node j, a _ij =0 indicates that there is no edge connection between node i and node j, and E is the set of all edges that have a connection.

Obtaining network diagram according to Node2vec algorithmSpace embedded vector/>, of each node v _i And obtaining a space embedded vector/>, through two fully connected neural network layersR ^D represents a real number domain vector set with dimension D, and a time embedded vector/> is obtained according to the time characteristics of the traffic field datasetWill/>And/>Concatenation mapping is a space-time joint vectorV _i is the ith node in the network diagram, t _j represents the jth moment, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to P+Q; the dimension of the matrix input to the coding module is p×n×d.

The matrix is input into the coding module, and steps S2, S3, S4 are performed sequentially. The processes of steps S2, S3 and S4 are the processing processes of the cross space-time attention mechanism layer.

The step S1 specifically comprises the following steps:

Performing one-hot coding on time features in the traffic field data set, splicing the obtained vectors with different dimensions, and inputting the spliced vectors into a fully-connected neural network to obtain time embedded vectorsFor example, the time characteristics of the day of the hour and the day of the week are subjected to one-hot encoding to obtain R ^T,R⁷ -dimensional vectors, and then the two vectors are spliced to obtain R ^T+7 -dimensional vectors.

S2, obtaining a network diagramThe weight of the nodes v _i, i is more than or equal to 1 and less than or equal to N and the neighbor nodes:

Specifically, in step S2, in the process of the (k-1) th iteration, calculating the similarity of the spliced vector corresponding to the node v _i at the current time t _j and the spliced vectors corresponding to the neighboring nodes v by using the (L, 1-L) th attention mechanism moduleWhen (1):

Specifically, in step S2, the similarity is determined by a softmax functionWeights normalized to 0 to 1/>

S3, passing through the weightAnd updating the space embedded vector of the node v _i by the hidden state vector aggregation of the neighbor node v, so as to realize the aggregation of the node v _i in the space dimension, and the space embedded vector updated by the node v _i is recorded as/>

The step S3 specifically comprises the following steps:

As a learnable parameter, ||represents a vector join,/>Representing the spatially hidden state vector of node v _i after k iterations at time t _j.

Wherein, Indicating the total number of crossings of node v _i, +., Representing the number of neighbor nodes of v _i, W ₁ represents a learnable parameter.

Wherein the method comprises the steps ofAs used herein, "& gt means that the corresponding positions in the vector are multiplied by" & gt means that the conditions before and after the symbol are satisfied simultaneously & lt & gt! =meaning that the elements before and after the symbol are different,/>W ₂ represents a learnable parameter.

H ^(k-1) is all of the k-1 iterations

S4, obtaining weight by calculating similarity between a node v _i at time t _j and a node v _i at history time t, t < t _j And further obtain a time embedded vector updated by the node v _i, realize the aggregation of the node v _i in the time dimension, and record the time embedded vector updated by the node v _i as/>

Specifically, step S4 specifically includes the steps of:

Wherein,

Is a learnable parameter,/>Representing the time hidden state vector of node v _i after k iterations at time t _j. Step S44: for each node v _i, multiplying the hidden state vectors of the historical moment of the node v _i bit by two and summing the hidden state vectors through a function T _t to obtain a vector/>

Wherein,Represents the number of crossings at time t _j,/>Indicating the number of time steps before time t _j, +..

Wherein the method comprises the steps ofRepresenting the number of crossings at time t _j, W ₄ represents a learnable parameter.

Step S46: the functions F _t、T_t and E _t are blended by the super parameter β ₁,β₂,β₃ to the updated time-embedded vector H _T for node v _i:

Wherein H ^(k-1) is all of the k-1 time />

S5, spatial embedding vector after updating the node v _i And updated temporal embedding vector/>Blending to obtain a space-time vector H ^(k) of the node v _i; calculating the space-time joint vector/>, of the historical time sequence of the node v _i, through the multi-head attention mechanism of the attention mechanism transformation layerT=t ₁,…,t_P and spatio-temporal joint vector of future time series/>T _j＝t_P+1,…,t_P+Q correlation and obtain normalized weight/>By weight/>The space-time vectors H ^(k) at all historic times of node v _i are weighted and summed to give H ^(k+1), changing the matrix dimension from p×n×d to q×n×d.

The step S5 specifically comprises the following steps:

Wherein, Sigma represents a sigmoid function, and W _η,1、W_η,2、b_η represents a learnable parameter.

Step S52: for node v _i, calculating a spatio-temporal joint vector of the historical time series of node v _i by an attention mechanism moduleT=t ₁,…,t_P and spatio-temporal joint vector of future time series/>Correlation of t _j＝t_P+1,…,t_P+Q/>And get normalized weight/>

Are all learnable parameters, d=d/L.

Step S53: by weightSumming the spatiotemporal vectors H ^(k) at all historic times of node v _i gives H ^(k+1):

H ^(k+1) is denoted as all of k+1 iterations The dimensions of the matrix are changed from p×n×d to q×n×d.

S6, inputting the matrix after the attention mechanism conversion layer into a decoding module, repeating the second, third and fourth steps to obtain an embedded vector H ^(2k+1)∈R^Q×N×D of a future time sequence, and obtaining a prediction result of the future time sequence through a two-layer fully connected neural network

Further, MAE loss function is adoptedTraining a traffic prediction model:

When the traffic prediction model is trained, the following parameters are adopted:

the number of steps P of the historical time series is equal to the number of steps Q of the future time series: p=q=12;

The number of cross spatiotemporal attention mechanism layers k=3;

dimension d=64 of the attribute feature;

α₁＝0.3,α₂＝0.4,α₃＝0.3；

the training round number is 500, the early termination round number is 20, the learning rate (LEARNING RATE) is 0.0001, and the optimizer is an Adam optimizer.

The invention provides a new aggregation operator of a graph neural network, which can utilize the second-order and third-order crossing characteristics of adjacent nodes in a displaying manner in time and space dimensions, so that nonlinear characteristics of different adjacent nodes are captured in a displaying manner. Through the technical framework of an end-to-end encoding module and a decoding module, for space embedding vectors and time embedding vectors, the invention not only uses first-order linear weighting characteristics, but also uses paired second-order and third-order characteristic interaction, so that nonlinear relations in space and time can be captured better, the encoding module and the decoding module both comprise k crossed space-time attention mechanism layers, and each crossed space-time attention mechanism layer comprises the following components: cross spatial attention mechanism, cross temporal attention mechanism and attention fusion gate. An attention mechanism conversion layer is designed between the coding module and the decoding module, and the output of the coding module is converted into the input of the decoding module. In addition, the invention also merges the graph structure and the time information through a space-time module layer.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a single embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to specific embodiments, and that the embodiments may be combined appropriately to form other embodiments that will be understood by those skilled in the art.

Claims

1. A traffic prediction method of a graph convolution neural network based on feature intersection is characterized in that a current traffic condition is predicted through a traffic prediction model, and the traffic prediction model comprises an encoding module, an attention mechanism transformation layer and a decoding module; the coding module and the decoding module comprise k crossed space-time attention mechanism layers, and each crossed space-time attention mechanism layer comprises L attention mechanism modules; the traffic field data set capable of reflecting traffic conditions is processed by an encoding module, an attention mechanism conversion layer and a decoding module in sequence to obtain a prediction result;

The method specifically comprises the following steps:

Step one, regarding each place in the traffic data set as a node, regarding a road between two places as an edge between the two nodes, and if no road is directly communicated between the two places, indicating that the two nodes are connected without edges; constructing a matrix with dimensions of (P+Q) multiplied by N multiplied by C through a traffic field data set, wherein P is the step number of a historical time sequence in the traffic field data set, Q is the step number of a future time sequence to be predicted, N is the number of all nodes in the traffic field data set, N= |V|, V is the set of all nodes in the traffic field data set, and C is the attribute dimension in the traffic field data set; construction of network map by traffic domain data set A represents an adjacency matrix, a _ij =1 indicates that there is an edge connection between node i and node j, a _ij =0 indicates that there is no edge connection between node i and node j, and E is a set of all edges connected; obtaining a network diagram/>, according to a Node2vec algorithmSpace embedded vector/>, of each node v _i And obtaining space embedded vectors through two fully connected neural network layersR ^D represents a real number domain vector set with dimension D, and a time embedded vector/> is obtained according to the time characteristics of the traffic field datasetWill/>And/>The stitching map is a space-time joint vector/>V _i is the ith node in the network diagram, t _j represents the jth moment, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to P+Q; the dimension of the matrix input to the coding module is p×n×d;

Step five, spatial embedding vector after updating the node v _i And updated temporal embedding vector/>Blending to obtain a space-time vector H ^(k) of the node v _i; calculating the space-time joint vector/>, of the historical time sequence of the node v _i, through the multi-head attention mechanism of the attention mechanism transformation layerAnd space-time joint vector/>, of future time series And get normalized weights/>By weight/>The space-time vectors H ^(k) of all the historical moments of the node v _i are weighted and summed to obtain H ^(k+1), and the dimension of the matrix is changed from P multiplied by N multiplied by D to Q multiplied by N multiplied by D;

2. The method for traffic prediction based on feature-based cross-plot convolutional neural network of claim 1, wherein step one specifically comprises the steps of:

3. The traffic prediction method of characteristic-cross-based graph roll-up neural network according to claim 1, wherein in the second step, in the k-1 th iteration process, similarity between the splice vector corresponding to the node v _i of t _j at the current time and the splice vector corresponding to each neighbor node v is calculated through the L-1 st attention mechanism moduleWhen (1):

When the iteration is carried out for k-1 times respectively, the hidden state vector and the space-time joint vector of the node v _i and the node v at the time t _j are different; /(I) When the iteration is carried out for k-1 times respectively, the hidden state vector of the node v _i and the node v at the time t _j are multiplied by the point of the space-time joint vector; /(I)When the hidden state vectors are k-1 iterations respectively, the hidden state vectors of the node v _i and the node v at the time t _j are obtained; /(I)Space-time joint vectors for node v; /(I)For/>As a result of the splicing, the splice results,For/>The result after stitching, | represents vector stitching,/>And/>D=d/L as a learnable parameter.

4. The traffic prediction method based on characteristic-cross graph convolution neural network according to claim 3, wherein in the second step, the similarity is determined by a softmax functionWeights normalized to 0 to 1/>

5. The method for traffic prediction based on a feature-cross graph convolutional neural network of claim 3, wherein the step three specifically comprises:

Wherein, Indicating the total number of crossings of node v _i, +.,Representing the number of neighbor nodes of v _i, W ₁ representing a learnable parameter;

H ^(k-1) is all of the k-1 iterations

6. The traffic prediction method of a graph roll-up neural network based on feature intersection as claimed in claim 1, wherein the fourth step specifically comprises the steps of:

Differences between hidden state vectors and space-time joint vectors of the node v _i at the time t _j and the node v _i at the time t,/>Point multiplication of hidden state vector and space-time joint vector of node v _i at time t _j and node v _i at time t respectively,/>The hidden state vectors are respectively a node v _i at the time t _j and a node v _i at the time t and are subjected to k-1 times of iteration; A space-time joint vector of a node v _i at the time t; /(I) For/>As a result of the splicing, the splice results,For/>The result after stitching, | represents vector stitching,/>And/>D=d/L as a learnable parameter;

step S42: similarity is determined by a softmax function Normalized to a weight of 0 to 1/>

Wherein,

Wherein,Represents the number of crossings at time t _j,/>Indicating the number of time steps before time t _j, as would indicate multiplication of the corresponding positions in the vector, W ₃ indicates a learnable parameter;

Wherein H ^(k-1) is all of the k-1 time

7. The traffic prediction method of a graph roll-up neural network based on feature intersection as claimed in claim 1, wherein the fifth step specifically comprises the steps of:

Wherein, Sigma represents a sigmoid function, W _η,1、W_η,2、b_η represents a learnable parameter; the corresponding positions in the vector are multiplied by the sum of the two;

step S52: for node v _i, calculating a spatio-temporal joint vector of the historical time series of node v _i by an attention mechanism module And space-time joint vector/>, of future time seriesCorrelation of/>And get normalized weight/>

Are all learnable parameters, d=d/L;

8. The method for traffic prediction based on feature-cross-plot convolutional neural network of claim 1, wherein MAE loss function is employedTraining a traffic prediction model: