CN115762183A

CN115762183A - Traffic speed prediction method based on geometric algebra and hypergraph

Info

Publication number: CN115762183A
Application number: CN202211370158.9A
Authority: CN
Inventors: 臧笛; 雷俊涛; 崔哲; 程久军; 张军旗
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2022-11-03
Filing date: 2022-11-03
Publication date: 2023-03-07

Abstract

A traffic speed prediction method based on geometric algebra and hypergraph. Step 1, inputting traffic speed data into a model; and (4) combining the pre-training K-means clustering result of the whole training set and the traffic network graph to construct a hypergraph in the spatial feature extraction module. And 2, constructing a K-layer space-time feature extraction module. Embedding periodic information of the traffic data which belongs to which day of each week and which time of each day through two linear layers, combining the extracted space-time characteristics of each layer of modules, and predicting the future traffic speed from the space-time characteristics of the current input data through the linear layers; and 4, using an optimized loss function combining two common loss functions, and continuously optimizing network parameters through back propagation and gradient descent to minimize the loss function and finally obtain an optimal model. The invention can help traffic management departments to better strengthen traffic demand management, strengthen comprehensive treatment of urban traffic jam and ensure that urban traffic is smoother when applied in actual scenes.

Description

Traffic speed prediction method based on geometric algebra and hypergraph

Technical Field

The invention relates to the field of intelligent traffic, in particular to a traffic speed prediction method based on geometric algebra and hypergraph.

Background

The traffic system is one of the most important infrastructures in modern cities, and supports daily trips of millions of people. With rapid urbanization and population growth, traffic systems are becoming more complex. Modern transportation systems include road vehicles, rail transit, and various shared travel modes that have been increasingly emerging in recent years. The ever-expanding cities face a number of traffic-related problems, including air pollution and traffic congestion. And the advance intervention based on traffic prediction can become a key for improving the efficiency of a traffic system and relieving related problems such as traffic jam and the like.

The traffic speed prediction method is mainly a data-driven method, and most of the traffic speed prediction methods are based on historical speed data. The traffic prediction problem is more challenging than other time series prediction problems because the traffic prediction problem involves a large amount of data with high dimension, and various dynamic characteristics including emergency situations (such as traffic accidents), which may cause non-stationarity of the traffic time series, making it difficult to predict for a long time.

The traffic state of a specific location is both time dependent and space dependent. Traditional linear time series models, such as autoregressive and synthetic moving average models, do not address this spatiotemporal prediction problem well. In recent years, machine learning and deep learning techniques have been introduced into this field to improve prediction accuracy. For example, by modeling the entire city as a grid and applying a convolutional neural network. However, the convolutional neural network based approach is not an optimal solution for traffic network structures with graphical form.

In recent years, graph neural networks have become the frontier of deep learning research and exhibit very excellent performance in various applications. Graph neural networks are well suited for use in traffic prediction problems because they are able to capture spatial dependencies and represent them using non-euclidean graphs. While sensors in road networks generally have complex relationships, even two sensors that are very close in euclidean space may exhibit very different behavior. Thus, a road network is naturally a non-euclidean graph, with road intersections as nodes and road connections as edges. By taking the graph as input, a plurality of models based on the graph neural network show performance superior to that of the traditional method on the problems of road traffic flow, speed prediction and the like.

Although the most advanced model can extract the space-time characteristics in the traffic data by combining the time characteristic extraction method and the graph convolution network, the most advanced model still has some defects. From the aspect of temporal information extraction, methods based on a recurrent neural network or a convolutional neural network are often used. The former is easy to cause the problem of gradient disappearance or gradient explosion, and if a variant of a recurrent neural network, such as a long-term and short-term memory model, is used, the problems of excessive resource consumption, difficulty in training, incapability of processing a large number of longer-term sequence predictions and the like can occur; although the latter solves the above problem based on the recurrent neural network, the general recurrent neural network not only ignores the internal dependency relationship between different time segments, but also cannot model the external dependency relationship between the convolution kernel and the time segments, and has limitations in mining and analyzing massive high-dimensional related traffic data. From the spatial information extraction level, previous work often captured spatial features through a fixed graph structure. However, the traffic prediction problem is a time series problem, the relationship between the sensor nodes may change with time, and the fixed graph structure cannot reflect the change. Meanwhile, most of the current models are based on the traditional non-Euclidean graph structure, edges contained in the graph structure can only connect two vertexes, the relation between road network sensors in the traffic prediction problem is not completely a pair relation, and the road network sensors contain the common action among a plurality of nodes to a great extent, which is a high-order relation which cannot be captured by the traditional graph structure.

Geometric Algebra (Geometric Algebra) is a covariate Algebra framework generated in a uniform mode and is an extension of vector Algebra. Geometric algebra can use a subspace with higher dimensionality to operate by introducing the concepts of multiple vectors and geometric products, has uniform and efficient expression on information in a high-dimensional space and interaction relation between the information, and can be expanded into any high-dimensional space. The characteristics of geometric algebra make it suitable for coding different time segments in traffic data, modeling internal dependency relationships between time segments, and constructing convolution kernel based convolution operation.

The hypergraph is a popularization of the graph, and is different from a simple graph in that two nodes are connected by one edge, and each hypergraph can be connected with any number of nodes in the hypergraph. The degree of the super-edge in the hypergraph can be higher than that of a simple graph, and compared with a graph structure which can only be connected in pairs, the hypergraph has a remarkable advantage in modeling the correlation of actual data.

Disclosure of Invention

Aiming at the defects in the existing traffic speed prediction field, the invention provides a traffic speed prediction method based on geometric algebra and hypergraph on the basis of the expression of a coding mode and rotation on a dependency relationship of high-dimensional data in a geometric algebra frame on the time dimension and on the basis of a hypergraph capable of representing high-order interaction on the space dimension. Combining a group of super edges constructed by applying a multi-level clustering method to data and a group of super edges constructed based on a road network structure to form a multi-dimensional high-order super graph structure, and combining super graph convolution and diffusion graph convolution on a traditional traffic graph to extract higher-order spatial information. By carrying out multi-dimensional deep mining on the traffic speed data, long-term prediction of the traffic speed is realized, and therefore the accuracy of the traffic speed prediction is improved.

The technical scheme to be protected in the invention is as follows:

a traffic speed prediction method based on geometric algebra and hypergraph comprises the following steps:

step 1, inputting traffic speed data into a model, performing dimension promotion on the speed data through a linear layer to convert a scalar into a vector, obtaining clusters through a K-means unsupervised clustering method, and constructing a hypergraph in a spatial feature extraction module by combining a pre-training clustering result of a whole training set and a traffic road network diagram.

The method comprises the steps of inputting traffic speed data into a model, carrying out dimension promotion on the speed data through a linear layer, and converting a scalar into a vector, so that the model can extract richer information. The method comprises the steps of obtaining a clustering result by using a K-means unsupervised clustering method for input data, using the clustering result obtained by pre-training a multi-layer K-means clustering method for the whole training set as a group of super edges in a hypergraph, and simultaneously combining each node in a traffic network structure with a neighbor node thereof to construct a group of super edges. The multi-level hypergraph constructed by the two groups of hyperedges together comprises both the traffic data-based hyperedges and the road network structure-based hyperedges, and is used in the hypergraph convolution of the space-time extraction module.

And 2, constructing a K-layer space-time feature extraction module. In each module, a gating geometric algebraic time convolution network based on a geometric algebraic frame is constructed to extract time features in traffic data, and a multidimensional graph convolution network combining a diffusion graph convolution and a multi-level hypergraph convolution is used to extract space features.

In the temporal information extraction module, different time slices are first encoded into multi-vectors in geometric algebra using a sliding time window, so that the model can model the internal dependencies between time slices as well as convolutions. And a convolution kernel of the time convolution network is constructed based on the description of multi-vector rotation in geometric algebra, so that the model can model the external dependence relationship between time data and the convolution kernel, and the time characteristics can be better mined. In the spatial information extraction module, two types of graph convolution of different levels are used, one type is diffusion graph convolution based on an adaptive adjacency matrix and a traditional road network structure, and the spatial relationship between paired nodes can be modeled; the other type is hypergraph convolution based on the multi-level hypergraph constructed in the step 1, and spatial relations among a plurality of nodes can be modeled.

And 3, embedding periodic information of the traffic data which belongs to the day of each week and which belongs to the time of each day through two linear layers, and predicting the future traffic speed from the space-time characteristics of the current input data through the linear layers by combining the extracted space-time characteristics of the modules of each layer.

The traffic data contains strong periodicity, and the change situation of the traffic speed has strong correlation with the change situation belonging to a certain day of the week and a certain moment of the day. This periodicity can be embedded by constructing a network of two linear layers for specific time information of traffic data and used as auxiliary data for predicting future traffic speed. The spatio-temporal characteristics extracted by the multiple layers of modules are combined to ensure that the model cannot encounter the situation of gradient disappearance or gradient explosion. And (4) combining the periodic characteristics with the extracted recent space-time characteristics to predict the future traffic speed through a linear layer.

And 4, using an optimization loss function combining two common loss functions, and continuously optimizing network parameters through back propagation and gradient descent to minimize the loss function and finally obtain an optimal model.

The Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are combined such that time slices in real traffic speed data that produce congestion or produce a sudden change are lost using RMSE, while the remaining speed-normal portions are lost using MAE. Under the condition that the time segment with normal speed is not influenced by the abnormal value, the time segment with congestion is better fitted. And finally, performing back propagation and gradient descent to obtain an optimal model.

Drawings

FIG. 1 is a system flow diagram of a traffic prediction method based on geometric algebra and hypergraph.

Fig. 2 is a diagram of an application scenario of the present invention.

FIG. 3 is a diagram of an overall model structure based on geometric algebra and hypergraph in the present invention.

Fig. 4 is an example of a hypergraph.

FIG. 5 is a visual representation of bivectors in geometric algebra.

FIG. 6 is a visual representation of reflections in geometric algebra.

FIG. 7 is a graph showing the fitting of the predicted value and the true value of METR-LA on node 0.

FIG. 8 is a graph showing the fitting of the predicted value and the true value of METR-LA on node 22.

Detailed Description

Aiming at the defects in the existing traffic speed prediction field, the invention provides a traffic speed prediction method based on geometric algebra and hypergraph on the basis of the expression of a coding mode and rotation on a dependency relationship of high-dimensional data in a geometric algebra frame on the time dimension and on the basis of a hypergraph capable of representing high-order interaction on the space dimension. Combining a group of super edges constructed by applying a multi-level clustering method to data and a group of super edges constructed based on a road network structure to form a multi-dimensional high-order super graph structure, and combining super graph convolution and diffusion graph convolution on a traditional traffic graph to extract higher-order spatial information. By carrying out multi-dimensional deep mining on the traffic speed data, long-term prediction of the traffic speed is realized, and therefore the accuracy of the traffic speed prediction is improved. The specific method procedures and logical relationships are shown in fig. 1.

The application of the invention in the actual scene can help the traffic management department to better strengthen the traffic demand management, strengthen the comprehensive treatment of urban traffic jam, make the urban traffic smoother and make the travel experience of the masses more comfortable. The specific application scenario is shown in fig. 2.

The invention provides a traffic speed prediction method based on geometric algebra and hypergraph, the overall architecture of the method is shown in figure 3, and the method comprises the following specific steps:

step 1, inputting traffic speed data sampled by a road speed sensor into a model, carrying out dimension promotion on the speed data through a linear layer to convert a scalar into a vector, obtaining clusters through a K-means unsupervised clustering method, and constructing a hypergraph in a spatial feature extraction module by combining a pre-training clustering result of a whole training set and a traffic road network diagram.

Step 1.1 traffic speed data is sampled at time intervals by speed sensors on the road, each speed sensor being a node in a non-euclidean graph or hypergraph. In the present invention, the spatial dependence in traffic speed data is modeled using conventional non-euclidean graph structures and hypergraph structures. First, a hypergraph is constructed using speed data and a traffic network structure.

The hypergraph is a generalization of the graph, and differs from a simple graph in that two nodes are connected by one edge in that each hypergraph can connect any number of nodes in the hypergraph. The degree of the super-edge in the hypergraph can be higher than that of a simple graph, and compared with a graph structure which can only be connected in pairs, the hypergraph has a remarkable advantage in modeling the correlation of actual data. An example of a hypergraph is shown in fig. 4.

And constructing a hypergraph according to results obtained by pre-training the K-means clustering method on the whole training set and results obtained by applying the K-means clustering method to the current input traffic speed data, wherein each cluster is taken as a hyperedge.

The K-Means (K-Means) clustering method is an unsupervised clustering method, and can be used for classifying nodes with similar characteristics in traffic data into the same cluster, and the super edge constructed by the K-Means clustering method is used for modeling the nodes with higher data similarity in spatial dimension. And providing K clustering quantity, and dividing the data into K different groups by an algorithm through continuous iteration. When the K-means clustering method is pre-trained for the whole training set, different clustering results can be obtained by setting a plurality of different clustering quantities, and super edges with different granularities are constructed and obtained.

Meanwhile, for each traffic speed sensor node, the traffic speed sensor node and a neighbor node of the node in the traffic network structure are combined into a super edge, and a group of super edges based on the traffic network structure is constructed. The two types of super edges jointly form a super graph, wherein the super graph comprises the super edges constructed based on data and the super edges constructed based on a road network structure, and the multi-dimensional high-level modeling can be performed on the spatial dependence. This hypergraph will be used in the hypergraph convolution applied to the spatiotemporal feature extraction layer.

Step 1.2 initialization of model parameter settings, including random initialization of two node embedding E for the construction of an adaptive adjacency matrix ₁ ，E ₂ . And then, carrying out dimension promotion on the input traffic data through a full connection layer, so that the speed value is converted from a scalar into a vector, and a model can extract more abundant information from the data.

And 2.1, extracting the time characteristics in the traffic data by using a gating geometric algebra time convolution network based on a geometric algebra framework.

Geometric Algebra (Geometric Algebra) is a covariate Algebra framework generated in a uniform mode and is an extension of vector Algebra. Geometric algebra can use a subspace with higher dimensionality to operate by introducing the concepts of multiple vectors and geometric products, has uniform and efficient expression on information in a high-dimensional space and interaction relation between the information, and can be expanded into any high-dimensional space. The characteristics of geometric algebra make it suitable for coding different time slices in traffic data, modeling internal dependency relationship between time slices, and constructing convolution kernel based on convolution operation.

Geometric algebra first introduces an operator called the outer product, which is represented using the ^ symbol. Given two vectors a and b, the outer product a Λ b results in a directional two-dimensional subspace referred to as the two-way quantity in geometric algebra, as shown in fig. 5.

A vector can be decomposed into a linear combination of basis vectors, taking into account the Euclidean plane R ² Two vectors a = (α) ₁ ,α ₂ )＝α ₁ e ₁ +α ₂ e ₂ ，b＝(β ₁ ,β ₂ )＝β ₁ e ₁ +β ₂ e ₂ The external product of the two is:

a∧b＝(α ₁ e ₁ +α ₂ e ₂ )∧(β ₁ e ₁ +β ₂ e ₂ )

according to the distribution law and the inverse exchange law of the outer product and the law that the outer product of the vector and the vector is zero, the formula can be simplified as follows:

a∧b＝(α ₁ β ₂ -α ₂ β ₁ )e ₁ ∧e ₂

in euclidean planes, I = e may be used ₁₂ ＝e ₁ ∧e ₂ . Therefore, geometric algebra also defines a group of bases in two-dimensional space, which are respectively {1,e ₁ ,e ₂ ,e ₁₂ }. The vector can be decomposed into a combination of several basis vectors, and this definition is generalized in geometric algebra to the definition of multiple vectors: the multi-vector is a linear combination of different bases. At R ² In space, a multi-vector can be decomposed into a scalar part, a vector part and a bi-directional component part:

A＝α ₁ +α ₂ e ₁ +α ₃ e ₂ +α ₄ I

α _i are real numbers, represent components of multiple vectors, and may be zero. Multiple vectors, as a linear combination of subspaces, can be used to express many different concepts in geometry, and the above definition can be extended to higher dimensional spaces. Geometric product operation is defined in geometric algebra, and outer product and dot product are combined to form the geometric product for any multiple vectorThe calculation of which product is:

in geometric algebra, each subspace can be simplified by creating a multiplication table called a base according to the computation laws of outer products and dot products. For example, R ² The multiplication table of bases in space is as follows:

	1	e ₁	e ₂	I
					1	1	e ₁	e ₂	I
e ₁	e ₁	1	I	e ₂
					e ₂	e ₂	-I	1	-e ₁
I	I	-e ₂	e ₁	-1

applying geometric multiplication to R ² Multiple vectors in space, according to R ² A multiplication table of bases in space, which can obtain the following results:

one rotation is defined in geometric algebra and can be implemented by two reflections, and such rotation can be applied in space of arbitrary dimension by defining a surface of rotation. The present invention uses rotation in three-dimensional space, and therefore, reflection and rotation in three-dimensional space will be briefly described.

First, reflection is the reflection of a vector to the other side of a bivector, which can also be simply understood as the other side of the euclidean plane. Suppose we have a bivector U, its dual U ^* Is the normal vector u of the euclidean plane. If a vector a is geometrically multiplied with a vector u and a vector-u, and a is decomposed and simplified, the following result can be obtained (see also fig. 6):

a rotation is established on top of such a reflection. To be at R ² Rotating in space as an exampleAssuming that s and t are two unit normal vectors, one rotation in geometric algebra is to combine two reflections, v ^′ ＝-t(-svs ^-1 )t ^-1 ＝tsvs ^-1 t ^-1 Since s and t are unit normal vectors, v ^′ = tsvst, the geometric product notation is omitted here for convenience. Definition of

Is a conjugate of R, therefore

As can be seen from the definition of geometric multiplication, R consists of a scalar and a bi-directional quantity, provided that it is desired to multiply R by two ² Rotating the vector v by an angle theta in space, then it needs to be defined:

in which the vector is rotated relative to the Euclidean plane A, i.e. A is a bivector, R being called R ² A spin in space. The proof here is complex and therefore only provides one conclusion.

In the rotation of geometric algebras of higher dimensions, if it is desired to rotate any vector or multi-vector by an angle θ, it is also necessary to define such a rotation R consisting of scalar and bi-directional quantities. In the present invention, rotation in three-dimensional space is used, and the rotation amount in three-dimensional space is defined as:

the rotation R in three-dimensional space is composed of four parts, a scalar and three double vectors (rotating surfaces) in three-dimensional space, and the rotation is still defined as

Order:

the rotations in three-dimensional space can be combined into a matrix multiplied by a vector form:

after entering a space-time characteristic extraction module, the invention constructs a convolution kernel in a time convolution network based on the matrix expression for rotation in the geometric algebra, and extracts the time information in the traffic data after the dimensionality is improved.

The two convolution kernels of the gated time convolution network need to be initialized first. In the range of [ - π, π]Randomly initializing an angle theta at 0,1]Randomly sampling and normalizing three matrices v in the mean distribution of _b ，v _c ，v _d Four parts of the convolution kernel forming the time convolution network are calculated by the method:

wherein

Is a matrix obtained by random sampling. The four matrixes obtained by initialization can be spliced into geometric algebra at R ³ A rotation matrix in space:

thus, a matrix form of one rotation in geometric algebra is formed, and the matrix form is also a convolution kernel of the time convolution network. According to the geometric algebraic rotation formula, rotation is completed through a rotation left multiplication and a conjugate right multiplication of the rotation, after the rotation is simplified into a matrix left multiplication form, the matrix is used as a convolution kernel to carry out convolution, the effect of two convolution operations can be achieved through one convolution operation, and the convolution kernel can be continuously optimized in the training process of the model.

The traffic data after dimensionality improvement is divided into different time segments in a sliding window mode, a time window with the size of 4 is used in the model, four time segments are coded into a multi-vector in a three-dimensional space in a geometric algebra, two randomly initialized geometric algebra convolution kernels are used for carrying out convolution on the data, namely a matrix expression form for rotation in the three-dimensional space in the geometric algebra, and finally a gating mechanism is used for screening extracted information. The different time slices are coded into multi-vectors in the geometric algebra, so that the model can learn the internal dependency relationship between different time slices in the time sequence, and the model can learn the external dependency relationship between the time sequence and the convolution kernel by constructing the convolution kernel as the matrix description for rotation in the geometric algebra. The specific process of temporal feature extraction is described as follows:

h＝g(W ₁ *X+b)⊙σ(W ₂ *X+c)

wherein X is the traffic data input into the model space-time feature extraction module after dimension promotion, and W is ₁ ，W ₂ Two different convolution kernels, one is a time convolution operation, and one is a multiplication of corresponding elements of the matrix. g (-) is the activation function of the output data, set in the present model as the tanh activation function, and σ (-) is the sigmoid function that is used to decide what proportion of information can go through to the next layer. The resulting h will be used as input to the spatial feature extraction module.

Step 2.2 extraction of spatial features using a multi-dimensional graph convolution network based on diffusion graph convolution and multi-level hypergraphs

By E initialized in step 1 ₁ ，E ₂ The node embedding calculates the self-adaptive adjacency matrix, and combines the static adjacency matrix to carry out the convolution operation of the bidirectional diffusion graph, and the concrete process is as follows:

wherein, A _adp The method is an adaptive adjacency matrix and can continuously perform self-optimization in the process of model back propagation. K is the number of steps of random diffusion, W _k1 ，W _k2 ，W _k3 Are weight matrices that can be learned in the model. Since the traffic map is a directed map, the diffusion map convolution also has bidirectionality, P _f ，P _p The two transition probability matrixes are respectively a forward transition probability matrix and a backward transition probability matrix, and are obtained by calculating a static adjacent matrix A, wherein the specific formula of the calculation is as follows:

where rowsum (-) is a function of the sum of each row of the computation matrix.

Extracting high-order spatial dependency information of the data through the multilevel hypergraph constructed in the step 1, wherein the first step is to form information of the hyper-edge by aggregating node information in the hyper-edge, and the second step is to update the information of the node by aggregating the hyper-edge information connected with the node. The specific process is as follows:

wherein h is _e Hidden super-edge features, W, obtained for aggregating super-edge interior node feature values _e Weight matrix for class learning, d _i Degree of node i, i.e. number of super edges to which node i is connected, d _e The average degree of the super edge e is the average value of the degrees of each node in the super edge.

And adding the results of the convolution of the diffusion graph and the hypergraph, and connecting the data input into the layer as residual errors to finally obtain the output of the space-time layer.

Transition probability maps used in the convolution of the diffusion map belong to traditional map structures, and can model pairwise relations among nodes, including static traffic map structures and dynamic traffic map structures obtained by node embedding calculation. The hypergraph used in the hypergraph convolution can model the common relation among a plurality of nodes, and comprises a group of static hyperedges obtained by pre-training a whole training set by a plurality of K-means clustering methods with different clustering quantities, a group of static hyperedges constructed based on the traditional traffic graph structure, and dynamic hyperedges for carrying out K-means clustering on traffic speed data input into a model each time. The traditional graph structure and the high-order graph structure are combined together, static and dynamic characteristics are combined, the special type of the traffic network structure is considered comprehensively, and the spatial characteristics in traffic data can be mined more deeply.

Through the stacked space-time feature extraction modules, the space-time features of the input traffic data are extracted, and the outputs of each layer are overlapped to ensure that the problems of gradient disappearance or gradient explosion cannot occur.

The traffic data contains strong periodicity, and the change situation of the traffic speed has strong correlation with the change situation belonging to a certain day of the week and a certain moment of the day. The periodic characteristics can be used as auxiliary data for predicting future traffic speed, are embedded through two linear layers and are connected with the extracted traffic data space-time characteristics, and the prediction result of the current input traffic speed data is obtained through the multiple linear layers.

In the loss function part, the model combines the average absolute error (MAE) and the Root Mean Square Error (RMSE), loss is calculated by using the RMSE for time segments generating sudden change or congestion in real traffic speed data, the difference between a prediction result and the real traffic speed is amplified, and the loss is calculated by using the MAE for the other parts with normal speed, so that the time segments generating congestion or sudden change can be better fitted under the condition of ensuring that the time segments with normal speed are not influenced by abnormal values. Optimizing the prediction accuracy of the traffic speed through the optimized loss function, wherein the optimized loss function formula is as follows:

wherein, alpha is a change threshold value of a sudden change situation in the real traffic speed data, beta is a speed threshold value for identifying congestion in the real traffic speed, and T is a predicted traffic speed time sequence length.

And performing gradient calculation and back propagation, and repeating model training to enable the model to gradually reach an optimal point.

In order to prove the effectiveness of the method, the method is applied to the METR-LA data set, and a model which is excellent in the recent model based on the graph neural network is selected as a comparison for experiment. The METR-LA dataset records traffic speeds over four months for 207 road speed sensors on the los Angeles expressway, each of which may serve as a node in a graph neural network or a hyper graph neural network. All models adopt the same preprocessing mode, five minutes are taken as a time window, the model inputs the traffic speed of the previous hour, and the model outputs the predicted traffic speed of the next hour. In order to make the model cover the whole time sequence, the invention adopts a 5-layer space-time characteristic extraction module. The invention selects the comparison between the real value and the predicted value of the traffic speed on the two road speed sensors for visualization, and the result is shown in fig. 7 and fig. 8.

The present invention achieves the best results in both the 30 minute and one hour MAE and RMSE indices compared to other models, with only a 0.01 difference from the most advanced model in the 15 minute MAE. On MAPE indexes, the method obtains results which are closer to those of the most advanced model. The results of the experiment are shown in table 1:

TABLE 1 comparison of the predicted effect of the present invention on METR-LA datasets with existing models

The references are as follows:

[1]Y.Li,R.Yu,C.Shahabi,and Y.Liu,“Diffusion convolutional recurrent neural network:Data-driven traffic forecasting,”in Proc.of ICLR,2018.

[2]Z.Wu,S.Pan,G.Long,J.Jiang,and C.Zhang,“Graph wavenet for deep spatial-temporal graph modeling,”in Proc.of IJCAI,2019.

[3]C.Zheng,X.Fan,C.Wang,and J.Qi,“Gman:A graph multiattention network for traffic prediction,”Proceedings of the AAAI Conference on Artificial Intelligence,vol.34,pp.1234–1241,04 2020.

[4]Z.Wu,S.Pan,G.Long,J.Jiang,X.Chang,and C.Zhang,“Connecting the dots:Multivariate time series forecasting with graph neural networks,”08 2020,pp.753–763.

[5]F.Li,J.Feng,H.Yan H,et al.“Dynamic graph convolutional recurrent network for traffic prediction:Benchmark and solution.”ACM Transactions on Knowledge Discovery from Data(TKDD),2021.

innovation point

In the field of traffic speed prediction, aiming at the limitations of high-dimensional data object expression, extraction of internal dependency among time segments and lack of external dependency among convolution kernels in the time characteristic extraction of the existing method, the invention combines geometric algebra and a time convolution network to realize the coding and the structural operation of high-dimensional information. Meanwhile, aiming at the problem that the conventional method only can model the pairwise relation among nodes in the traditional graph structure but cannot model the common relation among a plurality of nodes in the space feature extraction, the invention combines a multi-level hypergraph and the traditional graph structure to realize the extraction of high-order space information in the traffic graph.

According to the method, multi-vector coding and unified expression are carried out on traffic speed data according to multi-vector definition in geometric algebra, the correlation among different time segments in the data is extracted through vector rotation expression in the geometric algebra, and time dimension information in the traffic data is further deeply mined under the condition that a large number of parameters are reduced. And the high-order extraction is carried out on the spatial dimension information in the data through the combination of a traditional graph structure and a high-order graph structure, static and dynamic combination and the multi-level bidirectional diffusion graph convolution and the hypergraph convolution, so that the full mining of the space-time correlation contained in the traffic data is realized. The invention ensures that the situation of gradient disappearance or gradient explosion can not occur during the backward propagation by combining the output of each layer, simultaneously considers the special periodicity in the traffic speed data, and adds the periodic time information embedding in the output layer. The invention also combines two loss functions to ensure that the final prediction result is better fitted with the true value, and finally the accuracy and performance of the traffic speed data prediction are improved.

Claims

1. The traffic speed prediction method based on geometric algebra and hypergraph is characterized by comprising the following steps:

step 1, inputting traffic speed data into a model, carrying out dimension promotion on the speed data through a linear layer to convert a scalar into a vector, obtaining clusters through a K-means unsupervised clustering method, and constructing a hypergraph in a spatial feature extraction module by combining a pre-training K-means clustering result of a whole training set and a traffic network graph;

step 2, constructing a K-layer space-time characteristic extraction module; in each module, a gating geometric algebra time convolution network based on a geometric algebra frame is constructed to extract time features in traffic data, and a multidimensional graph convolution network combining a diffusion graph convolution and a multi-level hypergraph convolution is used to extract space features;

embedding periodic information of the traffic data which belongs to the day of each week and which belongs to the time of each day through two linear layers, and predicting the future traffic speed from the space-time characteristics of the current input data through the linear layers by combining the extracted space-time characteristics of each layer of modules;

and 4, using an optimized loss function combining two common loss functions, and continuously optimizing network parameters through back propagation and gradient descent to minimize the loss function and finally obtain an optimal model.

2. The prediction method of claim 1, wherein in step 1, the traffic speed data is input into the model, the speed data is subjected to dimension enhancement through a linear layer, and the speed value is converted from a scalar into a vector, so that the model can extract richer information; obtaining a clustering result by using a K-means unsupervised clustering method for input data, taking the clustering result as a group of super edges in a hypergraph by combining the clustering result obtained by pre-training a multilayer K-means clustering method for the whole training set, and simultaneously combining each node in a traffic network structure with a neighbor node thereof to construct a group of super edges; the multi-level hypergraph constructed by the two groups of hyperedges together comprises both the traffic data-based hyperedges and the road network structure-based hyperedges, and is used in the hypergraph convolution of the space-time extraction module.

3. The prediction method as claimed in claim 1, wherein in the step 2, in the time information extraction module, different time slices are first encoded into multi-vectors in geometric algebra using a sliding time window, so that the model can model the internal dependence between time slices and convolution; a convolution kernel of the time convolution network is constructed based on the description of multi-vector rotation in geometric algebra, so that the model can model the external dependence relationship between time data and the convolution kernel; in a spatial information extraction module, two types of graph convolution of different levels are used, one type is diffusion graph convolution based on an adaptive adjacency matrix and a traditional road network structure, and a spatial relationship between paired nodes is modeled; the other type is hypergraph convolution based on the multi-level hypergraph constructed in the step 1, and spatial relations among a plurality of nodes are modeled.

4. The prediction method of claim 1, wherein in the step 3, the traffic data comprises strong periodicity, and the traffic speed variation has strong correlation with the day of the week and the time of the day; the periodicity is embedded by constructing a network consisting of two linear layers for specific time information of traffic data and is used as auxiliary data for predicting future traffic speed; the space-time characteristics extracted by the multiple layers of modules are combined to ensure that the model cannot be subjected to the situation of gradient disappearance or gradient explosion; and (4) combining the periodic characteristics with the extracted recent space-time characteristics to predict the future traffic speed through a linear layer.

5. The prediction method as set forth in claim 1, wherein in the step 4, the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) are combined such that time slices of the real traffic speed data where congestion occurs or where abrupt change occurs are calculated loss using RMSE, and the remaining speed normal portions are calculated loss using MAE; fitting the time segments generating congestion; and finally, performing back propagation and gradient descent to obtain an optimal model.

6. The prediction method of claim 1, wherein the traffic speed data is input into the model by a speed sensor;

each speed sensor acts as a node in a graph neural network or a hypergraph neural network.