CN115700628A - Traffic flow prediction method and system containing missing data - Google Patents

Traffic flow prediction method and system containing missing data Download PDF

Info

Publication number
CN115700628A
CN115700628A CN202211301263.7A CN202211301263A CN115700628A CN 115700628 A CN115700628 A CN 115700628A CN 202211301263 A CN202211301263 A CN 202211301263A CN 115700628 A CN115700628 A CN 115700628A
Authority
CN
China
Prior art keywords
data
traffic flow
matrix
time
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211301263.7A
Other languages
Chinese (zh)
Inventor
金雅妮
刘彩苹
谢鲲
文吉刚
张大方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202211301263.7A priority Critical patent/CN115700628A/en
Publication of CN115700628A publication Critical patent/CN115700628A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a traffic flow prediction method containing missing data, which comprises the following steps: acquiring a traffic data set of a certain area, wherein the traffic data set comprises missing data, reconstructing the traffic data set into a traffic flow data matrix, inputting the traffic flow data matrix X into an orthogonal nonnegative matrix decomposition (ONMF) module of a trained space-time prediction model to form K clusters, decomposing GMF module filling data in each cluster by using a generalized matrix of the space-time prediction model to obtain the traffic flow data matrix filled with data
Figure DDA0003904208410000011
For the filled traffic flow data matrix
Figure DDA0003904208410000012
Carrying out standardization treatment, and carrying out standardization on the traffic flow data matrix according to the historical step length H and the prediction window W
Figure DDA0003904208410000013
Modelling as a three-dimensional tensor
Figure DDA0003904208410000014
Will three-dimensional tensor
Figure DDA0003904208410000015
And inputting the graph convolution cyclic neural network GCRNN of the trained space-time prediction model to obtain prediction data Y'. The invention has universality in the aspect of traffic prediction of missing data, and learns the space-time characteristics in a finer-grained manner to realize more effective traffic flow prediction.

Description

Traffic flow prediction method and system containing missing data
Technical Field
The invention belongs to the technical field of deep learning and intelligent traffic in artificial intelligence, and particularly relates to a traffic flow prediction method and system with missing data, which are realized by using a Fine-grained filling Graph Convolution neural Network (FCGCRN).
Background
In recent years, with the collection of mass data by sensors and monitoring systems, the prediction task has been widely studied in various fields such as climate, finance and traffic. Traffic prediction is a classic application, is an indispensable component of an Intelligent Transportation System (ITS for short), and plays an important role in alleviating traffic congestion, reducing traffic accidents and improving urban traffic service quality. Predicting future states is crucial for traffic flow prediction given historical traffic flow and existing road information. However, each future traffic flow data depends not only on the historical value of the piece of traffic flow but also on other pieces of traffic flow. Meanwhile, traffic data loss may occur when the sensor collects traffic flow due to network jitter, equipment failure, and the like. Therefore, how to accurately predict the future state of traffic flow containing missing data is a challenging problem.
The existing research on traffic flow prediction mainly comprises three types of algorithms. The first is based on a statistical method, which assumes that each traffic flow is a stationary sequence and adopts a linear algorithm to fit traffic data, such as Historical Average (HA), autoregressive Integrated moving average (ARIMA), gaussian Process (GP), and the like; the second is based on a single neural Network method, which adopts a Recurrent Neural Network (RNN) and a variant Long Short Term Memory (LSTM) and a Gated Recurrent Unit (GRU) thereof, and can process Long-range time series traffic data in a Short time; the third method is based on a hybrid Neural Network method, which fuses a Convolutional Neural Network (CNN) or a Graph convolutional Neural Network (GCN) and a Recurrent Neural Network (RNN) to respectively capture the complex spatial dependency relationship between traffic flows and the long-time dependency relationship between single traffic sequences.
However, the above existing traffic flow prediction methods all have some non-negligible technical problems: firstly, the traditional method and the single neural network method only consider the characteristics of traffic flow data in a time dimension, and do not explicitly model the interdependence relation between different time sequences, so that the prediction performance is low; secondly, the CNN in the hybrid neural network method encapsulates the interaction between traffic flows into a global hidden state, and is limited to processing a regular grid structure to capture spatial correlation, so that the characterization capability of the CNN is weak when processing a non-grid structure spatial relationship, and the prediction accuracy is further influenced; third, GCN in the hybrid neural network approach relies on predefined graphs, making the model less versatile; fourthly, the mixed method is lack of a proper parameter learning mode, so that the space-time correlation cannot be represented in fine granularity, and the prediction precision is further influenced; fifth, the three methods are highly sensitive to the loss of traffic data, which makes the model highly susceptible to noise while learning features, thereby degrading prediction performance.
Disclosure of Invention
Aiming at the defects or the improvement requirements of the prior art, the invention provides a traffic flow prediction method and a traffic flow prediction system containing missing data, and aims to solve the technical problem that the future state of the traffic flow cannot be faithfully reflected due to the fact that the spatial correlation of traffic data cannot be captured by the conventional traffic flow prediction method; and the technical problem that the spatial representation is limited to grid data because the complex spatial correlation under the non-Euclidean space cannot be represented; and lack of versatility of the hybrid neural network approach due to the limitations of the predefined graph affecting the spatial characterization; the technical problem that the traffic flow prediction precision is influenced due to the fact that the traffic flow cannot be represented in a fine-grained mode and the specific mode of the flow node is captured is solved; and the data loss problem caused by network or equipment faults influences the characteristic learning performance, so that the technical problem of influencing the learning time and space correlation of the characteristic learning module is caused.
To achieve the above object, according to one aspect of the present invention, there is provided a traffic flow prediction method including missing data, including the steps of:
(1) Acquiring a traffic data set of a certain area, wherein the traffic data set comprises missing data, and reconstructing the traffic data set into a traffic flow data matrix;
(2) Inputting the traffic flow data matrix X obtained in the step (1) into the trained space-timeForming K clusters by an orthogonal nonnegative matrix decomposition ONMF module of the prediction model, and decomposing and filling GMF module filling data in each cluster by utilizing a generalized matrix of a space-time prediction model to obtain a traffic flow data matrix after the data is filled
Figure BDA0003904208390000031
For the filled traffic flow data matrix
Figure BDA0003904208390000032
Carrying out standardization processing to obtain a standardized traffic flow data matrix
Figure BDA0003904208390000033
And standardizing the traffic flow data matrix according to the historical step length H and the prediction window W
Figure BDA0003904208390000034
Modelling as a three-dimensional tensor
Figure BDA0003904208390000035
(3) Modeling the three-dimensional tensor obtained in the step (2)
Figure BDA0003904208390000036
Inputting the data into a graph convolution cyclic neural network GCRNN of the trained space-time prediction model to obtain prediction data Y'.
Preferably, the traffic data is three-dimensional tensor data { time, node, traffic characteristics }, wherein the time refers to the time when the node acquires the traffic characteristics, the node refers to a single sensor, and the traffic characteristics comprise a vehicle speed characteristic, a traffic flow characteristic and a person number characteristic;
preferably, in step (1), the traffic data set
Figure BDA0003904208390000037
Wherein
Figure BDA0003904208390000038
Watch with watchA traffic flow data matrix showing the traffic flow of the nth node in all nodes (namely sensors) arranged on all streets in the area at T moments, n is the [1,N ]],t∈[1,T]T is any positive integer, N is the total number of sensors arranged on all streets in the area, and
Figure BDA0003904208390000039
Figure BDA00039042083900000310
indicating that the data is non-negative;
wherein
Figure BDA00039042083900000311
For the c characteristic value of the nth node at the t moment, c is the [1,C ]]Wherein C represents a traffic characteristic category.
Preferably, the process of inputting the traffic flow data matrix X obtained in the step (1) into the ONMF module to form K clusters in the step (2) specifically includes:
(2-1) initializing matrix factors F and G of the traffic flow data matrix X to random values within (0,1);
(2-2) initializing the obtained matrix factors F and G according to the step (2-1) and adopting an updating rule
Figure BDA0003904208390000041
And
Figure BDA0003904208390000042
updating observable data in the matrix when the X-FG is applied T When the error is converged, stopping iteration so as to obtain an updated matrix factor G;
(2-3) clustering the traffic flow data matrix X into K clusters according to the updated matrix factor G obtained in the step (2-2);
preferably, the GMF module is used to fill the traffic flow data matrix X in each cluster in step (2) to obtain a traffic flow data matrix after data filling
Figure BDA0003904208390000043
For the filled traffic flow data matrix
Figure BDA0003904208390000044
Carrying out standardization processing to obtain a standardized traffic flow data matrix
Figure BDA0003904208390000045
And standardizing the traffic flow data matrix according to the historical step length H and the prediction window W
Figure BDA0003904208390000046
Modelling as a three-dimensional tensor
Figure BDA0003904208390000047
This process comprises the following sub-steps:
(2-4) dividing the traffic flow data matrix X into observable and unobservable data sets in each cluster obtained in the step (2-3), and reconstructing the observable data sets to obtain a time vector v p ∈R m A node vector v q ∈R m And traffic flow vector v y ∈R m Reconstructing the non-observable data set to a time vector v' p ∈R m′ And node vector v' q ∈R m′ Wherein m represents a total number of observable data in the observable data set and m' represents a total number of unobservable data in the unobservable data set;
(2-5) v obtained in the step (2-4) p And v q The two vectors are input to the embedding layer of the GMF module to obtain the output of the embedding layer, i.e. the time matrix factor P and the node matrix factor Q:
P=e 1 (v p ),
Q=e 2 (v q ),
wherein P ∈ R m×a And Q ∈ R m×a Time matrix factor and node matrix factor, respectively, a =16 is a latent factor, e 1 () And e 2 () Representing embedded functions, which are all the functions of the store.nn.embed () in the Pythrch frame;
(2-6) inputting the time matrix factor P and the node matrix factor Q obtained in the step (2-5) into a decomposition layer of the GMF module to obtain an output result f (P, Q):
f(P,Q)=P⊙Q,
wherein an element product operation is indicated by an element.
(2-7) inputting the output result f (P, Q) obtained in the step (2-6) into a filling layer of the GMF module, wherein the obtained output is a filling result g (P, Q):
g(P,Q)=a ott (W T (P⊙Q)+b),
wherein, a out For Relu activation function, W and b represent the weight and bias parameters learnable in GMF module, respectively;
(2-8) measuring the filling result of the step (2-7) by using mean square error MSE to obtain the traffic flow vector v of the step (2-4) y And an error value MSE between the filling result g (P, Q) of step (2-7), which is calculated by the formula:
Figure BDA0003904208390000051
(2-9) updating the weight W and the bias parameter b which can be learned in the GMF module through an Adam optimizer;
(2-10) repeating the steps (2-8) to (2-9) until the MSE is smaller than the threshold value or the training times reach a preset turn, thereby obtaining a trained GMF module;
(2-11) inputting the unobservable data set obtained in the step (2-4) into the GMF module trained in the step (2-10) to obtain a traffic flow vector v' y ∈R m′
(2-12) obtaining a time vector v under the observable data set according to the step (2-4) p V, node vector q Traffic flow vector v y And a time vector v 'under the non-observable data set' p And node vector v' q And the traffic flow vector v 'obtained in the step (2-11)' y And acquiring the traffic flow data matrix after the data filling
Figure BDA0003904208390000052
(2-13) standardizing the traffic flow data matrix filled with the data in the step (2-12) to obtain a standardized traffic flow data matrix
Figure BDA0003904208390000053
Figure BDA0003904208390000054
Wherein mu is a traffic flow data matrix
Figure BDA0003904208390000055
A mean value of
Figure BDA0003904208390000056
Standard deviation of (d);
(2-14) standardizing the traffic flow data matrix obtained in the step (2-13) according to the historical step length H and the prediction window W
Figure BDA0003904208390000061
Reconstructed as a three-dimensional tensor
Figure BDA0003904208390000062
And the three-dimensional tensor Y ∈ R (T-H-W+1)×N×W
Preferably, the GCRNN network is trained by the following steps:
(3-1) the three-dimensional tensor obtained in the step (2-14)
Figure BDA0003904208390000063
And Y are divided into a training set and a test set according to the proportion of 6:4;
(3-2) performing adaptive graph learning on the GCRNN through the parameter E to obtain the adjacency matrix
Figure BDA0003904208390000064
The calculation formula of the step is as follows:
Figure BDA0003904208390000065
wherein E ∈ R N×e The method is a learnable parameter matrix, a torch.FloatTensor () function in a Pythrch frame is adopted for initializing a parameter E, and the optimal parameter adjusting results of the parameter E are 2 and 10;
(3-3) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA0003904208390000066
Data of training set at time t
Figure BDA0003904208390000067
Figure BDA0003904208390000068
And the adjacency matrix obtained in step (3-2)
Figure BDA0003904208390000069
Inputting the graph convolution neural network to obtain the graph convolution result H of the Kth cluster K ∈R N×h
The calculation formula of the step is as follows:
Figure BDA00039042083900000610
wherein G is K ∈R N×N Is the Laplace matrix, θ, of the Kth cluster K ∈R H×h And b K ∈R h The learnable parameter of the Kth cluster is h, and the number of neurons of a hidden layer in the graph convolution cyclic neural network is h;
(3-4) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA00039042083900000612
Data of training set at time t
Figure BDA00039042083900000611
Input into a recurrent neural network toObtaining a characterization result h at the moment t t
(3-5) outputting h obtained in the step (3-4-4) t Inputting the data into a two-dimensional convolution layer of a GCRNN network (as shown in FIG. 3) to obtain a final traffic flow prediction result Y' epsilon R N×W
The specific implementation of this step is as follows:
Y′=h t ★f 1×h
wherein, the channel number of the two-dimensional convolution layer is 1, the output W =12,f 1×h Represents a convolution kernel size in the two-dimensional convolution layer of (1,h), where h =64;
(3-6) calculating a loss value O (Y, Y ') between the prediction result Y' obtained in the step (3-5) and the tensor Y obtained in the step (2-14) by using an L1 loss function;
specifically, the L1 penalty function used in this step is:
Figure BDA0003904208390000071
(3-7) utilizing the loss function of step (3-6) and the Adam optimizer in the Pythrch framework for the learnable parameters E of step (3-2) and the learnable parameters of steps (3-4-1) to (3-4-4)
Figure BDA0003904208390000072
Figure BDA0003904208390000073
And
Figure BDA0003904208390000074
carrying out iterative updating;
(3-8) repeating the training process from the step (3-6) to the step (3-7) until the iteration number (100 in the invention) of the step (3-7) or the loss value O (Y, Y') of the step (3-6) is less than a set threshold value, and finishing the training so as to obtain a preliminarily trained GCRNN model;
and (3-9) verifying the GCRNN model preliminarily trained in the step (3-8) by using the test set obtained in the step (3-1) until the prediction error is optimal, thereby obtaining a trained space-time prediction model.
Preferably, step (3-4) comprises the sub-steps of:
(3-4-1) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA0003904208390000075
Data of training set at time t
Figure BDA0003904208390000076
The adjacency matrix obtained in step (3-2)
Figure BDA0003904208390000077
And the characterization result h at the previous time t-1 t-1 Input to GCRNN to obtain updated gate z at time t t ∈R N×h
Specifically, the calculation formula in this step is:
Figure BDA0003904208390000078
wherein h is 0 ∈R N×h Represents an initial state, is a matrix of all 0 s, [,]it is shown that the operation of splicing,
Figure BDA0003904208390000079
and
Figure BDA00039042083900000710
is the updating gate z at the time t in the Kth cluster t σ (-) is a sigmoid activation function;
(3-4-2) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA00039042083900000711
Data of training set at time t
Figure BDA00039042083900000712
The adjacency matrix obtained in step (3-2)
Figure BDA0003904208390000081
And the characterization result h at the previous time t-1 t-1 Input GCRNN to obtain reset gate r at t t ∈R N×h
The calculation formula of the step is as follows:
Figure BDA0003904208390000082
wherein the content of the first and second substances,
Figure BDA0003904208390000083
and
Figure BDA0003904208390000084
is reset gate r at time t in the Kth cluster t A learnable parameter of (c);
(3-4-3) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA0003904208390000085
Data of training set at time t
Figure BDA0003904208390000086
The reset door r obtained in the step (3-4-2) t The adjacency matrix obtained in step (3-2)
Figure BDA0003904208390000087
And the characterization result h at the previous time t-1 t-1 Input to GCRNN to obtain transmission state at t moment
Figure BDA0003904208390000088
The calculation formula of the step is as follows:
Figure BDA0003904208390000089
wherein, l represents an element product,
Figure BDA00039042083900000810
and
Figure BDA00039042083900000811
is the transmission state of the Kth cluster at time t
Figure BDA00039042083900000812
A learnable parameter of (c);
(3-4-4) updating the door z obtained in the step (3-4-1) t The characterization result at the last time t-1 and the transmission state obtained in the step (3-4-3)
Figure BDA00039042083900000813
Inputting the result into GCRNN to obtain a characterization result h at the current time t t ∈R N×h
The calculation formula of the step is as follows:
Figure BDA00039042083900000814
wherein z is t ⊙h t-1 Representing information h for the last time t-1 t-1 The selective forgetting is carried out, and the selective forgetting is carried out,
Figure BDA00039042083900000815
Figure BDA00039042083900000816
indicating the information including the current time t
Figure BDA00039042083900000817
Performing selective memory.
According to another aspect of the present invention, there is provided a traffic flow prediction system including missing data, comprising:
the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring a traffic data set of a certain area, the traffic data set comprises missing data, and the traffic data set is reconstructed into a traffic flow data matrix;
a second module for connectingInputting the traffic flow data matrix X obtained by the first module into an orthogonal nonnegative matrix decomposition ONMF module of a trained space-time prediction model to form K clusters, decomposing and filling GMF module filling data in each cluster by utilizing a generalized matrix of the space-time prediction model to obtain a traffic flow data matrix after the data is filled
Figure BDA0003904208390000091
For the filled traffic flow data matrix
Figure BDA0003904208390000092
Carrying out standardization processing to obtain a standardized traffic flow data matrix
Figure BDA0003904208390000093
And standardizing the traffic flow data matrix according to the historical step length H and the prediction window W
Figure BDA0003904208390000094
Modelling as a three-dimensional tensor
Figure BDA0003904208390000095
A third module for modeling the second module to obtain a three-dimensional tensor
Figure BDA0003904208390000096
Inputting the data into a graph convolution cyclic neural network GCRNN of the trained space-time prediction model to obtain prediction data Y'.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
(1) Because the invention adopts the step (3), GCN and GRU are fused to form a Graph Convolution Recurrent Neural Network (GCRNN for short), the module can approximate GCN to capture spatial correlation by utilizing Chebyshev polynomial, and the mode can improve the calculation performance; meanwhile, a GRU model is adopted to enable the GRU model to selectively memorize key characteristics so as to learn long-range time characteristics. Therefore, the technical problem that the future state of the traffic flow cannot be faithfully reflected due to the fact that only the time relation is represented by the existing method can be solved;
(2) Since the step (3) is adopted in the invention, when the spatial correlation is captured, the adopted GCN depends on graph Laplacian decomposition to process irregular graph data, so that the technical problem that the spatial representation is limited to regular data can be solved;
(3) Because the invention adopts the step (3), the graph learning network is designed, and the graph structure can be learned in a data-driven mode instead of relying on the predefined graph structure, thereby solving the technical problem of poor universality of the hybrid neural network method;
(4) As the invention adopts the step (2), a cluster parameter learning mechanism based on Orthogonal non-negative Matrix Factorization (ONMF) is designed, and the step (3) is combined to learn the specific parameters between clusters in the clusters and capture the space-time dependency relationship of the traffic flow sequence with fine granularity. The mechanism enables traffic flow data from the same cluster to have a common parameter space, and traffic flows from different clusters to have independent parameter spaces, so that the technical problem that the traffic flow prediction precision is influenced due to the fine-grained representation problem is solved;
(5) Because the invention adopts the step (2), a Generalized Matrix Factorization filling module (GMF for short) is designed, and the invention is a simple and efficient deep learning method, and fills the lost traffic flow sequence by learning implicit cross correlation. The module learns the time node interaction function in each traffic flow cluster in a mode of element product rather than inner product, not only inherits the advantage of matrix decomposition, but also fully excavates the nonlinear intrinsic correlation of the traffic flow in different clusters, thereby making up the technical problem that the characteristic learning performance is influenced by the data loss problem caused by network or equipment faults.
(6) Because the invention adopts the cluster parameter learning mechanism in the step (2), the parameter quantity of the parameter learning mode is greatly reduced compared with the node-specific mode by controlling the number of clusters, namely the relationship between the fine-grained characterization learning and the parameter quantity is balanced; the clustering method in the module is well suitable for missing traffic data, and the number of nodes in each cluster is relatively balanced.
(7) Because the invention adopts the step (2), the influence of missing data on the model is avoided by only updating observable data; meanwhile, the ONMF module in the step (2) has uniqueness and cluster interpretability of solution: the unique performance of the solution can ensure the stability of the algorithm; meanwhile, the clustering module in the step (2) has the interpretability due to the equivalence between the clustering module and the K-means method.
(8) The ONMF module, the GMF module and the GCRNN module are independent and can be used independently or jointly to adapt to the existing space-time data prediction model so as to improve the prediction performance.
Drawings
FIG. 1 illustrates a multi-modal characterization of a traffic data set;
FIG. 2 is a framework of the spatio-temporal prediction model of the present invention;
FIG. 3 is a block diagram of a GCRNN module designed by the present invention;
FIG. 4 is an ablation experiment of the spatiotemporal prediction model FCGCRN of the present invention on the PEMS04 data set with a deletion rate of 10%, wherein FIG. 4 (a), FIG. 4 (b), and FIG. 4 (c) are the results of the experiment on three error measures, mean absolute error MAE, root mean square error RMSE, and mean absolute percentage error MAPE, respectively;
FIG. 5 is an ablation experiment of the spatiotemporal prediction model FCGCRN of the present invention on the PEMS08 data set with a deletion rate of 10%, wherein FIG. 5 (a), FIG. 5 (b), and FIG. 5 (c) are the results of the experiment on three error measures, mean absolute error MAE, root mean square error RMSE, and mean absolute percentage error MAPE, respectively;
FIG. 6 is an ablation experiment of the spatiotemporal prediction model FCGCRN of the present invention on the PEMS04 data set with a deletion rate of 30%, wherein FIG. 6 (a), FIG. 6 (b), and FIG. 6 (c) are the results of the experiment on three error measures, mean absolute error MAE, root mean square error RMSE, and mean absolute percentage error MAPE, respectively;
FIG. 7 is an ablation experiment of the spatiotemporal prediction model FCGCRN of the present invention on a PEMS08 data set with a deletion rate of 30%, wherein FIG. 7 (a), FIG. 7 (b), and FIG. 7 (c) are the results of the experiment on three error measures, mean absolute error MAE, root mean square error RMSE, and mean absolute percentage error MAPE, respectively;
fig. 8 is a parametric analysis experiment of the spatio-temporal prediction model FCGCRN of the present invention on the cluster parameter K, wherein fig. 8 (a), fig. 8 (b), fig. 8 (c) and fig. 8 (d) are analysis experiments of the FCGCRN on four data sets PEMS04 (MR = 10), PEMS08 (MR = 10), PEMS04 (MR = 30) and PEMS08 (MR = 30), respectively, measured in terms of mean absolute percentage error MAPE and root mean square error RMSE;
fig. 9 is a flowchart of a traffic flow prediction method including missing data according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The basic idea of the invention is to provide a traffic flow prediction method and system containing missing data, which utilizes orthogonal nonnegative matrix decomposition to cluster traffic data sets, adopts generalized matrix decomposition to fill the missing traffic data under each cluster, adopts a self-adaptive mode to learn a graph and fuses a graph convolution recurrent neural network to represent time and space characteristics in the traffic data in a fine granularity mode by sharing a specific parameter learning mechanism between clusters in a cluster, and finally completes prediction of all traffic data sets by using a two-dimensional convolution layer.
As shown in fig. 9, the present invention provides a traffic flow prediction method containing missing data, which specifically includes the following steps:
(1) Acquiring a traffic data set of a certain area, wherein the traffic data set comprises missing data, and reconstructing the traffic data set into a traffic flow data matrix;
specifically, in this step, traffic data (which is three-dimensional tensor data) including missing data of each street in a certain area is acquired by using a sensor arranged on the street, and a traffic data set is formed by the traffic data including the missing data of all the streets, and then data dimension reduction processing is performed on the traffic data set to acquire a traffic flow data matrix.
The three-dimensional tensor data refers to time, nodes, traffic characteristics. The time refers to the time when the node acquires the traffic characteristics, the node refers to a single sensor, and the traffic characteristics comprise a vehicle speed characteristic, a traffic flow characteristic and a person number characteristic.
In this step, a traffic data set
Figure BDA0003904208390000121
Wherein
Figure BDA0003904208390000122
It represents the traffic flow data matrix of the nth node in all nodes (i.e. sensors) set on all streets in the area at T moments, n is [1,N ]],t∈[1,T]T is any positive integer, N is the total number of sensors arranged on all streets in the area, and
Figure BDA0003904208390000123
Figure BDA0003904208390000124
indicating that the data is non-negative;
wherein
Figure BDA0003904208390000125
For the c characteristic value of the nth node at the t moment, c is the [1,C ]]Where C represents a traffic characteristic category, the present invention has 3 traffic characteristics (vehicle speed characteristic, traffic flow characteristic, and head count characteristic), and thus C =3;
after obtaining the traffic data set Z, the invention sets C to 1, and realizes data dimension reduction through Numpy. Squeeze () function of Numpy library in Python, and the obtained traffic flow data matrix is
Figure BDA0003904208390000131
(2) Inputting the traffic flow data Matrix X obtained in the step (1) into an Orthogonal non-negative Matrix Factorization (ONMF) module of a trained space-time prediction model to form K clusters (see the following steps (2-1) to (2-3)) and filling data in each cluster by utilizing a Generalized Matrix Factorization (GMGMGMGMF) module of the space-time prediction model to obtain the traffic flow data Matrix after data filling
Figure BDA0003904208390000132
(see the following steps (2-4) to (2-12)) for the filled traffic flow data matrix
Figure BDA0003904208390000133
Carrying out standardization processing to obtain a standardized traffic flow data matrix
Figure BDA0003904208390000134
(see the following steps (2-13)) and based on the history step length H and the prediction window W, the normalized traffic flow data matrix is formed
Figure BDA0003904208390000135
Modelling as a three-dimensional tensor
Figure BDA0003904208390000136
(see the following steps (2-14) for details);
specifically, the ONMF module mentioned in step (2) is the first part shown in fig. 2. The module is a clustering module and is used for clustering the traffic flow data matrix obtained in the step (1) into K clusters; the ONMF is characterized in that an orthogonal constraint condition is added on the basis of matrix decomposition, and formalized expression is as follows: min F≥0,G≥0 ||X-FG T || 2 ,s.t.G T G = I, wherein,
Figure BDA0003904208390000137
is a traffic flow data matrix (R represents a real number, where X isA non-negative real matrix of T rows and N columns),
Figure BDA0003904208390000138
and
Figure BDA0003904208390000139
are two matrix factors of the traffic flow data matrix X,
Figure BDA00039042083900001310
is an identity matrix, and the value range of K is 2 to 5, preferably 2 or 3;
the GMF module is the second part shown in fig. 2, i.e. the filler module of the present invention, comprising an embedding layer, a decomposition layer and a filler layer. And (4) according to the clustering result obtained in the step (2-3), filling data of the missing part in each cluster by using a GMF module according to observable data.
The process of inputting the traffic flow data matrix X obtained in the step (1) into the ONMF module to form K clusters in the step (2) specifically comprises the following steps:
(2-1) initializing matrix factors F and G of the traffic flow data matrix X to random values within (0,1);
(2-2) initializing the obtained matrix factors F and G according to the step (2-1) and adopting an updating rule
Figure BDA0003904208390000141
And
Figure BDA0003904208390000142
updating observable data in the matrix when the X-FG is applied T When the error is converged, stopping iteration so as to obtain an updated matrix factor G;
and (2) because the traffic flow data matrix obtained in the step (1) is missing, the ONMF module only uses observable data in the traffic flow data matrix.
The advantage of this step is that only using observable data in the traffic stream data matrix can avoid missing data (i.e. unobservable data) from affecting the clustering result. Meanwhile, the matrix factors F and G obtained by adopting the updating rule of the step have uniqueness of solution, namely, the clustering results of each time are kept consistent under the condition that the parameters are consistent.
(2-3) clustering the traffic flow data matrix X into K clusters according to the updated matrix factor G obtained in the step (2-2);
the advantage of this step is that it is interpretable, i.e. there is equivalence with the K-means clustering algorithm. The concrete expression is as follows: if the kth probability of a certain node is the maximum, the node belongs to the kth cluster; in this way, the traffic flow data matrix is clustered into K clusters. Meanwhile, the data volume proportion of the K clusters obtained in the step is balanced, and favorable conditions are provided for better representing the space-time correlation subsequently. The detailed process of the clustering traffic flow data matrix of the ONMF module in the steps (2-1) to (2-3) is followed by continuously adjusting the parameter K to ensure that the parameter K is fused with the GCRNN module to achieve the optimal model, namely the prediction error is minimized;
filling a traffic flow data matrix X in each cluster by using a GMF module in the step (2) to obtain a traffic flow data matrix after data filling
Figure BDA0003904208390000143
For the filled traffic flow data matrix
Figure BDA0003904208390000144
Carrying out standardization processing to obtain a standardized traffic flow data matrix
Figure BDA0003904208390000145
And standardizing the traffic flow data matrix according to the historical step length H and the prediction window W
Figure BDA0003904208390000146
Modelling as a three-dimensional tensor
Figure BDA0003904208390000147
This process comprises the following sub-steps:
(2-4) dividing the traffic flow data matrix X into observable data sets (i.e., sets of non-missing data in the traffic flow data matrix X) and non-observable data sets in each cluster obtained in the step (2-3)Observable data sets (i.e., sets of missing data in the traffic flow data matrix X) are reconstructed to obtain a time vector v p ∈R m A node vector v q ∈R m And traffic flow vector v y ∈R m Reconstructing the non-observable data set to a time vector v' p ∈R m′ And node vector v' q ∈R m′ Wherein m represents a total number of observable data in the observable data set and m' represents a total number of unobservable data in the unobservable data set;
(2-5) v obtained in the step (2-4) p And v q The two vectors are input to the embedding layer of the GMF module to obtain the output of the embedding layer, i.e., the time matrix factor P and the node matrix factor Q (shown in fig. 2):
P=e 1 (v p ),
Q=e 2 (v q ),
wherein P ∈ R m×a And Q ∈ R m×a Time matrix factor and node matrix factor, respectively, a =16 is a latent factor, e 1 () And e 2 () Representing embedded functions, wherein the embedded functions are all the store.nn.embedding () functions in a Pythrch frame, and parameters of the two embedded functions are different in the specific implementation process;
(2-6) inputting the time matrix factor P and the node matrix factor Q obtained in the step (2-5) into a decomposition layer of the GMF module to obtain an output result f (P, Q):
f(P,Q)=P⊙Q,
wherein an element product operation is indicated by an element.
The method has the advantages that the mode of replacing the inner product of the matrix with the element product in the decomposition layer inherits the advantages of matrix decomposition and fully excavates the nonlinear intrinsic correlation of data in different clusters.
(2-7) inputting the output result f (P, Q) obtained in the step (2-6) into a filling layer of the GMF module, wherein the obtained output is a filling result g (P, Q):
g(P,Q)=a out (W T (P⊙Q)+b),
wherein, a out Is Relu laserThe living function, W and b represent the learnable weight and bias parameters in the GMF module, respectively;
the step has the advantages that the filling layer is a simple and efficient neural network, and missing traffic data can be filled by the layer.
(2-8) measuring the filling result in the step (2-7) by adopting Mean Square Error (MSE) to obtain the traffic flow vector v in the step (2-4) y And an error value MSE between the filling result g (P, Q) of step (2-7), which is calculated by the formula:
Figure BDA0003904208390000161
(2-9) updating the weight W and the bias parameter b which can be learned in the GMF module through an Adam optimizer;
(2-10) repeating the above steps (2-8) to (2-9) until the MSE is less than the threshold (10 in the present invention) -6 ) Or the training times reach the preset turns (100 in the invention), so as to obtain the well-trained GMF module;
(2-11) inputting the unobservable data set obtained in the step (2-4) into the GMF module trained in the step (2-10) to obtain a traffic flow vector v' y ∈R m′
At this point, the data fill job is complete.
(2-12) obtaining a time vector v under the observable data set according to the step (2-4) p A node vector v q Traffic flow vector v y And a time vector v 'under the non-observable data set' p And node vector v' q And the traffic flow vector v 'obtained in the step (2-11)' y And acquiring the traffic flow data matrix after the data filling
Figure BDA0003904208390000162
(2-13) standardizing the traffic flow data matrix filled with data in the step (2-12) to obtain a standardized traffic flow data matrix
Figure BDA0003904208390000163
Figure BDA0003904208390000164
Wherein mu is a traffic flow data matrix
Figure BDA0003904208390000165
Has a mean value of
Figure BDA0003904208390000166
Standard deviation of (d);
the advantage of this step lies in first: eliminating data dimension to improve the convergence rate of the GCRNN module so as to reduce the calculation cost; and secondly, the situation of gradient explosion in the module training process is prevented.
(2-14) standardizing the traffic flow data matrix obtained in the step (2-13) according to the historical step length H and the prediction window W
Figure BDA0003904208390000167
Reconstructed as a three-dimensional tensor
Figure BDA0003904208390000168
And the three-dimensional tensor Y ∈ R (T-H-W+1)×N×W
The space-time prediction model is a Fine-grained packed Graph Convolution Recurrent neural Network (FCGCRN for short), and comprises an ONMF module, a GMF module and a GCRNN module.
(3) Modeling the three-dimensional tensor obtained in the step (2)
Figure BDA0003904208390000171
The prediction data Y' is obtained by inputting the data into a Graph Convolution Recurrent Neural Network (GCRNN for short) of the trained spatio-temporal prediction model, which has a low error rate.
As shown in fig. 3, the GCRNN of the present invention fuses a graph learning neural network, a graph convolution neural network, and a recurrent neural network under a cluster parameter learning mechanism.
The GCRNN network is obtained by training the following steps:
(3-1) subjecting the three-dimensional tensor obtained in the step (2-14) to
Figure BDA0003904208390000172
And Y are divided into a training set and a test set according to the proportion of 6:4;
(3-2) performing adaptive graph learning on the GCRNN through the parameter E to obtain the adjacency matrix
Figure BDA0003904208390000173
The calculation formula of the step is as follows:
Figure BDA0003904208390000174
wherein E ∈ R N×e Is a parameter matrix which can be learnt, and the initialization of the parameter E adopts a torch.FloatTensor () function in a Pythtorch frame; in the invention, the optimal parameter adjusting result of the parameter e is 2 and 10;
(3-3) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA0003904208390000175
Data of training set at time t
Figure BDA0003904208390000176
Figure BDA00039042083900001711
And the adjacency matrix obtained in step (3-2)
Figure BDA0003904208390000177
Inputting the graph convolution neural network to obtain the graph convolution result H of the Kth cluster K ∈R N×h
The calculation formula of the step is as follows:
Figure BDA0003904208390000178
wherein, G K ∈R N×N Is the Laplace matrix of the Kth cluster, θ K ∈R H×h And b K ∈R h For the learnable parameter of the kth cluster, h is the neuron number of the hidden layer in the graph convolution recurrent neural network.
Further, the invention adopts Chebyshev polynomial expansion
Figure BDA0003904208390000179
Approximate Laplace matrix G K . Wherein the content of the first and second substances,
Figure BDA00039042083900001710
in the concrete implementation, T 0 =1,
Figure BDA0003904208390000181
In the invention, through parameter adjustment, the optimal prediction effect is finally determined when the polynomial parameter d is 2 and the cluster parameter K is 2 or 3.
The advantage of this step lies in first: the Chebyshev polynomial is used for approximating the Laplace matrix, so that high-dimensional features can be represented with lower calculation cost; secondly, the method comprises the following steps: the graph convolution neural network adopted can be used for powerfully characterizing the pair space correlation (comprising Euclidean space and non-Euclidean space).
(3-4) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA0003904208390000182
Data of training set at time t
Figure BDA0003904208390000183
Inputting into a recurrent neural network (which is used for characterizing the time dependence) to obtain a characterization result h at the time t t
Specifically, the invention fuses the graph convolution neural network into the cyclic neural network to form the GCRNN, so as to jointly represent the time and space correlation, and the multi-layer perceptron MLP in the cyclic neural network GRU is replaced by the graph convolution network GCN. The method comprises the following concrete steps:
(3-4-1) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA0003904208390000184
Data of training set at time t
Figure BDA0003904208390000185
The adjacency matrix obtained in step (3-2)
Figure BDA0003904208390000186
And the characterization result h at the previous time t-1 t-1 (see step (3-4-4) below for details) is input to the GCRNN to obtain the refresh gate z at time t t ∈R N×h
Specifically, the calculation formula in this step is:
Figure BDA0003904208390000187
wherein h is 0 ∈R N×h Representing an initial state, is a matrix of all 0 s, [,]it is shown that the operation of splicing,
Figure BDA0003904208390000188
and
Figure BDA0003904208390000189
is the updating gate z at the time t in the Kth cluster t σ (-) is a sigmoid activation function;
(3-4-2) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA00039042083900001810
Data of training set at time t
Figure BDA00039042083900001811
The adjacency matrix obtained in step (3-2)
Figure BDA00039042083900001812
And the characterization result h at the previous time t-1 t-1 (see step (3-4-4) below for details) input GCRNN to get reset gate r at time t t ∈R N×h
The calculation formula of the step is as follows:
Figure BDA00039042083900001813
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00039042083900001814
and
Figure BDA00039042083900001815
is reset gate r at time t in the Kth cluster t A learnable parameter of (c);
(3-4-3) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure BDA00039042083900001816
Data of training set at time t
Figure BDA00039042083900001817
The reset gate r obtained in the step (3-4-2) t The adjacency matrix obtained in step (3-2)
Figure BDA0003904208390000191
And the characterization result h at the previous time t-1 t-1 (see the following step (3-4-4)) is input to the GCRNN to obtain the transmission status at time t
Figure BDA0003904208390000192
The calculation formula of the step is as follows:
Figure BDA0003904208390000193
therein, <' >The product of the elements is expressed by,
Figure BDA0003904208390000194
and
Figure BDA0003904208390000195
is the transmission state of the Kth cluster at time t
Figure BDA0003904208390000196
A learnable parameter of (c);
(3-4-4) updating the door z obtained in the step (3-4-1) t The characterization result at the last time t-1 and the transmission state obtained in the step (3-4-3)
Figure BDA0003904208390000197
Inputting the result into GCRNN to obtain a characterization result h at the current time t t ∈R N×h
The calculation formula of the step is as follows:
Figure BDA0003904208390000198
wherein z is t ⊙h t-1 Representing information h for the last time t-1 t-1 The selective forgetting is carried out, and the selective forgetting is carried out,
Figure BDA0003904208390000199
Figure BDA00039042083900001910
indicating the information containing the current time t
Figure BDA00039042083900001911
Performing selective memory;
in the steps (3-4-1) to (3-4-4), feature learning is carried out at a certain time T, in the specific implementation, long-time correlation is represented by iterating information at T times, and the future traffic flow is predicted by utilizing the accumulated representation result at the last time.
In addition, due to the excellent characterization capability of the graph convolution neural network, the number of layers of the graph neural network in the invention is only 2, and a lower error result can be achieved, namely the number of layers of the GCRNN is 2.
The above steps (3-1) to (3-4) have advantages in that first: temporal and spatial correlations in traffic prediction data are captured in a fine-grained manner and high-precision predictions are achieved. In the prior art implementation, the parameter matrix θ is shared by all nodes (i.e., roads), however, in traffic prediction, not all nodes adopt the same mode. As shown in fig. 1, road 1 exhibits an early peak mode, roads 2 and 4 exhibit a late peak mode, and road 3 exhibits an early-late peak mode. Therefore, the invention adopts a specific parameter learning mode between cluster sharing and cluster to represent the space-time correlation in fine granularity, namely the GCRNN module is realized in K clusters. Secondly, the method comprises the following steps: compared with other recurrent neural networks, the adopted recurrent neural network GRU can learn the characteristics in a long time range with less parameters and lower calculation cost, and the learning capability of the GRU is equivalent to that of other recurrent neural networks.
(3-5) outputting h obtained in the step (3-4-4) t Inputting the data into a two-dimensional convolution layer of a GCRNN network (as shown in FIG. 3) to obtain a final traffic flow prediction result Y' epsilon R N×W
The specific implementation of this step is as follows:
Y′=h t ★f 1×h
wherein the number of channels of the two-dimensional convolutional layer is 1, the output W =12,f 1×h The convolution kernel size in the two-dimensional convolution layer is (1,h), and the prediction effect is best when h =64 in the invention;
(3-6) calculating a loss value O (Y, Y ') between the prediction result Y' obtained in the step (3-5) and the tensor Y (which refers to the training set) obtained in the step (2-14) by using an L1 loss function;
specifically, the L1 penalty function used in this step is:
Figure BDA0003904208390000201
(3-7) Using the loss function of step (3-6) and the Adam optimizer in the Pythrch framework for the learnable parameters E of step (3-2) and the learnable parameters of steps (3-4-1) to (3-4-4)
Figure BDA0003904208390000202
Figure BDA0003904208390000203
And
Figure BDA0003904208390000204
carrying out iterative updating;
(3-8) repeating the training process from the step (3-6) to the step (3-7) until the number of iterations of the step (3-7) (100 in the present invention) or the loss value O (Y, Y') of the step (3-6) is less than the set threshold value (10 in the present invention) -6 ) Finishing training to obtain a preliminarily trained GCRNN model;
and (3-9) verifying the GCRNN model preliminarily trained in the step (3-8) by using the test set obtained in the step (3-1) until the prediction error is optimal, thereby obtaining the trained space-time prediction model.
In summary, through the above description of the present invention, the main advantages of the present invention include:
1. the traffic flow prediction method containing the missing data can fill the traffic flow data containing the missing data, represents the complex and long-range space-time correlation of the historical traffic flow in a fine granularity manner, and realizes high-precision prediction of the future traffic flow state.
2. By dividing the traffic flow data, the GCRNN can adopt a specific parameter learning mechanism between intra-cluster sharing and clusters to extract the characteristics of the traffic flow data. Clustering adopts an orthogonal constrained non-negative matrix factorization algorithm ONMF, the algorithm is well adapted to missing traffic flow data, and only iteration is carried out on observable data. Moreover, the number of nodes of each cluster in the clustering result is uniformly distributed, and favorable conditions are provided for better representing the space-time relationship in the follow-up process. In addition, clustering is realized before data filling, so that a clustering structure is accurate and reliable.
3. And the generalized matrix decomposition module GMF is adopted to realize missing filling in each cluster, so that the characteristic of high correlation of the clusters is effectively utilized, and meanwhile, the nonlinear relation among data is learned by adopting a simple neural network.
4. The invention adopts the graph convolution neural network to learn the spatial correlation among data, the graph structure is not limited by Euclidean space, and the spatial relationship can be well represented. Meanwhile, the graph convolution neural network adopts a Chebyshev polynomial to approximate the traditional graph convolution, and the calculation cost is reduced under the condition of ensuring the representation effect. The invention replaces the predefined graph with adaptive graph learning in the GCRNN module, so that the graph depends on data and not on a predefined structure.
5. The traffic flow data is modeled into tensors and is subjected to standardization processing, so that the difference between data scales can be effectively eliminated, and the influence on the representation is eliminated.
Results of the experiment
The invention performs experiments on two real traffic flow data sets of PEMS04 and PEMS08, which contain 10% and 30% of Missing Rate (MR). The PEMS04 data set has 307 nodes, 16992 time steps and 59 days of total time; the PEMS08 data set has 170 nodes, 17856 time steps and total time duration of 62 days. Both data sets were 5 minutes as a time step.
The validity and accuracy of the traffic flow prediction method with missing data provided by the invention are verified through comparative experiments on real data sets, and the method is shown in the following tables 1 and 2 and fig. 4 to 7. The spatio-temporal prediction model FCGCRN is compared with other eight reference methods (baselines), including HA, ARIMA, logistic Regression (LR for short), LSTM, DCRNN, MTGNN, DMSTGCN and STGODE (the methods refer to English letters directly in other papers without Chinese characters), and three measurement indexes are adopted: mean Absolute Error MAE, root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The smaller the error metric index, the better the prediction performance for missing data. Tables 1-2 show the results of the experiments at two deletion rates and two real data sets, and it can be seen from the tables that the prediction performance of the spatio-temporal prediction model FCGCRN is superior to other baselines under both the MAE and RMSE indexes and superior to most benchmark methods (baselines) under the MAPE index. The performance of the spatio-temporal prediction model FCGCRN is more outstanding under the condition of high deletion rate. As can be seen from Table 1, the prediction performance of the deep learning methods (LR, LSTM, DCRNN, DMSTGCN and STGODE) is superior to the conventional statistical methods (HA and ARIMA). It can be further seen that the adaptive image learning models (MTGNN and STGODE) are greatly affected by data missing, and in contrast, the model of the present invention is also an image learning model, but the performance is more stable under the condition that data is missing and the missing rate is large.
The ablation experiments of fig. 4-7 demonstrate the effectiveness of the key modules of the present invention. The GMF ensures the integrity of the data by filling missing data; the cluster parameter learning mechanism ensures the fine granularity of the model through an intra-cluster sharing-inter-cluster parameter specific mode; the data-driven graph learning mode in the GCRNN module enables the model to be more universal in the space-time sequence prediction task. FIG. 8 shows an analysis of the number of key parameter clusters K of the present invention that determine the diversity of parameters in the GCRNN module and affect the number of parameters. As can be seen, the model works best when K is 2 or 3, indicating that GCRNN learns class 2 to 3 specific parameters. The results of this experiment correspond to traffic sequences with early peak, late peak and early-late peak patterns. In conclusion, the method has the advantages of high stability, fine granularity and universality on the space-time prediction task containing the missing data.
TABLE 1 comparative experimental results of the spatio-temporal prediction model FCGCRN of the present invention and other eight baseline methods (baselines) on the data sets of PEMS04 and PEMS08 (loss rate 10%)
Figure BDA0003904208390000231
TABLE 2 comparative experimental results of the spatio-temporal prediction model FCGCRN of the present invention and other eight baseline methods (baselines) on the data sets of PEMS04 and PEMS08 (loss rate 30%)
Figure BDA0003904208390000232
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A traffic flow prediction method containing missing data is characterized by comprising the following steps:
(1) Acquiring a traffic data set of a certain area, wherein the traffic data set comprises missing data, and reconstructing the traffic data set into a traffic flow data matrix;
(2) Inputting the traffic flow data matrix X obtained in the step (1) into an orthogonal nonnegative matrix decomposition ONMF module of a trained space-time prediction model to form K clusters, decomposing and filling GMF module filling data in each cluster by utilizing a generalized matrix of the space-time prediction model to obtain a traffic flow data matrix after the data is filled
Figure FDA0003904208380000016
For the filled traffic flow data matrix
Figure FDA0003904208380000017
Carrying out standardization processing to obtain a standardized traffic flow data matrix
Figure FDA00039042083800000110
And standardizing the traffic flow data matrix according to the historical step length H and the prediction window W
Figure FDA0003904208380000019
Modelling as a three-dimensional tensor
Figure FDA0003904208380000018
(3) Obtained by modeling in step (2)Three-dimensional tensor
Figure FDA00039042083800000111
Inputting the data into a graph convolution cyclic neural network GCRNN of the trained space-time prediction model to obtain prediction data Y'.
2. The method for predicting the traffic flow containing the missing data according to claim 1, wherein the traffic data is three-dimensional tensor data { time, node, traffic characteristics }, wherein the time refers to the time when the node acquires the traffic characteristics, the node refers to a single sensor, and the traffic characteristics comprise a vehicle speed characteristic, a traffic flow characteristic and a people number characteristic.
3. The method for predicting the traffic flow containing the missing data according to claim 1 or 2, wherein in the step (1), the traffic data set
Figure FDA0003904208380000011
Wherein
Figure FDA0003904208380000012
It represents the traffic flow data matrix of the nth node in all nodes (i.e. sensors) set on all streets in the area at T moments, n is [1,N ]],t∈[1,T]T is any positive integer, N is the total number of sensors arranged on all streets in the area, and
Figure FDA0003904208380000013
Figure FDA0003904208380000014
indicating that the data is non-negative;
wherein
Figure FDA0003904208380000015
For the c characteristic value of the nth node at the t moment, c is the [1,C ]]Wherein C represents trafficA class of features.
4. The method for predicting the traffic flow containing the missing data according to any one of claims 1 to 3, wherein the process of inputting the traffic flow data matrix X obtained in the step (1) into the ONMF module to form K clusters in the step (2) specifically comprises:
(2-1) initializing matrix factors F and G of the traffic flow data matrix X to random values within (0,1);
(2-2) initializing the obtained matrix factors F and G according to the step (2-1) and adopting an updating rule
Figure FDA0003904208380000021
And
Figure FDA0003904208380000022
updating observable data in the matrix when the X-FG T When the error is converged, stopping iteration so as to obtain an updated matrix factor G;
and (2-3) clustering the traffic flow data matrix X into K clusters according to the updated matrix factor G obtained in the step (2-2).
5. The method for predicting the traffic flow containing the missing data according to claim 4, wherein the GMF module is used to fill the traffic flow data matrix X in each cluster in the step (2) to obtain the traffic flow data matrix after data filling
Figure FDA0003904208380000023
For the filled traffic flow data matrix
Figure FDA0003904208380000024
Carrying out standardization processing to obtain a standardized traffic flow data matrix
Figure FDA0003904208380000025
And standardizing the traffic flow data matrix according to the historical step length H and the prediction window W
Figure FDA0003904208380000026
Modelling as a three-dimensional tensor
Figure FDA0003904208380000027
This process comprises the following sub-steps:
(2-4) dividing the traffic flow data matrix X into observable and unobservable data sets in each cluster obtained in the step (2-3), and reconstructing the observable data sets to obtain a time vector v p ∈R m A node vector v q ∈R m And traffic flow vector v y ∈R m Reconstructing the non-observable data set to a time vector v' p ∈R m′ And node vector v' q ∈R m′ Wherein m represents a total number of observable data in the observable data set and m' represents a total number of unobservable data in the unobservable data set;
(2-5) v obtained in the step (2-4) p And v q The two vectors are input to the embedding layer of the GMF module to obtain the output of the embedding layer, i.e. the time matrix factor P and the node matrix factor Q:
P=e 1 (v p ),
Q=e 2 (v q ),
wherein P ∈ R m×a And Q ∈ R m×a Time matrix factor and node matrix factor, respectively, a =16 is a latent factor, e 1 () And e 2 () Representing embedded functions, which are all the functions of the store.nn.embed () in the Pythrch frame;
(2-6) inputting the time matrix factor P and the node matrix factor Q obtained in the step (2-5) into a decomposition layer of the GMF module to obtain an output result f (P, Q):
f(P,Q)=P⊙Q,
wherein an element product operation is indicated by an element.
(2-7) inputting the output result f (P, Q) obtained in the step (2-6) into a filling layer of the GMF module, wherein the obtained output is a filling result g (P, Q):
g(P,Q)=a out (W T (P⊙Q)+b),
wherein, a out For Relu activation function, W and b represent the weight and bias parameters learnable in GMF module, respectively;
(2-8) measuring the filling result of the step (2-7) by using mean square error MSE to obtain the traffic flow vector v of the step (2-4) y And an error value MSE between the filling result g (P, Q) of step (2-7), which is calculated by the formula:
Figure FDA0003904208380000031
(2-9) updating the weight W and the bias parameter b which can be learned in the GMF module through an Adam optimizer;
(2-10) repeating the steps (2-8) to (2-9) until the MSE is smaller than the threshold value or the training times reach a preset turn, thereby obtaining a trained GMF module;
(2-11) inputting the unobservable data set obtained in the step (2-4) into the GMF module trained in the step (2-10) to obtain a traffic flow vector v' y ∈R m′
(2-12) according to the time vector v under the observable data set obtained in the step (2-4) p A node vector v q Traffic flow vector v y And a time vector v 'under the non-observable data set' p And node vector v' q And the traffic flow vector v 'obtained in the step (2-11)' y And acquiring the traffic flow data matrix after the data filling
Figure FDA0003904208380000032
(2-13) standardizing the traffic flow data matrix filled with data in the step (2-12) to obtain a standardized traffic flow data matrix
Figure FDA0003904208380000041
Figure FDA0003904208380000042
Wherein mu is a traffic flow data matrix
Figure FDA0003904208380000043
Has a mean value of
Figure FDA0003904208380000044
The standard deviation of (a);
(2-14) standardizing the traffic flow data matrix obtained in the step (2-13) according to the historical step length H and the prediction window W
Figure FDA0003904208380000045
Reconstructed as a three-dimensional tensor
Figure FDA0003904208380000046
And the three-dimensional tensor Y ∈ R (T-H-W+1)×N×W
6. The method for predicting the traffic flow containing the missing data according to claim 5, wherein the GCRNN network is obtained by training through the following steps:
(3-1) subjecting the three-dimensional tensor obtained in the step (2-14) to
Figure FDA0003904208380000047
And Y are divided into a training set and a test set according to the ratio of 6: 4;
(3-2) performing adaptive graph learning on the GCRNN through the parameter E to obtain the adjacency matrix
Figure FDA0003904208380000048
The calculation formula of the step is as follows:
Figure FDA0003904208380000049
wherein E ∈ R N×e The method is a learnable parameter matrix, a torch.FloatTensor () function in a Pythrch frame is adopted for initializing a parameter E, and the optimal parameter adjusting results of the parameter E are 2 and 10;
(3-3) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure FDA00039042083800000410
Data of training set at time t
Figure FDA00039042083800000411
Figure FDA00039042083800000412
And the adjacency matrix obtained in step (3-2)
Figure FDA00039042083800000413
Inputting the graph convolution neural network to obtain the graph convolution result H of the Kth cluster K ∈R N×h
The calculation formula of the step is as follows:
Figure FDA00039042083800000414
wherein G is K ∈R N×N Is the Laplace matrix of the Kth cluster, θ K ∈R H×h And b K ∈R h The learnable parameter of the Kth cluster is h, and the number of neurons of a hidden layer in the graph convolution cyclic neural network is h;
(3-4) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure FDA00039042083800000416
Data of training set at time t
Figure FDA00039042083800000415
Inputting into a recurrent neural network to obtain a characterization result h at the moment t t
(3-5) step (3-4)) The resulting output h t Inputting the data into a two-dimensional convolution layer of a GCRNN network (as shown in FIG. 3) to obtain a final traffic flow prediction result Y' epsilon R N×W
The specific implementation of this step is as follows:
Y′=h t ★f 1×h
wherein, the channel number of the two-dimensional convolution layer is 1, the output W =12,f 1×h Represents a convolution kernel size in the two-dimensional convolution layer of (1,h), where h =64;
(3-6) calculating a loss value O (Y, Y ') between the prediction result Y' obtained in the step (3-5) and the tensor Y obtained in the step (2-14) by using an L1 loss function;
specifically, the L1 penalty function used in this step is:
Figure FDA0003904208380000051
(3-7) Using the loss function of step (3-6) and the Adam optimizer in the Pythrch framework for the learnable parameters E of step (3-2) and the learnable parameters of steps (3-4-1) to (3-4-4)
Figure FDA0003904208380000052
Figure FDA0003904208380000053
And
Figure FDA0003904208380000054
carrying out iterative updating;
(3-8) repeating the training process from the step (3-6) to the step (3-7) until the iteration number (100 in the invention) of the step (3-7) or the loss value O (Y, Y') of the step (3-6) is less than a set threshold value, and finishing the training so as to obtain a preliminarily trained GCRNN model;
and (3-9) verifying the GCRNN model preliminarily trained in the step (3-8) by using the test set obtained in the step (3-1) until the prediction error is optimal, thereby obtaining the trained space-time prediction model.
7. The traffic flow prediction method containing missing data according to claim 6, characterized in that step (3-4) includes the following substeps:
(3-4-1) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure FDA0003904208380000055
Data of training set at time t
Figure FDA0003904208380000056
The adjacency matrix obtained in step (3-2)
Figure FDA0003904208380000057
And the characterization result h at the previous time t-1 t-1 Input to GCRNN to obtain updated gate z at time t t ∈R N ×h
Specifically, the calculation formula in this step is:
Figure FDA0003904208380000061
wherein h is 0 ∈R N×h Representing an initial state, is a matrix of all 0 s, [,]it is shown that the operation of splicing,
Figure FDA0003904208380000062
and
Figure FDA0003904208380000063
is the updating gate z at the time t in the Kth cluster t σ (-) is a sigmoid activation function;
(3-4-2) subjecting the three-dimensional tensor obtained in the step (3-1) to
Figure FDA0003904208380000064
Data of training set at time t
Figure FDA0003904208380000065
The adjacency matrix obtained in step (3-2)
Figure FDA0003904208380000066
And the characterization result h at the previous time t-1 t-1 Input GCRNN to obtain reset gate r at t t ∈R N×h
The calculation formula of the step is as follows:
Figure FDA0003904208380000067
wherein the content of the first and second substances,
Figure FDA0003904208380000068
and
Figure FDA0003904208380000069
is reset gate r at time t in the Kth cluster t A learnable parameter of (c);
(3-4-3) subjecting the three-dimensional tensor obtained in the step (3-1)
Figure FDA00039042083800000610
Data of training set at time t
Figure FDA00039042083800000611
The reset gate r obtained in the step (3-4-2) t The adjacency matrix obtained in step (3-2)
Figure FDA00039042083800000612
And the characterization result h at the previous time t-1 t-1 Input to GCRNN to obtain transmission state at time t
Figure FDA00039042083800000613
The calculation formula of the step is as follows:
Figure FDA00039042083800000614
wherein, l represents an element product,
Figure FDA00039042083800000615
and
Figure FDA00039042083800000616
is the transmission state of the Kth cluster at time t
Figure FDA00039042083800000617
A learnable parameter of (c);
(3-4-4) updating the door z obtained in the step (3-4-1) t The characterization result at the last time t-1 and the transmission state obtained in the step (3-4-3)
Figure FDA00039042083800000618
Inputting the result into GCRNN to obtain a characterization result h at the current time t t ∈R N×h
The calculation formula of the step is as follows:
Figure FDA00039042083800000619
wherein z is t ⊙h t-1 Representing information h for the last time t-1 t-1 The selective forgetting is carried out, and the selective forgetting is carried out,
Figure FDA00039042083800000620
Figure FDA00039042083800000621
indicating the information containing the current time t
Figure FDA00039042083800000622
Is selectedAnd (6) selectively memorizing.
8. A traffic flow prediction system including missing data, comprising:
the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring a traffic data set of a certain area, the traffic data set comprises missing data, and the traffic data set is reconstructed into a traffic flow data matrix;
a second module for inputting the traffic flow data matrix X obtained by the first module into the orthogonal nonnegative matrix decomposition ONMF module of the trained space-time prediction model to form K clusters, and decomposing and filling GMF module filling data in each cluster by utilizing the generalized matrix of the space-time prediction model to obtain the traffic flow data matrix after filling data
Figure FDA0003904208380000071
For the filled traffic flow data matrix
Figure FDA0003904208380000072
Carrying out standardization processing to obtain a standardized traffic flow data matrix
Figure FDA0003904208380000073
And standardizing the traffic flow data matrix according to the historical step length H and the prediction window W
Figure FDA0003904208380000074
Modelling as a three-dimensional tensor
Figure FDA0003904208380000075
A third module for modeling the three-dimensional tensor obtained by the second module
Figure FDA0003904208380000076
Inputting the graph convolution cyclic neural network GCRNN of the trained space-time prediction model to obtain prediction data Y'.
CN202211301263.7A 2022-10-24 2022-10-24 Traffic flow prediction method and system containing missing data Pending CN115700628A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211301263.7A CN115700628A (en) 2022-10-24 2022-10-24 Traffic flow prediction method and system containing missing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211301263.7A CN115700628A (en) 2022-10-24 2022-10-24 Traffic flow prediction method and system containing missing data

Publications (1)

Publication Number Publication Date
CN115700628A true CN115700628A (en) 2023-02-07

Family

ID=85120935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211301263.7A Pending CN115700628A (en) 2022-10-24 2022-10-24 Traffic flow prediction method and system containing missing data

Country Status (1)

Country Link
CN (1) CN115700628A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115964621A (en) * 2023-03-17 2023-04-14 中国科学技术大学 Regional road network exhaust emission data completion method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115964621A (en) * 2023-03-17 2023-04-14 中国科学技术大学 Regional road network exhaust emission data completion method
CN115964621B (en) * 2023-03-17 2023-05-09 中国科学技术大学 Regional road network tail gas emission data complement method

Similar Documents

Publication Publication Date Title
Wang et al. Modeling inter-station relationships with attentive temporal graph convolutional network for air quality prediction
CN114303174A (en) Performance testing of robotic systems
CN112765896A (en) LSTM-based water treatment time sequence data anomaly detection method
CN114944053B (en) Traffic flow prediction method based on space-time hypergraph neural network
Qin et al. Simulating and Predicting of Hydrological Time Series Based on TensorFlow Deep Learning.
CN111612243A (en) Traffic speed prediction method, system and storage medium
CN113591380B (en) Traffic flow prediction method, medium and equipment based on graph Gaussian process
CN112419710A (en) Traffic congestion data prediction method, traffic congestion data prediction device, computer equipment and storage medium
CN111047078B (en) Traffic characteristic prediction method, system and storage medium
CN112270355A (en) Active safety prediction method based on big data technology and SAE-GRU
CN116187555A (en) Traffic flow prediction model construction method and prediction method based on self-adaptive dynamic diagram
Medel Anomaly detection using predictive convolutional long short-term memory units
CN115984586A (en) Multi-target tracking method and device under aerial view angle
James Citywide estimation of travel time distributions with Bayesian deep graph learning
CN115700628A (en) Traffic flow prediction method and system containing missing data
Liang et al. Mixed-order relation-aware recurrent neural networks for spatio-temporal forecasting
Wang et al. Reconstruction of missing trajectory data: a deep learning approach
Feng et al. Multi-step ahead traffic speed prediction based on gated temporal graph convolution network
Yan et al. A quantum-inspired online spiking neural network for time-series predictions
Li et al. Autonomous vehicle trajectory combined prediction model based on CC-LSTM
Yalçın Weather parameters forecasting with time series using deep hybrid neural networks
CN114399901B (en) Method and equipment for controlling traffic system
Vlontzos et al. Causal future prediction in a minkowski space-time
CN115830865A (en) Vehicle flow prediction method and device based on adaptive hypergraph convolution neural network
CN115063972A (en) Traffic speed prediction method and system based on graph convolution and gate control cyclic unit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination