CN112668797A

CN112668797A - Long-term and short-term traffic prediction method

Info

Publication number: CN112668797A
Application number: CN202011641479.9A
Authority: CN
Inventors: 刘玉葆; 黄楚茵
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2021-04-16
Anticipated expiration: 2040-12-31
Also published as: CN112668797B

Abstract

The application discloses a long-term and short-term traffic prediction method, which comprises the steps of obtaining first historical traffic data of nodes in a constructed traffic network graph, and performing convolution processing on the first historical traffic data through a first convolution layer in a preset traffic prediction model; performing traffic prediction on the convolved first historical traffic data through a first iterative RNN operator in the model, outputting a traffic prediction result of a first time step, and inputting the traffic prediction result of the first time step into a next iterative RNN operator for traffic prediction until the Tth time step_pOutput of the iterative RNN operator_pThe traffic prediction result of each time step is obtained by splicing the T in the model_pThe traffic prediction results of each time step are spliced and input intoAnd the second convolution layer performs convolution processing and outputs a final traffic prediction result. The method and the device solve the technical problems that the existing traffic prediction method is high in prediction error accumulation and cannot give consideration to long-term and short-term prediction accuracy at the same time.

Description

Long-term and short-term traffic prediction method

Technical Field

The application relates to the technical field of traffic prediction, in particular to a long-term and short-term traffic prediction method.

Background

Traffic prediction is a classic space-time prediction problem, and is widely applied to actual life, such as intelligent city road network planning, intelligent travel path planning, urban public transport systems and the like. Because traffic data has a high degree of non-linearity and complexity, the prior art mainly solves the traffic prediction problem through deep learning. The existing traffic prediction method has the problems of high prediction error accumulation and incapability of simultaneously considering long-term and short-term prediction accuracy.

Disclosure of Invention

The application provides a long-term and short-term traffic prediction method, which is used for solving the technical problems that the existing traffic prediction method is high in prediction error accumulation and cannot give consideration to long-term and short-term prediction accuracy at the same time.

In view of the above, a first aspect of the present application provides a long-term and short-term traffic prediction method, including:

constructing a traffic network into a graph structure to obtain a traffic network graph;

acquiring first historical traffic data of nodes in the traffic network graph;

inputting the first historical traffic data into a first convolutional layer, T_pThe iterative RNN operators, the splicing module and a preset traffic prediction model of the second convolutional layer enable the first convolutional layer to carry out convolution processing on the first historical traffic data, the first iterative RNN operator carries out traffic prediction on the first historical traffic data after convolution processing, a traffic prediction result of a first time step is output, the traffic prediction result of the first time step is input to the next iterative RNN operator to carry out traffic prediction until the Tth time step_pOutput of the iterative RNN operator_pThe traffic prediction result of each time step, the splicing module pair T_pAnd splicing the traffic prediction results of each time step, performing convolution processing on the spliced traffic prediction results by the second convolution layer, and outputting a final traffic prediction result.

Optionally, the iterative RNN operator includes a gate-controlled linear unit, a diffusion convolution layer, and a full-link layer;

the first iteration RNN operator carries out traffic prediction on the first historical traffic data after convolution processing, and outputs a traffic prediction result of a first time step, and the method comprises the following steps:

the gate control linear unit extracts the time dependence relationship of the first historical traffic data after convolution processing and outputs a first characteristic;

extracting the spatial dependence relation of the first characteristic by the diffusion convolution layer, and outputting a second characteristic;

and the full-connection layer carries out traffic prediction on the second characteristic and outputs a traffic prediction result of the first time step.

Optionally, the gate control linear unit extracts a time dependency relationship of the first historical traffic data after the convolution processing, and outputs a first feature, including:

the gate control linear unit performs one-dimensional convolution processing on the first historical traffic data after convolution processing to obtain a first convolution characteristic and a second convolution characteristic,

the gate control linear unit carries out activation processing on the second convolution characteristic through an activation function, and calculates a Hadamard product of the first convolution characteristic and the second convolution characteristic after the activation processing;

and the gate control linear unit carries out residual error connection on the first historical traffic data after convolution processing and the Hadamard product, and outputs a first characteristic.

Optionally, the extracting, by the diffusion convolution layer, the spatial dependency relationship of the first feature, and outputting a second feature includes:

and the diffusion convolution layer performs diffusion convolution characteristic extraction with a self-adaptive matrix on the first characteristic and outputs a second characteristic.

Optionally, the second characteristic is:

wherein Z is a second feature, P_fA/rowsum (a) is a forward transition matrix, a is a weight adjacency matrix; p_b＝A^T/rowsum(A^T) Is a backward transfer matrix; x is a first characteristic, W is a parameter matrix of the diffusion convolution layer, and K is a constant;

to adapt the matrix, E₁、E₂Respectively embedding a source node and embedding a target node.

Optionally, the method further includes:

calculating the weight adjacency matrix based on the traffic network graph, wherein the calculation formula of the weights in the weight adjacency matrix is as follows:

wherein ,a_ijIs the weight of the edge between the neighboring nodes i, j in the traffic network graph, d_ijIs the distance between neighboring nodes i, j, δ is the threshold parameter, and ε is the hyperparameter.

Optionally, the first historical traffic data is input into a first convolutional layer, T_pThe iterative RNN operators, the splicing module and the preset traffic prediction model of the second convolution layer comprise the following steps:

and preprocessing the first historical traffic data.

Optionally, the configuration process of the preset traffic prediction model is as follows:

acquiring second historical traffic data of nodes in the traffic network graph;

and training a traffic prediction network through the second historical traffic data to obtain the preset traffic prediction model.

Optionally, the loss function of the traffic prediction network is:

wherein ,

is predicted T_pThe traffic data for each time step is,

for the future T_pReal traffic data per time step, W_θTraining parameters of the network are predicted for traffic.

According to the technical scheme, the method has the following advantages:

the application provides a long-term and short-term traffic prediction method, which comprises the following steps: constructing a traffic network into a graph structure to obtain a traffic network graph; acquiring first historical traffic data of nodes in a traffic network graph; inputting the first historical traffic data into a first convolutional layer, T_pThe iterative RNN operators, the splicing module and the preset traffic prediction model of the second convolution layer enable the first convolution layer to carry out convolution processing on the first historical traffic data, the first iterative RNN operator carries out traffic prediction on the first historical traffic data after convolution processing, the traffic prediction result of the first time step is output, the traffic prediction result of the first time step is input to the next iterative RNN operator to carry out traffic prediction until the Tth time step_pOutput of the iterative RNN operator_pTraffic prediction result of each time step, and splicing module pair T_pAnd splicing the traffic prediction results of each time step, performing convolution processing on the spliced traffic prediction results by the second convolution layer, and outputting a final traffic prediction result.

According to the traffic prediction method, the acquired first historical traffic data is input into a preset traffic prediction model, one time step prediction is carried out through iterative RNN operators, each time of traffic prediction result is input into the next iterative RNN operator to carry out the next time step prediction, and meanwhile, the traffic prediction result is used as a part of the final output result, and the one time step prediction is carried out through each iterative RNN operator, so that effective long-term and short-term traffic space-time information can be extracted, the phenomenon of error accumulation is reduced, and the short-term traffic prediction precision is improved; the traffic prediction results of all time steps are spliced through the splicing module, and then the spliced traffic prediction results are subjected to convolution processing to output the final traffic prediction result, so that the long-term and short-term traffic prediction result is obtained, and the long-term and short-term traffic prediction precision is taken into consideration, thereby solving the technical problems that the existing traffic prediction method is high in prediction error accumulation and cannot take the long-term and short-term prediction precision into consideration.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a schematic flow chart of a long-term and short-term traffic prediction method according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a traffic network sensor distribution according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of an RNN structure according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a gate linear unit according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a preset traffic prediction network model according to an embodiment of the present application;

fig. 6 is a schematic flow chart of traffic prediction performed by the first iterative RNN operator according to the embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

For easy understanding, referring to fig. 1, an embodiment of a long-short term traffic prediction method provided by the present application includes:

step 101, constructing a traffic network into a graph structure to obtain a traffic network graph.

In the embodiment of the present application, a traffic network may be constructed as a directed graph structure, and a traffic network graph G is obtained as (V, E, a), where V is a node set, that is, a set of sensors on the traffic network, and reference may be made to fig. 2, where each sensor records historical traffic data, such as speed, traffic flow, and the like. E is an edge set between nodes, A belongs to R^N×NIs a weight adjacency matrix of the traffic network graph G.

Further, the calculation formula of the weights in the weight adjacency matrix is:

wherein ,a_ijIs the weight of the edge between the neighboring nodes i, j in the traffic network graph, d_ijThe distance between the neighboring nodes i and j is delta, which is a threshold parameter and can be set to 0.1, epsilon is a hyper-parameter, and a specific value can be set according to the actual situation. The weight adjacency matrix A is formed by the weights of edges between all the neighbor nodes i and j in the traffic network graph.

Step 102, first historical traffic data of nodes in a traffic road network graph are obtained.

After the traffic network graph G is obtained, the first historical traffic data may be obtained through nodes therein.

Traffic data for the ith node at the t-th time step, x_t∈R^NTraffic data for all nodes at the t-th time step. In the embodiment of the application, the T in the future is predicted by acquiring tau traffic data of the history_pTraffic data for each time step. Therefore, the first historical traffic data acquired in the embodiment of the present application may be expressed as:

X＝(x₁，x₂，…，x_τ)∈R^N×τ；

the traffic data that needs to be predicted into the future may be expressed as:

step 103, inputting the first historical traffic data into the first convolution layer and the first convolution layer_pThe iterative RNN operators, the splicing module and the preset traffic prediction model of the second convolution layer enable the first convolution layer to carry out convolution processing on the first historical traffic data, the first iterative RNN operator carries out traffic prediction on the first historical traffic data after convolution processing, the traffic prediction result of the first time step is output, the traffic prediction result of the first time step is input to the next iterative RNN operator to carry out traffic prediction until the Tth time step_pOutput of the iterative RNN operator_pTraffic prediction result of each time step, and splicing module pair T_pAnd splicing the traffic prediction results of each time step, performing convolution processing on the spliced traffic prediction results by the second convolution layer, and outputting a final traffic prediction result.

RNNs are a class of neural networks used to process sequence data, and differ from other basic neural networks that establish connections only between layers in that RNNs also establish connections between neurons between layers because data later in the sequence data is also closely related to data earlier in the sequence data, as shown in fig. 3.

The embodiment of the application is based on RNN thought, one iteration prediction process is used as an iteration RNN operator, the output prediction result is used as a part of the final output result, and the single-step prediction result is input into the next iteration RNN operator at the same time_pIterative RNN operator for predicting future T_pTraffic data for each time step. The iterative RNN operator in the embodiment of the present application is different from the conventional RNN structure in which each RNN operator shares a parameter, but the embodiment of the present application does not use the same parameterThe iterative RNN operator comprises a time and space module (namely a linear gating unit and a diffusion convolution layer) for primary prediction and the like, which are respectively used for extracting a time dependency relationship and a space dependency relationship to complete single-step prediction, and parameters of different iterative RNN operators are different.

The preset traffic prediction model in the embodiment of the application comprises a first convolution layer and a T layer_pThe iterative RNN operator, the concatenation module and the second convolution layer, can refer to fig. 5. Predicting future T by inputting acquired first historical traffic data into preset traffic prediction model_pThe traffic data of each time step is specifically that convolution processing is carried out on the first historical traffic data through a first convolution layer, traffic prediction is carried out on the convolved first historical traffic data through a first iteration RNN operator, the traffic prediction result of the first time step is output, the traffic prediction result of the first time step is input into a next iteration RNN operator for traffic prediction until the Tth time step_pOutput of the iterative RNN operator_pThe traffic prediction result of each time step is compared with the T by a splicing module_pAnd splicing the traffic prediction results of each time step, performing convolution processing on the spliced traffic prediction results through the second convolution layer, and outputting a final traffic prediction result.

Further, the first historical traffic data is input to a data processing system including a first convolutional layer, T_pThe iteration RNN operator, the splicing module and the preset traffic prediction model of the second convolution layer further comprise the following steps: the first historical traffic data is preprocessed. Specifically, the linear difference method may be used to fill in the missing nulls in the first historical traffic data, and the Z-Score method may be used to remove outliers.

Further, the iterative RNN operator in the embodiment of the present application includes a gated linear unit, a diffusion convolution layer, and a fully connected layer. Referring to fig. 6, the specific steps of performing traffic prediction on the convolved first historical traffic data by the first iterative RNN operator and outputting a traffic prediction result at the first time step include:

and S1031, extracting the time dependence relationship of the first historical traffic data after convolution processing by a gate control linear unit, and outputting a first characteristic.

In the embodiment of the application, the gate control linear unit performs one-dimensional convolution processing on the first historical traffic data after convolution processing to obtain a first convolution characteristic and a second convolution characteristic; the gate control linear unit carries out activation processing on the second convolution characteristic through an activation function, and calculates the Hadamard product of the first convolution characteristic and the second convolution characteristic after the activation processing; and the gate control linear unit carries out residual error connection on the first historical traffic data after convolution processing and the Hadamard product, and outputs a first characteristic.

The gated linear unit in the embodiment of the present application captures dynamic information of the first historical traffic data in the time dimension through a convolution structure, and reference may be made to fig. 4. Specifically, a one-dimensional convolution process is performed by a one-dimensional causal convolution layer, the causal convolution layer having a convolution kernel width of K_tUnder the action of convolution kernel, the time sequence length of input data after convolution processing is shortened by K_t-1. Convolution processed first historical traffic data for input

C_iFor the dimension of the first historical traffic data after convolution processing, the size of the convolution kernel of the causal convolution layer in the embodiment of the application is

After the gate control linear unit carries out one-dimensional convolution processing on the first historical traffic data after the convolution processing, the obtained convolution result is

And dividing the convolution result into a front part and a rear part to obtain a first convolution characteristic P and a second convolution characteristic Q, wherein the characteristic numbers of the first convolution characteristic P and the second convolution characteristic Q are consistent with the input characteristic number before convolution.

In the embodiment of the present application, it is preferable that a sigmoid activation function is adopted to activate the second convolution feature Q, which is used to gate the first convolution feature P, i.e., the hadamard products P [ [ sigma ] (Q) ] of the first convolution feature and the activated second convolution feature are calculated, which are hadamard products. And the gate control linear unit further performs residual error connection on the first historical traffic data after convolution processing and the Hadamard product obtained through calculation, and outputs a first characteristic.

S1032, the diffusion convolution layer extracts the spatial dependence relation of the first feature and outputs a second feature.

The diffusion convolution layer in the embodiment of the application is a diffusion convolution layer with an adaptive matrix, and the diffusion convolution layer performs diffusion convolution characteristic extraction with the adaptive matrix on the first characteristic and outputs a second characteristic.

In the given node structure information, a diffusion process of a signal can be simulated by adopting K finite steps, node characteristics are extracted, and the diffusion convolution characteristics obtained by diffusion convolution processing are as follows:

wherein ,P^kThe transition matrix is calculated from the weighted adjacency matrix a as a power series of the transition matrix. In the embodiment of the present application, since the traffic network structure is a directed graph, the diffusion process has two directions, i.e. a forward transition matrix P_fA backward transfer matrix P ═ a/rowsum (a)_b＝A^T/rowsum(A^T) Thus, the diffusion convolution signature can be expressed as:

where X is the first characteristic of the input and W is the parameter matrix of the diffusion convolution layer.

At given node structure information, hidden spatial information can be acquired through an adaptive matrix, wherein the adaptive matrix is as follows:

in the formula, adaptive matrix

Including two randomly initialized learnable parameters E₁、E₂∈R^N×cIn particular, E₁For source node embedding, E₂For target node embedding, by E₁、E₂Multiplying to obtain a spatial dependence weight between the source node and the target node, eliminating weak connection of the weight through a ReLU activation function, normalizing the weight processed by the ReLU activation function through a SoftMax function, and taking the adaptive matrix as a transfer matrix in a hidden diffusion process.

In the embodiment of the application, explicit graph structure association is captured by adopting diffusion convolution, implicit graph structure association is captured by adopting a self-adaptive matrix, and spatial dependency relationship, namely graph structure information of a given node, is captured by combining the explicit graph structure association and the implicit graph structure association, and finally output second characteristics are as follows:

and S1033, the full connection layer carries out traffic prediction on the second characteristic and outputs a traffic prediction result of the first time step.

The full-connection layer carries out traffic prediction on the input second characteristic and outputs a traffic prediction result y of the first time step₁。

The specific processing procedure of the other iterative RNN operators is similar to that of the first iterative RNN operator, except that the input data and the output data of different iterative RNN operators are different.

Further, the configuration process of the preset traffic prediction model in the embodiment of the present application is as follows: acquiring second historical traffic data of nodes in the traffic network graph; and training the traffic prediction network through the second historical traffic data to obtain a preset traffic prediction model.

After the second historical traffic data is acquired, the time interval may be set to 5 minutes, that is, there will be 12 historical traffic data in one hour, and there will be 288 historical traffic data in a day. The second historical traffic data may be preprocessed, specifically, the missing null values in the second historical traffic data are complemented by using a linear interpolation method, and the outliers are removed by using a Z-Score method.

Wherein the second historical traffic data is longer in time period from the current time than the first historical traffic data. For example, a sensor in a certain traffic network records historical traffic data for 62 days, and historical traffic data for the previous 50 days can be used as a training set to obtain second historical traffic data; historical traffic data from day 51 to day 56 may be used as a verification set to obtain a third historical traffic data; the historical traffic data for the last 6 days may be used as a test set to obtain first historical traffic data. Of course, those skilled in the art can flexibly divide the data according to actual situations, and the data is not specifically limited herein.

And training the traffic prediction network through the second historical traffic data until the preset iteration times are reached, stopping training to obtain a preset traffic prediction model, and performing model verification by adopting third historical traffic data in the training process.

During training, the loss value can be calculated through the loss function, and parameters of each layer of the traffic prediction network are updated through back propagation of the loss value. The loss function of the traffic prediction network may be:

wherein ,

is predicted T_pThe traffic data for each time step is,

The long-short-term traffic prediction method in the embodiment of the application combines the advantages of iterative prediction and one-time prediction together, and ensures the accuracy of a long-short-term traffic prediction result; in addition, the preset traffic prediction model in the embodiment of the application can simultaneously give consideration to the accuracy of long-term and short-term prediction and can reduce the accumulation of iterative prediction errors, because the error of a prediction result is adjusted according to a real value in each step of prediction during training.

According to the traffic prediction method in the embodiment of the application, the acquired first historical traffic data is input into a preset traffic prediction model, one time step prediction is carried out through iterative RNN operators, each time of traffic prediction result is input into the next iterative RNN operator to carry out the next time step prediction, and meanwhile, the traffic prediction result is used as a part of a final output result, and the one time step prediction is carried out through each iterative RNN operator, so that effective long-term and short-term traffic space-time information can be extracted, the phenomenon of error accumulation is reduced, and the short-term traffic prediction precision is improved; the traffic prediction results of all time steps are spliced through the splicing module, and then the spliced traffic prediction results are subjected to convolution processing to output the final traffic prediction result, so that the long-term and short-term traffic prediction result is obtained, and the long-term and short-term traffic prediction precision is taken into consideration, thereby solving the technical problems that the existing traffic prediction method is high in prediction error accumulation and cannot take the long-term and short-term prediction precision into consideration.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for executing all or part of the steps of the method described in the embodiments of the present application through a computer device (which may be a personal computer, a server, or a network device). And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A long-term and short-term traffic prediction method, comprising:

acquiring first historical traffic data of nodes in the traffic network graph;

2. The long-term and short-term traffic prediction method of claim 1, characterized in that the iterative RNN operator comprises gated linear units, diffusion convolutional layers and fully-connected layers;

3. The long-short term traffic prediction method according to claim 2, wherein the gated linear unit extracts the time dependency relationship of the first historical traffic data after convolution processing, and outputs a first feature, including:

4. The long-term and short-term traffic prediction method of claim 3, wherein the diffusion convolutional layer extracts the spatial dependency of the first feature and outputs a second feature, comprising:

5. The long and short term traffic prediction method according to claim 4, characterized in that the second feature is:

to adapt the matrix, E₁、E₂Respectively embedded into source nodesAnd embedding the target node.

6. The long and short term traffic prediction method according to claim 5, characterized in that the method further comprises:

7. The long term and short term traffic prediction method according to any of claim 1, characterized in that said inputting of said first historical traffic data into a traffic prediction system comprising a first convolutional layer, T_pThe iterative RNN operators, the splicing module and the preset traffic prediction model of the second convolution layer comprise the following steps:

and preprocessing the first historical traffic data.

8. The long and short term traffic prediction method according to any of claims 1-7, characterized in that the preset traffic prediction model is configured by the following process:

acquiring second historical traffic data of nodes in the traffic network graph;

9. The long and short term traffic prediction method according to claim 8, characterized in that the loss function of the traffic prediction network is:

wherein ,

is predicted T_pThe traffic data for each time step is,