CN116486611A

CN116486611A - Urban road vehicle speed prediction method

Info

Publication number: CN116486611A
Application number: CN202310415345.2A
Authority: CN
Inventors: 樊子德; 黄飞龙; 孙熠; 李肖赫; 邓雅文
Original assignee: Aerospace Information Research Institute of CAS
Current assignee: Aerospace Information Research Institute of CAS
Priority date: 2023-04-18
Filing date: 2023-04-18
Publication date: 2023-07-25

Abstract

The invention relates to a method for predicting the speed of an urban road vehicle, belongs to the technical field of vehicle speed prediction, and solves the problem of low prediction accuracy caused by neglecting the time-space correlation of the vehicle speed in the prior art. Comprising the following steps: collecting the vehicle speed of each node in the urban road network, constructing a sample set, and dividing a training set from the sample set; constructing a dynamic adjacency matrix according to the network topology structure of each node in the training set and the daily vehicle speed probability; dividing the training set into multiple time periods according to the time sequence, constructing multiple network models according to each time period, and fusing the outputs of the multiple network models to obtain a prediction model; training a prediction model based on the training set and the dynamic adjacency matrix to obtain a trained prediction model; and collecting the vehicle speed of the node to be predicted, and selecting samples to be detected in each period to be transmitted into a trained prediction model according to the period to be predicted and the preset sample number in each period to obtain the vehicle speed in the period to be predicted. The accurate prediction of the vehicle speed is realized.

Description

Urban road vehicle speed prediction method

Technical Field

The invention relates to the technical field of vehicle speed prediction, in particular to a method for predicting the speed of an urban road vehicle.

Background

The traffic jam greatly reduces the life quality of urban residents, increases the commute time and makes people feel dysphoria; meanwhile, the continuous loss of fuel during the blockage not only wastes a large amount of energy, but also generates a large amount of tail gas which pollutes the environment. To solve this problem, many cities are beginning to explore and study Intelligent Traffic Systems (ITS), and vehicle speed prediction is a fundamental topic in ITS research, and many traffic problems ultimately depend on accurate estimates of traffic speed; however, vehicle speed prediction has been extremely challenging due to overly complex spatial and temporal correlations.

In recent years, with the rapid rise of deep learning and artificial intelligence, deep neural network models have received attention because of their ability to capture spatiotemporal dynamic features of traffic data well. Recurrent Neural Networks (RNNs), long and short term memory LSTM, have been successfully used for traffic prediction; to better consider spatial features, scholars have also introduced CNNs into their models, such as ST-ResNet, etc.; however, since CNNs were originally designed for euclidean space structures, the above methods all first convert traffic networks into regular grid structures, but such conversion loses much of the topology information of the traffic network itself. With respect to this limitation of CNN, scholars consider introducing a convolutional architecture in the spectral domain, and use the graph neural network in traffic flow prediction, such as model DCRNN, TGCN, ASTGCN.

By analyzing the current state of research at home and abroad, it can be found that, for speed prediction in road traffic flow, modeling in both space and time is required for historical data of traffic flow, and many researchers use various methods to process spatiotemporal data, such as comprehensive utilization of GCN and RNN. However, these methods are still not good enough in terms of processing the dynamic correlation of traffic data time and space, the prediction accuracy is not high, summarized as: firstly, on a time scale, the existing research part deliberately ignores data of holidays and weekends, and the generalization capability of the model is limited by the lack of data diversity; and secondly, in the spatial scale, the partial research only utilizes the adjacency matrix to model the road network into a static diagram, and ignores the spatial dynamic dependency relationship of the road.

Disclosure of Invention

In view of the above analysis, the embodiment of the invention aims to provide a method for predicting the speed of an urban road vehicle, which is used for solving the problem of low prediction accuracy caused by neglecting the space-time correlation of the speed of the vehicle.

The embodiment of the invention provides a method for predicting the speed of an urban road vehicle, which comprises the following steps:

collecting the vehicle speed of each node in the urban road network, constructing a sample set, and dividing a training set from the sample set; constructing a dynamic adjacency matrix according to the network topology structure of each node in the training set and the daily vehicle speed probability;

dividing the training set into multiple time periods according to the time sequence, constructing multiple network models according to each time period, and fusing the outputs of the multiple network models to obtain a prediction model; training a prediction model based on the training set and the dynamic adjacency matrix to obtain a trained prediction model;

and collecting the vehicle speed of the node to be predicted, and selecting samples to be detected in each period to be transmitted into a trained prediction model according to the period to be predicted and the preset sample number in each period to obtain the vehicle speed in the period to be predicted.

Based on further improvement of the method, the dynamic adjacency matrix is constructed according to the network topology structure of each node in the training set and the daily vehicle speed probability, and the method comprises the following steps:

constructing a static adjacency matrix according to the network topology structure of each node in the training set;

according to the daily vehicle speed ratio of each node in the training set, the daily vehicle speed probability of each node is obtained, and the vehicle speed probability distribution of each node is formed; the Wasserstein distance is adopted, the cosine distance is used as a cost function of probability distribution transfer, the probability distribution distance of the vehicle speed of any two nodes is calculated, and a distance matrix is constructed;

and multiplying and superposing element values in the distance matrix and the static adjacency matrix with preset distance matrix weights and static adjacency matrix weights respectively, and then carrying out binarization processing on the superposed values according to a threshold value to obtain the dynamic adjacency matrix.

Based on a further improvement of the method, the training set is divided into multiple time segments according to the time sequence, which comprises the following steps: according to the prediction time period, respectively dividing training samples of adjacent time periods, daily time periods, weekly time periods and holiday time periods according to the preset sample quantity of each time period; wherein the adjacent period is a period within [1.5,2.5] hours before the prediction period, the daily period is a period of time identical to the prediction period every day before the prediction day, the weekly period is a period of time identical to the prediction period every other week before the prediction day, and when the prediction day belongs to the holiday, the holiday period is a period of time identical to the prediction period on the day of the prediction day in the history year.

Based on further improvement of the method, constructing a plurality of network models according to each period, and fusing the outputs of the plurality of network models to obtain a prediction model, wherein the method comprises the following steps: the network model corresponding to the adjacent time period comprises a space diagram convolution layer, a gating circulation unit and a first full-connection layer which are sequentially connected, and the network model corresponding to the daily time period, the weekly time period and the holiday time period is the same and comprises a space diagram convolution layer, a time attention layer and a time convolution layer which are sequentially connected; and carrying out weighted fusion on the outputs of the four network models through a second full-connection layer to obtain a prediction model.

Based on further improvement of the method, the space diagram convolution layer comprises a space attention layer and a space diagram convolution neural network which are sequentially connected, the space attention layer calculates a space attention score matrix according to the difference of vehicle speed vectors among nodes in corresponding time periods, and the space attention score matrix is dynamically smoothed according to characteristic smoothness and then is transmitted into the space diagram convolution neural network; and (3) a space diagram convolutional neural network, namely performing convolution operation on training samples in a corresponding period based on the dynamic adjacency matrix and the smoothed space attention score matrix by using a chebyshev polynomial as a convolution kernel.

Based on further improvement of the method, the time attention layer calculates a time attention score matrix according to the difference of vehicle speed vectors among time slices in the corresponding time period, dynamically weights the output result of the corresponding space diagram convolution layer according to the time attention score matrix, and outputs the result to the time convolution layer; the time convolution layer is a convolution neural network, and after the output results of the time attention layer are convolved and combined, the output results are mapped into output by using a LeakyReLU function.

Based on a further improvement of the above method, the element values of the spatial attention score matrix are calculated according to the following formula

Wherein,,vehicle speed vector representing node i within τ time slices within the corresponding period, +.>Vehicle speed vector, V, representing node j over τ time slices _s ,b _s ,W _s1 ,W _s2 Representing weight parameters, σ representing a sigmoid activation function, N representing a sectionTotal number of points.

Based on a further improvement of the above method, the element values of the temporal attention score matrix are calculated according to the following formula

Wherein,,representing time slices t within corresponding time periods _i A vehicle speed vector at N nodes,representing time slices t within corresponding time periods _j Vehicle speed vector at N nodes, V _q ,b _q ,W _q1 And W is _q2 Representing weight parameters, σ represents a sigmoid activation function, N represents the total number of nodes, and τ represents the number of time slices in the corresponding period.

Based on a further improvement of the above method, dynamically smoothing the spatial attention score matrix according to the feature smoothness, comprising:

according to the sum of the differences of the vehicle speed vectors of each node and the neighboring nodes, calculating the average value of the sum of the differences, and taking the average value as the characteristic smoothness of the corresponding time period;

obtaining the quantity to be reserved according to the characteristic smoothness and the preset multiple;

and according to the number to be reserved, reserving corresponding element values according to the sequence from the large element value to the small element value in the space attention score matrix, and setting the rest element values to be zero to obtain the smoothed space attention score matrix.

Based on the further improvement of the method, based on the training set and the dynamic adjacency matrix, training the prediction model to obtain a trained prediction model, comprising:

introducing a course learning idea, selecting training samples of all adjacent time periods during each training round, and introducing a network model corresponding to the adjacent time periods; respectively selecting corresponding training samples according to training schedulers corresponding to the daily time period, the weekly time period and the holiday time period, and respectively transmitting the training samples into corresponding network models;

and outputting predicted vehicle speed by each network model according to the input training sample and the dynamic adjacency matrix, comparing the predicted vehicle speed with the corresponding actual vehicle speed, selecting a mean square error as a loss function of the prediction model, and obtaining the trained prediction model after the mean square error reaches a preset error value and training is finished.

Compared with the prior art, the invention has at least one of the following beneficial effects:

1. based on historical sequence data, similarity among nodes is measured by using the difference of probability distribution, and dynamic connectivity among road network nodes is defined to be closer to actual communication relations;

2. the attention mechanism is improved based on the feature smoothness, so that useful information brought by neighbors is amplified by the nodes, and corresponding negative interference is restrained;

3. the data of four most relevant time periods are selected, the diversity of the data is increased, the data of the time periods which are difficult to train are gradually increased based on the training scheduling strategy of the course learning thought, so that the model has different emphasis points in different training processes, the training speed and the generalization performance of the model are improved, and the prediction accuracy is also improved.

In the invention, the technical schemes can be mutually combined to realize more preferable combination schemes. Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, like reference numerals being used to refer to like parts throughout the several views.

FIG. 1 is a flow chart of a method for predicting the speed of an urban road vehicle according to an embodiment of the invention;

FIG. 2 is a schematic diagram of a prediction model according to an embodiment of the present invention.

Detailed Description

Preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and together with the description serve to explain the principles of the invention, and are not intended to limit the scope of the invention.

In one embodiment of the present invention, a method for predicting the speed of an urban road vehicle is disclosed, as shown in fig. 1, comprising the steps of:

s11, collecting the vehicle speed of each node in the urban road network, constructing a sample set, and dividing a training set from the sample set; and constructing a dynamic adjacency matrix according to the network topology structure of each node in the training set and the probability distribution of the vehicle speed.

It should be noted that, in this embodiment, undirected graphTo abstract the topology of the urban road network, consider each sensor on the road as a node, v= { V ₁ ,v ₂ ,…,v _N -representing a set of nodes, the total number being N; every two nodes are connected into one edge, E represents the collection of edges, and +.>Representing the dynamic adjacency matrix.

Dividing the acquired time period into a plurality of time slices according to preset time intervals, taking the average vehicle speed of each node in each time interval range as the vehicle speed at each time slice, and taking the vehicle speed of the node i at the time slice tAs a sample-> Vehicle speed vector representing n nodes at time slice t, Representing vehicle speed vectors for τ time slices in time slice order at node i;Representing the set of all vehicle speeds for N nodes in τ time slices, when τ=t, n=n, then all vehicle speeds for all nodes in all time slices are represented. To increase the convergence speed, the collected vehicle speed is normalized to [0,1]Within the interval, a sample set is constructed.

Illustratively, data from 207 sensors in the 1515 roads, i.e., the total number of nodes N is 207, is collected from 3 months 2012 to 6 months 30 days 2012 (119 days); 288 time slices per day at 5 minute intervals; and supplementing the value which is not acquired by the sensor by adopting a linear interpolation method, and finally, acquiring 119×288= 34272 time slice data by each node, wherein each time slice corresponds to 207 nodes of vehicle speed data. The sample set is divided into a training set and a test set according to a proportion, and a verification set can be also divided, which belong to conventional usage, and the embodiment is not limited. Illustratively, the training set, the test set, and the validation set are partitioned at 7:2:1.

Further, constructing a dynamic adjacency matrix according to the network topology structure and the vehicle speed probability distribution of each node in the training set, including:

(1) constructing a static adjacency matrix according to the network topology structure of each node in the training set;

(2) according to the daily vehicle speed ratio of each node in the training set, obtaining the vehicle speed probability distribution of each node; the Wasserstein distance is adopted, the cosine distance is used as a cost function of probability distribution transfer, the probability distribution distance of the vehicle speed of any two nodes is calculated, and a distance matrix is constructed;

(3) and multiplying and superposing element values in the distance matrix and the adjacency matrix with preset distance matrix weights and adjacency matrix weights respectively, and then carrying out binarization processing on the superposed values according to a threshold value to obtain the dynamic adjacency matrix.

In particular, static adjacency matrixTo describe the static connectivity between nodes, in this embodiment, when two nodes are adjacent, the element value in the static adjacency matrix is 1, otherwise it is 0.

Aggregating vehicle speeds of nodes in a training set by day toVehicle speed vector representing node i on day d, d e [1, D]D is the total days of training set data, and the data probability of the D-th day of the node i is calculated by the following formula>

In equation (1), the daily vehicle speed vector of each node is converted into a probability based on the ratio of the vehicle speed on the day of node i to the vehicle speed on all days of the training set for node iThe daily probability of each node constitutes a respective probability distribution P _i And (2) and

and then, taking the cosine distance as a cost function of probability distribution transfer between nodes, calculating the probability distribution distance of any two nodes by adopting the Wasserstein distance, and constructing a distance matrix for representing the spatial correlation between the nodes.

Specifically, the d-th is calculated by the following formula ₁ Vehicle speed vector for day node iAnd the d ₂ Vehicle speed vector for day node j>Conversion cost of (2):

based on the cost function, the probability distribution distance of any two nodes is calculated through the following formula:

wherein inf denotes a probability distribution P _i Conversion to another probability distribution P _j Is the solution with the smallest cumulative moving distance, and gamma is the joint probability distribution pi [ P ] _i ,P _j ]The edge distribution is P _i And P _j ，Eigenvalue vector representing node i on day x,/->And represents the eigenvalue vector of the node j on the y-th day.

Further, after multiplying and superposing element values in the distance matrix and the static adjacency matrix with preset distance matrix weights and static adjacency matrix weights respectively, binarizing the superposed values according to a threshold value to obtain a dynamic adjacency matrixThe formula is as follows:

M[i,j]＝W ₁ ×D _Wasserstein [i,j]+W ₂ ×A[i,j]formula (5)

Wherein eta represents a threshold value, W ₁ Representing distance matrix weights, W ₂ Representing adjacency matrix weights, preferably, W is set ₁ Is 0.45, W ₂ 0.55.

Compared with the prior art, the adjacency matrix is constructed based on the topological structure relation of the traffic network, but in the practical situation, similar urban functional areas can cause the traffic data between the road network nodes to have correlation, and the adjacency matrix is irrelevant to the communication of the road network nodes; meanwhile, the road network is also influenced by various random factors such as road sealing traffic accidents and the like, so that the dynamic adjacency matrix constructed based on time sequence data can better reflect the actual communication relation.

S12, dividing the training set into multiple time periods according to the time sequence, constructing multiple network models according to each time period, and fusing the outputs of the multiple network models to obtain a prediction model; based on the training set and the dynamic adjacency matrix, training the prediction model to obtain a trained prediction model.

It should be noted that, performing multi-period division on the training set according to the time sequence includes: according to the prediction time period, respectively dividing training samples of adjacent time periods, daily time periods, weekly time periods and holiday time periods according to the preset sample quantity of each time period; wherein the adjacent period is a period within [1.5,2.5] hours before the prediction period, preferably 2 hours before the prediction period is selected as the adjacent period, the daily period is a period of the same time as the prediction period every day before the prediction day, the weekly period is a period of the same time as the prediction period every other week before the prediction day, and when the prediction day belongs to the holiday, the holiday period is a period of the historical year which is the same as the prediction period on the day of the prediction day.

Specifically, training samples for the adjacent, daily, and weekly periods are obtained according to the following formula:

wherein t is _p Represents the start time of the prediction period, T _q Representing a prediction window, i.e. the number of predicted time slices, T _R Representing the number of samples in a contiguous period, T _D Represents the number of samples in a daily period, T _W Represents the number of samples in a weekly period, and m represents the number of time slices per day. Illustratively, the time slices are divided at 5 minute intervals, then m=288, the current time is 2012/5/17:00, t _q =12, then 12 vehicle speeds are predicted every 5 minutes in 2012/5/17:00-8:00 time period, set T _R ＝24，T _D ＝12，T _W =24, then represents training samples corresponding to 24 time slices in adjacent time period selection 2012/5/15:00-7:00, training samples corresponding to 12 time slices in daily time period selection 2012/4/307:00-8:00, training samples corresponding to 24 time slices in weekly time period selections 2012/4/177:00-8:00 and 2012/4/247:00-8:00.

For training samples of holiday time period, when the prediction time period is identified to belong to holiday, according to the sample number T in the holiday time period _H Extracting training samples of the same date and time from historical years to X _H Is a kind of medium. Illustratively T _H =12, selecting training samples corresponding to 12 time slices in 2011/5/17:00-8:00; when the prediction period is not recognized to belong to holidays, noneTraining samples corresponding to holiday time periods are selected.

Further, as shown in fig. 2, constructing a plurality of network models according to each period, fusing the outputs of the plurality of network models to obtain a prediction model, including: the network model corresponding to the adjacent time period comprises a space diagram convolution layer, a gating circulation unit and a first full-connection layer which are sequentially connected, and the network model corresponding to the daily time period, the weekly time period and the holiday time period is the same and comprises a space diagram convolution layer, a time attention layer and a time convolution layer which are sequentially connected; and carrying out weighted fusion on the outputs of the four network models through a second full-connection layer to obtain a prediction model.

The structure and function of the layers in the network model are specifically described below.

Space diagram convolution layer

It should be noted that, the structure of the space map convolution layer in the network model corresponding to the four time periods is the same, and the space map convolution layer is based on the attention mechanism of the characteristic smoothness and the chebyshev polynomial, and comprises a space attention layer and a space map convolution neural network which are sequentially connected, wherein the space attention layer calculates a space attention score matrix according to the difference of vehicle speed vectors among nodes in the corresponding time periods, and the space attention score matrix is dynamically smoothed according to the characteristic smoothness and then is transmitted into the space map convolution neural network; and (3) a space diagram convolutional neural network, namely performing convolution operation on training samples in a corresponding period based on the dynamic adjacency matrix and the smoothed space attention score matrix by using a chebyshev polynomial as a convolution kernel.

Specifically, the spatial attention layer learns the intrinsic attention using a vehicle speed vector instead of an external factor vector based on an attention mechanism. To be used forA vehicle speed vector representing node i for τ time slices over the corresponding time period,vehicle speed vector representing node j over τ time slices based on differences between the vectorsCalculating the spatial attention score between nodes by the following formula>The method is used for expressing the influence degree of the node j on the node i and measuring the similarity between the neighbor road condition and the current road condition:

wherein V is _s ,b _s ,W _s1 And W is _s2 Is a weight parameter which can be learned, sigma represents a sigmoid activation function, and N represents the total number of nodes; by softmax vs _ij Normalizing to obtain a spatial attention fractionForming an attention score matrix

It should be noted that, the existing graph annotation is applied to the GATThe vector is used to calculate the attention score, but the present embodiment considers that the smaller the smoothness of the feature is, the less similar the node is to the feature of its neighbor, which means that the neighbor can contribute a larger information gain, so the attention mechanism in the present embodiment applies->I.e. the difference in feature vectors of node i and node j, to calculate the attention score, a larger/smaller attention score is facilitated when node i is more dissimilar/similar to the features of its neighbors j.

Further, since the neighborhood can provide both positive information and negative interference to a particular task; simply aggregating feature vectors of neighbors generally does not achieve better performance. Therefore, in order to amplify useful information brought by neighbors and suppress corresponding negative interference, selective aggregation of surrounding information is required. The present embodiment updates the attention score matrix according to the feature smoothness of the corresponding period.

Specifically, according to the average value of the sum of the differences of the vehicle speed vectors of each node and the neighboring nodes, calculating the characteristic smoothness of the corresponding time period through a formula (9), and according to the characteristic smoothness and a preset multiple, obtaining the quantity to be reserved through a formula (10); according to the number to be reserved, reserving corresponding element values according to the sequence from big to small of the element values in the attention score matrix, and setting the rest element values to be zero to obtain a smoothed attention score matrix:

wherein, xi _s Characteristic smoothness, r, representing time period _s Representing the number to be reserved, V representing the node set of the urban road network, N representing the total number of nodes in V, N _i Representing the set of neighbor nodes for node i, e represents the total number of edges of the urban road network,represents an upward rounding and ρ represents a preset multiple, preferably set to ρ to 2.

It should be noted that a large xi _s The characteristic signals representing the graph have higher frequencies, represent that the characteristic vectors between two connected nodes are more likely to be different, and contain more useful information; whereas for those neighbor information that are discarded, it is considered to contain more interference information, so the zeroing out discard helps to preserve the nodeThe characteristics of the method inhibit corresponding negative interference.

Further, the space diagram convolutional neural network introduces an attention mechanism, uses chebyshev polynomials as convolution kernels, and carries out convolution operation on training samples of corresponding time periods based on a dynamic adjacency matrix and a smoothed space attention score matrix through the following formula:

wherein X represents training samples for a corresponding period of time,representing dynamic adjacency matrix->Representing the smoothed spatial attention score matrix, < >>Representing parameters updated continuously during training, lambda _max Representing the maximum value of the Laplace matrix eigenvalue, I _N Representing the identity matrix, the D-representation matrix, +. _G Indicates the convolution operation, +. _k (. Cndot.) is a chebyshev polynomial of the k-th order, preferably with k set to 4.

(II) gating circulation unit and first full connection layer

It should be noted that, the cyclic neural network is a multi-layer perceptron with hidden states, which stores past information and current input by introducing the hidden states, so as to determine current output, but the problems of unstable values, vanishing gradients, gradient explosion and the like can occur, and the acquisition capacity of the cyclic neural network is very limited for the information of a far-distance time slice, so that the embodiment adopts a variant gating cyclic unit GRU of the cyclic neural network for the adjacent time period, and utilizes a gating mechanism to retain more historical information, and the internal structure is simple and the training speed is high.

Specifically, the mathematical expression of the gating cycle cell is as follows:

R _t ＝σ(ΨW _xr +H _t-1 W _hr +b _r ) Formula (14)

Z _t ＝σ(ΨW _xz +H _t-1 W _hz +b _z ) Formula (15)

Wherein ψ represents the output of the spatial map convolutional layer, H _t-1 Representing the hidden state value at time slice t-1, R _t Indicating reset gate, Z _t The representation of the update gate is made,is a candidate hidden state at time slice t, +.>Representing the output state at time slice t, W _xr ,W _hr ,W _xz ,W _hz ,W _xh ,W _hh ,b _r ,b _z And b _h Representing a learnable parameter.

The gate control circulation unit can capture the vehicle speed of the current time slice and better capture the dependence relationship of the vehicle speed of the time slices far away by comprehensively utilizing the hidden state and the current vehicle speed at the time slice t-1.

Further, the first full connection layer uses a sigmoid function to weight and map the output result of the gating cycle unit to the output Y of the network model corresponding to the adjacent time period _R Expressed as:

wherein W is _r Representing the weight parameters that can be learned,represents the output of the gating loop, σ represents the sigmoid activation function.

(III) temporal attention layer and temporal convolution layer

It should be noted that, the vehicle speeds of adjacent nodes of each node are aggregated through the space diagram convolution layer of the previous layer, then the attention score between different time slices is calculated for the vehicle speeds of each time slice in the corresponding time period between the nodes, so as to measure the influence of the vehicle speeds of different time slices in the time period on the vehicle speed of the prediction window, and finally the information of different time points is aggregated through the time convolution layer to obtain the output of the corresponding time period.

Specifically, the time attention layer calculates a time attention score matrix from the difference in vehicle speed vector between time slices within the corresponding period, and calculates the element values of the time attention score matrix according to the following formula

Wherein,,representing time slices t within corresponding time periods _i A vehicle speed vector at N nodes,representing time slices t within corresponding time periods _j Vehicle speed vector at N nodes, V _q ,b _q ,W _q1 ,W _q2 The method comprises the steps of representing weight parameters, sigma represents a sigmoid activation function, N represents the total number of nodes, and tau represents the number of time slices in a corresponding period; normalization of the temporal attention score was performed by softmax, ensuring that the temporal attention weight sum of the nodes was 1.

Time attention moment arrayElement value +.>Represents the point in time t _i And time point t _j The time-attention fraction between, i.e. time t _j For time t _i The dependence degree of the spatial graph convolution layer is then dynamically weighted to obtain the output result of the temporal attention layer ∈>Expressed as:

further, the output result of the time attention layer is transmitted to a time convolution layer, the time convolution layer is a convolution neural network, and after convolution and combination of the output result of the time attention layer, the output is mapped by using a LeakyReLU function, and the output is expressed as:

where Φ represents parameters of the convolution kernel in the time dimension, such as: the step size of the convolution kernel is 3 and the number is 64.

(IV) second full connection layer

It should be noted that, four periods correspond to four network models, and four output results Y are obtained _R ,Y _D ,Y _W ,Y _H The importance of these four output results with respect to the predicted result is dynamic: for roads with significant peak hours in the day, the output of the daily period is more important; for roads that differ significantly between monday and weekend, the output of the weekly period is relatively more critical; the output of holiday time period is key for the road with traffic flow rising in holidays; whereas the daily period and weekly period output is less important for roads that do not have periodic variation. Thus, the outputs of the four components are weighted and combined, with their respective weights not fixed, initialized randomly from the beginning to [0,1 ]]And then updating the values iteratively in a training process. Therefore, the second full-connection layer is a weighted fusion of the multiple output results, resulting in a final predicted result Y, expressed as:

Y＝W _R ⊙Y _R +W _D ⊙Y _D +W _W ⊙Y _W +W _H ⊙Y _H formula (23)

Wherein E is _R ,W _D ,W _w And W is _H Is a weight parameter which can be learned and reflects the influence degree of four output results on the prediction result.

After the prediction model is built, training the prediction model based on the training set and the dynamic adjacency matrix to obtain a trained prediction model, wherein the method comprises the following steps:

and outputting predicted vehicle speed by each network model according to the input training sample and the dynamic adjacency matrix, comparing the predicted vehicle speed with the corresponding actual vehicle speed, selecting a mean square error MSE as a loss function of the prediction model, and obtaining the trained prediction model after the mean square error reaches a preset error value and training is finished.

It should be noted that, for the training samples of the adjacent time period, the daily time period, the weekly time period and the holiday time period divided in the embodiment, the specificity of the training samples is gradually increased, and the corresponding fluctuation of the vehicle speed is more frequent, so if the same data distribution is kept all the time in the training process, the predictive model is excessively focused on the data with smaller data quantity and larger fluctuation, thus not only the model training difficulty is higher, but also a good effect can be achieved only after a long time is needed; and the generalization performance of the model is also poor, so the embodiment introduces a course learning idea.

Course learning is a training strategy, and can improve generalization capability and convergence speed of various models in a wide scene. From the aspect of model optimization, a smooth target of an optimization method is provided, a global minimum value is easily found, and the smooth target is gradually considered to be reduced after the training of the local minimum value is always tracked. From the point of view of data distribution, it advocates learning from simple samples, gradually increasing the diversity of data and information amount samples. The core components of course learning are difficulty measurer and training scheduler, in this embodiment, the difference of time periods is used as the difference of difficulty, the training samples of adjacent time periods are defined as the easiest training samples, the training samples of daily time period, weekly time period and holiday time period are the training samples with gradually increasing difficulty, three training schedulers phi are used respectively _D (c),φ _W (c),φ _H (c) The number of training samples increases as the number of training rounds increases.

Specifically, the training scheduler φ for training samples for a daily period is expressed by the following formula _D (c) Training scheduler phi for training samples for weekly periods _D (c) Training of holiday period training samplesTraining dispatcher phi _H (c)：

Wherein C represents the number of wheels currently trained, C represents the total number of wheels trained,indicating that the upper scheduling limit of the corresponding scheduler is a [0,1 ]]Super parameters in between, preferably, set +.>

And when training is performed in each round, selecting training samples of all adjacent time periods, transmitting the training samples into a network model corresponding to the adjacent time periods, taking the value of the corresponding training scheduler as the selected quantity proportion, randomly selecting the training samples from the corresponding training samples, and transmitting the training samples into the corresponding network model. Then, in the initial stage of training, basically only training samples in adjacent time periods are used, and the training speed is high; in the middle training period, training samples of other three periods are gradually added, at the moment, the model gradually starts focusing on the period with larger fluctuation of the vehicle speed, and the prediction effect of the portion is improved seriously; in the later stage of training, the application of four complete data ensures that the model has a better grasp on the overall trend of the vehicle speed, has better generalization performance, has a more accurate approximation on a time period with larger fluctuation, and realizes the improvement of the overall performance.

Preferably, after the prediction model training is finished, the test set is used for performing performance evaluation on the prediction model so as to prevent the network from over fitting and under training the training data set. And calculating a Root Mean Square Error (RMSE), a Mean Absolute Error (MAE) and a Mean Absolute Percent Error (MAPE) according to the prediction result of the test set, and if the prediction result does not meet the requirement, retraining the prediction model until the prediction result meets the requirement, thereby obtaining a trained prediction model.

S13, collecting the vehicle speed of the node to be predicted, and selecting samples to be detected in each period to be transmitted into a trained prediction model according to the period to be predicted and the preset sample quantity in each period to obtain the vehicle speed in the period to be predicted.

During actual prediction, a dynamic adjacency matrix of a trained prediction model is used, samples to be detected in each period are automatically selected from the acquired vehicle speeds of the nodes to be predicted according to the period to be predicted and the preset sample number in each period, and are transmitted into the trained prediction model, and the prediction model fuses output results of each period to obtain the vehicle speed of the period to be predicted.

Illustratively, the period to be predicted is 2022/4/12 8:00-9:00, the time interval is 5 minutes, the prediction window T _q =12, if the period to be predicted does not belong to holidays, according to T _R ＝24，T _D ＝12，T _W The samples to be tested in the corresponding time period are selected to be transmitted into a trained prediction model by the method of the embodiment of the invention, and 12 prediction results are obtained every 5 minutes: 2022/4/12:05, 2022/4/12:10, … …,2022/4/129:00.

Compared with the prior art, the urban road vehicle speed prediction method provided by the embodiment is based on historical sequence data, uses the difference of probability distribution to measure the similarity between nodes, defines the dynamic connectivity between road network nodes, and is closer to the actual communication relationship; the attention mechanism is improved based on the feature smoothness, so that useful information brought by neighbors is amplified by the nodes, and corresponding negative interference is restrained; the data of four most relevant time periods are selected, the diversity of the data is increased, the data of the time periods which are difficult to train are gradually increased based on the training scheduling strategy of the course learning thought, so that the model has different emphasis points in different training processes, the training speed and the generalization performance of the model are improved, and the prediction accuracy is also improved.

Those skilled in the art will appreciate that all or part of the flow of the methods of the embodiments described above may be accomplished by way of a computer program to instruct associated hardware, where the program may be stored on a computer readable storage medium. Wherein the computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory, etc.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims

1. A method for predicting the speed of an urban road vehicle, comprising the steps of:

2. The urban road vehicle speed prediction method according to claim 1, wherein the constructing a dynamic adjacency matrix according to the network topology of each node in the training set and the daily vehicle speed probability comprises:

3. The urban road vehicle speed prediction method according to claim 1, wherein the multi-period division of the training set in time sequence comprises: according to the prediction time period, respectively dividing training samples of adjacent time periods, daily time periods, weekly time periods and holiday time periods according to the preset sample quantity of each time period; wherein the adjacent period is a period within [1.5,2.5] hours before the prediction period, the daily period is a period of time identical to the prediction period every day before the prediction day, the weekly period is a period of time identical to the prediction period every other week before the prediction day, and when the prediction day belongs to the holiday, the holiday period is a period of time identical to the prediction period on the day of the prediction day in the history year.

4. The urban road vehicle speed prediction method according to claim 3, wherein the constructing a plurality of network models according to each period, merging outputs of the plurality of network models to obtain the prediction model, comprises: the network model corresponding to the adjacent time period comprises a space diagram convolution layer, a gating circulation unit and a first full-connection layer which are sequentially connected, and the network model corresponding to the daily time period, the weekly time period and the holiday time period is the same and comprises a space diagram convolution layer, a time attention layer and a time convolution layer which are sequentially connected; and carrying out weighted fusion on the outputs of the four network models through a second full-connection layer to obtain a prediction model.

5. The urban road vehicle speed prediction method according to claim 4, wherein the space map convolution layer comprises a space attention layer and a space map convolution neural network which are sequentially connected, the space attention layer calculates a space attention score matrix according to the difference of vehicle speed vectors among nodes in corresponding time periods, and the space attention score matrix is dynamically smoothed according to characteristic smoothness and then is transmitted into the space map convolution neural network; the space diagram convolutional neural network uses chebyshev polynomials as convolution kernels, and carries out convolution operation on training samples in corresponding time periods based on the dynamic adjacency matrix and the smoothed space attention score matrix.

6. The urban road vehicle speed prediction method according to claim 4, wherein the time attention layer calculates a time attention score matrix according to the difference of vehicle speed vectors among time slices in the corresponding time period, dynamically weights the output result of the corresponding space map convolution layer according to the time attention score matrix, and outputs the result to the time convolution layer; the time convolution layer is a convolution neural network, and after the output results of the time attention layer are convolved and combined, the output results are mapped into output by using a LeakyReLU function.

7. The urban road vehicle speed prediction method according to claim 5, characterized in that the element values of the spatial attention score matrix are calculated according to the following formula

Wherein,,vehicle speed vector representing node i within τ time slices within the corresponding period, +.>Vehicle speed vector, V, representing node j over τ time slices _s ，b _s ，W _s1 And W is _s2 Represents a weight parameter, sigma represents a sigmoid activation function, and N represents the total number of nodes.

8. The urban road vehicle speed prediction method according to claim 6, wherein the element values of the temporal attention score matrix are calculated according to the following formula

Wherein,,representing time slices t within corresponding time periods _i Vehicle speed vector at N nodes, +.>Representing the corresponding timeTime slice t within a segment _j Vehicle speed vector at N nodes, V _q ，b _q ，W _q1 And W is _q2 Representing weight parameters, σ represents a sigmoid activation function, N represents the total number of nodes, and τ represents the number of time slices in the corresponding period.

9. The urban road vehicle speed prediction method according to claim 5, wherein the dynamically smoothing the spatial attention score matrix according to the characteristic smoothness comprises:

10. The method for predicting urban road vehicle speed according to claim 3, wherein the training the prediction model based on the training set and the dynamic adjacency matrix to obtain a trained prediction model comprises: