CN115293237A

CN115293237A - Vehicle track prediction method based on deep learning

Info

Publication number: CN115293237A
Application number: CN202210791760.3A
Authority: CN
Inventors: 杨璐; 李培鑫; 任凤雷; 刘佳琦; 赵晨阳
Original assignee: Tianjin University of Technology
Current assignee: Tianjin University of Technology
Priority date: 2022-07-05
Filing date: 2022-07-05
Publication date: 2022-11-04

Abstract

The invention discloses a vehicle track prediction method based on deep learning, which comprises the following steps: s1, constructing a vehicle track database for deep learning, wherein the vehicle track database comprises a plurality of groups of vehicle track data, each group of vehicle track data comprises characteristic information of a target vehicle and related vehicles around the target vehicle, S2, constructing a vehicle track prediction model formed by a convolution pooling module, a coding module and a decoding module, and S3, training the vehicle track prediction model constructed in the step S2 by using the data set constructed in the step S1; s4, obtaining the same vehicle characteristic information as the step S1 and inputting the same vehicle characteristic information into a trained model to predict the vehicle track; the method extracts the characteristic information of the target vehicle and the related vehicles around the target vehicle, and constructs a vehicle track prediction model with stronger information extraction capability, so that the aim of effectively predicting the track of the target vehicle is fulfilled, and the prediction result is high in accuracy and reliability and can be widely applied to various road surface scenes.

Description

Vehicle track prediction method based on deep learning

Technical Field

The invention relates to the technical field of automatic driving, in particular to a vehicle track prediction method based on deep learning.

Background

With the improvement of the technological level of human beings, the unmanned automobile technology is rapidly developed. By continuously integrating environment perception, target detection, obstacle prediction, decision planning and control modules and depending on an artificial intelligence technology, the vehicle can finish automatic driving under some conditions.

Safety issues in the field of automotive driving have been a major concern. Sensing and prediction of surrounding obstacles is a key issue for the entire autonomous vehicle. Particularly for dynamic obstacles, such as vehicles in the road. Not only is it necessary to obtain information about its current position, but it also has the ability to predict its future movement. Therefore, future tracks of surrounding obstacles can be accurately predicted, more reaction time can be reserved for a planning control module for driving the vehicle, the vehicle can run more stably, riding experience of the automatic driving vehicle is greatly improved, and a comfortable and safe riding effect is brought. The traditional track prediction methods include a track prediction method based on physics and a track prediction based on a kinetic mode. Vehicle trajectory prediction is based on a traditional mode, a dynamic equation is generally established, a target trajectory polynomial equation is generally carried out, and flexibility is poor. The methods such as hidden markov models and bayesian formulas are developed later, but the models need to set a large number of parameters, and subjective parameters need to be set in the policy application stage, so that the method is not very applicable. Therefore, a vehicle trajectory prediction method based on deep learning is proposed herein.

Disclosure of Invention

The invention aims to provide a vehicle track prediction method based on deep learning, which can accurately extract the motion characteristics of a vehicle and the interaction information of the vehicle and surrounding vehicles so as to improve the safety of an automatic driving automobile.

Therefore, the technical scheme of the invention is as follows:

a vehicle track prediction method based on deep learning comprises the following steps:

s1, constructing a vehicle track database for deep learning, which comprises a plurality of groups of vehicle track data; the method comprises the following steps of obtaining track data of each group of vehicles:

s101, acquiring historical track information of a plurality of running vehicles on a plurality of lanes by adopting a database calling or real-time data acquisition mode;

s102, setting a target vehicle x, and determining related vehicles around the target vehicle x according to the position information of the target vehicle x at the current i moment; wherein the surrounding relevant vehicles of the target vehicle x are defined as: based on the driving lane of the target vehicle x at the current i moment, all the vehicles with the distance less than or equal to L from the target vehicle x are the surrounding related vehicles y = { y = on the driving lane of the target vehicle x and the adjacent lanes on the left side and the right side of the driving lane of the target vehicle x ₁ ,y ₂ ,…,y _n-1 ,y _n }；

S103, acquiring the current i moment and t before the i moment of the target vehicle x and the related vehicles y around the target vehicle x _h The historical position of the historical time within the second is used as the known historical position information; obtaining the target vehicle x from the moment i to the moment i + t _pred The historical position of the historical moment within the time range corresponding to the moment is used as the future position information of the target vehicle;

s104, establishing a vehicle coordinate system by taking the position of the target vehicle x at the current i moment as an origin, the road width direction as an x axis and the vehicle running direction as a y axis; then, all the position information of the target vehicle x and the related vehicles y around the target vehicle x, which is obtained in the step S103, is normalized, that is, the position coordinates of each vehicle at each moment are unified to be expressed by adopting the vehicle coordinate system;

s105, rasterizing the road range determined when the peripheral related vehicles of the target vehicle x are acquired in the step S102, with the lane width as the grid width and the body length of the target vehicle x as the grid length, so that the target vehicle and each peripheral related vehicle are respectively positioned in one grid; then, mapping the target vehicle x and all surrounding related vehicles at the moment i into corresponding grids, and representing by adopting a mask vector:

in the mask vector, each grid takes X _ab Indicating that a represents the row number of the road grid, and b represents the column number of the road grid; and then carrying out assignment according to whether vectors exist in the grids or not: when a vehicle is in the grid, the grid position is assigned to be 1, and when no vehicle is in the grid, the grid position is assigned to be 0;

s106, based on the data obtained in the step S104, obtaining:

1) Historical time series track information Traj of target vehicle _h It is expressed as: traj _h ＝{vehicleid，(p ₀ ,p ₁ ,…,p _i )}；

2) Future time series track information Traj of target vehicle _pred It is expressed as: traj _pred ＝{vehicleid，(p _i+1 ,p _i+2 ,…,p _i+tpred )}；

3) Historical time series track information Traj of any peripheral related vehicle _nbrs Denoted as Traj _nbrs ＝{vehicleid，(p ₀ ,p ₁ ,…,p _i )}；

Wherein vehicleid represents the ID number of the vehicle; p is a radical of formula _i Indicating the position information of the surrounding vehicle at time i, p _i ＝{x _i ,y _i ,t _i ,laneid}，t _i Denotes time i, x _i Is the abscissa, y, of the vehicle at time i _i The abscissa of the vehicle at the moment i, and laneid is the mark of the driving lane of the vehicle at the moment i;

s107, the historical time series track information Traj of the target vehicle obtained in the step S106 _h Historical time series track information Traj of vehicles related to surroundings _nbrs Respectively carrying out code mapping to obtain historical time sequence track code information f of the target vehicle _h And other surrounding vehicle time sequence track coding information f _nbrs (ii) a Wherein the content of the first and second substances,

(1) For the target vehicle: coordinates (x) of target vehicle based on i time _i ,y _i ) The expression of the high-dimensional vector is as follows:

f _hi ＝F(x _i ,y _i :W _loc ,B _loc )，

wherein, F is a full connection layer neural network,

representing parameters of the network, R representing a real space, D _k In order to be a spatial characteristic dimension, the method comprises the following steps of,

representing the bias parameters of the network, R representing the real space, D _k Is a spatial characteristic dimension;

(2) For other vehicles in the surroundings: based on the position (x) of any of the surrounding vehicles at time i _i ,y _i ) The expression of the high-dimensional vector is as follows:

f _nbrsi ＝F(x _i ,y _i :W _loc ,B _loc )，

wherein F is a full connection layer neural network,

representing parameters of the network, R representing a real space, D _m Is a spatial characteristic dimension;

representing the bias parameters of the network, R representing the real space, D _m Is a spatial characteristic dimension;

s2, constructing a vehicle track prediction model based on a self-attention Transformer network, wherein the vehicle track prediction model consists of a convolution pooling module, an encoding module and a decoding module; the coding module adopts a coding module in a self-attention Transformer network, and an LSTM coding module or a Fully Connected Layer module is additionally arranged at the output end of the coding module; the decoding module consists of a first Fully Connected Layer module, a LayerNorm module, a ReLu module, a second Fully Connected Layer module and a third Fully Connected Layer module which are Connected in sequence; the output end of the convolution pooling module and the output end of the Encoder module are Connected with a first Fully Connected Layer module through a vector fusion module; the vector fusion module adopts a vector fusion module with two vector splicing and fusion functions;

s3, utilizing a data set obtained by processing the vehicle track database constructed in the step S1 and used for deep learning, and training the vehicle track prediction model constructed in the step S2;

wherein, in the training process, for each group of vehicle track data: 1) The mask vector of the position relationship between the target vehicle and each of the surrounding related vehicles at the time i obtained in step S105 and the surrounding related vehicles obtained in step S106 are calculated at i-t _h Time sequence track coding information f of other vehicles around within corresponding time range from moment to moment i _nbrs As input data, importing the data into a convolution pooling module of a vehicle track prediction model; 2) The target vehicle obtained in the step S106 is in the range of i-t _h Historical time sequence track coding information f of target vehicle in corresponding time range from time to time i _h As input data, importing the data into a coding module of a vehicle track prediction model; 3) The target vehicle obtained in the step S106 is started from the time i to the time i + i _pred Future time sequence track information Traj of target vehicle in time-corresponding time range _pred As output data of the decoding module;

s4, obtaining the following components in the same way as the step S1: information (1) mask vectors of the position relations between the new target vehicle and each of the surrounding related vehicles at the current i moment; information (2) surrounding relevant vehicles at i-t _h Time sequence track coding information f of other vehicles around within corresponding time range from moment to moment i _nbrs (ii) a Information (3) new target vehicle at i-t _h Historical time sequence track coding information f of new target vehicle in corresponding time range from time to time i _h (ii) a Then, the information (1) and the information (2) are introduced into a convolution pooling module of a vehicle track prediction model, and the information (3) is introduced into a coding module of the vehicle track prediction model; after operation, by decodingThe output of the module is the new target vehicle from the moment i to the moment i + i _pred Future time sequence track information Traj of target vehicle in time range corresponding to time _pred 。

Further, in step S102, L has a value ranging from 18.4m to 36.8m.

Further, in step S103, t _h The value range of (A) is 2 s-5s _pred The value range of (1) is 3 s-5 s, and the vehicle position information acquisition frequency is 0.2 s/time.

4. The deep learning-based vehicle track prediction method according to claim 1, wherein in step S107, D _k ＝512，D _m ＝32。

Further, in step S3, the specific implementation steps are as follows:

s301, randomly dividing a plurality of groups of vehicle track data in the vehicle track database for deep learning, which is constructed in the step S1, into a training set, a verification set and a test set according to the proportion of 7;

s302, inputting input data of vehicle track data serving as model input data and output data serving as output data of a model into the vehicle track prediction model constructed in the step S2 and calculating a loss value of a current error; and then, according to the current loss value, optimizing the parameters of the network by using an error back propagation method until the training is finished.

Further, in step S302, the penalty function is set to take the root mean square error RMSE, and the parameter learning rate is set to a fixed value, i.e., lr =0.0001.

Compared with the prior art, the vehicle track prediction method based on deep learning has the following beneficial effects:

1) When the characteristics of the vehicle are extracted, the vehicle track prediction model with stronger information extraction capability is constructed and formed by combining the recurrent neural network, the attention mechanism and the spatial information fusion mechanism, and the extraction of the interaction tensor of the target vehicle and the surrounding vehicles by using the convolution pooling module is realized. Finally, the coding information of the target vehicle and the interaction tensor of the target vehicle are fused to represent the space-time information of the target vehicle, the coding module is used for coding the characteristics of the target vehicle, and the decoding module is further used for achieving the purpose of effectively predicting the track of the target vehicle;

2) The vehicle trajectory prediction model is trained based on actual driving data, so that the prediction result is higher in credibility, and the prediction effect proves that the prediction result is higher in accuracy;

3) The prediction method has wide application scenes, can be well applied to expressway scenes or other complex road networks, and has good market application prospect.

Drawings

FIG. 1 is a flow chart of a deep learning based vehicle trajectory prediction method of the present invention;

fig. 2 (a) is a schematic diagram illustrating the selection manner of the target vehicle and the related vehicles around in step S102 of the deep learning-based vehicle trajectory prediction method according to the present invention;

fig. 2 (b) is a schematic diagram of rasterizing the target vehicle and the related vehicles around in step S105 of the deep learning-based vehicle trajectory prediction method of the present invention;

fig. 3 is a schematic structural diagram of a vehicle trajectory prediction model in step S2 of the deep learning-based vehicle trajectory prediction method of the present invention.

Detailed Description

The invention will be further described with reference to the following figures and specific examples, which are not intended to limit the invention in any way.

As shown in fig. 1, the vehicle trajectory prediction method based on deep learning includes the following steps:

s1, constructing a vehicle track database for deep learning;

s101, obtaining historical track information of a plurality of running vehicles on a plurality of lanes in a database calling or real-time data acquisition mode;

s102, setting a target vehicle x, and determining related vehicles around the target vehicle x according to the position information of the target vehicle x at the current i moment; specifically, as shown in fig. 2 (a), based on the presence of a target vehicle xIn the driving lane a at the previous i moment, the adjacent lane on the left side of the driving lane a is a first related lane B, the adjacent lane on the right side of the driving lane a is a second related lane C, and then on the driving lane a, the driving lane B and the driving lane C, all vehicles with the distance less than or equal to L from the target vehicle x are defined as the peripheral related vehicles y = { y } of the target vehicle x ₁ ,y ₂ ,…,y _n-1 ,y _n }；

Wherein the value range of L is 18.4 m-36.8 m, and in the embodiment, the value of L is 27.6m; based on this, as shown in fig. 2 (a), the target vehicle x has five related vehicles around it;

s103, acquiring the current i moment and t before the i moment of the target vehicle x and the related vehicles y around the target vehicle x _h The historical position of the historical time within the second is used as the known historical position information; obtaining the target vehicle x from the moment i to the moment i + t _pred Historical positions of historical moments within a time range corresponding to the moments are used as future position information of the target vehicle and used as training data of the model and data for verifying the effectiveness of the model during testing; wherein, t _h The value range of (A) is 2 s-5s _pred The value range of (A) is 3 s-5 s;

in the present embodiment, t _h The value is 3s, t _pred The value is 5s, and the collection frequency of the vehicle position information is set to be 0.2 s/time; therefore, in the acquired known historical position information, 16 pieces of position information are correspondingly acquired from the target vehicle and each peripheral related vehicle within the range from the moment i-3 to the moment i; in the future position information of the target vehicle, 25 pieces of position information are correspondingly collected from the moment i (excluding the moment i) to the moment i +5 of the target vehicle;

s104, establishing a vehicle coordinate system by taking the position of the target vehicle x at the current i moment as an origin, the road width direction as an x axis and the vehicle running direction as a y axis; then, normalizing the position information of the target vehicle x and the related vehicles y around the target vehicle within the historical time range set in the step S103, namely unifying the position coordinates of each vehicle at each time to adopt the vehicle coordinate system to carry out position representation;

s105, as shown in fig. 2 (B), performing rasterization processing on the driving lane a, the driving lane B and the driving lane C to make the target vehicle and each of the surrounding related vehicles respectively located in one grid, with the lane width as the grid width and the average household automobile body length of 4.6m as the grid length, for the road range determined when the surrounding related vehicles of the target vehicle x are acquired in step S102; then, mapping the target vehicle x and all the surrounding related vehicles at the moment i into corresponding grids, thereby obtaining the relative position relationship between different vehicles, and expressing by adopting a mask vector: when a vehicle exists in the grid, the grid position is assigned to be 1, and when no vehicle exists in the grid, the grid position is assigned to be 0;

in the present embodiment, the lane width is taken as the grid width, the body length of the target vehicle x is taken as the grid length, and the lane a, the lane B and the lane C are rasterized into a 13 × 3 grid body, that is, the number of rows of the grid body is 13, and the number of columns of the grid body is 3; the target vehicle x and five surrounding related vehicles are respectively located in six grids; correspondingly, the mask vector of the position relationship between the target vehicle and each of the surrounding related vehicles at time i is represented as:

s106, based on the data obtained in the step S104, obtaining:

1) Historical time series track information Traj of target vehicle _h It is expressed as:

Traj _h ＝{v＝hicleid，(p ₀ ,p ₁ ,…,p _i )}；

2) Future time series track information Traj of target vehicle _pred It is expressed as:

Traj _pred ＝{vehicleid，(p _i+1 ,p _i+2 ,…,p _i+tpred )}；

3) Historical time series track information Traj of any vehicle related to the surroundings _nbrs It is expressed as:

Traj _nbrs ＝{vehicleid，(p ₀ ,p ₁ ,…,p _i )}；

wherein vehicleid represents the ID number of the vehicle; p is a radical of _i Indicating the position information of the surrounding vehicle at time i, p _i ＝{x _i ,y _i ,t _i ,laneid}，t _i Denotes time i, x _i Is the abscissa, y, of the vehicle at time i _i The abscissa of the vehicle at the moment i, and laneid is the mark of the driving lane of the vehicle at the moment i;

s107, based on the historical time series track information Traj of the target vehicle obtained in the step S106 _h Historical time series track information Traj of vehicles related to surroundings _nbrs Respectively carrying out code mapping to obtain historical time sequence track code information f of the target vehicle _h And other surrounding vehicle time sequence track coding information f _nbrs (ii) a In particular, the amount of the solvent to be used,

(1) For the target vehicle: converting the position coordinates of the target vehicle at the historical time into a high-dimensional vector representation, namely historical time sequence track coding information f of the target vehicle _h ；

Coordinates (x) of the target vehicle at time i _i ,y _i ) For example, the expression of the high-dimensional vector is:

f _hi ＝F(x _i ,y _i :W _loc ,B _loc )，

wherein F is a full connection layer neural network,

in this embodiment, D _k =512; since the number of the historical sampling points of the target vehicle is 16, the historical time-series track coding information f of the target vehicle _h Consists of 16 high-dimensional vectors;

(2) For other vehicles in the surroundings: correlate each surrounding under historical timeConverting the position coordinates of the vehicle into a high-dimensional vector representation, i.e. encoding information f of the time-sequence tracks of other vehicles around _nbrs (ii) a Assuming that the number of other peripheral vehicles is n, n =5 in the present embodiment.

Position (x) of any surrounding related vehicle at time i _i ,y _i ) For example, the expression of the high-dimensional vector is:

f _nbrsi ＝F(x _i ,y _i :W _loc ,B _loc )，

wherein F is a full connection layer neural network,

representing parameters of the network, R representing a real space, D _m Is a spatial feature dimension.

in this embodiment, D _m =32; the historical sampling point number of the target vehicle is 16, so that each other vehicle around comprises 16 high-dimensional vectors and time sequence track coding information f of other vehicles around _nbrs The total number is 5 x 16 high-dimensional vectors.

S2, constructing a vehicle track prediction model based on a self-attention Transformer network;

the vehicle track prediction model is composed of a convolution Pooling module (a Convolutional Social Pooling module), an encoding module (an Encoder module) and a decoding module (a Decoder module).

The convolution Pooling module (a conventional Social networking module), also referred to as CSP module for short, is used to process the mask vector of the position relationship between the target vehicle and each of the surrounding related vehicles at time i obtained in step S105 and the time sequence track coding information f of the surrounding other vehicles obtained in step S107 _nbrs Namely, the two parts of data information are input data of the convolution pooling module;

specifically, the data processing principle of the convolution pooling module is as follows:

(1) The module encodes the time sequence track coding information f of other vehicles around the vehicle obtained in the step S107 _nbrs The treatment process of (2): selecting some other vehicle y around the target vehicle _m To explain:

in the formula (I), the compound is shown in the specification,

representing the hidden state vector of the LSTM encoding module at time t,

is a parameter of the LSTM coding module, D _n Is a dimension representing LSTM encoding modules, each sharing the same parameter W _m ；

(2) The processing procedure of the mask vector of the position relationship between the target vehicle and each of the surrounding related vehicles at time i obtained in step S105 by the module is as follows:

determining the position of each vehicle according to the mask vector, extracting from the previous step

Filling according to the spatial position of each vehicle, and obtaining a D _n Tensor of x 13 x 3 in D _n For example, =64, at which point there is already one vector of fused timing features; this vector is first convolved by a torch, nn, conv2d (64,64,3), and the tensor becomes D _n × 9 × 1=64 × 11 × 1 and then passes through the torch.nn.conv2d (64,16, (3,1)), the tensor becomes 16 × 9 × 1. Finally, the tensor dimension obtained by one pooling operation nn, maxpool2d ((2,1), padding = (1,0)) is 16 × 5 × 1; finally, the tensor dimensionality is adjusted to obtain the final output, and the target vehicle interaction tensor S ^t ＝D _s ＝80；

The convolution pooling module is used for providing interaction information of the target vehicle and surrounding vehicles, and further assisting the target vehicle in generating a predicted track. Therefore, the module has the function of improving the prediction accuracy of the future track of the vehicle.

The encoding module (Encoder module) is obtained by improving an Encoder module of a self-attention-based transform network, and specifically, an LSTM encoding module or a full Connected Layer module is additionally arranged at the data output end of the original Encoder module; wherein, a new coding module obtained by adding an LSTM coding module at the output point of an Encoder module of a self-attention Transformer network is also called as a TFL module for short; adding a new coding module which is also called as a TFF module for short and is obtained by a Fully Connected Layer module at an output point of an Encoder module of a self-attention Transformer network;

specifically, the data processing principle of the encoding module is as follows:

the input data of the encoding module is the historical time series track information Traj of the target vehicle obtained in step S106 _h And step S107, obtaining the historical time-series track coding information f of the target vehicle _h (ii) a Wherein the content of the first and second substances,

obtaining the historical time-series track code information f of the target vehicle for step S107 _h In other words, the characteristic dimension is f _h ＝H×D _k H is the total time number of the historical track points of the vehicle;

however, due to the historical time series track information f of the target vehicle _h There is no information in the time dimension, so in order to enable the model to learn the characteristics in the time sequence, a position embedding (Positional Encoding) method is adopted to add the vehicle time sequence track vector and the position embedding vector in the historical time sequence track information Trajh of the target vehicle obtained in step S106; the specific formula is as follows:

in the formula, pos refers to the current time, i is the dimension of the feature, and the value range is 0-512;

after adding the position embedding vector, three Attention matrixes Attention (Q, K and V) are obtained through three full-connection layers, and the three Attention matrixes Attention are respectively a query matrix Q e to H multiplied by D _k And the key matrix K is formed by { H multiplied by D ∈ [ { H multiplied by D }) _k The value matrix V belongs to H multiplied by D _k . Where H represents the number of observed historical trace samples, D _k Is the vector dimension of the feature; the Softmax layer is used for obtaining the attention weight of each point, and the Softmax layer is multiplied by V to obtain the final output and then multiplied by V

The reverse propagation is avoided to avoid the disappearance of the gradient; the expressions of the three matrices Attention (Q, K, V) are:

namely, the attention of each time sampling point of the target vehicle to all the times is obtained; then passing through a residual module and a Layer Normalization module; the Layer Normalization module is added to avoid that data fall into a saturation region of an activation function when the data pass through the activation function, and further the network learning speed is reduced, namely, the Layer Normalization is adopted to normalize the data, and input is converted into data with a mean value of 0 and a variance of 1; the expression for Layer Normalization is as follows:

in the formula, mu _B Is the raw mean, σ, of the data calculation _B Is the raw variance calculated for the data. And then the output and the input are connected together through a residual error module. The FFN module comprisesTwo full-connection layers are subjected to dropout to prevent overfitting, and the dimension of the feature vector of the target vehicle is H multiplied by D _k (ii) a Meanwhile, in order to further perform fusion in the time dimension, an LSTM encoding module or a full connection layer is additionally arranged at the data output end of the original encoder module, and the final output characteristic is obtained and is V ^t ＝D _k 。

The decoding module (Decoder module) is a data processing module redesigned based on various existing operation modules, and specifically comprises a first full Connected Layer module, a LayerNorm module, a ReLu module, a second full Connected Layer module and a third full Connected Layer module which are sequentially Connected, wherein the output end of the convolution pooling module and the output end of the Encoder module are Connected with the first full Connected Layer module through a vector fusion module, namely the input data of the decoding module is data obtained by vector fusion of the output data of the convolution pooling module and the output data of the Encoder module; wherein, the first and the second end of the pipe are connected with each other,

the vector fusion module is realized by adopting a fusion mode of two vector splicing; the expression of the fused feature vector is:

in the formula, | | represents a connection operation, and R represents a real number space; the fused feature vector represents the fused spatiotemporal features at the current time.

A decoding module (also abbreviated as MLP module) for predicting the motion trail of the target vehicle, wherein the input data is the interaction tensor S of the target vehicle output by the convolution pooling module ^t And the characteristic V of the output of the coding module ^t Outputting a result after vector fusion;

based on this, the vehicle trajectory prediction model of the present application may specifically adopt: 1) A TFF-MLP-CSP model composed of a CSP module, a TFF module and an MLP module, or 2) a TFL-MLP-CSP model composed of a CSP module, a TFL module and an MLP module.

S3, utilizing a data set obtained by processing the vehicle track database constructed in the step S1 and used for deep learning to train the vehicle track prediction model constructed in the step S2;

specifically, the specific implementation steps of step S3 are:

s302, for each group of vehicle track data, the mask vector of the position relation between the target vehicle and each surrounding related vehicle at the time i obtained in the step S105, and the mask vector of the surrounding related vehicles at the time i-t obtained in the step S106 _h Time sequence track coding information f of other vehicles around within corresponding time range from moment to moment i _nbrs Input data for input into a convolution pooling module of the vehicle trajectory prediction model; the target vehicle obtained in step S106 is at i-t _h Historical time sequence track coding information f of target vehicle in time range corresponding to time from moment to moment i _h Input data for input into a coding module of a vehicle trajectory prediction model; and the target vehicle obtained in the step S106 is from the time i to the time i + i _pred Future time sequence track information Traj of target vehicle in time range corresponding to time _pred The output data is output by a decoding module used for the vehicle track prediction model;

based on the above, in the training process, inputting the input data of any vehicle track data as model input data and the output data as output data of the model into the vehicle track prediction model constructed in the step S2 and calculating the loss value of the current error; further, according to the current loss value, optimizing the parameters of the network by using an error back propagation method until the training of the vehicle track prediction model is completed;

in the training process, the loss function adopts a Root Mean Square Error (RMSE), and the expression is as follows:

the parameter learning rate is set to a fixed value lr =0.0001;

s4, adopting the same method as the step S1Obtaining: information (1) mask vectors of the position relations between the new target vehicle and each surrounding related vehicle at the current i moment; information (2) surrounding relevant vehicles at i-t _h Time sequence track coding information f of other vehicles around within corresponding time range from moment to moment i _nbrs (ii) a Information (3) New target vehicle at i-t _h Historical time sequence track coding information f of new target vehicle in corresponding time range from time to time i _h (ii) a Then, the information (1) and the information (2) are introduced into a convolution pooling module of a vehicle track prediction model, and the information (3) is introduced into a coding module of the vehicle track prediction model; after operation, the output of the decoding module is the new target vehicle from the moment i to the moment i + i _pred Future time sequence track information Traj of target vehicle in time-corresponding time range _h‘’ ；

The number of predicted target vehicle position points in this example is N =25, i.e., vehicle trajectory information within 5 seconds in the future, i.e., position coordinates (x, y) at the time of 25 position information.

In order to evaluate the prediction accuracy of the vehicle trajectory prediction method based on deep learning, three indexes, namely, an Average Displacement Error (ADE), a Farthest Displacement Error (FDE), and a Root Mean Square Error (RMSE), are used for measurement.

The specific calculation formulas of the three indexes are as follows:

t＝T _pred ，

the essence of the three indexes is to measure the error between the predicted value and the true value, so that the smaller the numerical value corresponding to the calculation result is, the higher the precision is.

Meanwhile, for convenience of comparison, when the precision of the prediction method is evaluated, three existing prediction models are selected as comparison models, and the precision evaluation is carried out by adopting the method the same as that of the prediction method, so that the prediction effect of the prediction method is shown.

The details of the model involved in the prediction accuracy evaluation are shown in table 1 below.

Table 1:

respectively substituting the vehicle track database for deep learning constructed in the step S1 in the embodiment into the four models, and completing model training by adopting the training mode in the step S3 in the embodiment; then, the database is called again to obtain historical track data of the target vehicle and related vehicles around the target vehicle, track data under the first three seconds in the historical track data is used as historical data and substituted into each trained model, track data under the last five seconds in the historical track data is used as future real data, prediction accuracy of each model is obtained based on prediction results and real data results, and calculation results are shown in the following table 2.

Table 2:

in table 2 above, the unit of each data is m; as can be seen from the evaluation results in table 2, based on the principle that the smaller the value of each of the three indexes, namely the ADE index, the FDE index and the RMSE index, the better the prediction accuracy of the corresponding model, the overall prediction accuracy of the trajectory prediction result obtained by the method of the present application is better than that of other models in the position prediction accuracy of 1s to 5s in the future, and particularly, in the long-term prediction effect, for example, in the 5 th s, compared with the transform SE, the prediction accuracy of the present application can be advanced by 27.9% in the ADE index, by 31.3% in the FDE index and by 28.2% in the RMSE index.

Meanwhile, for further comparison, the applicant also performed the same evaluation as the above method on the prediction accuracy of the four models involved in the model preferred construction process of the present application. Specifically, the four models are respectively: (1) a TFF-MLP model composed of only a TFF module and an MLP module, (2) a TFL-MLP model composed of only a TFL module and an MLP module, (3) a TFF-MLP-CSP model composed of a CSP module, a TFF module, and an MLP module, and (4) a TFL-MLP-CSP model composed of a CSP module, a TFL module, and an MLP module.

The results of the prediction accuracy evaluation obtained by the above prediction experiments for the four models are shown in table 3 below.

Table 3:

in table 3, the unit of each data is m; as can be seen from the evaluation results in table 3, when the input of the surrounding vehicle is extracted without using the convolution pooling module, the accuracy of the predicted result of the vehicle trajectory obtained by the model composed of the TFF module and the MLP module after the same training is better than that of the model composed of the TFL module and the MLP module; however, when a convolution pooling module, namely a CSP module, is further added to extract the mutual information between the surrounding vehicles and the target vehicle, the prediction accuracy of the vehicle trajectory can be further improved, and compared with a TFL-MLP model, the TFL-MLP-CSP model has better prediction result accuracy of the vehicle trajectory, compared with a TFF-MLP-CSP model, the TFF-MLP-CSP model has better prediction result accuracy of the vehicle trajectory, so that the vehicle trajectory prediction model constructed by the present application has the advantages of considering both the temporal-spatial information (including the change of the vehicle state in the time dimension) in the vehicle prediction and the mutual information between the vehicle and other surrounding vehicles in the construction and training process, so that the model can learn the change of the vehicle motion state after being trained, and simultaneously referring to the influence of the surrounding vehicles, thereby accurately predicting the vehicle trajectory; meanwhile, the vehicle track prediction method has wide application prospect, can be suitable for various scenes, and has potential market and application values. In the application, the TFF-MLP-CSP model is an optimal applicable model of a vehicle track prediction model, and a vehicle track prediction result closest to a real vehicle track can be obtained after training.

Claims

1. A vehicle track prediction method based on deep learning is characterized by comprising the following steps:

s1, constructing a vehicle track database for deep learning, wherein the vehicle track database comprises a plurality of groups of vehicle track data; the method comprises the following steps of obtaining track data of each group of vehicles:

s102, setting a target vehicle x, and determining related vehicles around the target vehicle x according to the position information of the target vehicle x at the current i moment; wherein the surrounding relevant vehicles of the target vehicle x are defined as: based on the driving lane of the target vehicle x at the current i moment, all the vehicles with the distance less than or equal to L from the target vehicle x are the surrounding related vehicles y = { y = on the driving lane of the target vehicle x and the adjacent lanes on the left side and the right side of the driving lane of the target vehicle x ₁ ，y ₂ ，...，y _n-1 ，y _n }；

s105, rasterizing the road range determined when the peripheral related vehicles of the target vehicle x are acquired in the step S102, with the lane width as the grid width and the body length of the target vehicle x as the grid length, so that the target vehicle and each peripheral related vehicle are respectively positioned in one grid; then, mapping the target vehicle x and all surrounding related vehicles at the moment i into corresponding grids, and representing by adopting a mask vector: when a vehicle is in the grid, the grid position is assigned to be 1, and when no vehicle is in the grid, the grid position is assigned to be 0;

s106, based on the data obtained in the step S104, obtaining:

1) Historical time series track information Traj of target vehicle _h It is expressed as: traj _h ＝{vehicleid，(p ₀ ，p ₁ ，...，p _i )}；

2) Future time series track information Traj of target vehicle _pred It is expressed as: traj _pred ＝{vehicleid，(p _i+1 ，p _i+2 ，...，p _i+tpred )}；

3) Historical time series track information Traj of any vehicle related to the surroundings _nbrs Denoted as Traj _nbrs ＝{vehicleid，(p ₀ ，p ₁ ，...，p _i )}；

Wherein vehicleid represents the ID number of the vehicle; p is a radical of _i Indicating the position information of the surrounding vehicle at time i, p _i ＝{x _i ，y _i ，t _i ，laneid}，t _i Denotes time i, x _i Is the abscissa, y, of the vehicle at time i _i The abscissa of the vehicle at the moment i, and laneid is the mark of the driving lane of the vehicle at the moment i;

s107, the historical time series track information Traj of the target vehicle obtained in the step S106 _h Historical time series track information Traj of vehicles related to surroundings _nbrs Respectively carrying out code mapping to obtain historical time sequence track code information f of the target vehicle _h Compiling time sequence track with other vehicles aroundCode information f _nbrs (ii) a Wherein the content of the first and second substances,

(1) For the target vehicle: coordinates (x) of target vehicle based on i time _i ，y _i ) The expression of the high-dimensional vector is as follows:

f _hi ＝F(x _i ，y _i ：W _loc ，B _loc )，

wherein, F is a full connection layer neural network,

(2) For other vehicles in the surroundings: based on the position (x) of any of the surrounding vehicles at time i _i ，y _i ) The expression of the high-dimensional vector is as follows:

f _nbrsi ＝F(x _i ，y _i ：W _loc ，B _loc )，

wherein F is a full connection layer neural network,

wherein, in the training process, for each group of vehicle track data: 1) The mask vector of the position relationship between the target vehicle and each of the surrounding related vehicles at the time i obtained in step S105 and the surrounding related vehicles obtained in step S106 are calculated at i-t _h Time sequence track coding information f of other vehicles around within corresponding time range from moment to moment i _nbrs As input data, importing the data into a convolution pooling module of a vehicle track prediction model; 2) The target vehicle obtained in the step S106 is in the range of i-t _h Historical time sequence track coding information f of target vehicle in time range corresponding to time from moment to moment i _h As input data, importing the data into a coding module of a vehicle track prediction model; 3) The target vehicle obtained in the step S106 is started from the time i to the time i + i _pred Future time sequence track information Traj of target vehicle in time range corresponding to time _pred As output data of the decoding module;

s4, obtaining the following components in the same way as the step S1: information (1) mask vectors of the position relations between the new target vehicle and each of the surrounding related vehicles at the current i moment; information (2) surrounding relevant vehicles at i-t _h Time sequence track coding information f of other vehicles around within corresponding time range from moment to moment i _nbrs (ii) a Information (3) new target vehicle at i-t _h Historical time sequence track coding information f of new target vehicle in corresponding time range from time to time i _h (ii) a Then, the information (1) and the information (2) are introduced into a convolution pooling module of a vehicle track prediction model, and the information (3) is introducedThe vehicle track prediction model is coded; after operation, the output of the decoding module is the new target vehicle from the moment i to the moment i + i _pred Future time sequence track information Traj of target vehicle in time range corresponding to time _pred 。

2. The vehicle track prediction method based on deep learning of claim 1, wherein in step S102, a value range of L is 18.4m to 36.8m.

3. The deep learning-based vehicle track prediction method according to claim 1, wherein in step S103, t is _h The value range of (A) is 2 s-5s _pred The value range of (1) is 3 s-5 s, and the vehicle position information acquisition frequency is 0.2 s/time.

5. The vehicle track prediction method based on deep learning of claim 1, wherein in step S3, the detailed implementation steps are as follows:

6. The method for predicting vehicle trajectories based on deep learning of claim 1, wherein in step S302, the loss function is set to adopt root mean square error RMSE, and the parameter learning rate is set to a fixed value, i.e., lr =0.0001.