CN108803617B - Trajectory prediction method and apparatus - Google Patents

Trajectory prediction method and apparatus Download PDF

Info

Publication number
CN108803617B
CN108803617B CN201810752554.5A CN201810752554A CN108803617B CN 108803617 B CN108803617 B CN 108803617B CN 201810752554 A CN201810752554 A CN 201810752554A CN 108803617 B CN108803617 B CN 108803617B
Authority
CN
China
Prior art keywords
vehicle
information
track
video sequence
trajectory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810752554.5A
Other languages
Chinese (zh)
Other versions
CN108803617A (en
Inventor
邹文斌
周长源
吴迪
王振楠
唐毅
李霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201810752554.5A priority Critical patent/CN108803617B/en
Publication of CN108803617A publication Critical patent/CN108803617A/en
Application granted granted Critical
Publication of CN108803617B publication Critical patent/CN108803617B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0246Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means
    • G05D1/0251Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means extracting 3D information from a plurality of images taken from different locations, e.g. stereo vision
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0221Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0276Control of position or course in two dimensions specially adapted to land vehicles using signals provided by a source external to the vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Electromagnetism (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The embodiment of the invention provides a track prediction method and a track prediction device, relates to the field of local navigation of robots and intelligent vehicles, and is applied to vehicles provided with vehicle-mounted cameras, and the method comprises the following steps: and shooting the surrounding environment by using the vehicle-mounted camera to obtain a video sequence comprising surrounding vehicles and a vehicle background. And positioning the surrounding vehicles from the video sequence, extracting historical track information of the surrounding vehicles, and taking scene semantic information obtained by image segmentation of the video sequence as auxiliary information. And inputting the historical track information and the auxiliary information into a neural network model to obtain the predicted track of the surrounding vehicle. The trajectory prediction method can improve the accuracy of predicting the vehicle trajectory.

Description

Trajectory prediction method and apparatus
Technical Field
The invention relates to the field of local navigation of robots and intelligent vehicles, in particular to a track prediction method and a track prediction device.
Background
During vehicle travel, it is important to predict future trajectories of other traffic participants to avoid collision of the autonomous vehicle with other vehicles. Assuming that all traffic participants comply with traffic regulations and that human drivers can subconsciously predict the future trajectory of a target, modeling methods are typically employed to predict the future trajectories of other traffic participants for autonomous vehicles.
However, most of the current work is to extract visual semantic messages by using static images, or to learn a driving network by adopting an end-to-end structure, wherein the former ignores the time continuity in the driving situation, and the latter lacks the interpretability of the training network, thereby causing the problem of low accuracy of predicting the vehicle track.
Disclosure of Invention
The invention mainly aims to provide a track prediction method and a track prediction device, which can improve the accuracy of vehicle track prediction.
The track prediction method provided by the first aspect of the embodiment of the invention is applied to a vehicle provided with a vehicle-mounted camera, and comprises the following steps: shooting the surrounding environment by using a vehicle-mounted camera to obtain a video sequence comprising surrounding vehicles and a vehicle background; positioning the surrounding vehicles from the video sequence, extracting historical track information of the surrounding vehicles, and taking scene semantic information obtained by image segmentation of the video sequence as auxiliary information; and inputting the historical track information and the auxiliary information into a neural network model to obtain the predicted track of the surrounding vehicle.
A trajectory prediction apparatus provided in a second aspect of an embodiment of the present invention is applied to a vehicle provided with a vehicle-mounted camera, and includes: the acquisition module is used for shooting the surrounding environment by utilizing the vehicle-mounted camera to acquire a video sequence comprising surrounding vehicles and a vehicle background; the extraction and segmentation module is used for positioning the surrounding vehicles from the video sequence, extracting historical track information of the surrounding vehicles, and taking scene semantic information obtained by image segmentation of the video sequence as auxiliary information; and the output module is used for inputting the historical track information and the auxiliary information into a neural network model to obtain the predicted track of the surrounding vehicle.
In the embodiment, the video sequence including the surrounding vehicles and the vehicle background is acquired through the vehicle-mounted camera, the video sequence is subjected to image segmentation to acquire scene semantic information, then the scene semantic information and the historical track information are input into the neural network model to acquire the predicted track, instead of extracting the scene semantic information by using a static image to analyze, so that the time continuity of the neural network model in the embodiment is ensured, and the accuracy of the predicted vehicle track is improved.
Drawings
Fig. 1 is a schematic flow chart illustrating an implementation of a trajectory prediction method according to a first embodiment of the present invention;
FIG. 2 is a schematic flow chart of an implementation of a trajectory prediction method according to a second embodiment of the present invention;
FIG. 3 is a diagram of a neural network model of a trajectory prediction method according to a second embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating an application of a trajectory prediction method according to a second embodiment of the present invention;
fig. 5 is a schematic structural diagram of a trajectory prediction apparatus according to a third embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a flow chart of a track prediction method according to a first embodiment of the present invention, where the method is applied to a vehicle with a vehicle-mounted camera. As shown in fig. 1, the trajectory prediction method mainly includes the following steps:
101. and shooting the surrounding environment by using the vehicle-mounted camera to obtain a video sequence comprising surrounding vehicles and a vehicle background.
Specifically, in the automatic driving process of the vehicle, assuming that all traffic participants obey the traffic rules, a method for establishing a model is adopted to predict the future tracks of other traffic participants. In the process of establishing the model, the surrounding environment information needs to be acquired, so the surrounding environment is photographed by using a vehicle-mounted camera on the vehicle, and a video sequence comprising the surrounding vehicle and the background of the vehicle is acquired. Wherein, the frame number per second of the video sequence can be selected according to the actual situation. The distance between the surrounding vehicle and the vehicle provided with the vehicle-mounted camera is within a certain range, the vehicle having potential influence on the vehicle provided with the vehicle-mounted camera exists, and the range can be 30 meters around the vehicle provided with the vehicle-mounted camera.
102. And positioning the surrounding vehicles from the video sequence, extracting historical track information of the surrounding vehicles, and taking scene semantic information obtained by image segmentation of the video sequence as auxiliary information.
Specifically, the motion in the video sequence is an artifact of motion formed by displaying frames in quick succession, the video sequence of each frame is a static image, the surrounding vehicle is positioned in the video sequence of each frame, and the track information of the surrounding vehicle can be seen from the continuous multi-frame video sequence, so that for the video sequence of the current frame, the historical track information of the surrounding vehicle is obtained from the video sequence of the past multi-frame.
The scene semantic information obtained by image segmentation of the video sequence of each frame is used as auxiliary information. The image segmentation means that objects in the video sequence of each frame are segmented according to semantic categories and labeled with scene semantic information, such as pedestrians, surrounding vehicles, buildings, sky, vegetation, road barriers, lane lines, road identification information, traffic signal light information and the like, so as to identify a drivable area in the video sequence of the current frame. By using scene semantic information as auxiliary information, the method can have certain robustness to apparent change of the target.
Optionally, since the regions corresponding to different semantic categories are different feature regions, and boundaries of the different feature regions are edges, the video sequence of each frame may be segmented by using edge detection, so as to extract the required target. The edge indicates the end of one feature region and the beginning of another feature region, and the internal features or attributes of the desired target are consistent and inconsistent with the features or attributes of the other feature regions, such as features of gray scale, color or texture.
103. And inputting the historical track information and the auxiliary information into a neural network model to obtain the predicted track of the surrounding vehicle.
In particular, the neural network is a complex network system formed by a large number of simple neurons widely connected with each other, is a highly complex nonlinear dynamical learning system, and has large-scale parallel, distributed storage and processing, self-organization, self-adaptation and self-learning capabilities. Therefore, a neural network model is obtained by establishing a mathematical model by using a neural network, and the obtained historical track information and auxiliary information are input into the neural network model to obtain the predicted track of the surrounding vehicle.
In the embodiment of the invention, the video sequence comprising surrounding vehicles and vehicle backgrounds is obtained through the vehicle-mounted camera, the video sequence is subjected to image segmentation to obtain scene semantic information, then the scene semantic information and historical track information are input into the neural network model to obtain the predicted track, instead of extracting the scene semantic information by using a static image to analyze, so that the time continuity of the neural network model in the embodiment is ensured, and the accuracy of the predicted vehicle track is improved.
Referring to fig. 2, fig. 2 is a schematic diagram of a flow chart of a track prediction method according to a second embodiment of the present invention, where the method is applied to a vehicle with a vehicle-mounted camera. As shown in fig. 2, the trajectory prediction method mainly includes the following steps:
201. and shooting the surrounding environment by using the vehicle-mounted camera to obtain a video sequence comprising surrounding vehicles and a vehicle background.
202. And positioning the surrounding vehicles from the video sequence, extracting historical track information of the surrounding vehicles, and taking scene semantic information obtained by image segmentation of the video sequence as auxiliary information.
203. And inputting the auxiliary information into the convolutional neural network to obtain spatial characteristic information.
Specifically, the neural network model comprises a convolutional neural network, a first layer long short term memory network, a second layer long short term memory network and a full connection layer.
The convolutional neural network is a kind of feedforward neural network. And (3) carrying out image segmentation and annotation on the video sequence to obtain scene semantic information which is used as auxiliary information and input into the convolutional neural network to obtain spatial characteristic information. The auxiliary information is image information, can adopt one-bit effective coding to code, and uses the channel number as the semantic category number, inputs the auxiliary information into a four-layer convolution neural network, the convolution kernel can be 3 x 4, obtains the space characteristic information, and the space characteristic information is expressed by 6-dimensional vector.
As shown in fig. 3, the convolutional neural network includes a convolutional layer, a linear correction unit, a pooling layer, and a Dropout layer. The convolutional layer may extract features in the side information. The linear layer may introduce non-linear characteristics. The pooling layer may compress the inputted auxiliary information and extract the main features. A Dropout layer may be used to alleviate the over-fitting problem.
204. And inputting the historical track information into the first layer long and short term memory network to obtain time characteristic information. And inputting the spatial characteristic information and the temporal characteristic information into the second layer long-short term memory network to obtain the joint characteristic information.
In particular, a Long-short Term neural (LSTM) network is a time-recursive network. The historical track information has a certain time sequence and a certain context correlation exists at the position, namely the historical track information is used as a sequence input and needs to be continuously subjected to position characteristics before and after learning, so the LSTM network is used for training the historical track information and connecting the track information of the historical frames for estimating the track information of the current frame.
As shown in fig. 3, the historical track information is input into the first-layer LSTM network to obtain temporal characteristic information, and the temporal characteristic information and the spatial characteristic information obtained in step 203 are input into the second-layer LSTM network to obtain joint characterization information. And because the dimension of the three-dimensional space grid is 6, the first-layer LSTM network not only can learn the time characteristic information, but also can make the dimension of the time characteristic information consistent with that of the space characteristic information. In practical applications, the number of the elements of the first layer of LSTM network may be 100, and the second layer of LSTM network may include two layers of LSTM networks each having a number of 300.
205. And inputting the joint characteristic information into a full-connection layer to obtain the predicted track.
Specifically, each node of the fully-connected layer is connected with all nodes of the previous layer for integrating all the extracted features of the previous layer, so that the combined characterization information is input into the fully-connected layer, a series of matrix multiplication is performed to obtain the output of the neural network model, and the predicted trajectory J of T time steps is obtained. In practical applications, the time T may be 1.6s (unit: second)
Wherein, the neural network model comprises the following formula:
J←Mp(h,a):H×A。
Figure BDA0001725930000000061
Figure BDA0001725930000000062
Figure BDA0001725930000000063
where J denotes the predicted trajectory, M denotes a mapping relationship between H, A and J, H denotes the history trajectory information, a denotes the auxiliary information, p denotes the surrounding vehicle, H denotes position information of the vehicle p in the T-th frame video sequence, a denotes scene semantic information of the vehicle p in the T-th frame video sequence, J denotes position information of the vehicle p in the T-th frame video sequence from T +1 frame, and T denotes each frame.
As shown in fig. 3, in the present embodiment, an image Segmentation-Long-short Term Memory network (SEG-LSTM) is proposed to merge multiple streams of history frames and predict future trajectories of surrounding vehicles.
The number of layers of the LSTM network, the number of units of each layer of the LSTM network, the number of layers of the convolutional neural network and the size of the convolutional kernel belong to network hyper-parameters, and are determined through cross validation. The role of cross-validation is to determine the optimal hyper-parameters while avoiding model overfitting. Illustratively, first, the data set is divided into a training set and a test set, with a ratio of 5: 1. Then the training set is divided into 5 parts, each part is taken as a verification set in turn, the other 4 parts are taken as the training sets to carry out 5 times of training and verification, the corresponding average accuracy rate can be obtained by using different hyper-parameters, and the hyper-parameter with the optimal effect is taken to determine the numerical value.
As shown in fig. 4, a video sequence is divided into a plurality of video sequences with time step lengths by frame, and position information is obtained by detecting and tracking the video sequence of each frame, and semantic information is obtained by image segmentation. Then, the position information and semantic information of the same frame are input into an LSTM network for training, a plurality of video sequences of historical frames and a current frame are trained to obtain a predicted track,
206. and respectively acquiring the minimum relative distance between the vehicle and each surrounding vehicle through the depth camera. And converting the two-dimensional space prediction track into a three-dimensional space prediction track according to the minimum relative distance.
Specifically, the predicted track is a two-dimensional spatial predicted track, and a depth camera is further arranged in the vehicle.
Wherein the two-dimensional spatial predicted trajectory is converted into a three-dimensional spatial predicted trajectory according to the minimum relative distance by the following formula:
Figure BDA0001725930000000071
wherein x, y, w, h respectively represent elements of the two-dimensional spatial prediction track in the pixel bounding box of each frame of the video sequence, and xr,yr,wr,hrRespectively representing the elements of a three-dimensional space prediction track in a pixel bounding box in each frame of a video sequence, f is represented as the focal length of the depth camera, dminExpressed as the minimum relative distance of the vehicle from each of the surrounding vehicles.
Wherein, if the subscript p is ignored, the historical track information and the predicted track can be defined as a three-dimensional space occupying grid, that is
H,J∈R6={x,y,w,h,dmin,dmax}
In the formula (d)maxIndicating the maximum distance of the vehicle from each of the surrounding vehicles.
In the embodiment of the invention, firstly, a video sequence comprising surrounding vehicles and vehicle backgrounds is obtained through a vehicle-mounted camera, the video sequence is subjected to image segmentation to obtain scene semantic information, then the scene semantic information and historical track information are input into a neural network model to obtain a predicted track, instead of extracting the scene semantic information by using a static image to analyze, so that the time continuity of the neural network model in the embodiment is ensured, and the accuracy of predicting the vehicle track is improved. In addition, the convolutional neural network and the LSTM network can improve the robustness of tracking surrounding vehicles, and the image segmentation is adopted to obtain scene semantic information, so that the interpretability of the training process can be improved.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a trajectory prediction device according to a third embodiment of the present invention, and the trajectory prediction device is applied to a vehicle with a vehicle-mounted camera. As shown in fig. 5, the trajectory prediction apparatus mainly includes:
an obtaining module 301, configured to use the vehicle-mounted camera to photograph the surrounding environment, and obtain a video sequence including the surrounding vehicle and the vehicle background.
And an extracting and dividing module 302, configured to locate surrounding vehicles from the video sequence, extract historical track information of the surrounding vehicles, and use scene semantic information obtained by performing image division on the video sequence as auxiliary information.
And the output module 303 is configured to input the historical trajectory information and the auxiliary information into the neural network model to obtain a predicted trajectory of the surrounding vehicle.
Further, the neural network model includes a convolutional neural network, a first layer long short term memory network, a second layer long short term memory network, and a full link layer,
the output module 303 is further configured to input the auxiliary information to the convolutional neural network to obtain spatial feature information.
The output module 303 is further configured to input the historical trajectory information into the first-tier long-short term memory network to obtain the time characteristic information.
The output module 303 is further configured to input the spatial feature information and the temporal feature information into the second layer long-term and short-term memory network to obtain the joint feature information.
The output module 303 is further configured to input the joint feature information into the full connection layer to obtain a predicted track.
Further, the neural network model includes the following formula:
J←Mp(h,a):H×A。
Figure BDA0001725930000000081
Figure BDA0001725930000000082
Figure BDA0001725930000000083
wherein J denotes a predicted trajectory, M denotes a mapping relationship between H, A and J, H denotes history trajectory information, a denotes auxiliary information, p denotes a surrounding vehicle, H denotes position information of the vehicle p in the T-th frame video sequence, a denotes scene semantic information of the vehicle p in the T-th frame video sequence, J denotes position information of the vehicle p in the T-th frame video sequence from T +1 frame, and T denotes each frame.
Furthermore, the predicted track is a two-dimensional space predicted track, a depth camera is also arranged in the vehicle,
the obtaining module 301 is further configured to obtain, through the depth camera, the minimum relative distance between the vehicle and each surrounding vehicle.
The apparatus further comprises a conversion module 304 which,
and a conversion module 304, configured to convert the two-dimensional spatial prediction trajectory into a three-dimensional spatial prediction trajectory according to the minimum relative distance.
Further, the conversion module 304 is further configured to convert the two-dimensional spatial predicted trajectory into the three-dimensional spatial predicted trajectory according to the minimum relative distance by the following formula:
Figure BDA0001725930000000091
wherein x, y, w, h respectively represent elements of the two-dimensional spatial prediction track in the pixel bounding box of each frame of the video sequence, and xr,yr,wr,hrRespectively representing the elements of a three-dimensional spatial prediction track in a pixel bounding box in each frame of a video sequence, f representing the focal length of the depth camera, dminExpressed as vehicle and surrounding vehiclesThe minimum relative distance of.
The process of the above modules to implement each function may specifically refer to the related contents in the embodiments shown in fig. 1 to fig. 4, and is not described herein again.
In the embodiment of the invention, the video sequence comprising surrounding vehicles and vehicle backgrounds is obtained through the vehicle-mounted camera, the video sequence is subjected to image segmentation to obtain scene semantic information, then the scene semantic information and historical track information are input into the neural network model to obtain the predicted track, instead of extracting the scene semantic information by using a static image to analyze, so that the time continuity of the neural network model in the embodiment is ensured, and the accuracy of the predicted vehicle track is improved.
In the embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and other divisions may be realized in practice, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication link may be electrical, mechanical or other form of coupling or communication link of the modules indirectly through some interfaces.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, the functional modules in the embodiments of the present invention may be integrated into one processing module. Each module may exist alone physically, or two or more modules may be integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the invention.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In view of the above description of the trajectory prediction method and apparatus, the terminal and the computer readable storage medium provided by the present invention, those skilled in the art will recognize that there may be variations in the embodiments and applications of the method and apparatus provided by the present invention.

Claims (6)

1. A track prediction method is applied to a vehicle provided with a vehicle-mounted camera, and is characterized by comprising the following steps:
shooting the surrounding environment by using a vehicle-mounted camera to obtain a video sequence comprising surrounding vehicles and a vehicle background;
positioning the surrounding vehicles from the video sequence, extracting historical track information of the surrounding vehicles, and taking scene semantic information obtained by image segmentation of the video sequence as auxiliary information;
inputting the historical track information and the auxiliary information into a neural network model to obtain a predicted track of the surrounding vehicle;
the predicted trajectory is a two-dimensional spatial predicted trajectory, and a depth camera is further disposed in the vehicle, then the method further includes:
respectively acquiring the minimum relative distance between the vehicle and each surrounding vehicle through the depth camera;
converting the two-dimensional space prediction track into a three-dimensional space prediction track according to the minimum relative distance;
converting the two-dimensional space prediction track into a three-dimensional space prediction track according to the minimum relative distance by the following formula:
Figure FDA0002166441740000011
wherein x, y, w, h respectively represent elements of the two-dimensional spatial prediction track in the pixel bounding box of each frame of the video sequence, and xr,yr,wr,hrRespectively representing the elements of a three-dimensional spatial prediction track in a pixel bounding box in each frame of a video sequence, f representing the focal length of the depth camera, dminExpressed as the minimum relative distance of the vehicle from each of the surrounding vehicles.
2. The trajectory prediction method of claim 1, wherein the neural network model comprises a convolutional neural network, a first layer long short term memory network, a second layer long short term memory network, and a fully connected layer, and the inputting the historical trajectory information and the auxiliary information into the neural network model to obtain the predicted trajectory of the surrounding vehicle comprises:
inputting the auxiliary information to the convolutional neural network to obtain spatial characteristic information;
inputting the historical track information into the first layer long and short term memory network to obtain time characteristic information;
inputting the spatial characteristic information and the temporal characteristic information into the second layer long-short term memory network to obtain joint characteristic information;
and inputting the joint characteristic information into a full-connection layer to obtain the predicted track.
3. The trajectory prediction method of claim 1, wherein the neural network model comprises the following formula:
J←Mp(h,a):H×A;
Figure FDA0002166441740000021
Figure FDA0002166441740000022
Figure FDA0002166441740000023
wherein J represents the predicted trajectory, M represents a mapping relationship between H, A and J, H represents the historical trajectory information, a represents the auxiliary information, p represents the surrounding vehicle, H represents position information of the vehicle p in the T-th frame video sequence, a represents scene semantic information of the vehicle p in the T-th frame video sequence, J represents position information of the vehicle p in the T-th frame video sequence from T +1 frame, and T represents each frame.
4. A trajectory prediction device applied to a vehicle provided with an onboard camera, the device comprising:
the acquisition module is used for shooting the surrounding environment by utilizing the vehicle-mounted camera to acquire a video sequence comprising surrounding vehicles and a vehicle background;
the extraction and segmentation module is used for positioning the surrounding vehicles from the video sequence, extracting historical track information of the surrounding vehicles, and taking scene semantic information obtained by image segmentation of the video sequence as auxiliary information;
the output module is used for inputting the historical track information and the auxiliary information into a neural network model to obtain the predicted track of the surrounding vehicle;
the predicted track is a two-dimensional spatial predicted track, and a depth camera is further arranged in the vehicle;
the acquisition module is further configured to acquire, through the depth camera, minimum relative distances between the vehicle and each of the surrounding vehicles, respectively;
the apparatus may further comprise a conversion module for,
the conversion module is used for converting the two-dimensional space prediction track into a three-dimensional space prediction track according to the minimum relative distance;
the conversion module is further configured to convert the two-dimensional spatial prediction trajectory into a three-dimensional spatial prediction trajectory according to the minimum relative distance by using the following formula:
Figure FDA0002166441740000031
wherein x, y, w, h respectively represent elements of the two-dimensional spatial prediction track in the pixel bounding box of each frame of the video sequence, and xr,yr,wr,hrRespectively representing the elements of a three-dimensional spatial prediction track in a pixel bounding box in each frame of a video sequence, f representing the focal length of the depth camera, dminExpressed as the minimum relative distance of the vehicle from each of the surrounding vehicles.
5. The trajectory prediction device of claim 4, wherein the neural network model includes a convolutional neural network, a first layer long short term memory network, a second layer long short term memory network, and a fully connected layer,
the output module is further configured to input the auxiliary information to the convolutional neural network to obtain spatial feature information;
the output module is further used for inputting the historical track information into the first layer long-short term memory network to obtain time characteristic information;
the output module is further configured to input the spatial feature information and the temporal feature information into the second layer long-short term memory network to obtain joint feature information;
and the output module is also used for inputting the combined characteristic information into a full connection layer to obtain the predicted track.
6. The trajectory prediction device of claim 4, wherein the neural network model comprises the following formula:
J←Mp(h,a):H×A;
Figure FDA0002166441740000041
Figure FDA0002166441740000042
Figure FDA0002166441740000043
wherein J represents the predicted trajectory, M represents a mapping relationship between H, A and J, H represents the historical trajectory information, a represents the auxiliary information, p represents the surrounding vehicle, H represents position information of the vehicle p in the T-th frame video sequence, a represents scene semantic information of the vehicle p in the T-th frame video sequence, J represents position information of the vehicle p in the T-th frame video sequence from T +1 frame, and T represents each frame.
CN201810752554.5A 2018-07-10 2018-07-10 Trajectory prediction method and apparatus Active CN108803617B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810752554.5A CN108803617B (en) 2018-07-10 2018-07-10 Trajectory prediction method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810752554.5A CN108803617B (en) 2018-07-10 2018-07-10 Trajectory prediction method and apparatus

Publications (2)

Publication Number Publication Date
CN108803617A CN108803617A (en) 2018-11-13
CN108803617B true CN108803617B (en) 2020-03-20

Family

ID=64075916

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810752554.5A Active CN108803617B (en) 2018-07-10 2018-07-10 Trajectory prediction method and apparatus

Country Status (1)

Country Link
CN (1) CN108803617B (en)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020010517A1 (en) * 2018-07-10 2020-01-16 深圳大学 Trajectory prediction method and apparatus
JP2020095612A (en) * 2018-12-14 2020-06-18 株式会社小松製作所 Transport vehicle management system and transport vehicle management method
CN109631915B (en) * 2018-12-19 2021-06-29 百度在线网络技术(北京)有限公司 Trajectory prediction method, apparatus, device and computer readable storage medium
CN109523574B (en) * 2018-12-27 2022-06-24 联想(北京)有限公司 Walking track prediction method and electronic equipment
WO2020164089A1 (en) * 2019-02-15 2020-08-20 Bayerische Motoren Werke Aktiengesellschaft Trajectory prediction using deep learning multiple predictor fusion and bayesian optimization
CN109583151B (en) * 2019-02-20 2023-07-21 阿波罗智能技术(北京)有限公司 Method and device for predicting running track of vehicle
CN111738037B (en) * 2019-03-25 2024-03-08 广州汽车集团股份有限公司 Automatic driving method, system and vehicle thereof
CN109885066B (en) * 2019-03-26 2021-08-24 北京经纬恒润科技股份有限公司 Motion trail prediction method and device
WO2020191642A1 (en) * 2019-03-27 2020-10-01 深圳市大疆创新科技有限公司 Trajectory prediction method and apparatus, storage medium, driving system and vehicle
CN110007675B (en) * 2019-04-12 2021-01-15 北京航空航天大学 Vehicle automatic driving decision-making system based on driving situation map and training set preparation method based on unmanned aerial vehicle
CN110223318A (en) * 2019-04-28 2019-09-10 驭势科技(北京)有限公司 A kind of prediction technique of multi-target track, device, mobile unit and storage medium
CN110262486B (en) * 2019-06-11 2020-09-04 北京三快在线科技有限公司 Unmanned equipment motion control method and device
CN112078592B (en) * 2019-06-13 2021-12-24 魔门塔(苏州)科技有限公司 Method and device for predicting vehicle behavior and/or vehicle track
CN110275531B (en) * 2019-06-21 2020-11-27 北京三快在线科技有限公司 Obstacle trajectory prediction method and device and unmanned equipment
CN110852342B (en) * 2019-09-26 2020-11-24 京东城市(北京)数字科技有限公司 Road network data acquisition method, device, equipment and computer storage medium
CN110834645B (en) * 2019-10-30 2021-06-29 中国第一汽车股份有限公司 Free space determination method and device for vehicle, storage medium and vehicle
US11351996B2 (en) 2019-11-01 2022-06-07 Denso International America, Inc. Trajectory prediction of surrounding vehicles using predefined routes
CN112784628B (en) * 2019-11-06 2024-03-19 北京地平线机器人技术研发有限公司 Track prediction method, neural network training method and device for track prediction
US11650072B2 (en) 2019-11-26 2023-05-16 International Business Machines Corporation Portable lane departure detection
CN111114554B (en) * 2019-12-16 2021-06-11 苏州智加科技有限公司 Method, device, terminal and storage medium for predicting travel track
WO2021134354A1 (en) * 2019-12-30 2021-07-08 深圳元戎启行科技有限公司 Path prediction method and apparatus, computer device, and storage medium
CN111260122A (en) * 2020-01-13 2020-06-09 重庆首讯科技股份有限公司 Method and device for predicting traffic flow on expressway
CN111114543B (en) * 2020-03-26 2020-07-03 北京三快在线科技有限公司 Trajectory prediction method and device
CN111523643B (en) * 2020-04-10 2024-01-05 商汤集团有限公司 Track prediction method, device, equipment and storage medium
CN111595352B (en) * 2020-05-14 2021-09-28 陕西重型汽车有限公司 Track prediction method based on environment perception and vehicle driving intention
WO2022033650A1 (en) 2020-08-10 2022-02-17 Dr. Ing. H.C. F. Porsche Aktiengesellschaft Device for and method of predicting a trajectory for a vehicle
CN112562331A (en) * 2020-11-30 2021-03-26 的卢技术有限公司 Vision perception-based other-party vehicle track prediction method
CN112558608B (en) * 2020-12-11 2023-03-17 重庆邮电大学 Vehicle-mounted machine cooperative control and path optimization method based on unmanned aerial vehicle assistance
CN113554060B (en) * 2021-06-24 2023-06-20 福建师范大学 LSTM neural network track prediction method integrating DTW
CN114387782B (en) * 2022-01-12 2023-06-27 智道网联科技(北京)有限公司 Method and device for predicting traffic state and electronic equipment
CN114460943B (en) * 2022-02-10 2023-07-28 山东大学 Self-adaptive target navigation method and system for service robot
CN115881286B (en) * 2023-02-21 2023-06-16 创意信息技术股份有限公司 Epidemic prevention management scheduling system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105700538A (en) * 2016-01-28 2016-06-22 武汉光庭信息技术股份有限公司 A track following method based on a neural network and a PID algorithm
CN106873580A (en) * 2015-11-05 2017-06-20 福特全球技术公司 Based on perception data autonomous driving at the intersection
CN106952303A (en) * 2017-03-09 2017-07-14 北京旷视科技有限公司 Vehicle distance detecting method, device and system
CN107144285A (en) * 2017-05-08 2017-09-08 深圳地平线机器人科技有限公司 Posture information determines method, device and movable equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11086334B2 (en) * 2016-07-21 2021-08-10 Mobileye Vision Technologies Ltd. Crowdsourcing a sparse map for autonomous vehicle navigation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106873580A (en) * 2015-11-05 2017-06-20 福特全球技术公司 Based on perception data autonomous driving at the intersection
CN105700538A (en) * 2016-01-28 2016-06-22 武汉光庭信息技术股份有限公司 A track following method based on a neural network and a PID algorithm
CN106952303A (en) * 2017-03-09 2017-07-14 北京旷视科技有限公司 Vehicle distance detecting method, device and system
CN107144285A (en) * 2017-05-08 2017-09-08 深圳地平线机器人科技有限公司 Posture information determines method, device and movable equipment

Also Published As

Publication number Publication date
CN108803617A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
CN108803617B (en) Trajectory prediction method and apparatus
WO2020244653A1 (en) Object identification method and device
CN111563446B (en) Human-machine interaction safety early warning and control method based on digital twin
Ramos et al. Detecting unexpected obstacles for self-driving cars: Fusing deep learning and geometric modeling
CN109726627B (en) Neural network model training and universal ground wire detection method
US11455813B2 (en) Parametric top-view representation of complex road scenes
EP3561727A1 (en) A device and a method for extracting dynamic information on a scene using a convolutional neural network
Saha et al. Enabling spatio-temporal aggregation in birds-eye-view vehicle estimation
CN107545263B (en) Object detection method and device
CN113936139A (en) Scene aerial view reconstruction method and system combining visual depth information and semantic segmentation
EP3822852B1 (en) Method, apparatus, computer storage medium and program for training a trajectory planning model
WO2021218786A1 (en) Data processing system, object detection method and apparatus thereof
Wulff et al. Early fusion of camera and lidar for robust road detection based on U-Net FCN
CN111563415A (en) Binocular vision-based three-dimensional target detection system and method
Kim et al. Vision-based real-time obstacle segmentation algorithm for autonomous surface vehicle
Bhalla et al. Simulation of self-driving car using deep learning
CN111860269A (en) Multi-feature fusion tandem RNN structure and pedestrian prediction method
CN111098850A (en) Automatic parking auxiliary system and automatic parking method
CN110942037A (en) Action recognition method for video analysis
CN116597270A (en) Road damage target detection method based on attention mechanism integrated learning network
CN116740424A (en) Transformer-based timing point cloud three-dimensional target detection
CN114049532A (en) Risk road scene identification method based on multi-stage attention deep learning
CN113012191A (en) Laser mileage calculation method based on point cloud multi-view projection graph
CN112288702A (en) Road image detection method based on Internet of vehicles
CN115147450B (en) Moving target detection method and detection device based on motion frame difference image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant