CN114742317A - Vehicle track prediction method and device, electronic equipment and storage medium - Google Patents

Vehicle track prediction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114742317A
CN114742317A CN202210515263.0A CN202210515263A CN114742317A CN 114742317 A CN114742317 A CN 114742317A CN 202210515263 A CN202210515263 A CN 202210515263A CN 114742317 A CN114742317 A CN 114742317A
Authority
CN
China
Prior art keywords
feature
track
future
sequence
trajectory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210515263.0A
Other languages
Chinese (zh)
Inventor
林华东
范圣印
李雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Yihang Yuanzhi Intelligent Technology Co ltd
Original Assignee
Suzhou Yihang Yuanzhi Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Yihang Yuanzhi Intelligent Technology Co ltd filed Critical Suzhou Yihang Yuanzhi Intelligent Technology Co ltd
Priority to CN202210515263.0A priority Critical patent/CN114742317A/en
Publication of CN114742317A publication Critical patent/CN114742317A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Strategic Management (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Molecular Biology (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides a vehicle trajectory prediction method, including: acquiring a first future track characteristic sequence of a target vehicle; acquiring observed image features of a target vehicle to generate a feature map; acquiring a second future track characteristic sequence of the target vehicle based on the characteristic diagram, and acquiring track multi-modal characteristics of the target vehicle based on the characteristic diagram; obtaining a first future inversion trajectory feature sequence and a second future inversion trajectory feature sequence; performing first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence; performing second fusion processing on each future track fusion feature in the future track fusion feature sequence and the track multi-mode feature to obtain a final future track feature for generating a final future predicted track; and correcting the final future predicted track based on the lane lines in the semantic map to obtain a corrected track. The disclosure also provides a vehicle track prediction device and electronic equipment.

Description

Vehicle track prediction method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer vision and automatic driving technologies, and in particular, to a vehicle trajectory prediction method and apparatus, an electronic device, and a storage medium.
Background
In the driving process of the vehicle, a driver can make a correct decision by judging the motion state of the vehicle, so that traffic accidents are avoided. However, it is difficult for an autopilot system to make a reasonable decision by merely detecting and tracking vehicles (i.e., vehicles in the vicinity of the host vehicle).
If the vehicle is detected to stop, the degree of traffic congestion increases, and the wrong choice to continue forward may result in a collision. Therefore, the future track of the vehicle is reasonably predicted, and the safety and the smoothness of a traffic system can be improved. However, vehicle trajectory prediction presents a great challenge, and the topology of roads, traffic signs and signal lights, interactions between vehicles and surrounding agents, and the like all affect the prediction of vehicle trajectories.
The technical scheme for predicting the vehicle track in the prior art is as follows:
scheme 1: the WACV paper "Uncertainty-aware Short-term Motion Prediction of Traffic indicators for Autonomous Driving" in 2020 proposes the use of a rasterized high-precision map, i.e., a grid map, to provide more comprehensive and fine map information. The model takes the state (position, speed and acceleration) of the vehicle at a certain moment as input, and then predicts the future track by combining with the grid map processed by the convolutional neural network.
Scheme 2: the CVPR paper "Multimodal Motion Prediction with Stacked transformations" in 2021 performs trajectory Prediction based on Stacked transformations. And respectively extracting track information, high-precision map information and interaction information by using three stacked transformers, and generating a plurality of suggested track characteristics. The Decoder of the first transform is initialized with learnable parameters, and the input of each subsequent Decoder is the output of the previous Decoder. Each proposed trajectory feature is then decoded to generate a future trajectory and confidence. Meanwhile, the future prediction track is divided into a plurality of independent areas, and optimization is carried out in different areas. The stacked transformers independently process the semantic map and the track data, and fusion is carried out in a stacking mode, so that mutual interference among different types of features is reduced.
Scheme 3: chinese patent document CN114022847A "an intelligent agent trajectory prediction method, system, device and storage medium" combines a graph neural network and a generative network, expresses the interaction between intelligent agents by the graph neural network, extracts history and future information by a recurrent neural network, and obtains a probability model of a trajectory by the generative network. However, the generative network is less interpretable for trajectory multimodal.
Scheme 4: chinese patent document CN114004406A, "vehicle trajectory prediction method, device, storage medium, and electronic apparatus", proposes a trajectory correction method. Firstly, simultaneously predicting the tracks of a target vehicle and surrounding vehicles to obtain an initial predicted track; and finally, correcting the initial predicted track of the target vehicle based on the track correction amount to obtain a final predicted track. The trajectory correction method based on interaction avoids vehicle collision to a certain extent, and can improve the accuracy of vehicle trajectory prediction in a dense traffic scene. However, the corrected trajectory may not satisfy the constraint of the lane line.
At present, the vehicle track prediction has the following three difficulties, and the thesis or patent in the prior art is difficult to be fully solved.
Firstly, the multi-modal input fusion effect is poor. The input to vehicle trajectory prediction is multi-modal and typically includes high-precision maps and historical trajectories. The expression form of the high-precision map comprises a common semantic map, a rasterized semantic map, a vectorized map and the like. The Transformer is widely applied to the field of sequence prediction, and different types of feature data need to be fused for moving to the field of trajectory prediction. However, conventional transformers can only process a single type of data, such as processing text sequences directly or processing image data. The existing fusion method based on Transformer cannot fully play the role of a high-precision map.
Secondly, the multi-modal interpretability of the trajectory is poor. The predicted trajectory has the characteristics of multiple modes, i.e., there are many possible situations for the future trajectory. The multi-modal includes a directional multi-modal and a velocity multi-modal. For the multi-modal directions, the vehicle may have multiple options such as turning left, going straight, turning right at the intersection, or may turn under the influence of surrounding vehicles. For the multi-mode speed, due to the influence of traffic lights, surrounding vehicles and the like, the vehicles can accelerate, decelerate, and brake suddenly. At present, the multi-modal trajectory generation mode comprises a generation model and a two-stage prediction method. GAN (generative countermeasure network) and CVAE (conditional variable automatic encoder) are commonly used as generative models, and tracks are generated by sampling different noises, and although the tracks have certain multimodalities, the interpretability is poor. The two-stage prediction method is to predict the end point first and then predict the trajectory based on the end point. By taking the constraints of the end point into account, the method is more suitable for the reaction of a human driver in a real scene. Transformer has outstanding advantages in the field of sequence prediction, but it is difficult to take endpoint constraints into account because of its inherent framework.
And thirdly, the track does not meet the constraint of the lane line. In a real traffic scene, a driver generally drives along a lane line, so the real track generally satisfies the constraint of the lane line. However, the plurality of tracks predicted by the model have randomness, and the tracks may not satisfy the constraints of the lane lines or the constraints of the lane lines are not strong enough.
Disclosure of Invention
In order to solve at least one of the above technical problems, the present disclosure provides a vehicle trajectory prediction method, apparatus, electronic device, storage medium, and program product.
According to an aspect of the present disclosure, there is provided a vehicle trajectory prediction method including:
s110, acquiring a first future track characteristic sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;
s120, acquiring the observation image characteristics of the target vehicle based on a semantic map with the target vehicle as the center to generate a characteristic map;
s130, acquiring a second future trajectory feature sequence of the target vehicle based on the feature map, and acquiring at least one trajectory multi-modal feature of the target vehicle based on the feature map;
s140, respectively inverting the first future track feature sequence and the second future track feature sequence to obtain a first future inverted track feature sequence and a second future inverted track feature sequence;
s150, performing first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence;
and S160, carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and the at least one track multi-mode feature to obtain at least one final future track feature for generating at least one final future predicted track.
The vehicle trajectory prediction method according to at least one embodiment of the present disclosure further includes:
s170, correcting the final future predicted track based on the lane lines in the semantic map to obtain a corrected track.
According to the vehicle track prediction method of at least one embodiment of the present disclosure, S110, acquiring a first future track feature sequence of a target vehicle based on an acquired observation track sequence of the target vehicle, includes:
s111, embedding the observation track sequence of the target vehicle, and performing position coding to generate a pre-coding feature;
s112, coding the pre-coding features by using a coder of a first Transformer model to obtain a first feature vector;
s113, decoding the first feature vector by using a decoder of a first transform model to obtain the first future track feature sequence.
According to the vehicle track prediction method of at least one embodiment of the present disclosure, S120, acquiring an observation image feature of the target vehicle based on a semantic map centering on the target vehicle to generate a feature map, includes:
and performing feature extraction on the semantic map by using a backbone network based on CNN to acquire the observed image features of the target vehicle so as to generate a feature map.
According to the vehicle track prediction method of at least one embodiment of the present disclosure, in S130, obtaining a second future track feature sequence of the target vehicle based on the feature map includes:
s131, segmenting the feature graph into a plurality of feature subgraphs to obtain a feature subgraph sequence;
s132, carrying out position coding processing on the characteristic subgraph sequence to obtain pre-coding characteristics;
s133, coding the pre-coding features by using a coder of a second Transformer model to obtain a second feature vector;
and S134, decoding the second feature vector by using a decoder of a second transform model to obtain the second future track feature sequence.
According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, in S130, obtaining at least one trajectory multi-modal feature of the target vehicle based on the feature map includes:
and S135, decoding the second feature vector by using a decoder of a third transform model to obtain at least one track multi-modal feature of the target vehicle.
According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, the decoder of the first transform model is a decoder initialized by a learnable embedding and subjected to trajectory feature learning.
According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, the decoder of the second transform model is a decoder initialized by a learnable embedding and subjected to trajectory feature learning.
According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, the decoder of the third transform model is a decoder initialized by a learnable embedding and subjected to multi-modal feature learning.
According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, includes:
and splicing each track multi-modal feature with each future track fusion feature in the future track fusion feature sequence to obtain a final future track feature based on each track multi-modal feature.
According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, further includes:
processing based on a Self-Attention mechanism (Self-Attention) is carried out on each final future track characteristic, and a final future predicted track based on each final future track characteristic is obtained through a multi-layer perceptron.
According to the vehicle track prediction method of at least one embodiment of the present disclosure, S170, correcting the final future predicted track based on a lane line in a semantic map to obtain a corrected track, includes:
s171, for each final future predicted track, obtaining a lane line closest to the future predicted track;
s172, carrying out coding processing based on GRU on the final future prediction track to obtain a final future prediction track coding sequence; carrying out coding processing based on GRU on the lane line closest to the final future predicted track to obtain a lane line coding sequence, and carrying out position coding on the lane line coding sequence to obtain a final lane line coding sequence;
s173, processing the final future prediction track coding sequence and the final lane line coding sequence based on a Multi-head Attention mechanism (Multi-head Attention) to obtain a corrected track characteristic;
and S174, decoding the corrected track characteristics based on the multilayer perceptron to obtain the corrected track.
According to another aspect of the present disclosure, there is provided a vehicle trajectory prediction apparatus including:
a first future track feature sequence acquisition module, which acquires a first future track feature sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;
a feature map acquisition module that acquires an observation image feature of the target vehicle based on a semantic map centered on the target vehicle to generate a feature map;
a second future trajectory feature sequence acquisition module that acquires a second future trajectory feature sequence of the target vehicle based on the feature map;
a trajectory multi-modal feature acquisition module that acquires at least one trajectory multi-modal feature of the target vehicle based on the feature map;
the first inversion processing module performs sequence inversion on the first future trajectory feature sequence to obtain a first future inversion trajectory feature sequence;
the second inversion processing module performs sequence inversion on the second future trajectory feature sequence to obtain a second future inversion trajectory feature sequence;
the first fusion processing module is used for carrying out first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence;
the second fusion processing module is used for carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and the at least one track multi-modal feature to obtain at least one final future track feature;
a final future predicted trajectory generation module that generates at least one final future predicted trajectory based on the final future trajectory features.
The vehicle track prediction apparatus according to at least one embodiment of the present disclosure further includes:
a correction module that corrects the final future predicted trajectory based on lane lines in a semantic map to obtain a corrected trajectory.
According to the vehicle trajectory prediction apparatus of at least one embodiment of the present disclosure, the first future trajectory feature sequence acquisition module includes:
the embedded processing and position coding processing module is used for embedding an observation track sequence of the target vehicle and carrying out position coding processing to generate a pre-coding characteristic;
the first Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a first characteristic vector;
and the first Transformer model decoder is used for decoding the first feature vector to obtain the first future track feature sequence.
According to a vehicle trajectory prediction device of at least one embodiment of the present disclosure, the feature map acquisition module includes:
a CNN-based backbone network (backbone) module, wherein the CNN-based backbone network (backbone) module performs feature extraction on the semantic map to acquire the observation image features of the target vehicle so as to generate a feature map.
According to the vehicle trajectory prediction apparatus of at least one embodiment of the present disclosure, the second future trajectory feature sequence acquisition module includes:
the characteristic graph segmentation module segments the characteristic graph into a plurality of characteristic subgraphs to obtain a characteristic subgraph sequence;
the position coding processing module is used for carrying out position coding processing on the characteristic subgraph sequence to obtain pre-coding characteristics;
the second Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a second characteristic vector;
and the second Transformer model decoder is used for decoding the second feature vector to obtain the second future track feature sequence.
According to the vehicle trajectory prediction apparatus of at least one embodiment of the present disclosure, the trajectory multimodal feature acquisition module includes:
and the third Transformer model decoder is used for decoding the second feature vector to obtain at least one track multi-modal feature of the target vehicle.
According to a vehicle trajectory prediction device of at least one embodiment of the present disclosure, the final future predicted trajectory generation module includes: the system comprises a self-attention mechanism module and a multilayer perceptron module;
inputting each final future trajectory feature into the Self-Attention mechanism (Self-Attention) module for processing, wherein the output of the Self-Attention mechanism module is used as the input of the multi-layer perceptron module, and a final future predicted trajectory based on each final future trajectory feature is obtained through the processing of the multi-layer perceptron.
According to the vehicle track prediction device of at least one embodiment of the present disclosure, the correction module includes:
the lane line acquisition module acquires a lane line of the closest distance of each final future predicted track from the semantic map;
a first GRU module, wherein the first GRU module performs GRU-based encoding processing on the final future predicted trajectory to obtain a final future predicted trajectory encoding sequence;
a second GRU module, which performs GRU-based coding processing on the lane line of the closest distance of each final future predicted track to obtain a lane line coding sequence;
the position coding module is used for carrying out position coding on the lane line coding sequence to obtain a final lane line coding sequence;
a Multi-head Attention mechanism module, wherein the Multi-head Attention mechanism module performs processing based on a Multi-head Attention mechanism (Multi-head Attention) on the final future prediction track coding sequence and the final lane line coding sequence to obtain a corrected track characteristic;
and the multilayer perceptron module decodes the corrected track characteristics to obtain the corrected track.
According to yet another aspect of the present disclosure, there is provided an electronic device including: a memory storing execution instructions; and a processor executing execution instructions stored by the memory to cause the processor to perform the vehicle trajectory prediction method of any of the embodiments of the present disclosure.
According to yet another aspect of the present disclosure, there is provided a readable storage medium having stored therein execution instructions for implementing the vehicle trajectory prediction method of any one of the embodiments of the present disclosure when executed by a processor.
According to yet another aspect of the present disclosure, there is provided a computer program product comprising computer programs/instructions which, when executed by a processor, implement the vehicle trajectory prediction method of any one of the embodiments of the present disclosure.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the disclosure and together with the description serve to explain the principles of the disclosure.
Fig. 1 is a flowchart illustrating a vehicle trajectory prediction method according to an embodiment of the present disclosure.
Fig. 2 is a flowchart illustrating a vehicle trajectory prediction method according to still another embodiment of the present disclosure.
Fig. 3 is a flowchart illustrating a method for acquiring a first future trajectory feature sequence according to an embodiment of the disclosure.
Fig. 4 is a flowchart illustrating a second future trajectory feature sequence acquisition method according to an embodiment of the present disclosure.
Fig. 5 is a flowchart illustrating a method for acquiring a trajectory multi-modal feature according to an embodiment of the present disclosure.
FIG. 6 is a flow chart illustrating a method for modifying a final future predicted trajectory according to one embodiment of the present disclosure.
Fig. 7 is a block diagram illustrating a structure of a network model implementing the vehicle trajectory prediction method of the present disclosure according to an embodiment of the present disclosure.
FIG. 8 is a flow chart illustrating the steps of the loss function calculation in the model training process according to an embodiment of the present disclosure.
Fig. 9 is a block diagram schematic structure of a vehicle trajectory prediction device employing a hardware implementation of a processing system according to an embodiment of the present disclosure.
Fig. 10 is a block diagram schematically illustrating the structure of a vehicle trajectory prediction device using a hardware implementation of a processing system according to still another embodiment of the present disclosure.
Description of the reference numerals
1000 vehicle trajectory prediction device
1002 first future track characteristic sequence acquisition module
1004 feature map acquisition module
1006 second future trajectory feature sequence acquisition module
1008 track multi-modal feature acquisition module
1010 first inversion processing module
1012 second inversion processing module
1014 first fusion processing module
1016 second fusion processing Module
1018 final future predicted trajectory generation module
1020 correction module
1100 bus
1200 processor
1300 memory
1400 and other circuits.
Detailed Description
The present disclosure will be described in further detail with reference to the drawings and embodiments. It is to be understood that the specific embodiments described herein are for purposes of illustration only and are not to be construed as limitations of the present disclosure. It should be further noted that, for the convenience of description, only the portions relevant to the present disclosure are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict. Technical solutions of the present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Unless otherwise indicated, the illustrated exemplary embodiments/examples are to be understood as providing exemplary features of various details of some ways in which the technical concepts of the present disclosure may be practiced. Accordingly, unless otherwise indicated, features of the various embodiments may be additionally combined, separated, interchanged, and/or rearranged without departing from the technical concept of the present disclosure.
The use of cross-hatching and/or shading in the drawings is generally used to clarify the boundaries between adjacent components. As such, unless otherwise noted, the presence or absence of cross-hatching or shading does not convey or indicate any preference or requirement for a particular material, material property, size, proportion, commonality between the illustrated components and/or any other characteristic, attribute, property, etc., of a component. Further, in the drawings, the size and relative sizes of components may be exaggerated for clarity and/or descriptive purposes. While example embodiments may be practiced differently, the specific process sequence may be performed in a different order than that described. For example, two processes described consecutively may be performed substantially simultaneously or in reverse order to that described. In addition, like reference numerals denote like parts.
When an element is referred to as being "on" or "over," "connected to" or "coupled to" another element, it can be directly on, connected or coupled to the other element or intervening elements may be present. However, when an element is referred to as being "directly on," "directly connected to" or "directly coupled to" another element, there are no intervening elements present. For purposes of this disclosure, the term "connected" may refer to physically, electrically, etc., and may or may not have intermediate components.
The terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, when the terms "comprises" and/or "comprising" and variations thereof are used in this specification, the presence of stated features, integers, steps, operations, elements, components and/or groups thereof are stated but does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof. It is also noted that, as used herein, the terms "substantially," "about," and other similar terms are used as approximate terms and not as degree terms, and as such, are used to interpret inherent deviations in measured values, calculated values, and/or provided values that would be recognized by one of ordinary skill in the art.
The following describes the vehicle trajectory prediction method and the vehicle trajectory prediction apparatus of the present disclosure in detail with reference to fig. 1 to 10.
Fig. 1 is a flowchart illustrating a vehicle trajectory prediction method according to an embodiment of the present disclosure.
Referring to fig. 1, a vehicle trajectory prediction method S100 of the present embodiment includes:
s110, acquiring a first future track characteristic sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;
s120, acquiring the observation image characteristics of the target vehicle based on the semantic map with the target vehicle as the center to generate a characteristic map;
s130, acquiring a second future track feature sequence of the target vehicle based on the feature map, and acquiring at least one track multi-mode feature of the target vehicle based on the feature map (wherein the number of the track multi-mode features is K, the number of corresponding predicted tracks is K, and K is a natural number greater than or equal to 1);
s140, respectively inverting the first future track characteristic sequence and the second future track characteristic sequence to obtain a first future inverted track characteristic sequence and a second future inverted track characteristic sequence;
s150, performing first fusion processing (preferably Concat operation) on the first future inverted trajectory feature sequence and the second future inverted trajectory feature sequence to obtain a future trajectory fusion feature sequence; and
and S160, carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and at least one track multi-mode feature to obtain at least one final future track feature for generating at least one final future predicted track.
In the disclosure, a vehicle observation track sequence and a future real track sequence are defined, and specific expressions are respectively shown in formula (1) and formula (2). Wherein N represents the total number of tracks, XiRepresents the ith observation track (i.e., the observation track of the ith target vehicle), tobsIndicating the duration of the observed trajectory. Y isiRepresenting the ith future true trajectory, tpredRepresents the time of future trajectory duration;
Figure BDA0003639260960000091
respectively, the abscissa, the ordinate, the velocity, the acceleration, the yaw rate, and the yaw angle of the trajectory i at the time t. X is the set of observed trajectories and Y is the set of future true trajectories, as shown in equations (3) and (4).
Figure BDA0003639260960000092
Figure BDA0003639260960000093
X={X1,X2,...,XN} (3)
Y={Y1,Y2,...,YN} (4)
The number of predicted trajectories is defined as K. The expression of the future predicted trajectory is shown in (5).
Figure BDA0003639260960000094
And representing the k-th predicted track of the ith piece of data.
Figure BDA0003639260960000095
Is a set of K traces, as shown in equation (6).
Figure BDA0003639260960000096
Figure BDA0003639260960000101
Defining the semantic map as I, preferably using a rasterized semantic map. The semantic map takes the target vehicle as a center, the head of the vehicle faces to the right front, and the actual distances of the left side, the right side, the front and the rear are a meter, b meter, c meter and d meter respectively, illustratively, a is 40, b is 25, c is 25 and d is 10. The image size of the map I is (3, H, W), the number of channels is 3, the height and width are H and W, respectively, and exemplarily, H ═ W ═ 200.
The semantic map described in the disclosure is an electronic map, which is a high-precision map with a precision of 20-50 cm and containing various road surface properties and spatial properties. Those skilled in the art can select various semantic maps in the prior art according to the teachings of the present disclosure, and all of them fall into the protection scope of the present disclosure.
Fig. 2 is a vehicle trajectory prediction method S100 of a preferred embodiment of the present disclosure, including:
s110, acquiring a first future track characteristic sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;
s120, acquiring the observation image characteristics of the target vehicle based on the semantic map with the target vehicle as the center to generate a characteristic map;
s130, acquiring a second future track feature sequence of the target vehicle based on the feature map, and acquiring at least one track multi-mode feature of the target vehicle based on the feature map;
s140, respectively inverting the first future track characteristic sequence and the second future track characteristic sequence to obtain a first future inverted track characteristic sequence and a second future inverted track characteristic sequence;
s150, performing first fusion processing on the first future inversion track characteristic sequence and the second future inversion track characteristic sequence to obtain a future track fusion characteristic sequence;
s160, performing second fusion processing on each future track fusion feature in the future track fusion feature sequence and at least one track multi-modal feature to obtain at least one final future track feature for generating at least one final future predicted track;
and S170, correcting the final future predicted track based on the lane lines in the semantic map to obtain a corrected track.
For the vehicle trajectory prediction method S100 of each of the above embodiments, preferably, the step S110 of obtaining a first future trajectory feature sequence of the target vehicle based on the obtained observation trajectory sequence of the target vehicle includes:
s111, embedding the observation track sequence of the target vehicle, and performing position coding to generate a pre-coding feature; preferably, the precoding characteristics are obtained by:
X′=XE+Epos
wherein X is an observation trajectory sequence of the target vehicle, and E represents an embedding process (Embe)dding),EposIndicating position coding and X' is a precoding characteristic.
S112, coding the pre-coding features by using a coder of the first Transformer model to obtain a first feature vector; preferably, the first feature vector is obtained by:
h1=Encoder(X′;WX);
wherein Encoder represents an Encoder of a first transform model, WXIs a corresponding parameter, h1Is the first feature vector.
S113, decoding the first feature vector by using a decoder of the first Transformer model to obtain a first future track feature sequence; preferably, the first future trajectory feature sequence is obtained by:
T1=Decoder(h1;WDX);
wherein Decoder represents a Decoder of a first transform model, WDXIs the corresponding parameter, T1Is a first future track feature sequence with a length tpred
The decoder of the first transform model is initialized by embedding and capable of learning, and performs track feature learning.
Fig. 3 shows a flow of a method for acquiring a first future trajectory feature sequence according to an embodiment of the present disclosure.
For the vehicle trajectory prediction method S100 of each of the above embodiments, preferably, S120, acquiring the observed image feature of the target vehicle based on the semantic map centered on the target vehicle to generate the feature map, includes:
feature extraction is performed on the semantic map by using a CNN-based backbone network (backbone) to acquire observed image features of the target vehicle so as to generate a feature map.
Wherein the backbone network preferably uses reset18, and the size of the feature map is (D)img,h,w),DimgIndicating the number of channels, and h and w indicating the height and width of the profile.
In some embodiments of the present disclosure, the steps S121 to S123 described in chinese patent CN202111249188X may also be adopted to obtain the observation image features of the target vehicle based on the observation trajectory sequence of the target vehicle to generate the feature map, which is not described in detail in this disclosure.
For the vehicle trajectory prediction method S100 of each of the above embodiments, after obtaining the feature map, the present disclosure obtains a second future trajectory feature sequence of the target vehicle based on the feature map in S130, preferably, including:
s131, dividing the feature graph into a plurality of feature subgraphs to obtain a feature subgraph sequence;
the number of the feature graphs is h multiplied by w, and the dimension of each feature sub graph is DimgThen the dimension of the feature subgraph sequence P is (hw, D)img)。
S132, carrying out position coding processing on the characteristic subgraph sequence to obtain pre-coding characteristics; preferably, the precoding characteristic P' is obtained by:
P′=PE+Epos
wherein E represents Embedding, EposIndicating a position code.
S133, coding the pre-coding features by using a coder of a second Transformer model to obtain a second feature vector; preferably, the second feature vector is obtained by:
h2=Encoder(P′;WI);
wherein Encoder represents an Encoder of a second transform model, WIIs the corresponding parameter.
S134, decoding the second feature vector by using a decoder of the second transform model to obtain a second future track feature sequence; preferably, the second future trajectory feature sequence is obtained by:
T2=Decoder(h2;WDI);
wherein Decoder represents a Decoder of a second transform model, WDIIs the corresponding parameter, T2Has a length of tpred
Wherein, the decoder of the second transform model is initialized by the learnable embedding and carries out the decoder of the track characteristic learning.
Fig. 4 shows a flow of a second future trajectory feature sequence acquisition method of an embodiment of the present disclosure.
FIG. 5 shows a flow of a method for obtaining multi-modal trajectory features according to an embodiment of the present disclosure.
Referring to fig. 5, in S130, obtaining at least one track multi-modal feature of the target vehicle based on the feature map (the number of track multi-modal features is K, and the number of corresponding predicted tracks is K) includes:
s135, decoding the second feature vector by using a decoder of the third transform model to obtain at least one track multi-modal feature of the target vehicle; preferably, the trajectory multi-modal features are obtained by:
M=Decoder(h2;WDM);
wherein Decoder represents a Decoder of a third transform model, WDMIs a corresponding parameter, and the length of M is the number K of predicted tracks.
The decoder of the third transform model is initialized by the learnable embedding and performs multi-modal feature learning.
For the vehicle trajectory prediction method S100 of each of the above embodiments, S140, the first future trajectory feature sequence and the second future trajectory feature sequence are respectively inverted to obtain a first future inverted trajectory feature sequence and a second future inverted trajectory feature sequence, and preferably, the first future trajectory feature sequence T is obtained1And a second future trajectory feature sequence T2The inverted sequence is T1 revAnd
Figure BDA0003639260960000121
will T1 revAnd
Figure BDA0003639260960000122
splicing to obtain a future trackFusion signature sequence Trev
In some embodiments of the present disclosure, S160, fusing the future trajectory into the feature sequence TrevPerforming a second fusion process on each future trajectory fusion feature and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, comprising:
fusing each track multi-modal feature with a future track to form a feature sequence TrevAnd performing splicing processing on each future track fusion feature to obtain a final future track feature based on each track multi-modal feature.
Wherein each track multi-modal feature corresponds to a track, and repeating t each track multi-modal feature to align with future track lengthspredSecondly, splicing and fusing the characteristic sequence T in the future trackrevAnd obtaining the final future track characteristic T.
In some embodiments of the present disclosure, preferably, the S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, further includes:
and (3) processing each final future track characteristic T based on a Self-Attention mechanism (Self-Attention), obtaining a processed final future predicted track T', and obtaining a final future predicted track based on each final future track characteristic through a multi-layer perceptron.
That is, preferably, the present disclosure performs a self-attention mechanism-based process on the final future trajectory feature T to enhance the fusion, the self-attention mechanism-based process being represented by the following formula:
Figure BDA0003639260960000131
wherein Q, K, V respectively represents Query, Key, Value vector, dkIs the dimension of the K vector. For the bestAn expression of the Attention of the final future trajectory feature T is Attention (T, T), where Q ═ K ═ V ═ T.
In the present disclosure, the final future predicted trajectory is obtained by the multi-layer perceptron:
Figure BDA0003639260960000132
where δ denotes the multilayer perceptron, W1Representing the corresponding parameters of the multi-layer perceptron.
For the vehicle trajectory prediction method S100 of each of the above embodiments, preferably, S170, the final future predicted trajectory is corrected based on the lane lines in the semantic map to obtain a corrected trajectory, and fig. 6 shows a flow of a method of correcting the final future predicted trajectory according to an embodiment of the present disclosure.
The method for correcting the final future predicted trajectory according to the embodiment includes:
s171, for each final future predicted track, obtaining a lane line closest to the future predicted track;
preferably, for the trajectory
Figure BDA0003639260960000133
At each time t, the off-coordinate is found
Figure BDA0003639260960000134
Nearest lane segment
Figure BDA0003639260960000135
Obtaining a sequence of lane segments
Figure BDA0003639260960000136
S172, carrying out coding processing based on GRU on the final future prediction track to obtain a final future prediction track coding sequence; carrying out coding processing based on GRU on the lane line closest to the final future predicted track to obtain a lane line coding sequence, and carrying out position coding on the lane line coding sequence to obtain a final lane line coding sequence;
preferably, the predicted trajectory is encoded with a GRU to obtain a final future predicted trajectory encoding sequence, i.e. the predicted trajectory encoding sequence
Figure BDA0003639260960000137
Encoding each lane segment of the lane line sequence L by using GRUs to obtain a final lane line encoding sequence, which is shown as the following formula:
L′={GRU(Lt)|t=tobs+1,...,tobs+tpred)};
the method can acquire lane line information from a high-precision map (electronic map), cut lane lines into lane line segments with fixed length, wherein the length of each lane line segment is L meters, the lane line segments are discretized into M points at intervals of d meters, and M lane line segments closest to a vehicle are selected as the lane lines closest to each final future predicted track.
S173, processing the final future prediction track coding sequence and the final lane line coding sequence based on a Multi-head Attention mechanism (Multi-head Attention) to obtain a corrected track characteristic;
preferably, the track is used as Query, the lane line is used as Key and Value, and Multi-head Attention is performed on the track and the lane line. The Mulit-head Attention expression is as follows:
MultiHead(Q,K,V)=Concat(head1,...,headh)WO
headi=Attention(QWi Q,KWi K,VWi V);
wherein h represents the number of attribute heads, Q, K and V respectively represent query, key and value vectors, and Wi Q,Wi K,Wi V,Wi OA linear mapping is represented. The track characteristics obtained after the Mulit-head orientation are
Figure BDA0003639260960000141
I.e. correct the trajectoryAnd (5) carrying out characterization.
S174, decoding the corrected track characteristics based on the multilayer perceptron to obtain a corrected track;
preferably, the corrected trajectory is obtained based on the following formula:
Figure BDA0003639260960000142
where δ denotes the multilayer perceptron, W3Representing the corresponding parameters of the multi-layer perceptron.
Fig. 7 shows a schematic network model structure diagram of a vehicle trajectory prediction method implementing the present disclosure according to an embodiment of the present disclosure.
Referring to fig. 7, the network model (i.e., the vehicle trajectory prediction apparatus) for executing the vehicle trajectory prediction method of the present disclosure generates a confidence when acquiring the trajectory multi-modal features of the target vehicle during the training process.
Preferably, the multi-modal features are decoded with a multi-layer perceptron (MLP) resulting in a confidence C. Specifically, C ═ δ (M; W)2) Where δ denotes the multilayer perceptron, W2Representing the corresponding parameters of the multi-layer perceptron.
In the training process, further, a loss function calculating step S180 is further included, as shown in fig. 8.
And S181, calculating the probability loss. First, score C is giveniMaking softmax, converting into probability distribution PiI.e. by
Figure BDA0003639260960000143
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003639260960000144
Ciand PiAnd respectively representing the confidence coefficient and probability distribution corresponding to the K tracks generated by the ith data.
Further, softmax is carried out on the error of each predicted track and the error of each real track, and the formula is shown as follows:
Figure BDA0003639260960000145
Figure BDA0003639260960000146
wherein D (·,) represents L for calculating the predicted trajectory2The loss is given a negative sign to make the probability of small errors larger.
Finally, a probability distribution P is calculatediAnd a desired distribution λ (Y)i) KL divergence between, the formula is as follows:
Figure BDA0003639260960000147
Figure BDA0003639260960000148
s182, the training process of the model is divided into two stages. The first stage is to fix the trim module and train only the body part, where the trajectory loss is S183. The second stage is to fix the body part and train only the trimming module, where the trajectory loss is S184.
And S183, calculating a diversity loss function of the predicted track. The loss of diversity is L of K predicted tracks and the real track2L of least lost track2And (4) loss. When the predicted track of the main body part is selected
Figure BDA0003639260960000151
To calculate the loss of diversity, the formula is shown below:
Figure BDA0003639260960000152
s184, calculating the diversity loss by selecting the correction track NY predicted by the fine tuning module, wherein the formula is as follows:
Figure BDA0003639260960000153
the loss function comprises two parts, namely a confidence loss function and a diversity loss function. The total loss is expressed by the following formula:
Loss=σ1Lreg2Lconf
where { σ12Is a weight parameter.
The present disclosure also provides a vehicle trajectory prediction device, a vehicle trajectory prediction device 1000 according to an embodiment of the present disclosure, including:
the first future track feature sequence acquisition module 1002, the first future track feature sequence acquisition module 1002 acquires a first future track feature sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;
a feature map acquisition module 1004, wherein the feature map acquisition module 1004 acquires the observation image features of the target vehicle based on the semantic map with the target vehicle as the center to generate a feature map;
a second future trajectory feature sequence acquisition module 1006, wherein the second future trajectory feature sequence acquisition module 1006 acquires a second future trajectory feature sequence of the target vehicle based on the feature map;
the track multi-modal feature acquisition module 1008, wherein the track multi-modal feature acquisition module 1008 acquires at least one track multi-modal feature of the target vehicle based on the feature map;
the first inversion processing module 1010, the first inversion processing module 1010 performs sequence inversion on the first future trajectory feature sequence to obtain a first future inverted trajectory feature sequence;
a second inversion processing module 1012, wherein the second inversion processing module 1012 performs sequence inversion on the second future trajectory feature sequence to obtain a second future inverted trajectory feature sequence;
the first fusion processing module 1014, the first fusion processing module 1014 performs the first fusion processing on the first future inverted trajectory feature sequence and the second future inverted trajectory feature sequence to obtain a future trajectory fusion feature sequence;
the second fusion processing module 1016 is used for performing second fusion processing on each future track fusion feature in the future track fusion feature sequence and at least one track multi-modal feature by the second fusion processing module 1016 to obtain at least one final future track feature;
a final future predicted trajectory generation module 1018, the final future predicted trajectory generation module 1018 generating at least one final future predicted trajectory based on the final future trajectory features.
The vehicle trajectory prediction apparatus of the present disclosure may be implemented based on a computer software program architecture, or may be implemented by adopting a hardware implementation manner of a processing system, referring to fig. 9.
According to a preferred embodiment of the present disclosure, referring to fig. 10, the vehicle track prediction apparatus 1000 of the present disclosure further includes:
and a correction module 1020, wherein the correction module 1020 corrects the final future predicted trajectory based on the lane lines in the semantic map to obtain a corrected trajectory.
In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure, preferably, the first future trajectory feature sequence acquisition module 1002 includes:
the embedding processing and position coding processing module is used for embedding the observation track sequence of the target vehicle and carrying out position coding processing to generate precoding characteristics;
the first Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a first characteristic vector;
and the first Transformer model decoder is used for decoding the first feature vector to obtain a first future track feature sequence.
In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure preferably includes the feature map acquisition module 1004:
and the CNN-based backbone network (backbone) module is used for carrying out feature extraction on the semantic map to acquire the observation image features of the target vehicle so as to generate a feature map.
In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure, preferably, the second future trajectory feature sequence acquisition module 1006 includes:
the characteristic graph segmentation module segments the characteristic graph into a plurality of characteristic subgraphs to obtain a characteristic subgraph sequence;
the position coding processing module is used for carrying out position coding processing on the characteristic sub-graph sequence to obtain pre-coding characteristics;
the second Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a second characteristic vector;
and the second transform model decoder is used for decoding the second characteristic vector to obtain a second future track characteristic sequence.
In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure, preferably, the trajectory multimodal feature acquisition module 1008 includes:
and the third Transformer model decoder is used for decoding the second feature vector to obtain at least one track multi-modal feature of the target vehicle.
In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure, preferably the final future predicted trajectory generation module 1018, includes: the system comprises a self-attention mechanism module and a multi-layer perceptron module;
and inputting each final future track characteristic into a Self-Attention mechanism (Self-Attention) module for processing, taking the output of the Self-Attention mechanism module as the input of a multi-layer perceptron module, and obtaining a final future predicted track based on each final future track characteristic through the processing of the multi-layer perceptron.
In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure preferably includes the modification module 1020:
the lane line acquisition module acquires a lane line of the closest distance of each final future predicted track from the semantic map;
the first GRU module is used for carrying out encoding processing based on GRU on the final future prediction track to obtain a final future prediction track encoding sequence;
the second GRU module is used for carrying out coding processing based on GRU on the lane line with the shortest distance of each final future predicted track to obtain a lane line coding sequence;
the position coding module is used for carrying out position coding on the lane line coding sequence to obtain a final lane line coding sequence;
a Multi-head Attention mechanism module, wherein the Multi-head Attention mechanism module carries out processing based on a Multi-head Attention mechanism (Multi-head Attention) on the final future prediction track coding sequence and the final lane line coding sequence to obtain a corrected track characteristic;
and the multilayer perceptron module decodes the corrected track characteristics to obtain the corrected track.
As can be seen from the above description of the vehicle trajectory prediction method/apparatus of the present disclosure, the present disclosure is based on a Detr improved Transformer, i.e., a Transformer decoder, which is initialized with learnable parameters and decodes all objects at once. Aiming at the problem of poor multi-modal input fusion effect, a high-precision map and a historical track are adopted to respectively predict a future track and then fusion is carried out. Meanwhile, in order to introduce the constraint of the endpoint into the Transformer model, the sequence generated by the decoder is inverted, and after the sequence is inverted, the prediction from the endpoint to the front is meant. Aiming at the problem of poor multi-modal interpretability of the track, the multi-modal generator based on the Transformer is adopted in the method, so that different modalities pay attention to different areas on a semantic map, and the track is generated based on the different modalities. For trajectories that do not satisfy lane line constraints, the present disclosure provides a fine tuning module that corrects the predicted trajectory to the nearest lane line using an attention mechanism.
The vehicle track prediction method/device adopts a high-precision map and a historical track to respectively predict a future track and then perform fusion, improves the multi-mode fusion effect, indirectly introduces the constraint of an end point into a Transformer by reversing the track predicted by the Transformer, improves the prediction precision, adopts a multi-mode generator (a decoder with a third Transformer structure) based on the Transformer to enable different modes to pay attention to different areas on the map, increases the interpretability and the prediction precision of the multi-mode, corrects the track to the nearest lane line by an attention mechanism through arranging a universal fine adjustment module, and increases the constraint of the lane line to the track.
The vehicle trajectory prediction apparatus of the present disclosure may include corresponding modules that perform each or several of the steps of the flowcharts described above. Thus, each step or several steps in the above-described flow charts may be performed by a respective module, and the apparatus may comprise one or more of these modules. The modules may be one or more hardware modules specifically configured to perform the respective steps, or implemented by a processor configured to perform the respective steps, or stored within a computer-readable medium for implementation by a processor, or by some combination.
The hardware architecture may be implemented using a bus architecture. The bus architecture may include any number of interconnecting buses and bridges depending on the specific application of the hardware and the overall design constraints. The bus 1100 couples various circuits including the one or more processors 1200, the memory 1300, and/or the hardware modules together. The bus 1100 may also connect various other circuits 1400, such as peripherals, voltage regulators, power management circuits, external antennas, and the like.
The bus 1100 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one connection line is shown, but no single bus or type of bus is shown.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present disclosure includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the implementations of the present disclosure. The processor performs the various methods and processes described above. For example, method embodiments in the present disclosure may be implemented as a software program tangibly embodied in a machine-readable medium, such as a memory. In some embodiments, some or all of the software program may be loaded and/or installed via memory and/or a communication interface. When the software program is loaded into memory and executed by a processor, one or more steps of the method described above may be performed. Alternatively, in other embodiments, the processor may be configured to perform one of the methods described above by any other suitable means (e.g., by means of firmware).
The logic and/or steps represented in the flowcharts or otherwise described herein may be embodied in any readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.
For the purposes of this description, a "readable storage medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the readable storage medium include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). Further, the readable storage medium may even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a memory.
It should be understood that portions of the present disclosure may be implemented in hardware, software, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps of the method implementing the above embodiments may be implemented by hardware that is instructed to be associated with a program, which may be stored in a readable storage medium, and which, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present disclosure may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.
The present disclosure also provides an electronic device, comprising: a memory storing execution instructions; and a processor executing the execution instructions stored by the memory such that the processor performs the vehicle trajectory prediction method S100 of any one of the embodiments of the present disclosure.
The present disclosure also provides a readable storage medium having stored therein executable instructions for implementing the vehicle trajectory prediction method S100 of any one of the embodiments of the present disclosure when executed by a processor.
The present disclosure also provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the vehicle trajectory prediction method S100 of any of the embodiments of the present disclosure.
In the description herein, reference to the description of the terms "one embodiment/implementation," "some embodiments/implementations," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment/implementation or example is included in at least one embodiment/implementation or example of the present application. In this specification, the schematic representations of the terms described above are not necessarily the same embodiment/mode or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments/modes or examples. Furthermore, the various embodiments/aspects or examples and features of the various embodiments/aspects or examples described in this specification can be combined and combined by one skilled in the art without conflicting therewith.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of the feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
It will be understood by those skilled in the art that the foregoing embodiments are merely for clarity of illustration of the disclosure and are not intended to limit the scope of the disclosure. Other variations or modifications may occur to those skilled in the art, based on the foregoing disclosure, and are still within the scope of the present disclosure.

Claims (10)

1. A vehicle trajectory prediction method, characterized by comprising:
s110, acquiring a first future track characteristic sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;
s120, acquiring the observation image characteristics of the target vehicle based on a semantic map with the target vehicle as the center to generate a characteristic map;
s130, acquiring a second future trajectory feature sequence of the target vehicle based on the feature map, and acquiring at least one trajectory multi-modal feature of the target vehicle based on the feature map;
s140, respectively inverting the first future track feature sequence and the second future track feature sequence to obtain a first future inverted track feature sequence and a second future inverted track feature sequence;
s150, performing first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence; and
and S160, carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and the at least one track multi-mode feature to obtain at least one final future track feature for generating at least one final future predicted track.
2. The vehicle trajectory prediction method according to claim 1, characterized by further comprising:
s170, correcting the final future predicted track based on the lane lines in the semantic map to obtain a corrected track.
3. The vehicle track prediction method according to claim 1 or 2, wherein the step S110 of obtaining a first future track feature sequence of the target vehicle based on the obtained observation track sequence of the target vehicle includes:
s111, embedding the observation track sequence of the target vehicle, and performing position coding to generate a pre-coding feature;
s112, coding the pre-coding features by using a coder of a first Transformer model to obtain a first feature vector; and
s113, decoding the first feature vector by using a decoder of a first transform model to obtain the first future track feature sequence.
4. The vehicle trajectory prediction method according to claim 1 or 2, wherein the step S120 of obtaining the observation image feature of the target vehicle based on the semantic map centered on the target vehicle to generate a feature map comprises:
and performing feature extraction on the semantic map by using a CNN-based backbone network to acquire the observed image features of the target vehicle so as to generate a feature map.
5. The vehicle trajectory prediction method of claim 4, wherein the step S130 of obtaining a second future trajectory feature sequence of the target vehicle based on the feature map comprises:
s131, dividing the feature graph into a plurality of feature subgraphs to obtain a feature subgraph sequence;
s132, carrying out position coding processing on the characteristic subgraph sequence to obtain pre-coding characteristics;
s133, coding the pre-coding features by using a coder of a second Transformer model to obtain a second feature vector; and
and S134, decoding the second feature vector by using a decoder of a second transform model to obtain the second future track feature sequence.
6. The vehicle trajectory prediction method of claim 5, wherein the step S130 of obtaining at least one multi-modal trajectory feature of the target vehicle based on the feature map comprises:
s135, decoding the second feature vector by using a decoder of a third transform model to obtain at least one track multi-modal feature of the target vehicle;
preferably, the decoder of the first transform model is initialized by learnable embedding and subjected to trajectory feature learning;
preferably, the decoder of the second transform model is initialized by learnable embedding and subjected to trajectory feature learning;
preferably, the decoder of the third Transformer model is initialized by learnable embedding and performs multi-modal feature learning;
preferably, S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, includes:
splicing each track multi-modal feature with each future track fusion feature in the future track fusion feature sequence to obtain a final future track feature based on each track multi-modal feature;
preferably, S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, further includes:
processing each final future track characteristic based on a Self-Attention mechanism (Self-Attention), and obtaining a final future predicted track based on each final future track characteristic through a multi-layer perceptron;
preferably, S170, correcting the final future predicted trajectory based on the lane lines in the semantic map to obtain a corrected trajectory, includes:
s171, for each final future predicted track, obtaining a lane line closest to the future predicted track;
s172, carrying out coding processing based on GRU on the final future prediction track to obtain a final future prediction track coding sequence; carrying out coding processing based on GRU on the lane line closest to the final future predicted track to obtain a lane line coding sequence, and carrying out position coding on the lane line coding sequence to obtain a final lane line coding sequence;
s173, processing the final future prediction track coding sequence and the final lane line coding sequence based on a Multi-head Attention mechanism (Multi-head Attention) to obtain a corrected track characteristic; and
and S174, decoding the corrected track characteristics based on the multilayer perceptron to obtain the corrected track.
7. A vehicle trajectory prediction device characterized by comprising:
a first future track feature sequence acquisition module, which acquires a first future track feature sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;
the feature map acquisition module acquires observation image features of the target vehicle based on a semantic map with the target vehicle as a center to generate a feature map;
a second future trajectory feature sequence acquisition module that acquires a second future trajectory feature sequence of the target vehicle based on the feature map;
the track multi-modal feature acquisition module is used for acquiring at least one track multi-modal feature of the target vehicle based on the feature map;
the first inversion processing module performs sequence inversion on the first future trajectory feature sequence to obtain a first future inversion trajectory feature sequence;
the second inversion processing module performs sequence inversion on the second future trajectory feature sequence to obtain a second future inversion trajectory feature sequence;
the first fusion processing module is used for carrying out first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence;
the second fusion processing module is used for carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and the at least one track multi-modal feature to obtain at least one final future track feature; and
a final future predicted trajectory generation module that generates at least one final future predicted trajectory based on the final future trajectory features;
preferably, the method further comprises the following steps:
a correction module that corrects the final future predicted trajectory based on lane lines in a semantic map to obtain a corrected trajectory;
preferably, the first future trajectory feature sequence acquisition module includes:
the embedded processing and position coding processing module is used for embedding an observation track sequence of the target vehicle and carrying out position coding processing to generate a pre-coding characteristic;
the first Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a first characteristic vector; and
a first Transformer model decoder, wherein the first Transformer model decoder decodes the first feature vector to obtain the first future trajectory feature sequence;
preferably, the feature map obtaining module includes:
a CNN-based backbone network (backbone) module, wherein the CNN-based backbone network (backbone) module is used for carrying out feature extraction on the semantic map to obtain the observation image features of the target vehicle so as to generate a feature map;
preferably, the second future trajectory feature sequence acquisition module includes:
the characteristic graph segmentation module segments the characteristic graph into a plurality of characteristic subgraphs to obtain a characteristic subgraph sequence;
the position coding processing module is used for carrying out position coding processing on the characteristic subgraph sequence to obtain pre-coding characteristics;
the second Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a second characteristic vector; and
the second Transformer model decoder is used for decoding the second feature vector to obtain a second future track feature sequence;
preferably, the trajectory multi-modal feature acquisition module comprises:
a third Transformer model decoder, configured to decode the second feature vector to obtain at least one track multi-modal feature of the target vehicle;
preferably, the final future predicted trajectory generation module includes: the system comprises a self-attention mechanism module and a multilayer perceptron module;
inputting each final future trajectory feature into the Self-Attention mechanism (Self-Attention) module for processing, wherein the output of the Self-Attention mechanism module is used as the input of the multi-layer perceptron module, and a final future predicted trajectory based on each final future trajectory feature is obtained through the processing of the multi-layer perceptron module;
preferably, the correction module comprises:
the lane line acquisition module acquires a lane line of the closest distance of each final future predicted track from the semantic map;
a first GRU module, wherein the first GRU module performs GRU-based encoding processing on the final future predicted trajectory to obtain a final future predicted trajectory encoding sequence;
a second GRU module, which performs GRU-based coding processing on the lane line of the closest distance of each final future predicted track to obtain a lane line coding sequence;
the position coding module is used for carrying out position coding on the lane line coding sequence to obtain a final lane line coding sequence;
a Multi-head Attention mechanism module, wherein the Multi-head Attention mechanism module performs processing based on a Multi-head Attention mechanism (Multi-head Attention) on the final future prediction track coding sequence and the final lane line coding sequence to obtain a corrected track characteristic; and
and the multilayer perceptron module decodes the corrected track characteristics to obtain the corrected track.
8. An electronic device, comprising:
a memory storing execution instructions; and
a processor executing execution instructions stored by the memory to cause the processor to perform the vehicle trajectory prediction method of any one of claims 1 to 6.
9. A readable storage medium having stored therein executable instructions for implementing the vehicle trajectory prediction method of any one of claims 1 to 6 when executed by a processor.
10. A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the vehicle trajectory prediction method of any one of claims 1 to 6.
CN202210515263.0A 2022-05-11 2022-05-11 Vehicle track prediction method and device, electronic equipment and storage medium Pending CN114742317A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210515263.0A CN114742317A (en) 2022-05-11 2022-05-11 Vehicle track prediction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210515263.0A CN114742317A (en) 2022-05-11 2022-05-11 Vehicle track prediction method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114742317A true CN114742317A (en) 2022-07-12

Family

ID=82285580

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210515263.0A Pending CN114742317A (en) 2022-05-11 2022-05-11 Vehicle track prediction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114742317A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117132002A (en) * 2023-10-26 2023-11-28 深圳前海中电慧安科技有限公司 Multi-mode space-time track prediction method, device, equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117132002A (en) * 2023-10-26 2023-11-28 深圳前海中电慧安科技有限公司 Multi-mode space-time track prediction method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN112166304B (en) Error detection of sensor data
Kamenev et al. Predictionnet: Real-time joint probabilistic traffic prediction for planning, control, and simulation
US20200148215A1 (en) Method and system for generating predicted occupancy grid maps
US10839524B2 (en) Systems and methods for applying maps to improve object tracking, lane-assignment and classification
US11827214B2 (en) Machine-learning based system for path and/or motion planning and method of training the same
CN114787739A (en) Smart body trajectory prediction using vectorized input
US20220250646A1 (en) Route-relative trajectory numerical integrator and controller using the same
US11364934B2 (en) Training a generator unit and a discriminator unit for collision-aware trajectory prediction
CN113071524B (en) Decision control method, decision control device, autonomous driving vehicle and storage medium
WO2020000191A1 (en) Method for driver identification based on car following modeling
CN114399743A (en) Method for generating future track of obstacle
KR102176483B1 (en) Deep Learning-based Vehicle Trajectory Prediction Method and Apparatus using Rasterized Lane Information
CN114742317A (en) Vehicle track prediction method and device, electronic equipment and storage medium
WO2023102327A1 (en) Center-based detection and tracking
CN114283576A (en) Vehicle intention prediction method and related device
CN116373851A (en) Automatic parking path planning method, automatic parking method and related device
KR102552719B1 (en) Method and apparatus for automatically generating drive route
US20210398014A1 (en) Reinforcement learning based control of imitative policies for autonomous driving
US20230177241A1 (en) Method for determining similar scenarios, training method, and training controller
CN111738046A (en) Method and apparatus for calibrating a physics engine of a virtual world simulator for deep learning based device learning
CN113888601B (en) Target trajectory prediction method, electronic device, and storage medium
CN114898550A (en) Pedestrian trajectory prediction method and system
CN113902776B (en) Target pedestrian trajectory prediction method and device, electronic equipment and storage medium
US20230399027A1 (en) Method for classifying a behavior of a road user and method for controlling an ego vehicle
US20240101157A1 (en) Latent variable determination by a diffusion model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination