CN114742317A

CN114742317A - Vehicle track prediction method and device, electronic equipment and storage medium

Info

Publication number: CN114742317A
Application number: CN202210515263.0A
Authority: CN
Inventors: 林华东; 范圣印; 李雪
Original assignee: Suzhou Yihang Yuanzhi Intelligent Technology Co ltd
Current assignee: Suzhou Yihang Yuanzhi Intelligent Technology Co ltd
Priority date: 2022-05-11
Filing date: 2022-05-11
Publication date: 2022-07-12

Abstract

The present disclosure provides a vehicle trajectory prediction method, including: acquiring a first future track characteristic sequence of a target vehicle; acquiring observed image features of a target vehicle to generate a feature map; acquiring a second future track characteristic sequence of the target vehicle based on the characteristic diagram, and acquiring track multi-modal characteristics of the target vehicle based on the characteristic diagram; obtaining a first future inversion trajectory feature sequence and a second future inversion trajectory feature sequence; performing first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence; performing second fusion processing on each future track fusion feature in the future track fusion feature sequence and the track multi-mode feature to obtain a final future track feature for generating a final future predicted track; and correcting the final future predicted track based on the lane lines in the semantic map to obtain a corrected track. The disclosure also provides a vehicle track prediction device and electronic equipment.

Description

Vehicle track prediction method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer vision and automatic driving technologies, and in particular, to a vehicle trajectory prediction method and apparatus, an electronic device, and a storage medium.

Background

In the driving process of the vehicle, a driver can make a correct decision by judging the motion state of the vehicle, so that traffic accidents are avoided. However, it is difficult for an autopilot system to make a reasonable decision by merely detecting and tracking vehicles (i.e., vehicles in the vicinity of the host vehicle).

If the vehicle is detected to stop, the degree of traffic congestion increases, and the wrong choice to continue forward may result in a collision. Therefore, the future track of the vehicle is reasonably predicted, and the safety and the smoothness of a traffic system can be improved. However, vehicle trajectory prediction presents a great challenge, and the topology of roads, traffic signs and signal lights, interactions between vehicles and surrounding agents, and the like all affect the prediction of vehicle trajectories.

The technical scheme for predicting the vehicle track in the prior art is as follows:

scheme 1: the WACV paper "Uncertainty-aware Short-term Motion Prediction of Traffic indicators for Autonomous Driving" in 2020 proposes the use of a rasterized high-precision map, i.e., a grid map, to provide more comprehensive and fine map information. The model takes the state (position, speed and acceleration) of the vehicle at a certain moment as input, and then predicts the future track by combining with the grid map processed by the convolutional neural network.

Scheme 2: the CVPR paper "Multimodal Motion Prediction with Stacked transformations" in 2021 performs trajectory Prediction based on Stacked transformations. And respectively extracting track information, high-precision map information and interaction information by using three stacked transformers, and generating a plurality of suggested track characteristics. The Decoder of the first transform is initialized with learnable parameters, and the input of each subsequent Decoder is the output of the previous Decoder. Each proposed trajectory feature is then decoded to generate a future trajectory and confidence. Meanwhile, the future prediction track is divided into a plurality of independent areas, and optimization is carried out in different areas. The stacked transformers independently process the semantic map and the track data, and fusion is carried out in a stacking mode, so that mutual interference among different types of features is reduced.

Scheme 3: chinese patent document CN114022847A "an intelligent agent trajectory prediction method, system, device and storage medium" combines a graph neural network and a generative network, expresses the interaction between intelligent agents by the graph neural network, extracts history and future information by a recurrent neural network, and obtains a probability model of a trajectory by the generative network. However, the generative network is less interpretable for trajectory multimodal.

Scheme 4: chinese patent document CN114004406A, "vehicle trajectory prediction method, device, storage medium, and electronic apparatus", proposes a trajectory correction method. Firstly, simultaneously predicting the tracks of a target vehicle and surrounding vehicles to obtain an initial predicted track; and finally, correcting the initial predicted track of the target vehicle based on the track correction amount to obtain a final predicted track. The trajectory correction method based on interaction avoids vehicle collision to a certain extent, and can improve the accuracy of vehicle trajectory prediction in a dense traffic scene. However, the corrected trajectory may not satisfy the constraint of the lane line.

At present, the vehicle track prediction has the following three difficulties, and the thesis or patent in the prior art is difficult to be fully solved.

Firstly, the multi-modal input fusion effect is poor. The input to vehicle trajectory prediction is multi-modal and typically includes high-precision maps and historical trajectories. The expression form of the high-precision map comprises a common semantic map, a rasterized semantic map, a vectorized map and the like. The Transformer is widely applied to the field of sequence prediction, and different types of feature data need to be fused for moving to the field of trajectory prediction. However, conventional transformers can only process a single type of data, such as processing text sequences directly or processing image data. The existing fusion method based on Transformer cannot fully play the role of a high-precision map.

Secondly, the multi-modal interpretability of the trajectory is poor. The predicted trajectory has the characteristics of multiple modes, i.e., there are many possible situations for the future trajectory. The multi-modal includes a directional multi-modal and a velocity multi-modal. For the multi-modal directions, the vehicle may have multiple options such as turning left, going straight, turning right at the intersection, or may turn under the influence of surrounding vehicles. For the multi-mode speed, due to the influence of traffic lights, surrounding vehicles and the like, the vehicles can accelerate, decelerate, and brake suddenly. At present, the multi-modal trajectory generation mode comprises a generation model and a two-stage prediction method. GAN (generative countermeasure network) and CVAE (conditional variable automatic encoder) are commonly used as generative models, and tracks are generated by sampling different noises, and although the tracks have certain multimodalities, the interpretability is poor. The two-stage prediction method is to predict the end point first and then predict the trajectory based on the end point. By taking the constraints of the end point into account, the method is more suitable for the reaction of a human driver in a real scene. Transformer has outstanding advantages in the field of sequence prediction, but it is difficult to take endpoint constraints into account because of its inherent framework.

And thirdly, the track does not meet the constraint of the lane line. In a real traffic scene, a driver generally drives along a lane line, so the real track generally satisfies the constraint of the lane line. However, the plurality of tracks predicted by the model have randomness, and the tracks may not satisfy the constraints of the lane lines or the constraints of the lane lines are not strong enough.

Disclosure of Invention

In order to solve at least one of the above technical problems, the present disclosure provides a vehicle trajectory prediction method, apparatus, electronic device, storage medium, and program product.

According to an aspect of the present disclosure, there is provided a vehicle trajectory prediction method including:

s110, acquiring a first future track characteristic sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;

s120, acquiring the observation image characteristics of the target vehicle based on a semantic map with the target vehicle as the center to generate a characteristic map;

s130, acquiring a second future trajectory feature sequence of the target vehicle based on the feature map, and acquiring at least one trajectory multi-modal feature of the target vehicle based on the feature map;

s140, respectively inverting the first future track feature sequence and the second future track feature sequence to obtain a first future inverted track feature sequence and a second future inverted track feature sequence;

s150, performing first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence;

and S160, carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and the at least one track multi-mode feature to obtain at least one final future track feature for generating at least one final future predicted track.

The vehicle trajectory prediction method according to at least one embodiment of the present disclosure further includes:

s170, correcting the final future predicted track based on the lane lines in the semantic map to obtain a corrected track.

According to the vehicle track prediction method of at least one embodiment of the present disclosure, S110, acquiring a first future track feature sequence of a target vehicle based on an acquired observation track sequence of the target vehicle, includes:

s111, embedding the observation track sequence of the target vehicle, and performing position coding to generate a pre-coding feature;

s112, coding the pre-coding features by using a coder of a first Transformer model to obtain a first feature vector;

s113, decoding the first feature vector by using a decoder of a first transform model to obtain the first future track feature sequence.

According to the vehicle track prediction method of at least one embodiment of the present disclosure, S120, acquiring an observation image feature of the target vehicle based on a semantic map centering on the target vehicle to generate a feature map, includes:

and performing feature extraction on the semantic map by using a backbone network based on CNN to acquire the observed image features of the target vehicle so as to generate a feature map.

According to the vehicle track prediction method of at least one embodiment of the present disclosure, in S130, obtaining a second future track feature sequence of the target vehicle based on the feature map includes:

s131, segmenting the feature graph into a plurality of feature subgraphs to obtain a feature subgraph sequence;

s132, carrying out position coding processing on the characteristic subgraph sequence to obtain pre-coding characteristics;

s133, coding the pre-coding features by using a coder of a second Transformer model to obtain a second feature vector;

and S134, decoding the second feature vector by using a decoder of a second transform model to obtain the second future track feature sequence.

According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, in S130, obtaining at least one trajectory multi-modal feature of the target vehicle based on the feature map includes:

and S135, decoding the second feature vector by using a decoder of a third transform model to obtain at least one track multi-modal feature of the target vehicle.

According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, the decoder of the first transform model is a decoder initialized by a learnable embedding and subjected to trajectory feature learning.

According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, the decoder of the second transform model is a decoder initialized by a learnable embedding and subjected to trajectory feature learning.

According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, the decoder of the third transform model is a decoder initialized by a learnable embedding and subjected to multi-modal feature learning.

According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, includes:

and splicing each track multi-modal feature with each future track fusion feature in the future track fusion feature sequence to obtain a final future track feature based on each track multi-modal feature.

According to the vehicle trajectory prediction method of at least one embodiment of the present disclosure, S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, further includes:

processing based on a Self-Attention mechanism (Self-Attention) is carried out on each final future track characteristic, and a final future predicted track based on each final future track characteristic is obtained through a multi-layer perceptron.

According to the vehicle track prediction method of at least one embodiment of the present disclosure, S170, correcting the final future predicted track based on a lane line in a semantic map to obtain a corrected track, includes:

s171, for each final future predicted track, obtaining a lane line closest to the future predicted track;

s172, carrying out coding processing based on GRU on the final future prediction track to obtain a final future prediction track coding sequence; carrying out coding processing based on GRU on the lane line closest to the final future predicted track to obtain a lane line coding sequence, and carrying out position coding on the lane line coding sequence to obtain a final lane line coding sequence;

s173, processing the final future prediction track coding sequence and the final lane line coding sequence based on a Multi-head Attention mechanism (Multi-head Attention) to obtain a corrected track characteristic;

and S174, decoding the corrected track characteristics based on the multilayer perceptron to obtain the corrected track.

According to another aspect of the present disclosure, there is provided a vehicle trajectory prediction apparatus including:

a first future track feature sequence acquisition module, which acquires a first future track feature sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;

a feature map acquisition module that acquires an observation image feature of the target vehicle based on a semantic map centered on the target vehicle to generate a feature map;

a second future trajectory feature sequence acquisition module that acquires a second future trajectory feature sequence of the target vehicle based on the feature map;

a trajectory multi-modal feature acquisition module that acquires at least one trajectory multi-modal feature of the target vehicle based on the feature map;

the first inversion processing module performs sequence inversion on the first future trajectory feature sequence to obtain a first future inversion trajectory feature sequence;

the second inversion processing module performs sequence inversion on the second future trajectory feature sequence to obtain a second future inversion trajectory feature sequence;

the first fusion processing module is used for carrying out first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence;

the second fusion processing module is used for carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and the at least one track multi-modal feature to obtain at least one final future track feature;

a final future predicted trajectory generation module that generates at least one final future predicted trajectory based on the final future trajectory features.

The vehicle track prediction apparatus according to at least one embodiment of the present disclosure further includes:

a correction module that corrects the final future predicted trajectory based on lane lines in a semantic map to obtain a corrected trajectory.

According to the vehicle trajectory prediction apparatus of at least one embodiment of the present disclosure, the first future trajectory feature sequence acquisition module includes:

the embedded processing and position coding processing module is used for embedding an observation track sequence of the target vehicle and carrying out position coding processing to generate a pre-coding characteristic;

the first Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a first characteristic vector;

and the first Transformer model decoder is used for decoding the first feature vector to obtain the first future track feature sequence.

According to a vehicle trajectory prediction device of at least one embodiment of the present disclosure, the feature map acquisition module includes:

a CNN-based backbone network (backbone) module, wherein the CNN-based backbone network (backbone) module performs feature extraction on the semantic map to acquire the observation image features of the target vehicle so as to generate a feature map.

According to the vehicle trajectory prediction apparatus of at least one embodiment of the present disclosure, the second future trajectory feature sequence acquisition module includes:

the characteristic graph segmentation module segments the characteristic graph into a plurality of characteristic subgraphs to obtain a characteristic subgraph sequence;

the position coding processing module is used for carrying out position coding processing on the characteristic subgraph sequence to obtain pre-coding characteristics;

the second Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a second characteristic vector;

and the second Transformer model decoder is used for decoding the second feature vector to obtain the second future track feature sequence.

According to the vehicle trajectory prediction apparatus of at least one embodiment of the present disclosure, the trajectory multimodal feature acquisition module includes:

and the third Transformer model decoder is used for decoding the second feature vector to obtain at least one track multi-modal feature of the target vehicle.

According to a vehicle trajectory prediction device of at least one embodiment of the present disclosure, the final future predicted trajectory generation module includes: the system comprises a self-attention mechanism module and a multilayer perceptron module;

inputting each final future trajectory feature into the Self-Attention mechanism (Self-Attention) module for processing, wherein the output of the Self-Attention mechanism module is used as the input of the multi-layer perceptron module, and a final future predicted trajectory based on each final future trajectory feature is obtained through the processing of the multi-layer perceptron.

According to the vehicle track prediction device of at least one embodiment of the present disclosure, the correction module includes:

the lane line acquisition module acquires a lane line of the closest distance of each final future predicted track from the semantic map;

a first GRU module, wherein the first GRU module performs GRU-based encoding processing on the final future predicted trajectory to obtain a final future predicted trajectory encoding sequence;

a second GRU module, which performs GRU-based coding processing on the lane line of the closest distance of each final future predicted track to obtain a lane line coding sequence;

the position coding module is used for carrying out position coding on the lane line coding sequence to obtain a final lane line coding sequence;

a Multi-head Attention mechanism module, wherein the Multi-head Attention mechanism module performs processing based on a Multi-head Attention mechanism (Multi-head Attention) on the final future prediction track coding sequence and the final lane line coding sequence to obtain a corrected track characteristic;

and the multilayer perceptron module decodes the corrected track characteristics to obtain the corrected track.

According to yet another aspect of the present disclosure, there is provided an electronic device including: a memory storing execution instructions; and a processor executing execution instructions stored by the memory to cause the processor to perform the vehicle trajectory prediction method of any of the embodiments of the present disclosure.

According to yet another aspect of the present disclosure, there is provided a readable storage medium having stored therein execution instructions for implementing the vehicle trajectory prediction method of any one of the embodiments of the present disclosure when executed by a processor.

According to yet another aspect of the present disclosure, there is provided a computer program product comprising computer programs/instructions which, when executed by a processor, implement the vehicle trajectory prediction method of any one of the embodiments of the present disclosure.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the disclosure and together with the description serve to explain the principles of the disclosure.

Fig. 1 is a flowchart illustrating a vehicle trajectory prediction method according to an embodiment of the present disclosure.

Fig. 2 is a flowchart illustrating a vehicle trajectory prediction method according to still another embodiment of the present disclosure.

Fig. 3 is a flowchart illustrating a method for acquiring a first future trajectory feature sequence according to an embodiment of the disclosure.

Fig. 4 is a flowchart illustrating a second future trajectory feature sequence acquisition method according to an embodiment of the present disclosure.

Fig. 5 is a flowchart illustrating a method for acquiring a trajectory multi-modal feature according to an embodiment of the present disclosure.

FIG. 6 is a flow chart illustrating a method for modifying a final future predicted trajectory according to one embodiment of the present disclosure.

Fig. 7 is a block diagram illustrating a structure of a network model implementing the vehicle trajectory prediction method of the present disclosure according to an embodiment of the present disclosure.

FIG. 8 is a flow chart illustrating the steps of the loss function calculation in the model training process according to an embodiment of the present disclosure.

Fig. 9 is a block diagram schematic structure of a vehicle trajectory prediction device employing a hardware implementation of a processing system according to an embodiment of the present disclosure.

Fig. 10 is a block diagram schematically illustrating the structure of a vehicle trajectory prediction device using a hardware implementation of a processing system according to still another embodiment of the present disclosure.

Description of the reference numerals

1000 vehicle trajectory prediction device

1002 first future track characteristic sequence acquisition module

1004 feature map acquisition module

1006 second future trajectory feature sequence acquisition module

1008 track multi-modal feature acquisition module

1010 first inversion processing module

1012 second inversion processing module

1014 first fusion processing module

1016 second fusion processing Module

1018 final future predicted trajectory generation module

1020 correction module

1100 bus

1200 processor

1300 memory

1400 and other circuits.

Detailed Description

The present disclosure will be described in further detail with reference to the drawings and embodiments. It is to be understood that the specific embodiments described herein are for purposes of illustration only and are not to be construed as limitations of the present disclosure. It should be further noted that, for the convenience of description, only the portions relevant to the present disclosure are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict. Technical solutions of the present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Unless otherwise indicated, the illustrated exemplary embodiments/examples are to be understood as providing exemplary features of various details of some ways in which the technical concepts of the present disclosure may be practiced. Accordingly, unless otherwise indicated, features of the various embodiments may be additionally combined, separated, interchanged, and/or rearranged without departing from the technical concept of the present disclosure.

The use of cross-hatching and/or shading in the drawings is generally used to clarify the boundaries between adjacent components. As such, unless otherwise noted, the presence or absence of cross-hatching or shading does not convey or indicate any preference or requirement for a particular material, material property, size, proportion, commonality between the illustrated components and/or any other characteristic, attribute, property, etc., of a component. Further, in the drawings, the size and relative sizes of components may be exaggerated for clarity and/or descriptive purposes. While example embodiments may be practiced differently, the specific process sequence may be performed in a different order than that described. For example, two processes described consecutively may be performed substantially simultaneously or in reverse order to that described. In addition, like reference numerals denote like parts.

When an element is referred to as being "on" or "over," "connected to" or "coupled to" another element, it can be directly on, connected or coupled to the other element or intervening elements may be present. However, when an element is referred to as being "directly on," "directly connected to" or "directly coupled to" another element, there are no intervening elements present. For purposes of this disclosure, the term "connected" may refer to physically, electrically, etc., and may or may not have intermediate components.

The terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, when the terms "comprises" and/or "comprising" and variations thereof are used in this specification, the presence of stated features, integers, steps, operations, elements, components and/or groups thereof are stated but does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof. It is also noted that, as used herein, the terms "substantially," "about," and other similar terms are used as approximate terms and not as degree terms, and as such, are used to interpret inherent deviations in measured values, calculated values, and/or provided values that would be recognized by one of ordinary skill in the art.

The following describes the vehicle trajectory prediction method and the vehicle trajectory prediction apparatus of the present disclosure in detail with reference to fig. 1 to 10.

Referring to fig. 1, a vehicle trajectory prediction method S100 of the present embodiment includes:

s120, acquiring the observation image characteristics of the target vehicle based on the semantic map with the target vehicle as the center to generate a characteristic map;

s130, acquiring a second future track feature sequence of the target vehicle based on the feature map, and acquiring at least one track multi-mode feature of the target vehicle based on the feature map (wherein the number of the track multi-mode features is K, the number of corresponding predicted tracks is K, and K is a natural number greater than or equal to 1);

s140, respectively inverting the first future track characteristic sequence and the second future track characteristic sequence to obtain a first future inverted track characteristic sequence and a second future inverted track characteristic sequence;

s150, performing first fusion processing (preferably Concat operation) on the first future inverted trajectory feature sequence and the second future inverted trajectory feature sequence to obtain a future trajectory fusion feature sequence; and

and S160, carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and at least one track multi-mode feature to obtain at least one final future track feature for generating at least one final future predicted track.

In the disclosure, a vehicle observation track sequence and a future real track sequence are defined, and specific expressions are respectively shown in formula (1) and formula (2). Wherein N represents the total number of tracks, X_iRepresents the ith observation track (i.e., the observation track of the ith target vehicle), t_obsIndicating the duration of the observed trajectory. Y is_iRepresenting the ith future true trajectory, t_predRepresents the time of future trajectory duration;

respectively, the abscissa, the ordinate, the velocity, the acceleration, the yaw rate, and the yaw angle of the trajectory i at the time t. X is the set of observed trajectories and Y is the set of future true trajectories, as shown in equations (3) and (4).

X＝{X₁,X₂,...,X_N} (3)

Y＝{Y₁,Y₂,...,Y_N} (4)

The number of predicted trajectories is defined as K. The expression of the future predicted trajectory is shown in (5).

And representing the k-th predicted track of the ith piece of data.

Is a set of K traces, as shown in equation (6).

Defining the semantic map as I, preferably using a rasterized semantic map. The semantic map takes the target vehicle as a center, the head of the vehicle faces to the right front, and the actual distances of the left side, the right side, the front and the rear are a meter, b meter, c meter and d meter respectively, illustratively, a is 40, b is 25, c is 25 and d is 10. The image size of the map I is (3, H, W), the number of channels is 3, the height and width are H and W, respectively, and exemplarily, H ═ W ═ 200.

The semantic map described in the disclosure is an electronic map, which is a high-precision map with a precision of 20-50 cm and containing various road surface properties and spatial properties. Those skilled in the art can select various semantic maps in the prior art according to the teachings of the present disclosure, and all of them fall into the protection scope of the present disclosure.

Fig. 2 is a vehicle trajectory prediction method S100 of a preferred embodiment of the present disclosure, including:

s130, acquiring a second future track feature sequence of the target vehicle based on the feature map, and acquiring at least one track multi-mode feature of the target vehicle based on the feature map;

s150, performing first fusion processing on the first future inversion track characteristic sequence and the second future inversion track characteristic sequence to obtain a future track fusion characteristic sequence;

s160, performing second fusion processing on each future track fusion feature in the future track fusion feature sequence and at least one track multi-modal feature to obtain at least one final future track feature for generating at least one final future predicted track;

and S170, correcting the final future predicted track based on the lane lines in the semantic map to obtain a corrected track.

For the vehicle trajectory prediction method S100 of each of the above embodiments, preferably, the step S110 of obtaining a first future trajectory feature sequence of the target vehicle based on the obtained observation trajectory sequence of the target vehicle includes:

s111, embedding the observation track sequence of the target vehicle, and performing position coding to generate a pre-coding feature; preferably, the precoding characteristics are obtained by:

X′＝XE+E_pos；

wherein X is an observation trajectory sequence of the target vehicle, and E represents an embedding process (Embe)dding)，E_posIndicating position coding and X' is a precoding characteristic.

S112, coding the pre-coding features by using a coder of the first Transformer model to obtain a first feature vector; preferably, the first feature vector is obtained by:

h₁＝Encoder(X′；W_X)；

wherein Encoder represents an Encoder of a first transform model, W_XIs a corresponding parameter, h₁Is the first feature vector.

S113, decoding the first feature vector by using a decoder of the first Transformer model to obtain a first future track feature sequence; preferably, the first future trajectory feature sequence is obtained by:

T₁＝Decoder(h₁；W_DX)；

wherein Decoder represents a Decoder of a first transform model, W_DXIs the corresponding parameter, T₁Is a first future track feature sequence with a length t_pred。

The decoder of the first transform model is initialized by embedding and capable of learning, and performs track feature learning.

Fig. 3 shows a flow of a method for acquiring a first future trajectory feature sequence according to an embodiment of the present disclosure.

For the vehicle trajectory prediction method S100 of each of the above embodiments, preferably, S120, acquiring the observed image feature of the target vehicle based on the semantic map centered on the target vehicle to generate the feature map, includes:

feature extraction is performed on the semantic map by using a CNN-based backbone network (backbone) to acquire observed image features of the target vehicle so as to generate a feature map.

Wherein the backbone network preferably uses reset18, and the size of the feature map is (D)_img,h,w)，D_imgIndicating the number of channels, and h and w indicating the height and width of the profile.

In some embodiments of the present disclosure, the steps S121 to S123 described in chinese patent CN202111249188X may also be adopted to obtain the observation image features of the target vehicle based on the observation trajectory sequence of the target vehicle to generate the feature map, which is not described in detail in this disclosure.

For the vehicle trajectory prediction method S100 of each of the above embodiments, after obtaining the feature map, the present disclosure obtains a second future trajectory feature sequence of the target vehicle based on the feature map in S130, preferably, including:

s131, dividing the feature graph into a plurality of feature subgraphs to obtain a feature subgraph sequence;

the number of the feature graphs is h multiplied by w, and the dimension of each feature sub graph is D_imgThen the dimension of the feature subgraph sequence P is (hw, D)_img)。

S132, carrying out position coding processing on the characteristic subgraph sequence to obtain pre-coding characteristics; preferably, the precoding characteristic P' is obtained by:

P′＝PE+E_pos；

wherein E represents Embedding, E_posIndicating a position code.

S133, coding the pre-coding features by using a coder of a second Transformer model to obtain a second feature vector; preferably, the second feature vector is obtained by:

h₂＝Encoder(P′；W_I)；

wherein Encoder represents an Encoder of a second transform model, W_IIs the corresponding parameter.

S134, decoding the second feature vector by using a decoder of the second transform model to obtain a second future track feature sequence; preferably, the second future trajectory feature sequence is obtained by:

T₂＝Decoder(h₂；W_DI)；

wherein Decoder represents a Decoder of a second transform model, W_DIIs the corresponding parameter, T₂Has a length of t_pred。

Wherein, the decoder of the second transform model is initialized by the learnable embedding and carries out the decoder of the track characteristic learning.

Fig. 4 shows a flow of a second future trajectory feature sequence acquisition method of an embodiment of the present disclosure.

FIG. 5 shows a flow of a method for obtaining multi-modal trajectory features according to an embodiment of the present disclosure.

Referring to fig. 5, in S130, obtaining at least one track multi-modal feature of the target vehicle based on the feature map (the number of track multi-modal features is K, and the number of corresponding predicted tracks is K) includes:

s135, decoding the second feature vector by using a decoder of the third transform model to obtain at least one track multi-modal feature of the target vehicle; preferably, the trajectory multi-modal features are obtained by:

M＝Decoder(h₂；W_DM)；

wherein Decoder represents a Decoder of a third transform model, W_DMIs a corresponding parameter, and the length of M is the number K of predicted tracks.

The decoder of the third transform model is initialized by the learnable embedding and performs multi-modal feature learning.

For the vehicle trajectory prediction method S100 of each of the above embodiments, S140, the first future trajectory feature sequence and the second future trajectory feature sequence are respectively inverted to obtain a first future inverted trajectory feature sequence and a second future inverted trajectory feature sequence, and preferably, the first future trajectory feature sequence T is obtained₁And a second future trajectory feature sequence T₂The inverted sequence is T₁ ^revAnd

will T₁ ^revAnd

splicing to obtain a future trackFusion signature sequence T^rev。

In some embodiments of the present disclosure, S160, fusing the future trajectory into the feature sequence T^revPerforming a second fusion process on each future trajectory fusion feature and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, comprising:

fusing each track multi-modal feature with a future track to form a feature sequence T^revAnd performing splicing processing on each future track fusion feature to obtain a final future track feature based on each track multi-modal feature.

Wherein each track multi-modal feature corresponds to a track, and repeating t each track multi-modal feature to align with future track lengths_predSecondly, splicing and fusing the characteristic sequence T in the future track^revAnd obtaining the final future track characteristic T.

In some embodiments of the present disclosure, preferably, the S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, further includes:

and (3) processing each final future track characteristic T based on a Self-Attention mechanism (Self-Attention), obtaining a processed final future predicted track T', and obtaining a final future predicted track based on each final future track characteristic through a multi-layer perceptron.

That is, preferably, the present disclosure performs a self-attention mechanism-based process on the final future trajectory feature T to enhance the fusion, the self-attention mechanism-based process being represented by the following formula:

wherein Q, K, V respectively represents Query, Key, Value vector, d_kIs the dimension of the K vector. For the bestAn expression of the Attention of the final future trajectory feature T is Attention (T, T), where Q ═ K ═ V ═ T.

In the present disclosure, the final future predicted trajectory is obtained by the multi-layer perceptron:

where δ denotes the multilayer perceptron, W₁Representing the corresponding parameters of the multi-layer perceptron.

For the vehicle trajectory prediction method S100 of each of the above embodiments, preferably, S170, the final future predicted trajectory is corrected based on the lane lines in the semantic map to obtain a corrected trajectory, and fig. 6 shows a flow of a method of correcting the final future predicted trajectory according to an embodiment of the present disclosure.

The method for correcting the final future predicted trajectory according to the embodiment includes:

preferably, for the trajectory

At each time t, the off-coordinate is found

Nearest lane segment

Obtaining a sequence of lane segments

preferably, the predicted trajectory is encoded with a GRU to obtain a final future predicted trajectory encoding sequence, i.e. the predicted trajectory encoding sequence

Encoding each lane segment of the lane line sequence L by using GRUs to obtain a final lane line encoding sequence, which is shown as the following formula:

L′＝{GRU(L^t)|t＝t_obs+1,...,t_obs+t_pred)}；

the method can acquire lane line information from a high-precision map (electronic map), cut lane lines into lane line segments with fixed length, wherein the length of each lane line segment is L meters, the lane line segments are discretized into M points at intervals of d meters, and M lane line segments closest to a vehicle are selected as the lane lines closest to each final future predicted track.

preferably, the track is used as Query, the lane line is used as Key and Value, and Multi-head Attention is performed on the track and the lane line. The Mulit-head Attention expression is as follows:

MultiHead(Q,K,V)＝Concat(head₁,...,head_h)W^O；

head_i＝Attention(QW_i ^Q,KW_i ^K,VW_i ^V)；

wherein h represents the number of attribute heads, Q, K and V respectively represent query, key and value vectors, and W_i ^Q，W_i ^K，W_i ^V，W_i ^OA linear mapping is represented. The track characteristics obtained after the Mulit-head orientation are

I.e. correct the trajectoryAnd (5) carrying out characterization.

S174, decoding the corrected track characteristics based on the multilayer perceptron to obtain a corrected track;

preferably, the corrected trajectory is obtained based on the following formula:

where δ denotes the multilayer perceptron, W₃Representing the corresponding parameters of the multi-layer perceptron.

Fig. 7 shows a schematic network model structure diagram of a vehicle trajectory prediction method implementing the present disclosure according to an embodiment of the present disclosure.

Referring to fig. 7, the network model (i.e., the vehicle trajectory prediction apparatus) for executing the vehicle trajectory prediction method of the present disclosure generates a confidence when acquiring the trajectory multi-modal features of the target vehicle during the training process.

Preferably, the multi-modal features are decoded with a multi-layer perceptron (MLP) resulting in a confidence C. Specifically, C ═ δ (M; W)₂) Where δ denotes the multilayer perceptron, W₂Representing the corresponding parameters of the multi-layer perceptron.

In the training process, further, a loss function calculating step S180 is further included, as shown in fig. 8.

And S181, calculating the probability loss. First, score C is given_iMaking softmax, converting into probability distribution P_iI.e. by

Wherein, the first and the second end of the pipe are connected with each other,

C_iand P_iAnd respectively representing the confidence coefficient and probability distribution corresponding to the K tracks generated by the ith data.

Further, softmax is carried out on the error of each predicted track and the error of each real track, and the formula is shown as follows:

wherein D (·,) represents L for calculating the predicted trajectory₂The loss is given a negative sign to make the probability of small errors larger.

Finally, a probability distribution P is calculated_iAnd a desired distribution λ (Y)_i) KL divergence between, the formula is as follows:

s182, the training process of the model is divided into two stages. The first stage is to fix the trim module and train only the body part, where the trajectory loss is S183. The second stage is to fix the body part and train only the trimming module, where the trajectory loss is S184.

And S183, calculating a diversity loss function of the predicted track. The loss of diversity is L of K predicted tracks and the real track₂L of least lost track₂And (4) loss. When the predicted track of the main body part is selected

To calculate the loss of diversity, the formula is shown below:

s184, calculating the diversity loss by selecting the correction track NY predicted by the fine tuning module, wherein the formula is as follows:

the loss function comprises two parts, namely a confidence loss function and a diversity loss function. The total loss is expressed by the following formula:

Loss＝σ₁L_reg+σ₂L_conf

where { σ₁,σ₂Is a weight parameter.

The present disclosure also provides a vehicle trajectory prediction device, a vehicle trajectory prediction device 1000 according to an embodiment of the present disclosure, including:

the first future track feature sequence acquisition module 1002, the first future track feature sequence acquisition module 1002 acquires a first future track feature sequence of the target vehicle based on the acquired observation track sequence of the target vehicle;

a feature map acquisition module 1004, wherein the feature map acquisition module 1004 acquires the observation image features of the target vehicle based on the semantic map with the target vehicle as the center to generate a feature map;

a second future trajectory feature sequence acquisition module 1006, wherein the second future trajectory feature sequence acquisition module 1006 acquires a second future trajectory feature sequence of the target vehicle based on the feature map;

the track multi-modal feature acquisition module 1008, wherein the track multi-modal feature acquisition module 1008 acquires at least one track multi-modal feature of the target vehicle based on the feature map;

the first inversion processing module 1010, the first inversion processing module 1010 performs sequence inversion on the first future trajectory feature sequence to obtain a first future inverted trajectory feature sequence;

a second inversion processing module 1012, wherein the second inversion processing module 1012 performs sequence inversion on the second future trajectory feature sequence to obtain a second future inverted trajectory feature sequence;

the first fusion processing module 1014, the first fusion processing module 1014 performs the first fusion processing on the first future inverted trajectory feature sequence and the second future inverted trajectory feature sequence to obtain a future trajectory fusion feature sequence;

the second fusion processing module 1016 is used for performing second fusion processing on each future track fusion feature in the future track fusion feature sequence and at least one track multi-modal feature by the second fusion processing module 1016 to obtain at least one final future track feature;

a final future predicted trajectory generation module 1018, the final future predicted trajectory generation module 1018 generating at least one final future predicted trajectory based on the final future trajectory features.

The vehicle trajectory prediction apparatus of the present disclosure may be implemented based on a computer software program architecture, or may be implemented by adopting a hardware implementation manner of a processing system, referring to fig. 9.

According to a preferred embodiment of the present disclosure, referring to fig. 10, the vehicle track prediction apparatus 1000 of the present disclosure further includes:

and a correction module 1020, wherein the correction module 1020 corrects the final future predicted trajectory based on the lane lines in the semantic map to obtain a corrected trajectory.

In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure, preferably, the first future trajectory feature sequence acquisition module 1002 includes:

the embedding processing and position coding processing module is used for embedding the observation track sequence of the target vehicle and carrying out position coding processing to generate precoding characteristics;

and the first Transformer model decoder is used for decoding the first feature vector to obtain a first future track feature sequence.

In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure preferably includes the feature map acquisition module 1004:

and the CNN-based backbone network (backbone) module is used for carrying out feature extraction on the semantic map to acquire the observation image features of the target vehicle so as to generate a feature map.

In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure, preferably, the second future trajectory feature sequence acquisition module 1006 includes:

the position coding processing module is used for carrying out position coding processing on the characteristic sub-graph sequence to obtain pre-coding characteristics;

and the second transform model decoder is used for decoding the second characteristic vector to obtain a second future track characteristic sequence.

In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure, preferably, the trajectory multimodal feature acquisition module 1008 includes:

In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure, preferably the final future predicted trajectory generation module 1018, includes: the system comprises a self-attention mechanism module and a multi-layer perceptron module;

and inputting each final future track characteristic into a Self-Attention mechanism (Self-Attention) module for processing, taking the output of the Self-Attention mechanism module as the input of a multi-layer perceptron module, and obtaining a final future predicted track based on each final future track characteristic through the processing of the multi-layer perceptron.

In some embodiments of the present disclosure, the vehicle trajectory prediction apparatus 1000 of the present disclosure preferably includes the modification module 1020:

the first GRU module is used for carrying out encoding processing based on GRU on the final future prediction track to obtain a final future prediction track encoding sequence;

the second GRU module is used for carrying out coding processing based on GRU on the lane line with the shortest distance of each final future predicted track to obtain a lane line coding sequence;

a Multi-head Attention mechanism module, wherein the Multi-head Attention mechanism module carries out processing based on a Multi-head Attention mechanism (Multi-head Attention) on the final future prediction track coding sequence and the final lane line coding sequence to obtain a corrected track characteristic;

As can be seen from the above description of the vehicle trajectory prediction method/apparatus of the present disclosure, the present disclosure is based on a Detr improved Transformer, i.e., a Transformer decoder, which is initialized with learnable parameters and decodes all objects at once. Aiming at the problem of poor multi-modal input fusion effect, a high-precision map and a historical track are adopted to respectively predict a future track and then fusion is carried out. Meanwhile, in order to introduce the constraint of the endpoint into the Transformer model, the sequence generated by the decoder is inverted, and after the sequence is inverted, the prediction from the endpoint to the front is meant. Aiming at the problem of poor multi-modal interpretability of the track, the multi-modal generator based on the Transformer is adopted in the method, so that different modalities pay attention to different areas on a semantic map, and the track is generated based on the different modalities. For trajectories that do not satisfy lane line constraints, the present disclosure provides a fine tuning module that corrects the predicted trajectory to the nearest lane line using an attention mechanism.

The vehicle track prediction method/device adopts a high-precision map and a historical track to respectively predict a future track and then perform fusion, improves the multi-mode fusion effect, indirectly introduces the constraint of an end point into a Transformer by reversing the track predicted by the Transformer, improves the prediction precision, adopts a multi-mode generator (a decoder with a third Transformer structure) based on the Transformer to enable different modes to pay attention to different areas on the map, increases the interpretability and the prediction precision of the multi-mode, corrects the track to the nearest lane line by an attention mechanism through arranging a universal fine adjustment module, and increases the constraint of the lane line to the track.

The vehicle trajectory prediction apparatus of the present disclosure may include corresponding modules that perform each or several of the steps of the flowcharts described above. Thus, each step or several steps in the above-described flow charts may be performed by a respective module, and the apparatus may comprise one or more of these modules. The modules may be one or more hardware modules specifically configured to perform the respective steps, or implemented by a processor configured to perform the respective steps, or stored within a computer-readable medium for implementation by a processor, or by some combination.

The hardware architecture may be implemented using a bus architecture. The bus architecture may include any number of interconnecting buses and bridges depending on the specific application of the hardware and the overall design constraints. The bus 1100 couples various circuits including the one or more processors 1200, the memory 1300, and/or the hardware modules together. The bus 1100 may also connect various other circuits 1400, such as peripherals, voltage regulators, power management circuits, external antennas, and the like.

The bus 1100 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one connection line is shown, but no single bus or type of bus is shown.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present disclosure includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the implementations of the present disclosure. The processor performs the various methods and processes described above. For example, method embodiments in the present disclosure may be implemented as a software program tangibly embodied in a machine-readable medium, such as a memory. In some embodiments, some or all of the software program may be loaded and/or installed via memory and/or a communication interface. When the software program is loaded into memory and executed by a processor, one or more steps of the method described above may be performed. Alternatively, in other embodiments, the processor may be configured to perform one of the methods described above by any other suitable means (e.g., by means of firmware).

The logic and/or steps represented in the flowcharts or otherwise described herein may be embodied in any readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.

For the purposes of this description, a "readable storage medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the readable storage medium include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). Further, the readable storage medium may even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a memory.

It should be understood that portions of the present disclosure may be implemented in hardware, software, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps of the method implementing the above embodiments may be implemented by hardware that is instructed to be associated with a program, which may be stored in a readable storage medium, and which, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in the embodiments of the present disclosure may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.

The present disclosure also provides an electronic device, comprising: a memory storing execution instructions; and a processor executing the execution instructions stored by the memory such that the processor performs the vehicle trajectory prediction method S100 of any one of the embodiments of the present disclosure.

The present disclosure also provides a readable storage medium having stored therein executable instructions for implementing the vehicle trajectory prediction method S100 of any one of the embodiments of the present disclosure when executed by a processor.

The present disclosure also provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the vehicle trajectory prediction method S100 of any of the embodiments of the present disclosure.

In the description herein, reference to the description of the terms "one embodiment/implementation," "some embodiments/implementations," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment/implementation or example is included in at least one embodiment/implementation or example of the present application. In this specification, the schematic representations of the terms described above are not necessarily the same embodiment/mode or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments/modes or examples. Furthermore, the various embodiments/aspects or examples and features of the various embodiments/aspects or examples described in this specification can be combined and combined by one skilled in the art without conflicting therewith.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of the feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

It will be understood by those skilled in the art that the foregoing embodiments are merely for clarity of illustration of the disclosure and are not intended to limit the scope of the disclosure. Other variations or modifications may occur to those skilled in the art, based on the foregoing disclosure, and are still within the scope of the present disclosure.

Claims

1. A vehicle trajectory prediction method, characterized by comprising:

s150, performing first fusion processing on the first future reversal track characteristic sequence and the second future reversal track characteristic sequence to obtain a future track fusion characteristic sequence; and

2. The vehicle trajectory prediction method according to claim 1, characterized by further comprising:

3. The vehicle track prediction method according to claim 1 or 2, wherein the step S110 of obtaining a first future track feature sequence of the target vehicle based on the obtained observation track sequence of the target vehicle includes:

s112, coding the pre-coding features by using a coder of a first Transformer model to obtain a first feature vector; and

4. The vehicle trajectory prediction method according to claim 1 or 2, wherein the step S120 of obtaining the observation image feature of the target vehicle based on the semantic map centered on the target vehicle to generate a feature map comprises:

and performing feature extraction on the semantic map by using a CNN-based backbone network to acquire the observed image features of the target vehicle so as to generate a feature map.

5. The vehicle trajectory prediction method of claim 4, wherein the step S130 of obtaining a second future trajectory feature sequence of the target vehicle based on the feature map comprises:

s133, coding the pre-coding features by using a coder of a second Transformer model to obtain a second feature vector; and

6. The vehicle trajectory prediction method of claim 5, wherein the step S130 of obtaining at least one multi-modal trajectory feature of the target vehicle based on the feature map comprises:

s135, decoding the second feature vector by using a decoder of a third transform model to obtain at least one track multi-modal feature of the target vehicle;

preferably, the decoder of the first transform model is initialized by learnable embedding and subjected to trajectory feature learning;

preferably, the decoder of the second transform model is initialized by learnable embedding and subjected to trajectory feature learning;

preferably, the decoder of the third Transformer model is initialized by learnable embedding and performs multi-modal feature learning;

preferably, S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, includes:

splicing each track multi-modal feature with each future track fusion feature in the future track fusion feature sequence to obtain a final future track feature based on each track multi-modal feature;

preferably, S160, performing a second fusion process on each future trajectory fusion feature in the future trajectory fusion feature sequence and the at least one trajectory multi-modal feature to obtain at least one final future trajectory feature for generating at least one final future predicted trajectory, further includes:

processing each final future track characteristic based on a Self-Attention mechanism (Self-Attention), and obtaining a final future predicted track based on each final future track characteristic through a multi-layer perceptron;

preferably, S170, correcting the final future predicted trajectory based on the lane lines in the semantic map to obtain a corrected trajectory, includes:

s173, processing the final future prediction track coding sequence and the final lane line coding sequence based on a Multi-head Attention mechanism (Multi-head Attention) to obtain a corrected track characteristic; and

7. A vehicle trajectory prediction device characterized by comprising:

the feature map acquisition module acquires observation image features of the target vehicle based on a semantic map with the target vehicle as a center to generate a feature map;

the track multi-modal feature acquisition module is used for acquiring at least one track multi-modal feature of the target vehicle based on the feature map;

the second fusion processing module is used for carrying out second fusion processing on each future track fusion feature in the future track fusion feature sequence and the at least one track multi-modal feature to obtain at least one final future track feature; and

a final future predicted trajectory generation module that generates at least one final future predicted trajectory based on the final future trajectory features;

preferably, the method further comprises the following steps:

a correction module that corrects the final future predicted trajectory based on lane lines in a semantic map to obtain a corrected trajectory;

preferably, the first future trajectory feature sequence acquisition module includes:

the first Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a first characteristic vector; and

a first Transformer model decoder, wherein the first Transformer model decoder decodes the first feature vector to obtain the first future trajectory feature sequence;

preferably, the feature map obtaining module includes:

a CNN-based backbone network (backbone) module, wherein the CNN-based backbone network (backbone) module is used for carrying out feature extraction on the semantic map to obtain the observation image features of the target vehicle so as to generate a feature map;

preferably, the second future trajectory feature sequence acquisition module includes:

the second Transformer model encoder is used for encoding the pre-encoding characteristics to obtain a second characteristic vector; and

the second Transformer model decoder is used for decoding the second feature vector to obtain a second future track feature sequence;

preferably, the trajectory multi-modal feature acquisition module comprises:

a third Transformer model decoder, configured to decode the second feature vector to obtain at least one track multi-modal feature of the target vehicle;

preferably, the final future predicted trajectory generation module includes: the system comprises a self-attention mechanism module and a multilayer perceptron module;

inputting each final future trajectory feature into the Self-Attention mechanism (Self-Attention) module for processing, wherein the output of the Self-Attention mechanism module is used as the input of the multi-layer perceptron module, and a final future predicted trajectory based on each final future trajectory feature is obtained through the processing of the multi-layer perceptron module;

preferably, the correction module comprises:

a Multi-head Attention mechanism module, wherein the Multi-head Attention mechanism module performs processing based on a Multi-head Attention mechanism (Multi-head Attention) on the final future prediction track coding sequence and the final lane line coding sequence to obtain a corrected track characteristic; and

8. An electronic device, comprising:

a memory storing execution instructions; and

a processor executing execution instructions stored by the memory to cause the processor to perform the vehicle trajectory prediction method of any one of claims 1 to 6.

9. A readable storage medium having stored therein executable instructions for implementing the vehicle trajectory prediction method of any one of claims 1 to 6 when executed by a processor.

10. A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the vehicle trajectory prediction method of any one of claims 1 to 6.