CN113033364A

CN113033364A - Trajectory prediction method, trajectory prediction device, travel control method, travel control device, electronic device, and storage medium

Info

Publication number: CN113033364A
Application number: CN202110275813.1A
Authority: CN
Inventors: 张景淮; 张世权; 方良骥; 蒋沁宏; 刘毅成; 周博磊; 李樊
Original assignee: Bozhi Perceptual Interaction Research Center Co ltd; Sensetime Group Ltd
Current assignee: Bozhi Perceptual Interaction Research Center Co ltd; Sensetime Group Ltd
Priority date: 2021-03-15
Filing date: 2021-03-15
Publication date: 2021-06-25

Abstract

The present disclosure provides a trajectory prediction and driving control method, device, electronic device and storage medium, the method comprising: acquiring historical track information of a detection object and traffic road information of a target scene, wherein the historical track information of the detection object and the traffic road information of the target scene are included in the target scene; generating a first fusion feature matrix corresponding to the detection object based on the historical track information of the detection object and the trained initial feature matrix; the initial characteristic matrix is used for representing the track distribution characteristics of each mode in a plurality of modes, and each mode is used for representing a driving direction and/or a driving distance; generating a second fusion feature matrix corresponding to the detection object based on the traffic road information and the first fusion feature matrix corresponding to the detection object; and determining predicted trajectory information of the target object under the plurality of modalities based on the second fusion feature matrix corresponding to the detection objects, wherein the target object is one of the detection objects.

Description

Trajectory prediction method, trajectory prediction device, travel control method, travel control device, electronic device, and storage medium

Technical Field

The disclosure relates to the technical field of deep learning, in particular to a track prediction and driving control method, a track prediction and driving control device, an electronic device and a storage medium.

Background

With the rapid development of the technology, vehicles occupy an important position in life of people, and in order to improve the safety and intelligence of vehicle driving, intelligent driving systems such as an automatic driving system and an auxiliary driving system can be arranged on the vehicles. In the intelligent driving system, the accurate track prediction is carried out on the movable object on the road, and the intelligent driving system can play a vital role in the safe driving of the vehicle provided with the intelligent driving system.

Generally, in a trajectory prediction task, a plurality of possible future trajectories of a movable object at the same time exist, and due to the multi-modal characteristic of trajectory prediction, how to predict the trajectory of the movable object is an important and difficult problem to solve.

Disclosure of Invention

In view of the above, the present disclosure provides at least a trajectory prediction method, a travel control method, a trajectory prediction device, an electronic device, and a storage medium.

In a first aspect, the present disclosure provides a trajectory prediction method, including:

acquiring historical track information of a detection object included in a target scene and traffic road information of the target scene;

generating a first fusion feature matrix corresponding to the detection object based on the historical track information of the detection object and the trained initial feature matrix; wherein the initial feature matrix is used for characterizing the trajectory distribution feature of each of a plurality of modalities, each modality being used for characterizing a driving direction and/or a driving distance;

generating a second fusion feature matrix corresponding to the detection object based on the traffic road information and the first fusion feature matrix corresponding to the detection object;

and determining predicted trajectory information of a target object under the plurality of modalities based on the second fusion feature matrix corresponding to the detection objects, wherein the target object is one of the detection objects.

Determining a trained initial feature matrix, wherein the initial feature matrix is used for representing the track distribution feature of each mode in a plurality of modes, each mode represents a driving direction and/or a section of driving distance, and the acquired historical track information of a detected object is fused into the initial feature matrix to generate a first fused feature matrix; then fusing the acquired traffic road information into the first fusion characteristic matrix to generate a second fusion characteristic matrix; the second fusion characteristic matrix comprises historical track information, traffic road information and track distribution characteristics of each of a plurality of modes of the detection object, and therefore predicted track information of the target object under the plurality of modes can be accurately determined based on the second fusion characteristic matrix corresponding to the detection object. Meanwhile, the predicted track information of the target object under the multiple modalities is generated, namely at least one predicted track line of the target object in multiple driving directions and/or multiple driving distances is generated, the number of the generated predicted track lines is large, and the predicted track information is rich.

In a possible implementation manner, the initial feature matrix includes a preset number of initial feature vectors corresponding to each of the plurality of modalities; the predicted trajectory information in the plurality of modalities includes first information of the preset number of predicted trajectory lines in each modality of the plurality of modalities and second information used for representing confidence degrees corresponding to the predicted trajectory lines.

In the above embodiment, the number of predicted trajectory lines included in the predicted trajectory information is related to the number of initial eigenvectors included in the initial eigenvector matrix, that is, each initial eigenvector may correspond to one predicted trajectory line, so that the number of predicted trajectory lines is set flexibly. The predicted track information includes first information of the predicted track line, such as position information of the predicted track line, and second information corresponding to each predicted track line, such as scoring of the predicted track line, so that the generated information of each predicted track line is rich, and the occurrence probability of the corresponding predicted track line can be accurately judged through the second information of the predicted track line.

In one possible embodiment, the detection object is determined according to the following steps:

determining the area range of the target object in the real scene;

and taking a movable object including the target object in the area range as the detection object.

Considering that the historical track information of other detection objects in the area around the target object may affect the predicted track line of the target object, the area range where the target object is located in the target scene may be determined, and the movable object including the target object in the area range may be used as the detection object, so that the predicted track information of the target object may be generated more accurately based on the historical track information of at least one detection object.

In one possible embodiment, the trajectory prediction method is performed by a target neural network for performing trajectory prediction.

In a possible implementation manner, the generating a first fused feature matrix corresponding to the detection object based on the historical trajectory information of the detection object and a trained initial feature matrix includes:

performing feature extraction processing on the historical track information of the detection object to generate first feature information corresponding to the historical track information of the detection object;

and taking the initial feature matrix as a proposal parameter of a first decoder in the target neural network, and inputting the first feature information into the first decoder to obtain a first fusion feature matrix of the detection object output by the first decoder.

In one possible embodiment, the generating a second fused feature matrix corresponding to the detection object based on the traffic road information and the first fused feature matrix corresponding to the detection object includes:

carrying out feature extraction processing on the traffic road information to generate second feature information;

and taking the first fusion feature matrix as a proposal parameter of a second decoder in the target neural network, and inputting the second feature information into the second decoder to obtain a second fusion feature matrix of the detection object output by the second decoder.

In a possible implementation manner, in a case that there are a plurality of detection objects, the determining predicted trajectory information of the target object in the plurality of modalities based on the second fused feature matrix corresponding to the detection objects includes:

performing feature fusion on each fusion feature vector of the second fusion feature matrix corresponding to other detection objects aiming at other detection objects except the target object to obtain intermediate feature vectors corresponding to the other detection objects;

generating a third fusion feature matrix corresponding to the target object based on the intermediate feature vectors corresponding to the other detection objects and the second fusion feature matrix of the target object;

and determining predicted trajectory information of the target object under the plurality of modalities based on the third fusion feature matrix corresponding to the target object.

In the above embodiment, when the number of the detection objects is multiple, the second fusion feature matrix corresponding to each detection object may be determined respectively; performing feature fusion on each fusion feature vector of the second fusion feature matrix corresponding to other detection objects to obtain intermediate feature vectors containing high-dimensional semantic information corresponding to other detection objects; and then, based on the intermediate feature vectors corresponding to other detection objects and the second fusion feature matrix of the target object, generating a third fusion feature matrix corresponding to the target object, so that the third fusion feature matrix contains historical track information of other detection objects, and then based on the third fusion feature matrix corresponding to the target object, the predicted track information of the target object under multiple modalities can be more accurately determined.

In a possible implementation manner, the generating a third fused feature matrix corresponding to the target object based on the intermediate feature vector corresponding to the other detected object and the second fused feature matrix of the target object includes:

and taking the second fusion feature matrix of the target object as a proposal parameter of a third decoder in the target neural network, and inputting the intermediate feature vector into the third decoder to obtain a third fusion feature matrix of the target object output by the third decoder.

In one possible embodiment, the target neural network for performing trajectory prediction is trained by the following method:

acquiring sample data;

obtaining the predicted track information of a target sample object in the sample data based on a target neural network to be trained and the sample data;

training the target neural network to be trained based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data until the target neural network to be trained meets a preset condition, and obtaining the target neural network for performing trajectory prediction.

In the above embodiment, the trajectory prediction method is executed by the trained target neural network, so that the predicted trajectory information of the target object can be determined more efficiently and accurately.

In a possible embodiment, the method further comprises:

dividing a scene area corresponding to the sample data into a plurality of local areas, and determining a mode matched with each local area;

determining a target modality corresponding to the sample data based on the local region where the real trajectory of the target sample object indicated by the sample data is located and the modality matched with each local region;

the obtaining of the predicted trajectory information of the target sample object in the sample data based on the target neural network to be trained and the sample data comprises:

and obtaining the predicted trajectory information of the target sample object under the target mode based on the target neural network to be trained, the sample data and the target mode corresponding to the sample data.

In a possible embodiment, the obtaining, based on the target neural network to be trained, the sample data, and a target modality corresponding to the sample data, predicted trajectory information of the target sample object in the target modality includes:

performing feature processing on the sample data by using the target neural network to be trained to generate a sample fusion feature matrix corresponding to the target sample object, wherein the sample fusion feature matrix comprises at least one sample fusion feature vector corresponding to each mode;

determining at least one target fusion feature vector corresponding to the target modality from the sample fusion feature matrix;

and obtaining the predicted trajectory information of the target sample object in the target modality based on the at least one target fusion feature vector corresponding to the target modality.

In the above embodiment, the target modality corresponding to the sample data is determined; and according to the determined target mode, at least one target fusion characteristic vector corresponding to the target mode is determined from the sample fusion characteristic matrix corresponding to the target sample object, and the predicted trajectory information of the target sample object in the target mode is determined based on the at least one target fusion characteristic vector corresponding to the target mode, so that the regional training of the target neural network to be trained is realized, and the training efficiency of the target neural network to be trained is improved. Meanwhile, when the predicted trajectory information of the target object is determined through the trained target neural network, the value of the second information of the at least one predicted trajectory line in a mode is higher, that is, the probability of occurrence of the at least one predicted trajectory line representing the target object in the mode is higher.

In one possible implementation manner, the training the target neural network to be trained based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data until the target neural network to be trained satisfies a preset condition to obtain the target neural network for performing trajectory prediction includes:

generating a loss value of the training based on the predicted track information of the target sample object and the sample track information of the target sample object included in the sample data; the predicted track information corresponding to the target sample object comprises first information of predicted track lines and second information used for representing the confidence degree of each predicted track line; the loss values include: at least one of a regression loss value used for representing the deviation of the predicted trajectory, a deviation loss value used for representing the deviation of the confidence coefficient and the deviation of the tail end of the predicted trajectory and a classification loss value used for representing the deviation of the modal class corresponding to the predicted trajectory;

and training the target neural network to be trained based on the loss value until the target neural network to be trained meets a preset condition to obtain the target neural network for predicting the track.

Here, the trained loss value may include a regression loss value, a classification loss value, and a deviation loss value, and by setting various loss values, the target neural network to be trained may be trained more accurately, so that the performance of the trained target neural network is better.

In one possible embodiment, when the predicted trajectory line corresponding to the target sample object includes a plurality of prediction trajectory lines and the loss value includes the deviation loss value, the generating the loss value for the current training based on the predicted trajectory information of the target sample object and the sample trajectory information of the target sample object included in the sample data includes:

determining position information of the tail end of the track of each predicted track line corresponding to the target sample object and deviation information between the position information of the tail end of the track of the real track line in the sample track information corresponding to the target sample object;

for each predicted trajectory line corresponding to the target sample object, determining a deviation proportion corresponding to the predicted trajectory line based on the deviation information of the predicted trajectory line and the deviation information of other predicted trajectory lines except the predicted trajectory line in the plurality of predicted trajectory lines corresponding to the target sample object; generating a confidence coefficient proportion corresponding to the predicted trajectory line based on the second information corresponding to the predicted trajectory line and the second information corresponding to the other predicted trajectory lines;

and generating the deviation loss value of the training based on the deviation proportion and the confidence coefficient proportion respectively corresponding to the plurality of predicted trajectory lines of the target sample object.

In one possible embodiment, when the loss value includes a classification loss value, the generating a loss value for the current training based on predicted trajectory information corresponding to the target sample object and sample trajectory information of the target sample object included in the sample data includes:

determining a sum of second information of at least one predicted trajectory line contained in each modality based on predicted trajectory information of the target sample object in the modality;

and generating the classification loss value of the training based on the sum of the second information respectively corresponding to each mode and the target mode corresponding to the sample data.

In a possible embodiment, in a case where the loss value includes the regression loss value, the deviation loss value, and the classification loss value, the training the target neural network to be trained based on the loss value until the target neural network to be trained satisfies a preset condition includes:

determining a total loss value based on a first loss weight to be trained corresponding to the regression loss value, a second loss weight to be trained corresponding to the deviation loss value, a third loss weight to be trained corresponding to the classification loss value, the regression loss value, the deviation loss value and the classification loss value;

and training the target neural network to be trained based on the total loss value until the target neural network to be trained meets a preset condition.

Each loss value corresponds to a loss weight to be trained, when the target neural network is trained, the first loss weight, the second loss weight and the third loss weight are trained simultaneously, the total loss value can be determined more accurately through the trained first loss weight, the trained second loss weight, the trained third loss weight, the trained regression loss value, the trained deviation loss value and the trained classification loss value, and then when the target neural network to be trained is trained based on the total loss value, the trained target neural network can have better performance.

In a second aspect, the present disclosure provides a travel control method including:

acquiring historical track information of at least one moving object acquired by a driving device in the driving process and traffic road information of the driving device in the driving process;

generating predicted trajectory information corresponding to each moving object based on the traffic road information, the trajectory information of the at least one moving object, and the trajectory prediction method of any one of the first aspect;

and controlling the running device based on the predicted track information corresponding to each moving object.

The following descriptions of the effects of the apparatus, the electronic device, and the like refer to the description of the above method, and are not repeated here.

In a third aspect, the present disclosure provides a trajectory prediction device, including:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring historical track information of a detection object in a target scene and traffic road information of the target scene;

the first generation module is used for generating a first fusion feature matrix corresponding to the detection object based on the historical track information of the detection object and the trained initial feature matrix; wherein the initial feature matrix is used for characterizing the trajectory distribution feature of each of a plurality of modalities, each modality being used for characterizing a driving direction and/or a driving distance;

the second generation module is used for generating a second fusion feature matrix corresponding to the detection object based on the traffic road information and the first fusion feature matrix corresponding to the detection object;

a determining module, configured to determine, based on the second fusion feature matrix corresponding to the detection object, predicted trajectory information of a target object in the multiple modalities, where the target object is one of the detection objects.

In a fourth aspect, the present disclosure provides a running control apparatus including:

the second acquisition module is used for acquiring historical track information of at least one moving object acquired by a driving device in the driving process and traffic road information of the driving device in the driving process;

a third generating module, configured to generate predicted trajectory information corresponding to each of the moving objects based on the traffic road information, the trajectory information of the at least one moving object, and the trajectory prediction method according to any one of the first aspect;

and the control module is used for controlling the running device based on the predicted track information corresponding to each moving object.

In a fifth aspect, the present disclosure provides an electronic device comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the trajectory prediction method according to the first aspect or any one of the embodiments; or the steps of the running control method according to the second aspect described above.

In a sixth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the trajectory prediction method according to the first aspect or any one of the embodiments described above; or the steps of the running control method according to the second aspect described above.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1 is a schematic flow chart illustrating a trajectory prediction method provided by an embodiment of the present disclosure;

fig. 2 is a schematic diagram illustrating a target scene in a trajectory prediction method provided by an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram illustrating a target neural network in a trajectory prediction method provided by an embodiment of the present disclosure;

fig. 4 is a flow chart illustrating a driving control method according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram illustrating an architecture of a trajectory prediction apparatus provided in an embodiment of the present disclosure;

fig. 6 is a schematic diagram illustrating an architecture of a driving control device provided in an embodiment of the present disclosure;

fig. 7 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure;

fig. 8 shows a schematic structural diagram of another electronic device provided in the embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

Generally, in a trajectory prediction task, a plurality of possible future trajectories of a movable object at the same time exist, and due to the multi-modal characteristic of trajectory prediction, how to predict the trajectory of the movable object is an important and difficult problem to solve. In order to alleviate the above problem, embodiments of the present disclosure provide a trajectory prediction method.

The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

For the convenience of understanding the embodiments of the present disclosure, a trajectory prediction method and a travel control method disclosed in the embodiments of the present disclosure will be described in detail first. The execution subject of the trajectory prediction method and the travel control method provided by the embodiments of the present disclosure is generally a computer device with certain computing power, and the computer device includes: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or a server or other processing device. In some possible implementations, the trajectory prediction method, the travel control method, may be implemented by a processor calling computer readable instructions stored in a memory.

Referring to fig. 1, a schematic flow chart of a trajectory prediction method provided in the embodiment of the present disclosure is shown, the method includes S101-S104, where:

s101, acquiring historical track information of a detection object and traffic road information of a target scene, wherein the historical track information of the detection object and the traffic road information of the target scene are included in the target scene;

s102, generating a first fusion feature matrix corresponding to the detection object based on the historical track information of the detection object and the trained initial feature matrix; the initial characteristic matrix is used for representing the track distribution characteristics of each mode in a plurality of modes, and each mode is used for representing a driving direction and/or a driving distance;

s103, generating a second fusion feature matrix corresponding to the detection object based on the traffic road information and the first fusion feature matrix corresponding to the detection object;

s104, determining predicted trajectory information of the target object under the multiple modalities based on the second fusion feature matrix corresponding to the detection object, wherein the target object is one of the detection objects.

Determining a trained initial feature matrix, wherein the initial feature matrix is used for representing the track distribution feature of each mode in a plurality of modes, each mode is used for representing a driving direction and/or a driving distance, and the acquired historical track information of the detected object is fused into the initial feature matrix to generate a first fused feature matrix; then fusing the acquired traffic road information into the first fusion characteristic matrix to generate a second fusion characteristic matrix; the second fusion characteristic matrix comprises historical track information, traffic road information and track distribution characteristics of each of a plurality of modes of the detection object, and therefore predicted track information of the target object under the plurality of modes can be accurately determined based on the second fusion characteristic matrix corresponding to the detection object. Meanwhile, the predicted track information of the target object under the multiple modalities is generated, namely at least one predicted track line of the target object in multiple driving directions and/or multiple driving distances is generated, the number of the generated predicted track lines is large, and the predicted track information is rich.

The following description will be made specifically for S101 to S104.

For S101:

here, the target scene may be any road scene, and the detection object may be a motor vehicle, a non-motor vehicle, a pedestrian, or the like that travels on a road.

In an alternative embodiment, the detection object may be determined according to the following steps:

firstly, determining the region range of a target object in a real scene;

and secondly, taking a movable object including the target object in the area range as a detection object.

Here, the target object may be any one of a vehicle, a pedestrian, and the like determined in a real scene. The area range where the target object is located may be a range centered on the target object and having a preset size as a radius, where the preset size may be set according to an actual situation. At least one movable object within the determined area range may then be used as a detection object (including a target object). The number of detection objects included in the target scene may be one or more, and when the number of detection objects is one, the detection objects are target objects; when the number of detection objects is plural, the target object is one of the plural detection objects.

The method comprises the steps that a moving historical track line exists in a target scene of each detection object, a plurality of track position points can be obtained by sampling from the historical track line, and position information of the track position points on an image corresponding to the target scene is used as historical track information of the detection object. Alternatively, fitting parameters of a historical trajectory of the detection object in the target scene may be determined, and the fitting parameters may be used as the historical trajectory information of the detection object.

The traffic road information of the target scene may be used to characterize the traffic road condition in the target scene, and in implementation, a neural network may be used to extract a road line from an image corresponding to the target scene, or, in response to a road line determination operation, a road line may be extracted from an image corresponding to the target scene; and then extracting a plurality of road position points from the road line, and determining the position information of the road position points on the image corresponding to the target scene as the traffic road information in the target scene.

For S102:

the trajectory prediction method may be performed by a target neural network for performing trajectory prediction, and the trained initial feature matrix may be a network parameter included in the trained target neural network. Wherein the initial feature matrix is used to characterize the trajectory distribution features of each of the plurality of modalities. Each mode is used to characterize a driving direction and/or a driving distance.

For example, when each mode represents a driving direction, the plurality of modes may include a mode corresponding to straight movement, a mode corresponding to right turn, a mode corresponding to left turn, a mode corresponding to turning around, and the like; where each modality characterizes a direction of travel and a distance traveled, the plurality of modalities may include: the first mode of the first travel distance in the straight direction, the second mode of the second travel distance in the straight direction, the first mode of the first travel distance in the left-turn direction, the second mode of the second travel distance in the left-turn direction, the first mode of the first travel distance in the right-turn direction, and the second mode of the second travel distance in the right-turn direction, wherein the distance range corresponding to the first travel distance is adjacent to and does not overlap the distance range corresponding to the second travel distance. Wherein, the number of the modes can be set according to actual needs.

Generally, the initial feature matrix includes a preset number of initial feature vectors corresponding to each of the plurality of modalities. In specific implementation, each initial feature vector may correspond to one predicted trajectory in the prediction result, for example, the initial feature matrix may be a 36 × 128 matrix, the multiple modalities may be 6 modalities, each modality corresponds to 6 initial feature vectors, that is, there are 36 initial feature vectors, each initial feature vector may be a 128-dimensional vector, and then, of the 6 modalities, 6 predicted trajectories corresponding to each modality may be obtained in the prediction result, and 36 predicted trajectories are obtained, that is, 6 predicted trajectories in the straight-ahead modality, 6 predicted trajectories in the u-turn modality, 6 predicted trajectories in the left-turn modality, and the like.

For example, referring to a schematic diagram of a target scene in a trajectory prediction method shown in fig. 2, the target scene area in fig. 2 may be divided into 7 scene local areas, i.e., R₁Region, R₂Region, R₃Region, R₄Region, R₅Region, R₆Region, R₇Regions, each scene local region may correspond to one modality, i.e., the target scene may correspond to 7 modalities. Illustratively, at the end of the predicted trajectory line at R₅Region(s)When the predicted trajectory is R₅Predicting a trajectory line under a region corresponding mode; at the end of the predicted trajectory line at R₆When the area is in, the predicted trajectory is R₆The region corresponds to the predicted trajectory in the mode.

In an alternative embodiment, in S102, generating a first fused feature matrix corresponding to the detection object based on the historical trajectory information of the detection object and the trained initial feature matrix may include:

s1021, performing feature extraction processing on the historical track information of the detection object to generate first feature information corresponding to the historical track information of the detection object;

and S1022, taking the initial feature matrix as a proposal parameter of a first decoder in the target neural network, and inputting the first feature information into the first decoder to obtain a first fusion feature matrix of the detection object output by the first decoder.

In S1021, the target neural network may include a first feature extraction network, and perform feature extraction processing on the historical trajectory information of the detection object by using the first feature extraction network to generate first feature information corresponding to the historical trajectory information of the detection object. The first feature extraction network may be any network structure capable of performing feature extraction processing, for example, the first feature extraction network may be a network structure formed by at least one convolutional layer; alternatively, the first feature extraction network may be an Encoder structure Encoder in a transform network structure.

In S1022, the initial feature matrix may be used as a proposed parameter proposal of a first Decoder in the trained target neural network, the first feature information is input into the first Decoder, and the first feature information is merged into the initial feature matrix, so as to obtain a first merged feature matrix of the detection object output by the first Decoder. The number of the first fused feature vectors included in the first fused feature matrix may be the same as the number of the initial feature vectors included in the initial feature matrix.

For S103:

here, the traffic road information may be merged into the first merged feature matrix corresponding to the detection object, and the second merged feature matrix corresponding to the detection object may be generated.

In an optional embodiment, in S103, generating a second fused feature matrix corresponding to the detection object based on the traffic road information and the first fused feature matrix corresponding to the detection object may include:

s1031, performing feature extraction processing on the traffic road information to generate second feature information;

s1032, the first fusion feature matrix is used as a proposal parameter of a second decoder in the target neural network, and the second feature information is input into the second decoder to obtain a second fusion feature matrix of the detection object output by the second decoder.

In S1031, the target neural network may include a second feature extraction network, and the second feature extraction network is used to perform feature extraction processing on the traffic road information to generate second feature information. The second feature extraction network may be any network structure capable of performing feature extraction processing, for example, the second feature extraction network may be a network structure formed by at least one convolutional layer; alternatively, the second feature extraction network may be an Encoder structure Encoder in a transform network structure.

In S1032, the first fusion feature matrix may be used as a proposed parameter proposal of a second Decoder in the trained target neural network, the second feature information is input into the second Decoder, and the second feature information is fused into the first fusion feature matrix, so as to obtain a second fusion feature matrix of the detection object output by the second Decoder. The number of the second fusion feature vectors included in the second fusion feature matrix may be the same as the number of the first fusion feature vectors included in the first fusion feature matrix.

When the number of the detection objects is multiple, S1021 and S1022 may be performed for each detection object in the target scene, and based on the historical trajectory information of the detection object and the trained initial feature matrix, a first fused feature matrix corresponding to the detection object is generated; then executing S1031 and S1032, and generating a second fusion feature matrix corresponding to the detection object based on the traffic road information and the first fusion feature matrix corresponding to the detection object; and then a second fusion feature matrix corresponding to each detection object in the plurality of detection objects can be obtained.

For S104:

when the number of detection objects included in the target scene is one, the detection objects are target objects, and the predicted trajectory information of the target objects in the multiple modalities may be determined based on the second fusion feature matrix corresponding to the target objects. For example, the second fusion feature matrix may be subjected to feature extraction processing to obtain a third fusion feature matrix; inputting the third fusion characteristic matrix into a fourth decoder included in the trained target neural network to obtain predicted trajectory information of the target object under multiple modes; and the number of the third fusion feature vectors included in the third fusion feature matrix is the same as the number of the initial feature vectors included in the initial fusion feature matrix. Or, the second fusion feature matrix corresponding to the target object may be input to a fourth decoder included in the trained target neural network, so as to obtain predicted trajectory information of the target object in multiple modalities.

In an optional embodiment, the initial feature matrix includes a preset number of initial feature vectors corresponding to each of the plurality of modalities; the predicted trajectory information in the plurality of modalities includes first information of a preset number of predicted trajectory lines in each modality of the plurality of modalities and second information used for representing confidence degrees corresponding to the predicted trajectory lines.

Here, the number of predicted trajectory lines in each modality is consistent with the number of initial feature vectors corresponding to each modality in the initial feature matrix, for example, when the initial feature matrix includes 6 (a preset number) initial feature vectors corresponding to each modality in 6 modalities, the generated predicted trajectory information includes first information and second information of the 6 predicted trajectory lines in each modality in the 6 modalities, that is, the first information of the 36 predicted trajectory lines and the second information of the 36 confidence degrees representing the corresponding predicted trajectory lines can be obtained.

For example, the first information of the predicted trajectory line may be coordinate information of a plurality of trajectory position points on the predicted trajectory line in the target scene; the second information for characterizing the confidence level of the predicted trajectory line correspondence may be a score value of the predicted trajectory line correspondence.

In an optional implementation manner, in the case that there are a plurality of detection objects, in S104, determining predicted trajectory information of the target object in a plurality of modalities based on the second fused feature matrix corresponding to the detection objects may include:

s1041, performing feature fusion on each fusion feature vector of the second fusion feature matrix corresponding to other detection objects aiming at other detection objects except the target object to obtain intermediate feature vectors corresponding to other detection objects;

s1042, based on the intermediate eigenvectors corresponding to other detection objects and the second fusion feature matrix of the target object, generating a third fusion feature matrix corresponding to the target object;

s1043, determining predicted trajectory information of the target object under the multiple modalities based on the third fusion feature matrix corresponding to the target object.

In S1041, when a plurality of detection objects are provided, that is, when other detection objects are included in the target scene in addition to the target object, feature fusion may be performed on each fusion feature vector of the second fusion feature matrices corresponding to the other detection objects with respect to the other detection objects except the target object in the plurality of detection objects, to obtain intermediate feature vectors corresponding to the other detection objects, that is, to fuse the second fusion feature matrices corresponding to the other detection objects into one feature vector (that is, intermediate feature vector) with high-dimensional semantics. When the number of the other detection objects is multiple, the intermediate feature vector corresponding to each other detection object can be obtained. And then, based on the intermediate feature vectors corresponding to other detection objects and the second fusion feature matrix of the target object, generating a third fusion feature matrix corresponding to the target object, so that the third fusion feature matrix contains historical track information of other detection objects, and then based on the third fusion feature matrix corresponding to the target object, the predicted track information of the target object under multiple modalities can be more accurately determined.

For example, the second fused feature matrices corresponding to at least one other detected object may be input into a trained multi-layer Perceptron (MLP) to obtain intermediate feature vectors corresponding to at least one other detected object.

In S1042, in an optional implementation, generating a third fused feature matrix corresponding to the target object based on the intermediate feature vectors corresponding to the other detected objects and the second fused feature matrix of the target object includes: and taking the second fusion feature matrix of the target object as a proposal parameter of a third decoder in the target neural network, and inputting the intermediate feature vector into the third decoder to obtain a third fusion feature matrix of the target object output by the third decoder.

The second fusion feature matrix of the target object may be used as a proposed parameter proposal of a third Decoder in the trained target neural network, intermediate feature vectors corresponding to at least one other detection object are input into the third Decoder, the intermediate feature vectors corresponding to at least one other detection object are fused into the second fusion feature matrix of the target object, and a third fusion feature matrix of the target object output by the third Decoder is obtained.

In S1043, the third fusion feature matrix corresponding to the target object is input to a fourth decoder included in the trained target neural network, so as to obtain predicted trajectory information of the target object in multiple modalities. When the predicted trajectory information includes first information of the predicted trajectory line and second information of the predicted trajectory line, the fourth decoder may include a trajectory generator branch for performing trajectory prediction and a trajectory selector branch for generating a score, that is, the third fusion feature matrix corresponding to the target object is input to the trained trajectory generator branch and the trained trajectory selector branch, respectively, so as to obtain first information of at least one predicted trajectory line of the target object in each modality and second information corresponding to each predicted trajectory line.

Referring to fig. 3, a schematic structural diagram of a target neural network in a trajectory prediction method is shown, and the trajectory prediction method is exemplarily described with reference to fig. 3. The map comprises a map corresponding to a target scene, and the traffic road information of the target scene can be acquired from the map; the map also comprises historical track lines of a plurality of detection objects running in the target scene, namely, historical track information of at least one detection object can be acquired.

The target neural network includes a Stacked full attention network (Stacked transformations) and a proposed Feature Decoder (proposed Feature Decoder), wherein the proposed Feature Decoder is the fourth Decoder described above, and the Stacked transformations include: a trajectory generator branch (Predicted Trajectories) for performing trajectory prediction and a trajectory selector branch (Predicted scenes) for generating Scores may be included in a behavior Extractor (Motion Extractor), a Map Aggregator (Map Aggregator), a Social builder (Social Constructor), a multi-layer Perceptron (MLP), and a progressive feed Decoder. Wherein, the structure corresponding to the Motion Extractor, the Map Aggregator, and the Social configurator may be a transform structure, that is, the Motion Extractor may include a first encoder and a first decoder; the Map Aggregator may include a second encoder and a second decoder therein; a third encoder and a third decoder may be included in the Social Constructor.

In the first step, for each detection object, the acquired historical track information of the detection object may be input into a first encoder of a Motion Extractor for feature extraction, so as to obtain first feature information corresponding to the detection object; inputting the first feature information and the initial feature matrix (i.e. the trained trajectory proposal parameters) into a first decoder to obtain a first fusion feature matrix corresponding to the detection object; secondly, inputting the traffic road information into a second encoder of the Map Aggregator for feature extraction to obtain second feature information corresponding to the detection object; and inputting the second characteristic information and the first fusion characteristic matrix corresponding to the detection object into a second decoder to obtain a second fusion characteristic matrix corresponding to the detection object. Here, the second fusion feature matrix corresponding to each detection object can be obtained.

In the third step, other detection objects except the target object in the plurality of detection objects can be input into the MLP for feature fusion to obtain an intermediate feature vector corresponding to each other detection object; and inputting the intermediate feature vector corresponding to each other detection object and the second fusion feature matrix corresponding to the target object into a third decoder of the Social Constructor to obtain a third fusion feature matrix corresponding to the target object.

In the fourth step, the third fusion Feature matrix corresponding to the target object is input into the deployed Feature Decoder, that is, the Predicted trajectors and Predicted Scores in the deployed Feature Decoder perform Feature extraction on the third fusion Feature matrix respectively, so as to generate first information (position information) of at least one Predicted trajectory line of the target object in each modality and second information (score information) of each Predicted trajectory line.

The trajectory prediction method may be performed by a target neural network for performing trajectory prediction; the following illustrates an example of the training process for the target neural network.

In an alternative embodiment, the target neural network may be trained by:

a1, acquiring sample data;

step A2, obtaining the predicted track information of the target sample object in the sample data based on the target neural network to be trained and the sample data;

step A3, training a target neural network to be trained based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data until the target neural network to be trained meets a preset condition, and obtaining the target neural network for performing trajectory prediction.

In step a1, the sample data may include sample trajectory information of at least one sample object and sample road information of a scene where the sample object is located, and the sample trajectory information of the sample object may include historical trajectory information and real trajectory information of the sample object; the historical track information and the real track information are track line information on continuous running of the sample object. When the sample object is one, the sample object is a target sample object; when the sample object is multiple, the target sample object and other sample objects are included in the sample object, and the other sample objects may be traffic participants around the target sample object, for example, the traffic participants may be pedestrians, automobiles, non-automobiles, and the like.

The sample road information of the scene in which the sample object is located may be information of a traffic route included in the scene in which the sample object is located. For example, the sample road information may be position information of a plurality of road position points on the traffic route.

In step a2, feature extraction processing may be performed on the historical trajectory information indicated by the sample trajectory information of the sample object, so as to generate first sample feature information corresponding to the sample object; and inputting the first sample feature information corresponding to the sample object and the feature matrix generated by initialization into a first decoder of the target neural network to generate a first sample fusion feature matrix corresponding to the sample object.

Then, feature extraction processing can be carried out on the sample road information to generate second sample feature information; and inputting the second sample characteristic information and the first sample fusion characteristic matrix corresponding to the sample object into a second decoder of the target neural network to generate a second sample fusion characteristic matrix corresponding to the sample object.

When one sample object is available, extracting the characteristics of the second sample fusion characteristic matrix corresponding to the target sample object to generate a third sample fusion characteristic matrix corresponding to the target sample object; and inputting the third sample fusion feature matrix corresponding to the target sample object into a proposal feature decoder to obtain the predicted track information of the target sample object in the sample data, namely the predicted track information of the target sample object in multiple modes.

When a plurality of sample objects are provided, the second sample fusion feature matrices corresponding to other sample objects may be input into the MLP of the target neural network for other sample objects except the target sample object, and feature fusion may be performed on each second sample fusion feature vector in the second sample fusion feature matrices of other sample objects to obtain intermediate sample feature vectors corresponding to other sample objects.

And inputting the intermediate sample feature vector corresponding to each other sample object and the second sample fusion feature matrix corresponding to the target sample object into a third decoder to generate a third sample fusion feature matrix corresponding to the target sample object. And finally, inputting the third sample fusion feature matrix corresponding to the target object into a proposal feature decoder to obtain the predicted track information of the target sample object under multiple modes.

In an alternative embodiment, the method further comprises:

step B1, dividing the scene area corresponding to the sample data into a plurality of local areas, and determining the mode matched with each local area;

step B2, determining a target modality corresponding to the sample data based on the local region where the real trajectory of the target sample object indicated by the sample data is located and the modality matched with each local region.

In step B1, in response to the region dividing operation, dividing the scene region corresponding to the sample data into a plurality of local regions; or, the real trajectory lines of a plurality of sample objects may be obtained, the real trajectory lines are clustered by using a clustering algorithm to obtain a plurality of trajectory line sets, a region corresponding to each trajectory line set is determined, and then the scene region is divided into a plurality of local regions. For example, as shown in fig. 2, the scene area corresponding to the sample data may be divided, and the division of the local area may be set as needed, and each local area matches one modality.

In step B2, a local region where the true trajectory of the target sample object indicated by the sample data is located may be determined; determining a target mode corresponding to the sample data based on the local region where the real trajectory of the target sample object is located and the matched mode of each local region, for example, if the local region corresponding to the real trajectory of the target sample object is R in fig. 2₆Region, the target mode corresponding to the sample data may be R₆A modality.

In an optional implementation manner, in a2, obtaining predicted trajectory information of a target sample object in sample data based on a target neural network to be trained and the sample data includes:

step A21, performing feature processing on sample data by using a target neural network to be trained to generate a sample fusion feature matrix corresponding to a target sample object, wherein the sample fusion feature matrix comprises at least one sample fusion feature vector corresponding to each mode;

step A22, determining at least one target fusion characteristic vector corresponding to a target mode from a sample fusion characteristic matrix;

step A23, obtaining the predicted trajectory information of the target sample object in the target mode based on at least one target fusion feature vector corresponding to the target mode.

The sample fusion feature matrix may be a third sample fusion feature matrix, and the generation process of the third sample fusion feature matrix may refer to the generation process of the third fusion feature matrix, which is not described in detail herein. The sample fusion feature matrix includes at least one sample fusion feature vector corresponding to each modality, for example, the modality includes R₁Mode, R₂Mode, R₃Mode, R₄Mode, R₅Mode, R₆In the mode, the sample fusion characteristic matrix contains R₁At least one sample fusion feature vector, R, corresponding to a modality₂At least one sample fusion feature vector, R, corresponding to a modality₃At least one sample fusion feature vector, R, corresponding to a modality₄At least one sample fusion feature vector, R, corresponding to a modality₅At least one sample fusion feature vector, R, corresponding to a modality₆At least one sample corresponding to a modality fuses feature vectors.

If the target mode corresponding to the real trajectory of the target sample object is R₆Modality, R can be determined from the sample fusion feature matrix corresponding to the target sample object₆At least one target under the modality fuses the feature vectors. Based on R₆Determining at least one target fusion feature vector in a target modality (R) of the target sample object₆Modality). The predicted trajectory information includes first information of at least one predicted trajectory line at the target modality and second information of each predicted trajectory line at the target modality. Namely, using the Predicted trajectors and Predicted scenes for R respectively₆ModalityAt least one target fusion feature vector is subjected to feature extraction to generate a target sample object in R₆First information of at least one predicted trajectory line and second information of each predicted trajectory line in the modality.

For example, in the sample fusion feature matrix corresponding to the target sample object, other sample fusion feature vectors except for the at least one target fusion feature vector in the target modality may be determined, and the other sample fusion feature vectors are input into the Predicted Scores to obtain second information of the at least one Predicted trajectory line in the modalities except for the target modality.

In step a3, in an optional implementation manner, training the target neural network to be trained based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data until the target neural network to be trained satisfies a preset condition, to obtain the target neural network for performing trajectory prediction, which may include:

step A31, generating a loss value of the training based on the predicted track information corresponding to the target sample object and the sample track information of the target sample object included in the sample data; the predicted track information corresponding to the target sample object comprises first information of predicted track lines and second information used for representing the confidence degree of each predicted track line; the loss values include: at least one of a regression loss value used for representing the deviation of the predicted trajectory, a deviation loss value used for representing the deviation of the confidence coefficient and the deviation of the tail end of the predicted trajectory and a classification loss value used for representing the deviation of the modal class corresponding to the predicted trajectory;

and A32, training the target neural network to be trained based on the loss value until the target neural network to be trained meets a preset condition, and obtaining the target neural network for track prediction.

In step a31, when the loss value includes a regression loss value, the regression loss value may be determined according to the following formula (1):

wherein N is the number of predicted trajectory lines of the generated target sample object; s_iFirst information of predicted trajectory line for ith bar, S^gtTrue trajectory information (i.e., information of a true trajectory line) indicated for sample trajectory information of the target sample object; huber is a commonly used Huber function in neural networks.

Generating a loss value of the training based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data under the condition that the predicted trajectory information corresponding to the target sample object includes a plurality of pieces and the loss value includes a deviation loss value, including:

step a311, determining position information of the trajectory end of each predicted trajectory line corresponding to the target sample object, and deviation information between the position information of the trajectory end of the real trajectory line in the sample trajectory information corresponding to the target sample object;

step A312, for each predicted trajectory line corresponding to the target sample object, determining a deviation ratio corresponding to the predicted trajectory line based on the deviation information of the predicted trajectory line and the deviation information of other predicted trajectory lines except the predicted trajectory line in the plurality of predicted trajectory lines corresponding to the target sample object; generating a confidence coefficient proportion corresponding to the predicted trajectory line based on the second information corresponding to the predicted trajectory line and the second information corresponding to other predicted trajectory lines;

step A313, generating a deviation loss value of the training based on the deviation proportion and the confidence proportion respectively corresponding to the plurality of predicted trajectory lines of the target sample object.

The deviation ratio phi (S, S) corresponding to each predicted trajectory line can be determined according to the following equations (2) and (3)^gt)：

Wherein s is_i,TPosition information of the trajectory end of the predicted trajectory line for the ith piece;

position information of the trajectory end indicated for the real trajectory line of the target sample object.

Determining a confidence ratio according to the following equation (4):

wherein, Y_region＝{y₁,...,y_N}，C(y_i) Second information generated based on each third sample fusion feature vector is second information (such as score), Y, of each predicted trajectory line corresponding to the target sample object_regionFuse feature matrices for samples, y₁,...,y_NFeature vectors are fused for each sample included in the sample fusion feature matrix.

The offset loss value can be determined according to the following equation (5):

L_conf＝KL(Θ(Y_region),φ(S,S^gt))；(5)

wherein, KL can be a loss function corresponding to Kullback-Leibler Divergence.

Under the condition that the loss value comprises a classification loss value, generating the loss value of the training based on the predicted track information corresponding to the target sample object and the sample track information of the target sample object included in the sample data, wherein the method comprises the following steps:

step A314, based on the predicted track information of the target sample object in each mode, determining the sum of the second information of at least one predicted track line contained in the mode;

step A315, generating a classification loss value of the training based on the sum of the second information respectively corresponding to each mode and the target mode corresponding to the sample data.

The classification loss value may be generated according to the following equations (6) and (7):

L_cls＝CE(R(C),GT)；(7)

wherein j ∈ {1, …, M-1}, R (c) includes a sum of the second information of the at least one predicted trajectory line in each modality, GT is a target modality corresponding to the sample data, for example, the target modality is R₆In modality, GT may have a value of 6; l is_clsIs the classification loss value.

In step a32, when the loss value includes a regression loss value, a deviation loss value, and a classification loss value, training the target neural network to be trained based on the loss value until the target neural network to be trained satisfies a preset condition, to obtain a target neural network for performing trajectory prediction, which may include:

step A321, determining a total loss value based on a first loss weight to be trained corresponding to the regression loss value, a second loss weight to be trained corresponding to the deviation loss value, and a third loss weight to be trained corresponding to the classification loss value, the regression loss value, the deviation loss value and the classification loss value;

step A322, training the target neural network to be trained based on the total loss value until the target neural network to be trained meets a preset condition, and obtaining the target neural network for performing the trajectory prediction.

The total loss value can be calculated according to the following equation (8):

wherein σ₁Is a first loss weight, σ₂Is the second loss weight, σ₃Is the third loss weight.

Training the target neural network to be trained by using the total loss value until the target neural network to be trained meets a preset condition, for example, until the total loss value of the target neural network to be trained is smaller than a set loss threshold; or until the accuracy of the target neural network to be trained meets an accuracy threshold, etc.

Referring to fig. 4, a flow chart of a driving control method provided in the embodiment of the present disclosure is shown, where the method includes:

s401, acquiring historical track information of at least one moving object acquired by a driving device in the driving process and traffic road information of the driving device in the driving process;

s402, generating predicted track information corresponding to each moving object based on the traffic road information, the track information of at least one moving object and the track prediction method disclosed by the embodiment;

at S403, the traveling apparatus is controlled based on the predicted trajectory information corresponding to each moving object.

For example, the traveling device may be an autonomous vehicle, a vehicle equipped with an Advanced Driving Assistance System (ADAS), a robot, or the like. The moving object may be any object that may move and may appear in the road, for example, the moving object may be a vehicle, a pedestrian, or the like.

The method can also be used for collecting traffic road information of the driving device on a driving road, inputting the acquired historical track information and the traffic road information of at least one moving object into a target neural network included in the track prediction method, and generating predicted track information corresponding to each moving object. Further, the traveling device may be controlled based on the predicted trajectory information corresponding to each moving object.

When the driving device is controlled, the driving device can be controlled to accelerate, decelerate, turn, brake and the like, or voice prompt information can be played to prompt a driver to control the driving device to accelerate, decelerate, turn, brake and the like.

In the method, the predicted track information of each moving object can be accurately determined by using the track prediction method provided by the embodiment, the determined predicted track information is rich, and further, the driving device can be accurately controlled based on the predicted track information corresponding to each moving object, so that the driving safety of the driving device is improved.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Based on the same concept, an embodiment of the present disclosure further provides a trajectory prediction apparatus, as shown in fig. 5, an architecture diagram of the trajectory prediction apparatus provided in the embodiment of the present disclosure includes a first obtaining module 501, a first generating module 502, a second generating module 503, and a determining module 504, specifically:

a first obtaining module 501, configured to obtain historical track information of a detection object included in a target scene and traffic road information of the target scene;

a first generating module 502, configured to generate a first fused feature matrix corresponding to the detected object based on the historical trajectory information of the detected object and the trained initial feature matrix; wherein the initial feature matrix is used for characterizing the trajectory distribution feature of each of a plurality of modalities, each modality being used for characterizing a driving direction and/or a driving distance;

a second generating module 503, configured to generate a second fusion feature matrix corresponding to the detection object based on the traffic road information and the first fusion feature matrix corresponding to the detection object;

a determining module 504, configured to determine, based on the second fused feature matrix corresponding to the detection object, predicted trajectory information of a target object in the multiple modalities, where the target object is one of the detection objects.

In a possible implementation, the first obtaining module 501 is configured to determine the at least one detection object according to the following steps:

determining the area range of the target object in the real scene;

In one possible implementation, the first generating module 502, when generating the first fused feature matrix corresponding to the detected object based on the historical track information of the detected object and the trained initial feature matrix, is configured to:

In one possible implementation manner, the second generating module 503, when generating the second fused feature matrix corresponding to the detection object based on the traffic road information and the first fused feature matrix corresponding to the detection object, is configured to:

In a possible implementation manner, in a case that the at least one detection object is multiple, when determining predicted trajectory information of a target object in the at least one detection object in the multiple modalities based on the second fused feature matrix corresponding to the at least one detection object respectively, the determining module 504 is configured to:

In a possible implementation manner, the determining module 504, when generating a third fused feature matrix corresponding to the target object based on the intermediate feature vectors corresponding to the other detected objects and the second fused feature matrix of the target object, is configured to:

In a possible embodiment, the apparatus further comprises: a training module 505 for:

acquiring sample data;

In a possible embodiment, the apparatus further comprises: a partitioning module 506 to:

the training module 505, when obtaining the predicted trajectory information of the target sample object in the sample data based on the target neural network to be trained and the sample data, is configured to:

In a possible implementation manner, the training module 505, when obtaining the predicted trajectory information of the target sample object under the target modality based on the target neural network to be trained, the sample data, and the target modality corresponding to the sample data, is configured to:

In a possible implementation manner, the training module 505, when training the target neural network to be trained based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data, until the target neural network to be trained satisfies a preset condition, and obtaining the target neural network for trajectory prediction, is configured to:

In one possible embodiment, when the predicted trajectory line corresponding to the target sample object includes a plurality of predicted trajectory lines and the loss value includes the deviation loss value, the training module 505, when generating the loss value for the current training based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data, is configured to:

In one possible embodiment, when the loss value includes a classification loss value, the training module 505, when generating the loss value of the current training based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data, is configured to:

In a possible implementation manner, in a case that the loss value includes the regression loss value, the deviation loss value, and the classification loss value, the training module 505, when training the target neural network to be trained based on the loss value until the target neural network to be trained satisfies a preset condition to obtain the target neural network for trajectory prediction, is configured to:

and training the target neural network to be trained based on the total loss value until the target neural network to be trained meets a preset condition, so as to obtain the target neural network for predicting the track.

Based on the same concept, an embodiment of the present disclosure further provides a driving control device, as shown in fig. 6, which is an architecture schematic diagram of the driving control device provided in the embodiment of the present disclosure, and includes a second obtaining module 601, a third generating module 602, and a control module 603, specifically:

a second obtaining module 601, configured to obtain historical track information of at least one moving object collected by a driving device in a driving process;

a third generating module 602, configured to generate predicted trajectory information corresponding to each of the at least one moving object based on the trajectory information of the at least one moving object and the trajectory prediction method described in the foregoing embodiment;

a control module 603 configured to control the driving device based on the predicted trajectory information corresponding to each of the moving objects.

In some embodiments, the functions of the apparatus provided in the embodiments of the present disclosure or the included templates may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, no further description is provided here.

Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 7, a schematic structural diagram of an electronic device provided in the embodiment of the present disclosure includes a processor 701, a memory 702, and a bus 703. The memory 702 is used for storing execution instructions and includes a memory 7021 and an external memory 7022; the memory 7021 is also referred to as an internal memory, and is used to temporarily store operation data in the processor 701 and data exchanged with an external memory 7022 such as a hard disk, the processor 701 exchanges data with the external memory 7022 through the memory 7021, and when the electronic device 700 is operated, the processor 701 and the memory 702 communicate with each other through the bus 703, so that the processor 701 executes the following instructions:

Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 8, a schematic structural diagram of an electronic device provided in the embodiment of the present disclosure includes a processor 801, a memory 802, and a bus 803. The memory 802 is used for storing execution instructions and includes a memory 8021 and an external memory 8022; the memory 8021 is also referred to as an internal memory, and is used for temporarily storing operation data in the processor 801 and data exchanged with an external memory 8022 such as a hard disk, the processor 801 exchanges data with the external memory 8022 through the memory 8021, and when the electronic device 800 operates, the processor 801 communicates with the memory 802 through the bus 803, so that the processor 801 executes the following instructions:

generating predicted trajectory information corresponding to each moving object based on the traffic road information, the trajectory information of the at least one moving object, and the trajectory prediction method described in any one of the above embodiments;

Furthermore, the disclosed embodiments also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to execute the steps of the trajectory prediction method and the travel control method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the trajectory prediction method and the driving control method in the foregoing method embodiments, which may be referred to specifically for the foregoing method embodiments, and are not described herein again.

The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above are only specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present disclosure, and shall be covered by the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A trajectory prediction method, comprising:

2. The method according to claim 1, wherein the initial feature matrix comprises a preset number of initial feature vectors corresponding to each of the plurality of modalities;

the predicted trajectory information in the plurality of modalities includes first information of the preset number of predicted trajectory lines in each modality of the plurality of modalities and second information used for representing confidence degrees corresponding to the predicted trajectory lines.

3. The method according to claim 1 or 2, characterized in that the detection object is determined according to the following steps:

determining the area range of the target object in the target scene;

4. The method according to any one of claims 1 to 3, wherein the trajectory prediction method is performed by a target neural network for performing trajectory prediction.

5. The method of claim 4, wherein generating a first fused feature matrix corresponding to the detection object based on the historical trajectory information of the detection object and a trained initial feature matrix comprises:

6. The method according to claim 4 or 5, wherein the generating a second fused feature matrix corresponding to the detection object based on the traffic road information and the first fused feature matrix corresponding to the detection object comprises:

7. The method according to any one of claims 4 to 6, wherein, when there are a plurality of detection objects, the determining predicted trajectory information of the target object in the plurality of modalities based on the second fused feature matrix corresponding to the detection objects includes:

8. The method according to claim 7, wherein the generating a third fused feature matrix corresponding to the target object based on the intermediate feature vectors corresponding to the other detected objects and the second fused feature matrix of the target object comprises:

9. The method according to any one of claims 4 to 8, wherein the target neural network for trajectory prediction is trained by using the following method:

acquiring sample data;

10. The method of claim 9, further comprising:

determining a target modality corresponding to the sample data based on a local region where a real trajectory of a target sample object included in the sample data is located and a modality matched with each local region;

11. The method according to claim 10, wherein the obtaining predicted trajectory information of the target sample object in the target modality based on the target neural network to be trained, the sample data, and the target modality corresponding to the sample data comprises:

12. The method according to any one of claims 9 to 11, wherein the training the target neural network to be trained based on the predicted trajectory information corresponding to the target sample object and the sample trajectory information of the target sample object included in the sample data until the target neural network to be trained satisfies a preset condition to obtain the target neural network for trajectory prediction includes:

13. The method according to claim 12, wherein in a case that the predicted trajectory corresponding to the target sample object includes a plurality of paths and the loss value includes the deviation loss value, the generating a loss value for the current training based on the predicted trajectory information of the target sample object and the sample trajectory information of the target sample object included in the sample data includes:

14. The method of claim 12, wherein in a case that the loss value includes a classification loss value, the generating a loss value for the current training based on the predicted trajectory information of the target sample object and the sample trajectory information of the target sample object included in the sample data comprises:

15. The method according to any one of claims 12 to 14, wherein in a case where the loss value includes the regression loss value, the deviation loss value, and the classification loss value, the training the target neural network to be trained based on the loss value until the target neural network to be trained satisfies a preset condition includes:

16. A travel control method characterized by comprising:

generating predicted trajectory information corresponding to each of the moving objects based on the traffic road information, the trajectory information of the at least one moving object, and the trajectory prediction method of any one of claims 1 to 15;

17. A trajectory prediction device, comprising:

18. A travel control device characterized by comprising:

a third generating module, configured to generate predicted trajectory information corresponding to each of the moving objects based on the traffic road information, the trajectory information of the at least one moving object, and the trajectory prediction method according to any one of claims 1 to 15;

19. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of the trajectory prediction method according to any one of claims 1 to 15; or the steps of the running control method according to claim 16.

20. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the trajectory prediction method according to any one of claims 1 to 15; or the steps of the running control method according to claim 16.