CN111639591A - Trajectory prediction model generation method and device, readable storage medium and electronic equipment - Google Patents

Trajectory prediction model generation method and device, readable storage medium and electronic equipment Download PDF

Info

Publication number
CN111639591A
CN111639591A CN202010469558.XA CN202010469558A CN111639591A CN 111639591 A CN111639591 A CN 111639591A CN 202010469558 A CN202010469558 A CN 202010469558A CN 111639591 A CN111639591 A CN 111639591A
Authority
CN
China
Prior art keywords
initial model
track
image sequence
model
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010469558.XA
Other languages
Chinese (zh)
Other versions
CN111639591B (en
Inventor
范坤
陈迈越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Horizon Robotics Science and Technology Co Ltd
Original Assignee
Shenzhen Horizon Robotics Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Horizon Robotics Science and Technology Co Ltd filed Critical Shenzhen Horizon Robotics Science and Technology Co Ltd
Priority to CN202010469558.XA priority Critical patent/CN111639591B/en
Publication of CN111639591A publication Critical patent/CN111639591A/en
Application granted granted Critical
Publication of CN111639591B publication Critical patent/CN111639591B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The embodiment of the disclosure discloses a method and a device for generating a track prediction model, wherein the method comprises the following steps: acquiring a sample image sequence, wherein a sample image in the sample image sequence comprises a movable object; performing track prediction on the sample image sequence by using each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model, wherein the predicted track tensor is used for representing the moving track of the movable object; for each initial model in the initial model set, determining the distance from the predicted track tensor corresponding to the initial model to the predicted track tensors corresponding to other initial models respectively; determining a loss value of the initial model based on the obtained distances; and training the initial model based on the loss value to obtain a track prediction model. The embodiment of the disclosure can improve the trajectory prediction performance of the model, is beneficial to reducing the complexity of the trajectory prediction model, and improves the efficiency of model training and model application.

Description

Trajectory prediction model generation method and device, readable storage medium and electronic equipment
Technical Field
The disclosure relates to the technical field of computers, and in particular to a trajectory prediction model generation method and device, a readable storage medium and an electronic device.
Background
Most of the existing prediction algorithms for movable objects such as vehicles are based on a deep neural network model, various types of predictions such as tracks and obstacles are carried out by using shot videos, and since the predictions are carried out by using the videos and the calculation process is very complex, the tasks are often completed by using the complex neural network model.
Disclosure of Invention
The embodiment of the disclosure provides a track prediction model generation method and device, a computer-readable storage medium and electronic equipment.
The embodiment of the present disclosure provides a trajectory prediction model generation method, including: acquiring a sample image sequence, wherein a sample image in the sample image sequence comprises a movable object; performing track prediction on the sample image sequence by using each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model, wherein the predicted track tensor is used for representing the moving track of the movable object; for each initial model in the initial model set, determining the distance from the predicted track tensor corresponding to the initial model to the predicted track tensors corresponding to other initial models respectively; determining a loss value of the initial model based on the obtained distances; and training the initial model based on the loss value to obtain a track prediction model.
According to another aspect of the embodiments of the present disclosure, there is provided a trajectory prediction model generation apparatus including: an acquisition module, configured to acquire a sample image sequence, where a sample image in the sample image sequence includes a movable object; the prediction module is used for performing track prediction on the sample image sequence by using each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model respectively, wherein the predicted track tensor is used for representing the moving track of the movable object; the training module is used for determining the distance from the predicted track tensor corresponding to the initial model to the predicted track tensors corresponding to other initial models respectively for each initial model in the initial model set; determining a loss value of the initial model based on the obtained distances; and training the initial model based on the loss value to obtain a track prediction model.
According to another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the above-described trajectory prediction model generation method.
According to another aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing processor-executable instructions; and the processor is used for reading the executable instructions from the memory and executing the instructions to realize the track prediction model generation method.
Based on the trajectory prediction model generation method, apparatus, computer-readable storage medium and electronic device provided by the above embodiments of the present disclosure, by obtaining a sample image sequence, performing track prediction on the sample image sequence by using each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model respectively, for each initial model, determining the distance between the initial model and each other initial model, determining a loss value based on the obtained distance, training the initial model based on the loss value, finally obtaining a plurality of trajectory prediction models, thereby realizing the simultaneous training of a plurality of models, in the training process, the loss function of each model is linked with other models, the trajectory prediction performance of the model is improved through mutual learning of the multiple models, the complexity of the trajectory prediction model is reduced, and the efficiency of model training and model application is improved.
The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.
Fig. 1 is a system diagram to which the present disclosure is applicable.
Fig. 2 is a flowchart illustrating a trajectory prediction model generation method according to an exemplary embodiment of the disclosure.
Fig. 3 is a schematic diagram of an application scenario of a trajectory prediction model generation method according to an embodiment of the disclosure.
Fig. 4 is a flowchart illustrating a trajectory prediction model generation method according to another exemplary embodiment of the present disclosure.
Fig. 5 is a schematic structural diagram of a trajectory prediction model generation apparatus according to an exemplary embodiment of the present disclosure.
Fig. 6 is a schematic structural diagram of a trajectory prediction model generation apparatus according to another exemplary embodiment of the present disclosure.
Fig. 7 is a block diagram of an electronic device provided in an exemplary embodiment of the present disclosure.
Detailed Description
Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing an associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
The disclosed embodiments may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with electronic devices, such as terminal devices, computer systems, servers, and the like, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
Summary of the application
Existing prediction models include a large number of parameters due to the use of complex neural networks. This imposes a significant limitation on the training of the model and the speed of prediction. This problem is particularly serious in inference, since applying a predictive model to an actual scene requires a very strict speed requirement, otherwise the safety of the scene such as automatic driving cannot be guaranteed.
Exemplary System
Fig. 1 illustrates an exemplary system architecture 100 of a trajectory prediction model generation method or apparatus to which embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include a terminal device 101, a network 102, a server 103, and a camera 104. The network 102 is a medium for providing communication links between the terminal apparatus 101, the server 103, and the camera 104. Network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use terminal device 101 to interact with server 103 over network 102 to receive or send messages and the like. Various communication client applications, such as a monitoring application, a map application, an image processing application, and the like, may be installed on the terminal device 101.
The terminal device 101 may be various electronic devices including, but not limited to, devices such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle-mounted terminal (e.g., a car navigation terminal), and the like.
The server 103 may be a server that provides various services, such as a background model training server that trains a trajectory prediction model using an image sequence acquired by the terminal device 101 or the camera 104. The background model training server can train each initial model in the initial model set by using the obtained sample image sequence to obtain a plurality of track prediction models.
It should be noted that the trajectory prediction model generation method provided in the embodiment of the present disclosure may be executed by the server 103 or the terminal device 101, and accordingly, the trajectory prediction model generation apparatus may be provided in the server 103 or the terminal device 101.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Exemplary method
Fig. 2 is a flowchart illustrating a trajectory prediction model generation method according to an exemplary embodiment of the disclosure. The embodiment can be applied to an electronic device (such as the terminal device 101 or the server 103 shown in fig. 1), and as shown in fig. 2, the method includes the following steps:
step 201, a sample image sequence is acquired.
In this embodiment, the electronic device may obtain the sample image sequence locally or remotely. For example, the sample image sequence may be captured by the camera 104 as shown in fig. 1. The sample image sequence may include a plurality of sample images arranged in a time sequence, and the sample images in the sample image sequence may include a movable object. It should be understood that the movable object may be imagery of various actual movable objects (e.g., vehicles, pedestrians, etc.) mapped into the sample image. It should be noted that the number of movable objects in each sample image may be at least one.
Step 202, performing track prediction on the sample image sequence by using each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model respectively.
In this embodiment, the electronic device may perform track prediction on the sample image sequence by using each initial model in a preset initial model set, so as to obtain a predicted track tensor corresponding to each initial model. Wherein the predicted track tensor is used for representing the moving track of the movable object. As an example, the predicted trajectory tensor can include a probability that each pixel point in the sample image belongs to a movable object.
The initial model in the initial model set may be a deep neural network model, and the initial model may include, but is not limited to, at least one of: convolutional neural networks, cyclic neural networks, and the like. The initial model can directly receive the sample image sequence for trajectory prediction, and can also receive the image obtained by format conversion of the sample image sequence for trajectory prediction.
Step 203, for each initial model in the initial model set, determining the distance from the predicted track tensor corresponding to the initial model to the predicted track tensors corresponding to other initial models respectively; determining a loss value of the initial model based on the obtained distances; and training the initial model based on the loss value to obtain a track prediction model.
In this embodiment, for each initial model in the initial model set, the electronic device may perform the following steps:
step 2031, determining the distance from the predicted track tensor corresponding to the initial model to the predicted track tensors corresponding to other initial models respectively. Wherein the distance is used for representing the similarity degree of the two predicted track tensors. By way of example, the distance may include, but is not limited to, at least one of: euclidean distance, cosine distance, etc.
In some alternative implementations, the distance is an information divergence value. The information divergence is also called relative entropy (relative entropy), also called KL divergence (Kullback-Leibler divergence) or KL distance, and it measures the difference between two probability distributions in the same event space.
Assuming two probability distributions P1And P2Then P is1To P2The KL distance of (A) is represented by the following formula (1):
Figure BDA0002513854440000061
since the KL distance is asymmetric, P1To P2KL distance and P of2To P1Is generally unequal. By using the information divergence as the distance between the two predicted track tensors, the probability distributions of the two predicted track tensors tend to be consistent when the model is trained, even if the prediction effects of the models are approximately consistent, so that the mutual learning of the models can be realized, and the prediction accuracy of the trained track prediction model is improved.
Step 2032, based on the obtained distances, determining a loss value of the initial model.
Specifically, for a certain initial model, a function related to each obtained distance may be added on the basis of a loss function of the initial model.
As an example, assume that the initial model set includes an initial model of Θ12,…,ΘK(K ≧ 2) for an initial model ΘkThe primary loss function of the initial model is
Figure BDA0002513854440000062
Then, after adding the function associated with each of the obtained distances, the new loss function is shown in the following equation (2):
Figure BDA0002513854440000063
w is a weight, and may be a fixed value, such as 1, 0.1, 0.2, etc., or may be related to the number of initial models, such as 1/(K-1). According to the new loss function, a loss value can be calculated.
Step 2033, based on the loss value, training the initial model to obtain a trajectory prediction model.
Specifically, the electronic device may adjust parameters of the initial model by using a gradient descent algorithm and a back propagation algorithm to minimize a loss value, and determine the trained initial model as the trajectory prediction model when the initial model satisfies a preset condition. Wherein, the preset condition may include but is not limited to at least one of the following: the training times exceed the preset times, the training time exceeds the preset time, and the loss value is smaller than the preset loss value threshold.
It should be understood that steps 2031 to 2033 are performed for each initial model in the initial model set, i.e. each initial model corresponds to a trained trajectory prediction model.
The trajectory prediction model, when used, may have input thereto a sequence of images, wherein the input sequence of images is a sequence of images taken of the movable object. The trajectory prediction model may perform trajectory prediction on a movable object (each movable object or an individual movable object) specified in the image sequence, to obtain predicted trajectory information of the movable object.
In some alternative implementations, the electronic device can determine the loss value of the initial model according to the following steps:
first, an average value of the obtained respective distances is determined. As an example, when w in the above formula (2) is 1/(K-1), an average value of the respective distances can be obtained.
Then, based on the average, a loss value for the initial model is determined. As an example, the average value may be added, multiplied, or otherwise calculated to the loss value determined by the original loss function of the initial model to obtain the loss value of the initial model. In the optional implementation manner, the average value can reflect the mutual relationship of each distance, so that the loss value is obtained through the average value of the distances, the mutual relationship of each initial model can be more accurately reflected in the training process, and the prediction precision of the track prediction model obtained through training is further improved.
In some optional implementations, based on the average value described in the above optional implementations, the electronic device may determine the loss value of the initial model according to the following steps:
first, an initial loss value is determined based on a preset loss function. As an example, as shown in the above equation (2), a loss value may be obtained based on a predetermined loss function corresponding to the initial model
Figure BDA0002513854440000071
Then, the initial loss value is added to the average value to obtain the loss value of the initial model.
As an example, the loss value may be derived based on the following loss function:
Figure BDA0002513854440000081
the new loss value is obtained by adding the average value of the distances and the initial loss value, and the interrelation of each initial model can be further embodied by the average value of the distances on the basis of the initial loss value, so that the prediction accuracy of the track prediction model obtained by training is further improved.
In some optional implementations, after step 203, the electronic device may further select, from the obtained trajectory prediction models, a trajectory prediction model that meets a preset condition as a trajectory prediction model for predicting the motion trajectory of the movable object in real time. As an example, the preset condition may include, but is not limited to, at least one of: the track prediction model has the highest accuracy after testing and belongs to the track prediction model specified by the user. In an actual application scenario, the selected trajectory prediction model may be used as an actual model. Because the trajectory prediction models are trained simultaneously, the predictive performance of the models is improved simultaneously during training, and the model selected from the models can be in the actual scene of the predicted trajectory, so that the prediction accuracy is higher.
Referring to fig. 3, fig. 3 is a schematic diagram of an application scenario of the trajectory prediction model generation method according to the present embodiment. In the application scenario of fig. 3, the electronic device 301 first acquires a sample image sequence 303 captured by a camera 302 installed above a road for a vehicle (i.e., a movable object) traveling on the road. Then, the electronic device 301 obtains an initial model set, wherein the initial model set includes an initial model a (as shown in 304) and an initial model B (as shown in 305). The electronic device 301 performs track prediction on the sample image sequence by using the initial model a and the initial model B to obtain predicted track tensors P corresponding to the initial model a and the initial model B respectively1And P2. Then, the electronic device 301 trains the initial model a and the initial model B, and during training, for the initial model a, the loss function L of the initial model a isAOn the basis of the above-mentioned formula (I), adding the KL distance D from P1 to P2KL(p2||p1) Calculating to obtain the loss value L of the initial model AA'. For the initial model B, the loss function L at the initial model BBOn the basis of (1), adding P2To P1KL distance D ofKL(p1||p2) Calculating to obtain the loss value L of the initial model BB'. Finally, the initial model a and the initial model B are trained by using a back propagation algorithm and a gradient descent algorithm, and a trajectory prediction model C (shown as 306 in the figure) and a trajectory prediction model D (shown as 307 in the figure) are obtained.
According to the method provided by the embodiment of the disclosure, the sample image sequence is obtained, the track prediction is performed on the sample image sequence by using each initial model in the preset initial model set, the prediction track tensor corresponding to each initial model is obtained, the distance between each initial model and each other initial model is determined for each initial model, the loss value is determined based on the obtained distance, the initial model is trained based on the loss value, and finally a plurality of track prediction models are obtained, so that the simultaneous training of a plurality of models is realized.
With further reference to FIG. 4, a flowchart of yet another embodiment of a trajectory prediction model generation method is shown. As shown in fig. 4, based on the embodiment shown in fig. 2, step 202 may include the following steps:
step 2021, converting the sample images in the sample image sequence into track images for representing the motion track of the movable object, so as to obtain a track image sequence.
In this embodiment, the trajectory images are used to characterize the motion trajectory of the movable object, i.e. each trajectory image highlights the position of the movable object. As an example, the trajectory image may be detected from sample images in the sequence of sample images by a preset target detection model (e.g., a pnet network) for detecting the position of the movable object. The electronic device may highlight each movable object, for example, setting the color of the pixel covered by the movable object to white and the color of the other pixels to black.
In some alternative implementations, step 2021 may be performed as follows:
and converting the sample images in the sample image sequence into the occupied raster images to obtain the occupied raster image sequence as a track image sequence. Wherein each pixel point occupying a raster image (Occupancy grid) may correspond to a probability characterizing its belonging to a target movable object. In this implementation, the image mapped by each movable object to occupy the raster image may be a smooth two-dimensional gaussian kernel with bright middle and dark edge. The sample images are converted to occupy the raster images, so that the positions of the movable objects in the sample images can be more accurately represented, and the model can more accurately predict the track of the movable objects when the model is trained.
Step 2022, inputting the track image sequence into each initial model in the preset initial model set, and obtaining a predicted track tensor corresponding to each initial model respectively.
In this embodiment, each initial model in the initial model set may predict a future travel track of the movable object by using an actual travel track of the movable object represented by the track image sequence, so as to obtain a predicted track tensor. The initial model in this step is used to represent the corresponding relationship between the track image sequence and the predicted track tensor. The initial model in the initial model set may be a deep neural network model, and the initial model may include, but is not limited to, at least one of: convolutional neural networks, cyclic neural networks, and the like.
In the method provided by the embodiment corresponding to fig. 4, the sample image is converted into the track image, and the track image is used for track prediction, so that the track image can accurately represent the track that the movable object has traveled, and therefore, the model can more accurately predict the track according to the track image.
Exemplary devices
Fig. 5 is a schematic structural diagram of a trajectory prediction model generation apparatus according to an exemplary embodiment of the present disclosure. The embodiment can be applied to an electronic device, and as shown in fig. 5, the trajectory prediction model generation apparatus includes: an obtaining module 501, configured to obtain a sample image sequence, where a sample image in the sample image sequence includes a movable object; the prediction module 502 is configured to perform track prediction on the sample image sequence by using each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model, where the predicted track tensor is used to represent a moving track of the movable object; a training module 503, configured to determine, for each initial model in the initial model set, distances from a predicted trajectory tensor corresponding to the initial model to predicted trajectory tensors corresponding to other initial models, respectively; determining a loss value of the initial model based on the obtained distances; and training the initial model based on the loss value to obtain a track prediction model.
In this embodiment, the acquisition module 501 may acquire the sample image sequence locally or remotely. For example, the sample image sequence may be captured by the camera 104 as shown in fig. 1. The sample image sequence may include a plurality of sample images arranged in a time sequence, and the sample images in the sample image sequence may include a movable object. It should be understood that the movable object may be imagery of various actual movable objects (e.g., vehicles, pedestrians, etc.) mapped into the sample image. . It should be noted that the number of movable objects in each sample image may be at least one.
In this embodiment, the predicting module 502 may perform track prediction on the sample image sequence by using each initial model in a preset initial model set, so as to obtain a predicted track tensor corresponding to each initial model respectively. Wherein the predicted track tensor is used for representing the moving track of the movable object. As an example, the predicted trajectory tensor can include a probability that each pixel point in the sample image belongs to a movable object.
The initial model in the initial model set may be a deep neural network model, and the initial model may include, but is not limited to, at least one of: convolutional neural networks, cyclic neural networks, and the like.
In this embodiment, for each initial model in the initial model set, the training module 503 may perform the following steps:
step 5031, determining the distance from the predicted track tensor corresponding to the initial model to the predicted track tensors corresponding to other initial models respectively. Wherein the distance is used for representing the similarity degree of the two predicted track tensors. By way of example, the distance may include, but is not limited to, at least one of: euclidean distance, cosine distance, etc.
Based on the obtained distances, a loss value of the initial model is determined, step 5032.
Specifically, for a certain initial model, a function related to each obtained distance may be added on the basis of a loss function of the initial model.
Step 5033, training the initial model based on the loss value to obtain a trajectory prediction model.
Specifically, the training module 503 may adjust parameters of the initial model by using a gradient descent algorithm and a back propagation algorithm to minimize a loss value, and when the initial model meets a preset condition, determine the trained initial model as the trajectory prediction model. Wherein, the preset condition may include but is not limited to at least one of the following: the training times exceed the preset times, the training time exceeds the preset time, and the loss value is smaller than the preset loss value threshold.
It should be understood that the above steps 5031-5033 are performed for each initial model in the initial model set, that is, each initial model corresponds to a trained trajectory prediction model.
The trajectory prediction model, when used, may have input thereto a sequence of images, wherein the input sequence of images is a sequence of images taken of the movable object. The trajectory prediction model may perform trajectory prediction on a movable object (each movable object or an individual movable object) specified in the image sequence, to obtain predicted trajectory information of the movable object.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a trajectory prediction model generation apparatus according to another exemplary embodiment of the present disclosure.
In some alternative implementations, the prediction module 502 may include: a conversion unit 5021, configured to convert a sample image in the sample image sequence into a track image for representing a motion track of a movable object, so as to obtain a track image sequence; the predicting unit 5022 is used for inputting the track image sequence into each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model.
In some alternative implementations, the conversion unit 5021 may be further configured to: and converting the sample images in the sample image sequence into the occupied raster images to obtain the occupied raster image sequence as a track image sequence.
In some optional implementations, the apparatus may further include: and a selecting module 504, configured to select, from the obtained trajectory prediction models, a trajectory prediction model that meets a preset condition as a trajectory prediction model for predicting a motion trajectory of the movable object in real time.
In some alternative implementations, the distance is an information divergence value.
In some alternative implementations, the training module 503 may include: a first determining unit 5031 for determining an average value of the obtained respective distances; a second determining unit 5032, configured to determine a loss value of the initial model based on the average value.
In some alternative implementations, the second determining unit 5032 may include: a first determining sub-unit 50321, configured to determine an initial loss value based on a preset loss function; a second determining subunit 50322, configured to add the initial loss value and the average value to obtain a loss value of the initial model.
The trajectory prediction model generation device provided by the above embodiment of the present disclosure obtains a sample image sequence, performs trajectory prediction on the sample image sequence by using each initial model in a preset initial model set, to obtain a predicted trajectory tensor corresponding to each initial model, determines, for each initial model, a distance between the initial model and each other initial model, determines a loss value based on the obtained distance, trains the initial model based on the loss value, and finally obtains a plurality of trajectory prediction models, thereby implementing simultaneous training on a plurality of models, and in the training process, a loss function of each model is linked with other models, that is, the trajectory prediction performance of the models is improved by learning the plurality of models, which is helpful for reducing the complexity of the trajectory prediction models and improving the efficiency of model training and model application.
Exemplary electronic device
Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 7. The electronic device may be either or both of the terminal device 101 and the server 103 as shown in fig. 1, or a stand-alone device separate from them, which may communicate with the terminal device 101 and the server 103 to receive the collected input signals therefrom.
FIG. 7 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.
As shown in fig. 7, the electronic device 700 includes one or more processors 701 and memory 702.
The processor 701 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 700 to perform desired functions.
Memory 702 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like. One or more computer program instructions may be stored on a computer-readable storage medium and executed by the processor 701 to implement the trajectory prediction model generation methods of the various embodiments of the present disclosure above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
In one example, the electronic device 700 may further include: an input device 703 and an output device 704, which are interconnected by a bus system and/or other form of connection mechanism (not shown).
For example, when the electronic device is the terminal device 101 or the server 103, the input device 703 may be a mouse, a keyboard, a camera, or the like, and is used for inputting the sample image. When the electronic device is a stand-alone device, the input means 703 may be a communication network connector for receiving the input sample images from the terminal device 101 and the server 103.
The output device 704 may output various information to the outside, including a trajectory prediction model. The output devices 704 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.
Of course, for simplicity, only some of the components of the electronic device 700 relevant to the present disclosure are shown in fig. 7, omitting components such as buses, input/output interfaces, and the like. In addition, electronic device 700 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the methods and apparatus described above, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in a trajectory prediction model generation method according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a trajectory prediction model generation method according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (10)

1. A trajectory prediction model generation method, comprising:
acquiring a sample image sequence, wherein a sample image in the sample image sequence comprises a movable object;
performing track prediction on the sample image sequence by using each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model, wherein the predicted track tensor is used for representing the moving track of the movable object;
for each initial model in the initial model set, determining the distance between the predicted track tensor corresponding to the initial model and the predicted track tensors corresponding to other initial models in the initial model set respectively; determining a loss value of the initial model based on the obtained distances; and training the initial model based on the loss value to obtain a track prediction model corresponding to the initial model.
2. The method of claim 1, wherein the performing trajectory prediction on the sample image sequence using each initial model in a preset set of initial models comprises:
converting the sample images in the sample image sequence into track images for representing the motion track of the movable object to obtain a track image sequence;
and inputting the track image sequence into each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model respectively.
3. The method of claim 2, wherein the converting the sample images in the sample image sequence into track images for characterizing the motion track of the movable object, resulting in a track image sequence, comprises:
and converting the sample images in the sample image sequence into occupying raster images to obtain a occupying raster image sequence serving as a track image sequence.
4. The method of claim 1, wherein the method further comprises:
and selecting a track prediction model meeting preset conditions from the obtained track prediction models as a track prediction model for predicting the motion track of the movable object in real time.
5. The method according to one of claims 1 to 4, wherein said determining a loss value for the initial model based on the respective distances obtained comprises:
determining an average value of the obtained distances;
based on the average, a loss value for the initial model is determined.
6. The method of claim 5, wherein said determining a loss value for the initial model based on said average value comprises:
determining an initial loss value based on a preset loss function;
and adding the initial loss value and the average value to obtain the loss value of the initial model.
7. A trajectory prediction model generation apparatus comprising:
an acquisition module configured to acquire a sample image sequence, wherein a sample image in the sample image sequence includes a movable object;
the prediction module is used for performing track prediction on the sample image sequence by using each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model, wherein the predicted track tensor is used for representing the moving track of the movable object;
the training module is used for determining the distance from the predicted track tensor corresponding to the initial model to the predicted track tensors corresponding to other initial models respectively for each initial model in the initial model set; determining a loss value of the initial model based on the obtained distances; and training the initial model based on the loss value to obtain a track prediction model.
8. The apparatus of claim 7, wherein the prediction module comprises:
the conversion unit is used for converting the sample images in the sample image sequence into track images for representing the motion track of the movable object to obtain a track image sequence;
and the prediction unit is used for inputting the track image sequence into each initial model in a preset initial model set to obtain a predicted track tensor corresponding to each initial model respectively.
9. A computer-readable storage medium, the storage medium storing a computer program for performing the method of any of the preceding claims 1-6.
10. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any one of claims 1-6.
CN202010469558.XA 2020-05-28 2020-05-28 Track prediction model generation method and device, readable storage medium and electronic equipment Active CN111639591B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010469558.XA CN111639591B (en) 2020-05-28 2020-05-28 Track prediction model generation method and device, readable storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010469558.XA CN111639591B (en) 2020-05-28 2020-05-28 Track prediction model generation method and device, readable storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN111639591A true CN111639591A (en) 2020-09-08
CN111639591B CN111639591B (en) 2023-06-30

Family

ID=72328912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010469558.XA Active CN111639591B (en) 2020-05-28 2020-05-28 Track prediction model generation method and device, readable storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111639591B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112373471A (en) * 2021-01-12 2021-02-19 禾多科技(北京)有限公司 Method, device, electronic equipment and readable medium for controlling vehicle running
CN114283576A (en) * 2020-09-28 2022-04-05 华为技术有限公司 Vehicle intention prediction method and related device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN110737968A (en) * 2019-09-11 2020-01-31 北京航空航天大学 Crowd trajectory prediction method and system based on deep convolutional long and short memory network
CN110751683A (en) * 2019-10-28 2020-02-04 北京地平线机器人技术研发有限公司 Trajectory prediction method and device, readable storage medium and electronic equipment
WO2020087974A1 (en) * 2018-10-30 2020-05-07 北京字节跳动网络技术有限公司 Model generation method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020087974A1 (en) * 2018-10-30 2020-05-07 北京字节跳动网络技术有限公司 Model generation method and device
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN110737968A (en) * 2019-09-11 2020-01-31 北京航空航天大学 Crowd trajectory prediction method and system based on deep convolutional long and short memory network
CN110751683A (en) * 2019-10-28 2020-02-04 北京地平线机器人技术研发有限公司 Trajectory prediction method and device, readable storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283576A (en) * 2020-09-28 2022-04-05 华为技术有限公司 Vehicle intention prediction method and related device
CN112373471A (en) * 2021-01-12 2021-02-19 禾多科技(北京)有限公司 Method, device, electronic equipment and readable medium for controlling vehicle running

Also Published As

Publication number Publication date
CN111639591B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
US10002309B2 (en) Real-time object analysis with occlusion handling
CN108229419B (en) Method and apparatus for clustering images
US20210117760A1 (en) Methods and apparatus to obtain well-calibrated uncertainty in deep neural networks
CN111626219B (en) Track prediction model generation method and device, readable storage medium and electronic equipment
EP3637310A1 (en) Method and apparatus for generating vehicle damage information
CN109961032B (en) Method and apparatus for generating classification model
CN110751683A (en) Trajectory prediction method and device, readable storage medium and electronic equipment
CN111639591B (en) Track prediction model generation method and device, readable storage medium and electronic equipment
US20200160060A1 (en) System and method for multiple object tracking
CN115393592A (en) Target segmentation model generation method and device, and target segmentation method and device
CN113435409A (en) Training method and device of image recognition model, storage medium and electronic equipment
CN114565812A (en) Training method and device of semantic segmentation model and semantic segmentation method of image
CN112381868A (en) Image depth estimation method and device, readable storage medium and electronic equipment
CN109359727B (en) Method, device and equipment for determining structure of neural network and readable medium
CN114139630A (en) Gesture recognition method and device, storage medium and electronic equipment
CN114037990A (en) Character recognition method, device, equipment, medium and product
EP3579182A1 (en) Image processing device, image recognition device, image processing program, and image recognition program
CN110796003B (en) Lane line detection method and device and electronic equipment
CN111797665B (en) Method and apparatus for converting video
US20220237884A1 (en) Keypoint based action localization
EP4332910A1 (en) Behavior detection method, electronic device, and computer readable storage medium
JP2023036795A (en) Image processing method, model training method, apparatus, electronic device, storage medium, computer program, and self-driving vehicle
CN113111692B (en) Target detection method, target detection device, computer readable storage medium and electronic equipment
Rudol et al. Evaluation of human body detection using deep neural networks with highly compressed videos for UAV Search and rescue missions
CN112149426B (en) Reading task processing method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant