WO2021134169A1 - 一种轨迹预测方法及相关设备 - Google Patents

一种轨迹预测方法及相关设备 Download PDF

Info

Publication number
WO2021134169A1
WO2021134169A1 PCT/CN2019/129820 CN2019129820W WO2021134169A1 WO 2021134169 A1 WO2021134169 A1 WO 2021134169A1 CN 2019129820 W CN2019129820 W CN 2019129820W WO 2021134169 A1 WO2021134169 A1 WO 2021134169A1
Authority
WO
WIPO (PCT)
Prior art keywords
skeleton node
information
target object
trajectory
skeleton
Prior art date
Application number
PCT/CN2019/129820
Other languages
English (en)
French (fr)
Inventor
刘亚林
宋晓琳
施惠杰
曹昊天
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2019/129820 priority Critical patent/WO2021134169A1/zh
Priority to EP19958212.3A priority patent/EP4074563A4/en
Priority to CN201980064064.9A priority patent/CN112805730A/zh
Publication of WO2021134169A1 publication Critical patent/WO2021134169A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/02Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to ambient conditions
    • B60W40/04Traffic conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/40Dynamic objects, e.g. animals, windblown objects
    • B60W2554/402Type
    • B60W2554/4029Pedestrians
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2556/00Input parameters relating to data
    • B60W2556/10Historical data

Definitions

  • This application relates to the field of artificial intelligence, and in particular to a trajectory prediction method and related equipment.
  • the present application provides a trajectory prediction method and related equipment.
  • the behavior intention of a pedestrian is used as an input feature of the pedestrian trajectory prediction, which can reduce the prediction error of the pedestrian trajectory and improve the prediction performance of the pedestrian trajectory.
  • the present application provides a trajectory prediction method, including: obtaining posture information and historical motion trajectory information of a target object; obtaining target information representing the behavioral intention of the target object based on the posture information; using the target information With the historical trajectory information as input, the predicted trajectory of the target object is obtained through a trajectory prediction model.
  • the behavioral intention includes at least one of the following:
  • the target object includes a plurality of skeleton nodes
  • the posture information includes the position of the skeleton node of each skeleton node at multiple times.
  • the target object includes a first skeleton node
  • the posture information includes a first skeleton node position, a second skeleton node position, and a third skeleton node position
  • the first skeleton node The node position is the position of the first skeleton node of the target object at the first time
  • the third skeleton node position is the position of the first skeleton node of the target object at the third time
  • the position of the first skeleton node corresponding to the target object at the second moment, the first moment, the second moment, and the third moment are moments that are successively adjacent in the time dimension
  • the second The position of the skeleton node is related to the position of the first skeleton node and/or the position of the third skeleton node.
  • the second skeleton node position is a center position of the first skeleton node position and the third skeleton node position.
  • the method further includes:
  • Time interval sampling is performed on the position of the skeleton node of each skeleton node at multiple times to obtain processed posture information, and the processed posture information is used to obtain target information representing the behavioral intention of the target object.
  • the method further includes:
  • the acquiring target information representing the behavioral intention of the target object based on the posture information includes:
  • target information representing the behavioral intention of the target object is obtained.
  • the historical trajectory information includes historical trajectory point positions at multiple times, and each time corresponds to a historical trajectory point position, wherein the historical trajectory point positions at the multiple times are based on
  • the preset coordinate system indicates that the historical track point positions at the multiple times include a first track point position, and the first track point position is located at a preset position in the preset coordinate system.
  • the first track point position may be the initial track position among multiple historical track positions, or the last historical track position in time.
  • the preset position in the preset coordinate system may be the coordinate origin in the preset coordinate system, or any other preset coordinate position in the preset coordinate system.
  • the position of the historical track point is a position relative to the ground.
  • the obtaining the predicted trajectory of the target object through a trajectory prediction model using the target information and the historical trajectory information as input includes:
  • the predicted trajectory of the target object is obtained through a cyclic neural network model.
  • this application provides an execution device, including:
  • the acquisition module is used to acquire the posture information and historical motion trajectory information of the target object
  • the prediction module is configured to use the target information and the historical trajectory information as input to obtain the predicted trajectory of the target object through a trajectory prediction model.
  • the behavioral intention includes at least one of the following:
  • the target object includes a plurality of skeleton nodes
  • the posture information includes the position of the skeleton node of each skeleton node at multiple times.
  • the target object includes a first skeleton node
  • the posture information includes a first skeleton node position, a second skeleton node position, and a third skeleton node position
  • the first skeleton node The node position is the position of the first skeleton node of the target object at the first time
  • the third skeleton node position is the position of the first skeleton node of the target object at the third time
  • the position of the first skeleton node corresponding to the target object at the second moment, the first moment, the second moment, and the third moment are moments that are successively adjacent in the time dimension
  • the second The position of the skeleton node is related to the position of the first skeleton node and/or the position of the third skeleton node.
  • the second skeleton node position is a center position of the first skeleton node position and the third skeleton node position.
  • the sampling module is used to perform time interval sampling on the position of the skeleton node of each skeleton node at multiple times to obtain processed posture information, and the processed posture information is used to obtain the behavior of the target object Intent target information.
  • the normalization module is used to normalize the position of each skeleton node at multiple times to obtain normalized posture information, and the normalized posture information is used for Obtain target information representing the behavioral intention of the target object.
  • the acquisition module is specifically used for:
  • target information representing the behavioral intention of the target object is obtained.
  • the historical trajectory information includes historical trajectory point positions at multiple times, and each time corresponds to a historical trajectory point position, wherein the historical trajectory point positions at the multiple times are based on
  • the preset coordinate system indicates that the historical track point positions at the multiple times include a first track point position, and the first track point position is located at a preset position in the preset coordinate system.
  • the position of the historical track point is a position relative to the ground.
  • the prediction module is specifically used for:
  • the output vector is mapped to the predicted trajectory of the target object.
  • an embodiment of the present invention provides a terminal device, including a memory, a communication interface, and a processor coupled to the memory and the communication interface; the memory is used to store instructions, and the processor is used to execute the instructions
  • the communication interface is used to communicate with other devices under the control of the processor; wherein, when the processor executes the instructions, the method described in the first aspect or the possible embodiments of the first aspect is executed .
  • a computer-readable storage medium stores program codes for vehicle control.
  • the program code includes instructions for executing the method described in the foregoing first aspect or possible embodiments of the first aspect.
  • a computer program product including instructions, which when run on a computer, causes the computer to execute the method described in the first aspect or the possible embodiments of the first aspect.
  • the embodiment of the application provides a method for trajectory prediction, including obtaining posture information and historical motion trajectory information of a target object; obtaining target information representing the behavioral intention of the target object based on the posture information; using the target information and the The historical trajectory information is input, and the predicted trajectory of the target object is obtained through the trajectory prediction model.
  • Figure 1A is a schematic structural diagram of the artificial intelligence main frame
  • FIG. 1B is a schematic structural diagram of a possible terminal device of this application.
  • FIG. 1C is a schematic structural diagram of another possible terminal device of this application.
  • FIG. 1D is a schematic diagram of a possible scenario according to an embodiment of the application.
  • FIG. 2 is a schematic diagram of an embodiment of a trajectory prediction method provided by an embodiment of this application.
  • FIG. 3 is a schematic diagram of a pedestrian behavior intention provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of historical trajectory information processing provided by an embodiment of this application.
  • FIG. 5 is a schematic diagram of a confusion matrix provided by an embodiment of this application.
  • FIG. 6 is a schematic diagram of a possible structure of the terminal device involved in this embodiment.
  • FIG. 7 is a schematic structural diagram of another terminal device 700 provided by an embodiment of this application.
  • FIG. 8 is a schematic structural diagram of an execution device provided by an embodiment of this application.
  • FIG. 9 is a schematic diagram of a structure of a chip provided by an embodiment of the application.
  • the embodiments of the present application provide a trajectory prediction method and related equipment, which are used to reduce the prediction error of the pedestrian trajectory and improve the prediction performance of the pedestrian trajectory.
  • Figure 1A shows a schematic diagram of the main framework of artificial intelligence.
  • the following section describes the "intelligent information chain” (horizontal axis) and “IT value chain” ( (Vertical axis)
  • the "intelligent information chain” reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensing process of "data-information-knowledge-wisdom”.
  • the "IT value chain” from the underlying infrastructure of human intelligence, information (providing and processing technology realization) to the industrial ecological process of the system, reflects the value that artificial intelligence brings to the information technology industry.
  • the infrastructure provides computing power support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the basic platform.
  • smart chips hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA
  • basic platforms include distributed computing frameworks and network related platform guarantees and support, which can include cloud storage and Computing, interconnection network, etc.
  • sensors communicate with the outside to obtain data, and these data are provided to the smart chip in the distributed computing system provided by the basic platform for calculation.
  • the data in the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence.
  • the data involves graphics, images, voice, and text, as well as the Internet of Things data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making and other methods.
  • machine learning and deep learning can symbolize and formalize data for intelligent information modeling, extraction, preprocessing, training, etc.
  • Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, using formal information to conduct machine thinking and solving problems based on reasoning control strategies.
  • the typical function is search and matching.
  • Decision-making refers to the process of making decisions after intelligent information is reasoned, and usually provides functions such as classification, ranking, and prediction.
  • some general capabilities can be formed based on the results of the data processing, such as an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, image Recognition and so on.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is an encapsulation of the overall solution of artificial intelligence, productizing intelligent information decision-making and realizing landing applications. Its application fields mainly include: intelligent terminals, intelligent manufacturing, Intelligent transportation, smart home, smart medical, smart security, autonomous driving, safe city, etc.
  • Fig. 1B shows a schematic structural diagram of a possible terminal device of the present application.
  • the terminal device includes an environment perception module 102, a planning decision module 104, and a control processing module 106.
  • the environment perception module 102 collects obstacle information, surrounding environment information where the terminal device is located, and driving information of the vehicle where the terminal device is located mainly through peripheral systems (such as sensors, cameras, etc.).
  • the obstacle information includes, but is not limited to, information such as the geographic location of the obstacle, the movement speed of the obstacle, the movement direction of the obstacle, the movement acceleration of the obstacle, the variance of the movement direction of the obstacle, the variance of the movement speed of the obstacle, etc.
  • the obstacles include, but are not limited to, vehicles, pedestrians, animate living obstacles, inanimate obstacles, and the like. This application will take the obstacle as a pedestrian as an example, and specifically describe some embodiments involved in this application.
  • the surrounding environment information includes but is not limited to information such as map information, weather information, intersection type, lane line, number of lanes, whether the road is congested, traffic flow speed, traffic flow acceleration, and the distance between the terminal device and the obstacle.
  • the driving information includes, but is not limited to, the geographic location, driving speed, driving direction, driving acceleration, distance between the vehicle and obstacles, and the like of the vehicle.
  • the terminal equipment includes, but is not limited to, vehicles such as automobiles, trains, trucks, and cars, and communication equipment installed on the vehicles, such as vehicle-mounted equipment.
  • the planning decision module 104 includes a behavior prediction module and a planning module.
  • the behavior prediction module is mainly used to predict the behavior intention of the obstacle (for example, the behavior result of the pedestrian described later in this application) and the motion trajectory corresponding to the behavior intention (ie, Obstacle trajectory).
  • the planning module is used to obtain a corresponding control strategy according to the behavior intention under the premise of ensuring safety, so as to subsequently use the control strategy to control the vehicle for safe driving.
  • the control strategy is pre-customized on the user side or the terminal device side, or generated according to the behavior intention, which will be described in detail below.
  • the control strategy is used to instruct the vehicle to adjust the corresponding vehicle parameters, so as to realize the safe driving of the vehicle.
  • the control processing module is used to control and adjust the vehicle correspondingly according to the control strategy obtained by the planning decision module to avoid collision between the vehicle and the obstacle. For example, control the vehicle parameters such as the steering wheel angle of the vehicle, driving speed, whether to brake or not, and whether to press the accelerator pedal. Regarding how to control the safe driving of the vehicle according to the behavior result (ie, behavior intention) of the pedestrian, the details will be described in detail below.
  • FIG. 1C shows a schematic structural diagram of another possible terminal device of the present application.
  • the terminal device 100 may include: a baseband chip 110, a memory 115, including one or more computer-readable storage media, a radio frequency (RF) module 116, and a peripheral system 117. These components may communicate on one or more communication buses 114.
  • RF radio frequency
  • the peripheral system 117 is mainly used to implement the interactive function between the terminal device 110 and the user (such as a pedestrian)/external environment, and mainly includes an input and output device of the terminal 100.
  • the peripheral system 117 may include: a touch screen controller 118, a camera controller 119, an audio controller 120, and a sensor management module 121.
  • each controller can be coupled with its corresponding peripheral devices, such as a touch screen 123, a camera 124, an audio circuit 125, and a sensor 126.
  • the gesture sensor in the sensor 126 may be used to receive gesture control operations input by the user.
  • the speed sensor in the sensor 126 may be used to collect the driving speed of the terminal device itself or to collect the movement speed of obstacles in the environment.
  • the touch screen 123 can be used as a prompting device, which is mainly used to prompt obstacles through screen display, projection, etc., for example, when a pedestrian is crossing a road, the display screen displays text to prompt the pedestrian to speed up walking.
  • the peripheral system 117 may also include other prompting devices such as lights, displays, etc., for interactive prompting between the vehicle and the pedestrian, so as to avoid collisions between the vehicle and the pedestrian. It should be noted that the peripheral system 117 may also include other I/O peripherals.
  • the baseband chip 110 may integrate: one or more processors 111, a clock module 112, and a power management module 113.
  • the clock module 112 integrated in the baseband chip 110 is mainly used to generate a clock required for data transmission and timing control for the processor 111.
  • the power management module 113 integrated in the baseband chip 110 is mainly used to provide a stable and high-precision voltage for the processor 111, the radio frequency module 116, and the peripheral system.
  • the radio frequency (RF) module 116 is used to receive and transmit radio frequency signals, and mainly integrates the receiver and transmitter of the terminal 100.
  • the radio frequency (RF) module 116 communicates with a communication network and other communication devices through radio frequency signals.
  • the radio frequency (RF) module 116 may include, but is not limited to: an antenna system, an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC chip, a SIM card, and Storage media, etc.
  • the radio frequency (RF) module 116 may be implemented on a separate chip.
  • the memory 115 is coupled with the processor 111, and is used to store various software programs and/or multiple sets of instructions.
  • the memory 115 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other non-volatile solid-state storage devices.
  • the memory 115 may store an operating system, such as an embedded operating system such as ANDROID, IOS, WINDOWS, or LINUX.
  • the memory 115 may also store a network communication program, which may be used to communicate with one or more additional devices, one or more terminal devices, one or more terminal devices, and so on.
  • FIG. 1D a schematic diagram of a possible scenario to which this application is applicable is introduced.
  • pedestrians are located at point P1 at the edge of the road on the road where vehicles go.
  • the vehicle needs to predict the pedestrian's trajectory and control its own safe driving based on the predicted trajectory.
  • the vehicle can be controlled to slow down, or even brake to wait, so as to avoid the pedestrian.
  • FIG. 2 is a schematic diagram of an embodiment of a trajectory prediction method provided by an embodiment of the present application.
  • a trajectory prediction method provided by an embodiment of the present application includes:
  • the terminal device may obtain posture information through peripheral systems (such as sensors, cameras, etc.).
  • the posture information described therein can be used to predict the impending behavior intention of pedestrians.
  • the terminal device may use the posture information as input, and obtain the target information representing the behavioral intention of the target object through the cyclic neural network.
  • posture information of a target object can be acquired, where the target object includes a plurality of skeleton nodes, and the posture information includes the position of the skeleton node of each skeleton node at multiple times.
  • the video stream of the pedestrian may be obtained first, and the obtained video stream may be subjected to video processing, and convolutional neural network (convolutional neural networks, CNN) skeleton fitting may be performed to obtain multiple moments (in other words: Multiple frames in the video stream) pedestrian skeleton information (hereinafter may also be referred to as a skeleton pixel coordinate sequence), wherein the skeleton information at each moment of the pedestrian may include multiple skeleton node positions, specifically, the posture information can be expressed as :
  • i is the index of the skeleton node
  • (x i , y i ) are the pixel abscissa and pixel ordinate of the skeleton node with index i in the video stream, respectively. It should be noted that, in one embodiment, considering that the probability of head joint points loss is relatively large, the skeleton information related to the pedestrian head joint points may be eliminated.
  • the skeleton node may be a human joint such as an elbow joint and an ankle joint, and the type of the skeleton node is not limited here.
  • the terminal device obtains the position of the skeleton node of each skeleton node of the target object at multiple times.
  • the target object includes a first skeleton node
  • the posture information includes a first skeleton node position, a second skeleton node position, and a third skeleton node position.
  • the first skeleton node The position is the position of the first skeleton node of the target object at the first time
  • the third skeleton node position is the position of the first skeleton node of the target object at the third time
  • the second skeleton node position corresponds to At the position of the first skeleton node of the target object at the second moment
  • the first moment, the second moment, and the third moment are moments that are successively adjacent in the time dimension
  • the node position is related to the position of the first skeleton node and/or the position of the third skeleton node.
  • the second skeleton node position is the center position of the first skeleton node position and the third skeleton node position.
  • the position of the skeleton node of the preceding and following frames can be used for interpolation supplementary processing, and the interpolation method can be the skeleton corresponding to the preceding and succeeding frames.
  • the position of the skeleton node of the node is taken as the average value or other operations are taken as the position of the missing skeleton node.
  • the posture information includes the first skeleton node position of the first skeleton node of the target object at the first time, the second skeleton node position at the second time, and the third skeleton node position at the third time.
  • the first skeleton node may It is a pedestrian elbow joint node
  • the position of the second skeleton node is related to the position of the first skeleton node and/or the position of the third skeleton node.
  • the second skeleton node position is the center position of the first skeleton node position and the third skeleton node position, that is, the second skeleton node position is the skeleton node position of the skeleton node corresponding to the previous and next frames (the first The position of the skeleton node and the position of the third skeleton node) are averaged.
  • the pedestrian skeleton information is:
  • XA i (x A1 ,y A1 ,x A2 ,y A2 ,...,x Ak ,y Ak ,...x A12 ,y A12 );
  • the pedestrian skeleton information is:
  • X Bi (x B1 ,y B1 ,x B3 ,y B3 ,...,x Bk ,y Bk ,...x B12 ,y B12 );
  • the pedestrian skeleton information is:
  • X Ci (x C1 ,y C1 ,x C2 ,y C2 ,...,x Ck ,y Ck ,...x C12 ,y C12 );
  • the skeleton information of the elbow joint is missing a frame of the skeleton node position: (x B2 , y B2 ), in order to supplement the missing skeleton node position (x B2 , y B2 ), you can refer to the elbow joints of the previous frame
  • the skeleton node positions (x A2 , y A2 ) and (x C2 , y C2 ) of, in particular, the missing skeleton node positions can be supplemented according to the following formula:
  • the terminal device may perform time interval sampling on the skeleton node position of each skeleton node at multiple times to obtain processed posture information, and the processed posture information It is used to obtain target information representing the behavioral intention of the target object.
  • the posture information is sampled at equal intervals at multiple times to enlarge the difference between the skeleton information of different behavior intention categories.
  • the terminal device may normalize the position of the skeleton node of each skeleton node at multiple times to obtain normalized posture information.
  • the processed posture information is used to obtain target information representing the behavioral intention of the target object.
  • timing sequence of each skeleton node can be internally normalized as follows:
  • x min is the minimum value of the horizontal (vertical) coordinate of a certain node in the sequence of pixels
  • x max is the maximum value of the horizontal (vertical) coordinate of a certain node in the sequence of pixels.
  • the skeleton information is normalized within the sequence to expand the difference of different types of skeleton information, thereby improving the prediction performance of pedestrian intent.
  • the historical movement trajectory information of the target object can also be obtained.
  • the historical movement trajectory information of pedestrians can be collected by vehicle sensors (such as millimeter wave radar, etc.).
  • the pedestrian's trajectory coordinate form in the pedestrian's historical motion trajectory information can be (x 1 ,y 1 ,x 2 ,y 2 ,...,x i ,y i ,...), where (x i ,y i ) is the i-th moment The coordinates of the pedestrian location.
  • the historical trajectory information includes historical trajectory point positions at multiple times, and each time corresponds to a historical trajectory point position, where all The historical track point positions at the multiple times are represented based on a preset coordinate system, the historical track point positions at the multiple times include a first track point position, and the first track point position is an initial position in the multiple times The position of the trajectory point at a time, and the position of the first trajectory point is located at a preset position in the preset coordinate system.
  • the position of the historical track point is a position relative to the ground.
  • the historical trajectory of the original pedestrian can be preprocessed to obtain the position of the historical trajectory point in the new coordinate system.
  • the historical trajectory point position acquired by the terminal device may be a movement trajectory generated by the relative displacement with the vehicle.
  • the vehicle movement can be displaced. Adding to the original trajectory of the pedestrian, that is, the vehicle displacement compensation is performed on the historical trajectory of the pedestrian, and the historical trajectory information relative to the ground is obtained.
  • the pedestrian trajectory prediction starting point can be used as the reference origin to perform the pedestrian trajectory coordinate transformation.
  • the historical trajectory information includes the positions of the historical trajectory points at multiple times.
  • the historical track point positions at the multiple times include the first track point position, and the first track point The position may be the position of the trajectory point at the initial moment among the multiple moments, or may be the position of the trajectory point at the last moment among the plurality of moments, which is not limited here, and the first trajectory point position is located in the The preset position in the preset coordinate system, where the preset position may be the origin (0,0) of the preset coordinate system.
  • FIG. 4 is a schematic diagram of historical trajectory information processing provided by an embodiment of this application.
  • O 1 is the origin of the vehicle's coordinate reference system
  • 0 2 is the initial of pedestrians.
  • the position of the trajectory point at the time and the historical trajectory information of the pedestrian can be expressed in a coordinate system with the O 2 point as the coordinate origin.
  • the terminal device may use the posture information as input, and obtain the target information representing the behavior intention of the target object through the cyclic neural network model.
  • the recurrent neural network model may be a long short-term memory network (long short-term memory, LSTM) or a gated recurrent unit GRU.
  • the recurrent neural network model is a long short-term memory network LSTM as an example.
  • LSTM algorithm is a specific form of recurrent neural network (recurrent neural network, RNN), and RNN is a general term for a series of neural networks capable of processing sequence data.
  • RNN recurrent neural network
  • RNN is a general term for a series of neural networks capable of processing sequence data.
  • RNN will encounter huge difficulties when dealing with long-term dependencies (nodes that are far away in the time series), because the calculation of the links between nodes that are far away involves multiple multiplications of the Jacobian matrix.
  • the most widespread is gated RNN (Gated RNN), and LSTM is the most famous kind of threshold RNN.
  • Leaky units allow RNNs to accumulate long-term connections between nodes that are far away by designing the weight coefficients between connections. Threshold RNNs generalize this idea, allowing the coefficient to be changed at different times, and allowing the network to forget the current accumulation. Information.
  • LSTM is such a threshold RNN. LSTM makes the weight of the self-loop change by increasing the input threshold, forgetting threshold and output threshold. In this way, when the model parameters are fixed, the integration scale at different times can be dynamically changed, thereby avoiding the problem of gradient disappearance or gradient expansion .
  • the terminal device can determine the behavioral intention of the pedestrian corresponding to the posture information based on the acquired posture information of the target object, where the behavioral intention can include at least one of the following: changing from a moving state to Stationary state (stopping), keep moving state (crossing), from stationary state to moving state (starting), from the first moving state to the second moving state (bending), where the first moving state can be understood as Moving in the area of the lane (for example, moving on the sidewalk near the edge of the lane), the second state of movement can be understood as entering the lane area (for example, crossing the lane).
  • the behavioral intention is described in detail:
  • the behavior intention of the pedestrian can be changed from a moving state to a stationary state, where the moving state can be understood as the pedestrian moving from the sidewalk to the edge of the lane, and the stationary state can be understood as the pedestrian will stop when moving to the edge of the lane Moving
  • Figure 3 is a schematic diagram of a pedestrian behavioral intention provided by an embodiment of the application. As shown in Figure 3, for the behavioral intention of changing from a moving state to a stopping state, pedestrians can Walk on the edge of the lane, and stop moving when it reaches the edge of the road.
  • the behavioral intention of the pedestrian may be to maintain a moving state (crossing), where the moving state can be understood as the pedestrian will maintain continuous movement, for example, move from the sidewalk to the edge of the lane and continue to move to walk to the lane area, As shown in FIG. 3, for the behavioral intention of maintaining a crossing state, the pedestrian can move from the sidewalk to the lane and cross the lane.
  • the behavioral intention of the pedestrian can be changed from a stationary state to a starting state, where the stationary state can be understood as a pedestrian in a stationary state in a certain area (for example, the edge of the lane on the side of the sidewalk), moving
  • the state can be understood as the pedestrian will start to move from the above-mentioned certain area (for example, walking to the lane area), as shown in Figure 3, for the behavior intention of changing from a stationary state to a moving state (starting), pedestrians can move from the edge of the lane Start walking and cross the lane.
  • the behavior intention of the pedestrian can be changed from the first movement state to the second movement state (bending), where the first movement state can be understood as moving in a certain area (for example, moving near the edge of a lane),
  • the second movement state can be understood as entering the lane area (for example, walking to the lane area).
  • pedestrians can change from The edge of the lane hovered and then traversed through the lane.
  • the behavior intention includes four behavior intentions: changing from a moving state to a static state, staying in a moving state, changing from a static state to a moving state, and changing from a first moving state to a second moving state as an example:
  • the input of the LSTM network is the preprocessed pedestrian skeleton node information sequence (posture information).
  • the preprocessed skeleton node information sequence (posture information).
  • the following skeleton sequence feature vector is obtained, namely:
  • o i is the component vector of the i-th type of intent in the skeleton feature vector out
  • P i is the probability that the skeleton sequence feature vector belongs to the i-th type of intent.
  • the terminal device may obtain target information representing the behavioral intention of the target object based on the posture information.
  • the target information may be an identifier indicating the behavioral intention of the target object or other information that can uniquely indicate the behavioral intention.
  • the loss when training the LSTM network for predicting the behavioral intention of pedestrians, the loss can be adopted using cross entropy, that is, the LSTM network is trained through the following loss function:
  • p(x) and q(x) can respectively represent the true probability distribution and the predicted probability distribution of the input x.
  • the above loss function is only an illustration, and does not constitute a limitation of the application.
  • the terminal device may use the target information and the historical trajectory information as input, and obtain the target information and the historical trajectory information through the trajectory prediction model. Describe the predicted trajectory of the target object.
  • the terminal device may use the target information and the historical trajectory information as input to obtain the predicted trajectory of the target object through a cyclic neural network model.
  • a cyclic neural network model may be used as an LSTM network as an example:
  • the input of the LSTM network is historical trajectory information and target information.
  • the above-mentioned input information is transformed into the output vector out b of the coding layer by the coding layer of the LSTM network.
  • the decoding layer of the LSTM network maps the output vector out b of the coding layer to the predicted trajectory of pedestrians.
  • the predicted trajectory can include the predicted trajectory coordinates, for example, the predicted trajectory can be (x 1 ,y 1 ,x 2 ,y 2 ,...,x i ,y i ,...x p ,y p ), where (x i ,y i ) is the position coordinate of the pedestrian predicted at the i-th time in the future, and p is the time length of the predicted trajectory in the future.
  • the mean square error can be used as the loss function when training the LSTM network for predicting the trajectory of pedestrians.
  • the expression of the mean square error can be the following function:
  • the preprocessed pedestrian skeleton information sequence needs to go through a trained LSTM pedestrian intention recognition model to obtain the predicted pedestrian intention.
  • the LSTM pedestrian intention recognition model can be a single-layer LSTM model, and the input data dimension is (batch_size, 32, 24), where batch_size is the amount of batch data. In this embodiment, batch_size can be equal to 1, and 32 is the history of skeleton features. Time length, 24 is the skeleton feature dimension of each time, the prediction accuracy rate of this model under 219 test samples is 95.43%, and the obtained confusion matrix is shown in Figure 5, which is a kind of example provided by this application. Schematic representation of the confusion matrix.
  • the behavioral intention category with the highest probability is regarded as the behavioral intention of the target object.
  • the behavior intention category obtained by the skeleton information sequence in this embodiment through the intention recognition model is 3 (wherein, bending is represented as 1, crossing is represented as 2, starting is represented as 3, and stopping is represented as 4).
  • the predicted target information intention representing pedestrian behavior intention and the corresponding pre-processed pedestrian historical trajectory information are used as the input of the trained LSTM pedestrian trajectory prediction model.
  • the input intent feature information and historical trajectory information undergoes the LSTM encoding process to obtain the output vector out b of the encoding layer.
  • the decoding layer of the LSTM network maps the output vector out b of the encoding layer to the predicted trajectory of the pedestrian, thereby obtaining the pedestrian’s future 16
  • the predicted position coordinates of pedestrians at 16 moments in the future can be shown in Table 1.
  • the average absolute error MAE can be used to measure the prediction performance of the trajectory prediction model. Specifically, the number of test samples M is taken as 1, and the average absolute error MAE corresponding to 16 moments is shown in Table 2.
  • the embodiment of the application provides a method for trajectory prediction, including obtaining posture information and historical motion trajectory information of a target object; obtaining target information representing the behavioral intention of the target object based on the posture information; using the target information and the The historical trajectory information is input, and the predicted trajectory of the target object is obtained through the trajectory prediction model.
  • taking the behavior intention of the pedestrian as one of the input features of the pedestrian trajectory prediction can reduce the prediction error of the pedestrian trajectory, improve the prediction performance of the pedestrian trajectory, and can solve the problem of inaccurate pedestrian motion trajectory prediction in the prior art. Thereby improving the safety of vehicle control.
  • the terminal device includes a hardware structure and/or software module corresponding to each function.
  • the embodiments of the present invention can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered as going beyond the scope of the technical solutions of the embodiments of the present invention.
  • the embodiment of the present invention may divide the functional units of the terminal device according to the foregoing method examples.
  • each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit. It should be noted that the division of units in the embodiment of the present invention is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 6 shows a schematic diagram of a possible structure of the terminal device involved in the foregoing embodiment.
  • the terminal device 600 includes: an acquisition module 601, which is used to acquire the posture information and historical motion track information of the target object; based on the posture information, the target information representing the behavioral intention of the target object; The information and the historical trajectory information are inputs, and the predicted trajectory of the target object is obtained through a trajectory prediction model.
  • the behavior intention includes at least one of the following: changing from a moving state to a static state, maintaining a moving state, changing from a static state to a moving state, and changing from a first moving state to a second moving state.
  • the target object includes a plurality of skeleton nodes
  • the posture information includes the position of the skeleton node of each skeleton node at multiple times.
  • the target object includes a first skeleton node
  • the posture information includes a first skeleton node position, a second skeleton node position, and a third skeleton node position
  • the first skeleton node position is the position of the target object
  • the position of the first skeleton node at the first time, the third skeleton node position is the position of the first skeleton node of the target object at the third time
  • the second skeleton node position corresponds to the first skeleton node of the target object
  • the position of a skeleton node at the second moment, the first moment, the second moment, and the third moment are moments that are successively adjacent in the time dimension, and the position of the second skeleton node is the same as that of the first moment.
  • the position of the skeleton node is related to the position of the third skeleton node.
  • the second skeleton node position is a center position of the first skeleton node position and the third skeleton node position.
  • the terminal device 600 further includes: a sampling module, configured to perform time interval sampling on the skeleton node position of each skeleton node at multiple times to obtain processed posture information, and the processed posture The information is used to obtain target information that represents the behavioral intention of the target object.
  • a sampling module configured to perform time interval sampling on the skeleton node position of each skeleton node at multiple times to obtain processed posture information, and the processed posture The information is used to obtain target information that represents the behavioral intention of the target object.
  • the terminal device 600 further includes: a normalization module, which is used to normalize the position of the skeleton node of each skeleton node at multiple times to obtain normalized posture information.
  • the normalized posture information is used to obtain target information representing the behavioral intention of the target object.
  • the acquisition module 601 is specifically configured to: use the posture information as an input to obtain target information representing the behavioral intention of the target object through a long and short-term memory network LSTM.
  • the historical trajectory information includes historical trajectory point positions at multiple moments, each time corresponding to a historical trajectory point position, wherein the historical trajectory point positions at the multiple moments are represented based on a preset coordinate system, so
  • the historical trajectory point positions of the multiple moments include a first trajectory point position, the first trajectory point position is the trajectory point position of the initial moment among the multiple moments, and the first trajectory point position is located at the preset The preset position in the coordinate system.
  • the historical track point position is a position relative to the ground.
  • the prediction module 602 is specifically configured to: use the target information and the historical trajectory information as input to obtain an output vector through a cyclic neural network model; and map the output vector to the target object Predict the trajectory.
  • the terminal device 600 may further include a storage unit for storing program codes and data of the terminal device 600.
  • the aforementioned prediction module 601 may be integrated in a processing module, where the processing module may be a processor or a controller, for example, a central processing unit (CPU), a general-purpose processor, or digital signal processing.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • the processor may also be a combination for realizing computing functions, for example, including a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
  • FIG. 7 is a schematic structural diagram of another terminal device 700 provided by an embodiment of this application.
  • the terminal device 700 includes: a processor 712, a communication interface 713, and a memory 77.
  • the terminal device 700 may further include a bus 714.
  • the communication interface 713, the processor 712, and the memory 77 may be connected to each other through a bus 714;
  • the bus 714 may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) Bus and so on.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the bus 714 can be divided into an address bus, a data bus, a control bus, and so on. For ease of presentation, only one thick line is used in FIG. 7B, but it does not mean that there is only one bus or one type of bus.
  • processor 712 may perform the following steps:
  • the predicted trajectory of the target object is obtained through a trajectory prediction model.
  • the behavior intention includes at least one of the following:
  • the target object includes a plurality of skeleton nodes
  • the posture information includes the position of the skeleton node of each skeleton node at multiple times.
  • the target object includes a first skeleton node
  • the posture information includes a first skeleton node position, a second skeleton node position, and a third skeleton node position
  • the first skeleton node position is the position of the target object
  • the position of the first skeleton node at the first time, the third skeleton node position is the position of the first skeleton node of the target object at the third time
  • the second skeleton node position corresponds to the first skeleton node of the target object
  • the position of a skeleton node at the second moment, the first moment, the second moment, and the third moment are moments that are successively adjacent in the time dimension, and the position of the second skeleton node is the same as that of the first moment.
  • the position of the skeleton node is related to the position of the third skeleton node.
  • the second skeleton node position is a center position of the first skeleton node position and the third skeleton node position.
  • processor 712 may perform the following steps:
  • Time interval sampling is performed on the position of the skeleton node of each skeleton node at multiple times to obtain processed posture information, and the processed posture information is used to obtain target information representing the behavioral intention of the target object.
  • processor 712 may perform the following steps:
  • processor 712 may perform the following steps:
  • target information representing the behavioral intention of the target object is obtained.
  • the historical trajectory information includes historical trajectory point positions at multiple moments, each time corresponding to a historical trajectory point position, wherein the historical trajectory point positions at the multiple moments are represented based on a preset coordinate system, so
  • the historical trajectory point positions of the multiple moments include a first trajectory point position, the first trajectory point position is the trajectory point position of the initial moment among the multiple moments, and the first trajectory point position is located at the preset The preset position in the coordinate system.
  • the historical track point position is a position relative to the ground.
  • processor 712 may perform the following steps:
  • the output vector is mapped to the predicted trajectory of the target object.
  • FIG. 8 is a schematic structural diagram of an execution device provided by an embodiment of this application.
  • the terminal device described in the embodiment corresponding to FIG. 6 or FIG. 7 may be deployed on the execution device 800 to implement the function of the terminal device in the embodiment corresponding to FIG. 6 and FIG. 7.
  • the execution device 800 includes: a receiver 801, a transmitter 802, a processor 803, and a memory 804 (the number of processors 803 in the execution device 800 may be one or more, and one processor is taken as an example in FIG. 8) , Where the processor 803 may include an application processor 8031 and a communication processor 8032.
  • the receiver 801, the transmitter 802, the processor 803, and the memory 804 may be connected by a bus or other methods.
  • the memory 804 may include a read-only memory and a random access memory, and provides instructions and data to the processor 803. A part of the memory 804 may also include a non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 804 stores a processor and operating instructions, executable modules or data structures, or a subset of them, or an extended set of them.
  • the operating instructions may include various operating instructions for implementing various operations.
  • the processor 803 controls the operation of the execution device.
  • the various components of the execution device are coupled together through a bus system, where the bus system may include a power bus, a control bus, and a status signal bus in addition to a data bus.
  • bus system may include a power bus, a control bus, and a status signal bus in addition to a data bus.
  • various buses are referred to as bus systems in the figure.
  • the method disclosed in the foregoing embodiment of the present application may be applied to the processor 803 or implemented by the processor 803.
  • the processor 803 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above method can be completed by an integrated logic circuit of hardware in the processor 803 or instructions in the form of software.
  • the aforementioned processor 803 may be a general-purpose processor, a digital signal processing (DSP), a microprocessor or a microcontroller, and may further include an application specific integrated circuit (ASIC), field programmable Field-programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • the processor 803 may implement or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present application.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 804, and the processor 803 reads the information in the memory 804, and completes the steps of the foregoing method in combination with its hardware.
  • the receiver 801 can be used to receive input digital or character information, and to generate signal input related to the relevant settings and function control of the execution device.
  • the transmitter 802 can be used to output digital or character information through the first interface; the transmitter 802 can also be used to send instructions to the disk group through the first interface to modify the data in the disk group.
  • the embodiment of the present application also provides a product including a computer program, which when running on a computer, causes the computer to execute the steps performed by the terminal device in the method described in the embodiment shown in FIG. 6 or FIG. 7.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a program for signal processing, and when it runs on a computer, the computer executes the program shown in Figure 6 or Figure 7 above. The steps performed by the terminal device in the method described in the embodiment are shown.
  • the execution device or terminal device provided by the embodiment of the present application may specifically be a chip.
  • the chip includes a processing unit and a communication unit.
  • the processing unit may be a processor, for example, and the communication unit may be an input/output interface, a pin, or Circuit etc.
  • the processing unit can execute the computer-executable instructions stored in the storage unit, so that the chip in the execution device executes the trajectory prediction method described in the embodiment shown in FIG. 2 above.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit located outside the chip in the wireless access device, such as Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
  • ROM Read-only memory
  • RAM random access memory
  • Figure 9 is a schematic structural diagram of a chip provided by an embodiment of the application.
  • the chip may be represented as a neural network processor NPU 900, which is mounted as a coprocessor to the main CPU (Host On the CPU), the Host CPU assigns tasks.
  • the core part of the NPU is an arithmetic circuit, and the arithmetic circuit 903 is controlled by the controller 904 to extract matrix data from the memory and perform multiplication operations.
  • the arithmetic circuit 903 includes multiple processing units (Process Engine, PE). In some implementations, the arithmetic circuit 903 is a two-dimensional systolic array. The arithmetic circuit 903 may also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 903 is a general-purpose matrix processor.
  • the arithmetic circuit fetches the data corresponding to matrix B from the weight memory 902 and caches it on each PE in the arithmetic circuit.
  • the arithmetic circuit takes the matrix A data and matrix B from the input memory 901 to perform matrix operations, and the partial results or final results of the obtained matrix are stored in an accumulator 908.
  • the unified memory 906 is used to store input data and output data.
  • the weight data directly passes through the memory unit access controller (Direct Memory Access Controller, DMAC) 905, and the DMAC is transferred to the weight memory 902.
  • the input data is also transferred to the unified memory 906 through the DMAC.
  • DMAC Direct Memory Access Controller
  • the BIU is the Bus Interface Unit, that is, the bus interface unit 910, which is used for the interaction of the AXI bus with the DMAC and the instruction fetch buffer (IFB) 909.
  • IFB instruction fetch buffer
  • the bus interface unit 910 (Bus Interface Unit, BIU for short) is used for the instruction fetch memory 909 to obtain instructions from an external memory, and is also used for the storage unit access controller 905 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • BIU Bus Interface Unit
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 906 or to transfer the weight data to the weight memory 902 or to transfer the input data to the input memory 901.
  • the vector calculation unit 907 includes multiple arithmetic processing units, and further processes the output of the arithmetic circuit if necessary, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, and so on. It is mainly used in the calculation of non-convolutional/fully connected layer networks in neural networks, such as Batch Normalization, pixel-level summation, and upsampling of feature planes.
  • the vector calculation unit 907 can store the processed output vector to the unified memory 906.
  • the vector calculation unit 907 may apply a linear function and/or a nonlinear function to the output of the arithmetic circuit 903, such as linearly interpolating the feature plane extracted by the convolutional layer, and for example, a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 907 generates normalized values, pixel-level summed values, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 903, for example for use in a subsequent layer in a neural network.
  • the instruction fetch buffer 909 connected to the controller 904 is used to store instructions used by the controller 904;
  • the unified memory 906, the input memory 901, the weight memory 902, and the fetch memory 909 are all On-Chip memories.
  • the external memory is private to the NPU hardware architecture.
  • processor mentioned in any of the foregoing may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the program of the method in the first aspect.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physically separate.
  • the physical unit can be located in one place or distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the connection relationship between the modules indicates that they have a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
  • this application can be implemented by means of software plus necessary general hardware.
  • it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CPUs, dedicated memory, Dedicated components and so on to achieve.
  • all functions completed by computer programs can be easily implemented with corresponding hardware.
  • the specific hardware structures used to achieve the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. Circuit etc.
  • software program implementation is a better implementation in more cases.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a computer floppy disk. , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, training device, or network device, etc.) execute the various embodiments described in this application method.
  • a computer device which can be a personal computer, training device, or network device, etc.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, training device, or data.
  • the center transmits to another website, computer, training equipment, or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • wired such as coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless such as infrared, wireless, microwave, etc.
  • the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a training device or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Psychiatry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Automation & Control Theory (AREA)

Abstract

一种轨迹预测方法,应用于人工智能领域下的自动驾驶,包括获取目标对象的姿态信息和历史运动轨迹信息(201);基于所述姿态信息获取表示所述目标对象行为意图的目标信息(202);以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹(203)。该方法将行人的行为意图作为行人轨迹预测的输入特征,可以减小行人轨迹的预测误差,提高行人轨迹的预测性能。

Description

一种轨迹预测方法及相关设备 技术领域
本申请涉及人工智能领域,尤其涉及一种轨迹预测方法及相关设备。
背景技术
随着经济的发展和人民生活水平的提高,汽车保有量逐年增长,导致了交通拥堵和交通事故的发生。为提高车辆驾驶的安全性,目前已将自动驾驶技术应用于车辆中,以实现车辆的自动驾驶。
车辆在自动驾驶过程中,需要对附近的行人的轨迹进行预测,现常用的行人轨迹预测方法有根据行人的历史轨迹信息进行行人的轨迹预测。然而行人的行走行为存在很多主观因素,因此上述基于行人的历史轨迹信息来预测行人未来的预测轨迹的方法的预测效果较差。
发明内容
本申请提供了一种轨迹预测方法及相关设备,将行人的行为意图作为行人轨迹预测的输入特征,可以减小行人轨迹的预测误差,提高行人轨迹的预测性能。
第一方面,本申请提供了一种轨迹预测方法,包括:获取目标对象的姿态信息和历史运动轨迹信息;基于所述姿态信息获取表示所述目标对象行为意图的目标信息;以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。
在第一方面的一种可选设计中,所述行为意图至少包括如下的一种:
由移动状态变为静止状态、保持移动状态、由静止状态变为移动状态、由第一移动状态变为第二移动状态。
在第一方面的一种可选设计中,所述目标对象包括多个骨架节点,所述姿态信息包括每个骨架节点在多个时刻的骨架节点位置。
在第一方面的一种可选设计中,所述目标对象包括第一骨架节点,所述姿态信息包括第一骨架节点位置、第二骨架节点位置和第三骨架节点位置,所述第一骨架节点位置为所述目标对象的第一骨架节点在第一时刻的位置,所述第三骨架节点位置为所述目标对象的第一骨架节点在第三时刻的位置,所述第二骨架节点位置对应于所述目标对象的第一骨架节点在第二时刻的位置,所述第一时刻、所述第二时刻和所述第三时刻为在时间维度上依次相邻的时刻,所述第二骨架节点位置与所述第一骨架节点位置和/或第三骨架节点位置有关。
在第一方面的一种可选设计中,所述第二骨架节点位置为所述第一骨架节点位置和所述第三骨架节点位置的中心位置。
在第一方面的一种可选设计中,所述方法还包括:
对所述每个骨架节点在多个时刻的骨架节点位置进行时刻上的间隔采样,得到处理后的姿态信息,所述处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
在第一方面的一种可选设计中,所述方法还包括:
对所述每个骨架节点在多个时刻的骨架节点位置进行归一化处理,得到归一化处理后的姿态信息,所述归一化处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
在第一方面的一种可选设计中,所述基于所述姿态信息获取表示所述目标对象行为意图的目标信息,包括:
以所述姿态信息为输入,通过循环神经网络模型,得到表示所述目标对象行为意图的目标信息。
在第一方面的一种可选设计中,所述历史轨迹信息包括多个时刻的历史轨迹点位置,每个时刻对应一个历史轨迹点位置,其中,所述多个时刻的历史轨迹点位置基于预设坐标系来表示,所述多个时刻的历史轨迹点位置包括第一轨迹点位置,所述第一轨迹点位置位于所述预设坐标系中的预设位置。
其中,第一轨迹点位置可以是多个历史轨迹位置中的初始轨迹位置,或者是时间上最靠后的历史轨迹位置。预设坐标系中的预设位置可以是预设坐标系中的坐标原点,或者是预设坐标系中其他任意预设的坐标位置。
在第一方面的一种可选设计中,所述历史轨迹点位置为相对于地面的位置。
在第一方面的一种可选设计中,所述以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹,包括:
以所述目标信息和所述历史轨迹信息为输入,通过循环神经网络模型,得到所述目标对象的预测轨迹。
第二方面,本申请提供了一种执行设备,包括:
获取模块,用于获取目标对象的姿态信息和历史运动轨迹信息;
基于所述姿态信息获取表示所述目标对象行为意图的目标信息;
预测模块,用于以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。
在第二方面的一种可选设计中,所述行为意图至少包括如下的一种:
由移动状态变为静止状态、保持移动状态、由静止状态变为移动状态、由第一移动状态变为第二移动状态。
在第二方面的一种可选设计中,所述目标对象包括多个骨架节点,所述姿态信息包括每个骨架节点在多个时刻的骨架节点位置。
在第二方面的一种可选设计中,所述目标对象包括第一骨架节点,所述姿态信息包括第一骨架节点位置、第二骨架节点位置和第三骨架节点位置,所述第一骨架节点位置为所述目标对象的第一骨架节点在第一时刻的位置,所述第三骨架节点位置为所述目标对象的第一骨架节点在第三时刻的位置,所述第二骨架节点位置对应于所述目标对象的第一骨架节点在第二时刻的位置,所述第一时刻、所述第二时刻和所述第三时刻为在时间维度上依次相邻的时刻,所述第二骨架节点位置与所述第一骨架节点位置和/或第三骨架节点位置有关。
在第二方面的一种可选设计中,所述第二骨架节点位置为所述第一骨架节点位置和所 述第三骨架节点位置的中心位置。
在第二方面的一种可选设计中,还包括:
采样模块,用于对所述每个骨架节点在多个时刻的骨架节点位置进行时刻上的间隔采样,得到处理后的姿态信息,所述处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
在第二方面的一种可选设计中,还包括:
归一化模块,用于对所述每个骨架节点在多个时刻的骨架节点位置进行归一化处理,得到归一化处理后的姿态信息,所述归一化处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
在第二方面的一种可选设计中,所述获取模块,具体用于:
以所述姿态信息为输入,通过循环神经网络模型,得到表示所述目标对象行为意图的目标信息。
在第二方面的一种可选设计中,所述历史轨迹信息包括多个时刻的历史轨迹点位置,每个时刻对应一个历史轨迹点位置,其中,所述多个时刻的历史轨迹点位置基于预设坐标系来表示,所述多个时刻的历史轨迹点位置包括第一轨迹点位置,所述第一轨迹点位置位于所述预设坐标系中的预设位置。
在第二方面的一种可选设计中,所述历史轨迹点位置为相对于地面的位置。
在第二方面的一种可选设计中,所述预测模块,具体用于:
以所述目标信息和所述历史轨迹信息为输入,通过循环神经网络模型,得到输出向量;
将所述输出向量映射为所述目标对象的预测轨迹。
第三方面,本发明实施例提供了一种终端设备,包括存储器、通信接口及与所述存储器和通信接口耦合的处理器;所述存储器用于存储指令,所述处理器用于执行所述指令,所述通信接口用于在所述处理器的控制下与其他设备进行通信;其中,所述处理器执行所述指令时执行上述第一方面或第一方面可能的实施例中所描述的方法。
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储了用于车辆控制的程序代码。所述程序代码包括用于执行上述第一方面或第一方面可能的实施例中所描述的方法的指令。
第五方面,提供了一种包括指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面或第一方面可能的实施例中所描述的方法。
本申请实施例提供了一种轨迹预测方法,包括获取目标对象的姿态信息和历史运动轨迹信息;基于所述姿态信息获取表示所述目标对象行为意图的目标信息;以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。通过上述方式,将行人的行为意图作为行人轨迹预测的输入特征,可以减小行人轨迹的预测误差,提高行人轨迹的预测性能,能够解决现有技术中行人运动轨迹预测不准确的问题,从而提高了车辆控制的安全性。
附图说明
图1A为人工智能主体框架的一种结构示意图;
图1B为本申请的一种可能的终端设备的结构示意图;
图1C为本申请的又一种可能的终端设备的结构示意图;
图1D为本申请实施例的一种可能的场景示意图;
图2为本申请实施例提供的一种轨迹预测方法的实施例示意图;
图3为本申请实施例提供的一种行人行为意图的示意;
图4为本申请实施例提供的一种历史轨迹信息处理示意;
图5为本申请实施例提供的一种混淆矩阵的示意;
图6为本实施例中所涉及的终端设备的一种可能的结构示意图;
图7为本申请实施例提供的另一种终端设备700的结构示意;
图8为本申请实施例提供的执行设备的一种结构示意图;
图9为本申请实施例提供的芯片的一种结构示意图。
具体实施方式
本申请实施例提供了一种轨迹预测方法方法及相关设备,用于减小行人轨迹的预测误差,提高行人轨迹的预测性能。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
首先对人工智能系统总体工作流程进行描述,请参见图1A,图1A示出的为人工智能主体框架的一种结构示意图,下面从“智能信息链”(水平轴)和“IT价值链”(垂直轴)两个维度对上述人工智能主题框架进行阐述。其中,“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到系统的产业生态过程,反映人工智能为信息技术产业带来的价值。
(1)基础设施
基础设施为人工智能系统提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。通过传感器与外部沟通;计算能力由智能芯片(CPU、NPU、GPU、ASIC、FPGA等硬件加速芯片)提供;基础平台包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。举例来说,传感器和外部沟通获取数据,这些数据提供给基础平台提供的分布式计算系统中的智能芯片进行计算。
(2)数据
基础设施的上一层的数据用于表示人工智能领域的数据来源。数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有系统的业务数据以及力、位移、液位、温度、湿度等感知数据。
(3)数据处理
数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等方式。
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。
推理是指在计算机或智能系统中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。
(4)通用能力
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用系统,例如,翻译,文本的分析,计算机视觉的处理,语音识别,图像的识别等等。
(5)智能产品及行业应用
智能产品及行业应用指人工智能系统在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能终端、智能制造、智能交通、智能家居、智能医疗、智能安防、自动驾驶、平安城市等。
下面将结合本发明的附图,对本发明实施例中的技术方案进行详细描述。
首先,介绍本申请适用的终端设备的结构示意图。如图1B示出本申请的一种可能的终端设备的结构示意图。如图1B,该终端设备包括环境感知模块102、规划决策模块104以及控制处理模块106。其中,所述环境感知模块102主要通过外围系统(如传感器、摄像头等)采集障碍物信息、终端设备所处的周围环境信息以及终端设备所在车辆的行驶信息。所述障碍物信息包括但不限于障碍物的地理位置、障碍物的运动速度、障碍物的运动方向、障碍物的运动加速度、障碍物的运动方向的方差、障碍物的运动速度的方差等信息。所述障碍物包括但不限于车辆、行人、有生命的活体障碍物以及无生命的障碍物等等。本申请将以所述障碍物为行人为例,具体阐述本申请所涉及的一些实施例。
所述周围环境信息包括但不限于地图信息、天气信息、路口类型、车道线、车道数量、道路是否拥塞、车流速度、车流加速度以及终端设备与障碍物之间的距离等等信息。
所述行驶信息包括但不限于车辆的地理位置、行驶速度、行驶方向、行驶加速度、车辆与障碍物之间的距离等等。所述终端设备包括但不限于汽车、火车、货车、小轿车等车辆以及安装在车辆上的通讯设备,如车载设备等。
所述规划决策模块104包括行为预测模块和规划模块。其中,所述行为预测模块主要用于根据环境感知模块所采集的上述信息预测障碍物的行为意图(如,本申请后文所述的行人的行为结果)以及该行为意图对应的运动轨迹(即障碍物轨迹)。所述规划模块用于在保证安全的前提下,根据所述行为意图获得对应的控制策略,以便后续利用该控制策略控制车 辆进行安全行驶。所述控制策略为用户侧或终端设备侧预先自定义设置的,或者根据所述行为意图生成的,具体在下文中进行详述。所述控制策略用于指示对所述车辆进行相应车辆参数的调整,以实现车辆安全驾驶。
所述控制处理模块用于根据所述规划决策模块所获得的控制策略,对所述车辆进行相应地控制和调整,以避免车辆与障碍物发生碰撞。例如对车辆的方向盘转角、行驶速度、是否制动刹车、是否按压加速踏板等车辆参数进行控制。关于如何根据所述行人的行为结果(即行为意图)控制所述车辆的安全行驶,具体将在下文进行详细阐述。
如图1C示出本申请的又一种可能的终端设备的结构示意图。如图1C所示,终端设备100可包括:基带芯片110、存储器115,包括一个或多个计算机可读存储介质、射频(RF)模块116、外围系统117。这些部件可在一个或多个通信总线114上通信。
外围系统117主要用于实现终端设备110和用户(如行人)/外部环境之间的交互功能,主要包括终端100的输入输出装置。具体实现中,外围系统117可包括:触摸屏控制器118、摄像头控制器119、音频控制器120以及传感器管理模块121。其中,各个控制器可与各自对应的外围设备,例如触摸屏123、摄像头124、音频电路125以及传感器126,耦合。在一些实施例中,传感器126中的手势传感器可用于接收用户输入的手势控制操作。传感器126中的速度传感器可用于采集终端设备自身的行驶速度或用于采集环境中障碍物的运动速度等。触摸屏123可作为提示装置,主要用于通过屏幕显示、投影等方式来提示障碍物,例如在行人横穿马路时通过显示屏显示文字的方式来提示行人加速行走等。可选的,外围系统117还可包括灯光、显示器等其他提示装置,以用于车辆与行人之间的交互提示,避免车辆与行人发生碰撞。需要说明的,外围系统117还可以包括其他I/O外设。
基带芯片110可集成包括:一个或多个处理器111、时钟模块112以及电源管理模块113。集成于基带芯片110中的时钟模块112主要用于为处理器111产生数据传输和时序控制所需要的时钟。集成于基带芯片110中的电源管理模块113主要用于为处理器111、射频模块116以及外围系统提供稳定的、高精确度的电压。
射频(RF)模块116用于接收和发送射频信号,主要集成了终端100的接收器和发射器。射频(RF)模块116通过射频信号与通信网络和其他通信设备通信。具体实现中,射频(RF)模块116可包括但不限于:天线系统、RF收发器、一个或多个放大器、调谐器、一个或多个振荡器、数字信号处理器、CODEC芯片、SIM卡和存储介质等。在一些实施例中,可在单独的芯片上实现射频(RF)模块116。
存储器115与处理器111耦合,用于存储各种软件程序和/或多组指令。具体实现中,存储器115可包括高速随机存取的存储器,并且也可包括非易失性存储器,例如一个或多个磁盘存储设备、闪存设备或其他非易失性固态存储设备。存储器115可以存储操作系统,例如ANDROID,IOS,WINDOWS,或者LINUX等嵌入式操作系统。存储器115还可以存储网络通信程序,该网络通信程序可用于与一个或多个附加设备,一个或多个终端设备,一个或多个终端设备进行通信等。
其次,介绍本申请适用的一种可能的场景示意图。如图1D,在车辆往来的道路上,行人位于道路边缘的P1点位置,为防止车辆与行人发生碰撞,此时车辆需预测行人的运动轨 迹,并基于预测的运动轨迹来控制自身的安全行驶,以避免与行人发生碰撞。例如,若预测行人P1点后的运动轨迹是运动轨迹b(横穿道路),则可控制所述车辆减速行驶、甚至刹车等待等,以避让行人。又如,若预测行人P1点后的运动轨迹是运动轨迹a沿道路直行或在P1点停止,则可不需对车辆进行任何控制操作,也不会发生车辆和行人的碰撞。
在本申请人提出本申请的过程中发现:现有技术中预测行人的的运动轨迹是基于行人的历史轨迹信息,预测的结果并不准确,进而导致车辆控制的可靠性和安全性不高。为解决上述问题,本申请提出相应地的轨迹预测方法。下面进行详细阐述。
参照图2,图2为本申请实施例提供的一种轨迹预测方法的实施例示意图,如图2示出的那样,本申请实施例提供的一种轨迹预测方法,包括:
201、获取目标对象的姿态信息和历史运动轨迹信息。
以目标对象为行人为例,本申请实施例中,终端设备可以通过外围系统(例如传感器、摄像头等)获取的姿态信息。其中所述的姿态信息可用于预测行人即将出现的行为意图。本申请实施例中,终端设备可以以所述姿态信息为输入,通过循环神经网络,得到表示所述目标对象行为意图的目标信息。
本申请实施例中,可以获取目标对象的姿态信息,其中,所述目标对象包括多个骨架节点,所述姿态信息包括每个骨架节点在多个时刻的骨架节点位置。
本申请实施例中,可以首先获取行人的视频流,对获取到的视频流进行视频处理,进行卷积神经网络(convolutional neural networks,CNN)骨架拟合以获取多个时刻(换一种表述:视频流中的多个帧)的行人的骨架信息(下文也可以称为骨架像素坐标序列),其中,行人每一时刻的骨架信息可以包括多个骨架节点位置,具体的,姿态信息可以表示为:
(x 1,y 1,x 2,y 2,…,x i,y i,…);
其中i为骨架节点索引,(x i,y i)分别为索引为i的骨架节点在视频流中的像素横坐标和像素纵坐标。需要说明的是,在一种实施例中,考虑到头部关节点丢失概率比较大,可以对与行人头部关节点相关的骨架信息进行剔除处理。
需要说明的是,骨架节点可以为例如肘关节、踝关节等人体关节,这里并不限定骨架节点的种类。
示例性的,在获取到的12帧视频流中识别出某一行人,对视频流进行CNN骨架拟合以获取骨架信息,其中获取到的骨架信息为:X=(X 1,…,X i,…,X 12);其中,X i为第i时刻行人各个骨架节点位置,X i=(x 1,y 1,x 2,y 2,…,x k,y k,…x 12,y 12),k为骨架节点索引。
即,本申请实施例中,终端设备获取到目标对象的每个骨架节点在多个时刻的骨架节点位置。
可选的,在一种实施例中,所述目标对象包括第一骨架节点,所述姿态信息包括第一骨架节点位置、第二骨架节点位置和第三骨架节点位置,所述第一骨架节点位置为所述目标对象的第一骨架节点在第一时刻的位置,所述第三骨架节点位置为所述目标对象的第一骨架节点在第三时刻的位置,所述第二骨架节点位置对应于所述目标对象的第一骨架节点 在第二时刻的位置,所述第一时刻、所述第二时刻和所述第三时刻为在时间维度上依次相邻的时刻,所述第二骨架节点位置与所述第一骨架节点位置和/或第三骨架节点位置有关。特别的,所述第二骨架节点位置为所述第一骨架节点位置和所述第三骨架节点位置的中心位置。
本申请实施例中,对于获取到的视频流中,某一帧或者多个帧的骨架信息丢失的情况,可以利用前后帧骨架节点位置进行插值补充处理,插值方式可以为对前后帧对应的骨架节点的骨架节点位置取平均值或者是其他运算作为丢失的骨架节点位置。
具体的,姿态信息包括目标对象的第一骨架节点在第一时刻的第一骨架节点位置、第二时刻的第二骨架节点位置以及第三时刻的第三骨架节点位置,例如第一骨架节点可以为行人肘关节节点,所述第二骨架节点位置与所述第一骨架节点位置和/或第三骨架节点位置有关。例如,所述第二骨架节点位置为所述第一骨架节点位置和所述第三骨架节点位置的中心位置,即第二骨架节点位置为对前后帧对应的骨架节点的骨架节点位置(第一骨架节点位置和第三骨架节点位置)取平均值得到的。
示例性的,在A时刻,行人的骨架信息为:
XA i=(x A1,y A1,x A2,y A2,…,x Ak,y Ak,…x A12,y A12);
在B时刻,行人的骨架信息为:
X Bi=(x B1,y B1,x B3,y B3,…,x Bk,y Bk,…x B12,y B12);
在C时刻,行人的骨架信息为:
X Ci=(x C1,y C1,x C2,y C2,…,x Ck,y Ck,…x C12,y C12);
可见,在B时刻,肘关节的骨架信息中缺失了一帧骨架节点位置:(x B2,y B2),为了补充缺失的骨架节点位置(x B2,y B2),可以参照前后帧的肘关节的骨架节点位置(x A2,y A2)以及(x C2,y C2),特别的,可以按照如下公式补充缺失的骨架节点位置:
Figure PCTCN2019129820-appb-000001
可选的,在一种实施例中,终端设备可以对所述每个骨架节点在多个时刻的骨架节点位置进行时刻上的间隔采样,得到处理后的姿态信息,所述处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
示例性的,本实施例中可以对所述姿态信息在多个时刻上进行等间隔采样,设(x 1,y 1,x 2,y 2,…,x i,y i,…x n,y n)为某一骨架节点的骨架像素坐标序列,其中n可以为偶数,(x i,y i)为第i时刻的该骨架节点的骨架像素坐标,可以通过t=2的等间隔采样,则,骨架像素坐标序列变为(x 1,y 1,x 3,y 3,…,x n-1,y n-1)。
本申请实施例中,为了克服采集的视频帧在像素较低且部分帧行人尺度较小时,由于不同行为意图类别骨架信息之间差异不明显而导致行人行为意图预测精度大幅下降的问题,可以对所述姿态信息在多个时刻上进行等间隔采样,来扩大不同行为意图类别骨架信息之间的差异。
可选的,在一种实施例中,终端设备可以对所述每个骨架节点在多个时刻的骨架节点位置进行归一化处理,得到归一化处理后的姿态信息,所述归一化处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
示例性的,可以对每一骨架节点的时序序列进行如下的内部归一化:
Figure PCTCN2019129820-appb-000002
其中,x min为某节点在该段序列像素横(纵)坐标的最小值,x max为某节点在该段序列像素横(纵)坐标的最大值。
本申请实施例中,为了克服采集的视频帧在像素较低且部分帧行人尺度较小时,由于不同行为意图类别骨架信息之间差异不明显而导致行人行为意图预测精度大幅下降的问题,可以对骨架信息进行序列内部归一化处理,以扩大不同类别骨架信息差异,从而提高行人意图的预测性能。
本申请实施例中,还可以获取到目标对象的历史运动轨迹信息,特别的,行人的历史运动轨迹信息可以由车辆传感器(如毫米波雷达等)采集得到。行人的历史运动轨迹信息中行人的轨迹坐标形式可以为(x 1,y 1,x 2,y 2,…,x i,y i,…),其中(x i,y i)是第i时刻的行人位置坐标。
可选的,在一种实施例中,可选的,在一种实施例中,所述历史轨迹信息包括多个时刻的历史轨迹点位置,每个时刻对应一个历史轨迹点位置,其中,所述多个时刻的历史轨迹点位置基于预设坐标系来表示,所述多个时刻的历史轨迹点位置包括第一轨迹点位置,所述第一轨迹点位置为所述多个时刻中的初始时刻的轨迹点位置,所述第一轨迹点位置位于所述预设坐标系中的预设位置。特别的,所述历史轨迹点位置为相对于地面的位置。
本申请实施例中,可以对原始行人的历史轨迹通过预处理得到新坐标系下的历史轨迹点位置。具体的,终端设备获取到的历史轨迹点位置可以是和车辆之间的相对位移产生的移动轨迹,为解决行人轨迹受到车辆运动干扰的问题,还原行人的真实历史轨迹坐标,可以将车辆运动位移加入到行人原始轨迹中,即对行人的历史轨迹进行车辆位移补偿,得到相对于地面的历史轨迹信息。
本申请实施例中,为提升行人轨迹的预测精度,可以以行人轨迹预测起点为参考原点进行行人轨迹坐标变换,具体的,所述历史轨迹信息包括多个时刻的历史轨迹点位置,每个时刻对应一个历史轨迹点位置,其中,所述多个时刻的历史轨迹点位置基于预设坐标系来表示,所述多个时刻的历史轨迹点位置包括第一轨迹点位置,所述第一轨迹点位置可以为所述多个时刻中的初始时刻的轨迹点位置,也可以为所述多个时刻中的最后一个时刻的轨迹点位置,这里并不限定,所述第一轨迹点位置位于所述预设坐标系中的预设位置,其中预设位置可以为预设坐标系的原点(0,0)。
示例性的,可以参照图4,图4为本申请实施例提供的一种历史轨迹信息处理示意,如图4示出的那样,O 1为车辆的坐标参考系原点,0 2为行人的初始时刻的轨迹点位置,行人的历史轨迹信息可以在以O 2点为坐标原点的坐标系中表示。
特别的,为减少传感器采集噪声对行人轨迹的影响,还可以对获取到的行人历史轨迹信息进行轨迹平滑处理。
202、基于所述姿态信息获取表示所述目标对象行为意图的目标信息。
本申请实施例中,终端设备可以以所述姿态信息为输入,通过循环神经网络模型,得到表示所述目标对象行为意图的目标信息。其中,循环神经网络模型可以是长短期记忆网络(long short-term memory,LSTM)或门控循环单元GRU,接下来以循环神经网络模型为长短期记忆网络LSTM为例进行说明。
关于LSTM:
LSTM算法是一种特定形式的循环神经网络(recurrent neural network,RNN),而RNN是一系列能够处理序列数据的神经网络的总称。RNN还有许多变形,例如双向RNN(bidirectional RNN)等。然而,RNN在处理长期依赖(时间序列上距离较远的节点)时会遇到巨大的困难,因为计算距离较远的节点之间的联系时会涉及雅可比矩阵的多次相乘,这会带来梯度消失(经常发生)或者梯度膨胀(较少发生)的问题,为了解决该问题,最广泛的就是门限RNN(Gated RNN),而LSTM就是门限RNN中最著名的一种。有漏单元通过设计连接间的权重系数,从而允许RNN累积距离较远节点间的长期联系;而门限RNN则泛化了这样的思想,允许在不同时刻改变该系数,且允许网络忘记当前已经累积的信息。LSTM就是这样的门限RNN。LSTM通过增加输入门限,遗忘门限和输出门限,使得自循环的权重是变化的,这样,在模型参数固定的情况下,不同时刻的积分尺度可以动态改变,从而避免了梯度消失或者梯度膨胀的问题。
本申请实施例中,行人作为重要的交通参与者,其行为意图以及未来出现位置的预测对智能车辆的前方风险评估具有十分重要的意义。具体的,终端设备可以基于获取到的目标对象的姿态信息,判断出与该姿态信息对应的行人具有何种行为意图,其中,所述行为意图可以至少包括如下的一种:由移动状态变为静止状态(stopping)、保持移动状态(crossing)、由静止状态变为移动状态(starting)、由第一移动状态变为第二移动状态(bending),其中,第一移动状态可以理解为在一定的区域内移动(例如车道边缘附近的人行道上移动),第二移动状态可以理解为进入到车道区域(例如横穿车道)。接下来分别对各个行为意图进行详细的描述:
一、由移动状态变为静止状态(stopping):
本申请实施例中,行人的行为意图可以是由移动状态变为静止状态,其中,移动状态可以理解为行人将从人行道移动至车道边缘,静止状态可以理解为行人当移动至车道边缘时将停止移动,参照图3,图3为本申请实施例提供的一种行人行为意图的示意,如图3示出的那样,针对于由移动状态变为静止状态(stopping)的行为意图,行人可以向车道边缘行走,当行走至道路边缘时,停止移动。
二、保持移动状态(crossing):
本申请实施例中,行人的行为意图可以是保持移动状态(crossing),其中,移动状态可以理解为行人将保持持续性的移动,例如从人行道移动至车道边缘并持续移动以行走至车道区域,如图3示出的那样,针对于保持移动状态(crossing)的行为意图,行人可以从人行道移动至车道,并横穿过车道。
三、由静止状态变为移动状态(starting):
本申请实施例中,行人的行为意图可以是由静止状态变为移动状态(starting),其中, 静止状态可以理解为行人在某一区域(例如位于人行道一侧的车道边缘)处于静止状态,移动状态可以理解为行人将从上述某一区域开始移动(例如行走至车道区域),如图3示出的那样,针对于由静止状态变为移动状态(starting)的行为意图,行人可以从车道边缘开始行走,并横穿过车道。
四、由第一移动状态变为第二移动状态(bending):
本申请实施例中,行人的行为意图可以是由第一移动状态变为第二移动状态(bending),其中,第一移动状态可以理解为在某一区域内移动(例如车道边缘附近移动),第二移动状态可以理解为进入到车道区域(例如行走至车道区域),如图3示出的那样,针对于由第一移动状态变为第二移动状态(bending)的行为意图,行人可以从车道边缘徘徊,然后横穿通过车道。
示例性的,以所述行为意图包括由移动状态变为静止状态、保持移动状态、由静止状态变为移动状态、由第一移动状态变为第二移动状态四种行为意图为例:
LSTM网络的输入为预处理过后的行人骨架节点信息序列(姿态信息)。在LSTM网络中,通过对预处理后的骨架节点信息序列进行计算,得到如下的骨架序列特征向量,即:
out=(0 1,0 2,0 3,0 4);
其中(0 1,0 2,0 3,0 4)分别代表该骨架节点信息序列经过LSTM网络得到的关于四种行人的行为意图(由静止状态变为移动状态bending,保持移动状态crossing,由静止状态变为移动状态starting,由移动状态变为静止状态stopping)的分向量。通过softmax函数将上述骨架序列特征向量out转化为关于四种行人的行为意图的概率分布P=(p 1,p 2,p 3,p 4),示例性的,通过softmax函数得到概率分布P的表达式可以为如下公式:
Figure PCTCN2019129820-appb-000003
其中o i为骨架特征向量out中关于第i类意图的分向量,P i是该骨架序列特征向量属于第i类意图的概率。取概率分布(p 1,p 2,p 3,p 4)中概率最大项的行为意图类别作为目标对象的行为意图。
本申请实施例中,终端设备可以基于所述姿态信息获取表示所述目标对象行为意图的目标信息。其中目标信息可以是表示目标对象行为意图的标识或者是其他可以唯一指示行为意图的信息。
本申请实施例中,在训练用于预测行人的行为意图的LSTM网络时,可以采用损失采用交叉熵,即通过如下损失函数进行LSTM网络的训练:
Figure PCTCN2019129820-appb-000004
其中,p(x)和q(x)可以分别代表输入x的真实概率分布和预测概率分布。预测概率分布和真实概率分布越接近,损失越小。需要说明的是,以上损失函数仅为一种示意,并不构成对本申请的限定。
203、以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。
本申请实施例中,终端设备在获取到表示所述目标对象行为意图的目标信息和历史运动轨迹信息之后,可以将所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。
具体的,本申请实施例中,终端设备可以以所述目标信息和所述历史轨迹信息为输入,通过循环神经网络模型,得到所述目标对象的预测轨迹。接下来以循环神经网络为LSTM网络为例进行说明:
LSTM网络的输入为历史轨迹信息和目标信息,上述输入信息经LSTM网络的编码层转化为编码层的输出向量out b,LSTM网络的解码层将编码层的输出向量out b映射为行人的预测轨迹,预测轨迹可以包括预测的轨迹坐标,例如预测轨迹可以为(x 1,y 1,x 2,y 2,…,x i,y i,…x p,y p),其中(x i,y i)是未来第i时刻预测的行人的位置坐标,p为预测未来轨迹的时刻长度。
本申请实施例中,在训练用于预测行人的轨迹的LSTM网络时可以采用均方误差(mean square error,MSE)作为损失函数,示例性的,均方误差的表达式可以为如下函数:
Figure PCTCN2019129820-appb-000005
Figure PCTCN2019129820-appb-000006
以上损失函数仅为一种示意,并不构成对本申请的限定。
示例性的,接下来描述一个行人轨迹预测的实例:
设历史时刻长度为32(2s),未来时刻长度为16(1s),对行人视频流进行CNN骨架拟合以获取行人的骨架节点信息序列,骨架信息序列为X=(X 1,…,X i,…,X 32);其中,X i为第i时刻行人各个骨架节点位置,X i=(x 1,y 1,x 2,y 2,…,x k,y k,…x 12,y 12),k为骨架节点索引。
对丢失的行人骨架信息进行如上述实施例描述的插值补充处理,对该序列的行人骨架信息进行如上述实施例描述的t=2的间隔采样处理,对该序列的行人骨架信息序列进行如上述实施例描述的序列内部归一化处理。
该预处理过后的行人骨架信息序列需经过训练好的LSTM行人意图识别模型来得到预测的行人意图。该LSTM行人意图识别模型可以为单层LSTM模型,输入数据维度为(batch_size,32,24),其中batch_size为批处理的数据量,在该实施例中batch_size可以等于1,32为骨架特征的历史时刻长度,24为每个时刻的骨架特征维度,该模型在219个测试样本下的预测准确率为95.43%,得到的混淆矩阵如图5所示,图5为本申请实施例提供的一种混淆矩阵的示意。预处理后的骨架信息序列经LSTM行人意图预测网络得到的输出为行人意图预测概率分布P=(p 1,p 2,p 3,p 4),取概率分布(p 1,p 2,p 3,p 4)中概率最大项的行为意图类别作为目标对象的行为意图。本实施例中的骨架信息序列经过意图识别模型得到的行为意图类别为3(其中,bending表示为1,crossing表示为2,starting表示为3, stopping表示为4)。
接下来,对与行人骨架信息序列对应的行人历史轨迹信息进行车辆位移补偿、轨迹平滑处理、轨迹坐标变换后得到新坐标系下的行人历史坐标Y=(x 1,y 1,x 2,y 2,…,x k,y k,…x 32,y 32)
预测的表示行人行为意图的目标信息intention和对应的预处理后的行人历史轨迹信息一起作为训练好的LSTM行人轨迹预测模型的输入。历史轨迹信息和目标信息intention的联合输入形式可以为(x 1,y 1,intention,…,x i,y i,intention,…x 32,y 32,intention),因此输入数据维度为(batch_size,32,3),其中batch_size为批处理的数据量,在该实施例中batch_size=1,32为轨迹信息的历史时刻长度,3为每个时刻的坐标和意图特征的数据维度。输入的意图特征信息和历史轨迹信息经过LSTM的编码过程得到编码层的输出向量out b,LSTM网络的解码层将编码层的输出向量out b映射为行人的预测轨迹,由此得到该行人未来16个时刻的预测位置坐标。预测的行人未来16个时刻的位置坐标可以如表1所示。
表1:某starting意图的行人预测未来轨迹坐标
时刻 1 2 3 4 13 14 15 16
X(m) -0.0650 -0.1377 -0.2089 -0.2784 -0.9294 -1.0080 -1.0872 -1.1668
Y(m) -0.0016 -0.0068 -0.0049 -0.0014 0.0098 0.0144 0.0196 0.0255
本实施例可以以平均绝对误差MAE来衡量轨迹预测模型的预测性能,具体的,取测试样本数M取值为1,则16个时刻对应的平均绝对误差MAE如表2所示。
表2:行人轨迹预测平均绝对误差
时刻 1 2 3 4 13 14 15 16
MAE(m) 0.0158 0.0496 0.0899 0.1133 0.1398 0.1416 0.1599 0.1796
本申请实施例提供了一种轨迹预测方法,包括获取目标对象的姿态信息和历史运动轨迹信息;基于所述姿态信息获取表示所述目标对象行为意图的目标信息;以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。通过上述方式,将行人的行为意图作为行人轨迹预测的输入特征之一,可以减小行人轨迹的预测误差,提高行人轨迹的预测性能,能够解决现有技术中行人运动轨迹预测不准确的问题,从而提高了车辆控制的安全性。
上述主要从终端设备的角度出发对本发明实施例提供的方案进行了介绍。可以理解的是,终端设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。结合本发明中所公开的实施例描述的各示例的单元及算法步骤,本发明实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特 定的应用来使用不同的方法来实现所描述的功能,但是这种实现不应认为超出本发明实施例的技术方案的范围。
本发明实施例可以根据上述方法示例对终端设备进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本发明实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用集成的单元的情况下,图6示出了上述实施例中所涉及的终端设备的一种可能的结构示意图。终端设备600包括:获取模块601,用于获取目标对象的姿态信息和历史运动轨迹信息;基于所述姿态信息获取表示所述目标对象行为意图的目标信息;预测模块602,用于以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。
可选的,所述行为意图至少包括如下的一种:由移动状态变为静止状态、保持移动状态、由静止状态变为移动状态、由第一移动状态变为第二移动状态。
可选的,所述目标对象包括多个骨架节点,所述姿态信息包括每个骨架节点在多个时刻的骨架节点位置。
可选的,所述目标对象包括第一骨架节点,所述姿态信息包括第一骨架节点位置、第二骨架节点位置和第三骨架节点位置,所述第一骨架节点位置为所述目标对象的第一骨架节点在第一时刻的位置,所述第三骨架节点位置为所述目标对象的第一骨架节点在第三时刻的位置,所述第二骨架节点位置对应于所述目标对象的第一骨架节点在第二时刻的位置,所述第一时刻、所述第二时刻和所述第三时刻为在时间维度上依次相邻的时刻,所述第二骨架节点位置与所述第一骨架节点位置和/或第三骨架节点位置有关。
可选的,所述第二骨架节点位置为所述第一骨架节点位置和所述第三骨架节点位置的中心位置。
可选的,终端设备600还包括:采样模块,用于对所述每个骨架节点在多个时刻的骨架节点位置进行时刻上的间隔采样,得到处理后的姿态信息,所述处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
可选的,终端设备600还包括:归一化模块,用于对所述每个骨架节点在多个时刻的骨架节点位置进行归一化处理,得到归一化处理后的姿态信息,所述归一化处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
可选的,所述获取模块601,具体用于:以所述姿态信息为输入,通过长短期记忆网络LSTM,得到表示所述目标对象行为意图的目标信息。
可选的,所述历史轨迹信息包括多个时刻的历史轨迹点位置,每个时刻对应一个历史轨迹点位置,其中,所述多个时刻的历史轨迹点位置基于预设坐标系来表示,所述多个时刻的历史轨迹点位置包括第一轨迹点位置,所述第一轨迹点位置为所述多个时刻中的初始时刻的轨迹点位置,所述第一轨迹点位置位于所述预设坐标系中的预设位置。
可选的,所述历史轨迹点位置为相对于地面的位置。
可选的,所述预测模块602,具体用于:以所述目标信息和所述历史轨迹信息为输入,通过循环神经网络模型,得到输出向量;将所述输出向量映射为所述目标对象的预测轨迹。
可选的,终端设备600还可以包括存储单元,用于存储终端设备600的程序代码和数据。
本申请实施例中,上述预测模块601可以集成在处理模块中,其中,处理模块可以是处理器或控制器,例如可以是中央处理器(central processing unit,CPU),通用处理器,数字信号处理器(digital signal processor,DSP),专用集成电路(application-specific integrated circuit,ASIC),现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本发明公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。
参照图7,图7为本申请实施例提供的另一种终端设备700的结构示意。
参阅图7所示,该终端设备700包括:处理器712、通信接口713、存储器77。可选地,终端设备700还可以包括总线714。其中,通信接口713、处理器712以及存储器77可以通过总线714相互连接;总线714可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。所述总线714可以分为地址总线、数据总线、控制总线等。为便于表示,图7B中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
其中,处理器712可以执行如下步骤:
获取目标对象的姿态信息和历史运动轨迹信息;
基于所述姿态信息获取表示所述目标对象行为意图的目标信息;
以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。
可选的,所述行为意图至少包括如下的一种:
由移动状态变为静止状态、保持移动状态、由静止状态变为移动状态、由第一移动状态变为第二移动状态。
可选的,所述目标对象包括多个骨架节点,所述姿态信息包括每个骨架节点在多个时刻的骨架节点位置。
可选的,所述目标对象包括第一骨架节点,所述姿态信息包括第一骨架节点位置、第二骨架节点位置和第三骨架节点位置,所述第一骨架节点位置为所述目标对象的第一骨架节点在第一时刻的位置,所述第三骨架节点位置为所述目标对象的第一骨架节点在第三时刻的位置,所述第二骨架节点位置对应于所述目标对象的第一骨架节点在第二时刻的位置,所述第一时刻、所述第二时刻和所述第三时刻为在时间维度上依次相邻的时刻,所述第二骨架节点位置与所述第一骨架节点位置和/或第三骨架节点位置有关。
可选的,所述第二骨架节点位置为所述第一骨架节点位置和所述第三骨架节点位置的中心位置。
可选的,处理器712可以执行如下步骤:
对所述每个骨架节点在多个时刻的骨架节点位置进行时刻上的间隔采样,得到处理后的姿态信息,所述处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
可选的,处理器712可以执行如下步骤:
对所述每个骨架节点在多个时刻的骨架节点位置进行归一化处理,得到归一化处理后的姿态信息,所述归一化处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
可选的,处理器712可以执行如下步骤:
以所述姿态信息为输入,通过循环神经网络模型,得到表示所述目标对象行为意图的目标信息。
可选的,所述历史轨迹信息包括多个时刻的历史轨迹点位置,每个时刻对应一个历史轨迹点位置,其中,所述多个时刻的历史轨迹点位置基于预设坐标系来表示,所述多个时刻的历史轨迹点位置包括第一轨迹点位置,所述第一轨迹点位置为所述多个时刻中的初始时刻的轨迹点位置,所述第一轨迹点位置位于所述预设坐标系中的预设位置。
可选的,所述历史轨迹点位置为相对于地面的位置。
可选的,处理器712可以执行如下步骤:
以所述目标信息和所述历史轨迹信息为输入,通过循环神经网络模型,得到输出向量;
将所述输出向量映射为所述目标对象的预测轨迹。
上述图7所示的终端设备的具体实现还可以对应参照前述所述实施例的相应描述,此处不再赘述。
接下来介绍本申请实施例提供的一种执行设备,请参阅图8,图8为本申请实施例提供的执行设备的一种结构示意图。其中,执行设备800上可以部署有图6或图7对应实施例中所描述的终端设备,用于实现图6和图7对应实施例中终端设备的功能。具体的,执行设备800包括:接收器801、发射器802、处理器803和存储器804(其中执行设备800中的处理器803的数量可以一个或多个,图8中以一个处理器为例),其中,处理器803可以包括应用处理器8031和通信处理器8032。在本申请的一些实施例中,接收器801、发射器802、处理器803和存储器804可通过总线或其它方式连接。
存储器804可以包括只读存储器和随机存取存储器,并向处理器803提供指令和数据。存储器804的一部分还可以包括非易失性随机存取存储器(non-volatile random access memory,NVRAM)。存储器804存储有处理器和操作指令、可执行模块或者数据结构,或者它们的子集,或者它们的扩展集,其中,操作指令可包括各种操作指令,用于实现各种操作。
处理器803控制执行设备的操作。具体的应用中,执行设备的各个组件通过总线系统耦合在一起,其中总线系统除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线系统。
上述本申请实施例揭示的方法可以应用于处理器803中,或者由处理器803实现。处理器803可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各 步骤可以通过处理器803中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器803可以是通用处理器、数字信号处理器(digital signal processing,DSP)、微处理器或微控制器,还可进一步包括专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。该处理器803可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器804,处理器803读取存储器804中的信息,结合其硬件完成上述方法的步骤。
接收器801可用于接收输入的数字或字符信息,以及产生与执行设备的相关设置以及功能控制有关的信号输入。发射器802可用于通过第一接口输出数字或字符信息;发射器802还可用于通过第一接口向磁盘组发送指令,以修改磁盘组中的数据。
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上运行时,使得计算机执行如前述图6或图7所示实施例描述的方法中终端设备所执行的步骤。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于进行信号处理的程序,当其在计算机上运行时,使得计算机执行如前述图6或图7所示实施例描述的方法中终端设备所执行的步骤。
本申请实施例提供的执行设备或终端设备具体可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使执行设备内的芯片执行上述图2所示实施例描述的轨迹预测方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
具体的,请参阅图9,图9为本申请实施例提供的芯片的一种结构示意图,所述芯片可以表现为神经网络处理器NPU 900,NPU 900作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路,通过控制器904控制运算电路903提取存储器中的矩阵数据并进行乘法运算。
在一些实现中,运算电路903内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路903是二维脉动阵列。运算电路903还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路903是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器902中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器901中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加 器(accumulator)908中。
统一存储器906用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(Direct Memory Access Controller,DMAC)905,DMAC被搬运到权重存储器902中。输入数据也通过DMAC被搬运到统一存储器906中。
BIU为Bus Interface Unit即,总线接口单元910,用于AXI总线与DMAC和取指存储器(Instruction Fetch Buffer,IFB)909的交互。
总线接口单元910(Bus Interface Unit,简称BIU),用于取指存储器909从外部存储器获取指令,还用于存储单元访问控制器905从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器906或将权重数据搬运到权重存储器902中或将输入数据数据搬运到输入存储器901中。
向量计算单元907包括多个运算处理单元,在需要的情况下,对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如Batch Normalization(批归一化),像素级求和,对特征平面进行上采样等。
在一些实现中,向量计算单元907能将经处理的输出的向量存储到统一存储器906。例如,向量计算单元907可以将线性函数和/或非线性函数应用到运算电路903的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元907生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路903的激活输入,例如用于在神经网络中的后续层中的使用。
控制器904连接的取指存储器(instruction fetch buffer)909,用于存储控制器904使用的指令;
统一存储器906,输入存储器901,权重存储器902以及取指存储器909均为On-Chip存储器。外部存储器私有于该NPU硬件架构。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述第一方面方法的程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可 以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,训练设备,或者网络设备等)执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、训练设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、训练设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。

Claims (24)

  1. 一种轨迹预测方法,其特征在于,所述方法包括:
    获取目标对象的姿态信息和历史运动轨迹信息;
    基于所述姿态信息获取表示所述目标对象行为意图的目标信息;
    以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。
  2. 根据权利要求1所述的方法,其特征在于,所述行为意图至少包括如下的一种:
    由移动状态变为静止状态、保持移动状态、由静止状态变为移动状态、由第一移动状态变为第二移动状态。
  3. 根据权利要求1或2所述的方法,其特征在于,所述目标对象包括多个骨架节点,所述姿态信息包括每个骨架节点在多个时刻的骨架节点位置。
  4. 根据权利要求3所述的方法,其特征在于,所述目标对象包括第一骨架节点,所述姿态信息包括第一骨架节点位置、第二骨架节点位置和第三骨架节点位置,所述第一骨架节点位置为所述目标对象的第一骨架节点在第一时刻的位置,所述第三骨架节点位置为所述目标对象的第一骨架节点在第三时刻的位置,所述第二骨架节点位置对应于所述目标对象的第一骨架节点在第二时刻的位置,所述第一时刻、所述第二时刻和所述第三时刻为在时间维度上依次相邻的时刻,所述第二骨架节点位置与所述第一骨架节点位置和/或第三骨架节点位置有关。
  5. 根据权利要求4所述的方法,其特征在于,所述第二骨架节点位置为所述第一骨架节点位置和所述第三骨架节点位置的中心位置。
  6. 根据权利要求3或4所述的方法,其特征在于,所述方法还包括:
    对所述每个骨架节点在多个时刻的骨架节点位置进行时刻上的间隔采样,得到处理后的姿态信息,所述处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
  7. 根据权利要求3至5任一所述的方法,其特征在于,所述方法还包括:
    对所述每个骨架节点在多个时刻的骨架节点位置进行归一化处理,得到归一化处理后的姿态信息,所述归一化处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
  8. 根据权利要求1至7任一所述的方法,其特征在于,所述基于所述姿态信息获取表示所述目标对象行为意图的目标信息,包括:
    以所述姿态信息为输入,通过循环神经网络模型,得到表示所述目标对象行为意图的目标信息。
  9. 根据权利要求1至8任一所述的方法,其特征在于,所述历史轨迹信息包括多个时刻的历史轨迹点位置,每个时刻对应一个历史轨迹点位置,其中,所述多个时刻的历史轨迹点位置基于预设坐标系来表示,所述多个时刻的历史轨迹点位置包括第一轨迹点位置,所述第一轨迹点位置位于所述预设坐标系中的预设位置。
  10. 根据权利要求9所述的方法,其特征在于,所述历史轨迹点位置为相对于地面的位置。
  11. 根据权利要求1至10任一所述的方法,其特征在于,所述以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹,包括:
    以所述目标信息和所述历史轨迹信息为输入,通过循环神经网络模型,得到所述目标对象的预测轨迹。
  12. 一种执行设备,其特征在于,包括:
    获取模块,用于获取目标对象的姿态信息和历史运动轨迹信息;
    基于所述姿态信息获取表示所述目标对象行为意图的目标信息;
    预测模块,用于以所述目标信息和所述历史轨迹信息为输入,通过轨迹预测模型,得到所述目标对象的预测轨迹。
  13. 根据权利要求12所述的执行设备,其特征在于,所述行为意图至少包括如下的一种:
    由移动状态变为静止状态、保持移动状态、由静止状态变为移动状态、由第一移动状态变为第二移动状态。
  14. 根据权利要求12或13所述的执行设备,其特征在于,所述目标对象包括多个骨架节点,所述姿态信息包括每个骨架节点在多个时刻的骨架节点位置。
  15. 根据权利要求14所述的执行设备,其特征在于,所述目标对象包括第一骨架节点,所述姿态信息包括第一骨架节点位置、第二骨架节点位置和第三骨架节点位置,所述第一骨架节点位置为所述目标对象的第一骨架节点在第一时刻的位置,所述第三骨架节点位置为所述目标对象的第一骨架节点在第三时刻的位置,所述第二骨架节点位置对应于所述目标对象的第一骨架节点在第二时刻的位置,所述第一时刻、所述第二时刻和所述第三时刻为在时间维度上依次相邻的时刻,所述第二骨架节点位置与所述第一骨架节点位置和/或第三骨架节点位置有关。
  16. 根据权利要求15所述的执行设备,其特征在于,所述第二骨架节点位置为所述第一骨架节点位置和所述第三骨架节点位置的中心位置。
  17. 根据权利要求14或15所述的执行设备,其特征在于,还包括:
    采样模块,用于对所述每个骨架节点在多个时刻的骨架节点位置进行时刻上的间隔采样,得到处理后的姿态信息,所述处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
  18. 根据权利要求14至16任一所述的执行设备,其特征在于,还包括:
    归一化模块,用于对所述每个骨架节点在多个时刻的骨架节点位置进行归一化处理,得到归一化处理后的姿态信息,所述归一化处理后的姿态信息用于获取表示所述目标对象行为意图的目标信息。
  19. 根据权利要求12至18任一所述的执行设备,其特征在于,所述获取模块,具体用于:
    以所述姿态信息为输入,通过循环神经网络模型,得到表示所述目标对象行为意图的目标信息。
  20. 根据权利要求12至19任一所述的执行设备,其特征在于,所述历史轨迹信息包 括多个时刻的历史轨迹点位置,每个时刻对应一个历史轨迹点位置,其中,所述多个时刻的历史轨迹点位置基于预设坐标系来表示,所述多个时刻的历史轨迹点位置包括第一轨迹点位置,所述第一轨迹点位置位于所述预设坐标系中的预设位置。
  21. 根据权利要求20所述的执行设备,其特征在于,所述历史轨迹点位置为相对于地面的位置。
  22. 根据权利要求12至21任一所述的执行设备,其特征在于,所述预测模块,具体用于:
    以所述目标信息和所述历史轨迹信息为输入,通过循环神经网络模型,得到所述目标对象的预测轨迹。
  23. 一种终端设备,其特征在于,包括存储器、通信接口及与所述存储器和通信接口耦合的处理器;所述存储器用于存储指令,所述处理器用于执行所述指令,所述通信接口用于在所述处理器的控制下与目标车辆进行通信;其中,所述处理器执行所述指令时执行如上权利要求1-11中任一项所述的方法。
  24. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至11中任一项所述方法。
PCT/CN2019/129820 2019-12-30 2019-12-30 一种轨迹预测方法及相关设备 WO2021134169A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/CN2019/129820 WO2021134169A1 (zh) 2019-12-30 2019-12-30 一种轨迹预测方法及相关设备
EP19958212.3A EP4074563A4 (en) 2019-12-30 2019-12-30 TRAJECTORY PREDICTION METHOD AND ASSOCIATED DEVICE
CN201980064064.9A CN112805730A (zh) 2019-12-30 2019-12-30 一种轨迹预测方法及相关设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/129820 WO2021134169A1 (zh) 2019-12-30 2019-12-30 一种轨迹预测方法及相关设备

Publications (1)

Publication Number Publication Date
WO2021134169A1 true WO2021134169A1 (zh) 2021-07-08

Family

ID=75804002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/129820 WO2021134169A1 (zh) 2019-12-30 2019-12-30 一种轨迹预测方法及相关设备

Country Status (3)

Country Link
EP (1) EP4074563A4 (zh)
CN (1) CN112805730A (zh)
WO (1) WO2021134169A1 (zh)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113743657A (zh) * 2021-08-16 2021-12-03 的卢技术有限公司 一种基于车辆运动信息的轨迹预测方法、系统和存储介质
CN113759400A (zh) * 2021-08-04 2021-12-07 江苏怀业信息技术股份有限公司 卫星定位轨迹的平滑化方法和装置
CN113807298A (zh) * 2021-07-26 2021-12-17 北京易航远智科技有限公司 行人过街意图预测方法、装置、电子设备及可读存储介质
CN113850209A (zh) * 2021-09-29 2021-12-28 广州文远知行科技有限公司 一种动态物体检测方法、装置、交通工具及存储介质
CN114237239A (zh) * 2021-12-09 2022-03-25 珠海格力电器股份有限公司 设备控制方法、装置、电子设备及计算机可读存储介质
CN114693721A (zh) * 2022-03-24 2022-07-01 美的集团(上海)有限公司 运动规划方法、装置及机器人
CN114970819A (zh) * 2022-05-26 2022-08-30 哈尔滨工业大学 一种基于意图推理与深度强化学习的移动目标搜索跟踪方法及系统
CN115272395A (zh) * 2022-07-11 2022-11-01 哈尔滨工业大学重庆研究院 一种基于深度图卷积网络的跨域可迁移行人轨迹预测方法
CN115345390A (zh) * 2022-10-19 2022-11-15 武汉大数据产业发展有限公司 一种行为轨迹预测方法、装置、电子设备及存储介质
CN115628736A (zh) * 2022-09-23 2023-01-20 北京智行者科技股份有限公司 行人轨迹的预测方法、设备、移动装置和存储介质
EP4131181A1 (en) * 2021-08-05 2023-02-08 Argo AI, LLC Methods and system for predicting trajectories of actors with respect to a drivable area
US20230043601A1 (en) * 2021-08-05 2023-02-09 Argo AI, LLC Methods And System For Predicting Trajectories Of Actors With Respect To A Drivable Area
CN115982306A (zh) * 2023-03-13 2023-04-18 浙江口碑网络技术有限公司 一种目标对象的逆行行为识别方法、装置
CN116578569A (zh) * 2023-07-12 2023-08-11 成都国恒空间技术工程股份有限公司 卫星时空轨迹数据关联分析方法
CN117251748A (zh) * 2023-10-10 2023-12-19 中国船舶集团有限公司第七〇九研究所 一种基于历史规律挖掘的航迹预测方法、设备及存储介质
US11904906B2 (en) 2021-08-05 2024-02-20 Argo AI, LLC Systems and methods for prediction of a jaywalker trajectory through an intersection
CN117688823A (zh) * 2024-02-04 2024-03-12 北京航空航天大学 一种岩土颗粒轨迹预测方法、电子设备及介质
CN117688823B (zh) * 2024-02-04 2024-05-14 北京航空航天大学 一种岩土颗粒轨迹预测方法、电子设备及介质

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113792906B (zh) * 2021-08-05 2024-04-30 交控科技股份有限公司 列车长时间窗运行轨迹预测方法、装置、设备及存储介质
CN114153207B (zh) * 2021-11-29 2024-02-27 北京三快在线科技有限公司 一种无人驾驶设备的控制方法及控制装置
CN114312829B (zh) * 2021-12-06 2024-04-23 广州文远知行科技有限公司 一种行人轨迹预测方法和装置、电子设备及存储介质
CN114312831B (zh) * 2021-12-16 2023-10-03 浙江零跑科技股份有限公司 一种基于空间注意力机制的车辆轨迹预测方法
CN114359328B (zh) * 2021-12-28 2022-08-12 山东省人工智能研究院 一种利用单深度相机和人体约束的运动参数测量方法
CN114715145B (zh) * 2022-04-29 2023-03-17 阿波罗智能技术(北京)有限公司 一种轨迹预测方法、装置、设备及自动驾驶车辆
US11769350B1 (en) * 2022-06-01 2023-09-26 Sas Institute, Inc. Computer system for automatically analyzing a video of a physical activity using a model and providing corresponding feedback
CN115952930B (zh) * 2023-03-14 2023-08-22 中国人民解放军国防科技大学 一种基于imm-gmr模型的社会行为体位置预测方法
CN116558540B (zh) * 2023-07-11 2023-10-03 新石器慧通(北京)科技有限公司 模型训练方法和装置、轨迹生成方法和装置
CN116989818B (zh) * 2023-09-26 2024-01-19 毫末智行科技有限公司 一种轨迹生成方法、装置、电子设备及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106428000A (zh) * 2016-09-07 2017-02-22 清华大学 一种车辆速度控制装置和方法
US20190205629A1 (en) * 2018-01-04 2019-07-04 Beijing Kuangshi Technology Co., Ltd. Behavior predicton method, behavior predicton system, and non-transitory recording medium
CN109969172A (zh) * 2017-12-26 2019-07-05 华为技术有限公司 车辆控制方法、设备及计算机存储介质
CN110293968A (zh) * 2019-06-18 2019-10-01 百度在线网络技术(北京)有限公司 自动驾驶车辆的控制方法、装置、设备及可读存储介质
CN110622226A (zh) * 2017-05-16 2019-12-27 日产自动车株式会社 行驶辅助装置的动作预测方法以及动作预测装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6388205B2 (ja) * 2014-02-25 2018-09-12 パナソニックIpマネジメント株式会社 表示制御プログラム、表示制御装置、および表示装置
KR102070527B1 (ko) * 2017-06-22 2020-01-28 바이두닷컴 타임즈 테크놀로지(베이징) 컴퍼니 리미티드 자율 주행 차량 교통 예측에서 예측되는 궤적에 대한 평가 프레임 워크
CN109029446B (zh) * 2018-06-22 2020-11-20 北京邮电大学 一种行人位置预测方法、装置及设备
CN109523574B (zh) * 2018-12-27 2022-06-24 联想(北京)有限公司 一种行走轨迹预测方法和电子设备
CN109635793A (zh) * 2019-01-31 2019-04-16 南京邮电大学 一种基于卷积神经网络的无人驾驶行人轨迹预测方法
CN110223318A (zh) * 2019-04-28 2019-09-10 驭势科技(北京)有限公司 一种多目标轨迹的预测方法、装置、车载设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106428000A (zh) * 2016-09-07 2017-02-22 清华大学 一种车辆速度控制装置和方法
CN110622226A (zh) * 2017-05-16 2019-12-27 日产自动车株式会社 行驶辅助装置的动作预测方法以及动作预测装置
CN109969172A (zh) * 2017-12-26 2019-07-05 华为技术有限公司 车辆控制方法、设备及计算机存储介质
US20190205629A1 (en) * 2018-01-04 2019-07-04 Beijing Kuangshi Technology Co., Ltd. Behavior predicton method, behavior predicton system, and non-transitory recording medium
CN110293968A (zh) * 2019-06-18 2019-10-01 百度在线网络技术(北京)有限公司 自动驾驶车辆的控制方法、装置、设备及可读存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4074563A4 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807298A (zh) * 2021-07-26 2021-12-17 北京易航远智科技有限公司 行人过街意图预测方法、装置、电子设备及可读存储介质
CN113807298B (zh) * 2021-07-26 2024-03-29 北京易航远智科技有限公司 行人过街意图预测方法、装置、电子设备及可读存储介质
CN113759400A (zh) * 2021-08-04 2021-12-07 江苏怀业信息技术股份有限公司 卫星定位轨迹的平滑化方法和装置
CN113759400B (zh) * 2021-08-04 2024-02-27 江苏怀业信息技术股份有限公司 卫星定位轨迹的平滑化方法和装置
US11904906B2 (en) 2021-08-05 2024-02-20 Argo AI, LLC Systems and methods for prediction of a jaywalker trajectory through an intersection
EP4131181A1 (en) * 2021-08-05 2023-02-08 Argo AI, LLC Methods and system for predicting trajectories of actors with respect to a drivable area
US20230043601A1 (en) * 2021-08-05 2023-02-09 Argo AI, LLC Methods And System For Predicting Trajectories Of Actors With Respect To A Drivable Area
CN113743657A (zh) * 2021-08-16 2021-12-03 的卢技术有限公司 一种基于车辆运动信息的轨迹预测方法、系统和存储介质
CN113850209A (zh) * 2021-09-29 2021-12-28 广州文远知行科技有限公司 一种动态物体检测方法、装置、交通工具及存储介质
CN114237239A (zh) * 2021-12-09 2022-03-25 珠海格力电器股份有限公司 设备控制方法、装置、电子设备及计算机可读存储介质
CN114693721B (zh) * 2022-03-24 2023-09-01 美的集团(上海)有限公司 运动规划方法、装置及机器人
CN114693721A (zh) * 2022-03-24 2022-07-01 美的集团(上海)有限公司 运动规划方法、装置及机器人
CN114970819A (zh) * 2022-05-26 2022-08-30 哈尔滨工业大学 一种基于意图推理与深度强化学习的移动目标搜索跟踪方法及系统
CN114970819B (zh) * 2022-05-26 2024-05-03 哈尔滨工业大学 一种基于意图推理与深度强化学习的移动目标搜索跟踪方法及系统
CN115272395A (zh) * 2022-07-11 2022-11-01 哈尔滨工业大学重庆研究院 一种基于深度图卷积网络的跨域可迁移行人轨迹预测方法
CN115628736A (zh) * 2022-09-23 2023-01-20 北京智行者科技股份有限公司 行人轨迹的预测方法、设备、移动装置和存储介质
CN115345390A (zh) * 2022-10-19 2022-11-15 武汉大数据产业发展有限公司 一种行为轨迹预测方法、装置、电子设备及存储介质
CN115982306B (zh) * 2023-03-13 2023-08-18 浙江口碑网络技术有限公司 一种目标对象的逆行行为识别方法、装置
CN115982306A (zh) * 2023-03-13 2023-04-18 浙江口碑网络技术有限公司 一种目标对象的逆行行为识别方法、装置
CN116578569A (zh) * 2023-07-12 2023-08-11 成都国恒空间技术工程股份有限公司 卫星时空轨迹数据关联分析方法
CN116578569B (zh) * 2023-07-12 2023-09-12 成都国恒空间技术工程股份有限公司 卫星时空轨迹数据关联分析方法
CN117251748A (zh) * 2023-10-10 2023-12-19 中国船舶集团有限公司第七〇九研究所 一种基于历史规律挖掘的航迹预测方法、设备及存储介质
CN117251748B (zh) * 2023-10-10 2024-04-19 中国船舶集团有限公司第七〇九研究所 一种基于历史规律挖掘的航迹预测方法、设备及存储介质
CN117688823A (zh) * 2024-02-04 2024-03-12 北京航空航天大学 一种岩土颗粒轨迹预测方法、电子设备及介质
CN117688823B (zh) * 2024-02-04 2024-05-14 北京航空航天大学 一种岩土颗粒轨迹预测方法、电子设备及介质

Also Published As

Publication number Publication date
CN112805730A (zh) 2021-05-14
EP4074563A1 (en) 2022-10-19
EP4074563A4 (en) 2022-12-28

Similar Documents

Publication Publication Date Title
WO2021134169A1 (zh) 一种轨迹预测方法及相关设备
WO2021134172A1 (zh) 一种轨迹预测方法及相关设备
EP4152204A1 (en) Lane line detection method, and related apparatus
JP7086111B2 (ja) 自動運転車のlidar測位に用いられるディープラーニングに基づく特徴抽出方法
CN111771135B (zh) 自动驾驶车辆中使用rnn和lstm进行时间平滑的lidar定位
WO2021164752A1 (zh) 一种神经网络通道参数的搜索方法及相关设备
CN111771141A (zh) 自动驾驶车辆中使用3d cnn网络进行解决方案推断的lidar定位
Bie et al. Real-time vehicle detection algorithm based on a lightweight You-Only-Look-Once (YOLOv5n-L) approach
CN114323054B (zh) 自动驾驶车辆行驶轨迹的确定方法、装置及电子设备
JP2022003508A (ja) 軌道計画モデルの訓練方法と装置、電子機器、コンピュータ記憶媒体及びコンピュータプログラム
US20210125078A1 (en) System And Method For Predicting And Interpreting Driving Behavior
CN114222986A (zh) 使用社交图网络进行的随机轨迹预测
CN114882457A (zh) 一种模型的训练方法、车道线的检测方法及设备
Haris et al. Lane line detection and departure estimation in a complex environment by using an asymmetric kernel convolution algorithm
Haris et al. Multi-scale spatial convolution algorithm for lane line detection and lane offset estimation in complex road conditions
US11203348B2 (en) System and method for predicting and interpreting driving behavior
CN114332845A (zh) 一种3d目标检测的方法及设备
CN113119996B (zh) 一种轨迹预测方法、装置、电子设备及存储介质
CN114140512A (zh) 一种图像处理方法以及相关设备
WO2024008086A1 (zh) 轨迹预测方法及其装置、介质、程序产品和电子设备
WO2024093321A1 (zh) 车辆的位置获取方法、模型的训练方法以及相关设备
CN114407916B (zh) 车辆控制及模型训练方法、装置、车辆、设备和存储介质
CN116071925B (zh) 轨迹预测方法、装置及电子处理装置
WO2023010236A1 (zh) 一种显示方法、装置和系统
Rajasekaran et al. Artificial Intelligence in Autonomous Vehicles—A Survey of Trends and Challenges

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19958212

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019958212

Country of ref document: EP

Effective date: 20220713

NENP Non-entry into the national phase

Ref country code: DE