CN112734808A - Trajectory prediction method for vulnerable road users in vehicle driving environment - Google Patents

Trajectory prediction method for vulnerable road users in vehicle driving environment Download PDF

Info

Publication number
CN112734808A
CN112734808A CN202110069140.4A CN202110069140A CN112734808A CN 112734808 A CN112734808 A CN 112734808A CN 202110069140 A CN202110069140 A CN 202110069140A CN 112734808 A CN112734808 A CN 112734808A
Authority
CN
China
Prior art keywords
vru
prediction
sequence
behavior pattern
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110069140.4A
Other languages
Chinese (zh)
Other versions
CN112734808B (en
Inventor
游子诺
李克强
熊辉
许庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202110069140.4A priority Critical patent/CN112734808B/en
Publication of CN112734808A publication Critical patent/CN112734808A/en
Application granted granted Critical
Publication of CN112734808B publication Critical patent/CN112734808B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/001Planning or execution of driving tasks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Automation & Control Theory (AREA)
  • Human Computer Interaction (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a trajectory prediction method for vulnerable road users in a vehicle driving environment, which comprises the following steps: calculating a semantic vector according to the first N steps of VRU image frame sequences, the first N steps of VRU motion track sequences and the first N steps of self-vehicle driving odometer sequences in the training data set; predicting VRU behavior characteristics; predicting and generating prior behavior pattern distribution and posterior behavior pattern distribution of the VRU in a continuous iterative computation mode by utilizing a behavior pattern prediction network according to the semantic vector and a VRU motion track sequence of the next M step of the training data set and a driving odometer sequence of the next M step of the vehicle, and predicting the VRU motion track in the continuous iterative computation mode by utilizing a track prediction network; calculating a behavior mode objective function, a trajectory prediction objective function and a behavior characteristic objective function; the method comprises the steps of realizing supervised learning through back propagation, and obtaining a VRU motion trail prediction model of a self-driving odometer sequence supporting input planning; and an online track prediction stage. The method can be used for behavior prediction and safety protection of road users who are easily injured in an advanced driving assistance system, and help is provided for decision making of unmanned vehicles.

Description

Trajectory prediction method for vulnerable road users in vehicle driving environment
Technical Field
The invention relates to the technical field of computer vision technology and intelligent vehicles, in particular to a trajectory prediction method for road users vulnerable to injury in a vehicle driving environment.
Background
Vulnerable Road Users (VRU for short) in traffic refer to two types of traffic participants, mainly pedestrians and riders. The prediction of the VRU motion track is one of key technologies of intelligent vehicle perception, and the uncertainty estimation of the track generated by prediction and the prediction result can provide reference for subsequent planning and decision-making of the intelligent vehicle.
The behavior patterns of VRUs are usually diverse and dynamically changing, which poses great challenges to the accuracy of prediction, so decision-making systems often use a conservative approach to reduce the potential risks to other traffic participants, but this is detrimental to maintaining the stability of the external traffic environment and the internal riding experience. In recent years, attention has been paid to a multi-trajectory prediction method using a random generation model, which sets a behavior pattern of an object to conform to a certain prior probability distribution (e.g., gaussian distribution) and can be inferred from an observed variable, and predicts and generates trajectories in various behavior patterns by characterizing different behavior patterns using a plurality of values on enumeration or sampling distributions. According to the difference of the observed variables, the method can be divided into an inference mode for modeling the interaction relation between similar objects and an inference mode for modeling the static environment in which the objects are positioned. The multi-track prediction method accords with the common knowledge of individual behavior diversity, has a better result than a deterministic prediction method (the output result of the same observation value is unique) in prediction accuracy, and is beneficial to providing better reference for planning and decision making, but the method has improved space when applied to VRU motion track prediction in traffic environment, and has the following specific problems:
the lack of modeling of causal associations of VRU behavior patterns and human-vehicle interactions: the intelligent vehicle is mainly divided into three systems of perception, decision and control, VRU motion trail prediction belongs to a perception part, a decision system selects a driving strategy and plans actions by means of perception information, and the rationality of selection and planning is related to the accuracy of perception prediction. In a traffic scene, the behavior of the VRU is influenced by various variables such as internal attributes (such as characters and habits) and external attributes (such as human-vehicle interaction and barriers), but because the self-vehicle is the only independently controllable individual in the invention scene, the human-vehicle interaction is the main controllable variable capable of influencing the behavior mode of the VRU, the driving system of the self-vehicle can select different driving strategies, the human-vehicle interaction variable is influenced in the subsequent time, and the behavior of the VRU is further influenced, in other words, the diversity of the VRU behavior is associated with the behavior of vehicles in the future. If the VRU behavior mode after the self vehicle adopts a certain driving strategy can be effectively modeled, the intelligent vehicle decision system can be beneficial to more accurately evaluating the candidate driving strategy.
Although the existing trajectory prediction based on a random method considers that VRU behaviors have diversity, the existing trajectory prediction only uses observation values of various variables to infer the VRU behavior mode once and predict the trajectory, but in a traffic scene of high dynamic human-vehicle interaction, the behavior mode inferred only by the observation values is likely to change due to human-vehicle interaction in a prediction period, in other words, the existing method only establishes implicit association between the future driving strategy of a vehicle and the VRU behavior mode, and cannot explicitly associate the behavior mode of the VRU with the vehicle strategy, namely, the existing prediction result is the sum of the prediction results of the VRU under all possible driving strategy conditions, and a subset of result distribution cannot be in one-to-one correspondence with the driving strategy. Therefore, the existing prediction method based on single behavior pattern inference is not suitable for predicting the VRU motion trajectory in a human-vehicle interaction environment, and a trajectory prediction method is needed, which can continuously model human-vehicle interaction in prediction, further continuously adjust the VRU behavior pattern, and make more accurate trajectory prediction, so that the behavior prediction of the VRU can be explicitly associated with different candidate driving strategies.
Lack of modeling and measurement of prediction uncertainty: the prediction of VRU behaviors and tracks provides reference for a decision system of an intelligent vehicle, and the high safety and reliability of a driving system provide high standards for the reliability of a prediction method. Although the prediction diversity is solved by the randomly generated prediction method, the uncertainty measurement index of each prediction result is not given by the current prediction method, so that the confidence degree of the method on the prediction result cannot be quantified, and the requirement of high safety of a driving system is not facilitated.
No suitable data set provides supervised learning training: the existing method has the problems of lack of sequence marking of a driving odometer of a self-vehicle, lack of VRU marking and images, single scene and the like, and is not suitable for a VRU motion track deep learning prediction method considering human-vehicle interaction under multiple scenes.
Lack of prior knowledge when introducing artificial judgment VRU action: the traffic environment is complex and the priori rule knowledge is more, for example, when a human driver judges the VRU behavior, the human driver can judge according to important visual features such as head orientation, vehicles and gestures, and the prediction performance of a method without introducing the priori knowledge may be reduced.
Disclosure of Invention
It is an object of the present invention to provide a method for trajectory prediction of vulnerable road users in a driving environment of a vehicle that overcomes or at least alleviates at least one of the above-mentioned drawbacks of the prior art.
To achieve the above object, the present invention provides a method for predicting a trajectory of a road user vulnerable to injury in a vehicle driving environment, the method comprising:
step 1, establishing a VRU data set which is divided into a training data set and a testing data set;
step 2, preprocessing various data in the VRU data set, and transmitting the various data to the step 3 and the step 4;
step 3, an off-line training stage, which specifically comprises:
step 31, calculating a semantic vector according to the first N steps of VRU image frame sequence, the first N steps of VRU motion track sequence and the first N steps of self-driving odometer sequence in the training data set;
step 32, predicting VRU behavior characteristics according to the previous N steps of VRU image frame sequences in the training data set, wherein the VRU behavior characteristics comprise a head inclination view angle of a VRU and a vehicle probability vector;
step 33, predicting and generating prior behavior pattern distribution and posterior behavior pattern distribution of the VRU in a continuous iterative computation mode by utilizing a behavior pattern prediction network according to the semantic vector, the VRU motion track sequence in the last M step and the driving odometer sequence of the self-vehicle in the last M step, and predicting the VRU motion track in the continuous iterative computation mode by utilizing the track prediction network;
step 34, calculating a behavior pattern target function according to the prior behavior pattern distribution and the posterior behavior pattern distribution output in the step 33;
step 35, calculating a track prediction target function according to the VRU motion track output in the step 33 and the VRU motion track obtained in the last M steps;
step 36, calculating a behavior feature target function according to the VRU behavior features of the first N steps output in step 32 and the VRU behavior features of the first N steps in the training data set;
step 37, implementing supervised learning of the behavior pattern target function, the trajectory prediction target function and the behavior feature target function through back propagation to obtain a VRU motion trajectory prediction model supporting the input planning of the driving odometer sequence of the vehicle;
step 4, in the online track prediction stage, the method specifically comprises the following steps:
step 41, calculating semantic vectors according to the VRU image frame sequence of the first N steps, the VRU motion trail of the first N steps and the self-driving odometer sequence of the first N steps obtained on line;
and 42, predicting the VRU motion trail distribution of the future M steps by utilizing the VRU motion trail prediction model obtained in the step 37 in a continuous iterative calculation mode according to the semantic vector output in the step 41 and the post-M-step self-vehicle driving odometer sequence generated under the driving strategy selected by the decision module.
The invention provides a VRU motion trail prediction method based on a tape-controlled variation automatic coding machine, which is characterized in that when each VRU is predicted, the method uses observed VRU local visual characteristics, VRU motion characteristics and self-vehicle motion characteristics to infer initial VRU behavior pattern distribution, then the VRU motion trail is predicted according to the observation characteristics and the behavior patterns, and in the prediction, the method can dynamically update the behavior patterns according to different set human-vehicle interaction scenes, namely, an intelligent vehicle is allowed to obtain the prediction result under the corresponding interaction scene by inputting different planning strategies. The trajectory prediction method provided by the invention can be used for behavior prediction and safety protection of vulnerable road users (VRU for short) in an advanced driving assistance system, and can also provide help for decision making of unmanned vehicles.
Drawings
FIG. 1 is a flowchart of an off-line training process in the method according to the embodiment of the present invention.
Fig. 2 is a schematic structural diagram of the VRU behavior feature sequence prediction network in fig. 1.
Fig. 3 is a schematic diagram of posterior behavior pattern distribution generation and trajectory prediction in fig. 1.
FIG. 4 is a flowchart of an on-line test in the method according to the embodiment of the present invention.
FIG. 5 is a schematic diagram of the prior behavior pattern distribution generation and trajectory prediction of FIG. 4.
Fig. 6 is a schematic view of an application scenario of the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
In order to make the implementation objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be described in more detail below with reference to the accompanying drawings in the embodiments of the present invention.
Specifically, the trajectory prediction method of the embodiment of the invention comprises the following steps:
step 1, establishing a VRU data set.
The VRU data set is obtained by labeling signals sampled by an on-vehicle vision sensor, an Inertial Measurement Unit (IMU) and a positioning sensor (GPS) in a driving environment, the time frequency of the data set is a set value (for example: 10 frames per second), the data set is sorted according to a time sequence mode, and each step of data comprises the following parts:
the first part is a driving odometer of the self-vehicle, which is obtained by combining the degrees of freedom measured by the IMU sensor with GPS positioning correction, including position, speed and steering angle, and is used for representing the driving characteristics of the self-vehicle.
The second part is the characteristics of various types of VRUs in the visual scene, including pedestrians, low-speed two-wheeled vehicle occupants (e.g., cyclists) and high-speed two-wheeled vehicle occupants (e.g., motorists). The characteristics of various VRUs in the visual scene are obtained by manually marking sampling data, including positions, image framing position coordinates, head inclination visual angles and vehicle types. The vehicle categories include, among others, walking, bicycle, motorcycle and electric bicycle.
The third part is original monocular vision image semantic information which is obtained from image signals of a vision sensor and is mainly used for acquiring and coding the visual characteristics of the VRU by combining with the VRU image framing position.
Specifically, two-dimensional coordinates (x-y) are used for the position information of the VRU and the own vehicle, and the axis rotation angle in the vertical road plane direction is used for the VRU head inclination angle and the own vehicle steering angle. The setting of the reference frame will be explained in detail in the data preprocessing phase of off-line training and on-line testing.
The VRU data set is divided into a training data set and a testing data set according to the data acquisition scene, wherein the accumulated total time sequence step of the training data set exceeds 15000, and the training data set is used only in an off-line training stage and is used for training a track prediction model (a part to be trained in the graph 1). The cumulative total timing step for the test data set exceeds 2500 and is used only in the online trajectory prediction phase.
And 2, preprocessing each data in the VRU data set according to several types of data in the VRU data set.
Specifically, step 2 comprises:
the length of a data observation time interval window is set to be N, the length of a prediction time interval window is set to be M, and the observation time interval is positioned before the prediction time interval, so that the former N steps and the later M steps are carried out. Then, one training data includes a feature sequence with a length of N + M of the same VRU and the sequence of the driving odometer of the own vehicle in the corresponding time window. Specifically, the training data set in this embodiment may select a first N-step VRU behavior feature sequence, a first N-step VRU image frame sequence, a first N + M-step VRU motion trajectory sequence, and a first N + M-step driving odometer sequence.
Based on the difference in the kind of data in the data set, the corresponding preprocessing is performed as listed below.
First, the first N-step VRU behavior feature sequence, which includes the VRU head-leaning view sequence S ═ S1,s2,…,sNAnd VRU vehicle type T, where sNIndicating the VRU head inclination angle in the nth step image.
Since the image signal is captured by a camera moving with the vehicle, only the image signal is used when the angle of view is predicted by a subsequent prediction model, and therefore, when the VRU head inclination angle in the data set is used, the image signal needs to be converted from the world coordinate system to the image coordinate system. Specifically, the reference frame transformation is obtained by subtracting the heading angle of the vehicle in the world coordinate from the VRU head inclination angle in the world coordinate. In the image coordinate system, the direction of the vehicle is 0 degree at each time. The world reference system selects the position of the self vehicle in the first step of data as an origin, the X direction of coordinates is the direction of the vehicle head, the Y direction is the vertical direction of the vehicle head, and the angle is clockwise represented by the positive direction of the vehicle head being 0 degree. VRU vehicle type T is characterized as a vector with one and only one value of 1, the remainder being 0.
And the second type is a VRU image frame sequence with the first N steps.
Firstly, the image framing position of the VRU in each step is utilized to extract a VRU image frame sequence from the original monocular image.
Then, the VRU image frame is subjected to YUV color coding, histogram equalization, and the like, and then the size is uniformly scaled to a predetermined size, thereby obtaining a VRU image frame sequence B ═ B that can be input to a subsequent Convolutional Neural Network (CNN)1,b2,…,bN}. Wherein, bNAnd representing the VRU image frame in the Nth step. The predetermined dimension may be, for example, (224, 224,3), the first dimension (224) representing width, the second dimensionTwo dimensions (224) indicate high and the third dimension (3) indicates the number of image channels.
The third kind, VRU motion trajectory sequence of N + M steps, which is presented in the form of 2D coordinate sequence: p ═ P1,p2,…,pN+M}={(x1,y1),(x2,y2),…,(xN+M,yN+M) }, wherein: p is a radical ofN+MDenotes the position of the VRU in the world coordinate system in step N + M, xN+MThe X-axis coordinate value, y, of the VRU in the world coordinate system in the (N + M) th stepN+MAnd (4) representing the Y-axis coordinate value of the VRU in the world coordinate system in the N + M step.
The fourth category, the N + M step driving odometer sequence of the own vehicle, which includes the trajectory, velocity and steering angle of the own vehicle, is noted as:
Figure BDA0002905373690000061
wherein:
Figure BDA0002905373690000062
the coordinate value of the X axis of the bicycle in the world coordinate system of the step 1 is shown,
Figure BDA0002905373690000063
the coordinate value of the Y axis of the bicycle in the world coordinate system of the step 1 is shown,
Figure BDA0002905373690000064
represents the speed of the bicycle in step 1 on the X axis in a world coordinate system,
Figure BDA0002905373690000065
the steering angle of the bicycle in the world coordinate system in the step 1 is shown,
Figure BDA0002905373690000066
the X-axis coordinate value of the bicycle in the world coordinate system in the (N + M) th step is represented,
Figure BDA0002905373690000067
the Y-axis coordinate value of the bicycle in the world coordinate system in the (N + M) th step is represented,
Figure BDA0002905373690000068
represents the speed of the vehicle in the X axis of the world coordinate system in the N + M step,
Figure BDA0002905373690000069
and (4) indicating the steering angle of the bicycle in the world coordinate system in the (N + M) th step. The use of superscripts herein indicates features of the vehicle, to be distinguished from features of the VRU.
Overall, since the present invention involves many types of features in the observation phase and the prediction phase, in order to more simply show the distribution of the probability of the prediction condition, F ═ F is used1,f2,…,fN+MDenotes a sequence of all features in the data set, fiRepresenting all known observed features and the features generated by the prediction corresponding to step i.
Step 3, an off-line training stage, which specifically comprises:
and step 31, calculating a semantic vector according to the first N steps of VRU image frame sequence, the first N steps of VRU motion track sequence and the first N steps of self-driving odometer sequence in the training data set.
Specifically, step 31 includes:
step 311, the first type of data, i.e. the preprocessed first N-step VRU image frame feature sequence B ═ B1,b2,…,bNAnd (5) as input, coding by a Visual feature coder Visual-Encoder to obtain a Visual feature vector sequence. Specifically, the Visual feature Encoder Visual-Encoder comprises a Convolutional Neural Network (CNN) and a Visual feature time sequence coding network LSTMvis(LSTM recurrent neural network). Converting the image signal into a visual Feature vector sequence Box _ Feature ═ bf through a convolutional neural network1,bf2,…,bfNAfter the previous step, by LSTMvisCalculating visual time sequence characteristics of visual characteristic vector sequence and selecting LSTMvisAnd finally, outputting the hidden state as a visual time sequence feature vector.
For example, as shown in FIG. 2, VGG16 is a convolutional neural network commonly used to process images, with a support input size of (C:)224,3), the original model consisting of 5 blocks (Block) and 3 Fully Connected layers (full Connected Layer) in a sequential order, each Block comprising 2-4 Convolutional layers (Connected Layer) and 1 pooling Layer (Pool Layer). The invention only selects 5 modules and the first 2 connecting layers (the number of neurons is 4096 and 4096 respectively) of VGG16, and a VRU image frame feature sequence B ═ B1,b2,…,bNVia VGG16, a sequence of visual Feature vectors Box _ Feature ═ bf { bf } that will result in a dimension of 4096 is calculated1,bf2,…,bfN}. Wherein, bfNAnd representing the visual characteristic vector of the VRU image frame in the Nth step.
LSTMvisHas a cell dimension of 512, and thus, LSTMvisAt each step of the computation, the input is a visual feature vector of dimension 4096 (encoded by VGG 16), and the output is a hidden state of dimension 512. Selecting LSTM as shown in formula (1) and formula (2)visThe hidden state of the last step (t ═ N) of the output is used as the visual time sequence characteristic vector cvis
Figure BDA0002905373690000071
Figure BDA0002905373690000072
In the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000073
time-sequence coding network LSTM for representing visual characteristicsvisThe hidden state of the output t step is output,
Figure BDA0002905373690000074
time-sequence coding network LSTM for representing visual characteristicsvisHidden state of output t-1 step, bftAnd representing the VRU visual characteristic vector in the t step.
Step 312, using the second kind of data, i.e. the preprocessed sequence of VRU motion trajectories in the first N steps as the motion trajectory, and using LSVRU motion track time sequence characteristic encoder LSTM formed by TM recurrent neural networktrajAnd extracting time sequence characteristics in the sequence. Wherein, LSTMtrajThe output vector dimension is 64, therefore, LSTMtrajIn each step of the calculation, a position vector p representing the trajectory with a dimension of 2 is inputtThe output is a hidden state of dimension 64. As shown in formulas (3) and (4), LSTM is usedtrajThe hidden state output in the last step (t is N) is used as a VRU motion track time sequence characteristic vector ctraj
Figure BDA0002905373690000075
Figure BDA0002905373690000076
In the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000077
encoder LSTM for representing VRU motion track time sequence characteristicstrajThe hidden state of the output t step is output,
Figure BDA0002905373690000078
encoder LSTM for representing VRU motion track time sequence characteristicstrajHidden state of output t-1 step, ptAnd representing the VRU track position vector in the t step.
And step 313, representing the driving characteristics of the self-vehicle driving odometer sequence N steps before the third type of data. Self-driving characteristic time sequence characteristic encoder LSTM formed by LSTM recurrent neural networkodomSeparately extracting temporal features, LSTM, in the sequenceodomThe output vector dimension is 64, therefore, LSTModomFor each calculation in the sequence, the input is the odometer vector o with dimension 4tThe output is a hidden state of dimension 64. As shown in formulas (5) and (6), LSTM is usedodomThe hidden state output in the last step (t is N) is used as the characteristic time sequence characteristic vector c of the driving of the vehicleodom
Figure BDA0002905373690000081
Figure BDA0002905373690000082
In the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000083
encoder LSTM for representing time sequence characteristics of driving characteristics of self-vehicleodomThe hidden state of the output t step is output,
Figure BDA0002905373690000084
encoder LSTM for representing time sequence characteristics of driving characteristics of self-vehicleodomHidden state of output step t-1, otAnd representing the odometer vector of the bicycle in the t step.
And step 314, obtaining a semantic vector through the visual time sequence feature vector, the VRU motion track time sequence feature vector and the self-vehicle driving feature time sequence feature vector. Obtaining a visual time sequence characteristic vector c after coding the known observation part data of the data in the splicing training data setvisVRU motion track time sequence characteristic vector ctrajAnd the time sequence characteristic vector c of the driving characteristic of the self vehicleodomAnd the vector dimensions are 512,64 and 64 respectively to obtain a semantic vector C as shown in the formula (7). The dimension of the semantic vector C is 640, and the semantic vector C is used for comprehensively representing the VRU visual time sequence characteristics, the VRU motion track characteristics and the driving odometer characteristics of the vehicle obtained in the previous N steps:
C=concat(cvis,ctraj,codom) (7)
it is again emphasized that the odometer true values for M steps after the own vehicle will also be taken as input in the subsequent predictions in the off-line training phase, while the true values here in the on-line testing phase will be replaced by M-step odometers generated by the decision module according to the selected driving strategy post-plan.
And step 32, predicting VRU behavior characteristics according to the previous N steps of VRU image frame sequences in the training data set, wherein the VRU behavior characteristics comprise the head inclination angle of the VRU and the probability vector of the vehicle.
In the world reference system, the position of a self vehicle in the first step of data can be selected as an origin, the forward direction of the vehicle head is in the X direction, the direction vertical to the vehicle head is in the Y direction, the forward direction of the vehicle head is in the 0-degree direction, the clockwise increasing angle is a deviation angle, and the value range is [0,2 pi ].
The specific method for "predicting the head-inclined viewing angle of the VRU in the VRU behavior characteristics according to the sequence of the VRU image frames in the first N steps" in step 32 includes the following steps:
step 321, setting the VRU head inclination view angle range s ∈ [0,2 pi ] to be equally divided into a plurality of regions according to a MultiBin algorithm. Such as: s e 0,2 pi) are divided equally into 16 intervals, i.e. each interval corresponds to 22.5 deg..
And 322, taking the visual feature vector sequence obtained in the step 311 as an input, predicting the interval where the VRU head inclination visual angle is located by using a classification method, and predicting the deviation of the visual angle relative to the centerline of the interval in the interval by using a regression method.
Such as: calculating probability vector of interval in which VRU head inclination visual angle is located by using full connection layer FC1 and Softmax normalization function
Figure BDA0002905373690000091
And regularizing by using a full connection layer FC2 and an L2 to obtain a sine value and a cosine value of the offset angle in the interval:
Figure BDA0002905373690000092
Figure BDA0002905373690000093
wherein, bfiiVisual time sequence feature vector representing the VRU image frame in the ith step, FullyConnected (DEG) represents the full connection layer FC1 or FC2, Softmax (DEG) represents Softmax normalization function, Nomalizationl2(. cndot.) denotes L2 regularization.
In step 323, the VRU head inclination angle is calculated.
In one embodiment, the "predicting the vehicle probability vector in the VRU behavior feature according to the N-step VRU image frame sequence" in the step 32 specifically includes the following method:
step 325, according to bfiThe vehicle probability vector is predicted using a classification method. Such as: computing a probability vector for a vehicle type using a full connectivity layer FC3 and a Softmax normalization function
Figure BDA0002905373690000094
FullyConnected (·) denotes the fully connected layer FC 3.
And step 33, predicting and generating the prior Behavior pattern distribution and the posterior Behavior pattern distribution of the VRU in a continuous iterative computation mode by utilizing a Behavior pattern prediction network (Behavior Predictor) according to the semantic vector, the VRU motion track sequence at the last M step and the driving odometer sequence of the self-vehicle at the last M step, and predicting the VRU motion track in a continuous iterative computation mode by utilizing a track prediction network (track-Predictor).
As shown in fig. 3, the behavior pattern prediction network and the trajectory prediction network as a whole constitute a Conditional Variable Automatic Encoder (CVAE), and represent VRU behavior patterns affected by human-vehicle interaction with a dynamic state by latent codes (latent codes) that can be continuously generated. The Behavior pattern distribution prediction network comprises a Prior Behavior pattern distribution prediction network (Prior Behavior Predictor) and a Posterior Behavior pattern distribution prediction network (Posterior Behavior Predictor). In the off-line training stage, the posterior behavior pattern distribution prediction network is used for predicting and generating VRU behavior pattern distribution, and the VRU behavior pattern distribution is input into the trajectory prediction network to participate in iterative prediction, so that the calculation result of the prior network is as same as that of the posterior network as possible. In the online test stage, the posterior behavior pattern distribution prediction network has no effect, and only the behavior pattern distribution generated by the prior behavior pattern distribution prediction network is used for participating in iterative prediction.
The method for predicting and generating the prior behavior pattern distribution and the posterior behavior pattern distribution of the VRU in a continuous iterative computation mode by using a behavior pattern prediction network according to the semantic vector, the VRU motion trajectory in the next M step and the self-vehicle driving odometer sequence in the next M step specifically comprises the following steps:
step 331, set two LSTM recurrent neural networks: prior behavior pattern distribution prediction network LSTMpriorAnd posterior behavioral pattern distribution prediction network LSTMpost. The two networks are identical in that the unit dimension is 64, and the semantic vector C is calculated by the multilayer perceptron MLP to be used as an initial hidden state (the initial subscript is set to start from the last observation time t ═ N). The two networks differ in that the LSTM is used when the distribution of the behavior pattern of N +1 ≦ t ≦ N + M in the predicted tth steppostCoding the time sequence characteristics of the time interval of 1 ≤ i ≤ t, and generating posterior distribution, LSTM, of the t-th walking as modepriorAnd coding the time sequence characteristics of the time period with the time interval of more than or equal to 1 and less than t, and generating the prior distribution of the t-th walking as the mode. The former is used only for the offline training phase, the latter for the offline training and online testing phases, respectively.
After being calculated by a multilayer state perception machine, the semantic vector is used as a prior behavior pattern distribution prediction network LSTMpriorAnd posterior behavioral pattern distribution prediction network LSTMpostAs shown in equation (8);
Figure BDA0002905373690000101
in the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000102
distribution prediction network LSTM representing prior behavior patternspriorThe initial hidden state (the network starts to be used at global time t, N, so the subscript is N),
Figure BDA0002905373690000103
distribution prediction network LSTM for representing posterior behavior modepostThe network starts to be used at global time t ═ N, so the subscript is N), MLP (·) denotes a multi-layer state-aware network, and C denotes a semantic vector.
Step 332, dividing the VRU motion trail of the last M steps and the mileage sequence of the bicycle of the last M steps into M steps according to the time sequence to be taken as a rowFor the continuous input of the mode distribution prediction network, the network uses the input to update the hidden state of the network in each step, and the prior behavior mode distribution prediction network LSTM updates the hidden state in the t steppriorInputting the data of the t-1 step, and predicting the posterior behavior pattern distribution network LSTMpostData at the t-th step time are input, and are shown in equations (9) and (10):
Figure BDA0002905373690000104
Figure BDA0002905373690000105
in the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000111
representing the hidden state of the distribution prediction network representing the posterior behavior pattern at the t step,
Figure BDA0002905373690000112
representing the hidden state of the distributed prediction network representing the prior behavior pattern at step t, LSTMpost(. represents a posterior behavioral Pattern distribution prediction network, LSTMprior(. represents a prior behavior Pattern distribution prediction network, ptIndicating the coordinate position, o, of the VRU in step ttAnd (4) indicating the driving odometer of the self vehicle at the t step.
In step 333, the prior distribution and the posterior distribution of the behavior pattern distribution can be set to be one-dimensional gaussian distribution, but are not limited thereto. The mean and variance of the distribution are predicted by the model. Calculating hidden states of two behavior pattern prediction networks from N +1 to N + M time by a multilayer state perception machine, and predicting and generating prior behavior pattern distribution and posterior behavior pattern distribution of VRUs at corresponding time, wherein the prior behavior pattern distribution and the posterior behavior pattern distribution are shown as a formula (11) and a formula (12):
Figure BDA0002905373690000113
Figure BDA0002905373690000114
in the formula, mupost(t) means, σ, of the distribution of the posterior behaviour patterns of step tpost(t) standard deviation of posterior behavior pattern distribution in the t step, MLP (-) represents multilayer state-aware network, μprior(t) means, σ, of the distribution of the prior behavior patterns of step tpiror(t) represents the standard deviation of the prior behavior pattern distribution at step t.
And in the off-line training stage, the posterior behavior pattern distribution is used as an input trajectory prediction network to participate in the behavior pattern distribution of trajectory prediction. However, in the online test, data required by the posterior behavior pattern distribution calculation cannot be obtained in real time, so that the prior behavior pattern distribution is used as the behavior pattern distribution participating in the trajectory prediction in the online test, and in the offline training stage, the data obtained by training the prior behavior pattern distribution prediction network only when the network is online is used to fit the performance of the posterior behavior pattern distribution prediction network.
The method for predicting the VRU motion trail by using the trail-Predictor in a continuous iterative computation mode according to the semantic vector, the VRU motion trail in the last M steps and the driving odometer sequence of the self-vehicle in the last M steps specifically comprises the following steps of:
steps 331 to 333 are the same as those provided in the above embodiment.
Step 334, set up an LSTM recurrent neural network LSTMtrajThe trajectory is predicted, with a network unit dimension of 64. After the semantic vector is calculated by a multilayer state perception machine, the semantic vector is used as a track prediction network LSTMtrajAs shown in equation (13);
Figure BDA0002905373690000115
in the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000121
an initial hidden state is shown representing the trajectory prediction network (the subscript N is used since the network starts to be used at global time t — N).
Step 335, using the posterior behavior pattern distribution as the behavior pattern distribution P (z) participating in the trajectory predictiont|f≤t) Sampling the behavior vector z from the distribution using a resampling methodtAs a trace prediction network LSTM in corresponding time instantstrajAs shown in equation (14).
Figure BDA0002905373690000122
In the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000123
represents a normal distribution determined by the mean and standard deviation parameters.
Step 336, the trajectory prediction network takes the behavior vector as input, and updates the hidden state of itself through calculation, as shown in formula (15).
Figure BDA0002905373690000124
In the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000125
indicating the hidden state of the trajectory prediction network at step t, LSTMtraj(. cndot.) represents a trajectory prediction network.
And 337, calculating the hidden state of the track prediction network from N +1 to N + M by a multilayer state sensing machine, predicting to obtain the track distribution of VRU, and setting the prediction result as two-dimensional Gaussian distribution for quantifying prediction uncertainty.
Specifically, the input of each step of the LSTM network is a behavior pattern distribution sample value (vector) with dimension 1, the output is a hidden state vector with dimension 64, and the hidden state vector outputs a trajectory prediction distribution with dimension 5 through the MLP. Wherein, 2-dimensional mean, 2-dimensional standard deviation and 1-dimensional correlation coefficient are shown in formulas (16) to (18).
Figure BDA0002905373690000126
Figure BDA0002905373690000127
Figure BDA0002905373690000128
And step 34, calculating a Behavior pattern target function Loss _ Behavior according to the prior Behavior pattern distribution and the posterior Behavior pattern distribution output in the step 33. As shown in equation (19), the Behavior pattern objective function Loss _ Behavior uses LSTM in the same steppostAnd LSTMpriorThe calculated KL divergence between the prior and posterior distributions is used as an objective function, and the KL divergence can measure the similarity between the two distributions.
Figure BDA0002905373690000131
In the formula, DKLIndicating KL divergence (also called relative entropy), f≤tRepresenting all known observation features and predicted generated features of the previous t steps, P (-) represents probability distribution, | | | represents symbols in KL divergence formula, and is used for dividing two probability distributions.
And step 35, calculating a Trajectory prediction objective function Loss _ Tracory according to the VRU motion Trajectory predicted in the step 33 and the VRU motion Trajectory true value in the last M steps. As shown in equation (20), the Trajectory prediction objective function Loss _ Trajectory uses maximum likelihood estimation as an objective function, and approximates the estimation using L ═ 20 samples:
Figure BDA0002905373690000132
in the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000133
the ith sampling value representing the posterior behavior pattern distribution of the t step, |, represents the symbol of the conditional probability distribution formula.
And step 36, calculating a behavior characteristic target function Loss _ Feature according to the VRU behavior characteristics in the first N steps output in the step 32 and the VRU behavior characteristics in the first N steps in the training data set. As shown in equation (21), the behavior Feature objective function Loss _ Feature is defined by the head dip objective function LorientationAnd a vehicle classification objective function LtransportationConsists of the following components:
Loss_Feature=Lorientation+Ltransportation (21)
wherein:
Figure BDA0002905373690000134
Figure BDA0002905373690000135
in the formula, stRepresents the true head tilt of the VRU at step t,
Figure BDA0002905373690000136
indicating the deviation of the VRU predicted at the t-th step from the line angle in the prediction interval,
Figure BDA0002905373690000137
probability vector, theta, representing the interval in which the head inclination is predicted in step ttThe unique heat vector, mean (theta), representing the interval in which the true value of the t step liest) Represents the value of the median angle of the interval in which the true trend angle lies, crossEncopy (. cndot.) represents the cross entropy function,
as shown in equation (22), the behavior pattern objective function, the trajectory prediction objective function and the behavior feature objective function constitute a loss function L, and the Adam optimizer optimizes the trainable network by using the loss function through back propagation, so as to reduce the loss as much as possible.
L=α×Loss_Feature+β×Loss_Behavior+Loss_Trajectory (22)
Wherein: alpha and beta represent hyper-parameters, and specific numerical values can be determined by comparing experimental results by using an enumeration method.
And step 37, implementing supervised learning by the behavior mode target function, the trajectory prediction target function and the behavior characteristic target function through back propagation to obtain a VRU motion trajectory prediction model supporting the input planning of the driving odometer sequence of the self-vehicle.
As shown in fig. 4, step 4, the online trajectory prediction stage specifically includes:
and step 41, calculating a semantic vector according to the VRU image frame sequence in the first N steps, the VRU motion track in the first N steps and the self-driving odometer sequence in the first N steps acquired on line. Specifically, three types of sequences in online test data are coded by using a visual characteristic encoder, a VRU motion track time sequence characteristic encoder and a self-driving characteristic time sequence characteristic encoder which are trained in an off-line stage to form a time sequence characteristic vector cvis,ctraj,codemAnd splicing to obtain a semantic vector C as shown in formulas (23) to (26).
cvis=Visual_Encoder(B) (23)
ctraj=Trajectory_Encoder(P) (24)
codom=Odometry_Encoder(O) (25)
C=concat(cvis,ctraj,codom) (26)
And 42, predicting the VRU motion trail distribution of the future M steps by using the VRU motion trail prediction model obtained in the step 37 in a continuous iterative computation mode according to the semantic vector output in the step 41 and the post-M-step self-vehicle driving odometer sequence generated under the driving strategy selected by the decision module, wherein the prediction result provides reference for the subsequent driving decision of the intelligent vehicle.
Generally, the smart car has a plurality of selectable subsequent driving strategies under a scene described by an observation data, for example, when the smart car drives to an intersection to prepare for a right turn, a pedestrian is found to be ready to cross the street, and the driving strategies include, but are not limited to, sudden braking, slow walking, constant speed passing and the like. Due to the fact that the characteristic of high dynamic interaction is frequently achieved between people and vehicles in the traffic scene, after different types of strategies are selected for driving, different people and vehicles interaction may be generated by different driving strategies in a subsequent prediction time interval window, and further different changes and different tracks are generated in the behavior mode of the VRU.
In reality, the motion trail of the VRU is influenced by various factors, but in an automatic driving scene, the self vehicle is the only controllable individual in the scene, and the influence on the VRU behavior is mainly based on the change of human-vehicle interaction between the VRU and the self vehicle, so that the trail prediction of the invention is developed under the assumption that the driving strategy can influence the human-vehicle interaction and further influence the VRU trail.
In order to enable the prediction network to support the change of the VRU behavior mode within the prediction time period window and strengthen the causal relationship between the VRU prediction track and human-vehicle interaction so as to enhance the accuracy and usability of prediction, the driving strategy influencing the human-vehicle interaction needs to be input into the prediction network in a certain form. Two points need to be considered for the form of the driving strategy input network, namely the input form supported by the model (the variable related to the current model training is the odometer), and the VRU serving as a third party can judge the human-vehicle interaction relation only based on the observable variable (the form of the strategy input network is required to be observable by the VRU), so that the odometer is selected to represent the characteristics of the driving strategy. When the online prediction is carried out, the decision module needs to input the selected alternative driving strategy into the model in a driving odometer sequence of the driving self vehicle observable by the VRU, and the model can model human-vehicle interaction under the selected driving strategy and predict the behavior mode and the track of the VRU.
Using D to represent the selected driving strategy, using odometers
Figure BDA0002905373690000151
To representAnd generating a planned VRU observable driving characteristic, wherein the format of the odometer is consistent with that of the driving odometer of the self-vehicle in the data set. Step 42 specifically includes:
step 421, after the semantic vector is calculated by the multi-layer state sensing machine, it is used as the prior behavior pattern distribution prediction network LSTMpriorAnd track prediction network LSTMtrajThe initial hidden state of (a).
Step 422, according to the corresponding driving strategy D appointed by the intelligent vehicle decision module, planning and generating a corresponding odometer ODAnd the input is used as the input of a VRU motion trail prediction model.
Step 423, as shown in fig. 5, setting the prediction duration as M steps, and for the prediction of the t (N +1 ≦ t ≦ N + M) step, using the prior behavior pattern distribution prediction network LSTMpriorTaking the generated sequence of the driving odometer of the self vehicle and the predicted track point (when t is N +1, the track point true value of the Nth step) obtained in the preamble as input, and calculating the distribution of the current prior behavior pattern
Figure BDA0002905373690000152
Step 424, sampling on the distribution using a re-parametric approach, using a trajectory prediction network LSTMtrajAnd predicting the track point distribution of the t step based on the sampled behavior vector.
Step 425, as shown in equation (27), selecting a two-dimensional mean of the distribution of the tracing points to represent the mean as
Figure BDA0002905373690000153
As LSTM in the next iterationpriorThe behavior pattern distribution and the trajectory prediction of the t +1 step are performed in an autoregressive mode:
Figure BDA0002905373690000154
in the formula (I), the compound is shown in the specification,
Figure BDA0002905373690000161
represents the mean of the predicted t-th step trajectory distribution,
Figure BDA0002905373690000162
a covariance matrix representing the predicted t-th step trajectory distribution.
And 426, integrating the VRU track points predicted in each step into a sequence to obtain a VRU motion track distribution prediction result under a corresponding driving strategy D, wherein each predicted step is represented by two-dimensional Gaussian distribution, and uncertainty of the prediction result is quantified by using a covariance matrix of the distribution. The prediction is a result of one execution, and the semantic vector obtained in step 41 and the generated own odometer O are retainedDWithout change, multiple sub-steps of step 42 may be performed, and due to the randomness of the resampling method, a variety of feasible VRU motion trajectory distributions may be generated as shown in equation (29):
Figure BDA0002905373690000163
wherein P (-) represents a conditional probability distribution,
Figure BDA0002905373690000164
representing the predicted VRU trajectory from step N +1 to step M + N, f1:NRepresenting all known observed features and features generated by the prediction from step 1 to step N.
Fig. 6 illustrates an application example of the present invention, and this example provides a VRU target for simplicity of description. Consider an intersection without a right turn signal control light where the vehicle is scheduled to turn right, but where there are pedestrians scheduled to cross the road.
If a conventional multi-track prediction method is used, all possible actions of the pedestrian are predicted based on the observed known information as shown in a in fig. 6.
If the method provided by the invention is used, besides the observed known information, the subsequent planned path and behavior of the vehicle are also input, and different driving strategies are predicted to be associated with the VRU track, for example:
when the vehicle keeps turning to the right, the pedestrian will tend to wait on site or try to go forward and turn back in a behavior pattern, as shown by b in fig. 6.
When the vehicle slows down to a stop, the pedestrian will tend to pass straight through the intersection, as shown at c in fig. 6.
When the car is whistling to remind and keep turning right at speed, the pedestrian will tend to wait in place without any attempt to cross the road, as shown at d in fig. 6.
Therefore, the prediction can more accurately reflect the result under a certain driving strategy, and the strategy can be more accurately evaluated and selected by an upper-layer decision making system.
Finally, it should be pointed out that: the above examples are only for illustrating the technical solutions of the present invention, and are not limited thereto. Those of ordinary skill in the art will understand that: modifications can be made to the technical solutions described in the foregoing embodiments, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. A method for predicting a trajectory of a road user vulnerable to vehicle driving environment, comprising:
step 1, establishing a VRU data set which is divided into a training data set and a testing data set;
step 2, preprocessing various data in the VRU data set, and transmitting the various data to the step 3 and the step 4;
step 3, an off-line training stage, which specifically comprises:
step 31, calculating a semantic vector according to the first N steps of VRU image frame sequence, the first N steps of VRU motion track sequence and the first N steps of self-driving odometer sequence in the training data set;
step 32, predicting VRU behavior characteristics according to the previous N steps of VRU image frame sequences in the training data set, wherein the VRU behavior characteristics comprise a head inclination view angle of a VRU and a vehicle probability vector;
step 33, predicting and generating prior behavior pattern distribution and posterior behavior pattern distribution of the VRU in a continuous iterative computation mode by utilizing a behavior pattern prediction network according to the semantic vector, the VRU motion track sequence in the last M step and the driving odometer sequence of the self-vehicle in the last M step, and predicting the VRU motion track in the continuous iterative computation mode by utilizing the track prediction network;
step 34, calculating a behavior pattern target function according to the prior behavior pattern distribution and the posterior behavior pattern distribution output in the step 33;
step 35, calculating a track prediction objective function according to the VRU motion track predicted in the step 33 and the VRU motion track true value in the last M steps;
step 36, calculating a behavior feature target function according to the VRU behavior features of the first N steps output in step 32 and the VRU behavior features of the first N steps in the training data set;
step 37, implementing supervised learning of the behavior pattern target function, the trajectory prediction target function and the behavior feature target function through back propagation to obtain a VRU motion trajectory prediction model supporting the input planning of the driving odometer sequence of the vehicle;
step 4, in the online track prediction stage, the method specifically comprises the following steps:
step 41, calculating semantic vectors according to the VRU image frame sequence of the first N steps, the VRU motion trail of the first N steps and the self-driving odometer sequence of the first N steps obtained on line;
and 42, predicting the VRU motion trail distribution of the future M steps by utilizing the VRU motion trail prediction model obtained in the step 37 in a continuous iterative calculation mode according to the semantic vector output in the step 41 and the post-M-step self-vehicle driving odometer sequence generated under the driving strategy selected by the decision module.
2. The method according to claim 1, wherein the step 31 comprises:
step 311, using the VRU image frame feature sequence of the first N steps as input, and obtaining a visual feature vector sequence through encoding by a convolutional neural network and an LSTM cyclic neural network; after converting the image signal into a visual feature vector sequence through a convolutional neural network, calculating the visual features of the visual feature vector sequence through an LSTM recurrent neural network, and selecting the hidden state output by the last step of the LSTM recurrent neural network as a visual time sequence feature vector;
step 312, selecting the former N-step VRU motion track sequence as a motion track, extracting the time sequence characteristics in the sequence by using a VRU motion track time sequence characteristic encoder formed by an LSTM recurrent neural network, and using the hidden state output by the last step of the VRU motion track time sequence characteristic encoder as a VRU motion track time sequence characteristic vector;
313, selecting a self-driving odometer sequence of the first N steps to represent driving characteristics, extracting time sequence characteristics in the sequence by using a self-driving characteristic time sequence characteristic encoder formed by an LSTM recurrent neural network, and using a hidden state output by the last step of the self-driving characteristic time sequence characteristic encoder as a self-driving characteristic time sequence characteristic vector;
and step 314, obtaining a semantic vector by splicing the visual time sequence feature vector, the VRU motion track time sequence feature vector and the self-vehicle driving feature time sequence feature vector.
3. The method for predicting the trajectory of a vulnerable road user in a driving environment of a vehicle according to claim 1, wherein the step 32 of predicting the head inclination view angle of the VRU in the VRU behavior characteristics according to the N-step VRU image frame sequence specifically comprises the following steps:
step 321, setting a VRU head tendency view angle range s belonging to [0,2 pi ] to be equally divided into a plurality of regions according to a MultiBin algorithm;
step 322, calculating a probability vector of an interval where the VRU head inclination angle is located by using the full connection layer FC1 and the Softmax normalization function according to the visual time sequence feature vector obtained in the step 311
Figure FDA0002905373680000021
Figure FDA0002905373680000022
And regularizing by using a full connection layer FC2 and an L2 to obtain a sine value and a cosine value of the offset angle in the interval:
Figure FDA0002905373680000023
wherein, bfiVisual feature vector representing the VRU image frame in the ith step, FullyConnected (-) represents the full connection layer FC1 or FC2, Softmax (-) represents the Softmax normalization function, Nomalizationl2(. h) represents L2 regularization;
step 323, calculating the VRU head inclination angle by using a MultiBin algorithm.
4. The method for predicting the trajectory of a vulnerable road user in a driving environment of a vehicle according to claim 3, wherein the step 32 of predicting the vehicle probability vector in the VRU behavior feature according to the N-step VRU image frame sequence specifically comprises the following steps:
step 325, according to bfiCalculating a probability vector for a vehicle type using the full connectivity layer FC3 and the Softmax normalization function
Figure FDA0002905373680000031
FullyConnected (·) denotes the fully connected layer FC 3.
5. The method for predicting the trajectory of a vulnerable road user in the driving environment of a vehicle according to any one of claims 1 to 4, wherein the method for predicting the prior behavior pattern distribution and the posterior behavior pattern distribution of the generated VRU in a continuous iterative calculation manner by using the behavior pattern prediction network according to the semantic vector, the VRU motion trajectory in the last M steps and the self-driving odometer sequence in the last M steps in the step 33 specifically comprises:
step 331, after the semantic vector is calculated by the multilayer state sensing machine, the semantic vector is used as a prior behavior pattern distribution prediction network LSTMpriorAnd posterior behavioral pattern distribution prediction network LSTMpostThe initial hidden state of (a);
Figure FDA0002905373680000032
in the formula (I), the compound is shown in the specification,
Figure FDA0002905373680000033
representing an initial hidden state of the a priori behavior pattern distribution prediction network,
Figure FDA0002905373680000034
representing an initial hidden state of the posterior behavior mode distribution prediction network, wherein MLP (question mark) represents a multilayer state perception machine network, and C represents a semantic vector;
step 332, dividing the VRU motion trail of the last M steps and the mileage sequence of the next M steps into M steps according to time sequence to be used as the continuous input of the behavior pattern distribution prediction network, updating the hidden state of the network by using the input in each step, and updating the hidden state of the network in the t step by using the prior behavior pattern distribution prediction network LSTMpriorInputting data at the t-1 step time, and predicting the posterior behavior pattern distribution network LSTMpostInputting data at the t step moment;
Figure FDA0002905373680000035
Figure FDA0002905373680000036
in the formula (I), the compound is shown in the specification,
Figure FDA0002905373680000037
representing the hidden state of the posterior behavior pattern distribution prediction network at the t step,
Figure FDA0002905373680000038
representing the hidden state of the prior behavior pattern distribution prediction network at step t, LSTMpost(. represents a posterior behavioral Pattern distribution prediction network, LSTMprior(. represents a prior behavior Pattern distribution prediction network, ptIndicating the coordinate position, o, of the VRU in step ttDenotes the t-thStep-by-step driving odometer;
step 333, predicting and generating prior behavior pattern distribution and posterior behavior pattern distribution of the VRU at the corresponding time by calculating hidden states of the two behavior pattern prediction networks from N +1 to N + M through a multilayer state perception machine:
Figure FDA0002905373680000041
Figure FDA0002905373680000042
in the formula, mupost(t) means, σ, of the distribution of the posterior behaviour patterns of step tpost(t) standard deviation of posterior behavior pattern distribution in the t step, MLP (-) represents multilayer state-aware network, μprior(t) means, σ, of the distribution of the prior behavior patterns of step tpiror(t) represents the standard deviation of the prior behavior pattern distribution at step t.
6. The method for predicting the trajectory of a vulnerable road user in the driving environment of a vehicle according to any one of claims 1 to 4, wherein the method for predicting the VRU motion trajectory in a continuous iterative calculation manner by using the trajectory prediction network according to the semantic vector, the VRU motion trajectory in the last M steps and the driving odometer sequence in the last M steps in step 33 specifically comprises the following steps:
step 331, after the semantic vector is calculated by the multilayer state sensing machine, the semantic vector is used as a prior behavior pattern distribution prediction network LSTMpriorAnd posterior behavioral pattern distribution prediction network LSTMpostThe initial hidden state of (a);
Figure FDA0002905373680000043
step 332, performing sequential operation on the VRU motion trail of the last M steps and the odometer sequence of the bicycle of the last M stepsDividing the network into M steps as the continuous input of the behavior pattern distribution prediction network, updating the hidden state of the network by using the input in each step, and updating the hidden state in the t step by using the prior behavior pattern distribution prediction network LSTMpriorInputting data at the t-1 step time, and predicting the posterior behavior pattern distribution network LSTMpostInputting data at the t step moment;
Figure FDA0002905373680000044
Figure FDA0002905373680000045
step 333, predicting and generating prior behavior pattern distribution and posterior behavior pattern distribution of the VRU at the corresponding time by calculating hidden states of the two behavior pattern prediction networks from N +1 to N + M through a multilayer state perception machine:
Figure FDA0002905373680000051
Figure FDA0002905373680000052
step 334, after the semantic vector is calculated by the multilayer state perception machine, the semantic vector is used as a track prediction network LSTMtrajThe initial hidden state of (a);
Figure FDA0002905373680000053
in the formula (I), the compound is shown in the specification,
Figure FDA0002905373680000054
representing an initial hidden state of the trajectory prediction network;
step 335, using the posterior behavior pattern distribution as the behavior pattern distribution P (z) participating in the trajectory predictiontF is less than or equal to t), a behavior vector z is sampled from the distribution by using a resampling methodtAs a trace prediction network LSTM in corresponding time instantstrajThe input of (1);
Figure FDA0002905373680000055
in the formula (I), the compound is shown in the specification,
Figure FDA0002905373680000056
represents a normal distribution determined by mean and standard deviation parameters;
step 336, the trajectory prediction network takes the behavior vector as input, and updates the hidden state of itself through calculation.
Figure FDA0002905373680000057
In the formula (I), the compound is shown in the specification,
Figure FDA0002905373680000058
indicating the hidden state of the trajectory prediction network at step t, LSTMtraj() represents a trajectory prediction network;
and 337, calculating the hidden state of the track prediction network from N +1 to N + M by a multilayer state sensing machine, predicting to obtain the track distribution of VRU, and setting the prediction result as two-dimensional Gaussian distribution for quantifying prediction uncertainty.
7. The method according to any one of claims 1 to 4, wherein step 42 comprises:
step 421, after the semantic vector is calculated by the multi-layer state sensing machine, it is used as the prior behavior pattern distribution prediction network LSTMpriorAnd track prediction network LSTMtrajThe initial hidden state of (a);
step 422, according to the corresponding driving strategy D appointed by the intelligent vehicle decision module, planning and generating a corresponding odometer ODAnd is used as the input of a VRU motion trail prediction model;
step 423, setting the prediction duration as M steps, and for the prediction of the t step, using prior behavior pattern distribution prediction network LSTMpriorTaking the generated sequence of the driving odometer of the self vehicle and the predicted track points in the preamble as input, and calculating the distribution of the current prior behavior pattern
Figure FDA0002905373680000061
Step 424, sampling on the distribution using a re-parametric approach, using a trajectory prediction network LSTMtrajPredicting the track point distribution of the t step based on the sampled behavior vector;
425, selecting the two-dimensional mean value of the distribution of the trace points as the LSTM in the next iterationpriorPerforming autoregressive behavior pattern distribution and track prediction of the step t + 1;
Figure FDA0002905373680000062
in the formula (I), the compound is shown in the specification,
Figure FDA0002905373680000063
represents the mean of the predicted t-th step trajectory distribution,
Figure FDA0002905373680000064
a covariance matrix representing the predicted tth step trajectory distribution;
426, integrating the distribution of the VRU track points predicted in each step into a sequence to obtain a VRU motion track distribution prediction result under a corresponding driving strategy D, wherein each predicted coordinate point is represented by two-dimensional Gaussian distribution, and the uncertainty of the prediction result is quantified by using a distributed covariance matrix; the prediction is the result of one execution, and the words obtained in step 41 are retainedSemantic vector sum generated self-vehicle odometer ODThe substep of step 42 is performed multiple times without change, and due to the randomness of the resampling method, a variety of feasible VRU motion trajectory distributions can be generated;
Figure FDA0002905373680000065
wherein P (-) represents a conditional probability distribution,
Figure FDA0002905373680000066
representing the predicted VRU trajectory from step N +1 to step M + N, f1:NRepresenting all known observed features and features generated by the prediction from step 1 to step N.
CN202110069140.4A 2021-01-19 2021-01-19 Trajectory prediction method for vulnerable road users in vehicle driving environment Active CN112734808B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110069140.4A CN112734808B (en) 2021-01-19 2021-01-19 Trajectory prediction method for vulnerable road users in vehicle driving environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110069140.4A CN112734808B (en) 2021-01-19 2021-01-19 Trajectory prediction method for vulnerable road users in vehicle driving environment

Publications (2)

Publication Number Publication Date
CN112734808A true CN112734808A (en) 2021-04-30
CN112734808B CN112734808B (en) 2022-10-14

Family

ID=75592423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110069140.4A Active CN112734808B (en) 2021-01-19 2021-01-19 Trajectory prediction method for vulnerable road users in vehicle driving environment

Country Status (1)

Country Link
CN (1) CN112734808B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344069A (en) * 2021-05-31 2021-09-03 成都快眼科技有限公司 Image classification method for unsupervised visual representation learning based on multi-dimensional relation alignment
CN113705431A (en) * 2021-08-26 2021-11-26 山东大学 Method and system for track instance level segmentation and multi-motion visual mileage measurement
CN113902776A (en) * 2021-10-27 2022-01-07 北京易航远智科技有限公司 Target pedestrian trajectory prediction method and device, electronic equipment and storage medium
CN114067371A (en) * 2022-01-18 2022-02-18 之江实验室 Cross-modal pedestrian trajectory generation type prediction framework, method and device
CN114418159A (en) * 2021-10-29 2022-04-29 中国科学院宁波材料技术与工程研究所 Method and system for predicting limb movement locus and prediction error thereof and electronic device
CN114626598A (en) * 2022-03-08 2022-06-14 南京航空航天大学 Multi-modal trajectory prediction method based on semantic environment modeling
CN114821812A (en) * 2022-06-24 2022-07-29 西南石油大学 Deep learning-based skeleton point action recognition method for pattern skating players
CN115923847A (en) * 2023-03-15 2023-04-07 安徽蔚来智驾科技有限公司 Preprocessing method and device for perception information of automatic driving vehicle and vehicle

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635793A (en) * 2019-01-31 2019-04-16 南京邮电大学 A kind of unmanned pedestrian track prediction technique based on convolutional neural networks
CN110415266A (en) * 2019-07-19 2019-11-05 东南大学 A method of it is driven safely based on this vehicle surrounding vehicles trajectory predictions
CN110599521A (en) * 2019-09-05 2019-12-20 清华大学 Method for generating trajectory prediction model of vulnerable road user and prediction method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635793A (en) * 2019-01-31 2019-04-16 南京邮电大学 A kind of unmanned pedestrian track prediction technique based on convolutional neural networks
CN110415266A (en) * 2019-07-19 2019-11-05 东南大学 A method of it is driven safely based on this vehicle surrounding vehicles trajectory predictions
CN110599521A (en) * 2019-09-05 2019-12-20 清华大学 Method for generating trajectory prediction model of vulnerable road user and prediction method

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344069B (en) * 2021-05-31 2023-01-24 成都快眼科技有限公司 Image classification method for unsupervised visual representation learning based on multi-dimensional relation alignment
CN113344069A (en) * 2021-05-31 2021-09-03 成都快眼科技有限公司 Image classification method for unsupervised visual representation learning based on multi-dimensional relation alignment
CN113705431A (en) * 2021-08-26 2021-11-26 山东大学 Method and system for track instance level segmentation and multi-motion visual mileage measurement
CN113705431B (en) * 2021-08-26 2023-08-08 山东大学 Track instance level segmentation and multi-motion visual mileage measurement method and system
CN113902776A (en) * 2021-10-27 2022-01-07 北京易航远智科技有限公司 Target pedestrian trajectory prediction method and device, electronic equipment and storage medium
CN113902776B (en) * 2021-10-27 2022-05-17 北京易航远智科技有限公司 Target pedestrian trajectory prediction method and device, electronic equipment and storage medium
CN114418159A (en) * 2021-10-29 2022-04-29 中国科学院宁波材料技术与工程研究所 Method and system for predicting limb movement locus and prediction error thereof and electronic device
CN114067371A (en) * 2022-01-18 2022-02-18 之江实验室 Cross-modal pedestrian trajectory generation type prediction framework, method and device
CN114626598A (en) * 2022-03-08 2022-06-14 南京航空航天大学 Multi-modal trajectory prediction method based on semantic environment modeling
CN114821812B (en) * 2022-06-24 2022-09-13 西南石油大学 Deep learning-based skeleton point action recognition method for pattern skating players
CN114821812A (en) * 2022-06-24 2022-07-29 西南石油大学 Deep learning-based skeleton point action recognition method for pattern skating players
CN115923847A (en) * 2023-03-15 2023-04-07 安徽蔚来智驾科技有限公司 Preprocessing method and device for perception information of automatic driving vehicle and vehicle
CN115923847B (en) * 2023-03-15 2023-06-02 安徽蔚来智驾科技有限公司 Preprocessing method and device for perception information of automatic driving vehicle and vehicle

Also Published As

Publication number Publication date
CN112734808B (en) 2022-10-14

Similar Documents

Publication Publication Date Title
CN112734808B (en) Trajectory prediction method for vulnerable road users in vehicle driving environment
Hou et al. Interactive trajectory prediction of surrounding road users for autonomous driving using structural-LSTM network
Muhammad et al. Deep learning for safe autonomous driving: Current challenges and future directions
Fernando et al. Deep inverse reinforcement learning for behavior prediction in autonomous driving: Accurate forecasts of vehicle motion
Li et al. Deep neural network for structural prediction and lane detection in traffic scene
Zhao et al. A spatial-temporal attention model for human trajectory prediction.
Rasouli et al. Bifold and semantic reasoning for pedestrian behavior prediction
CN111339867B (en) Pedestrian trajectory prediction method based on generation of countermeasure network
Karim et al. A dynamic spatial-temporal attention network for early anticipation of traffic accidents
Peng et al. MASS: Multi-attentional semantic segmentation of LiDAR data for dense top-view understanding
Wu et al. HSTA: A hierarchical spatio-temporal attention model for trajectory prediction
Xiao et al. UB‐LSTM: a trajectory prediction method combined with vehicle behavior recognition
CN111291690B (en) Route planning method, route planning device, robot and medium
Hu et al. Learning a deep cascaded neural network for multiple motion commands prediction in autonomous driving
CN115147790B (en) Future track prediction method of vehicle based on graph neural network
Fu et al. Trajectory prediction-based local spatio-temporal navigation map for autonomous driving in dynamic highway environments
Zhang et al. Explainable multimodal trajectory prediction using attention models
Sun et al. CoDriver ETA: Combine driver information in estimated time of arrival by driving style learning auxiliary task
CN114399743A (en) Method for generating future track of obstacle
Kolekar et al. Behavior prediction of traffic actors for intelligent vehicle using artificial intelligence techniques: A review
Zhang et al. Predictive trajectory planning for autonomous vehicles at intersections using reinforcement learning
Kawasaki et al. Multimodal trajectory predictions for autonomous driving without a detailed prior map
Guo et al. Temporal Information Fusion Network for Driving Behavior Prediction
Ge et al. Deep reinforcement learning navigation via decision transformer in autonomous driving
Li et al. Personalized trajectory prediction for driving behavior modeling in ramp-merging scenarios

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant