EP4036892A1 - Erkennungssystem zur vorhersage von informationen über fussgänger - Google Patents

Erkennungssystem zur vorhersage von informationen über fussgänger Download PDF

Info

Publication number
EP4036892A1
EP4036892A1 EP21154871.4A EP21154871A EP4036892A1 EP 4036892 A1 EP4036892 A1 EP 4036892A1 EP 21154871 A EP21154871 A EP 21154871A EP 4036892 A1 EP4036892 A1 EP 4036892A1
Authority
EP
European Patent Office
Prior art keywords
pedestrian
data
prediction
module
tracked
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP21154871.4A
Other languages
English (en)
French (fr)
Inventor
Lukas HAHN
Maximilian SCHÄFER
Kun Zhao
Frederik HASECKE
Yvonne SCHNICKMANN
André Paus
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aptiv Technologies AG
Original Assignee
Aptiv Technologies Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aptiv Technologies Ltd filed Critical Aptiv Technologies Ltd
Priority to EP21154871.4A priority Critical patent/EP4036892A1/de
Priority to US17/649,672 priority patent/US20220242453A1/en
Priority to CN202210116190.8A priority patent/CN114838734A/zh
Publication of EP4036892A1 publication Critical patent/EP4036892A1/de
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/3453Special cost functions, i.e. other than distance or default speed limit of road segments
    • G01C21/3492Special cost functions, i.e. other than distance or default speed limit of road segments employing speed data or traffic data, e.g. real-time or historical
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/001Planning or execution of driving tasks
    • B60W60/0027Planning or execution of driving tasks using trajectory prediction for other traffic participants
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/0097Predicting future conditions
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • G01C21/1652Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments with ranging devices, e.g. LIDAR or RADAR
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/3407Route searching; Route guidance specially adapted for specific applications
    • G01C21/3415Dynamic re-routing, e.g. recalculating the route when the user deviates from calculated route or after detecting real-time traffic data or accidents
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/02Systems using reflection of radio waves, e.g. primary radar systems; Analogous systems
    • G01S13/50Systems of measurement based on relative movement of target
    • G01S13/58Velocity or trajectory determination systems; Sense-of-movement determination systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/86Combinations of radar systems with non-radar systems, e.g. sonar, direction finder
    • G01S13/867Combination of radar systems with cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/93Radar or analogous systems specially adapted for specific applications for anti-collision purposes
    • G01S13/931Radar or analogous systems specially adapted for specific applications for anti-collision purposes of land vehicles
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/42Determining position
    • G01S19/48Determining position by combining or switching between position solutions derived from the satellite radio beacon positioning system and position solutions derived from a further system
    • G01S19/49Determining position by combining or switching between position solutions derived from the satellite radio beacon positioning system and position solutions derived from a further system whereby the further system is an inertial position system, e.g. loosely-coupled
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/16Anti-collision systems
    • G08G1/166Anti-collision systems for active traffic, e.g. moving vehicles, pedestrians, bikes
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/40Dynamic objects, e.g. animals, windblown objects
    • B60W2554/402Type
    • B60W2554/4029Pedestrians
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2556/00Input parameters relating to data
    • B60W2556/35Data fusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Definitions

  • the present disclosure relates to a system and a method of predicting information related to a pedestrian, typically for autonomous driving.
  • An autonomous or self-driving vehicle driving in an urban environment is highly likely to operate near vulnerable road users like pedestrians and cyclists.
  • the autonomous vehicle can significantly reduce its velocity close to the vulnerable road user.
  • the driver of a human driven vehicle present in the surroundings, may be surprised by such a behaviour of the autonomous vehicle and cause an accident between the two vehicles.
  • a velocity reduction is a problem as it is likely to increase the number of accidents.
  • a human driver When driving a motor vehicle, a human driver considers in real time a multi-actor scene including multiple traffic actors (vehicle, pedestrian, bicycle, or any other potentially moving object) in an operating area surrounding the vehicle and takes maneuver decisions based on a current environment and a short-term prediction of how the traffic actors may behave.
  • the human driver can generally predict the trajectory of a pedestrian based on an observation of the behavior of the pedestrian and act depending on his prediction of the pedestrian's trajectory.
  • the human driver significantly reduces its velocity close to a pedestrian only when the driver anticipates that the pedestrian may have a dangerous behavior in a short-term period, which is actually a rare situation.
  • the present disclosure concerns a system for prediction of information related to a pedestrian having
  • the present prediction system takes into account how the pedestrian behaves in order to predict more accurately what the pedestrian intends to do.
  • the system is able to predict intent and behavior of the pedestrians in many situations to allow for typical urban velocities in low distances to the pedestrians.
  • the prediction system further includes a prediction fusion module that performs a fusion of a prediction of information related to the tracked pedestrian based on information determined by the tracking module and the prediction of information related to the tracked pedestrian performed by the machine-learning module.
  • the fusion of the two kinds of prediction allows to improve the accuracy and robustness of the final prediction.
  • the predicted information related to the tracked pedestrian can include at least one of a pedestrian trajectory at future times and an information on a pedestrian intention at future times.
  • the prediction can be reflected in two modes: in a first mode, the abstract intention of the pedestrian is predicted (e.g., stays on boardwalk, intends to cross road, waits for vehicle to pass, etc.), and in a second mode, a concrete trajectory of the pedestrian is predicted.
  • a first mode the abstract intention of the pedestrian is predicted (e.g., stays on boardwalk, intends to cross road, waits for vehicle to pass, etc.)
  • a concrete trajectory of the pedestrian is predicted.
  • the machine-learning prediction module has a deep neural network.
  • the deep neural network is a convolutional neural network.
  • the pedestrian behavior assessment module includes a key point detection block that detects body key points of the pedestrian from pedestrian data.
  • the body key points of a pedestrian are an important feature to gain knowledge about the pedestrian's pose or behavior. They can be detected using data related to bounding boxes representing the detected pedestrian, provided by the tracking module.
  • the behavior assessment module comprises at least one of
  • the behavior assessment module detects that the pedestrian is looking at his smartphone and is consequently very distracted and not aware of the traffic situation.
  • the behavior assessment module detects that the pedestrian has eye contact with the self-driving vehicle and is consequently aware that the self-driving vehicle is approaching.
  • the behavior assessment module is an encoder of a encoder-decoder neural network.
  • the behavior assessment module and the machine-learning prediction module can be trained jointly.
  • the tracking module includes:
  • the tracking module can use Kalman filtering.
  • the tracking module can also use a projection-based joint probabilistic data association (PJPDA) filter.
  • PJPDA projection-based joint probabilistic data association
  • the tracking module transmits a flow of data packets related to the tracked pedestrian to the pedestrian behavior assessment module, each data packet including
  • the present disclosure further concerns a vehicle integrating the prediction system previously defined and at least one sensor.
  • the present disclosure also concerns a computer-implemented method for predicting information related to a pedestrian, including the following steps carried out by computer hardware components:
  • the present disclosure also concerns a computer program including instructions which, when the program is executed by a computer, cause the computer to carry out the above defined method.
  • the present disclosure concerns a prediction system, or an architecture, 100 for predicting information related to a pedestrian.
  • a prediction system 100 can be installed in a vehicle 200, such as a self-driving (in other words: autonomous) vehicle, to detect one or more pedestrians in the surroundings of the vehicle 200 and predict (evaluate) accurately what the pedestrian will do in a short-term future.
  • the predicted information is for example a predicted trajectory of the pedestrian or a predicted intention of the pedestrian (i.e., what the pedestrian intends to do).
  • the prediction can concern a short-term future period of time of a few seconds from the present time(e.g., between 1 and 5 seconds, typically 2 or 3 seconds). However, a longer prediction could be possible.
  • Figure 1 shows a particular and illustrative first embodiment of the prediction system 100 installed in the vehicle 200.
  • the reference 300 represents a set of data inputs 301 to 304 into the prediction system 100 coming from different sources that can be inside or outside the vehicle 200, as described below.
  • the system 100 has access to real-time data streams coming from one or more sensors such as camera(s), LIDAR(s) or radar(s), installed aboard the vehicle 200.
  • the sensor data inputs can include:
  • the system 100 has also access to a real-time automotive navigation data stream 303 coming from an automotive navigation system, typically a GPS ('Global Positioning System') receiver of the vehicle 200 that provides real time geolocation (in other words: position) and time information of the vehicle 200.
  • a GPS 'Global Positioning System'
  • IMU inertial measurement unit
  • the system 100 has also access to map data 304.
  • the map data can be stored in a memory of the vehicle 200 and/or be accessible on a distant server through a communication network. It can be high-definition (HD) map data.
  • the HD map is a detailed map of a region where the vehicle 200 operates or is expected to operate.
  • the HD map is provided in a specific file format, for example an openDRIVE ® format.
  • the system 100 has different functional modules or blocks.
  • the system 100 has the following functional modules:
  • the tracking module 110 has the function of detecting and tracking in real-time traffic actors (also called objects) located in an operating area, from sensor data coming from the sensors 301, 302.
  • a traffic actor can be a pedestrian, a bicycle, a car, a truck, a bus, or any potentially moving object.
  • An operating area is defined as an area surrounding the vehicle 200, within a range R around the vehicle 200.
  • the range R is a value between 50 m and 100 m.
  • the tracking module 110 can have two functional blocks: an object detection block 111 and a fusion and tracking block 115.
  • the object detection block 111 performs real-time object detection. It receives sensor data from different sensors of the vehicle 200 (here: camera data 301, LIDAR data 302) in different sensor domains (here: the camera domain and the LIDAR domain), and detects objects (traffic actors) independently in the different sensor domains, in real time.
  • the object detection block 111 uses an object detection algorithm 112 like YOLO ® to detect objects in image data (in other words: camera data).
  • the object detection block 111 has a clustering software component 113, that performs a cluster analysis of LIDAR points to group a set of LIDAR points considered as similar, and a classification software component 114, that classifies the object detections into object classes.
  • the object detection block 111 can make multiple object detections, independently, of the same object captured in different sensor domains and/or by different sensors.
  • the term "independently” means that, for a same object (or group of objects, as explained later), there are multiple detections of the object (or group of objects) respectively in different sensor domains or by different sensors.
  • the object detection block 111 makes two object detections for the same detected object, like a pedestrian: a first object detection in the image (camera) domain and a second object detection in the LIDAR domain. Each object detection is transmitted independently to the fusion and tracking block 115 via an independent flow of information.
  • the fusion and tracking block 115 fuses the different object detections of the same detected object made in the different sensor domains (here, in the image domain and in the LIDAR domain), and tracks each detected object (for example a pedestrian).
  • One unique identifier ID is assigned to each detected object.
  • the fusion and tracking block 115 tracks each detected object independently.
  • the fusion and tracking block 115 can also track a group of objects which are close to each other (e.g., the distance between objects of the same group is less than a predetermined distance threshold) and share a similar movement. In that case, the block 115 can update state vectors of individual objects in the group jointly.
  • the fusion and tracking block 115 can also predict information related to a detected object at future times, typically a trajectory of the detected object in next time frames, from the tracking information. For example, the block 115 determines a predicted trajectory of the detected object by extrapolation of its past trajectory and based on dynamic model(s), as known by the person skilled in the art.
  • the dynamic models are used to predict the future states of a pedestrian.
  • the state of a pedestrian consists of a state vector with different features related to the pedestrian such as position, velocity, acceleration, box size (height, width, length), and a corresponding covariance matrix.
  • different dynamic models e.g., constant velocity and constant acceleration model
  • the fusion and tracking block 115 can use Kalman filter(s).
  • the fusion and tracking block 115 can use a JPDA (joint probabilistic data association) filter, such as a PJPDA (Projection-based Joint probabilistic Data Association) filter, that is a more specific implementation of a Kalman filter.
  • Detections can be assigned to tracks by applying Projection-based Joint Probabilistic Data Association (PJPDA). It allows to consider all tracks jointly in the association of detections to tracks (like the JPDAF or JPDA Filter) while also running in real time for a high number of pedestrians or tracks.
  • JPDA Joint probabilistic data association
  • the map processing module 120 has access to the high-definition (HD) map data 304 describing the vehicle's surroundings.
  • the map processing module 120 has also access to real time position data 303 of the vehicle 200.
  • the module 120 prepares the data to be usable as input for the machine-learning module 140. For example, it produces a rasterized image representation using predefined value ranges.
  • a function of the map processing module 120 is to generate simplified map information of an operating area of the vehicle 200 (surrounding the vehicle 200), describing for example the following illustrative and non-exhaustive list of static road elements:
  • the pedestrian behavior assessment module 130 determines additional data of the tracked pedestrian representative of a real time behavior of the pedestrian.
  • the additional data of the pedestrian is different from the pedestrian data determined by the tracking module.
  • the pedestrian behavior assessment module 130 has a key point detection block 131, an action recognition block 132 and a traffic awareness detection block 133.
  • the key point detection block 131 detects body key points of a tracked pedestrian from data output by the tracking module 110.
  • the body key points can be selected in the illustrative and non-exhaustive list including the following body elements: nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left hip, right hip, left knee, right knee, left ankle, right ankle.
  • the key point detection block 131 can use a neural network, for example an encoder-decoder convolutional neural network (CNN).
  • CNN encoder-decoder convolutional neural network
  • training data can include:
  • the action recognition block 132 has the function of identifying an action class of a tracked pedestrian.
  • the action class can be selected in the illustrative and non-exhaustive list including standing, walking, running and lying.
  • the action recognition block 132 can use a neural network, such as a recurrent neural network (RNN) or a long short-term memory (LSTM).
  • RNN recurrent neural network
  • LSTM long short-term memory
  • training data can include:
  • the traffic awareness detection block 133 has the function of identifying a class of awareness state of the tracked pedestrian with respect to a traffic situation around the pedestrian.
  • the awareness state class defines a level a awareness of the pedestrian with respect to the traffic around the pedestrian.
  • the awareness state class can be selected in the non-exhaustive list including eye contact with the vehicle 200, peripheral sight, unaware, distracted.
  • the traffic awareness detection block 133 can use a neural network, for example a convolutional neural network (CNN).
  • CNN convolutional neural network
  • training data can include:
  • the two blocks 132, 133 receive the data output from the key point detection block 131, as input data.
  • the blocks 131, 132 and 133 can receive and use the output of the object detection block 112 and/or from blocks 113-114, as additional information.
  • the pedestrian behavior assessment module 130 can use one or more neural networks to perform the tasks of detecting body key points, recognizing an action of the pedestrian and identifying a state of awareness of the pedestrian.
  • the machine-learning prediction module 140 has the function of performing a prediction of information at future times related to a tracked pedestrian. It takes, as input data, tracking data of the tracked pedestrian transmitted by the fusion and tracking module 115 and map data of the operating area of the vehicle 200 transmitted by the map processing module 120.
  • the prediction uses a machine-learning algorithm to perform the prediction.
  • the machine-learning prediction module 140 can use a neural network, for example a convolutional neural network (CNN) or a recurrent neural network (RNN).
  • CNN convolutional neural network
  • RNN recurrent neural network
  • the training data can include:
  • the two modules 130, 140 are separated modules.
  • the neural network of the pedestrian behavior assessment module 130 and the neural network of the machine-learning prediction module 140 are trained separately and independently.
  • the predicted information related to the tracked pedestrian can include a pedestrian trajectory at future times and/or an information on a pedestrian intention at future times.
  • the "future times" can represent a few seconds following a current time.
  • the "future times” can be within a period of time starting from a current time and lasting between 1 and 10 seconds, preferably between 1 and 5 seconds, for example 2 or 3 seconds.
  • the period of time of prediction could be longer.
  • the pedestrian intention can be to stay on boardwalk, to cross the road, to wait for the vehicle 200 to pass, or any other predictable pedestrian intention.
  • the prediction fusion module 150 has the function of performing a fusion of a first prediction of information related to the tracked pedestrian performed by the machine-learning module 140 and a second prediction of information related to the tracked pedestrian based on information determined by the tracking module 110.
  • the output module 160 formats the predicted information related to the tracked pedestrian in accordance with requirements of the vehicle system.
  • a method of predicting information related to a pedestrian, corresponding to the operation of the prediction system 100 is illustrated in figure 2 and will now be described.
  • the object detection module 111 receives a first real time flow of camera data 301, here from the front camera of the vehicle 200, and a second real time flow of LIDAR data 302, here from the 360° rotating LIDAR installed on the roof of the vehicle 200.
  • Relevant objects (pedestrians, cars, trucks, bicycles, etc...) are detected independently in the camera domain (in other words: image domain), by the block 112, and in the LIDAR domain, by the blocks 113 and 114, within the operating area of the vehicle 200.
  • Each object detection in the camera domain includes the following elements:
  • Each object detection in the LIDAR domain includes the following elements:
  • the object detection block 111 transmits an independent flow of information to the fusion and tracking block 115.
  • the flow of information transmitted in a step S4 includes a camera identifier and the 2D bounding box coordinates in the camera coordinate system.
  • the flow of information transmitted in a step S5 includes a LIDAR identifier and the 3D bounding box coordinates in the vehicle coordinate system.
  • the fusion and tracking block 115 performs a fusion of the independent object detections from different sensor domains (here in the camera domain and in the LIDAR domain).
  • Any technique of fusion of sensor data can be used.
  • the sensor data fusion uses one or more Kalman filter(s). Each object detection is assigned to a track and each detected object is tracked in real time. A unique identifier is assigned to each tracked object.
  • the sensor fusion is done using for example an extended Kalman filter. Detections of a pedestrian in the sensor data is assigned to a corresponding track. Assigned detections from the different sensors are separately used to update the current state of a pedestrian or track. This allows to use of the advantages of each sensor and handle missing detections from single sensors. The detections from the different sensors are separately assigned to the corresponding track.
  • the fusion and tracking block 115 performs a first prediction to predict a trajectory of the tracked object in future times, typically in a short-term future of a few seconds after the present time.
  • the first prediction is based on tracking data of the object, typically by extrapolation of the past trajectory of the object and using dynamic model(s).
  • the first prediction can be performed by the Kalman filter(s). It can also use an interacting multiple model estimator.
  • the map processing module 120 In parallel to the object detection and tracking, the map processing module 120 generates simplified map data of the operating area of the vehicle 200, in a step S7, and transmits the map data to the machine-learning prediction module 140, in a step S8.
  • the fusion and tracking block 115 transmits
  • the first flow of information can include past state vectors of a tracked pedestrian containing data related to the pedestrian such as position, velocity, acceleration and/or orientation, for the n last seconds.
  • the first flow of information can also include past state vectors of surrounding traffic actors with respect to the pedestrian (in other words: from the perspective of the considered pedestrian).
  • the second flow of information contains, for each tracked pedestrian, the unique identifier of the pedestrian, and real time data of the detected pedestrian, here data related to each bounding box detected.
  • the bounding box data includes first real time data related to the 2D bounding box detected in the camera domain and second real time data related the 3D bounding box detected in the LIDAR domain.
  • the first real time data can include the coordinates of the 2D bounding box and image data of a cropped image captured by the camera inside the 2D bounding box.
  • the second real time data can include the coordinates of the 3D bounding box and the LIDAR point cloud captured by the LIDAR inside the 3D bounding box.
  • the third flow of information contains a first prediction of information related to the pedestrian based on the tracking data and on one or more dynamic models.
  • the first prediction can include different predictions, here from the Kalman filters, using respectively different dynamic models (for example, and only to illustrate, a "Model 1" and a "Model 2").
  • the third flow of information can include state vectors of the pedestrian of each dynamic model with their respective likelihood.
  • the pedestrian behavior assessment module 130 processes the second flow of information coming from the tracking module 110.
  • the key point detection 131 detects body key points in the bounding boxes transmitted.
  • the body key points are an important feature to gain knowledge of a pedestrian behavior. In the first embodiment, they are detected based on the data related to the bounding boxes transmitted by the tracking module 110. This allows to have a smaller network architecture and save computation time and resources.
  • the key point detection is only carried out on objects that have been classified as pedestrians with a high probability.
  • the data from block 112 and/or blocks 113-114 could be used as an additional source of information.
  • the detection of body key points is optional. Additionally, or alternatively, the sensor data (here, the image data and/or the LIDAR data) might be used by blocks 132 and 133 as well.
  • the action recognition block 132 determines a class of action performed by the tracked pedestrian (walking, running).
  • the block 132 classifies the pedestrian behavior into specific actions (e.g., walking, running, crossing, waiting, etc.).
  • the classification is performed using physical models and based on the body key points.
  • the sensor data might be used additionally or alternatively.
  • the action classification can be made by a dedicated neural network.
  • the traffic awareness detection block 133 detects a state of awareness of the pedestrian with respect to the surrounding traffic situation.
  • the block 133 assesses the pedestrian behavior, especially in terms of estimating a risk that the pedestrian steps onto the road carelessly by detecting the pedestrian's awareness of the current situation around the pedestrian.
  • the block 133 can detect pedestrians that are looking at their phone or something else, and distinguish them as less predictable.
  • the awareness state detection can be made by a dedicated neural network. Such a neural network can make a classification based on the pedestrian image information on the pedestrian body and head orientation output from the key point detection in step S120 (and/or from sensor data).
  • the pedestrian behavior assessment module 130 transmits to the machine-learning prediction module 140 data packets each including the following elements:
  • each data packet transmitted by the pedestrian behavior assessment module 130 to the machine-learning prediction module 140 can include information on a pedestrian body and/or head orientation relative to the vehicle 200, and possibly raw body key point information.
  • the machine-learning prediction module 140 performs a second prediction of information related to the tracked pedestrian at future times, by using the machine-learning algorithm, from input data including:
  • the prediction fusion module 150 performs a fusion of the first prediction of information related to the tracked pedestrian based on information determined by the tracking module 110 and the second prediction of information related to the tracked pedestrian performed by the machine-learning prevision module 140.
  • the first prediction is conventional and based on a tracking and extrapolation approach and using one or more model(s).
  • the first prediction uses Kalman filter(s).
  • the first prediction can include different predictions from the Kalman filters, using (in other words: based on) different dynamic models respectively (Model 1 and Model 2 in the illustrative example).
  • the second prediction is based on a machine-learning approach.
  • the prediction fusion module 150 performs a fusion of the predictions from the Kalman filters using different dynamic models (for example Model 1 and Model 2), and the machine-learning based model prediction.
  • the fusion allows to improve the accuracy and robustness of the prediction.
  • the prediction based on the tracking and extrapolation approach may not be sufficient and the prediction based on the machine-learning may be a better prediction.
  • the first prediction based on the tracking and extrapolation approach may be relevant and more stable than the second prediction based on the machine-learning approach.
  • the fusion of the first and second predictions can be performed by data fusion, for example in a mathematical framework such as an interacting multiple model estimator.
  • the predicted information generated by the prediction fusion module 150 can include the following elements of information:
  • the predicted information generated by the prediction fusion module 150 is transmitted to the output module 160.
  • the module 160 formats the information in accordance with requirements of the autonomous driving system of the vehicle 200, in a step S16.
  • the method for predicting information related to a pedestrian is a computer-implemented method.
  • the present disclosure also concerns a computer program including instructions which, when the program is executed by a computer, cause the computer to carry out the method for predicting information related to a pedestrian, as described.
  • the program instructions can be stored in a storage module related to the vehicle, such as non-volatile memory and/or a volatile memory.
  • the memory can be permanently or removably integrated in the vehicle or connectable to the vehicle (e.g., via a 'cloud') and the stored program instructions can be executed by a computer, or a calculator, of the vehicle, such as one or more modules of electronic control units (ECUs).
  • the second embodiment is based on the first embodiment, but differs from it by the fact that the behavior assessment module 130 is an encoder part of an encoder-decoder neural network. It has the function of encoding the input data into a latent code (also called feature, or latent variables, or latent representation). In other words, the module 130 is one component (the encoder) of a whole neural network system. The latent code is part of the input of the prediction module 140 (or prediction network).
  • the behavior assessment module 130 does not classify the pedestrian action and awareness state. However, it has a function of pedestrian behavior assessment and plays the role of an encoder of the whole neural network architecture.
  • the pedestrian behavior assessment module 130 and the machine-learning prediction module 140 can be trained jointly.

Landscapes

  • Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Software Systems (AREA)
  • Mechanical Engineering (AREA)
  • Transportation (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Electromagnetism (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Traffic Control Systems (AREA)
EP21154871.4A 2021-02-02 2021-02-02 Erkennungssystem zur vorhersage von informationen über fussgänger Pending EP4036892A1 (de)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP21154871.4A EP4036892A1 (de) 2021-02-02 2021-02-02 Erkennungssystem zur vorhersage von informationen über fussgänger
US17/649,672 US20220242453A1 (en) 2021-02-02 2022-02-01 Detection System for Predicting Information on Pedestrian
CN202210116190.8A CN114838734A (zh) 2021-02-02 2022-02-07 用于预测与行人相关的信息的检测系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP21154871.4A EP4036892A1 (de) 2021-02-02 2021-02-02 Erkennungssystem zur vorhersage von informationen über fussgänger

Publications (1)

Publication Number Publication Date
EP4036892A1 true EP4036892A1 (de) 2022-08-03

Family

ID=74550457

Family Applications (1)

Application Number Title Priority Date Filing Date
EP21154871.4A Pending EP4036892A1 (de) 2021-02-02 2021-02-02 Erkennungssystem zur vorhersage von informationen über fussgänger

Country Status (3)

Country Link
US (1) US20220242453A1 (de)
EP (1) EP4036892A1 (de)
CN (1) CN114838734A (de)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4361961A1 (de) * 2022-10-28 2024-05-01 Aptiv Technologies AG Verfahren zur bestimmung von informationen bezüglich eines verkehrsteilnehmers

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11878684B2 (en) * 2020-03-18 2024-01-23 Toyota Research Institute, Inc. System and method for trajectory prediction using a predicted endpoint conditioned network
US11651583B2 (en) * 2021-07-08 2023-05-16 Cyngn, Inc. Multi-channel object matching
US11904906B2 (en) * 2021-08-05 2024-02-20 Argo AI, LLC Systems and methods for prediction of a jaywalker trajectory through an intersection
CN116453205A (zh) * 2022-11-22 2023-07-18 深圳市旗扬特种装备技术工程有限公司 一种营运车辆滞站揽客行为识别方法、装置及系统
CN116259176B (zh) * 2023-02-17 2024-02-06 安徽大学 一种基于意图随机性影响策略的行人轨迹预测方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190236958A1 (en) * 2016-06-27 2019-08-01 Nissan Motor Co., Ltd. Object Tracking Method and Object Tracking Device
US20200272148A1 (en) * 2019-02-21 2020-08-27 Zoox, Inc. Motion prediction based on appearance
US20200283016A1 (en) * 2019-03-06 2020-09-10 Robert Bosch Gmbh Movement prediction of pedestrians useful for autonomous driving
US20210009163A1 (en) * 2019-07-08 2021-01-14 Uatc, Llc Systems and Methods for Generating Motion Forecast Data for Actors with Respect to an Autonomous Vehicle and Training a Machine Learned Model for the Same

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102014201159A1 (de) * 2014-01-23 2015-07-23 Robert Bosch Gmbh Verfahren und Vorrichtung zum Klassifizieren eines Verhaltens eines Fußgängers beim Überqueren einer Fahrbahn eines Fahrzeugs sowie Personenschutzsystem eines Fahrzeugs
US10055652B2 (en) * 2016-03-21 2018-08-21 Ford Global Technologies, Llc Pedestrian detection and motion prediction with rear-facing camera
US9760806B1 (en) * 2016-05-11 2017-09-12 TCL Research America Inc. Method and system for vision-centric deep-learning-based road situation analysis
US10394245B2 (en) * 2016-11-22 2019-08-27 Baidu Usa Llc Method and system to predict vehicle traffic behavior for autonomous vehicles to make driving decisions
EP3514494A1 (de) * 2018-01-19 2019-07-24 Zenuity AB Erstellung und aktualisierung einer verhaltensschicht einer mehrschichtigen hochdefinierten digitalen karte eines strassennetzes
US10950130B2 (en) * 2018-03-19 2021-03-16 Derq Inc. Early warning and collision avoidance
CN109063559B (zh) * 2018-06-28 2021-05-11 东南大学 一种基于改良区域回归的行人检测方法
CN110378281A (zh) * 2019-07-17 2019-10-25 青岛科技大学 基于伪3d卷积神经网络的组群行为识别方法
DE102020200911B3 (de) * 2020-01-27 2020-10-29 Robert Bosch Gesellschaft mit beschränkter Haftung Verfahren zum Erkennen von Objekten in einer Umgebung eines Fahrzeugs

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190236958A1 (en) * 2016-06-27 2019-08-01 Nissan Motor Co., Ltd. Object Tracking Method and Object Tracking Device
US20200272148A1 (en) * 2019-02-21 2020-08-27 Zoox, Inc. Motion prediction based on appearance
US20200283016A1 (en) * 2019-03-06 2020-09-10 Robert Bosch Gmbh Movement prediction of pedestrians useful for autonomous driving
US20210009163A1 (en) * 2019-07-08 2021-01-14 Uatc, Llc Systems and Methods for Generating Motion Forecast Data for Actors with Respect to an Autonomous Vehicle and Training a Machine Learned Model for the Same

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BECKER STEFAN ET AL: "An RNN-Based IMM Filter Surrogate", 12 May 2019, ADVANCES IN DATABASES AND INFORMATION SYSTEMS; [LECTURE NOTES IN COMPUTER SCIENCE; LECT.NOTES COMPUTER], SPRINGER INTERNATIONAL PUBLISHING, CHAM, PAGE(S) 387 - 398, ISBN: 978-3-319-10403-4, XP047519695 *
VAN WYK B J ET AL: "A projection-based joint probabilistic data association algorithm", AFRICON, 2004. 7TH AFRICON CONFERENCE IN AFRICA GABORONE, BOTSWANA SEPT. 15-17, 2004, PISCATAWAY, NJ, USA,IEEE, vol. 1, 15 September 2004 (2004-09-15), pages 313 - 317, XP010780423, ISBN: 978-0-7803-8605-1, DOI: 10.1109/AFRICON.2004.1406682 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4361961A1 (de) * 2022-10-28 2024-05-01 Aptiv Technologies AG Verfahren zur bestimmung von informationen bezüglich eines verkehrsteilnehmers

Also Published As

Publication number Publication date
US20220242453A1 (en) 2022-08-04
CN114838734A (zh) 2022-08-02

Similar Documents

Publication Publication Date Title
EP4036892A1 (de) Erkennungssystem zur vorhersage von informationen über fussgänger
US11175145B2 (en) System and method for precision localization and mapping
CN111670468B (zh) 移动体行为预测装置以及移动体行为预测方法
US20200209857A1 (en) Multimodal control system for self driving vehicle
US20200211395A1 (en) Method and Device for Operating a Driver Assistance System, and Driver Assistance System and Motor Vehicle
US10849543B2 (en) Focus-based tagging of sensor data
US11003928B2 (en) Using captured video data to identify active turn signals on a vehicle
US20220105959A1 (en) Methods and systems for predicting actions of an object by an autonomous vehicle to determine feasible paths through a conflicted area
CN112363494A (zh) 机器人前进路径的规划方法、设备及存储介质
US11577732B2 (en) Methods and systems for tracking a mover's lane over time
CN113227712A (zh) 用于确定车辆的环境模型的方法和系统
CN116745195A (zh) 车道外安全驾驶的方法和系统
US10759449B2 (en) Recognition processing device, vehicle control device, recognition control method, and storage medium
CN117416344A (zh) 自主驾驶系统中校车的状态估计
US20230048926A1 (en) Methods and Systems for Predicting Properties of a Plurality of Objects in a Vicinity of a Vehicle
CN116524454A (zh) 物体追踪装置、物体追踪方法及存储介质
Guo et al. Understanding surrounding vehicles in urban traffic scenarios based on a low-cost lane graph
US11881031B2 (en) Hierarchical processing of traffic signal face states
EP4361961A1 (de) Verfahren zur bestimmung von informationen bezüglich eines verkehrsteilnehmers
US20230252638A1 (en) Systems and methods for panoptic segmentation of images for autonomous driving
Panda et al. RECENT DEVELOPMENTS IN LANE DEPARTURE WARNING SYSTEM: AN ANALYSIS
US20240025440A1 (en) State estimation and response to active school vehicles in a self-driving system
US20240185437A1 (en) Computer-Implemented Method and System for Training a Machine Learning Process
Kaida et al. Study on behavior prediction using multi-object recognition and map information in road environment
Khalfin et al. Developing, Analyzing, and Evaluating Vehicular Lane Keeping Algorithms Under Dynamic Lighting and Weather Conditions Using Electric Vehicles

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20230131

RBV Designated contracting states (corrected)

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

RAP3 Party data changed (applicant data changed or rights of an application transferred)

Owner name: APTIV TECHNOLOGIES LIMITED

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20240205

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: APTIV TECHNOLOGIES AG