US11587329B2 - Method and apparatus for predicting intent of vulnerable road users - Google Patents
Method and apparatus for predicting intent of vulnerable road users Download PDFInfo
- Publication number
- US11587329B2 US11587329B2 US16/727,926 US201916727926A US11587329B2 US 11587329 B2 US11587329 B2 US 11587329B2 US 201916727926 A US201916727926 A US 201916727926A US 11587329 B2 US11587329 B2 US 11587329B2
- Authority
- US
- United States
- Prior art keywords
- vrus
- estimating
- video frames
- scene
- estimated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 230000011218 segmentation Effects 0.000 claims abstract description 43
- 230000009471 action Effects 0.000 claims abstract description 21
- 230000006399 behavior Effects 0.000 claims description 65
- 238000013528 artificial neural network Methods 0.000 claims description 13
- 238000003062 neural network model Methods 0.000 claims description 8
- 230000003997 social interaction Effects 0.000 claims description 8
- 238000013135 deep learning Methods 0.000 claims description 7
- 230000010399 physical interaction Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 description 21
- 230000008447 perception Effects 0.000 description 16
- 230000000875 corresponding effect Effects 0.000 description 10
- 238000001514 detection method Methods 0.000 description 10
- 241000282412 Homo Species 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000003993 interaction Effects 0.000 description 7
- 230000033001 locomotion Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000005021 gait Effects 0.000 description 5
- 230000013016 learning Effects 0.000 description 5
- 230000003542 behavioural effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000011176 pooling Methods 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 101001093748 Homo sapiens Phosphatidylinositol N-acetylglucosaminyltransferase subunit P Proteins 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004884 risky behavior Effects 0.000 description 1
- 230000011273 social behavior Effects 0.000 description 1
- 230000009326 social learning Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W30/00—Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
- B60W30/08—Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
- B60W30/095—Predicting travel path or likelihood of collision
- B60W30/0956—Predicting travel path or likelihood of collision the prediction being responsive to traffic or environmental parameters
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W40/00—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
- B60W40/02—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to ambient conditions
- B60W40/04—Traffic conditions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G06K9/628—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/16—Anti-collision systems
- G08G1/166—Anti-collision systems for active traffic, e.g. moving vehicles, pedestrians, bikes
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2420/00—Indexing codes relating to the type of sensors based on the principle of their operation
- B60W2420/40—Photo, light or radio wave sensitive means, e.g. infrared sensors
- B60W2420/403—Image sensing, e.g. optical camera
-
- B60W2420/42—
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2554/00—Input parameters relating to objects
- B60W2554/40—Dynamic objects, e.g. animals, windblown objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30236—Traffic on road, railway or crossing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
Definitions
- aspects of the disclosure relate to automated driving and more specifically to predicting intention of a user sharing the road with a vehicle.
- Motor vehicles are being equipped with increasing amounts of sensor technology designed to assist drivers in manually handling their vehicles in a variety of situations.
- These sensors enable a variety of features such as parking assist, lane departure warning, and blind spot detection, all of which are now available as add-ons to standard vehicle configurations.
- Some of these sensors are also being used in connection with automated and autonomous driving.
- Autonomous driving technology has experienced rapid development in recent years, but is still a long way from being able to operate without human control in all situations.
- Pedestrians are moving on urban roads with specific targets and goals in mind. While navigating the road, pedestrians directly interact with other road users and make decisions accordingly.
- An automated or autonomous vehicle needs to identify and estimate trajectories of all the other vehicles, pedestrians, humans riding bikes, scooters and other devices in order to safely navigate the road.
- the method includes obtaining, by a computer system of a vehicle equipped with one or more sensors, a sequence of video frames corresponding to a scene external to the vehicle.
- the computer system detects one or more VRUs in the sequence of video frames.
- the detecting may include estimating pose of each of the detected one or more VRUs.
- the computer system further generates a segmentation map of the scene using one or more of the video frames and estimates one or more intention probabilities using estimated pose of the one or more VRUs and the segmentation map.
- Each intention probability may correspond to one of the detected one or more VRUs.
- the computer system further adjusts one or more automated driving actions based on the estimated one or more intention probabilities.
- the computer system estimates one or more behavior states based at least on the estimated pose of the VRUs, and estimates future trajectories of the one or more VRUs using the estimated one or more behavior states.
- the computer system may use past states of the VRU and detected pose and bounding boxes to estimate the future trajectories.
- Each behavior state may correspond to one of the detected VRUs.
- the steps of detecting, generating and estimating may be performed using a holistic deep learning neural network model by sequentially correlating the estimated pose of the one or more VRUs and their corresponding behavior state with a segmented scene in the vicinity of each of the VRUs.
- the computer system further estimates the one or more intention probabilities by estimating the one or more behavior states based at least on the estimated pose of the one or more VRUs using a multi-task model, and estimating the one or more intention probabilities based on the estimated one or more behavior states.
- the computer system estimates the one or more behavior states by applying a neural network architecture to a continuous sequence of poses of each of the one or more VRUs to select a behavior state for the VRU among a plurality of predefined behavior states.
- the computer system generates the segmentation map by applying a neural network architecture to one or more of the video frames to classify each pixel in the video frames as one of a plurality of predefined classes. Each class may correspond to a segment in the segmentation map.
- the computer system selects at least one high-risk VRU from among the one or more VRUs based on the predicted behavior state and proximity of each VRU to the vehicle.
- the computer system may then notify a driver of the vehicle of the risky behavior or change trajectory of an autonomous vehicle to avoid a future accident involving the high-risk VRU.
- the computer system estimates the intention probabilities of the one or more VRUs by analyzing social interactions between the one or more VRUs and one or more classes corresponding to objects in the segmentation map.
- a computer system including at least one processor and a memory coupled to the at least one processor.
- the at least one processor is configured to obtain a sequence of video frames corresponding to a scene external to a vehicle captured by one or more sensors.
- the at least one processor is further configured to detect one or more VRUs in the sequence of video frames and estimate pose of each of the detected one or more VRUs, generate a segmentation map of the scene using one or more of the video frames, estimate one or more intention probabilities using estimated pose of the one or more VRUs and the segmentation map, and adjust one or more automated driving actions based on the estimated one or more intention probabilities.
- Each intention probability may correspond to one of the detected one or more VRUs.
- a computer-readable storage medium stores instructions that, when executed by one or more processors of a vehicle computer system, cause the one or more processors to obtain a sequence of video frames corresponding to a scene external to the vehicle.
- the sequence of video frames is captured using one or more sensors.
- the instructions further cause the one or more processors to detect one or more VRUs in the sequence of video frames, wherein the detection comprises estimating pose of each of the detected one or more VRUs, generate a segmentation map of the scene using one or more of the video frames, estimate one or more intention probabilities using estimated pose of the one or more VRUs and the segmentation map, each intention probability corresponding to one of the detected one or more VRUs, and adjust one or more automated driving actions based on the estimated one or more intention probabilities.
- FIG. 1 is a simplified block diagram of a vehicle system that may utilize the disclosed intent prediction system, in accordance with certain embodiments of the present disclosure.
- FIG. 2 illustrates an example high-level block diagram of the VRU intent prediction system, in accordance with certain embodiments of the present disclosure.
- FIG. 3 illustrates an example perception module, in accordance with certain embodiments of the present disclosure.
- FIG. 4 illustrates an example behavior prediction module, in accordance with certain embodiments of the present disclosure.
- FIG. 5 illustrates an example block diagram of an intent prediction module, in accordance with certain embodiments of the present disclosure.
- FIG. 6 illustrates an example flow diagram of the proposed method, in accordance with certain embodiments of the present disclosure.
- FIGS. 7 A and 7 B illustrate example trajectory estimation and intent predication results on two example images, in accordance with certain embodiments of the present disclosure.
- VRU Vehicle Road User
- VRU Vehicle Road User
- Pedestrians cyclists, humans on motorbikes, humans riding scooters, and the like.
- VRU refers to any human on or around a roadway that directly interacts with vehicles on roads.
- VRUs may have a potentially higher risk of accident than a person sitting inside a vehicle.
- the present disclosure relates to techniques for detecting and identifying vulnerable road users.
- the embodiments described herein may be used in vehicles that offer various degrees of automated driving capabilities, ranging from partial driver assistance to full automation of all aspects of the driving task.
- the National Highway Traffic Safety Administration (NHTSA) and Society of Automotive Engineers (SAE) International define levels of vehicle autonomy as follows: Level 0, where the driver is in full control of the vehicle; Level 1, where a driver assistance system controls steering or acceleration/deceleration; Level 2, where the driver assistance system controls steering and acceleration/deceleration, and where the driver performs all other aspects of the driving task; Level 3, where all aspects of driving are performed by the driver assistance system, but where the driver may have to intervene if special circumstances occur that the automated vehicle is unable to safely handle; Level 4, where all aspects of driving are performed by the driver assistance system, even in situations where the driver does not appropriately respond when requested to intervene; and Level 5, where the vehicle drives fully autonomously in all driving situations, with or without a passenger.
- automated driving is used to refer to any driving action that is performed by an automated driving system.
- the actions performed by the lane keep assistant e.g., automated driving system
- automated driving actions any driving action that is performed by a human driver
- level 1 through level 3 of automation some form of automated driving actions may be performed when the driver assistance system controls at least some aspects of driving.
- level 1 through level 3 some input from a human driver can still be expected.
- autonomous vehicle is used to refer to a vehicle using levels 4 and 5 of automation, where the system performs automated driving actions most or all the time and there is little or no intervention by a human driver.
- Advanced perception and path planning systems are at the core for any autonomous vehicles. Autonomous vehicles need to understand their surroundings and intentions of other road users for safe motion planning. For urban use cases, it is very important to perceive and predict intentions of pedestrians and other VRUs.
- Certain embodiments disclose a system for estimating and predicting intentions of one or more VRUs in the surroundings of a vehicle. Intention of a VRU are estimated using a combination of current activities of the VRU, its interactions with other vehicles and VRUs, and long-term trajectories defining future motion of the VRUs.
- the intent prediction system utilizes an end-to-end trained deep neural network model that classifies activities of the VRUs and forecasts their future trajectories using sequences of video frames as input.
- Certain embodiments present a VRU intent prediction system that detects/estimates gait, speed, head and body pose, actions (carrying objects, pushing carts, holding a child, etc.) and awareness/distraction (talking on the phone, wearing a headset, etc.) levels of humans on the road, and utilize these behavioral patterns to predict future trajectories of the humans in or around the road.
- the VRU intent prediction system uses artificial intelligence and is trained on video sequences to recognize activities of VRUs on urban roads, and predict their trajectories.
- the combination of short-term discrete activity recognition and future continuous trajectory prediction summarizes the intention for VRUs and provides an accurate input to a path-planning module in the autonomous vehicle.
- Certain embodiments take advantage of low-level features for each VRU in the scene and use a data driven deep learning approach to learn state and behavioral interactions of the VRUs with the overall scene.
- the method disclosed herein perceives and understands the human behaviors and temporally predicts continuous trajectories weighing from the past history of state and spatial inputs.
- FIG. 1 is a simplified block diagram of a vehicle system 100 that may utilize the disclosed intent prediction system, according to certain embodiments.
- the vehicle system 100 may be an automated or autonomous vehicle.
- the vehicle system 100 includes a vehicle control subsystem 110 , one or more I/O devices 120 , one or more sensors 130 , and one or more communications interfaces 140 .
- Vehicle control subsystem 110 comprises a computer system that includes one or more vehicle control units 112 (e.g., electronic control units or ECUs).
- vehicle control units 112 may include any number of embedded systems that each control one or more sensors, electrical systems or other subsystems of a vehicle.
- vehicle control units 112 include, without limitation, an engine control unit, a power steering control unit, a powertrain control module, a speed control unit, a telematics control unit, a transmission control unit, a brake control module, a camera control module, a LIDAR control module or any other type of control module.
- vehicle control units 112 may comprise one or more processors and one or more non-transitory computer-readable media storing processor-executable instructions.
- a vehicle control unit 112 may include a processor configured to execute a software application that processes sensor information to determine an automated driving operation (e.g., determining trajectories of VRUs surrounding the vehicle and taking action if their trajectories cross the vehicle's path) or to generate output for a vehicle occupant or driver via an I/O device 120 .
- Sensors 130 may comprise any number of devices that provide information about the vehicle in which vehicle system 100 is deployed and/or an environment external to the vehicle.
- sensors 130 include, without limitation, a camera, a microphone, a radar sensor, an ultrasonic sensor, a LIDAR sensor, a global positioning system (GPS) sensor, a steering angle sensor, and/or a motion sensor (e.g., an accelerometer and/or gyroscope).
- vehicle system 100 may be equipped with one or more cameras that can be used to detect and localize VRUs in vicinity of the vehicle.
- vehicle control subsystem 110 includes an advanced driver assistance system (ADAS) 114 .
- ADAS system 114 may include an automated cruise control system, a blind spot detection system, a parking assistance system, emergency braking system or any other type of automated system.
- the ADAS system may include a VRU intent prediction module 116 and a path planning module 118 , as described herein.
- the ADAS system 114 may comprise hardware (e.g., an actuator) and/or software that enables autonomous performance of an advanced driver assistance system.
- ADAS system 114 may comprise a set of instructions that coordinate between one or more vehicle control units 112 (e.g., a power steering control unit and/or a powertrain control module) and one or more sensors 130 (e.g., a camera, a radar sensor, an ultrasonic sensor, and/or a LIDAR sensor) to identify VRUs and their trajectories, detect an imminent collision and actuate automatic emergency braking.
- vehicle control units 112 e.g., a power steering control unit and/or a powertrain control module
- sensors 130 e.g., a camera, a radar sensor, an ultrasonic sensor, and/or a LIDAR sensor
- I/O device 120 can include audio output devices, haptic output devices, displays and/or other devices that can be operated to generate output for a vehicle occupant in connection with a manual or an automated operation.
- Communications interface 140 includes a wireless communications interface configured to send messages to, and receive messages from other vehicles and other devices. Vehicle messages can be transmitted as V2X, DSRC or can be compliant with any other wireless communications protocol. Communications interface 140 may further include a transceiver configured to communicate with one or more components of a global positioning system (e.g., a satellite or a local assistance server).
- a global positioning system e.g., a satellite or a local assistance server.
- FIG. 2 illustrates an example high-level block diagram of the VRU intent prediction system 200 , in accordance with certain embodiments of the present disclosure.
- the VRU intent prediction system 200 can implement VRU intent prediction module 116 in FIG. 1 and includes a perception module 210 , a behavior prediction module 220 , and an intent prediction module 230 .
- the perception module 210 detects, identifies and localizes VRUs in the scene. Furthermore, the perception module estimates a two dimensional (2D) pose and a 3D bounding box for each detected VRU in the scene. Furthermore, the perception module 210 tracks the detected VRUs in the 3D scene.
- the perception module utilizes a segmentation deep neural network that classifies each pixel of an input image to belong to one of several known classes of objects. In one example, the pixel classification may be done using a semantic scene segmentation technique by passing the input images through an encoder-decoder architecture to generate a scene description. Outputs of the perception module may include 2D bounding box, key points, scene segmentation mask and the like. In addition, the perception module 210 detects objects in the scene using an image or video frame as an input.
- the behavior prediction module 220 receives the scene description and pose estimations from the perception module, and detects activity and state of each VRU in the scene. In addition, the behavior prediction module 220 receives a history of past locations of one or more VRUs and outputs the future possible pixel locations of all the VRUs in the scene.
- the intent prediction module 230 receives estimated trajectory of the VRUs and their activity state as an input and outputs a probability that the VRU's intended path will cross the vehicle's path.
- FIG. 3 illustrates an example perception module 300 , in accordance with certain embodiments of the present disclosure.
- the perception module 300 can implement the perception module 210 in FIG. 2 .
- the perception module 300 may include a semantic segmentation module 310 , and an object detection and human pose estimation module 320 .
- the semantic segmentation module 310 runs in parallel with the object detection module 320 to generate an understanding of the scene.
- the semantic segmentation module 310 associates all the pixels of the scene with their respective classes and outputs a full scene description that can be correlated to the 2D spatial location of the persons in scene.
- the semantic segmentation module 310 utilizes an encoder decoder architecture.
- the semantic segmentation module 310 may use a VGG or ResNet deep neural network model as an Encoder that is pre-trained on known datasets such as ImageNet, along with a Unet or fully convolutional network (FCN) decoder neural network.
- the model takes an image as input and uses 2D CNN layers with some pooling layers and batch normalization to encode the scene. Furthermore, the model uses a decoder to reconstruct a full resolution segmentation mask. The model is trained against annotated semantic segmentation data to match each pixel to a proper class.
- the output includes classification of each pixel into a set of predefined classes, such as persons, landmarks, cars, roads, curbs, traffic signs, etc.
- the object detection module 320 includes a pre-trained object detection network, and a 2D human pose estimation network that are used to encode all of the visual cues (features) for each individual in the scene.
- Two-dimensional spatial location and 2D key points and pose for each pedestrian in the scene provide low-level features and description about their body and head orientation in each image and relative limb movement across sequence of images. This information is very rich compared to using just 2D or 3D location of object in pixel or world coordinates, respectively.
- FIG. 4 illustrates an example behavior prediction module 400 , according to one aspect of the present disclosure.
- the behavior prediction module 400 may implement behavior prediction module 220 in FIG. 2 and includes an activity/state prediction module 410 and a trajectory prediction module 420 .
- the activity prediction module 410 receives scene description and pose estimation of each of the detected VRUs from the perception module 300 .
- the activity prediction module 410 uses a sequence of past history of 2D VRU poses, bounding boxes of each VRU for the past N frames to recognize the following classes or states:
- the activity prediction module 410 transforms sequential inputs of 2D bounding boxes, 2D pose and relative pose between sequence of video frames to get object-level feature representations.
- the activity prediction module 410 passes its input values through linear embedding layers and recurrent neural network (RNN) layers to perform spatial and temporal transformation.
- RNN recurrent neural network
- Fused scene and object encoding are passed through final dense layers to generate outputs of activity classes.
- the model learns to recognize the activities/state of all the pedestrians in the scene.
- the disclosed system learns multiple classes of activities and trajectories along with final VRU intentions, hence it is a multi-task learning model.
- the trajectory Prediction Module 420 estimates trajectories of the VRUs that are detected in the scene.
- the network uses convolutional neural network encoding layers to encode both 2D poses and 2D/3D bounding boxes from the perception module.
- Sequential object encoding and scene encoding are fused and passed to a decoder with recurrent units (e.g., LSTM) to output future 2D pixel locations of each individual VRU in the scene.
- recurrent units e.g., LSTM
- the output xy pixel locations are trained against ground truth trajectory values using a squared L2 loss.
- the trajectory prediction module predicts the possible future pixel locations of all the VRUs in the scene for the next t+n frames.
- the trajectory prediction module 420 utilizes the same neural network model as the activity prediction model with a separate branch for estimating trajectories.
- the trajectory prediction module 420 uses a recurrent encoder-decoder model that is trained on outputs from object detection module.
- VRUs in the scene interact with other objects (other VRUs, vehicles, etc.) and move with specific defined goals in mind.
- Social GAN Socially Acceptable Trajectories with Generative Adversarial Networks
- Social LSTM from academic research present idea of social learning using feature pooling for pedestrians.
- these models are very restricted to modelling only the interactions between pedestrians only.
- Certain embodiments present methods to model social interactions between one or more pedestrians and interactions between pedestrians and other objects, and the scene.
- Certain embodiments predict the interactions between pedestrians and other objects in the scene by including rich encoded scene semantics as input features to the trajectory prediction module to identify and detect the potential interaction of pedestrians with the scene.
- Certain embodiments uniquely qualifies and models the dynamics involved with pedestrians walking in a group, carrying something, holding other users or objects physically, and the like. It should be noted that behavioral intent is very different for each of the above cases.
- annotated labels may be added to the model that identify whether each VRU belongs to a group or is an individual. Including this supervised learning capability in the model enables the system to react differently when the pedestrians/VRUs have different group dynamics. Social Pooling of encoding layers is used to learn the interaction between pedestrians.
- FIG. 5 illustrates an example intention probability prediction module 500 , in accordance with certain embodiments of the present disclosure.
- trajectory predictions and activity predictions are input to final dense layers of the DNN model to estimate/predict the final intention with probabilities for each VRU.
- Certain embodiments use weighted cross-entropy losses for training each individual class labels for behavior module and a separate ridge regression loss function is used for training trajectory models.
- the intention probability prediction module receives VRU activity states (e.g., gait, attention, facing, crossing, etc.) and VRU trajectory as an input.
- the intention probability prediction module estimates a probability for intention of each of the VRUs.
- the intention probability prediction module estimates the probability that the VRU is going to cross the future trajectory of the vehicle.
- the intent probability for crossing the roadway for this first pedestrian will be high.
- the crossing intention probability for the second pedestrian is lower than the first pedestrian (maybe the second pedestrian is waiting to meet his friend at the intersection).
- the VRU intent prediction system provides current state and future predictions as input to the planning module or to the warning system on ADAS/automated driving (AD) capable vehicles.
- the VRU intent prediction system may serve as a warning system to let the driver behind the wheel take control or alert when we sense an anomaly or risky intention from pedestrians in 360 scene.
- the autonomous car needs to estimate correctly the intentions of road users (e.g., VRUs, cars, . . . ) to plan its trajectory and maneuvers accordingly.
- the robot has to constantly engage with pedestrians and cyclists on curb and navigate.
- FIG. 6 illustrates an example flow diagram of the proposed method, in accordance with certain embodiments of the present disclosure.
- a computer system of a vehicle equipped with one or more sensors obtains a sequence of video frames corresponding to a scene external to the vehicle.
- the sequence of video frames may be captured using at least one of the one or more sensors.
- the sequence of video frames may be captured using one or more fisheye cameras.
- the computer system detects one or more VRUs in the sequence of video frames.
- the computer system identifies one or more VRUs in the scene, generates a bounding box for each of the VRUs in the scene and estimates pose of each of the detected one or more VRUs.
- the computer system generates a segmentation map of the scene using one or more of the video frames.
- the computer system classifies each segment of the scene to one of multiple classes of objects in the scene.
- the computer system generates the segmentation map by applying a neural network architecture to the sequence of video frames to classify each pixel in the sequence of video frames as one of a plurality of predefined classes, each class corresponding to a segment in the segmentation map.
- the segmentation map is performed on each frame of video or image.
- the computer system estimates one or more intention probabilities of the one or more VRUs using estimated pose of the one or more VRUs and the segmentation map. Each intention probability may correspond to one of the detected one or more VRUs.
- the computer system estimates the one or more intention probabilities by analyzing social interactions between the one or more VRUs and one or more classes corresponding to objects in the segmentation map.
- the computer system estimates the one or more intention probabilities by first estimating the one or more behavior states based at least on the estimated pose of the one or more VRUs using a multi-task model, and utilizing the estimated behavior states to estimate the one or more intention probabilities.
- the computer system may estimate one or more behavior states based at least on the estimated pose of the VRUs.
- Each behavior state may correspond to one of the detected VRUs.
- the computer system may estimate the one or more behavior states by applying a neural network architecture to a continuous sequence of poses of each of the one or more VRUs to select a behavior state for the VRU from among a plurality of predefined behavior states.
- the predefined behavior states may be gait, attention, facing, crossing, and the like.
- the computer system may then estimate future trajectories of the one or more VRUs using the estimated one or more behavior states.
- the above mentioned steps of detecting, generating and estimating are performed using a holistic deep learning neural network model by sequentially correlating the estimated pose of the one or more VRUs and their corresponding behavior state with a segmented scene in the vicinity of each of the VRUs.
- the computer system adjusts one or more automated driving actions based on the estimated one or more intention probabilities.
- the automated driving action might be generating a warning for the driver of the vehicle of an imminent crash with a VRU that is about to enter the roadway and cross paths with the vehicle.
- the automated driving action might be changing the trajectory of an automated or autonomous vehicle to avoid hitting the pedestrian that is about to enter the roadway.
- the action might be activating the automatic emergency braking system to avoid hitting the pedestrian. It should be noted that any other automated driving action may fall within the scope of the present disclosure.
- the computer system may select at least one high-risk VRU from among the one or more VRUs based on the predicted behavior state and proximity of each VRU to the vehicle. The computer system may then notify the driver or the automated driving system of the presence of the high-risk VRU (e.g., a child who is about to run into the road and cross the trajectory of the vehicle, etc.)
- FIGS. 7 A and 7 B illustrate example outputs of the intent predication system on two example images, in accordance with certain embodiments of the present disclosure.
- FIG. 7 A illustrates an example image that is marked with outputs of the activity prediction module.
- two pedestrians are walking on or towards a roadway.
- the first pedestrian 710 is facing left, and is walking distracted.
- the probability of this pedestrian passing the roadway and crossing path with the vehicle is 0.97.
- the other pedestrian 720 is still on the sidewalk, is holding a device and is facing left, walking and is aware of his surroundings. Probability of this pedestrian intending to cross the road in the next few time stamps is at 0.82.
- FIG. 7 B illustrates another example of pedestrians walking in the vicinity of a vehicle.
- walking trajectories of four pedestrians are shown. These trajectories are then used by the intent probability estimation system to estimate the probability of each of these pedestrians (e.g., VRU) crossing the roadway.
- An automated system may then use the estimated probability in its path planning system to estimate its own trajectory to prevent any accidents.
- the VRU intention prediction system presented herein improves accuracy of estimation of future path of pedestrians and other road users.
- the intent prediction system can predict intention of each of the VRUs in near future for crossing the roadway or staying in the sidewalk.
- An automated or autonomous vehicle can utilize the VRU intention prediction system to improve its overall operation safety while driving on urban roads.
- the intention prediction system improves safety of VRUs that share the road with the vehicle.
- the automated or autonomous vehicle detects that a VRU is about to cross the road, it may reduce its speed and/or stop to yield to the VRU (e.g., if the VRU has right of way).
- the automated or autonomous vehicle may continue its path while paying extra attention to the VRU in the scene that is marked as a high-risk VRU, to prevent any future accident (e.g., if the VRU decides to walk into the roadway).
- the VRU intention prediction system disclosed herein has several advantages. First, by understanding the intentions of pedestrians and other class of VRUs, any autonomous vehicle or robot on urban roads can achieve a naturalistic driving behavior-similar to how humans drive and interact with VRUs in the scene. In addition, by using the low-level information about VRU pose and 3D positions in the scene and correlating the changes in sequence of frames—temporally, the VRU intent prediction model achieves about 98 percent accuracy in recognizing activities such as gait, awareness, distraction (as trained and evaluated on annotated data.
- certain embodiments achieve the task of predicting behaviors, future trajectories and intentions with much smaller (e.g., 30-40 percent) computing and memory requirements. This is because the network takes advantage of weight sharing and cross-correlating the significance of low-level features, behaviors and predicted trajectories. This leads to significant improvement in the quality and accuracy of activity recognition, trajectory prediction and intention prediction.
- Certain embodiments use the disclosed VRU intention prediction method on images from fisheye cameras, and/or 360-degree view cocoon cameras (e.g., one camera in front of the vehicle, one camera in rear of the vehicle, and two cameras on the sides of the vehicle) to achieve 360-degree detection and prediction capability for VRUs surrounding the vehicle.
- the disclosed system not only helps with front collision warning and motion planning, but also for rear driving mode (e.g., while backing out of parking spots or to improve the prediction horizon for rear AEB (automatic emergency braking). Thereby, a control system can initiate the braking process much earlier by predicting future states of VRUs.
- the system learns and predicts trajectories and activities of VRUs, by considering the physical interactions and causality between current behaviors of VRUs and different elements of the scene.
- the proposed system understands and predicts that pedestrians or cyclists cannot go through cars or buildings in the scene and accurately predicts trajectories around such elements.
- social behavior understanding between individuals or a group of VRUs, and VRUs and other objects in the scene is improved.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mechanical Engineering (AREA)
- Transportation (AREA)
- Automation & Control Theory (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Traffic Control Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (15)
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/727,926 US11587329B2 (en) | 2019-12-27 | 2019-12-27 | Method and apparatus for predicting intent of vulnerable road users |
CN202080090578.4A CN115039142A (en) | 2019-12-27 | 2020-12-21 | Method and apparatus for predicting intent of vulnerable road user |
PCT/US2020/066310 WO2021133706A1 (en) | 2019-12-27 | 2020-12-21 | Method and apparatus for predicting intent of vulnerable road users |
KR1020227025914A KR20220119720A (en) | 2019-12-27 | 2020-12-21 | Methods and devices for predicting the intentions of vulnerable road users |
EP20842837.5A EP4081931A1 (en) | 2019-12-27 | 2020-12-21 | Method and apparatus for predicting intent of vulnerable road users |
JP2022539182A JP7480302B2 (en) | 2019-12-27 | 2020-12-21 | Method and apparatus for predicting the intentions of vulnerable road users |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/727,926 US11587329B2 (en) | 2019-12-27 | 2019-12-27 | Method and apparatus for predicting intent of vulnerable road users |
Publications (2)
Publication Number | Publication Date |
---|---|
US20210201052A1 US20210201052A1 (en) | 2021-07-01 |
US11587329B2 true US11587329B2 (en) | 2023-02-21 |
Family
ID=74191923
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/727,926 Active 2040-01-30 US11587329B2 (en) | 2019-12-27 | 2019-12-27 | Method and apparatus for predicting intent of vulnerable road users |
Country Status (6)
Country | Link |
---|---|
US (1) | US11587329B2 (en) |
EP (1) | EP4081931A1 (en) |
JP (1) | JP7480302B2 (en) |
KR (1) | KR20220119720A (en) |
CN (1) | CN115039142A (en) |
WO (1) | WO2021133706A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210366269A1 (en) * | 2020-05-22 | 2021-11-25 | Wipro Limited | Method and apparatus for alerting threats to users |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11734907B2 (en) * | 2020-04-24 | 2023-08-22 | Humanising Autonomy Limited | Tracking vulnerable road users across image frames using fingerprints obtained from image analysis |
EP4172854A1 (en) * | 2020-06-24 | 2023-05-03 | Humanising Autonomy Limited | Appearance and movement based model for determining risk of micro mobility users |
US11682272B2 (en) * | 2020-07-07 | 2023-06-20 | Nvidia Corporation | Systems and methods for pedestrian crossing risk assessment and directional warning |
KR20220039903A (en) * | 2020-09-21 | 2022-03-30 | 현대자동차주식회사 | Apparatus and method for controlling autonomous driving of vehicle |
US11724641B2 (en) * | 2021-01-26 | 2023-08-15 | Ford Global Technologies, Llc | Hazard condition warning for package delivery operation |
DE102021201535A1 (en) * | 2021-02-18 | 2022-08-18 | Robert Bosch Gesellschaft mit beschränkter Haftung | Method for trajectory prediction for a vehicle |
US20230062158A1 (en) * | 2021-09-02 | 2023-03-02 | Waymo Llc | Pedestrian crossing intent yielding |
US20230196817A1 (en) * | 2021-12-16 | 2023-06-22 | Adobe Inc. | Generating segmentation masks for objects in digital videos using pose tracking data |
WO2023152422A1 (en) * | 2022-02-11 | 2023-08-17 | Teknologian Tutkimuskeskus Vtt Oy | Light-emitting device |
DE102022212869B3 (en) | 2022-11-30 | 2024-03-28 | Volkswagen Aktiengesellschaft | Method for operating at least one autonomously operated vehicle, vehicle guidance system, and vehicle |
WO2024186028A1 (en) * | 2023-03-03 | 2024-09-12 | 엘지전자 주식회사 | Method of determining travel route by device in wireless communication system and device therefor |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9760806B1 (en) | 2016-05-11 | 2017-09-12 | TCL Research America Inc. | Method and system for vision-centric deep-learning-based road situation analysis |
US20170329332A1 (en) | 2016-05-10 | 2017-11-16 | Uber Technologies, Inc. | Control system to adjust operation of an autonomous vehicle based on a probability of interference by a dynamic object |
US20180096595A1 (en) | 2016-10-04 | 2018-04-05 | Street Simplified, LLC | Traffic Control Systems and Methods |
US20180253595A1 (en) | 2015-09-29 | 2018-09-06 | Sony Corporation | Information processing apparatus, information processing method, and program |
US20190124232A1 (en) * | 2017-10-19 | 2019-04-25 | Ford Global Technologies, Llc | Video calibration |
US20190171871A1 (en) * | 2017-12-03 | 2019-06-06 | Facebook, Inc. | Systems and Methods for Optimizing Pose Estimation |
WO2019116099A1 (en) | 2017-12-13 | 2019-06-20 | Humanising Autonomy Limited | Systems and methods for predicting pedestrian intent |
KR101958868B1 (en) | 2017-02-23 | 2019-07-02 | 계명대학교 산학협력단 | System for predicting pedestrian intention for vehicle in night time and method thereof |
US20200023842A1 (en) * | 2019-09-27 | 2020-01-23 | David Gomez Gutierrez | Potential collision warning system based on road user intent prediction |
US10565880B2 (en) | 2018-03-19 | 2020-02-18 | Derq Inc. | Early warning and collision avoidance |
US10824155B2 (en) | 2018-08-22 | 2020-11-03 | Ford Global Technologies, Llc | Predicting movement intent of objects |
US20210070322A1 (en) * | 2019-09-05 | 2021-03-11 | Humanising Autonomy Limited | Modular Predictions For Complex Human Behaviors |
US20210081715A1 (en) * | 2019-09-13 | 2021-03-18 | Toyota Research Institute, Inc. | Systems and methods for predicting the trajectory of an object with the aid of a location-specific latent map |
US20210103742A1 (en) * | 2019-10-08 | 2021-04-08 | Toyota Research Institute, Inc. | Spatiotemporal relationship reasoning for pedestrian intent prediction |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2560387B (en) * | 2017-03-10 | 2022-03-09 | Standard Cognition Corp | Action identification using neural networks |
DE102017216000A1 (en) * | 2017-09-11 | 2019-03-14 | Conti Temic Microelectronic Gmbh | Gesture control for communication with an autonomous vehicle based on a simple 2D camera |
JP6969254B2 (en) * | 2017-09-22 | 2021-11-24 | 株式会社アイシン | Image processing equipment and programs |
JP6917878B2 (en) * | 2017-12-18 | 2021-08-11 | 日立Astemo株式会社 | Mobile behavior prediction device |
DE102018104270A1 (en) * | 2018-02-26 | 2019-08-29 | Connaught Electronics Ltd. | Method for predicting the behavior of at least one pedestrian |
CN108319930B (en) * | 2018-03-09 | 2021-04-06 | 百度在线网络技术(北京)有限公司 | Identity authentication method, system, terminal and computer readable storage medium |
CN110135304A (en) * | 2019-04-30 | 2019-08-16 | 北京地平线机器人技术研发有限公司 | Human body method for recognizing position and attitude and device |
-
2019
- 2019-12-27 US US16/727,926 patent/US11587329B2/en active Active
-
2020
- 2020-12-21 WO PCT/US2020/066310 patent/WO2021133706A1/en unknown
- 2020-12-21 JP JP2022539182A patent/JP7480302B2/en active Active
- 2020-12-21 KR KR1020227025914A patent/KR20220119720A/en unknown
- 2020-12-21 EP EP20842837.5A patent/EP4081931A1/en active Pending
- 2020-12-21 CN CN202080090578.4A patent/CN115039142A/en active Pending
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180253595A1 (en) | 2015-09-29 | 2018-09-06 | Sony Corporation | Information processing apparatus, information processing method, and program |
US20170329332A1 (en) | 2016-05-10 | 2017-11-16 | Uber Technologies, Inc. | Control system to adjust operation of an autonomous vehicle based on a probability of interference by a dynamic object |
US9760806B1 (en) | 2016-05-11 | 2017-09-12 | TCL Research America Inc. | Method and system for vision-centric deep-learning-based road situation analysis |
US20180096595A1 (en) | 2016-10-04 | 2018-04-05 | Street Simplified, LLC | Traffic Control Systems and Methods |
KR101958868B1 (en) | 2017-02-23 | 2019-07-02 | 계명대학교 산학협력단 | System for predicting pedestrian intention for vehicle in night time and method thereof |
US20190124232A1 (en) * | 2017-10-19 | 2019-04-25 | Ford Global Technologies, Llc | Video calibration |
US20190171871A1 (en) * | 2017-12-03 | 2019-06-06 | Facebook, Inc. | Systems and Methods for Optimizing Pose Estimation |
US10913454B2 (en) | 2017-12-13 | 2021-02-09 | Humanising Autonomy Limited | Systems and methods for predicting pedestrian intent |
WO2019116099A1 (en) | 2017-12-13 | 2019-06-20 | Humanising Autonomy Limited | Systems and methods for predicting pedestrian intent |
US10565880B2 (en) | 2018-03-19 | 2020-02-18 | Derq Inc. | Early warning and collision avoidance |
US10854079B2 (en) | 2018-03-19 | 2020-12-01 | Derq Inc. | Early warning and collision avoidance |
US10824155B2 (en) | 2018-08-22 | 2020-11-03 | Ford Global Technologies, Llc | Predicting movement intent of objects |
US20210070322A1 (en) * | 2019-09-05 | 2021-03-11 | Humanising Autonomy Limited | Modular Predictions For Complex Human Behaviors |
US20210081715A1 (en) * | 2019-09-13 | 2021-03-18 | Toyota Research Institute, Inc. | Systems and methods for predicting the trajectory of an object with the aid of a location-specific latent map |
US20200023842A1 (en) * | 2019-09-27 | 2020-01-23 | David Gomez Gutierrez | Potential collision warning system based on road user intent prediction |
US20210103742A1 (en) * | 2019-10-08 | 2021-04-08 | Toyota Research Institute, Inc. | Spatiotemporal relationship reasoning for pedestrian intent prediction |
Non-Patent Citations (5)
Title |
---|
International Preliminary Report on Patentability issued in corresponding Application No. PCT/US2020/066310, dated Jun. 28, 2022 (6 pages). |
Kooij et al. "Context-based path prediction for targets with switching dynamics." International Journal of Computer Vision 127.3 (2018): 239-262. (Year: 2018). * |
Xu et al. "Semantic Part RCNN for Real-World Pedestrian Detection." CVPR Workshops. Jan. 2019. (Year: 2019). * |
Zhang et al. "Integration convolutional neural network for person re-identification in camera networks." IEEE Access 6 (2018): 36887-36896. (Year: 2018). * |
Zhao, Yun, et al. "Joint Holistic and Partial CNN for Pedestrian Detection." BMVC. 2018. (Year: 2018). * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210366269A1 (en) * | 2020-05-22 | 2021-11-25 | Wipro Limited | Method and apparatus for alerting threats to users |
US12002345B2 (en) * | 2020-05-22 | 2024-06-04 | Wipro Limited | Environment-based-threat alerting to user via mobile phone |
Also Published As
Publication number | Publication date |
---|---|
JP7480302B2 (en) | 2024-05-09 |
CN115039142A (en) | 2022-09-09 |
EP4081931A1 (en) | 2022-11-02 |
WO2021133706A9 (en) | 2021-08-12 |
US20210201052A1 (en) | 2021-07-01 |
JP2023508986A (en) | 2023-03-06 |
WO2021133706A1 (en) | 2021-07-01 |
KR20220119720A (en) | 2022-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11587329B2 (en) | Method and apparatus for predicting intent of vulnerable road users | |
Gupta et al. | Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues | |
JP6833936B2 (en) | Systems and methods for future vehicle positioning based on self-centered video | |
CN111050116B (en) | System and method for online motion detection using a time recursive network | |
US11042156B2 (en) | System and method for learning and executing naturalistic driving behavior | |
US11427210B2 (en) | Systems and methods for predicting the trajectory of an object with the aid of a location-specific latent map | |
US20210009121A1 (en) | Systems, devices, and methods for predictive risk-aware driving | |
Ghorai et al. | State estimation and motion prediction of vehicles and vulnerable road users for cooperative autonomous driving: A survey | |
CN113128326A (en) | Vehicle trajectory prediction model with semantic map and LSTM | |
JP7072030B2 (en) | Systems and methods for predicting the future using action priors | |
CN115135548A (en) | Combined tracking confidence and classification model | |
US11458987B2 (en) | Driver-centric risk assessment: risk object identification via causal inference with intent-aware driving models | |
US11460857B1 (en) | Object or person attribute characterization | |
US11648965B2 (en) | Method and system for using a reaction of other road users to ego-vehicle actions in autonomous driving | |
US12072678B2 (en) | Systems and methods for providing future object localization | |
CN116802098A (en) | System and method for determining future intent of an object | |
CN113460080A (en) | Vehicle control device, vehicle control method, and storage medium | |
CN117836184A (en) | Complementary control system for autonomous vehicle | |
CN115761431A (en) | System and method for providing spatiotemporal cost map inferences for model predictive control | |
JP7464425B2 (en) | Vehicle control device, vehicle control method, and program | |
US11970164B1 (en) | Adverse prediction planning | |
US20240157977A1 (en) | Systems and methods for modeling and predicting scene occupancy in the environment of a robot | |
US20240208546A1 (en) | Predictive models for autonomous vehicles based on object interactions | |
US20220306160A1 (en) | System and method for providing long term and key intentions for trajectory prediction | |
Reddy | Artificial Superintelligence: AI Creates Another AI Using A Minion Approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: VALEO SCHALTER UND SENSOREN GMBH, GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:VALEO NORTH AMERICA, INC.;REEL/FRAME:054695/0343 Effective date: 20191223 |
|
AS | Assignment |
Owner name: VALEO NORTH AMERICA, INC., MICHIGAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RANGA, ADITHYA;BHANUSHALI, JAGDISH;REEL/FRAME:054705/0250 Effective date: 20191223 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |