EP4355626A1 - Devices and methods for predicting collisions, predicting intersection violations, and/or determining region of interest for object detection in camera images - Google Patents
Devices and methods for predicting collisions, predicting intersection violations, and/or determining region of interest for object detection in camera imagesInfo
- Publication number
- EP4355626A1 EP4355626A1 EP22825506.3A EP22825506A EP4355626A1 EP 4355626 A1 EP4355626 A1 EP 4355626A1 EP 22825506 A EP22825506 A EP 22825506A EP 4355626 A1 EP4355626 A1 EP 4355626A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- driver
- processing unit
- pose
- vehicle
- collision
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims description 237
- 238000001514 detection method Methods 0.000 title description 47
- 238000012545 processing Methods 0.000 claims abstract description 452
- 238000003062 neural network model Methods 0.000 claims description 62
- 230000008569 process Effects 0.000 claims description 53
- 230000007423 decrease Effects 0.000 claims description 14
- 230000009471 action Effects 0.000 claims description 13
- 230000000391 smoking effect Effects 0.000 claims description 13
- 230000001419 dependent effect Effects 0.000 claims description 10
- 210000004556 brain Anatomy 0.000 claims description 9
- 238000012544 monitoring process Methods 0.000 description 72
- 210000001508 eye Anatomy 0.000 description 56
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 53
- 206010041349 Somnolence Diseases 0.000 description 34
- 238000004891 communication Methods 0.000 description 29
- 238000013528 artificial neural network Methods 0.000 description 22
- 230000004044 response Effects 0.000 description 19
- 241000282412 Homo Species 0.000 description 18
- 210000003128 head Anatomy 0.000 description 17
- 230000001133 acceleration Effects 0.000 description 14
- 238000004422 calculation algorithm Methods 0.000 description 13
- 230000015654 memory Effects 0.000 description 12
- 210000000744 eyelid Anatomy 0.000 description 11
- 241001465754 Metazoa Species 0.000 description 10
- 230000035484 reaction time Effects 0.000 description 10
- 230000036626 alertness Effects 0.000 description 7
- 230000008901 benefit Effects 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000013507 mapping Methods 0.000 description 6
- 238000003860 storage Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000001815 facial effect Effects 0.000 description 5
- 238000013178 mathematical model Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 230000004399 eye closure Effects 0.000 description 4
- 239000011521 glass Substances 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000005096 rolling process Methods 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 125000002066 L-histidyl group Chemical group [H]N1C([H])=NC(C([H])([H])[C@](C(=O)[*])([H])N([H])[H])=C1[H] 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000001747 pupil Anatomy 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 206010057315 Daydreaming Diseases 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 239000000853 adhesive Substances 0.000 description 1
- 230000001070 adhesive effect Effects 0.000 description 1
- 210000005252 bulbus oculi Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 229940052961 longrange Drugs 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000002040 relaxant effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000002366 time-of-flight method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W30/00—Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
- B60W30/08—Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
- B60W30/095—Predicting travel path or likelihood of collision
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W30/00—Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
- B60W30/08—Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
- B60W30/095—Predicting travel path or likelihood of collision
- B60W30/0953—Predicting travel path or likelihood of collision the prediction being responsive to vehicle dynamic parameters
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W30/00—Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
- B60W30/08—Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
- B60W30/095—Predicting travel path or likelihood of collision
- B60W30/0956—Predicting travel path or likelihood of collision the prediction being responsive to traffic or environmental parameters
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W50/08—Interaction between the driver and the control system
- B60W50/14—Means for informing the driver, warning the driver or prompting a driver intervention
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W40/00—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
- B60W40/08—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
- B60W2040/0818—Inactivity or incapacity of driver
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W40/00—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
- B60W40/08—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
- B60W2040/0818—Inactivity or incapacity of driver
- B60W2040/0827—Inactivity or incapacity of driver due to sleepiness
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W2050/0062—Adapting control system settings
- B60W2050/0075—Automatic parameter input, automatic initialising or calibrating means
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W2050/0062—Adapting control system settings
- B60W2050/0075—Automatic parameter input, automatic initialising or calibrating means
- B60W2050/0083—Setting, resetting, calibration
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W50/08—Interaction between the driver and the control system
- B60W50/14—Means for informing the driver, warning the driver or prompting a driver intervention
- B60W2050/143—Alarm means
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2540/00—Input parameters relating to occupants
- B60W2540/225—Direction of gaze
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2540/00—Input parameters relating to occupants
- B60W2540/229—Attention level, e.g. attentive to driving, reading or sleeping
Definitions
- the field relates to devices for assisting operation of vehicles, and more particularly, to devices and methods for predicting collisions, predicting intersection violations, or both.
- Cameras have been used in vehicles to capture images of road conditions outside the vehicles.
- a camera may be installed in a subject vehicle for monitoring a traveling path of the subject vehicle or for monitoring other vehicles surrounding the subject vehicle.
- New techniques for determining and tracking risk of collision and/or for determining and tracking risk of intersection violation are described herein. Also, new techniques for providing control signal to operate a warning generator to warn a driver of risk of collision and/or risk of intersection violation are described herein.
- An apparatus includes: a first camera configured to view an environment outside a vehicle; a second camera configured to view a driver of the vehicle; and a processing unit configured to receive a first image from the first camera, and a second image from the second camera; wherein the processing unit is configured to determine first information indicating a risk of collision with the vehicle based at least partly on the first image; wherein the processing unit is configured to determine second information indicating a state of the driver based at least partly on the second image; and wherein the processing unit is configured to determine whether to provide a control signal for operating a device or not based on (1) the first information indicating the risk of collision with the vehicle, and (2) the second information indicating the state of the driver.
- the processing unit is configured to predict the collision at least 3 seconds or more before an expected occurrence time for the predicted collision.
- the processing unit is configured to predict the collision with sufficient lead time for a brain of the driver to process input and for the driver to perform an action to mitigate the risk of the collision.
- the sufficient lead time is dependent on the state of the driver.
- the first information indicating the risk of collision comprises a predicted collision, wherein the processing unit is configured to determine an estimated time it will take for the predicted collision to occur, and wherein the processing unit is configured to provide the control signal if the estimated time it will take for the predicted collision to occur is below a threshold.
- the device comprises a warning generator, and wherein the processing unit is configured to provide the control signal to cause the device to provide a warning for the driver if the estimated time it will take for the predicted collision to occur is below the threshold.
- the device comprises a vehicle control
- the processing unit is configured to provide the control signal to cause the device to control the vehicle if the estimated time it will take for the predicted collision to occur is below the threshold.
- the threshold is variable based on the second information indicating the state of the driver.
- the processing unit is configured to repeatedly evaluate the estimated time with respect to the variable threshold, as the predicted collision is temporally approaching in correspondence with a decrease of the estimated time it will take for the predicted collision to occur.
- the threshold is variable in real time based on the state of the driver.
- the processing unit is configured to increase the threshold if the state of the driver indicates that the driver is distracted or is not attentive to a driving task.
- the processing unit is configured to at least temporarily hold off in providing the control signal if the estimated time it will take for the predicted collision to occur is higher than the threshold.
- the processing unit is configured to determine a level of the risk of the collision, and wherein the processing unit is configured to adjust the threshold based on the determined level of the risk of the collision.
- the state of the driver comprises a distracted state
- the processing unit is configured to determine a level of a distracted state of the driver, and wherein the processing unit is configured to adjust the threshold based on the determined level of the distracted state of the driver.
- the threshold has a first value if the state of the driver indicates that the driver is attentive to a driving task, and wherein the threshold has a second value higher than the first value if the state of the driver indicates that the driver is distracted or is not attentive to the driving task.
- the threshold is also based on sensor information indicating that the vehicle is being operated to mitigate the risk of the collision.
- the processing unit is configured to determine whether to provide the control signal or not based on (1) the first information indicating the risk of collision with the vehicle, (2) the second information indicating the state of the driver, and (3) sensor information indicating that the vehicle is being operated to mitigate the risk of the collision.
- the apparatus further includes a non-transitory medium storing a first model, wherein the processing unit is configured to process the first image based on the first model to determine the risk of the collision.
- the first model comprises a neural network model.
- the non-transitory medium is configured to store a second model, and wherein the processing unit is configured to process the second image based on the second model to determine the state of the driver.
- the processing unit is configured to determine metric values for multiple respective pose classifications, and wherein the processing unit is configured to determine whether the driver is engaged with a driving task or not based on one or more of the metric values.
- the pose classifications comprise two or more of: looking-down pose, looking-up pose, looking-left pose, looking-right pose, cellphone-using pose, smoking pose, holding-object pose, hand(s)-not-on-the wheel pose, not-wearing-seatbelt pose, eye(s)-closed pose, looking-straight pose, one-hand-on-wheel pose, and two- hands-on-wheel pose.
- the processing unit is configured to compare the metric values with respective thresholds for the respective pose classifications.
- the processing unit is configured to determine the driver as belonging to one of the pose classifications if the corresponding one of the metric values meets or surpasses the corresponding one of the thresholds.
- the first camera, the second camera, and the processing unit are integrated as parts of an aftermarket device for the vehicle.
- the processing unit is configured to determine the second information by processing the second image to determine whether an image of the driver meets a pose classification or not; and wherein the processing unit is configured to determine whether the driver is engaged with a driving task or not based on the image of the driver meeting the pose classification or not.
- the processing unit is configured to process the second image based on a neural network model to determine the state of the driver.
- the processing unit is configured to determine whether an object in the first image or a bounding box of the object overlaps a region of interest.
- the region of interest has a geometry that is variable in correspondence with a shape of a road or a lane in which the vehicle is traveling.
- the processing unit is configured to determine a centerline of a road or a lane in which the vehicle is traveling, and wherein the region of interest has a shape that is based on the centerline.
- the processing unit is configured to determine a distance between the vehicle and a physical location based on a y-coordinate of the physical location in a camera image provided by the first camera, wherein the y-coordinate is with respect to an image coordinate frame.
- a method performed by an apparatus includes: obtaining a first image generated by a first camera, wherein the first camera is configured to view an environment outside a vehicle; obtaining a second image generated by a second camera, wherein the second camera is configured to view a driver of the vehicle; determining first information indicating a risk of collision with the vehicle based at least partly on the first image; determining second information indicating a state of the driver based at least partly on the second image; and determining whether to provide a control signal for operating a device or not based on (1) the first information indicating the risk of collision with the vehicle, and (2) the second information indicating the state of the driver.
- the first information is determined by predicting the collision, and wherein the collision is predicted at least 3 seconds or more before an expected occurrence time for the predicted collision.
- the first information is determined by predicting the collision, and wherein the collision is predicted with sufficient lead time for a brain of the driver to process input and for the driver to perform an action to mitigate the risk of the collision.
- the sufficient lead time is dependent on the state of the driver.
- the first information indicating the risk of collision comprises a predicted collision, wherein the method further comprises determining an estimated time it will take for the predicted collision to occur, and wherein the control signal is provided to cause the device to provide the control signal if the estimated time it will take for the predicted collision to occur is below a threshold.
- the device comprises a warning generator, and wherein the control signal is provided to cause the device to provide a warning for the driver if the estimated time it will take for the predicted collision to occur is below a threshold.
- the device comprises a vehicle control, and wherein the control signal is provided to cause the device to control the vehicle if the estimated time it will take for the predicted collision to occur is below the threshold.
- the threshold is variable based on the second information indicating the state of the driver.
- the estimated time is repeatedly evaluated with respect to the variable threshold, as the predicted collision is temporally approaching in correspondence with a decrease of the estimated time it will take for the predicted collision to occur.
- the threshold is variable in real time based on the state of the driver.
- the method further includes increasing the threshold if the state of the driver indicates that the driver is distracted or is not attentive to a driving task.
- the method further includes at least temporarily holding off in generating the control signal if the estimated time it will take for the predicted collision to occur is higher than the threshold.
- the method further includes determining a level of the risk of the collision, and adjusting the threshold based on the determined level of the risk of the collision.
- the state of the driver comprises a distracted state
- the method further comprises determining a level of a distracted state of the driver, and adjusting the threshold based on the determined level of the distracted state of the driver.
- the threshold has a first value if the state of the driver indicates that the driver is attentive to a driving task, and wherein the threshold has a second value higher than the first value if the state of the driver indicates that the driver is distracted or is not attentive to the driving task.
- the threshold is also based on sensor information indicating that the vehicle is being operated to mitigate the risk of the collision.
- the act of determining whether to provide the control signal for operating the device or not is performed also based on sensor information indicating that the vehicle is being operated to mitigate the risk of the collision.
- the act of determining the first information indicating the risk of the collision comprises processing the first image based on a first model.
- the first model comprises a neural network model.
- the act of determining the second information indicating the state of the driver comprises processing the second image based on a second model.
- the method further includes determining metric values for multiple respective pose classifications, and determining whether the driver is engaged with a driving task or not based on one or more of the metric values.
- the pose classifications comprise two or more of: looking-down pose, looking-up pose, looking-left pose, looking-right pose, cellphone-using pose, smoking pose, holding-object pose, hand(s)-not-on-the wheel pose, not-wearing-seatbelt pose, eye(s)-closed pose, looking-straight pose, one-hand-on-wheel pose, and two- hands-on-wheel pose.
- the method further includes comparing the metric values with respective thresholds for the respective pose classifications.
- the method further includes determining the driver as belonging to one of the pose classifications if the corresponding one of the metric values meets or surpasses the corresponding one of the thresholds.
- the method is performed by an aftermarket device, and wherein the first camera and the second camera are integrated as parts of the aftermarket device.
- the second information is determined by processing the second image to determine whether an image of the driver meets a pose classification or not; and wherein the method further comprises determining whether the driver is engaged with a driving task or not based on the image of the driver meeting the pose classification or not.
- the act of determining the second information indicating the state of the driver comprises processing the second image based on a neural network model.
- the method further includes determining whether an object in the first image or a bounding box of the object overlaps a region of interest.
- the region of interest has a geometry that is variable in correspondence with a shape of a road or a lane in which the vehicle is traveling.
- the method further includes determining a centerline of a road or a lane in which the vehicle is traveling, and wherein the region of interest has a shape that is based on the centerline.
- the method further includes determining a distance between the vehicle and a physical location based on a y-coordinate of the physical location in a camera image provided by the first camera, wherein the y-coordinate is with respect to an image coordinate frame.
- FIG. 1 illustrates an apparatus in accordance with some embodiments.
- FIG. 2A illustrates a block diagram of the apparatus of FIG. 1 in accordance with some embodiments.
- FIG. 2B illustrates an example of a processing scheme for the apparatus of
- FIG. 2A is a diagrammatic representation of FIG. 2A.
- FIG. 3 illustrates an example of an image captured by a camera of the apparatus of FIG. 2.
- FIG. 4 illustrates an example of classifier.
- FIG. 5 illustrates a method in accordance with some embodiments.
- FIG. 6 illustrates an example of an image captured by the camera of FIG. 1, any various classifier outputs.
- FIG. 7 illustrates another example of an image captured by the camera of FIG. 1, any various classifier outputs.
- FIG. 8 illustrates another example of an image captured by the camera of FIG. 1, any various classifier outputs.
- FIG. 9 illustrates an example of a processing architecture having a first model and a second model coupled in series.
- FIG. 10 illustrates an example of feature information received by the second model.
- FIG. 11 illustrates another example of feature information received by the second model.
- FIG. 12 illustrates a method of detecting drowsiness performed by the apparatus of FIG. 2 in accordance with some embodiments.
- FIG. 13 illustrates an example of a processing architecture that may be implemented using the apparatus of FIG. 2A.
- FIG. 14 illustrates examples of object detection in accordance with some embodiments.
- FIG. 15A illustrates another example of object identifiers, particularly showing each object identifier being a box representing a leading vehicle.
- FIG. 15B illustrates another example of object identifiers, particularly showing each object identifier being a horizontal line.
- FIG. 15C illustrates an example of lead vehicle detection in accordance with some embodiments.
- FIG. 15D illustrates a technique of determining a region of interest based on centerline detection.
- FIG. 15E illustrates an advantage of using the region of interest of FIG. 15D for detecting object that is at risk of collision.
- FIG. 16 illustrates three exemplary scenarios involving collision with lead vehicle.
- FIG. 17 illustrates another example of object detection in which the object(s) being detected is human.
- FIG. 18 illustrates examples of human movement prediction.
- FIGS. 19A-19B illustrate other examples of object detection in which the objects being detected are associated with an intersection.
- FIG. 19C illustrates a concept of braking distance.
- FIG. 20 illustrates a technique of determining a distance between the subject vehicle and a location in front of the vehicle.
- FIG. 21 illustrates an example of a technique for generating a control signal for controlling a vehicle and/or for causing a generation of an alert for a driver.
- FIG. 22A illustrates a method that involves a prediction of collision in accordance with some embodiments.
- FIG. 22B illustrates a method that involves a prediction of intersection violation in accordance with some embodiments.
- FIG. 23 illustrates a technique of determining a model for use by the apparatus of FIG. 2A in accordance with some embodiments.
- FIG. 24 illustrates a specialized processing system for implementing one or more electronic devices described herein. DESCRIPTION OF THE EMBODIMENTS
- FIG. 1 illustrates an apparatus 200 in accordance with some embodiments.
- the apparatus 200 is configured to be mounted to a vehicle, such as to a windshield of the vehicle, to the rear mirror of the vehicle, etc.
- the apparatus 200 includes a first camera 202 configured to view outside the vehicle, and a second camera 204 configured to view inside a cabin of the vehicle.
- the apparatus 200 is in a form of an after-market device that can be installed in a vehicle (i.e. , offline from the manufacturing process of the vehicle).
- the apparatus 200 may include a connector configured to couple the apparatus 200 to the vehicle.
- the connector may be a suction cup, an adhesive, a clamp, one or more screws, etc.
- the connector may be configured to detachably secure the apparatus 200 to the vehicle, in which case, the apparatus 200 may be selectively removed from and/or coupled to the vehicle as desired.
- the connector may be configured to permanently secure the apparatus 200 to the vehicle.
- the apparatus 200 may be a component of the vehicle that is installed during a manufacturing process of the vehicle. It should be noted that the apparatus 200 is not limited to having the configuration shown in the example, and that the apparatus 200 may have other configurations in other embodiments. For example, in other embodiments, the apparatus 200 may have a different form factor.
- the apparatus 200 may be an end-user device, such as a mobile phone, a tablet, etc., that has one or more cameras. [00103] FIG.
- the apparatus 200 includes the first camera 202 and the second camera 204. As shown in the figure, the apparatus 200 also includes a processing unit 210 coupled to the first camera 202 and the second camera 204, a non- transitory medium 230 configured to store data, a communication unit 240 coupled to the processing unit 210, and a speaker 250 coupled to the processing unit 210.
- the first camera 202, the second camera 204, the processing unit 210, the non-transitory medium 230, the communication unit 240, and the speaker 250 may be integrated as parts of an aftermarket device for the vehicle.
- the first camera 202, the second camera 204, the processing unit 210, the non-transitory medium 230, the communication unit 240, and the speaker 250 may be integrated with the vehicle, and may be installed in the vehicle during a manufacturing process of the vehicle.
- the processing unit 210 is configured to obtain images from the first camera 202 and images from the second camera 204, and process the images from the first and second cameras 202, 204.
- the images from the first camera 202 may be processed by the processing unit 210 to monitor an environment outside the vehicle (e.g., for collision detection, collision prevention, driving environment monitoring, etc.).
- the images from the second camera 204 may be processed by the processing unit 210 to monitor a driving behavior of the driver (e.g., whether the driver is distracted, drowsy, focused, etc.).
- the processing unit 210 may process images from the first camera 202 and/or the second camera 204 to determine a risk of collision, to predict the collision, to provision alerts for the driver, etc.
- the apparatus 200 may not include the first camera 202. In such cases, the apparatus 200 is configured to monitor only the environment inside a cabin of the vehicle.
- the processing unit 210 of the apparatus 200 may include hardware, software, or a combination of both.
- hardware of the processing unit 210 may include one or more processors and/or more or more integrated circuits.
- the processing unit 210 may be implemented as a module and/or may be a part of any integrated circuit.
- the non-transitory medium 230 is configured to store data relating to operation of the processing unit 210.
- the non-transitory medium 230 is configured to store a model, which the processing unit 210 can access and utilize to identify pose(s) of a driver as appeared in images from the camera 204, and/or to determine whether the driver is engaged with a driving task or not.
- the model may configure the processing unit 210 so that it has the capability to identify pose(s) of the driver and/or to determine whether the driver is engaged with a driving task or not.
- the non-transitory medium 230 may also be configured to store image(s) from the first camera 202, and/or image(s) from the second camera 204.
- the non-transitory medium 230 may also be configured to store data generated by the processing unit 210.
- the model stored in the transitory medium 230 may be any computational model or processing model, including but not limited to neural network model.
- the model may include feature extraction parameters, based upon which, the processing unit 210 can extract features from images provided by the camera 204 for identification of objects, such as a driver’s head, a hat, a face, a nose, an eye, a mobile device, etc.
- the model may include program instructions, commands, scripts, etc.
- the model may be in a form of an application that can be received wirelessly by the apparatus 200.
- the communication unit 240 of the apparatus 200 is configured to receive data wirelessly from a network, such as a cloud, the Internet, Bluetooth network, etc.
- the communication unit 240 may also be configured to transmit data wirelessly.
- images from the first camera 202, images from the second camera 204, data generated by the processing unit, or any combination of the foregoing may be transmitted by the communication unit 240 to another device (e.g., a server, an accessory device such as a mobile phone, another apparatus 200 in another vehicle, etc.) via a network, such as a cloud, the Internet, Bluetooth network, etc.
- the communication unit 240 may include one or more antennas.
- the communication 240 may include a first antenna configured to provide long- range communication, and a second antenna configured to provide near-field communication (such as via Bluetooth).
- the communication unit 240 may be configured to transmit and/or receive data physically through a cable or electrical contacts.
- the communication unit 240 may include one or more communication connectors configured to couple with a data transmission device.
- the communication unit 240 may include a connector configured to couple with a cable, a USB slot configured to receive a USB drive, a memory-card slot configured to receive a memory card, etc.
- the speaker 250 of the apparatus 200 is configured to provide audio alert(s) and/or message(s) to a driver of the vehicle.
- the processing unit 210 may be configured to detect an imminent collision between the vehicle and an object outside the vehicle. In such cases, in response to the detection of the imminent collision, the processing unit 210 may generate a control signal to cause the speaker 250 to output an audio alert and/or message.
- the processing unit 210 may be configured to determine whether the driver is engaged with a driving task or not.
- the processing unit 210 may generate a control signal to cause the speaker 250 to output an audio alert and/or message.
- the apparatus 200 is described as having the first camera 202 and the second camera 204, in other embodiments, the apparatus 200 may include only the second camera (cabin camera) 204, and not the first camera 202. Also, in other embodiments, the apparatus 200 may include multiple cameras configured to view the cabin inside the vehicle.
- the processing unit 210 also includes a driver monitoring module 211, an object detector 216, a collision predictor 218, an intersection violation predictor 222, and a signal generation controller 224.
- the driver monitoring module 211 is configured to monitor the driver of the vehicle based on one or more images provided by the second camera 204.
- the driver monitoring module 211 is configured to determine one or more poses of the driver.
- the driver monitoring module 211 may be configured to determine a state of the driver, such as whether the driver is alert, drowsiness, attentive to a driving task, etc. In some cases, a pose of a driver itself may also be considered to be a state of the driver.
- the object detector 216 is configured to detect one or more objects in the environment outside the vehicle based on one or more images provided by the first camera 202.
- the object(s) being detected may be a vehicle (e.g., car, motorcycle, etc.), a lane boundary, human, bicycle, an animal, a road sign (e.g., stop sign, street sign, no turn sign, etc.), a traffic light, a road marking (e.g., stop line, lane divider, text painted on road, etc.), etc.
- the vehicle being detected may be a lead vehicle, which is a vehicle in front of the subject vehicle that is traveling in the same lane as the subject vehicle.
- the collision predictor 218 is configured to determine a risk of a collision based on output from the object detector 216. For example, in some embodiments, the collision predictor 218 may determine that there is a risk of collision with a lead vehicle, and outputs information indicating the risk of such collision. In some embodiments, the collision predictor 218 may optionally also obtain sensor information indicating a state of the vehicle, such as the speed of the vehicle, the acceleration of the vehicle, a turning angle of the vehicle, a turning direction of the vehicle, a braking of the vehicle, a traveling direction of the vehicle, or any combination of the foregoing.
- the collision predictor 218 may be configured to determine the risk of the collision based on the output from the object detector 216, and also based on the obtained sensor information. Also, in some embodiments, the collision predictor 218 may be configured to determine a relative speed between the subject vehicle and an object (e.g., a lead vehicle), and determine that there is a risk of collision based on the determined relative speed. In some embodiments, the collision predictor 218 may be configured to determine a speed of the subject vehicle, a speed of a moving object, a traveling path of the subject vehicle, and a traveling path of the moving object, and determine that there is a risk of the collision based on these parameters.
- an object e.g., a lead vehicle
- the collision predictor 218 may determine that there is a risk of collision.
- the object that may be collided with the subject vehicle is not limited to a moving object (e.g., car, motorcycle, bicycle, pedestrian, animal, etc.), and that the collision predictor 218 may be configured to determine the risk of collision with non-moving object, such as a parked car, a street sign, a light post, a building, a tree, a mailbox, etc.
- the collision predictor 218 may be configured to determine a time it will take for the predicted collision to occur, and compare the time with a threshold time. If the time is less than the threshold time, the collision predictor 218 may determine that there is a risk of collision with the subject vehicle.
- the threshold time for identifying the risk of collision may be at least: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 seconds, or higher.
- the collision predictor 218 may consider the speed, acceleration, traveling direction, braking operation, or any combination of the foregoing, of the subject vehicle.
- the collision predictor 218 may also consider the speed, acceleration, traveling direction, or any combination of the foregoing, of the detected object predicted to collide with the vehicle.
- the intersection violation predictor 222 is configured to determine a risk of an intersection violation based on output from the object detector 216. For example, in some embodiments, the intersection violation predictor 222 may determine that there is a risk that the subject vehicle may not be able to stop at a target area associated with a stop sign or a red light, and outputs information indicating the risk of such intersection violation. In some embodiments, the intersection violation predictor 222 may optionally also obtain sensor information indicating a state of the vehicle, such as the speed of the vehicle, the acceleration of the vehicle, a turning angle of the vehicle, a turning direction of the vehicle, a braking of the vehicle, or any combination of the foregoing. In such cases, the intersection violation predictor 222 may be configured to determine the risk of the intersection violation based on the output from the object detector 216, and also based on the obtained sensor information.
- the intersection violation predictor 222 may be configured to determine a target area (e.g., a stop line) at which the subject vehicle is expected to stop, determine a distance between the subject vehicle and the target area, and compare the distance with a threshold distance. If the distance is less than the threshold distance, the intersection violation predictor 222 may determine that there is a risk of intersection violation.
- a target area e.g., a stop line
- the intersection violation predictor 222 may be configured to determine a target area (e.g., a stop line) at which the subject vehicle is expected to stop, determine a time it will take for the vehicle to reach the target area, and compare the time with a threshold time. If the time is less than the threshold time, the intersection violation predictor 222 may determine that there is a risk of intersection violation.
- the threshold time for identifying the risk of intersection violation may be at least: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 seconds, or higher.
- the intersection violation predictor 222 may consider the speed, acceleration, traveling direction, braking operation, or any combination of the foregoing, of the subject vehicle. [00119] It should be noted that the intersection violation is not limited to stop sign and red-light violations, and that the intersection violation predictor 222 may be configured to determine the risk of other intersection violations, such as the vehicle moving into a wrong-way street, the vehicle turning at an intersection with a “no turning on red light” sign, etc.).
- the signal generation controller 224 is configured to determine whether to generate a control signal based on output from the collision predictor 218, and output from the driver monitoring module. Alternatively or additionally, the signal generation controller 224 is configured to determine whether to generate a control signal based on output from the intersection violation predictor 222, and optionally also based on output from the driver monitoring module. In some embodiments, the signal generation controller 224 is configured to determine whether to generate the control signal also based on sensor information provided by one or more sensors at the vehicle.
- the control signal is configured to cause a device (e.g., a warning generator) to provide a warning for the driver if the estimated time it will take for the predicted collision to occur is below a threshold (action threshold).
- a device e.g., a warning generator
- the warning generator may output an audio signal, a visual signal, a mechanical vibration (shaking steering wheel), or any combination of the foregoing, to alert the driver.
- the control signal is configured to cause a device (e.g., a vehicle control) to control the vehicle if the estimated time it will take for the predicted collision to occur is below the threshold (action threshold).
- the vehicle control may automatically apply the brake of the vehicle, automatically disengage the gas pedal, automatically activate hazard lights, or any combination of the foregoing.
- the signal generation controller 224 may be configured to provide a first control signal to cause a warning to be provided for the driver. If the driver does not take any action to mitigate the risk of collision, the signal generation controller 224 may then provide a second control signal to cause the vehicle control to control the vehicle, such as to automatically apply brake of the vehicle.
- the signal generation controller 224 may be a separate component (e.g., module) from the collision predictor 218 and the intersection violation predictor 222. In other embodiments, the signal generation controller 224 or at least a part of the signal generation controller 224 may be implemented as a part of the collision predictor 218 and/or the intersection violation predictor 222. Also, in some embodiments, the collision predictor 218 and the intersection violation predictor 222 may be integrated together.
- the apparatus 200 is coupled to a vehicle such that the first camera 202 is viewing outside the vehicle, and the second camera 204 is viewing a driver inside the vehicle. While the driver operates the vehicle, the first camera 202 captures images outside the vehicle, and the second camera 204 captures images inside the vehicle.
- FIG. 2B illustrates an example of a processing scheme for the apparatus 200. As shown in the figure, during use of the apparatus 200, the second camera 204 provides images as input to the driver monitoring module 211. The driver monitoring module 211 analyzes the images to determine one or more poses for the driver of the subject vehicle.
- the one or more poses may include looking-down pose, looking-up pose, looking-left pose, looking-right pose, cellphone-using pose, smoking pose, holding-object pose, hand(s)-not-on-the wheel pose, not-wearing-seatbelt pose, eye(s)-closed pose, looking-straight pose, one-hand- on-wheel pose, and two-hands-on-wheel pose.
- the driver monitoring module 211 may be configured to determine one or more states of the driver based on the determined pose(s) of the driver. For example, the driver monitoring module 211 may determine whether the driver is distracted or not based on one or more determined poses for the driver.
- the driver monitoring module 211 may determine whether the driver is drowsy or not based on one or more determined poses for the driver. In some embodiments, if the driver has certain pose (e.g., cellphone-using pose), then the driver monitoring module 211 may determine that the driver is distracted. Also, in some embodiments, the driver monitoring module 211 may analyze a sequence of pose classifications for the driver over a period to determine if the driver is drowsy or not.
- certain pose e.g., cellphone-using pose
- the driver monitoring module 211 may analyze a sequence of pose classifications for the driver over a period to determine if the driver is drowsy or not.
- the first camera 202 provides images as input to the object detector 216, which analyzes the images to detect one or more objects in the images.
- the object detector 216 comprises different detectors configured to detect different types of objects.
- the object detector 216 has a vehicle detector 260 configured to detect vehicles outside the subject vehicle, vulnerable object detector 262 configured to detect vulnerable objects, such as humans, bicycles with bicyclists, animals, etc., and an intersection detector 264 configured to detect one or more items (e.g., stop sign, traffic light, crosswalk marking, etc.) for identifying an intersection.
- the object detector 216 may be configured to determine different types of objects based on different respective models.
- the vehicle detector 260 is configured to detect vehicles outside the subject vehicle, and provide information (such as vehicle identifiers, vehicle positions, etc.) regarding the detected vehicles to module 221.
- the module 221 may be the collision predictor 218 and/or the intersection violation predictor 222.
- the module 221 includes an object tracker 266 configured to track one or more of the detected vehicles, a course predictor 268 configured to determine a course of a predicted collision, and a time to collision / crossing (TTC) module 269 configured to estimate a time it will take for the estimated collision to occur.
- the object tracker 266 is configured to identify a leading vehicle that is traveling in front of the subject vehicle.
- the course predictor 268 is configured to determine the course of the predicted collision based on the identified leading vehicle and sensor information from the sensor(s) 225. For example, based on the speed of the subject vehicle, and a direction of traveling of the subject vehicle, the course predictor 268 may determine a course of a predicted collision.
- the TTC module 269 is configured to calculate a time it will take for the estimated collision to occur based on information regarding the predicted course of collision and sensor information from the sensor(s) 225. For example, the TTC module 269 may calculate a TTC (time-to-collision) based on a distance of the collision course and a relative speed between the leading vehicle and the subject vehicle.
- the signal generation controller 224 is configured to determine whether to generate a control signal to operate a warning generator to provide a warning for the driver, and/or to operate a vehicle control to control the vehicle (e.g., to automatically disengage the gas pedal operation, to apply brake, etc.), based on output from the module 221 and output from the TTC module 269.
- the signal generation controller 224 if the TTC is less than a threshold (e.g., 3 seconds), then the signal generation controller 224 generates the control signal to operate the warning generator and/or the vehicle control.
- the threshold may be adjustable based on the output from the driver monitoring module 211. For example, if the output from the driver monitoring module 211 indicates that the driver is distracted or not attentive to a driving task, then the signal generation controller 224 may increase the threshold (e.g., making the threshold to be 5 seconds). This way, the signal generation controller 224 will provide the control signal when the TTC with the leading vehicle is less than 5 seconds.
- the module 221 is not limited to predicting collision with a leading vehicle, and that the module 221 may be configured to predict collision with other vehicles.
- the module 221 may be configured to detect a vehicle that is traveling towards a path of the subject vehicle, such as a vehicle approaching an intersection, a vehicle merging towards the lane of the subject vehicle, etc.
- the course predictor 268 determines the course of the subject vehicle, as well as the course of the other vehicle, and also determines the intersection between the two courses.
- the TTC module 269 is configured to determine the TTC based on the location of the intersection, the speed of the other vehicle, and the speed of the subject vehicle.
- the vulnerable object detector 262 is configured to detect vulnerable objects outside the subject vehicle, and provide information (such as object identifiers, object positions, etc.) regarding the detected objects to module 221.
- the vulnerable object detector 262 may detect humans outside the subject vehicle, and provide information regarding the detected humans to the module 221.
- the module 221 includes an object tracker 266 configured to track one or more of the detected objects (e.g., humans), a course predictor 268 configured to determine a course of a predicted collision, and a time to collision / crossing (TTC) module 269 configured to estimate a time it will take for the estimated collision to occur.
- object tracker 266 configured to track one or more of the detected objects (e.g., humans)
- a course predictor 268 configured to determine a course of a predicted collision
- TTC time to collision / crossing
- the course predictor 268 is configured to determine a box surrounding the image of the detected object for indicating possible positions of the object. In some embodiments, the course predictor 268 is configured to determine the course of the predicted collision based on the box surrounding the identified object (e.g., human), and sensor information from the sensor(s) 225. For example, based on the speed of the subject vehicle, a direction of traveling of the subject vehicle, and the box surrounding the identified object, the course predictor 268 may determine that the current traveling path of the subject vehicle will intersect the box.
- the identified object e.g., human
- the TTC module 269 is configured to calculate a time it will take for the estimated collision to occur based on information regarding the predicted course of collision and sensor information from the sensor(s) 225. For example, the TTC module 269 may calculate a TTC (time-to-collision) based on a distance of the collision course and a relative speed between the leading vehicle and the human.
- TTC time-to-collision
- the signal generation controller 224 is configured to determine whether to generate a control signal to operate a warning generator to provide a warning for the driver, and/or to operate a vehicle control to control the vehicle (e.g., to automatically disengage the gas pedal operation, to apply brake, etc.), based on output from the module 221 and output from the TTC module 269. In some embodiments, if the TTC is less than a threshold (e.g., 3 seconds), then the signal generation controller 224 generates the control signal to operate the warning generator and/or the vehicle control. Also, in some embodiments, the threshold may be adjustable based on the output from the driver monitoring module 211.
- the signal generation controller 224 may increase the threshold (e.g., making the threshold to be 5 seconds). This way, the signal generation controller 224 will provide the control signal when the TTC with the object is less than 5 seconds.
- the module 221 is not limited to predicting collision with a human, and that the module 221 may be configured to predict collision with other objects.
- the module 221 may be configured to detect animals, bicyclists, roller-skaters, skateboarders, etc.
- the course predictor 268 may be configured to determine the course of the subject vehicle, as well as the course of the detected object (if the object is moving in one direction, such as a bicyclist), and also determines the intersection between the two courses.
- the course predictor 268 may determine the path of the subject vehicle, and a box encompassing a range of possible positions of the object, and may determine the intersection between the path of the subject vehicle and the box, as similarly discussed.
- the TTC module 269 is configured to determine the TTC based on the location of the intersection and the speed of the subject vehicle.
- the intersection detector 264 is configured to detect one or more objects outside the subject vehicle indicating an intersection, and provide information (such as type of intersection, required stop location for the vehicle, etc.) regarding the intersection to module 221.
- the one or more objects indicating an intersection may include a traffic light, a stop sign, a road marking, etc., or any combination of the foregoing.
- the intersections that can be detected by the intersection detector 264 may include a stop-sign intersection, a traffic-light intersection, an intersection with a train railroad, etc.
- the module 221 includes a course predictor 268 configured to determine a course of a predicted intersection violation, and a time to collision / crossing (TTC) module 269 configured to estimate a time it will take for the estimated intersection violation to occur.
- the TTC module 269 is configured to calculate a time it will take for the estimated intersection violation to occur based on the location of the required stopping for the vehicle and sensor information from the sensor(s) 225. For example, the TTC module 269 may calculate a TTC (time-to-crossing) based on a distance of the course (e.g., a distance between the current position of the vehicle and the location of the required stopping for the vehicle), and a speed of the subject vehicle.
- the location of the required stopping may be determined by the object detector 216 detecting a stop line marking on the road. In other embodiments, there may not be a stop line marking on the road. In such cases, the course predictor 268 may determine an imaginary line or a graphical line indicating the location of the required stopping.
- the signal generation controller 224 is configured to determine whether to generate a control signal to operate a warning generator to provide a warning for the driver, and/or to operate a vehicle control to control the vehicle (e.g., to automatically disengage the gas pedal operation, to apply brake, etc.), based on output from the module 221 and output from the TTC module 269.
- the signal generation controller 224 if the TTC is less than a threshold (e.g., 3 seconds), then the signal generation controller 224 generates the control signal to operate the warning generator and/or the vehicle control.
- the threshold may be adjustable based on the output from the driver monitoring module 211. For example, if the output from the driver monitoring module 211 indicates that the driver is distracted or not attentive to a driving task, then the signal generation controller 224 may increase the threshold (e.g., making the threshold to be 5 seconds). This way, the signal generation controller 224 will provide the control signal when the time to crossing the intersection is less than 5 seconds.
- the signal generation controller 224 may be configured to apply different values of threshold for generating the control signal based on the type of state of the driver indicated by the output of the driver monitoring module 211. For example, if the output of the driver monitoring module 211 indicates that the driver is looking at a cell phone, then the signal generation controller 224 may generate the control signal to operate the warning generator and/or to operate the vehicle control in response to the meeting or being less than a threshold of 5 seconds.
- the signal generation controller 224 may generate the control signal to operate the warning generator and/or to operate the vehicle control in response to the TTC being below a threshold of 8 seconds (e.g., longer than the threshold for the case in which the driver is using a cell phone).
- a longer time threshold (for comparison with the TTC value) may be needed to alert the driver and/or to control the vehicle because certain state of the driver (such as the driver being sleepy or drowsy) may take longer for the driver to react to an imminent collision. Accordingly, the signal generation controller 224 will alert the driver and/or may operate the vehicle control earlier in response to a predicted collision in these circumstances.
- the second camera 204 is configured for viewing a driver inside the vehicle. While the driver operates the vehicle, the first camera 202 captures images outside the vehicle, and the second camera 204 captures images inside the vehicle.
- FIG. 3 illustrates an example of an image 300 captured by the second camera 204 of the apparatus 200 of FIG. 2.
- the image 300 from the second camera 202 may include an image of a driver 310 operating the subject vehicle (the vehicle with the apparatus 200).
- the processing unit 210 is configured to processing image(s) (e.g., the image 300) from the camera 202, and to determine whether the driver is engaged with a driving task or not.
- a driving task may be paying attention to a road or environment in front of the subject vehicle, having hand(s) on steering wheel, etc.
- the processing unit 210 is configured to process the image 300 of the driver from the camera 202, and to determine whether the driver belongs to certain pose classification(s).
- the pose classification(s) may be one or more of: looking-down pose, looking-up pose, looking-left pose, looking-right pose, cellphone-using pose, smoking pose, holding-object pose, hand(s)-not-on-the wheel pose, not-wearing-seatbelt pose, eye(s)-closed pose, looking-straight pose, one-hand-on-wheel pose, and two-hands-on- wheel pose.
- the processing unit 210 is configured to determine whether the driver is engaged with a driving task or not based on one or more pose classifications. For example, if the driver’s head is “looking” down, and the driver is holding a cell phone, then the processing unit 210 may determine that the driver is not engaged with a driving task (i.e. , the driver is not paying attention to the road or to an environment in front of the vehicle). As another example, if the driver’s head is “looking” to the right or left, and if the angle of head turn has passed a certain threshold, then the processing unit 210 may determine that the driver is not engaged with a driving task.
- the processing unit 210 is configured to determine whether the driver is engaged with a driving task or not based on one or more pose(s) of the driver as it appears in the image without a need to determine a gaze direction of an eye of the driver. This feature is advantageous because a gaze direction of an eye of the driver may not be captured in an image, or may not be determined accurately.
- a driver of the vehicle may be wearing a hat that prevents his / her eyes from being captured by the vehicle camera.
- the driver may also be wearing sun glasses that obstruct the view of the eyes.
- the frame of the glasses may also obstruct the view of the eyes, and/or the lens of the glasses may make detection of the eyes inaccurate. Accordingly, determining whether the driver is engaged with a driving task or not without a need to determine gaze direction of the eye of the driver is advantageous, because even if the eye(s) of the driver cannot be detected and/or if the eye’s gazing direction cannot be determined, the processing unit 210 can still determine whether the driver is engaged with a driving task or not. [00135] In some embodiments, the processing unit 210 may use context-based classification to determine whether the driver is engaged with a driving task or not.
- the processing unit 210 may determine that the driver is not engaged with a driving task. The processing unit 210 may make such determination even if the driver’s eyes cannot be detected (e.g., because they may be blocked by a cap like that shown in FIG. 3). The processing unit 210 may also use context-based classification to determine one or more poses for the driver. For example, if the driver’s head is directing downward, then the processing unit 210 may determine that the driver is looking downward even if the eyes of the driver cannot be detected.
- the processing unit 210 may determine that the driver is looking upward even if the eyes of the driver cannot be detected. As a further example, if the driver’s head is directing towards the right, then the processing unit 210 may determine that the driver is looking right even if the eyes of the driver cannot be detected. As a further example, if the driver’s head is directing towards the left, then the processing unit 210 may determine that the driver is looking left even if the eyes of the driver cannot be detected.
- the processing unit 210 may be configured to use a model to identify one or more poses for the driver, and to determine whether the driver is engaged with a driving task or not.
- the model may be used by the processing unit 210 to process images from the camera 204.
- the model may be stored in the non-transitory medium 230.
- the model may be transmitted from a server, and may be received by the apparatus 200 via the communication unit 240.
- the model may be a neural network model.
- the neural network model may be trained based on images of other drivers.
- the neural network model may be trained using images of drivers to identify different poses, such as looking-down pose, looking-up pose, looking-left pose, looking- right pose, cellphone-using pose, smoking pose, holding-object pose, hand(s)-not-on-the wheel pose, not-wearing-seatbelt pose, eye(s)-closed pose, looking-straight pose, one- hand-on-wheel pose, two-hands-on-wheel pose, etc.
- the neural network model may be trained to identify the different poses even without detection of the eyes of the persons in the images.
- the neural network model may identify different poses and/or to determine whether a driver is engaged with a driving task or not based on context (e.g., based on information captured in the image regarding the state of the driver other than a gazing direction of the eye(s) of the driver).
- the model may be any of other types of model that is different from neural network model.
- the neural network model may be trained to classify pose(s) and/or to determine whether the driver is engaged with a driving task or not, based on context. For example, if the driver is holding a cell phone, and has a head pose that is facing downward towards the cell phone, then the neural network model may determine that the driver is not engaged with a driving task (e.g., is not looking at the road or the environment in front of the vehicle) without the need to detect the eyes of the driver.
- a driving task e.g., is not looking at the road or the environment in front of the vehicle
- deep learning or artificial intelligence may be used to develop a model that identifies pose(s) for the driver and/or to determine whether the driver is engaged with a driving task or not.
- a model can distinguish a driver who is engaged with a driving task from a driver who is not.
- the model utilized by the processing unit 210 to identify pose(s) for the driver may be a convolutional neural network model. In other embodiments, the model may be simply any mathematical model.
- FIG. 5 illustrates an algorithm 500 for determining whether a driver is engaged with a driving task or not.
- the algorithm 500 may be utilized for determining whether a driver is paying attention to the road or environment in front of the vehicle.
- the algorithm 500 may be implemented and/or performed using the processing unit 210 in some embodiments.
- the processing unit 210 processes an image from the camera 204 to attempt to detect a face of a driver based on the image (item 502). If the face of the driver cannot be detected in the image, the processing unit 210 may then determine that it is unknown as to whether the driver is engaged with a driving task or not. On the other hand, if the processing unit 210 determines that a face of the driver is present in the image, the processing unit 210 may then determine whether the eye(s) of the driver is closed (item 504). In one implementation, the processing unit 210 may be configured to determine eye visibility based on a model, such as a neural network model.
- a model such as a neural network model.
- the processing unit 210 may determine that the driver is not engaged with a driving task. On the other hand, if the processing unit 210 determines that the eye(s) of the driver is not closed, the processing unit 210 may then attempt to detect a gaze of the eye(s) of the driver based on the image (item 506).
- the processing unit 210 may then determine a direction of the gaze (item 510). For example, the processing unit 210 may analyze the image to determine a pitch (e.g., up-down direction) and/or a yaw (e.g., left- right direction) of the gazing direction of the eye(s) of the driver. If the pitch of the gazing direction is within a prescribed pitch range, and if the yaw of the gazing direction is within a prescribed yaw range, then the processing unit 210 may determine that the user is engaged with a driving task (i.e.
- a driving task i.e.
- the processing unit 210 may determine that the user is not engaged with a driving task (item 514).
- the processing unit 210 may then determine whether the driver is engaged with a driving task or not without requiring a determination of a gaze direction of the eye(s) of the driver (item 520).
- the processing unit 210 may be configured to use a model to make such determination based on context (e.g., based on information captured in the image regarding the state of the driver other than a gazing direction of the eye(s) of the driver).
- the model may be a neural network model that is configured to perform context-based classification for determining whether the driver is engaged with a driving task or not.
- the model is configured to process the image to determine whether the driver belongs to one or more pose classifications. If the driver is determined as belonging to one or more pose classifications, then the processing unit 210 may determine that the driver is not engaged with a driving task (item 522). If the driver is determined as not belonging to one or more pose classifications, the processing unit 210 may then determine that the driver is engaged with a driving task or that it is unknown whether the driver is engaged with a driving task or not (item 524). [00145] In some embodiments, the above items 502, 504, 506, 510, 520 may be repeatedly performed by the processing unit 210 to process multiple images in a sequence provided by the camera 204, thereby performing real-time monitoring of the driver while the driver is operating the vehicle.
- the algorithm 500 is not limited to the example described, and that the algorithm 500 implemented using the processing unit 210 may have other features and/or variations.
- the algorithm 500 may not include item 502 (detection of a face of a driver).
- the algorithm 500 may not include item 504 (detecting of closed-eye condition).
- the algorithm 500 may not include item 506 (attempt to detect gaze) and/or item 510 (determination of gaze direction).
- the processing unit 210 may still perform context-based classification to determine whether the driver belongs to one or more poses.
- the pose classification(s) may be used by the processing unit 210 to confirm a gaze direction of the eye(s) of the driver.
- the gaze direction of the eye(s) of the driver may be used by the processing unit 210 to confirm one or more pose classifications for the driver.
- the processing unit 210 is configured to determine whether the driver belongs to one or more pose classifications based on image from the camera 204, and to determine whether the driver is engaged with a driving task or not based on the one or more pose classifications. In some embodiments, the processing unit 210 is configured to determine metric values for multiple respective pose classifications, and to determine whether the driver is engaged with a driving task or not based on one or more of the metric values.
- FIG. 6 illustrates examples of classification outputs 602 provided by the processing unit 210 based on the image 604a. In the example, the classification outputs 602 include metric values for respective different pose classifications - i.e.
- the gaze direction of the driver can be determined by the processing unit 210.
- the gaze direction is represented by a graphical object superimposed on the nose of the driver in the image.
- the graphical object may include a vector or a line that is parallel to a gaze direction. Alternatively or additionally, the graphical object may include one or more vectors or one or more lines that are perpendicular to the gaze direction.
- FIG. 7 illustrates other examples of classification outputs 602 provided by the processing unit 210 based on image 604b.
- the metric value for the “looking down” pose has a relatively high value (e.g., higher than 0.6), indicating that the driver has a “looking down” pose.
- the metric values for the other poses have relatively low values, indicating that the driver in the image 604b does not meet these pose classifications.
- FIG. 8 illustrates other examples of classification outputs 602 provided by the processing unit 210 based on image 604c.
- the metric value for the “looking left” pose has a relatively high value (e.g., higher than 0.6), indicating that the driver has a “looking left” pose.
- the metric values for the other poses have relatively low values, indicating that the driver in the image 604c does not meet these pose classifications.
- the processing unit 210 is configured to compare the metric values with respective thresholds for the respective pose classifications. In such cases, the processing unit 210 is configured to determine the driver as belonging to one of the pose classifications if the corresponding one of the metric values meets or surpasses the corresponding one of the thresholds.
- the thresholds for the different pose classifications may be set to 0.6. In such cases, if any of the metric values for any of the pose classifications exceeds 0.6, then the processing unit 210 may determine that the driver as having a pose belonging to the pose classification (i.e. , the one with the metric value exceeding 0.6).
- the processing unit 210 may determine that the driver is not engaged with a driving task.
- the pre-set threshold e.g., 0.6
- the processing unit 210 may determine that the driver is not engaged with a driving task.
- the same pre-set threshold is implemented for the different respective pose classifications. In other embodiments, at least two of the thresholds for the at least two respective pose classifications may have different values.
- the metric values for the pose classifications have a range from 0.0 to 1.0, with 1.0 being the highest. In other embodiments, the metric values for the pose classifications may have other ranges. Also, in other embodiments, the convention of the metric values may be reversed in that a lower metric value may indicate that the driver is meeting a certain pose classification, and a higher metric value may indicate that the driver is not meeting a certain pose classification.
- the thresholds for the different pose classifications may be tuned in a tuning procedure, so that the different pose classifications will have their respective tuned thresholds for allowing the processing unit 210 to determine whether an image of a driver belongs to certain pose classification(s) or not.
- a single model may be utilized by the processing unit 210 to provide multiple pose classifications.
- the multiple pose classifications may be outputted by the processing unit 210 in parallel or in sequence.
- the model may comprise multiple sub-models, with each sub-model being configured to detect a specific classification of pose.
- the thresholds for the respective pose classifications are configured to determine whether a driver’s image meet the respective pose classifications.
- the thresholds for the respective pose classifications may be configured to allow the processing unit 210 to determine whether the driver is engaging with a driving task or not. In such cases, if one or more metric values for one or more respective pose classifications meet or surpass the respective one or more thresholds, then the processing unit 210 may determine that the driver is engaged with the driving task or not.
- the pose classifications may belong to a “distraction” class.
- the processing unit 210 may determine that the driver is not engaged with the driving task (e.g., the driver is distracted).
- Examples of pose classifications belonging to “distraction” class include “looking-left” pose, “looking-right” pose, “looking-up” pose, “looking-down” pose, “cell phone holding” pose, etc.
- the pose classifications may belong to an “attention” class.
- pose classifications belonging to “attention” class include “looking-straight” pose, “hand(s) on wheel” pose, etc.
- context-based classification is advantageous because it allows the processing unit 210 to identify driver who is not engaged with a driving task even if a gaze direction of the eyes of the driver cannot be detected.
- context-based identification will still allow the processing unit 210 to identify driver who is not engaged with a driving task.
- Aftermarket products may be mounted in different positions, making it difficult to detect eyes and gaze.
- the features described herein are advantageous because they allow determination of whether the driver is engaged with driving task or not even if the apparatus 200 is mounted in such a way that the driver’s eyes and gaze cannot be detected.
- processing unit 210 is not limited to using a neural network model to determine pose classification(s) and/or whether a driver is engaged with a driving task or not, and that the processing unit 210 may utilized any processing technique, algorithm, or processing architecture to determine pose classification(s) and/or whether a driver is engaged with a driving task or not.
- the processing unit 210 may utilize equations, regression, classification, neural networks (e.g., convolutional neural networks, deep neural networks), heuristics, selection (e.g., from a library, graph, or chart), instance-based methods (e.g., nearest neighbor), correlation methods, regularization methods (e.g., ridge regression), decision trees, Baysean methods, kernel methods, probability, deterministics, or a combination of two or more of the above, to process image(s) from the camera 204 to determine pose classification(s) and/or whether a driver is engaged with a driving task or not.
- a pose classification can be a binary classification or binary score (e.g., looking up or not), a score (e.g., continuous or discontinuous), a classification (e.g., high, medium, low), or be any other suitable measure of pose classification.
- the processing unit 210 is not limited to detecting poses indicating that the driver is not engaged with driving task (e.g., poses belonging to “distraction” class). In other embodiments, the processing unit 210 may be configured to detect poses indicating that the driver is engaged with driving task (e.g., poses belonging to “attention” class). In further embodiments, the processing unit 210 may be configured to detect both (1) poses indicating that the driver is not engaged with driving task, and (2) poses indicating that the driver is engaged with driving task.
- the processing unit 210 may be further configured to determine a collision risk based on whether the driver is engaged with a driving task or not. In some embodiments, the processing unit 210 may be configured to determine the collision risk based solely on whether the driver is engaged with a driving task or not. For example, the processing unit 210 may determine that the collision risk is “high” if the driver is not engaged with a driving task, and may determine that the collision risk is “low” if the driver is engaged with a driving task. In other embodiments, the processing unit 210 may be configured to determine the collision risk based on additional information.
- the processing unit 210 may be configured to keep track how long the driver is not engaged with a driving task, and may determine a level of collision risk based on a duration of the “lack of engagement with a driving task” condition.
- the processing unit 210 may process images from the first camera 202 to determine whether there is an obstacle (e.g., a vehicle, a pedestrian, etc.) in front of the subject vehicle, and may determine the collision risk based on a detection of such obstacle and in combination of the pose classification(s).
- an obstacle e.g., a vehicle, a pedestrian, etc.
- camera images from the camera 204 are utilized to monitor driver’s engagement with driving task.
- camera images from the camera 202 may also be utilized as well.
- the camera images capturing the outside environment of the vehicle may be processed by the processing unit 210 to determine whether the vehicle is turning left, moving straight, or turning right. Based on the direction in which the vehicle is travelling, the processing unit 210 may then adjust one or more thresholds for pose classifications of the driver, and/or one or more thresholds for determining whether the driver is engaged with driving task or not.
- the processing unit 210 may then adjust the threshold for the “looking- left” pose classification, so that a driver who is looking left will not be classified as not engaged with driving task.
- the threshold for “looking-left” pose classification may have a value of 0.6 for a straight-travelling vehicle, and may have a value of 0.9 for a left-turning vehicle.
- the processing unit 210 may determine that the driver is not engaged with the driving task (because the metric value of 0.7 surpasses the threshold 0.6 for straight travelling vehicle).
- a pose classification (e.g., “looking-left” pose) may belong to “distraction” class in one situation, and may belong to “attention” class in another situation.
- the processing unit 210 is configured to process images of the external environment from the camera 202 to obtain an output, and adjust one or more thresholds based on the output.
- the output may be a classification of driving condition, a classification of the external environment, a determined feature of the environment, a context of an operation of the vehicle, etc.
- the processing unit 210 may also be configured to processing images (e.g., the image 300) from the camera 204, and to determine whether the driver is drowsy or not based on the processing of the images. In some embodiments, the processing unit 210 may also process images from the camera 204 to determine whether the driver is distracted or not. In further embodiments, the processing unit 210 may also process images from the camera 202 to determine a collision risk.
- images e.g., the image 300
- the processing unit 210 may also process images from the camera 204 to determine whether the driver is distracted or not.
- the processing unit 210 may also process images from the camera 202 to determine a collision risk.
- the driver monitoring module 211 of the processing unit 210 may include a first model and a second model that are configured to operate together to detect drowsiness of the driver.
- FIG. 9 illustrates an example of a processing architecture having the first model 212 and the second model 214 coupled in series.
- the first and second models 212, 214 are in the processing unit 210, and/or may be considered as parts of the processing unit 210 (e.g., a part of the driver monitoring module 211).
- the models 212, 214 are shown schematically to be in the processing unit 210, in some embodiments, the models 212, 214 may be stored in the non-transitory medium 230.
- the models 212, 214 may still be considered as a part of the processing unit 210.
- a sequence of images 400a-400e from the camera 204 are received by the processing unit 210.
- the first model 212 of the processing unit 210 is configured to process the images 400a-400e.
- the first model 212 is configured to determine one or more poses for a corresponding one of the images 400a-400e.
- the first model 212 may analyze the image 400a and may determine that the driver has a “opened-eye(s)” pose and a “head-straight” pose.
- the first model 212 may analyze the image 400b and may determine that the driver has a “closed-eye(s)” pose.
- the first model 212 may analyze the image 400c and may determine that the driver has a “closed-eye(s)” pose.
- the first model 212 may analyze the image 400d and may determine that the driver has a “closed-eye(s)” pose and a “head-down” pose.
- the first model 212 may analyze the image 400e and may determine that the driver has a “closed-eye(s)” pose and a “head- straight” pose.
- the sequence of images received by the first model 212 may be more than five.
- the camera 202 may have a frame rate of at least 10 frames per second (e.g., 15 fps), and the first model 212 may continue to receive images from the camera 202 at that rate for the duration of the operation of the vehicle by the driver.
- the first model may be a single model utilized by the processing unit 210 to provide multiple pose classifications.
- the multiple pose classifications may be outputted by the processing unit 210 in parallel or in sequence.
- the first model may comprise multiple sub-models, with each sub- model being configured to detect a specific classification of pose. For example, there may be a sub-model that detects face, a sub-model that detects head-up pose, a sub model that detects head-down pose, a sub-model that detects closed-eye(s) pose, a sub-model that detects head-straight pose, a sub-model that detects opened-eye(s) pose, etc.
- the first model 212 of the processing unit 210 is configured to determine metric values for multiple respective pose classifications.
- the first model 212 of the processing unit 210 is also configured to compare the metric values with respective thresholds for the respective pose classifications.
- the processing unit 210 is configured to determine the driver as belonging to one of the pose classifications if the corresponding one of the metric values meets or surpasses the corresponding one of the thresholds.
- the thresholds for the different pose classifications may be set to 0.6. In such cases, if any of the metric values for any of the pose classifications exceeds 0.6, then the processing unit 210 may determine that the driver as having a pose belonging to the pose classification (i.e. , the one with the metric value exceeding 0.6).
- the same pre-set threshold is implemented for the different respective pose classifications.
- at least two of the thresholds for the at least two respective pose classifications may have different values.
- the metric values for the pose classifications have a range from 0.0 to 1.0, with 1.0 being the highest.
- the metric values for the pose classifications may have other ranges.
- the convention of the metric values may be reversed in that a lower metric value may indicate that the driver is meeting a certain pose classification, and a higher metric value may indicate that the driver is not meeting a certain pose classification.
- the first model 212 is configured to process images of the driver from the camera 204, and to determine whether the driver belongs to certain pose classifications.
- the pose classifications may belong to a “drowsiness” class, in which each of the pose classifications may indicate sign of drowsiness.
- the pose classification(s) in the “drowsiness” class may be one or more of: head-down pose, closed-eye(s), etc., or any of other poses that would be helpful in determining whether the driver is drowsy.
- the pose classifications may belong to an “alertness” class, in which each of the pose classifications may indicate sign of alertness.
- the pose classification(s) may be one or more of: cellphone- usage pose, etc., or any of other poses that would be helpful in determining whether the driver is drowsy or not.
- certain poses may belong to both “drowsiness” class and “alertness” class.
- head-straight and open-eye(s) pose may belong to both classes.
- the pose identifications may be outputted by the first model 212 as feature information.
- the second model 214 obtains the feature information from the first model 212 as input, and processes the feature information to determine whether the driver is drowsy or not.
- the second model 214 also generates an output indicating whether the driver is drowsy or not.
- the feature information outputted by the first model 212 may be a time series of data.
- the time series of data may be pose classifications of the driver for the different images 400 at the different respective times.
- the first model 212 processes the images sequentially one-by-one to determine pose(s) for each image.
- pose classification(s) is determined for each image by the first model 212
- the determined pose classification(s) for that image is then outputted by the first model 212 as feature information.
- feature information for the respective images are also outputted one-by-one sequentially by the first model 212.
- FIG. 10 illustrates an example of feature information received by the second model 214.
- the feature information includes pose classifications for the different respective images in a sequence, wherein ⁇ ” indicates that the driver has an “opened-eye(s)” pose in the image, and “C” indicates that the driver has a “closed-eye(s)” pose in the image.
- the second model 214 analyzes the feature information to determine whether the driver is drowsy or not.
- the second model 214 may be configured (e.g., programmed, made, trained, etc.) to analyze the pattern of the feature information, and determine whether it is a pattern that is associated with drowsiness (e.g., a pattern indicating drowsiness). For example, the second model 214 may be configured to determine blink rate, eye closure duration, time took to achieve eyelid closure, PERCLOS, or any of other metric(s) that measures or indicates alertness or drowsiness, based on the time series of feature information.
- drowsiness e.g., a pattern indicating drowsiness
- the second model 214 may be configured to determine blink rate, eye closure duration, time took to achieve eyelid closure, PERCLOS, or any of other metric(s) that measures or indicates alertness or drowsiness, based on the time series of feature information.
- the processing unit 210 may determine that the driver is drowsy.
- the processing unit 210 may determine that the driver is drowsy. A person who is drowsy may have a longer eye closure duration compared to a person who is alert.
- the processing unit 210 may determine that the driver is drowsy. It should be noted that the time it took to achieve eyelid closure is a time interval between a state of the eyes being substantially opened (e.g., at least 80% opened, at least 90% opened, 100% opened, etc.) until the eyelids are substantially closed (e.g., at least 70% closed, at least 80% closed, at least 90% closed, 100% closed, etc.). It is a measure of a speed of the closing of the eyelid. A person who is drowsy tends to have a slower speed of eyelid closure compared to a person who is alert.
- PERCLOS is a drowsiness metric that indicates the proportion of time in a minute that the eyes are at least 80 percent closed.
- PERCLOS is the percentage of eyelid closure over the pupil over time and reflects slow eyelid closures rather than blinks.
- the feature information provided by the first model 212 to the second model 214 is not limited to the examples of pose classifications described in FIG. 10, and that the feature information utilized by the second model 214 for detecting drowsiness may include other pose classifications.
- FIG. 11 illustrates another example of feature information received by the second model 214.
- the feature information includes pose classifications for the different respective images in a sequence, wherein “S” indicates that the driver has a “head straight” pose in the image, and “D” indicates that the driver has a “head down” pose in the image.
- the second model 214 analyzes the feature information to determine whether the driver is drowsy or not.
- the processing unit may determine that the driver is drowsy.
- the second model 214 may be configured (e.g., programmed, made, trained, etc.) to analyze the pattern of the feature information, and determine whether it is a pattern that is associated with drowsiness (e.g., a pattern indicating drowsiness).
- the feature information provided by the first model 212 to the second model 214 may have a data structure that allows different pose classifications to be associated with different time points. Also, in some embodiments, such data structure may also allow one or more pose classifications to be associated with a particular time point.
- the output of the first model 212 may be a numerical vector (e.g., a low dimensional numerical vector, such as embedding) that provides a numerical representation of pose(s) detected by the first model 212.
- the numerical vector may not be interpretable by a human, but may provide information regarding detected pose(s).
- the first model 212 may be a neural network model.
- the neural network model may be trained based on images of other drivers.
- the neural network model may be trained using images of drivers to identify different poses, such as head-down pose, head-up pose, head-straight pose, closed-eye(s) pose, opened-eye(s) pose, cellphone-usage pose, etc.
- the first model 212 may be any of other types of model that is different from neural network model.
- the second model 214 may be a neural network model.
- the neural network model may be trained based on feature information.
- the feature information may be any information indicating a state of a driver, such as pose classification.
- the neural network model may be trained using feature information output by the first model 212.
- the second model 214 may be any of other types of model that is different from neural network model.
- the first model 212 utilized by the processing unit 210 to identify pose(s) for the driver may be a convolutional neural network model. In other embodiments, the first model 212 may be simply any mathematical model. Also, in some embodiments, the second model 214 utilized by the processing unit 210 to determine whether the driver is drowsy or not may be a convolutional neural network model. In other embodiments, the second model 214 may be simply any mathematical model.
- the first model 212 may be a first neural network model trained to classify pose(s) based on context. For example, if the driver’s head is facing down, then the neural network model may determine that the driver is not looking straight even if the eyes of the driver cannot be detected (e.g., because the eyes may be blocked by a hat / cap).
- the second model 214 may be a second neural network model trained to determine whether the driver is drowsy or not based on context. For example, if the blink rate exceeds a certain threshold, and/or if the head-down pose and head-straight pose repeats in a period pattern, then the neural network model may determine that the driver is drowsy. As another example, if the time it took to achieve eyelid closure exceeds a certain threshold, then the neural network model may determine that the driver is drowsy.
- deep learning or artificial intelligence may be used to develop one or more models that identifies pose(s) for the driver and/or to determine whether the driver is drowsy or not.
- Such model(s) can distinguish a driver who is drowsy from a driver who is alert.
- processing unit 210 is not limited to using neural network model(s) to determine pose classification(s) and/or whether a driver is drowsy or not, and that the processing unit 210 may utilized any processing technique, algorithm, or processing architecture to determine pose classification(s) and/or whether a driver is drowsy or not.
- the processing unit 210 may utilize equations, regression, classification, neural networks (e.g., convolutional neural networks, deep neural networks), heuristics, selection (e.g., from a library, graph, or chart), instance-based methods (e.g., nearest neighbor), correlation methods, regularization methods (e.g., ridge regression), decision trees, Baysean methods, kernel methods, probability, deterministics, or a combination of two or more of the above, to process image(s) from the camera 204 to determine pose classification(s) and/or to process time series of feature information to determine whether a driver is drowsy or not.
- neural networks e.g., convolutional neural networks, deep neural networks
- heuristics selection (e.g., from a library, graph, or chart), instance-based methods (e.g., nearest neighbor), correlation methods, regularization methods (e.g., ridge regression), decision trees, Baysean methods, kernel methods, probability, deterministics, or a combination of
- a pose classification can be a binary classification or binary score (e.g., head down or not), a score (e.g., continuous or discontinuous), a classification (e.g., high, medium, low), or may be any other suitable measure of pose classification.
- a drowsiness classification can be a binary classification or binary score (e.g., drowsy or not), a score (e.g., continuous or discontinuous), a classification (e.g., high, medium, low), or may be any other suitable measure of drowsiness.
- the determination of whether a driver is drowsy or not may be accomplished by analyzing a pattern of pose classifications of the driver that occur over a period, such as a period that is at least: a fraction of a second, 1 second, 2 seconds, 5 seconds, 10 seconds, 12 seconds, 15 seconds, 20 seconds, 30 seconds, 1 minute, 2 minutes, 5 minutes, 10 minutes, 15 minutes, 20 minutes, 25 minutes, 30 minutes, 40 minutes, etc.
- the period may be any pre-determined time duration of a moving window or moving box (for identifying data that was generated in the last time duration, e.g., data in the last fraction of a second, 1 second, 2 seconds, 5 seconds, 10 seconds, 12 seconds, 15 seconds, 20 seconds, 30 seconds, 1 minute, 2 minutes, 5 minutes, 10 minutes, 15 minutes, 20 minutes, 25 minutes, 30 minutes, 40 minutes, etc.).
- the first model 212 and the second model 214 may be configured to operate together to detect “micro sleep” event, such as slow eyelid closure that occurs over a duration of sub-second, between 1 to 1.5 second or more than 2 seconds.
- the first model 212 and the second model 214 may be configured to operate together to detect early sign(s) of drowsiness based on images captured in a longer period, such as a period that is longer than 10 seconds, 12 seconds, 15 seconds, 20 seconds, 30 seconds, 1 minute, 2 minutes, 5 minutes, 10 minutes, 15 minutes, 20 minutes, 25 minutes, 30 minutes, 40 minutes, etc.
- the technique of combining the use of (1) the first model to process camera images (one-by-one as each camera image is generated) to identify driver’s poses, and (2) the second model to process feature information resulted from processing of camera images by the first model obviates the need for the processing unit 210 to collect a sequence of images in a batch, and to process the batch of camera images (video) together. This saves significant computational resource and memory space.
- the second model does not process images from the camera. Instead, the second model receives feature information as output from the first model, and process the feature information to determine whether the driver is drowsy or not.
- context-based classification is advantageous because it allows the processing unit 210 to identify different poses of the driver accurately. In some cases, even if the apparatus 200 is mounted at very off angle with respect to the vehicle (which may result in the driver appearing at odd angles and/or positions in the camera images), context-based identification will still allow the processing unit 210 to correctly identify poses of the driver. Aftermarket products may be mounted in different positions. The features described herein are also advantageous because they allow determination of whether the driver is drowsy or not even if the apparatus 200 is mounted at different angles.
- the processing unit 210 is not limited to detecting poses indicating that the driver is drowsy (e.g., poses belonging to “drowsiness” class). In other embodiments, the processing unit 210 may be configured to detect poses indicating that the driver is alert (e.g., poses belonging to “alertness” class). In further embodiments, the processing unit 210 may be configured to detect both (1) poses indicating that the driver is drowsy, and (2) poses indicating that the driver is alert.
- the processing unit 210 may obtain (e.g., by receiving or determining) additional parameter(s) for determining whether the driver is drowsy or not.
- the processing unit 210 may be configured to obtain acceleration of the vehicle, deceleration of the vehicle, vehicle position with respect to the driving lane, information regarding driver participation in the driving, etc.
- one or more of the above parameters may be obtained by the second model 214, which then determines whether the driver is drowsy or not based on the output from the first model 212, as well as based on such parameter(s). It should be noted that acceleration, deceleration, and information regarding driver participation are indicators of whether the driver is actively driving or not.
- sensors built within the vehicle may provide acceleration and deceleration information.
- the processing unit 210 may be hardwired to the vehicle system for receiving such information.
- the processing unit 210 may be configured to receive such information wirelessly.
- the apparatus 200 comprising the processing unit 210 may optionally further include an accelerometer for detecting acceleration and deceleration.
- the second model 214 may be configured to obtain the acceleration and/or deceleration information from the accelerometer.
- information regarding driver participation may be any information indicating that the driver is or is not operating the vehicle.
- information regarding driver participation may be information regarding driver participation that occurs within a certain past duration of time (e.g., within the last 10 seconds or longer, last 20 seconds or longer, last 30 seconds or longer, last 1 minute or longer, etc.).
- the vehicle position with respect to the driving lane may be determined by the processing unit 210 processing images from the external facing camera 202.
- the processing unit 210 may be configured to determine whether the vehicle is traveling within a certain threshold from a center line of the lane. If the vehicle is traveling within the certain threshold from the center line of the lane, that means the driver is actively participating in the driving. On the other hand, if the vehicle is drifting away from the center line of the lane past the threshold, that means the driver may not be actively participating in the driving.
- the second model 214 may be configured to receive images from the first camera 202, and to determine whether the vehicle is traveling within a certain threshold from the center line of the lane.
- another module may be configured to provide this feature. In such cases, the output of the module is input to the second model 214 for allowing the model 214 to determine whether the driver is drowsy or not based on the output of the module.
- the processing unit 210 may be further configured to determine a collision risk based on whether the driver is drowsy or not. In some embodiments, the processing unit 210 may be configured to determine the collision risk based solely on whether the driver is drowsy or not. For example, the processing unit 210 may determine that the collision risk is “high” if the driver is drowsy, and may determine that the collision risk is “low” if the driver is not drowsy (e.g., alert). In other embodiments, the processing unit 210 may be configured to determine the collision risk based on additional information. For example, the processing unit 210 may be configured to keep track how long the driver has been drowsy, and may determine a level of collision risk based on a duration of the drowsiness.
- the processing unit 210 may process images from the first camera 202 to determine an output, and may determine the collision risk based on such output and in combination of the pose classification(s) and/or drowsiness determination.
- the output may be a classification of driving condition, a classification of the external environment, a determined feature of the environment, a context of an operation of the vehicle, etc.
- the camera images capturing the outside environment of the vehicle may be processed by the processing unit 210 to determine whether the vehicle is turning left, moving straight, turning right, whether there is an obstacle (e.g., a vehicle, a pedestrian, etc.) in front of the subject vehicle, etc. If the vehicle is turning, and/or if there is an obstacle detected in the travelling path of the vehicle, while drowsiness is detected, the processing unit 210 may then determine that the collision risk is high.
- the second model 214 of the processing unit 210 is not limited to receiving only output from the first model 212.
- the second model 214 may be configured to receive other information (as input(s)) that are in addition to the output from the first model 212.
- the second model 214 may be configured to receive sensor signals from one or more sensors mounted to a vehicle, wherein the sensor(s) is configured to sense information about movement characteristic(s) and/or operation characteristic(s) of the vehicle.
- the sensor signals obtained by the second model 214 may be accelerometer signals, gyroscope signals, speed signals, location signals (e.g., GPS signals), etc., or any combination of the foregoing.
- the processing unit 210 may include a processing module that processes the sensor signals.
- the second model 214 may be configured to receive the processed sensor signals from the processing module.
- the second model 214 may be configured to process the sensor signals (provided by the sensor(s)) or the processed sensor signals (provided from the processing module) to determine a collision risk.
- the determination of the collision risk may be based on drowsiness detection and the sensor signals. In other embodiments, the determination of the collision risk may be based on drowsiness detection, the sensor signals, and images of surrounding environment outside the vehicle captured by the camera 202.
- the processing unit 210 may include a facial landmark(s) detection module configured to detect one or more facial landmarks of the driver as captured in images of the camera 204.
- the second model 214 may be configured to receive output from the facial landmark(s) detection module.
- the output from the facial landmark(s) detection module may be utilized by the second model 214 to determine drowsiness and/or alertness.
- the output from the facial landmark(s) detection module may be used to train the second model 214.
- the processing unit 210 may include an eye landmark(s) detection module configured to detect one or more eye landmarks of the driver as captured in images of the camera 204.
- the second model 214 may be configured to receive output from the eye landmark(s) detection module.
- the output from the eye landmark(s) detection module may be utilized by the second model 214 to determine drowsiness and/or alertness.
- the output from the eye landmark(s) detection module may be used to train the second model 214.
- An eye landmark may be a pupil, an eyeball, an eyelid, etc., or any feature associated with an eye of a driver.
- the second model 214 may be configured to receive the one or more information, and the output from the first model 212 in parallel. This allows different information to be received by the second model 214 independently and/or simultaneously.
- FIG. 12 illustrates a method 650 performed by the apparatus 200 of FIG. 2A in accordance with some embodiments.
- the method 650 includes: generating, by the camera, images of a driver of a vehicle (item 652); processing the images by the first model of the processing unit to obtain feature information (item 654); providing, by the first model, the feature information (item 656); obtaining, by the second model, the feature information from the first model (item 658); and processing, by the second model, the feature information to obtain an output that indicates whether the driver is drowsy or not (item 660).
- the poses that can be determined by the driver monitoring module 211 is not limited to the examples described, and that the driver monitoring module 211 may determine other poses or behaviors of the driver.
- the driver monitoring module 211 may be configured to detect talking, singing, eating, daydreaming etc., or any combination of the foregoing, of the driver. Detecting cognitive distraction (e.g., talking) is advantageous because even if the driver is looking at the road, the risk of intersection violation and/or the risk of collision may be higher if the driver is cognitively distracted (compared to if the driver is attentive to driving).
- FIG. 13 illustrates an example of a processing architecture 670 in accordance with some embodiments. At least part(s) of the processing architecture 670 may be implemented using the apparatus of FIG. 2A in some embodiments.
- the processing architecture 670 includes a calibration module 671 configured to determine a region of interest for detecting object(s) in an image that may be at risk of collision with the subject vehicle, a vehicle detector 672 configured to detect vehicles, and a vehicle state module 674 configured to obtain information regarding one or more states of the subject vehicle.
- the processing architecture 670 also includes a collision predictor 675 having a tracker 676 and a time-to-collision (TTC) computation unit 680.
- TTC time-to-collision
- the processing architecture 670 further includes a driver monitoring module 678 configured to determine whether the driver of the subject vehicle is distracted or not.
- the processing architecture 670 also includes an even trigger module 682 configured to generate a control signal 684 in response to detection of certain event(s) based on output provided by the collision predictor 675, and a contextual event module 686 configured to provide a contextual alert 688 based on output provided by the driver monitoring module 678.
- the vehicle detector 672 may be implemented by the object detector 216, and/or may be considered as an example of the object detector 216.
- the collision predictor 675 may be an example of the collision predictor 218 of the processing unit 210 in some embodiments.
- the driver monitoring module 678 may be implemented by the driver monitoring module 211 of the processing unit 210 in some embodiments.
- the even trigger module 682 may be implemented using the signal generation controller 224 of the processing unit 210, and/or may be considered as examples of the signal generation controller 224.
- the calibration module 671 is configured to determine a region of interest for the first camera 202 for detecting vehicle(s) that may be at risk of collision with the subject vehicle.
- the calibration module 671 will be described further in reference to FIGS. 15A-15C.
- the vehicle detector 672 is configured to identify vehicles in camera images provided by the first camera 202.
- the vehicle detector 672 is configured to detect vehicles in images based on a model, such as a neural network model that has been trained to identify vehicles.
- the driver monitoring module 678 is configured to determine whether the driver of the subject vehicle is distracted or not. In some embodiments, the driver monitoring module 678 may determine one or more poses of the driver based on images provided by the second camera 204. The driver monitoring module 678 may determine whether the driver is distracted or not based on the poses of the driver. In some cases, the driver monitoring module 678 may determine one or more poses of the driver based on a model, such as a neural network model that has been trained to identify poses of drivers.
- a model such as a neural network model that has been trained to identify poses of drivers.
- the collision predictor 675 is configured to select one or more of the vehicles detected by the vehicle detector 672 as possible candidates for collision prediction.
- the collision predictor 675 is configured to select a vehicle for collision prediction if the image of the vehicle intersects the region of interest (determined by the calibration module 671) in an image frame.
- the collision predictor 675 is also configured to track the state of the selected vehicle (by the tracker 676).
- the state of the selected vehicle being tracked may be: a position of the vehicle, a speed of the vehicle, an acceleration or deceleration of the vehicle, a movement direction of the vehicle, etc., or any combination of the foregoing.
- the tracker 676 may be configured to determine if a detected vehicle is in a collision course with the subject vehicle based on a traveling path of the subject vehicle and/or a traveling path of the detected vehicle. Also, in some embodiments, the tracker 676 may be configured to determine that a vehicle is a leading vehicle if an image of the detected vehicle as it appears in an image frame from the first camera 202 intersects a region of interest in the image frame.
- the TTC unit 680 of the collision predictor 675 is configured to calculate an estimated time it will take for the selected vehicle to collide with the subject vehicle for the predicted collision based on the tracked state of the selected vehicle and the state of the subject vehicle (provide by the vehicle state module 674). For example, if the tracked state of the selected vehicle indicates that the vehicle is in the path of the subject vehicle, and is travelling slower than the subject vehicle, the TTC unit 680 then determines the estimated time it will take for the selected vehicle to collide with the subject vehicle. As another example, if the tracked state of the selected vehicle indicates that the vehicle is a leading vehicle that is in front of the subject vehicle, the TTC unit 680 then determines the estimated time it will take for the selected vehicle to collide with the subject vehicle. In some embodiments, the TTC unit 680 may determine the estimated time to the predicted collision based on the relative speed between the two vehicles and/or a distance between the two vehicles. The TTC unit 680 is configured to provide the estimated time (TTC parameter) as output.
- the collision predictor 675 is not limited to predicting collision between a leading vehicle and the subject vehicle, and that the collision predictor 675 may be configured to predict other types of collisions.
- the collision predictor 675 may be configured to predict collision between the subject vehicle and another vehicle that are traveling in two different respective roads (e.g., intersecting roads) and that are heading towards an intersection.
- the collision predictor 675 may be configured to predict collision between the subject vehicle and another vehicle traveling in a next lane that is merging or drifting into the lane of the subject vehicle.
- the event triggering module 682 is configured to provide a control signal based on output provided by the collision predictor 675 and output provided by the driver monitoring module 678. In some embodiments, the event trigger module 682 is configured to continuously or periodically monitor the state of the driver based on output provided by the driver monitoring module 678. The event trigger module 682 also monitors the TTC parameter in parallel. If the TTC parameter indicates that the estimated time it will take for the predicted collision to occur is below a certain threshold (e.g., 8 seconds, 7 seconds, 6, seconds, 5 seconds, 4 seconds, 3 seconds, etc.), and if the output by the driver monitoring module 678 indicates that the driver is distracted or not attentive to a driving task, then the event triggering module 682 will generate a control signal 682.
- a certain threshold e.g. 8 seconds, 7 seconds, 6, seconds, 5 seconds, 4 seconds, 3 seconds, etc.
- control signal 684 from event triggering module 682 may be transmitted to a warning generator that is configured to provide a warning for the driver.
- control signal 684 from the event triggering module 682 may be transmitted to a vehicle control that is configured to control the vehicle (e.g., to automatically disengage the gas pedal operation, to apply brake, etc.).
- the threshold is variable based on the output from the driver monitoring module 678. For example, if the output from the driver monitoring module 678 indicates that the driver is not distracted and/or is attentive to a driving task, then the event trigger module 682 may generate the control signal 684 to operate the warning generator and/or to operate the vehicle control in response to the TTC meeting or being below a first threshold (e.g., 3 seconds).
- a first threshold e.g. 3 seconds
- the event trigger module 682 may generate the control signal 684 to operate the warning generator and/or to operate the vehicle control in response to the TTC meeting or being below a second threshold (e.g., 5 seconds) that is higher than the first threshold.
- the event trigger module 682 may be configured to apply different values of threshold for generating the control signal 684 based on the type of state of the driver indicated by the output of the driver monitoring module 678. For example, if the output of the driver monitoring module 678 indicates that the driver is looking at a cell phone, then the event trigger module 682 may generate the control signal 684 to operate the warning generator and/or to operate the vehicle control in response to the TTC meeting or being below a threshold of 5 seconds.
- the event trigger module 682 may generate the control signal 684 to operate the warning generator and/or to operate the vehicle control in response to the TTC meeting or being below a threshold of 8 seconds (e.g., longer than the threshold for the case in which the driver is using a cell phone).
- a longer time threshold may be needed to alert the driver and/or to control the vehicle because certain state of the driver (such as the driver being sleepy or drowsy) may take longer for the driver to react to an imminent collision.
- the event triggering module 682 will alert the driver and/or may operate the vehicle control earlier in response to a predicted collision in these circumstances.
- the TTC unit 680 is configured to determine a TTC value for a predicted collision, and then keep track of the passage of time with respect to the TTC value. For example, if the TTC unit 680 determines that the TTC for a predicted collision is 10 seconds, then the TTC unit 680 may perform a countdown of time for the 10 seconds. As the TTC unit 680 is doing the countdown, the TTC unit 680 periodically outputs the TTC to let the event triggering module 682 know the current TTC value.
- the TTC outputted by the TTC unit 680 at different respective times for the predicted collision will have different respective values based on the countdown.
- the TTC unit 680 is configured to repeatedly determine the TTC values for the predicted collision based on images from the first camera 202. In such cases, the TTC outputted by the TTC unit 680 at different respective times for the predicted collision will have different respective values computed by the TTC unit 680 based on the images from the first camera 202.
- the collision predictor 675 may continue to monitor the other vehicle and/or the state of the subject vehicle after a collision has been predicted. For example, if the other vehicle has moved out of the path of the subject vehicle, and/or if the distance between the two vehicles is increasing (e.g., because the other vehicle has accelerated, and/or the subject vehicle has decelerated), then the collision predictor 675 may provide an output indicating that there is no longer any risk of collision. In some embodiments, the TTC unit 680 may output a signal indicating to the event triggering module 682 that it does not need to generate the control signal 684.
- the TTC unit 680 may output a predetermined arbitrary TTC value that is very high (e.g., 2000 seconds), or a TTC having a negative value, so that when the event triggering module 682 processes the TTC value, it won’t result in a generation of the control signal 684.
- Embodiments of the collision predictor 675 (example of collision predictor 218) and embodiments of the event triggering module 682 (example of the signal generation controller 224) will be described further below.
- the contextual event module 686 is configured to provide a contextual alert 688 based on output provided by the driver monitoring module 678. For example, if the output of the driver monitoring module 678 indicates that the driver has been distracted for a duration that exceeds a duration threshold, or in a frequency that exceeds a frequency threshold, then the contextual event module 686 may generate an alert to warn the driver. Alternatively or additionally, the contextual event module 686 may generate a message to inform a fleet manager, insurance company, etc. In other embodiments, the contextual event module 686 is optional, and the processing architecture 670 may not include the contextual event module 686.
- item 672 may be a human detector, and the processing architecture 670 may be configured to predict collision with humans, and to generate a control signal based on the predicted collision and the state of the driver outputted by the driver monitoring module 678, as similarly described herein.
- item 672 may be an object detector configured to detect object(s) associated with an intersection
- the processing architecture 670 may be configured to predict intersection violation, and to generate a control signal based on the predicted intersection violation and the state of the driver outputted by the driver monitoring module 678, as similarly described herein.
- item 672 may be an object detector configured to detect multiple classes of objects, such as vehicles, humans, and objects associated with an intersection, etc.
- the processing architecture 670 may be configured to predict vehicle collision, predict human collision, predict intersection violation, etc., and to generate a control signal based on any one of these predicted events, and based on the state of the driver outputted by the driver monitoring module 678, as similarly described herein.
- FIG. 14 illustrates examples of object detection in accordance with some embodiments.
- the objects being detected are vehicles captured in the images provided by the first camera 202.
- the detection of the objects may be performed by the object detector 216.
- the identified vehicles are provided respective identifiers (e.g., in the form of bounding boxes to indicate the spatial extents of the respective identified vehicle).
- the object detector 216 is not limited to providing identifiers that are rectangular bounding boxes for the identified vehicles, and that the object detector 216 may be configured to provide other forms of identifiers for the respective identified vehicles.
- the object detector 216 may distinguish vehicle(s) that are leading vehicle(s) from other vehicle(s) that are not leading vehicle(s).
- the processing unit 210 may keep track of identified leading vehicles, and may determine a region of interest based on a spatial distribution of such identified leading vehicles. For example, as shown in FIG. 15A, the processing unit 210 may use the identifiers 750 (in the form of bounding boxes in the example) of leading vehicles that were identified over a period (e.g., the previous 5 seconds, the previous 10 seconds, the previous 1 minute, the previous 2 minutes, etc.), and form a region of interest based on the spatial distribution of the identifiers 750.
- the region of interest has a certain dimension, and location with respect to the camera image frame (wherein the location is towards the bottom of the image frame, and is approximately centered horizontally).
- the processing unit 2 may utilize horizontal lines 752 to form the region of interest (FIG. 15B).
- each horizontal line 752 represents an identified leading vehicle that has been identified over a period.
- the horizontal line 752 may be considered as an example of identifier of an identified leading vehicle.
- the horizontal lines 752 may be obtained by extracting only the bottom sides of the bounding boxes (e.g., such as the ones 750 shown in FIG. 15A).
- the distribution of the horizontal lines 752 form a region of interest (represented by the area filled in by the horizontal lines 752) having an approximate triangular shape or trapezoidal shape.
- the processing unit 210 may utilize such region of interest as a detection zone to detect future leading vehicles. For example, as shown in FIG. 15C, the processing unit 210 may use the identifiers (e.g., lines 752) of the identified leading vehicles to form the region of interest 754, which has a triangular shape in the example. The region of interest 754 may then be utilized by the object detector 216 to identify leading vehicles. In the example shown in the figure, the object detector 216 detects a vehicle 756. Because at least a part of the detected vehicle 756 is located in the region of interest 754, the object detector 216 may determine that the identified vehicle is a leading vehicle.
- the object detector 216 may determine that the identified vehicle is a leading vehicle.
- the region of interest 754 may be determined by a calibration module in the processing unit 210 during a calibration process. Also, in some embodiments, the region of interest 754 may be updated periodically during use of the apparatus 200. It should be noted that the region of interest 754 for detecting leading vehicles is not limited to the example described, and that the region of interest 754 may have other configurations (e.g., size, shape, location, etc.) in other embodiments. Also, in other embodiments, the region of interest 754 may be determined using other techniques. For example, in other embodiments, the region of interest 754 for detecting leading vehicles may be pre-determined (e.g., programmed during manufacturing) without using the distribution of previously detected leading vehicles.
- the region of interest 754 has a triangular shape that may be determined during a calibration process. In other embodiments, the region of interest 754 may have other shapes, and may be determined based on a detection of a centerline of a lane.
- the processing unit 210 may include a centerline detection module configured to determine a centerline of a lane or road in which the subject vehicle is traveling. In some embodiments, the centerline detection module may be configured to determine the centerline by processing images from the first camera 202. In one implementation, the centerline detection module analyzes images from the first camera 202 to determine the centerline of the lane or road based on a model.
- the model may be a neural network model that has been trained to determine centerline based on images of various road conditions.
- the model may be any of other types of model, such as a mathematical model, an equation, etc.
- FIG. 15D illustrates an example of the centerline detection module having determined a centerline, and an example of a region of interest that is based on the detected centerline.
- the centerline detection module determines a set of points 757a-757e that represent a centerline of the lane or road in which the subject vehicle is traveling. Although five points 757a-757e are shown, in other examples, the centerline detection module may determine more than five points 757 or fewer than five points 757 representing the centerline.
- the processing unit 210 may determine a set of left points 758a-758e, and a set of right points 759a-759e.
- the processing unit 210 may also determine a first set of lines connecting the left points 758a-758e, and a second set of lines connecting the right points 759a-759e.
- the first set of lines form a left boundary of a region of interest 754, and the second set of lines form a right boundary of the region of interest 754.
- the processing unit 210 is configured to determine the left point 758a as having the same y-coordinate as the centerline point 757a, and a x-coordinate that is a distance d1 to the left of the x-coordinate of the centerline point 757a. Also, the processing unit 210 is configured to determine the right point 759a as having the same y-coordinate as the centerline point 757a, and a x-coordinate that is a distance d1 to the right of the x-coordinate of the centerline point 757a. Thus, the left point 758a, the center line point 757a, and the right point 759a are horizontally aligned.
- the processing unit 210 is configured to determine the left points 758b-758e as having the same respective y-coordinates as the respective centerline points 757b-757e, and having respective x-coordinates that are at respective distances d2-d5 to the left of the respective x-coordinates of the centerline points 757b-757e.
- the processing unit 210 is also configured to determine the right points 759b-759e as having the same respective y-coordinates as the respective centerline points 757b-757e, and having respective x-coordinates that are at respective distances d2-d5 to the right of the respective x-coordinates of the centerline points 757b-757e.
- d1>d2>d3>d4>d5 results in the region of interest 754 having a tapering shape that corresponds with the shape of the road as it appears in the camera images.
- the processing unit 210 repeatedly determines the centerline and the left and right boundaries of the region of interest 754 based on the centerline.
- the region of interest 754 has a tapering shape that is variable (e.g., the curvature of the tapering of the region of interest 754 is variable) in correspondence with a changing shape of the road as it appears in the camera images.
- the shape of the region of interest 754 is variable in correspondence with the shape of the road in which the vehicle is traveling.
- FIG. 15E illustrates an advantage of using the region of interest 754 of FIG. 15D in the detection of object that presents a risk of collision.
- the right side of the figure shows the region of interest 754 that is determined based on centerline of the road or lane in which the subject vehicle is traveling, as described with reference to FIG. 15D.
- the left side of the figure shows another region of interest 754 that is determined based on camera calibration like that described with reference to FIGS. 15A- 15C, and has a shape that is independent of the centerline (e.g., curvature of the centerline) of the road / lane.
- the processing unit 210 may incorrectly detect a pedestrian as a possible risk of collision because it intersects the region of interest 754.
- the region of interest 754 in the left diagram may incorrectly detect a parked vehicle that is outside the subject lane as an object that presents a risk of collision.
- the processing unit 210 may be configured to perform additional processing to address the issue of false positive (e.g., falsely detecting an object as a risk of collision).
- the region of interest 754 on the right side is advantageous because it does not have the above issue of false positive.
- the processing unit 210 may include a road or lane boundary module configured to identify left and right boundaries of the lane or road in which the subject vehicle is traveling. The processing unit 210 may also determine one or more lines to fit the left boundary, and one or more lines to fit the right boundary, and may determine the region of interest 754 based on the determined lines.
- the processing unit 210 may be configured to determine both (1) a first region of interest (such as the triangular region of interest 754 described with reference to FIGS. 15A-15C), and (2) a second region of interest like the region of interest 754 described with reference to FIG. 15D.
- the first region of interest may be used by the processing unit 210 for cropping camera images. For example, certain parts of a camera image that are away from the first region of interest, or that are at certain distance away from the first region of interest may be cropped to reduce an amount of image data that will need to be processed.
- the second region of interest may be used by the processing unit 210 for determining whether a detected object poses a risk of collision.
- the processing unit 210 may determine that there is a risk of collision with the detected object.
- the first region of interest may also be used by the processing unit 210 to detect leading vehicles in camera images.
- the widths of the detected vehicles and their corresponding positions with respect to the coordinate system of the images may be used by the processing unit 210 to determine y-to-distance mapping, which will be described in further detail below with reference to FIG. 20.
- the collision predictor 218 may be configured to determine whether the region of interest 754 (e.g., the polygon created based on the centerline) intersects with a bounding box of a detected object, such as a lead vehicle, a pedestrian, etc. If so, then the collision predictor 218 may determine that there is a risk of collision, and the object corresponding to the bounding box is considered eligible for TTC computation.
- the region of interest 754 e.g., the polygon created based on the centerline
- the collision predictor 218 may be configured to predict collision with leading vehicles in at least three different scenarios.
- FIG. 16 illustrates three exemplary scenarios involving collision with lead vehicle.
- the subject vehicle the left vehicle
- the leading vehicle the right vehicle
- speed Vpov 0.
- the middle diagram the subject vehicle (the left vehicle) is traveling at non-zero speed Vsv
- the leading vehicle the right vehicle is traveling at non-zero speed Vpov that is less than speed Vsv.
- the collision predictor 218 is configured to predict collisions between the subject vehicle and the leading vehicle that may occur in any of the three scenarios shown in FIG. 16. In one implementation, the collision predictor 218 may analyze a sequence of images from the first camera 202 to determine a relative speed between the subject vehicle and the leading vehicle.
- the collision predictor 218 may obtain sensor information indicating the relative speed between the subject vehicle and the leading vehicle. For example, the collision predictor 218 may obtain a sequence of sensor information indicating distances between the subject vehicle and the leading vehicle over a period. By analyzing the change in distance over the period, the collision predictor 218 may determine the relative speed between the subject vehicle and the leading vehicle. Also, in some embodiments, the collision predictor 218 may obtain a speed of the subject vehicle, such as from a speed sensor of the subject vehicle, from a GPS system, or from a separate speed sensor that is different from that of the subject vehicle.
- the collision predictor 218 may be configured to predict a collision between the subject vehicle and the leading vehicle based on the relative speed between the subject vehicle and the leading vehicle, the speed of the subject vehicle, the speed of the leading vehicle, or any combination of the foregoing. For example, in some cases, the collision predictor 218 may determine that there is a risk of collision if (1) the object detector 216 detects a leading vehicle, (2) the relative speed between the leading vehicle and the subject vehicle is non-zero, and (3) the distance between the leading vehicle and the subject vehicle is decreasing. In some cases, criteria (2) and (3) may be combined to indicate whether the subject vehicle is traveling faster than the leading vehicle or not. In such cases, the collision predictor 218 may determine that there is a risk of collision if (1) the object detector 216 detects a leading vehicle, and (2) the subject vehicle is traveling faster than the leading vehicle (such that the subject vehicle is moving towards the leading vehicle).
- the collision predictor 218 may obtain other information for use to determine whether there is a risk of collision.
- the collision predictor 218 may obtain information (e.g., camera images, detected light, etc.) indicating that the leading vehicle is braking, operation parameters (such as information indicating the acceleration, deceleration, turning, etc.) of the subject vehicle, operation parameters (such as information indicating the acceleration, deceleration, turning, etc.) of the leading vehicle, or any combination of the foregoing.
- the collision predictor 218 is configured to predict the collision at least 3 seconds or more before an expected occurrence time for the predicted collision.
- the collision predictor 218 may be configured to predict the collision at least: 3 seconds, 4 seconds, 5 seconds, 6 seconds, 7 seconds, 8 seconds, 9 seconds, 10 seconds, 11 seconds, 12 seconds, 13 seconds, 14 seconds, 15 seconds, etc., before the expected occurrence time for the predicted collision.
- the collision predictor 218 is configured to predict the collision with sufficient lead time for a brain of the driver to process input and for the driver to perform an action to mitigate the risk of the collision.
- the sufficient lead time may be dependent on the state of the driver, as determined by the driver monitoring module 211.
- the object detector 216 may be configured to detect human in some embodiments. In such cases, the collision predictor 218 may be configured to predict a collision with a human.
- FIG. 17 illustrates another example of object detection in which the object(s) being detected is human. As shown in the figure, the objects being detected are humans captured in the images provided by the first camera 202. The detection of the objects may be performed by the object detector 216. In the illustrated example, the identified humans are provided respective identifiers (e.g., in the form of bounding boxes 760 to indicate the spatial extents of the respective identified vehicle).
- the object detector 216 is not limited to providing identifiers that are rectangular bounding boxes 760 for the identified humans, and that the object detector 216 may be configured to provide other forms of identifiers for the respective identified humans.
- the object detector 216 may distinguish human(s) that are in front of the subject vehicle (e.g., in the path of the vehicle) from other human(s) that is not in the path of the subject vehicle.
- the same region of interest 754 described previously for detecting leading vehicles may be utilized by the object detector 216 to detect human that is in the path of the subject vehicle.
- the collision predictor 218 may be configured to determine a direction of movement of a detected human by analyzing a sequence of images of the human provided by the first camera 202.
- the collision predictor 218 may also be configured to determine a speed of movement (e.g., how fast the human is walking or running) of the detected human by analyzing the sequence of images of the human.
- the collision predictor 218 may also be configured to determine whether there is a risk of collision with a human based on a traveling path of the subject vehicle and also based on a movement direction of the detected human. Such feature may be desirable to prevent collision with a human who is not in the path of the vehicle, but may be located at a sidewalk moving towards the path of the subject vehicle.
- the collision predictor 218 is configured to determine an area next to the detected human indicating a possible position of the human in some future time (e.g., next 0.5 second, next 1 second, next 2 seconds, next 3 seconds, etc.) based on the speed and direction of movement of the detected human. The collision predictor 218 may then determine whether the subject vehicle will traverse the determined area (e.g., in a box) indicating the predicted position of the human based on the speed of the subject vehicle. In one implementation, the collision predictor 218 may determine whether the determined area intersects the region of interest 754. If so, then the collision predictor 218 may determine that there is a risk of collision with the human, and may generate an output indicating the predicted collision.
- some future time e.g., next 0.5 second, next 1 second, next 2 seconds, next 3 seconds, etc.
- FIG. 18 illustrates examples of predicted positions of a human based on the human’s walking speed and direction. Because human movement is somewhat less predictable in nature, in some embodiments, even if a detected human is standing (e.g., a pedestrian standing next to a roadway), the collision predictor 218 may determine an area with respect to the human indicating possible positions of the human (e.g., in case the human starts walking or running). For example, the collision predictor 218 may determine the bounding box 760 (e.g., a rectangular box) surrounding the detected human, and may then increase the dimension(s) of the bounding box 760 to account for uncertainty in the future predicted positions of the human, wherein the enlarged box defines an area indicating the predicted positions of the human.
- the bounding box 760 e.g., a rectangular box
- the collision predictor 218 may then determine whether the subject vehicle will traverse the determined area indicating the predicted positions of the human. In one implementation, the collision predictor 218 may determine whether the determined area of the enlarged box intersects the region of interest 754. If so, then the collision predictor 218 may determine that there is a risk of collision with the human, and may generate an output indicating the predicted collision.
- the collision predictor 218 may be configured to predict collision with humans in at least three different scenarios.
- the detected human or the bounding box 760 surrounding the detected human
- the region of interest 754 indicating that the human is already in the traveling path of the subject vehicle.
- the detected human is not in the traveling path of the subject vehicle, and is standing next to the traffic roadway.
- the collision predictor 218 may use the area of an enlarged bounding box of the detected human to determine whether there is a risk of collision, as described above. If the enlarged bounding box intersects the region of interest 754 (for detecting collision), then the collision predictor 218 may determine that there is a risk of collision with the standing human.
- the collision predictor 218 may use area of predicted positions of the human to determine whether there is a risk of collision, as described above. If the area of predicted positions intersects the region of interest 754, then the collision predictor 218 may determine that there is a risk of collision with the human.
- the enlarged bounding box may have a dimension that is based on the dimension of the detected object plus an additional length, wherein the length is predetermined to account for uncertainty of movement of the object.
- the enlarged bounding box may be determined based on prediction of the object location.
- a detected object may have an initial bounding box 760.
- the processing unit 210 may predict the location of the moving object.
- a box 762a may be determined by the processing unit 210 that represents possible locations for the object at 0.3 sec in the future from now.
- the processing unit 210 may also determine box 762b representing possible locations for the object at 0.7 sec in the future from now, and box 762c representing possible locations for the object at 1 sec in the future from now.
- the collision predictor 218 may continue to predict the future positions of a detected object (e.g., human) at certain future time, and determine if the path of the subject vehicle will intersect any of these positions. If so, then the collision predictor 218 may determine that there is a risk of collision with the object.
- a detected object e.g., human
- the region of interest 754 may be enlarged in response to the driver monitoring module 211 detecting the driver being distracted.
- the region of interest 754 may be widened in response to the driver monitoring module 211 detecting the driver being distracted. This has the benefit of considering objects that are outside the road or lane as possible risks of collision. For example, if the driver is distracted, the processing unit 210 then widens the region of interest 754. This has the effect of relaxing the threshold for detecting overlapping of a detected object with the region of interest 754. If a bicyclist is driving on the edge of the lane, the processing unit 210 may detect the bicyclist as a possible risk of collision because it may overlap the enlarged region of interest 754.
- region of interest 754 will be smaller, and the bicyclist may not intersect the region of interest 754. Accordingly, in this scenario, the processing unit 210 may not consider the bicyclist as presenting a risk of collision, which makes sense because an attentive driver is likely going to avoid a collision with the bicyclist.
- the collision predictor 218 may not determine risk of collision for all of the detected humans in an image. For example, in some embodiments, the collision predictor 218 may exclude detected humans who are inside vehicles, humans who are standing at bus stops, humans who are sitting outside, etc. In other embodiments, the collision predictor 218 may consider all detected humans for collision prediction.
- the object detector 216 may utilize one or more models to detect various objects, such as cars (as illustrated in the figure), motorcycles, pedestrian, animals, lane dividers, street signs, traffic signs, traffic lights, etc.
- the model(s) utilized by the object detector 216 may be a neural network model that has been trained to identify various objects.
- the model(s) may be any of other types of models, such as mathematical model(s), configured to identify objects.
- the model(s) utilized by the object detector 216 may be stored in the non-transitory medium 230, and/or may be incorporated as a part of the object detector 216.
- FIGS. 19A-19B illustrate other examples of object detection in which the objects being detected by the object detector 216 are associated with an intersection.
- the object detector 216 may be configured to detect traffic lights 780.
- the object detector 216 may be configured to detect stop sign 790.
- the object detector 216 may also be configured to detect other items associated with an intersection, such as a road marking, a corner of a curb, a ramp, etc.
- the intersection violation predictor 222 is configured to detect an intersection based on the detected object(s) 216 detected by the object detector 216.
- the object detector 216 may detect a stop line at an intersection indicating an expected stop location of the subject vehicle.
- TTC time-to-crossing
- the intersection violation predictor 222 may estimate a location of the expected stopping based on the detected objects at the intersection. For example, the intersection violation predictor 222 may estimate a location of the expected stopping based on known relative position between the expected stop location and surrounding objects, such as stop sign, traffic light, etc.
- the intersection violation predictor 222 may be configured to determine time-to-brake (TTB) based on the location of the stop line and the speed of the subject vehicle.
- TTB measures the time the driver has left at the current speed in order to initiate a breaking maneuver to safely stop at or before the required stopping location associated with the intersection.
- the intersection violation predictor 222 may determine a distance d between the subject vehicle and the location of the stop line, and calculate the TTB based on the current speed of the subject vehicle.
- the intersection violation predictor 222 may be configured to determine a braking distance BD indicating a distance required for a vehicle to come to a complete stop based on the speed of the vehicle, and to determine the TTB based on the braking distance.
- the braking distance is longer for a traveling vehicle with higher speed.
- the braking distance may also be based on road conditions in some embodiments. For example, for the same given speed of the vehicle, braking distance may be longer for wet road condition compared to dry road condition.
- FIG. 19C illustrates the different braking distances required for different vehicle speeds and different road conditions.
- FIG. 19C also shows how much the vehicle would have travelled based on a driver’s reaction time of 1.5 seconds. For example, for a vehicle traveling at 40 km/hr, it would travel 17 meters in about 1.5 seconds (driver’s reaction time) before the driver applies the brake. Thus, the total distance it would take for a vehicle traveling at 40 km/hr to stop (and considering reaction time of the driver) will be 26 meters in dry road condition and 30 meters in wet road condition.
- intersection violation predictor 222 may generate a control signal to operate a device to provide a warning to the driver, and/or to operate a device to automatically control the vehicle, as described herein.
- the threshold reaction time may be 1.5 seconds or more, 2 seconds or more, 2.5 seconds or more, 3 seconds or more, 4 seconds or more, etc.
- the threshold reaction time may be variable based on a state of the driver as determined by the driver monitoring module 211. For example, in some embodiments, if the driver monitoring module 211 determines that the driver is distracted, then the processing unit 210 may increase the threshold reaction time (e.g., changing it from 2 seconds for non-distracted driver to 4 seconds for distracted driver, etc.). In addition, in some embodiments, the threshold reaction time may have different values for different states of the driver. For example, if the driver is alert and is distracted, the threshold reaction time may be 4 seconds, and if the driver is drowsy, the threshold reaction time may be 6 seconds.
- the intersection violation predictor 222 may be configured to determine the distance d between the subject vehicle and the stop location by analyzing image(s) from the first camera 202. Alternatively, or additionally, the intersection violation predictor 222 may receive information from a GPS system indicating a position of the subject vehicle, and a location of an intersection. In such cases, the intersection violation predictor 222 may determine the distance d based on the position of the subject vehicle and the location of the intersection. [00249] In some embodiments, the intersection violation predictor 222 may determine the braking distance BD by looking up a table that maps different vehicle speeds to respective braking distances.
- the intersection violation predictor 222 may determine the braking distance BD by performing a calculation based on a model (e.g., equation) that receives the speed of the vehicle as input, and outputs braking distance.
- the processing unit 210 may receive information indicating a road condition, and may determine the braking distance BD based on the road condition. For example, in some embodiments, the processing unit 210 may receive output from a moisture sensor indicating that there is rain. In such cases, the processing unit 210 may determine a higher value for the braking distance BD.
- the control signal may operate a device to generate a warning for the driver, and/or may operate a device to control the vehicle, as described herein.
- the distance threshold may be adjusted based on a state of the driver. For example, if the driver monitoring module 211 determines that the driver is distracted, then the processing unit 210 may increase the distance threshold to account for the longer distance for the driver to react.
- the processing unit 210 may be configured to determine a distance d that is between the subject vehicle and a location in front of the vehicle, wherein the location may be a location of an object (e.g., a lead vehicle, a pedestrian, etc.) as captured in an image from the first camera 202, an expected stop position for the vehicle, etc.
- a distance d may be determined using various techniques.
- the processing unit 210 may be configured to determine the distance d based on a Y-to-d mapping, wherein Y represents a y- coordinate in an image frame, and d represents the distance between the subject vehicle and the location corresponding to y-coordinate in the image frame.
- Y represents a y- coordinate in an image frame
- d represents the distance between the subject vehicle and the location corresponding to y-coordinate in the image frame.
- the higher y- coordinate values correspond larger widths of the bounding boxes. This is because a vehicle detected closer to the camera will be larger (having larger corresponding bounding box) and will appear closer to a bottom of the camera image, compared to another vehicle that is further away from the camera.
- a width (or a horizontal dimension) in a coordinate system of a camera image is related to the real world distance d based on homography principles.
- the width parameter in the top graph of FIG. 20 may be converted into real world distance d based on perspective projection geometry in some embodiments.
- the width-to-distance mapping may be obtained empirically by performing calculation based on the perspective projection geometry.
- the width-to-distance mapping may be obtained by measuring actual distance d between the camera and an object at a location, and determining a width of the object in the coordinate system of a camera image that captures the object at the distance d from the camera.
- the y-to-d mapping may be determined by measuring actual distance d between the camera and a location L in the real world, and determining the y- coordinate of the location L in the coordinate system of a camera image.
- Information in the lower graph of FIG. 20 can be used by the processing unit 210 to determine the distance d in some embodiments.
- the information relating the y-coordinate to the distance d may be stored in a non-transitory medium. The information may be an equation of the curve relating distances d to different y-coordinates, a table containing different y-coordinates and their corresponding distances d, etc.
- the processing unit 210 may detect an object (e.g., a vehicle) in a camera image from the first camera 202.
- the image of the detected object as it appears in the camera image has a certain coordinate (x, y) with respect to a coordinate system of the camera image. For example, if the y-coordinate of the detected object has a value of 510, then based on the curve of FIG. 20, the distance of the detected object from the camera / subject vehicle is about 25 meters.
- the processing unit 210 may determine a location in the camera image representing a desired stopping position for the subject vehicle.
- the location in the camera image has a certain coordinate (x, y) with respect to a coordinate system of the camera image. For example, if the y-coordinate of the location (representing the desired position for the subject vehicle) has a value of 490, then based on the curve of FIG. 20, the distance d between the desired stopping position (e.g., actual intersection stop line, or an artificially created stop line) and the camera / subject vehicle is about 50 meters.
- the desired stopping position e.g., actual intersection stop line, or an artificially created stop line
- the technique for determining the distance d is not limited to the example described, and that the processing unit 210 may utilize other techniques for determining distance d.
- the processing unit 210 may receive distance information from a distance sensor, such as a sensor that utilizes time-of-flight technique for distance determination.
- the signal generation control 224 is configured to generate a control signal for operating a warning generator and/or for causing a vehicle control to control the subject vehicle based on output from the collision predictor 218 or from the intersection violation predictor 222, and also based on output from the driver monitoring module 211 indicating a state of the driver.
- the output from the collision predictor 218 or the intersection violation predictor 222 may be a TTC value indicating a time-to-collision (with another vehicle or another object) or a time-to-crossing a detected intersection.
- the signal generation control 224 is configured to compare the TTC value (as it changes in correspondence with passage of time) with a threshold (threshold time), and determine whether to generate the control signal based on a result of the comparison.
- the threshold utilized by the signal generation controller 224 of the processing unit 210 to determine whether to generate the control signal (in response to a predicted collision or predicted intersection violation) may have a minimum value that is at least 2 seconds, or 3 seconds, or 4 seconds, or 5 seconds, or 6 seconds, or 7 seconds, or 8 seconds or 9 seconds, or 10 seconds.
- the threshold is variable based on the state of the driver as indicated by the information provided by the driver monitoring module 211.
- the processing unit 210 may adjust the threshold by increasing the threshold time from its minimum value (e.g., if the minimum value is 3 seconds, then the threshold may be adjusted to be 5 seconds).
- the processing unit 210 may adjust the threshold so that it is 7 seconds (i.e. , more than 5 seconds in the example), for example. This is because a driver who is in a drowsy state may take the driver longer to notice the collision risk or stopping requirement, and to take action to mitigate the risk of collision.
- FIG. 21 illustrates an example of a technique for generating a control signal for controlling a vehicle and/or for causing a generation of an alert for a driver.
- the collision predictor 218 determines that the TTC is 10 seconds.
- the x-axis in the graph indicates elapsed time that has elapsed since the determination of the TTC.
- the initial TTC 10 seconds was determined by the collision predictor 218.
- the TTC represented by the y-axis
- the TTC 10 - t, where 10 is the initial determined time-to-collision TTC of 10 sec.
- the processing unit 210 utilizes a first threshold TH1 of 3 seconds for providing a control signal (to warn the driver and/or to automatically cause operate a vehicle control to mitigate the risk of collision) when the state of the driver as output by the driver monitoring module 211 indicates that the driver is not distracted. Also, in the illustrated example, the processing unit 210 utilizes a second threshold TH2 of 5 seconds for providing a control signal (to warn the driver and/or to automatically cause operate a vehicle control to mitigate the risk of collision) when the state of the driver as output by the driver monitoring module 211 indicates that the driver is distracted.
- the processing unit 210 utilizes a third threshold TH3 of 8 seconds for providing a control signal (to warn the driver and/or to automatically cause operate a vehicle control to mitigate the risk of collision) when the state of the driver as output by the driver monitoring module 211 indicates that the driver is drowsy.
- the signal generation controller 224 provides the control signal CS.
- the signal generation controller 224 provides the control signal CS. Therefore, in the situation in which the driver is distracted, the signal generation controller 224 will provide the control signal earlier to cause a warning to be provided to the driver and/or to operate the vehicle.
- the signal generation controller 224 may be configured to consider the state of the driver within a temporal window before the threshold (e.g., 1.5 sec, 2 sec, etc. before TH2) to determine whether to use the threshold for determining the generation of the control signal. In other embodiments, the signal generation controller 224 may be configured to consider the state of the driver at the time of the threshold to determine whether to use the threshold.
- a temporal window before the threshold e.g. 1.5 sec, 2 sec, etc. before TH2
- the signal generation controller 224 may be configured to consider the state of the driver at the time of the threshold to determine whether to use the threshold.
- the signal generation controller 224 provides the control signal CS.
- the signal generation controller 224 will provide the control signal even earlier (i.e., earlier than when the driver is alert but is distracted) to cause a warning to be provided to the driver and/or to operate the vehicle.
- the threshold is variable in real time based on the state of the driver as determined by the driver monitoring module 211.
- the signal generation controller 224 may hold off in providing the control signal.
- the TTC value will indicate time-to-crossing the intersection.
- the same thresholds TH1, TH2, TH3 for determining when to provide control signal (to operate a warning generator and/or to operate a vehicle control) for collision prediction may also be used for intersection violation prediction.
- the thresholds TH1, TH2, TH3 for determining when to provide control signal for collision prediction may be different for the thresholds TH1, TH2, TH3 for determining when to provide control signal for intersection violation prediction.
- the collision predictor 218 is configured to determine an estimated time it will take for the predicted collision to occur, and the signal generation controller 224 of the processing unit 210 is configured to provide the control signal to operate a device if the estimated time it will take for the predicted collision to occur is below a threshold.
- the device comprises a warning generator, and the signal generation controller 224 of the processing unit 210 is configured to provide the control signal to cause the device to provide a warning for the driver if the estimated time it will take for the predicted collision to occur is below the threshold.
- the device may include a vehicle control
- the signal generation controller 224 of the processing unit 210 is configured to provide the control signal to cause the device to control the vehicle if the estimated time it will take for the predicted collision to occur is below the threshold.
- the signal generation controller 224 of the processing unit 210 is configured to repeatedly evaluate the estimated time (TTC) with respect to the variable threshold, as the predicted collision / intersection violation is temporally approaching in correspondence with a decrease of the estimated time it will take for the predicted collision / intersection violation to occur.
- the processing unit 210 (e.g., the signal generation controller 224 of the processing unit 210) is configured to increase the threshold if the state of the driver indicates that the driver is distracted or is not attentive to a driving task.
- the signal generation controller 224 of the processing unit 210 is configured to at least temporarily hold off in providing the control signal if the estimated time it will take for the predicted collision to occur is higher than the threshold.
- the threshold has a first value if the state of the driver indicates that the driver is attentive to a driving task, and wherein the threshold has a second value higher than the first value if the state of the driver indicates that the driver is distracted or is not attentive to the driving task.
- the threshold is also based on sensor information indicating that the vehicle is being operated to mitigate the risk of the collision. For example, if the sensor(s) 225 provides sensor information indicating that the driver is applying brake of the vehicle, then the processing unit 210 may increase the threshold to a higher value.
- the signal generation controller 224 of the processing unit 210 is configured to determine whether to provide the control signal or not based on (1) the first information indicating the risk of collision with the vehicle, (2) the second information indicating the state of the driver, and (3) sensor information indicating that the vehicle is being operated to mitigate the risk of the collision.
- the processing unit 210 is configured to determine a level of the risk of the collision, and the processing unit 210 (e.g., the signal generation controller 224 of the processing unit 210) is configured to adjust the threshold based on the determined level of the risk of the collision.
- the state of the driver comprises a distracted state
- the processing unit 210 is configured to determine a level of a distracted state of the driver, wherein the processing unit 210 (e.g., the signal generation controller 224 of the processing unit 210) is configured to adjust the threshold based on the determined level of the distracted state of the driver.
- different alerts may be provided at different thresholds, and based on whether the driver is attentive or not.
- the processing unit 210 may control a device to provide a first alert with a first characteristic if there is a risk of collision (with a vehicle, pedestrian, etc.) and if the driver is attentive, and may control the device to provide a second alert with a second characteristic if there is a risk of collision and if the driver is distracted.
- the first characteristic of the first alert may be a first alert volume
- the second characteristic of the second alert may be a second alert volume that is higher than the first alert volume.
- the processing unit 210 may control the device to provide a more intense alert (e.g., an alert with a higher volume, and/or with higher frequency of beeps).
- a more intense alert e.g., an alert with a higher volume, and/or with higher frequency of beeps.
- a gentle alert may be provided when the subject vehicle is approaching an object, and a more intense alert may be provided when the subject vehicle is getting closer to the object.
- the processing unit 210 may control a device to provide a first alert with a first characteristic if there is a risk of intersection violation and if the driver is attentive, and may control the device to provide a second alert with a second characteristic if there is a risk of intersection violation and if the driver is distracted.
- the first characteristic of the first alert may be a first alert volume
- the second characteristic of the second alert may be a second alert volume that is higher than the first alert volume.
- the processing unit 210 may control the device to provide a more intense alert (e.g., an alert with a higher volume, and/or with higher frequency of beeps).
- the apparatus 200 is advantageous because it considers the state of the driver when determining whether to generate a control signal to operate a device to provide warning and/or to operate a device to control the vehicle. Because the state of the driver may be used to adjust monitoring threshold(s), the apparatus 200 may provide warning to the driver and/or may control the vehicle to mitigate a risk of collision and/or a risk of intersection violation earlier to account for certain state of the driver (e.g., when driver is distracted, drowsy, etc.).
- the apparatus 200 may provide warning to the driver and/or may control the vehicle as early as 2 seconds before the predicted risk, or even earlier, such as at least 3 seconds, 4 seconds, 5 seconds, 6 seconds, 7, seconds, 8 seconds, 9 seconds, 10 seconds, 11 seconds, 12 seconds, 13 seconds, 14 seconds, 15 seconds, etc., before the predicted risk (e.g., risk of collision or risk of intersection violation).
- the predicted risk e.g., risk of collision or risk of intersection violation.
- higher precision is built into the systems in order to avoid false positives at the expense of increased sensitivity.
- the apparatus 200 may be configured to operate on lower sensitivity (e.g., lower than, or equal to, existing solutions), and the sensitivity of the apparatus 200 may be increased only if the driver is inattentive.
- the increase in sensitivity based on the state of the driver may be achieved by adjusting one or more thresholds based on the state of the driver, such as adjusting a threshold for determining time-to-collision, a threshold for determining time-to-crossing an intersection, a threshold for determining time-to- brake, a threshold for determining whether an object intersects a region of interest (e.g., a camera calibration ROI, a ROI determined based on centerline detection, etc.), a threshold on the confidence of object detection.
- a region of interest e.g., a camera calibration ROI, a ROI determined based on centerline detection, etc.
- the processing unit 210 may also be configured to consider the scenario in which the subject vehicle is tailgating.
- tailgating may be determined (e.g., measured) by time-to-headway, which is defined as the distance to the lead vehicle divided by the speed of the subject vehicle (ego-vehicle).
- the speed of the subject vehicle may be obtained from the speed sensing system of the vehicle. In other embodiments, the speed of the subject vehicle may be obtained from a GPS system. In further embodiments, the speed of the subject vehicle may be determined by the processing unit 210 processing external images received from the first camera 202 of the apparatus 200.
- the distance to the lead vehicle may be determined by the processing unit 210 processing external images received from the first camera 202.
- the distance to the lead vehicle may be obtained from a distance sensor, such as a sensor employing time-of-flight technology.
- the processing unit 210 may determine that there is tailgating if the time-to-headway is less than a tailgate threshold.
- the tailgate threshold may be 2 seconds or less, 1.5 seconds or less, 1 second or less, 0.8 second or less, 0.6 second or less, 0.5 second or less, etc.
- the processing unit 210 may be configured to determine that there is a risk of collision if the subject vehicle is tailgating, and if driver monitoring module 211 determines that the driver is distracted. The processing unit 210 may then generate a control signal to cause a device (e.g., a warning generator) to provide a warning for the driver, and/or to cause a device (e.g., a vehicle control) to control the vehicle, as described herein.
- a device e.g., a warning generator
- the vehicle control may automatically apply the brake of the vehicle, automatically disengage the gas pedal, automatically activate hazard lights, or any combination of the foregoing.
- the processing unit 210 may include a rolling-stop module configured to detect a rolling stop maneuver.
- the rolling stop module may be implemented as a part of the intersection violation predictor 222 in some embodiments.
- the processing unit 210 may detect an intersection that requires the vehicle to stop (e.g., the processing unit 210 may identify a stop sign, a red light, etc., based on processing of image(s) from the first camera 202).
- the rolling-stop module may monitor one or more parameters indicating operation of the vehicle to determine if the vehicle is making a rolling stop maneuver for the intersection.
- the rolling-stop module may obtain a parameter indicating a speed of the vehicle, a braking of the vehicle, a deceleration of the vehicle, etc., or any combination of the foregoing.
- the rolling-stop module may determine that there is a rolling-stop maneuver by analyzing the speed profile of the vehicle over a period as the vehicle is approaching the intersection. For example, if the vehicle has slowed down (indicating that the driver is aware of the intersection), and if the vehicle’s speed does not further decrease within a certain period, then the rolling-stop module may determine that the driver is performing a rolling-stop maneuver.
- the rolling-stop module may determine that the driver is performing a rolling-stop maneuver.
- the rolling-stop maneuver may determine that the driver is performing a rolling-stop maneuver.
- the intersection violation predictor 222 may determine that there is a risk of intersection violation. In response to the determined risk of intersection violation, the rolling-stop module may then generate a control signal to operate a device.
- the control signal may operate a communication device to send a message wirelessly to a server system (e.g., a cloud system).
- the server system may be utilized by a fleet management for coaching of the driver, or may be utilized by insurance company to identify risky driver.
- control signal may operate a warning system to provide a warning to the driver, which may serve as a way of coaching the driver.
- the control signal may operate a braking system of the vehicle to control the vehicle so that it will come to a complete stop.
- FIG. 22A illustrates a method 800 performed by the apparatus 200 of FIG. 2A in accordance with some embodiments.
- the method 800 includes: obtaining a first image generated by a first camera, wherein the first camera is configured to view an environment outside a vehicle (item 802); obtaining a second image generated by a second camera, wherein the second camera is configured to view a driver of the vehicle (item 804); determining first information indicating a risk of collision with the vehicle based at least partly on the first image (item 806); determining second information indicating a state of the driver based at least partly on the second image (item 808); and determining whether to provide a control signal for operating a device or not based on (1) the first information indicating the risk of collision with the vehicle, and (2) the second information indicating the state of the driver (item 810).
- the first information is determined by predicting the collision, and wherein the collision is predicted at least 3 seconds or more before an expected occurrence time for the predicted collision.
- the first information is determined by predicting the collision, and wherein the collision is predicted with sufficient lead time for a brain of the driver to process input and for the driver to perform an action to mitigate the risk of the collision.
- the sufficient lead time is dependent on the state of the driver.
- the first information indicating the risk of collision comprises a predicted collision
- the method further comprises determining an estimated time it will take for the predicted collision to occur, and wherein the control signal is provided to cause the device to provide the control signal if the estimated time it will take for the predicted collision to occur is below a threshold.
- the device comprises a warning generator, and wherein the control signal is provided to cause the device to provide a warning for the driver if the estimated time it will take for the predicted collision to occur is below a threshold.
- the device comprises a vehicle control, and wherein the control signal is provided to cause the device to control the vehicle if the estimated time it will take for the predicted collision to occur is below the threshold.
- the threshold is variable based on the second information indicating the state of the driver.
- the estimated time is repeatedly evaluated with respect to the variable threshold, as the predicted collision is temporally approaching in correspondence with a decrease of the estimated time it will take for the predicted collision to occur.
- the threshold is variable in real time based on the state of the driver.
- the method 800 further includes increasing the threshold if the state of the driver indicates that the driver is distracted or is not attentive to a driving task.
- the method 800 further includes at least temporarily holding off in generating the control signal if the estimated time it will take for the predicted collision to occur is higher than the threshold.
- the method 800 further includes determining a level of the risk of the collision, and adjusting the threshold based on the determined level of the risk of the collision.
- the state of the driver comprises a distracted state
- the method further comprises determining a level of a distracted state of the driver, and adjusting the threshold based on the determined level of the distracted state of the driver.
- the threshold has a first value if the state of the driver indicates that the driver is attentive to a driving task, and wherein the threshold has a second value higher than the first value if the state of the driver indicates that the driver is distracted or is not attentive to the driving task.
- the threshold is also based on sensor information indicating that the vehicle is being operated to mitigate the risk of the collision.
- the act of determining whether to provide the control signal for operating the device or not is performed also based on sensor information indicating that the vehicle is being operated to mitigate the risk of the collision.
- the act of determining the first information indicating the risk of the collision comprises processing the first image based on a first model.
- the first model comprises a neural network model.
- the act of determining the second information indicating the state of the driver comprises processing the second image based on a second model.
- the method 800 further includes determining metric values for multiple respective pose classifications, and determining whether the driver is engaged with a driving task or not based on one or more of the metric values.
- the pose classifications comprise two or more of: looking-down pose, looking-up pose, looking-left pose, looking-right pose, cellphone using pose, smoking pose, holding-object pose, hand(s)-not-on-the wheel pose, not- wearing-seatbelt pose, eye(s)-closed pose, looking-straight pose, one-hand-on-wheel pose, and two-hands-on-wheel pose.
- the method 800 further includes comparing the metric values with respective thresholds for the respective pose classifications.
- the method 800 further includes determining the driver as belonging to one of the pose classifications if the corresponding one of the metric values meets or surpasses the corresponding one of the thresholds.
- the method 800 is performed by an aftermarket device, and wherein the first camera and the second camera are integrated as parts of the aftermarket device.
- the second information is determined by processing the second image to determine whether an image of the driver meets a pose classification or not; and wherein the method further comprises determining whether the driver is engaged with a driving task or not based on the image of the driver meeting the pose classification or not.
- the act of determining the second information indicating the state of the driver comprises processing the second image based on a neural network model.
- FIG. 22B illustrates a method 850 performed by the apparatus 200 of FIG. 2A in accordance with some embodiments.
- the method 850 includes: obtaining a first image generated by a first camera, wherein the first camera is configured to view an environment outside a vehicle (item 852); obtaining a second image generated by a second camera, wherein the second camera is configured to view a driver of the vehicle (item 854); determining first information indicating a risk of intersection violation based at least partly on the first image (item 856); determining second information indicating a state of the driver based at least partly on the second image (item 858); and determining whether to provide a control signal for operating a device or not based on (1) the first information indicating the risk of the intersection violation, and (2) the second information indicating the state of the driver (item 860).
- the first information is determined by predicting the intersection violation, and wherein the predicted intersection violation is predicted at least 3 seconds or more before an expected occurrence time for the predicted intersection violation.
- the first information is determined by predicting the intersection violation, and wherein the intersection violation is predicted with sufficient lead time for a brain of the driver to process input and for the driver to perform an action to mitigate the risk of the intersection violation.
- the sufficient lead time is dependent on the state of the driver.
- the first information indicating the risk of the intersection violation comprises a predicted intersection violation, wherein the method further comprises determining an estimated time it will take for the predicted intersection violation to occur, and wherein the control signal is provided to cause the device to provide the control signal if the estimated time it will take for the predicted intersection violation to occur is below a threshold.
- the device comprises a warning generator, and wherein the control signal is provided to cause the device to provide a warning for the driver if the estimated time it will take for the predicted intersection violation to occur is below a threshold.
- the device comprises a vehicle control, and wherein the control signal is provided to cause the device to control the vehicle if the estimated time it will take for the predicted intersection violation to occur is below the threshold.
- the threshold is variable based on the second information indicating the state of the driver.
- the estimated time is repeatedly evaluated with respect to the variable threshold, as the predicted intersection violation is temporally approaching in correspondence with a decrease of the estimated time it will take for the predicted intersection violation to occur.
- the threshold is variable in real time based on the state of the driver.
- the method further includes increasing the threshold if the state of the driver indicates that the driver is distracted or is not attentive to a driving task.
- the method further includes at least temporarily holding off in generating the control signal if the estimated time it will take for the predicted intersection violation to occur is higher than the threshold.
- the method further includes determining a level of the risk of the intersection violation, and adjusting the threshold based on the determined level of the risk of the intersection violation.
- the state of the driver comprises a distracted state
- the method further comprises determining a level of a distracted state of the driver, and adjusting the threshold based on the determined level of the distracted state of the driver.
- the threshold has a first value if the state of the driver indicates that the driver is attentive to a driving task, and wherein the threshold has a second value higher than the first value if the state of the driver indicates that the driver is distracted or is not attentive to the driving task.
- the threshold is also based on sensor information indicating that the vehicle is being operated to mitigate the risk of the intersection violation.
- the act of determining whether to provide the control signal for operating the device or not is performed also based on sensor information indicating that the vehicle is being operated to mitigate the risk of the intersection violation.
- the act of determining the first information indicating the risk of the intersection violation comprises processing the first image based on a first model.
- the first model comprises a neural network model.
- the act of determining the second information indicating the state of the driver comprises processing the second image based on a second model.
- the method further includes determining metric values for multiple respective pose classifications, and determining whether the driver is engaged with a driving task or not based on one or more of the metric values.
- the pose classifications comprise two or more of: looking-down pose, looking-up pose, looking-left pose, looking-right pose, cellphone-using pose, smoking pose, holding-object pose, hand(s)-not-on-the wheel pose, not-wearing-seatbelt pose, eye(s)-closed pose, looking-straight pose, one-hand-on-wheel pose, and two- hands-on-wheel pose.
- the method further includes comparing the metric values with respective thresholds for the respective pose classifications.
- the method further includes determining the driver as belonging to one of the pose classifications if the corresponding one of the metric values meets or surpasses the corresponding one of the thresholds.
- the method is performed by an aftermarket device, and wherein the first camera and the second camera are integrated as parts of the aftermarket device.
- the second information is determined by processing the second image to determine whether an image of the driver meets a pose classification or not; and wherein the method further comprises determining whether the driver is engaged with a driving task or not based on the image of the driver meeting the pose classification or not.
- the act of determining the second information indicating the state of the driver comprises processing the second image based on a neural network model.
- Model generation and incorporation
- FIG. 23 illustrates a technique of determining a model for use by the apparatus 200 in accordance with some embodiments.
- Each of the apparatuses 200a-200d may have the configuration and features described with reference to the apparatus 200 of FIG. 2A.
- cameras both external viewing cameras and internal viewing cameras
- the images are transmitted, directly or indirectly, to a server 920 via a network (e.g., a cloud, the Internet, etc.).
- the server 920 include a processing unit 922 configured to process the images from the apparatuses 200b-300d in the vehicles 91 Ob-91 Od to determine a model 930, and one or more models 932.
- the model 930 may be configured to detect poses of drivers, and the model(s) 932 may be configured to detect different types of objects in camera images.
- the models 930, 932 may then be stored in a non-transitory medium 924 in the server 920.
- the server 920 may transmit the models 930, 932 directly or indirectly, to the apparatus 200a in the vehicle 910a via a network (e.g., a cloud, the Internet, etc.).
- a network e.g., a cloud, the Internet, etc.
- the apparatus 200a can then use the model(s) 932 to process images received by the camera of the apparatus 200a to detect different poses of the driver of the vehicle 910a. Also, the apparatus 200a can then use the model(s) 932 to process images received by the camera of the apparatus 200a to detect different objects outside the vehicle 910a and/or to determine a region of interest for the camera of the apparatus 200a.
- FIG. 23 there are three apparatuses 200b-200d in three respective vehicles 91 Ob-91 Od for providing images. In other examples, there may be more than three apparatuses 200 in more than three respective vehicles 910 for providing images to the server 920, or there may be fewer than three apparatuses 200 in fewer than three vehicles 910 for providing images to the server 920.
- the model 930 provided by the server 920 may be a neural network model.
- the model(s) 932 provided by the server 920 may also be one or more neural network model(s).
- the server 920 may be a neural network, or a part of a neural network, and the images from the apparatuses 200b-200d may be utilized by the server 920 to configure the model 930 and/or the model(s) 932.
- the processing unit 922 of the server 920 may configure the model 930 and/or the model(s) 932 by training the model 930 via machine learning.
- the images from the different apparatuses 200b-200d form a rich data set from different cameras mounting at different positions with respect to the corresponding vehicles, which will be useful in training the model 930 and/or the model(s) 932.
- the term “neural network” refers to any computing device, system, or module made up of a number of interconnected processing elements, which process information by their dynamic state response to input.
- the neural network may have deep learning capability and/or artificial intelligence.
- the neural network may be simply any computing element that can be trained using one or more data sets.
- the neural network may be a perceptron, a feedforward neural network, a radial basis neural network, a deep-feed forward neural network, a recurrent neural network, a long/short term memory neural network, a gated recurrent unit, an auto encoder neural network, a variational auto encoder neural network, a denoising auto encoder neural network, a sparse auto encoder neural network, a Markov chain neural network, a Hopfield neural network, a Boltzmann machine, a restricted Boltzmann machine, a deep belief network, a convolutional network, a deconvolutional network, a deep convolutional inverse graphics network, a generative adversarial network, a liquid state machine, an extreme learning machine, an echo state network, a deep residual network, a Kohonen network, a support vector machine, a neural turing machine, a modular neural network, a sequence-to-sequence model, etc., or any combination of the
- the processing unit 922 of the server 920 uses the images to configure (e.g., to train) the model 930 to identify certain poses of drivers.
- the model 930 may be configured to identify whether a driver is looking-down pose, looking-up pose, looking-left pose, looking-right pose, cellphone-using pose, smoking pose, holding-object pose, hand(s)-not-on-the wheel pose, not-wearing-seatbelt pose, eye(s)-closed pose, looking-straight pose, one-hand- on-wheel pose, two-hands-on-wheel pose, etc.
- the processing unit 922 of the server 920 may use the images to configure the model to determine whether a driver is engaged with a driving task or not.
- the determination of whether a driver is engaged with a driving task or not may be accomplished by a processing unit processing pose classifications of the driver.
- pose classifications may be output provided by a neural network model.
- the neural network model may be passed to a processing unit, which determines whether the driver is engaged with a driving task or not based on the pose classifications from the neural network model.
- the processing unit receiving the pose classifications may be another (e.g., second) neural network model.
- the first neural network model is configured to output pose classifications
- the second neural network model is configured to determine whether a driver is engaged with a driving task or not based on the pose classifications outputted by the first neural network model.
- the model 930 may be considered as having both a first neural network model and a second neural network model.
- the model 930 may be a single neural network model that is configured to receive images as input, and to provide an output indicating whether a driver is engaged with a driving task or not.
- the processing unit 922 of the server 920 uses the images to configure (e.g., to train) the model(s) 932 to detect different objects.
- the model(s) 932 may be configured to detect vehicles, humans, animals, bicycles, traffic lights, road signs, road markings, curb sides, centerlines of roadways, etc.
- the model 930 and/or the model(s) 932 may not be a neural network model, and may be any of other types of model.
- the configuring of the model 930 and/or the model(s) 932 by the processing unit 922 may not involve any machine learning, and/or images from the apparatuses 200b-200d may not be needed.
- the configuring of the model 930 and/or the model(s) 932 by the processing unit 922 may be achieved by the processing unit 922 determining (e.g., obtaining, calculating, etc.) processing parameters (such as feature extraction parameters) for the model 930 and/or the model(s) 932.
- the model 930 and/or the model(s) 932 may include program instructions, commands, scripts, parameters (e.g., feature extraction parameters), etc.
- the model 930 and/or the model(s) 932 may be in a form of an application that can be received wirelessly by the apparatus 200.
- the models 930, 932 are then available for use by apparatuses 200 in different vehicles 910 to identify objects in camera images. As shown in the figure, the models 930, 932 may be transmitted from the server 920 to the apparatus 200a in the vehicle 910a. The models 930, 932 may also be transmitted from the server 920 to the apparatuses 200b-200d in the respective vehicles 91 Ob-91 Od.
- the processing unit in the apparatus 200a may then process images generated by the camera (internal viewing camera) of the apparatus 200a based on the model 930 to identify poses of drivers, and/or to determine whether drivers are engaged with a driving task or not, as described herein, and may process images generated by the camera (external viewing camera) of the apparatus 200a based on the model(s) 932 to detect objects outside the vehicle 910a.
- the transmission of the models 930, 932 from the server 920 to the apparatus 200 may be performed by the server 920 “pushing” the models 930, 932, so that the apparatus 200 is not required to request for the models 930, 932.
- the transmission of the models 930, 932 from the server 920 may be performed by the server 920 in response to a signal generated and sent by the apparatus 200.
- the apparatus 200 may generate and transmit a signal after the apparatus 200 is turned on, or after the vehicle with the apparatus 200 has been started. The signal may be received by the server 920, which then transmits the models 930, 932 for reception by the apparatus 200.
- the apparatus 200 may include a user interface, such as a button, which allows a user of the apparatus 200 to send a request for the models 930, 932.
- a user interface such as a button
- the apparatus 200 transmits a request for the models 930, 932 to the server 920.
- the server 920 transmits the models 930, 932 to the apparatus 200.
- the server 920 of FIG. 23 is not limiting to being one server device, and may be more than one server devices.
- the processing unit 922 of the server 920 may include one or more processors, one or more processing modules, etc.
- the images obtained by the server 920 may not be generated by the apparatuses 200b-200d. Instead, the images used by the server 920 to determine (e.g., to train, to configure, etc.) the models 930, 932 may be recorded using other device(s), such as mobile phone(s), camera(s) in other vehicles, etc. Also, in other embodiments, the images used by the server 920 to determine (e.g., to train, to configure, etc.) the models 930, 932 may be downloaded to the server 920 from a database, such as from a database associated with the server 920, or a database owned by a third party.
- a database such as from a database associated with the server 920, or a database owned by a third party.
- FIG. 24 illustrates a specialized processing system for implementing one or more electronic devices described herein.
- the processing system 1600 may implement the apparatus 200, or at least a part of the apparatus 200, such as the processing unit 210 of the apparatus 200.
- Processing system 1600 includes a bus 1602 or other communication mechanism for communicating information, and a processor 1604 coupled with the bus 1602 for processing information.
- the processor system 1600 also includes a main memory 1606, such as a random access memory (RAM) or other dynamic storage device, coupled to the bus 1602 for storing information and instructions to be executed by the processor 1604.
- the main memory 1606 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by the processor 1604.
- the processor system 1600 further includes a read only memory (ROM) 1608 or other static storage device coupled to the bus 1602 for storing static information and instructions for the processor 1604.
- ROM read only memory
- a data storage device 1610 such as a magnetic disk or optical disk, is provided and coupled to the bus 1602 for storing information and instructions.
- the processor system 1600 may be coupled via the bus 1602 to a display 167, such as a screen or a flat panel, for displaying information to a user.
- a display 167 such as a screen or a flat panel
- An input device 1614 is coupled to the bus 1602 for communicating information and command selections to processor 1604.
- cursor control 1616 is Another type of user input device, such as a touchpad, a touchscreen, a trackball, or cursor direction keys for communicating direction information and command selections to processor 1604 and for controlling cursor movement on display 167.
- This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
- the processor system 1600 can be used to perform various functions described herein. According to some embodiments, such use is provided by processor system 1600 in response to processor 1604 executing one or more sequences of one or more instructions contained in the main memory 1606. Those skilled in the art will know how to prepare such instructions based on the functions and methods described herein. Such instructions may be read into the main memory 1606 from another processor-readable medium, such as storage device 1610. Execution of the sequences of instructions contained in the main memory 1606 causes the processor 1604 to perform the process steps described herein. One or more processors in a multi-processing arrangement may also be employed to execute the sequences of instructions contained in the main memory 1606. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the various embodiments described herein. Thus, embodiments are not limited to any specific combination of hardware circuitry and software.
- processor-readable medium refers to any medium that participates in providing instructions to the processor 1604 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media.
- Non-volatile media includes, for example, optical or magnetic disks, such as the storage device 1610.
- a non-volatile medium may be considered an example of non-transitory medium.
- Volatile media includes dynamic memory, such as the main memory 1606.
- a volatile medium may be considered an example of non-transitory medium.
- Transmission media includes cables, wire and fiber optics, including the wires that comprise the bus 1602. Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
- processor-readable media include, for example, hard disk, a magnetic medium, a CD-ROM, any other optical medium, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a processor can read.
- processor-readable media may be involved in carrying one or more sequences of one or more instructions to the processor 1604 for execution.
- the instructions may initially be carried on a storage of a remote computer or remote device.
- the remote computer or device can send the instructions over a network, such as the Internet.
- a receiving unit local to the processing system 1600 can receive the data from the network, and provide the data on the bus 1602.
- the bus 1602 carries the data to the main memory 1606, from which the processor 1604 retrieves and executes the instructions.
- the instructions received by the main memory 1606 may optionally be stored on the storage device 1610 either before or after execution by the processor 1604.
- the processing system 1600 also includes a communication interface 1618 coupled to the bus 1602.
- the communication interface 1618 provides a two-way data communication coupling to a network link 1620 that is connected to a local network 1622.
- the communication interface 1618 may be an integrated services digital network (ISDN) card to provide a data communication.
- ISDN integrated services digital network
- the communication interface 1618 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN.
- LAN local area network
- Wireless links may also be implemented.
- the communication interface 1618 sends and receives electrical, electromagnetic or optical signals that carry data streams representing various types of information.
- the network link 1620 typically provides data communication through one or more networks to other devices.
- the network link 1620 may provide a connection through local network 1622 to a host computer 1624 or to equipment 1626.
- the data streams transported over the network link 1620 can comprise electrical, electromagnetic or optical signals.
- the signals through the various networks and the signals on the network link 1620 and through the communication interface 1618, which carry data to and from the processing system 1600, are exemplary forms of carrier waves transporting the information.
- the processing system 1600 can send messages and receive data, including program code, through the network(s), the network link 1620, and the communication interface 1618.
- image is not limited to an image that is displayed, and may refer to an image that is displayed or not displayed (e.g., an image in data or digital form that is stored).
- graphical element or any of other similar terms, such as “graphical identifier”, may refer to an item that is displayed or not displayed. The item may be a computational element, an equation representing the graphical element / identifier, one or more geometric parameters associated with the graphical element / identifier.
- model may refer to one or more algorithms, one or more equations, one or more processing applications, one or more variables, one or more criteria, one or more parameters, or any combination of two or more of the foregoing.
- the phrase “determine whether the driver is engaged with a driving task or not”, or any of other similar phrases do not necessarily require both (1) “driver is engaged with a driving task” and (2) “driver is not engaged with a driving task” to be possible determination outcomes. Rather, such phrase and similar phases are intended to cover (1) “driver is engaged with a driving task” as a possible determination outcome, or (2) “driver is not engaged with a driving task” as a possible determination outcome, or (3) both “driver is engaged with a driving task” and “driver is not engaged with a driving task” to be possible determination outcomes. Also, the above phrase and other similar phrases do not exclude other determination outcomes, such as an outcome indicating that a state of the driver is unknown.
- the above phrase or other similar phrases cover an embodiment in which a processing unit is configured to determine that (1) the driver is engaged with a driving task, or (2) it is unknown whether the driver is engaged with a driving task, as two possible processing outcomes (because the first part of the phrase mentions the determination outcome (1)).
- the above phrase or other similar phrases cover an embodiment in which a processing unit is configured to determine that (1) the driver is not engaged with a driving task, or (2) it is unknown whether the driver is not engaged with a driving task, as two possible processing outcomes (because the later part of the phrase mentions the determination outcome (2)).
- signal may refer to one or more signals.
- a signal may include one or more data, one or more information, one or more signal values, one or more discrete values, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Automation & Control Theory (AREA)
- Transportation (AREA)
- Mechanical Engineering (AREA)
- Human Computer Interaction (AREA)
- Traffic Control Systems (AREA)
Abstract
Description
Claims
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/348,715 US11945435B2 (en) | 2021-06-15 | 2021-06-15 | Devices and methods for predicting collisions and/or intersection violations |
US17/348,732 US20210309221A1 (en) | 2021-06-15 | 2021-06-15 | Devices and methods for determining region of interest for object detection in camera images |
US17/348,727 US20210312193A1 (en) | 2021-06-15 | 2021-06-15 | Devices and methods for predicting intersection violations and/or collisions |
PCT/US2022/029312 WO2022265776A1 (en) | 2021-06-15 | 2022-05-13 | Devices and methods for predicting collisions, predicting intersection violations, and/or determining region of interest for object detection in camera images |
Publications (1)
Publication Number | Publication Date |
---|---|
EP4355626A1 true EP4355626A1 (en) | 2024-04-24 |
Family
ID=84527603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP22825506.3A Pending EP4355626A1 (en) | 2021-06-15 | 2022-05-13 | Devices and methods for predicting collisions, predicting intersection violations, and/or determining region of interest for object detection in camera images |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP4355626A1 (en) |
JP (1) | JP2024525153A (en) |
WO (1) | WO2022265776A1 (en) |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7428449B2 (en) * | 2006-03-14 | 2008-09-23 | Temic Automotive Of North America, Inc. | System and method for determining a workload level of a driver |
US7710248B2 (en) * | 2007-06-12 | 2010-05-04 | Palo Alto Research Center Incorporated | Human-machine-interface (HMI) customization based on collision assessments |
US8577550B2 (en) * | 2009-10-05 | 2013-11-05 | Ford Global Technologies, Llc | System for vehicle control to mitigate intersection collisions and method of using the same |
JP5657809B2 (en) * | 2011-10-06 | 2015-01-21 | 本田技研工業株式会社 | Armpit detector |
US8989914B1 (en) * | 2011-12-19 | 2015-03-24 | Lytx, Inc. | Driver identification based on driving maneuver signature |
US10246014B2 (en) * | 2016-11-07 | 2019-04-02 | Nauto, Inc. | System and method for driver distraction determination |
US11403857B2 (en) * | 2018-11-19 | 2022-08-02 | Nauto, Inc. | System and method for vehicle localization |
US11945435B2 (en) * | 2021-06-15 | 2024-04-02 | Nauto, Inc. | Devices and methods for predicting collisions and/or intersection violations |
-
2022
- 2022-05-13 JP JP2023577297A patent/JP2024525153A/en active Pending
- 2022-05-13 WO PCT/US2022/029312 patent/WO2022265776A1/en active Application Filing
- 2022-05-13 EP EP22825506.3A patent/EP4355626A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
JP2024525153A (en) | 2024-07-10 |
WO2022265776A1 (en) | 2022-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11945435B2 (en) | Devices and methods for predicting collisions and/or intersection violations | |
KR102351592B1 (en) | Default preview area and gaze-based driver distraction detection | |
US20230166743A1 (en) | Devices and methods for assisting operation of vehicles based on situational assessment fusing expoential risks (safer) | |
US20210312193A1 (en) | Devices and methods for predicting intersection violations and/or collisions | |
US9524643B2 (en) | Orientation sensitive traffic collision warning system | |
WO2020100539A1 (en) | Information processing device, moving device, method, and program | |
CN103987577B (en) | Method for monitoring the traffic conditions in the surrounding environment with signalling vehicle | |
US10336252B2 (en) | Long term driving danger prediction system | |
AU2019337091A1 (en) | Systems and methods for classifying driver behavior | |
JP2019533609A (en) | Near-crash determination system and method | |
JP2016001463A (en) | Processor, processing system, processing program, and processing method | |
JP2016001464A (en) | Processor, processing system, processing program, and processing method | |
Köhler et al. | Autonomous evasive maneuvers triggered by infrastructure-based detection of pedestrian intentions | |
KR20110067359A (en) | Method and apparatus for collision avoidance of vehicle | |
CN114340970A (en) | Information processing device, mobile device, information processing system, method, and program | |
WO2023179494A1 (en) | Danger early warning method and apparatus, and vehicle | |
US20210309221A1 (en) | Devices and methods for determining region of interest for object detection in camera images | |
Altaf et al. | A survey on autonomous vehicles in the field of intelligent transport system | |
JP2021136001A (en) | Driving support device | |
Kondyli et al. | A 3D experimental framework for exploring drivers' body activity using infrared depth sensors | |
EP4355626A1 (en) | Devices and methods for predicting collisions, predicting intersection violations, and/or determining region of interest for object detection in camera images | |
JP7298351B2 (en) | State determination device, in-vehicle device, driving evaluation system, state determination method, and program | |
Jebamani et al. | AR Upgraded Windshield |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20231211 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
RAP3 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: NAUTO, INC. |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: HORNSTEIN, ILAN Inventor name: BELKIN, RUSLAN Inventor name: HECK, STEFAN Inventor name: KWAN, GARY Inventor name: MARSCHKE, JEREMY Inventor name: CHANDRA, PIYUSH Inventor name: WU, ALEXANDER Inventor name: MAHMUD, TAHMIDA Inventor name: ALPERT, BENJAMIN |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) |