WO2019161766A1 - Method for distress and road rage detection - Google Patents

Method for distress and road rage detection Download PDF

Info

Publication number
WO2019161766A1
WO2019161766A1 PCT/CN2019/075558 CN2019075558W WO2019161766A1 WO 2019161766 A1 WO2019161766 A1 WO 2019161766A1 CN 2019075558 W CN2019075558 W CN 2019075558W WO 2019161766 A1 WO2019161766 A1 WO 2019161766A1
Authority
WO
WIPO (PCT)
Prior art keywords
driver
estimate
normal driving
time
sensors
Prior art date
Application number
PCT/CN2019/075558
Other languages
French (fr)
Inventor
Fatih Porikli
Yuzhu WU
Luis Bill
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to CN201980014866.9A priority Critical patent/CN111741884B/en
Priority to EP19756593.0A priority patent/EP3755597B1/en
Publication of WO2019161766A1 publication Critical patent/WO2019161766A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • B60W40/09Driving style or behaviour
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/08Interaction between the driver and the control system
    • B60W50/14Means for informing the driver, warning the driver or prompting a driver intervention
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24143Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/143Sensing or illuminating at different wavelengths
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • B60W2040/0818Inactivity or incapacity of driver
    • B60W2040/0863Inactivity or incapacity of driver due to erroneous selection or response of the driver
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/08Interaction between the driver and the control system
    • B60W50/14Means for informing the driver, warning the driver or prompting a driver intervention
    • B60W2050/143Alarm means
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/08Interaction between the driver and the control system
    • B60W50/14Means for informing the driver, warning the driver or prompting a driver intervention
    • B60W2050/146Display means
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2540/00Input parameters relating to occupants
    • B60W2540/22Psychological state; Stress level or workload
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2540/00Input parameters relating to occupants
    • B60W2540/26Incapacity

Definitions

  • the present disclosure relates to automated systems for vehicle safety, and in particular to systems and methods for detection of driver and passenger distress and road rage.
  • a method for determining distress of a driver of a vehicle comprises receiving inputs from a plurality of sensors by one or more processors, the sensors including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors, and processing the received inputs to obtain a driver heat change estimate, a driver expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate.
  • the estimates are stored in a memory, and the stored estimates are used to generate deviation scores for each of the driver heat change estimate, the driver expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate.
  • a machine learning algorithm is executed by the one or more processors to classify driver behavior as normal or impaired based on the deviation scores, and to generate a warning based on the classification indicating impaired driver behavior.
  • generating the deviation score for the driver or passenger heat change estimate includes: generating a normal driving model offline using normal driving thermal images of the driver or a passenger; comparing the normal driving model with real-time thermal image data of the driver or the passenger to obtain a comparison result; and applying a probability density function (PDF) to the comparison result to obtain the deviation score for the driver or passenger heat change estimate.
  • PDF probability density function
  • generating the deviation score for the driver or passenger expression estimate includes: using detection-tracking-validation (DTV) to localize frontal face images of the driver or a passenger; constructing a face stream frame from a partitioned face region of the frontal face images; applying a fully convolutional network (FCN) to the face stream frame using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers; reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving expression model using a Gaussian mixture model (GMM) ; and comparing a real-time driver or passenger expression with the normal driving expression model to calculate the deviation score for the driver or passenger expression estimate.
  • DTV detection-tracking-validation
  • FCN fully convolutional network
  • ReLU rectified linear unit
  • generating the deviation score for the driver or passenger gesture estimate includes: detecting driver or passenger gestures to obtain an image of a hands region of the driver or passenger; constructing a two-layer hand stream from the image and normalizing the two-layer hand stream for size adjustment; applying a fully convolutional network (FCN) to the two-layer hand stream using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers; reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving gesture model using a Gaussian mixture model (GMM) ; and comparing a real-time driver or passenger gesture with the normal driving or passenger gesture model to calculate the deviation score for the driver or passenger gesture estimate.
  • FCN fully convolutional network
  • ReLU rectified linear unit
  • generating the deviation score for the OBD estimate includes: collecting normal driving data from OBD related to two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; using the normal driving data to generate a normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; and comparing real-time data to the normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision to generate a deviation score for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision.
  • the warning includes a visual alert.
  • the warning includes an audio output.
  • the warning includes a suggested corrective driver action using a display.
  • using the processor to execute the machine learning algorithm to classify the driver behavior as normal or impaired includes using a Gaussian mixture model (GMM) .
  • GMM Gaussian mixture model
  • expectation maximization is used to estimate model parameters of the GMM.
  • the processor is configured to generate a normal driving model offline for comparison to real-time driving data.
  • a system for determining distress of a driver of a vehicle comprises a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors, and a processor in communication with the plurality of sensors.
  • the processor is configured to: receive inputs from the plurality of sensors, process the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver or passenger gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate, and store the estimates in a memory.
  • GPS Global Positioning System
  • the stored estimates are to generate deviation scores for each of the driver or passenger heat change estimate, the driver or passenger expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate.
  • a machine learning algorithm is executed to classify driver behavior as normal or impaired based on the deviation scores, and a warning is generated based on the classification indicating impaired driver or passenger behavior.
  • the plurality of sensors further includes exterior-facing sensors of the vehicle.
  • the processor is further configured to receive a traffic information input, including at least one of a speed limit and a lane direction.
  • the warning includes a suggested corrective driver action using a display.
  • a non-transitory computer-readable medium storing computer instructions to determine distress of a driver of a vehicle and provide a warning, that when executed by one or more processors, cause the one or more processors to perform steps of: receiving inputs from a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors; processing the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate; storing the estimates in a memory; using the stored estimates to generate deviation scores for each of the driver or passenger heat change estimate, the driver or passenger expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate; executing a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores; and generating the warning based on the classification indicating impaired driver or passenger behavior.
  • GPS Global Positioning System
  • generating the deviation score for the driver or passenger heat change estimate includes: generating a normal driving model offline using normal driving thermal images of the driver or a passenger; comparing the normal driving model with real-time thermal image data of the driver or the passenger to obtain a comparison result; and applying a probability density function (PDF) to the comparison result to obtain the deviation score for the driver or passenger heat change estimate.
  • PDF probability density function
  • generating the deviation score for the driver or passenger expression estimate includes: using detection-tracking-validation (DTV) to localize frontal face images of the driver or a passenger; constructing a face stream frame from a partitioned face region of the frontal face images; applying a fully convolutional network (FCN) to the face stream frame using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers; reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving expression model using a Gaussian mixture model (GMM) ; and comparing a real-time driver or passenger expression with the normal driving expression model to calculate the deviation score for the driver or passenger expression estimate.
  • DTV detection-tracking-validation
  • FCN fully convolutional network
  • ReLU rectified linear unit
  • generating the deviation score for the driver or passenger gesture estimate includes: detecting driver gestures to obtain an image of a hands region of the driver or passenger; constructing a two-layer hand stream from the image and normalizing the two-layer hand stream for size adjustment; applying a fully convolutional network (FCN) to the two-layer hand stream using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers; reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving or passenger gesture model using a Gaussian mixture model (GMM) ; and comparing a real-time driver or passenger gesture with the normal driving or passenger gesture model to calculate the deviation score for the driver or passenger gesture estimate.
  • FCN fully convolutional network
  • ReLU rectified linear unit
  • generating the deviation score for the OBD estimate includes: collecting normal driving data from OBD related to two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; using the normal driving data to generate a normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; and comparing real-time data to the normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision to generate a deviation score for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision.
  • FIGS. 1A-1B are block diagrams illustrating systems for detection of driver and passenger distress, according to various embodiments.
  • FIG. 2 is a flow diagram illustrating a method for detection of driver and passenger distress, according to various embodiments.
  • FIG. 3 is a graph illustrating density of occurrences of driver hand gestures, according to various embodiments.
  • FIG. 4 is a block diagram illustrating a system for detection of driver and passenger distress, according to various embodiments.
  • FIG. 5 is a block diagram illustrating calculation of a deviation score for a driver or passenger heat change estimate in a system for detection of driver and passenger distress, according to various embodiments.
  • FIGS. 6A-6B are block diagrams illustrating detection of driver or passenger expression and calculation of a deviation score for a driver or passenger expression estimation, according to various embodiments.
  • FIGS. 7A-7C are graphs illustrating calculation of a deviation score for a driver gesture estimate in a system for detection of driver and passenger distress, according to various embodiments.
  • FIG. 8 is a flow diagram illustrating calculation of a deviation score for mel-frequency cepstral coefficients (MFCC) , according to various embodiments.
  • FIGS. 9A-9B are flow diagrams illustrating methods for associating hands and a face with a driver using an image sensor stream, according to various embodiments.
  • FIGS. 10A-10B illustrate an image sensor of the present subject matter and an example of a captured image from the sensor, according to various embodiments.
  • FIG. 11 is a flow diagram illustrating a method for detection of driver and passenger distress, according to various embodiments.
  • FIG. 12 illustrates a system for detection of driver and passenger distress, according to various embodiments.
  • FIG. 13 is a diagram illustrating circuitry for implementing devices to perform methods according to an example embodiment.
  • FIG. 14 is a schematic diagram illustrating circuitry for implementing devices to perform methods according to example embodiments.
  • Embodiments of the present subject matter monitor distress and road rage in real time as part of a driver assistance system.
  • the recognition of distress and road rage typically relies on interpretation of very subtle cues, which may vary among individuals. Therefore, embodiments of the present subject matter monitor a plurality of modalities (such as facial expressions, hand gestures, vehicle speed, etc. ) in order to create a robust system, which can be used to detect changes in driver temperament.
  • Road rage can be classified into four stages: in stage 1, when a driver is annoyed by somebody, they usually start making non-threatening gestures or facial expressions to show annoyance; in stage 2, after showing their dissatisfaction, angry drivers can escalate the situation by honking, flashing lights, braking maliciously, tailgating, and blocking vehicles; in stage 3, aggressive drivers might curse, yell, and threaten another driver; in stage 4, a worst case is that some drivers might fire a gun, hit a vehicle with objects, chase a vehicle, or run a vehicle off the road.
  • the present subject matter provides a distress and road rage monitoring system, which can monitor a driver or passenger to detect levels of distress and road rage and provide a notification if distress or road rage is detected.
  • the system incorporates, but is not limited to, thermal imaging, speech, and visual information together, as well as other modalities, such as driving performance and hand gestures, in various embodiments.
  • the inputs to a processing unit can be information originating from audio sensors, image sensors (e.g., near-infrared reflectance (NIR) cameras or thermal cameras) , and overall vehicle data.
  • the system can then assist the driver or passenger to reduce the possibility of an incident.
  • the present system can obtain important information that otherwise cannot be obtained when relying on just a single source of information.
  • Each modality can provide information that may not be found in a different modality (e.g., image information from an image sensor vs. sound information from a sound transducer) .
  • embodiments of the present subject matter use neural networks, reinforcement learning, and other machine learning techniques in order for the system to learn which features about the driver and the vehicle can be useful when detecting road rage and stress.
  • a system for determining distress of a driver of a vehicle comprising a plurality of sensors, including, but not limited to, interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors.
  • the system also includes a processor configured to receive inputs from the plurality of sensors, and process the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate.
  • GPS Global Positioning System
  • the processor is further configured to store the estimates in a memory, use the stored estimates to generate deviation scores for each of the estimates, execute a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores, and generate a warning if the classification indicates impaired driver behavior.
  • the functions or algorithms described herein may be implemented in software in one embodiment.
  • the software may consist of computer-executable instructions stored on computer-readable media or a computer-readable storage device such as one or more non-transitory memories or other types of hardware-based storage devices, either local or networked.
  • modules which may be software, hardware, firmware, or any combination thereof. Multiple functions may be performed in one or more modules as desired, and the embodiments described are merely examples.
  • the software may be executed on a digital signal processor, application-specific integrated circuit (ASIC) , microprocessor, or other type of processor operating on a computer system, such as a personal computer, server, or other computer system, turning such a computer system into a specifically programmed machine.
  • ASIC application-specific integrated circuit
  • a method for determining distress of a driver of a vehicle comprises receiving inputs from a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors, and processing the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver or passenger gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate.
  • the estimates are stored in a memory, and the stored estimates are used to generate deviation scores for each of the driver or passenger heat change estimate, the driver or passenger expression estimate, the driver or passenger gesture estimate, the OBD estimate, and the GPS estimate.
  • a machine learning algorithm is executed to classify driver behavior as normal or impaired based on the deviation scores, and to generate a warning if the classification indicates impaired driver or passenger behavior.
  • a computer-implemented system determines driving information of the driver (and passengers if available) , the driving information being based on sensor information collected by image sensors, location sensors, sound sensors, and vehicle sensors, which can then be used in order to understand the driver’s and passengers’ states, so that the system can further determine if distress or road rage is present in the driver’s state.
  • the system uses machine learning and pre-trained models in order to learn how to predict distress and road rage, in various embodiments, and stores this model in memory.
  • Machine learning techniques such as reinforcement learning, allow the system to adapt to the driver’s distress/road rage driving performance, and non-distress/road rage driving performance, in various embodiments.
  • Systems and methods of the present subject matter generate a prediction model of the driver’s distress and road rage level.
  • Various embodiments of a method include identifying the driver and passengers inside the vehicle, identifying the hands and faces of the driver and passengers, tracking the hands and faces, and using this information in order to detect facial expressions, gestures, thermal states, and activities that are indicators of distress.
  • the method further includes identifying the state of the environment, such as traffic conditions, objects near the vehicle, sounds around the vehicle (such as other vehicles honking) , road conditions, and the speed limit, in various embodiments.
  • the method also includes obtaining driving performance data, such as acceleration, speed, steering angle, and other embedded sensor data. Other inputs can be used without departing from the scope of the present subject matter.
  • the method includes fusing the aforementioned indicators, states, and data to determine if the driver is enraged or distressed, in various embodiments.
  • Various embodiments of the present system use a multimodal approach (e.g., multiple data streams, such as images, audio, vehicle data, etc. ) , such as described with respect to FIG. 4 below, where each modality can be used to detect features that help the system understand the driver’s distress and road rage levels.
  • the system can adapt and learn ways different drivers may display rage and distress expressions, and determine driver preferences for how warning and driving assistance are to be provided. For example, some drivers prefer frequent and repetitive warnings, which will provide assistance until the driver calms down, while other drivers prefer short warnings, because these drivers may be distracted by the alarms and warnings.
  • driver assistance may include reducing or limiting the speed of the vehicle, applying the brakes of the vehicle, or vibrating the steering wheel.
  • Other types of driver assistance can be used without departing from the scope of the present subject matter.
  • Various embodiments of the present system also accept driver feedback using reinforcement learning, which allows the system to continuously adapt to the driver. Advantages of the technical improvements of the present subject matter include that the present systems provide the desired warnings without requiring invasive sensing, such as blood pressure cuffs or other special equipment.
  • FIGS. 1A-1B are block diagrams illustrating systems for detection of driver and passenger distress, according to various embodiments.
  • the depicted embodiment includes a plurality of sensors 100 including at least one image sensor 101, at least one audio sensor 102, an OBD access device 103 for obtaining vehicle data, and a GPS input 104 for obtaining vehicle location.
  • Various embodiments include a processing unit 10, and an output (such as generation of an audible or visual alert or taking control of the vehicle) 20 generated by the processing unit 10 based on the condition of a driver 5.
  • Various embodiments also include an outside-facing image sensor 105 that records information about the environment outside the vehicle, as shown in FIG. 1B.
  • the processing unit 10 can include any platform that has capabilities to run neural processing computations, such as existing vehicle hardware, a mobile phone, or a dedicated device that is connected to vehicle OBD and GPS.
  • the processing unit 10 can include a rage and distress detector 21, a driver performance analyzer 22, a surrounding environment processor 23, a distress and road rage management processor 24, a module for reinforcement 25 and an input for driver feedback 26 in various embodiments.
  • the rage and distress detector 21 uses statistical models that allow the system to use statistical classification to determine distress and road rage levels. Since there are different levels of distress and road rage, the system can use a reference point for each of the modalities that are used as input.
  • the system uses a statistical distribution model that determines how far from normal or from the average the currently detected distress and road rage are.
  • the system can learn offline or in real time a normal driving baseline for a particular driver, and depending on how far the driving performance has deviated from the normal driving performance, the system determines if the driver’s distress and road rage levels are acceptable.
  • Some indicators of normal driving can include, but are not limited to: driving at or under the speed limit based on GPS information; word usage that does not include offensive language, as well as normal sound levels of the voice; and hand gestures that may not be included in the category of offensive hand gestures.
  • the system adapts the model in real time in order to accommodate a driver’s normal driving performance, including learning the regular driving speed, the regular body parts heat signatures, and the normal noise levels inside the cabin. Then, using reinforcement learning techniques, the system readjusts the parameters and models that are currently in use to determine if the levels of distress and road rage are within a normal range of driving performance for the particular driver.
  • FIG. 1B illustrates a sample data flow and sample data analysis schematics using multiple modalities.
  • the system automatically detects and tracks the driver’s face and eyes using image-sensor 101 (e.g., thermal and NIR) streams as input, in an embodiment.
  • image-sensor 101 e.g., thermal and NIR
  • the system can recognize the heat change and facial expressions of the driver.
  • the system detects the driver’s hands using the image sensor stream, in various embodiments. Using the hand regions as spatial anchors, the system recognizes the driver’s gestures.
  • the system can also use the audio stream acquired from a microphone inside the vehicle as an input to analyze the driver’s voice and sounds from inside the vehicle, in an embodiment.
  • the rage and distress detector 21 analyses the inputs to understand the driver’s driving performance.
  • the audio sensor 102 in FIGS. 1A-1B can be an in-vehicle microphone or a smartphone microphone, in various embodiments.
  • the audio sensor 102 can be used to record various audio features, including, but not limited to, speech recognition, as there are certain key words and tone intensities that indicate that the driver is distressed; speech volume (whether the driver is speaking or there are passengers’ voices in the audio signal) ; or whether the driver is hitting/banging a part of the vehicle’s cabin with their hands, during a moment of distress and rage.
  • Other factors may be part of the environment outside the vehicle, such as other vehicles honking or other drivers shouting at the driver. Sounds outside the vehicle may also be factors that can increase distress on the driver, and this distress may lead to road rage.
  • the system may learn what specific and repetitive sounds may lead to increases in distress and road rage levels for a driver.
  • the OBD access device or vehicle data device 103 in FIGS. 1A-1B receives, processes, and stores sensor and driving information, and provides such sensor and driving information to the rage and distress detector 21 and the driver performance analyzer 22.
  • the OBD access device 103 can be manufactured by the vehicle’s original equipment manufacturer (OEM) , or can be an aftermarket device.
  • the OBD access device 103 can have access to a controller area network (CAN) bus, for instance, through an OBD logger, and can access sensors, such as an accelerometer, a gyroscope, a GPS sensor, and other types of sensors, and further can communicate with user devices, such as smartphones, using a wired or wireless connection, in various embodiments.
  • CAN controller area network
  • the driver performance analyzer 22 is used to evaluate driving performance impairment under distress and road rage. When the driver is distressed or enraged, he/she typically reacts more erratically (and at times with a slower reaction time) .
  • a two-level model of performance impairment may be used in this system. In the first level, which represents relatively minor degradation, drivers are generally able to control the vehicle accurately, and there is no significant reduction of driving performance. In the second level, as impairment becomes more severe, drivers become less able to maintain the same driving performance.
  • the surrounding environment processor 23 in FIG. 1B can use the video frames coming from the outside-facing image sensor 105, as well as the GPS data from the GPS input 104, to detect road conditions such as potholes, lane markers, and road curvature, and surrounding objects such as other vehicles, pedestrians, motorcycles, bicycles, and traffic signs or lights. Other road conditions and surrounding objects can be detected without departing from the scope of the present subject matter.
  • the driver feedback 26 can be used with reinforcement 25 learning algorithms by updating the distress/road rage detector models using the buffered streams.
  • the distress and road rage management processor 24 generates warnings and suggests corrective actions for the driver.
  • FIG. 2 is a flow diagram illustrating a method 200 for detection of driver and passenger distress, according to various embodiments.
  • a processor is used to receive inputs from a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and GPS data sensors. The processor is used to process the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver gesture estimate, an OBD estimate, and a GPS estimate, at 210.
  • the processor is used to store the estimates in a memory, and at 220, the processor and the stored estimates are used to generate deviation scores for each of the driver or passenger heat change estimate, the driver or passenger expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate.
  • the processor is used to execute a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores, and at 230, the processor is used to generate a warning if the classification indicates impaired driver behavior.
  • FIG. 3 is a graph illustrating density of occurrences of driver hand gestures, according to various embodiments.
  • the graph depicts a sample of a normal distribution model of normal driving performance which shows how normal driving gestures 302 accumulate towards the middle of the distribution (more common or repetitive) , and how gestures that are not that common 304 tend to accumulate on the sides of the distribution (less repetitive or less common) .
  • Common hand gestures include holding the steering wheel, while less common hand gestures include a fist gesture or a middle finger gesture by the driver, as shown in the depicted embodiment.
  • FIG. 4 is a block diagram illustrating a system for detection of driver and passenger distress, according to various embodiments.
  • the depicted embodiment shows details of the rage and distress detector 21 from FIG. 1A, as it processes data through several streams.
  • the system receives inputs from the vehicle’s cabin image sensors 2101 including images of the driver 2005, inputs from audio sensors 2102, inputs from vehicle data 2103, and inputs from a GPS sensor 2104.
  • the cabin image sensor 2101 input is processed by a face detector 2111, heat change comparator 2301, expression estimator 2202 and expression density estimator 2302, and further processed by a hand detector 2112, gesture detector 2203 and gesture density estimator 2303, in various embodiments.
  • the audio sensor 2102 input is processed for mel-frequency cepstral coefficients (MFCC) features 2204, MFCC feature density estimator 2304, and by natural language processing (NLP) detector 2205 and NLP density estimator 2305, in various embodiments.
  • the vehicle data 2103 input is processed by OBD measurement generator 2206 and OBD density estimator 2306, and the GPS sensor 2104 input is processed 2207 by GPS features density estimator 2307.
  • MFCC mel-frequency cepstral coefficients
  • NLP natural language processing
  • the vehicle data 2103 input is processed by OBD measurement generator 2206 and OBD density estimator 2306
  • the GPS sensor 2104 input is processed 2207 by GPS features density estimator 2307.
  • a normal driving model will be pre-trained using a probabilistic model, such as a Gaussian mixture model and density estimators.
  • expectation maximization (EM) is used to estimate the mixture model’s parameters, including using maximum likelihood estimation techniques, which seek to maximize the probability, or likelihood, of the observed data given the model parameters.
  • the fitted model can be used to perform various forms of inference, in various embodiments.
  • the deviation scores include, but are not limited to, a heat change deviation score ⁇ H from heat change deviation score generator 2401, an expression deviation score ⁇ E from expression deviation score generator 2401, a gesture deviation score ⁇ G from gesture deviation score generator 2403, an MFCC deviation score ⁇ MFCC from MFCC deviation score generator 2404, an NLP deviation score ⁇ NLP from NLP deviation score generator 2405, vehicle OBD deviation scores (such as a vehicle speed deviation score ⁇ sp , a steering wheel deviation score ⁇ sw , a steering wheel error deviation score ⁇ swe , a time-to-lane-crossing deviation score ⁇ ttl , a time-to-collision deviation score ⁇ ttc , etc.
  • vehicle OBD deviation scores such as a vehicle speed deviation score ⁇ sp , a steering wheel deviation score ⁇ sw , a steering wheel error deviation score ⁇
  • these deviation scores will be inputs to a fusion layer 2500, the output of which is used by classifier 2600 to classify the driver state as normal driving behavior or road rage and distress driving behavior, in various embodiments.
  • ⁇ i is the component means
  • ⁇ i is the component variances/co-variances
  • FIG. 5 is a block diagram illustrating calculation of a deviation score for a driver or passenger heat change estimate in a system for detection of driver and passenger distress, according to various embodiments.
  • a normal driving model generator 311 collects normal driving thermal images 312 and pre-processes the images for a normal driving model 313.
  • the normal driving model 313 can be generated offline using a statistical analysis method, such as a Gaussian mixture model (GMM) .
  • the pre-process uses a sequence of continuous image sensor frames from the normal driving model, and obtains the mean reading for each pixel. This mean is then compared with the real-time input of the image sensor to obtain a deviation score.
  • GMM Gaussian mixture model
  • the normal driving model 313 is compared with real-time thermal images 314 using mathematic manipulation 315, such as subtraction.
  • the comparison result output from comparison system 301 is an input to a heat change deviation score generator, which can use a probability density function (PDF) 401 to generate the heat change deviation score 402 ⁇ H .
  • PDF probability density function
  • the heat change deviation score 402 ⁇ H can be an input to the fusion layer 2500, as shown in FIG. 4.
  • the present disclosure also combines thermal imaging as part of the multimodal approach.
  • a thermal imaging sensor can be used in order to understand the stress state and the emotional state of a driver, as the skin’s temperature changes based on the activity being performed, and also changes based on the emotional state of a person.
  • the skin’s temperature can change not only due to stress, but also based on other factors, such as physical activity, the present subject matter uses additional modalities to determine the stress level of a driver or passenger.
  • the temperature of the driver’s hands is also taken into account, since hand temperature is also a good indicator of emotions and distress states.
  • the present system’s multimodal approach makes use of activity recognition, voice recognition, and all the other aforementioned modalities. Combining all these modalities alongside the thermal signature of both the driver’s face and hands produces a more generic and more robust model resistant to false positives.
  • FIGS. 6A-6B are block diagrams illustrating detection of driver or passenger expression and calculation of a deviation score for a driver or passenger expression estimation 2700, according to various embodiments.
  • the system receives an input from the cabin image sensors 2101, and uses the face detector 2111, a face validator 2702, and a face tracker 2704 to build a face stream 2706.
  • the system uses a real-time human face detection and tracking technique called detection-tracking-validation (DTV) , in various embodiments.
  • DTV detection-tracking-validation
  • the offline trained face detector 2111 localizes frontal faces, and the online trained face validator 2702 decides whether the tracked face corresponds to the driver.
  • a face stream 2706 frame is constructed from the partitioned face/eye regions, and normalized for size adjustment, in various embodiments.
  • a two-dimensional (2D) fully convolutional network (FCN) 2708 with multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers is applied.
  • the feature map of the last layer of the encoder is reshaped into vector form 2710, and the output is applied to a recurrent neural network (RNN) , such as RNN1 2712, such as a long-short term memory (LSTM) .
  • RNN recurrent neural network
  • This network is trained offline using back-propagation with facial expression data, in various embodiments.
  • a normal driving expression model is pre-trained using a Gaussian mixture model (GMM) , in various embodiments. While the driver is driving, the real-time expression is compared with the normal driving model, using an expression detector 2714. The calculated expression deviation score is used as an input to the fusion layer 2500, as shown in FIG. 4.
  • GMM Gaussian mixture model
  • FIGS. 7A-7C are graphs illustrating calculation of a deviation score for a driver gesture estimate in a system for detection of driver and passenger distress, according to various embodiments.
  • the system first detects driver gestures, such a clenched fist, holding the steering wheel, waving hands, pointing at something, holding a smart phone, slapping, or a middle finger.
  • the gesture detector 2203 of FIG. 4 receives an image, hand regions are partitioned, and a two-layer hand stream is constructed, in various embodiments.
  • the hand stream is normalized for size adjustment, and a 2D FCN with multiple convolutional, pooling, batch normalization, and ReLU layers is applied, in various embodiments.
  • the feature map of the last layer of the encoder is reshaped into vector form, and applied to the RNN, in various embodiments.
  • the network is trained offline using back-propagation with gesture data.
  • a normal driving gesture model is pre-trained, and during driving, real-time gestures (shown in FIG. 7B) are compared with the normal driving gesture model, as shown in FIG. 7C, to obtain the gesture deviation score 2403 which is used as an input to the fusion layer 2500.
  • FIG. 7A demonstrates the distribution of gestures detected inside a vehicle.
  • the middle of the graph (the mean or expected gesture) indicates what is considered normal.
  • FIG. 8 is a flow diagram illustrating calculation of a deviation score for mel-frequency cepstral coefficients (MFCC) , according to various embodiments.
  • a time domain audio signal is processed by a sampling step, windowing, and a de-noising step to obtain a speech signal 802, and then calculate the MFCC.
  • An MFCC calculator 800 incorporates a fast Fourier transform (FFT) 806, mel scale filtering 808, a logarithmic function 810, a discrete cosine transform 812, and derivatives 814 to obtain a feature vector 804.
  • FFT fast Fourier transform
  • mel scale filtering 808 incorporates a logarithmic function 810, a discrete cosine transform 812, and derivatives 814 to obtain a feature vector 804.
  • a normal driving MFCC model will be pre-trained using GMM or density estimators. During driving, the MFCC will be compared with the normal driving MFCC model to generate the MFCC deviation score 2404 ⁇
  • the system uses natural language processing (NLP) to detect cursing and abusing words.
  • NLP natural language processing
  • a normal driving NLP model will be pre-trained using GMM and density estimators, in various embodiments.
  • the driver’s words will be compared with the normal driving NLP model, and the NLP deviation score 2405 ⁇ NLP will be calculated as one of the inputs to the fusion layer 2500.
  • driving performance measurements can be used to generate OBD deviation scores, which include, but are not limited to, vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision.
  • OBD deviation scores include, but are not limited to, vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision.
  • a multi-channel deviation score generator can be used for OBD data, in an embodiment.
  • normal driving OBD data is collected and used to generate measurements, including pre-training a normal driving model to compare with real-time data.
  • Each of the multiple channels is used to calculate a deviation score such as a vehicle speed deviation score ⁇ sp , a steering wheel deviation score ⁇ sw , a steering wheel error deviation score ⁇ swe , a time-to-lane-crossing deviation score ⁇ ttl , a time-to-collision deviation score ⁇ ttc , etc.
  • the deviation scores will be inputs to the fusion layer 2500.
  • the present system uses GPS data since a vehicle’s location at a given time can offer useful information regarding a driver’s distress and road rage level.
  • Traffic information (such as speed limit, lane direction, no parking zones, location, etc. ) is obtained and compared ⁇ with the vehicle data to compute an initial traffic violation indicator.
  • the system could also use outside-facing sensors (e.g., SONAR, image sensors, LIDAR, etc. ) to detect driving environment factors such as the vehicle’s distance to nearby objects, the location of road lane markers, or traffic signs, as additional sources of information.
  • outside-facing sensors e.g., SONAR, image sensors, LIDAR, etc.
  • each modality processing module outputs a deviation score as an input to the fusion layer 2500.
  • the deviation ⁇ score indicates the amount of deviation from normal for the modality output.
  • the deviation score can include the dispersion from a standard or average measurement (i.e., how different the current measurements are from what is considered normal) . For example, if a passenger is screaming, then the deviation scores for the current measurements of the noise modalities are going to be high.
  • the cabin image sensors 2101 can include at least one cabin-facing image sensor, such as an NIR camera, a thermal IR camera, and/or a smartphone camera.
  • the image sensors are used to extract visual features that help the system determine if there is road rage and distress for the driver. Some of these visual features may include, but are not limited to, thermal features such as changes in the driver’s face temperature and changes in the driver’s hand temperature.
  • temperature measurements can come from instantaneous changes in temperature (e.g., temperature at a specific time) , or they may also be tracked over time (e.g., temperature change over an hour of measurement) .
  • the present system is capable of observing the changes in temperature over time.
  • Changes in temperature over time can help the system determine, in combination with other features, if distress is building over time. For example, if the temperature is increasing over time inside the vehicle’s cabin then the driver’s temperature may also increase. This increase can be measured by the camera and then be used as an indication of future distress.
  • the present system also uses the image sensors in order to understand other visual cues, which may include, but are not limited to, facial expressions, as well as hand gestures and activities inside the cabin.
  • the image sensors after detecting the hands and face of the driver, sense images in which the driver is waving his/her fist, while at the same time the face and hand temperatures are rising, while at the same time the mouth of the driver is wide open (e.g., screaming) .
  • these circumstances can be understood as potential indications of distress and road rage in the driver.
  • FIGS. 9A-9B are flow diagrams illustrating methods for associating hands and a face with a driver using an image sensor stream, according to various embodiments.
  • the present system detects multiple people inside a vehicle’s cabin, the system learns to match specific hands to specific faces, so the system knows which hands and which face to track.
  • FIG. 9A illustrates an embodiment to match hands and faces from an image sensor stream 902.
  • the depicted method includes algorithms to detect hands at 904, measure distance between all hands detected and the detected face close to a detected face at 906, match the closest hand (s) to the current detected face and assign an identification tag (ID) to the face/hand pair at 908, and to use that information to build a hand stream at 910.
  • ID identification tag
  • FIGB illustrates another embodiment, where a driver’s skeleton is detected and assigned an identification tag (operation 950) , which is used to obtain the positions of face and hands relative to the skeleton in three-dimensional space (operation 952) .
  • the embodiment proceeds with using a hand detector 954 to build a hand stream 956 and a face detector 958 to build a face stream 960, to associate each hand and face with the people inside the vehicle.
  • the present subject matter uses machine learning methods to further refine the procedure by adapting to activities between the driver and passengers.
  • the system can learn that not only external drivers in the environment may cause distress and road rage, but also a combination of environmental factors inside the cabin (e.g., kids screaming) .
  • the image sensors can also be used in order to detect hand gestures such as cursing gestures, and other gestures which may have different meanings (e.g., country/culture-dependent gestures) .
  • the system uses stored data to predict if the distress and road rage is happening, or if it may happen in the near future.
  • Image sensors embedded in vehicles are becoming common, and some vehicles on the market already include not only external image sensors, but also internal image sensors that can capture the entire vehicle’s cabin.
  • FIGS. 10A-10B illustrate an image sensor of the present subject matter and an example of a captured image from the sensor, according to various embodiments.
  • the image sensor may include an NIR camera 1001 mounted on or inside the dashboard and directed toward the face of a driver 1002, to produce an image 1010 of the driver’s face for further processing.
  • FIG. 11 is a flow diagram illustrating a method for detection of driver and passenger distress, according to various embodiments.
  • the depicted embodiment shows the multimodal approach of the present subject matter, which gathers information from several sources, including but not limited to gesture inputs 1101, emotion inputs 1102, driving behavior inputs 1103, traffic condition inputs 1104, speech inputs 1105, OBD/vehicle status data 1106, and GPS data 1107, to detect driver distress and road rage 1110.
  • Alternative embodiments for this system may also include biosignal sensors that can be attached to the steering wheel, and other points of contact in a vehicle.
  • the transmission clutch for a vehicle may have sensors embedded in the fabric that can measure heartbeats and hand temperature.
  • these biosignal sensors can be embedded in other parts of the vehicle, such as the radio buttons and the control panel buttons.
  • the steering wheel is one of the most-touched parts of the vehicle, so the steering wheel can include one or more biosignal sensors to help better understand the current status of a driver, in various embodiments.
  • the data gathered from these touch sensors embedded in the vehicle’s fabric and equipment can be obtained from the OBD port located inside the vehicle, in an embodiment.
  • Further embodiments may include using a radar, capacitive, or inductive sensor attached to or within a seat of the vehicle, and configured to sense a heartbeat of the occupant. These seat sensors can function in a touchless manner, in an embodiment.
  • Alternative embodiments may also include using the image sensors inside a vehicle in order to perform remote photoplethysmography (rPPG) .
  • rPPG remote photoplethysmography
  • Remote photoplethysmography is a technique that uses an image sensor that detects changes that occur to the skin, for example, due to changes in blood pressure as a direct consequence of changes in the heartbeat rate.
  • This is a touchless technology means that the same image sensor that is used for detecting facial expressions and activity recognition can also be used in order to perform photoplethysmography.
  • the image sensor choice could be an RGB imaging sensor, or a near-infrared imaging sensor, in various embodiments.
  • the additional information provided by rPPG can also be combined with the information obtained from a thermal camera.
  • the system can further learn to identify changes in the driver’s skin that are related to stress levels and also to road rage.
  • other methods can be used to detect changes in blood flow in a driver’s face, including the use of the Eulerian video magnification method in order to amplify subtle changes in a person’s face. This can further help the machine learning algorithm to track the changes over time, and predict if the driver will present distress and be prone to road rage.
  • FIG. 12 illustrates a system for detection of driver and passenger distress, according to various embodiments.
  • a mobile phone 1201 is mounted to the windshield as part of the system.
  • the present subject matter uses a number of sensors outside of the mobile phone 1201 for inputs, therefore is not limited to onboard sensors of the mobile phone 1201.
  • the present subject matter can use the processor in the mobile phone 1201 as the main computational device, or can use an embedded processor in the vehicle’s computational unit, a designated unit, or a combination of these.
  • FIG. 13 is a schematic diagram illustrating circuitry for implementing devices to perform methods according to example embodiments. Not all components need be used in various embodiments. For example, the computing devices may each use a different set of components and storage devices.
  • One example computing device in the form of a computer 1300 may include a processing unit 1302, memory 1303, removable storage 1310, and non-removable storage 1312.
  • the computing device may be in different forms in different embodiments.
  • the computing device may instead be a smartphone, a tablet, a smartwatch, or another computing device including elements the same as or similar to those illustrated and described with regard to FIG. 13.
  • Devices such as smartphones, tablets, and smartwatches are generally collectively referred to as “mobile devices. ”
  • the various data storage elements are illustrated as part of the computer 1300, the storage may also or alternatively include cloud-based storage accessible via a network, such as the Internet, or server-based storage.
  • the various components of computer 1300 are connected with a system bus 1320.
  • the memory 1303 may include volatile memory 1314 and/or non-volatile memory 1308.
  • the computer 1300 may include –or have access to a computing environment that includes –a variety of computer-readable media, such as the volatile memory 1314 and/or the non-volatile memory 1308, the removable storage 1310, and/or the non-removable storage 1312.
  • Computer storage includes random access memory (RAM) , read only memory (ROM) , erasable programmable read-only memory (EPROM) or electrically erasable programmable read-only memory (EEPROM) , flash memory or other memory technologies, compact disc read-only memory (CD ROM) , digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium capable of storing computer-readable instructions.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technologies
  • compact disc read-only memory (CD ROM) compact disc read-only memory
  • DVD digital versatile disks
  • magnetic cassettes magnetic tape
  • magnetic disk storage magnetic disk storage devices
  • the computer 1300 may include or have access to a computing environment that includes an input device 1306, an output device 1304, and a communication interface 1316.
  • the communication interface 1316 includes a transceiver and an antenna.
  • the output device 1304 may include a display device, such as a touchscreen, that also may serve as an input device.
  • the input device 1306 may include one or more of a touchscreen, a touchpad, a mouse, a keyboard, a camera, one or more device-specific buttons, and other input devices.
  • Various embodiments include one or more sensors 1307 integrated within or coupled via wired or wireless data connections to the computer 1300.
  • the computer 1300 may operate in a networked environment using a communication connection to connect to one or more remote computers, such as database servers.
  • the remote computer may include a personal computer (PC) , server, router, network PC, peer device or other common network node, or the like.
  • the communication connection may include a Local Area Network (LAN) , a Wide Area Network (WAN) , a cellular network, a WiFi network, a Bluetooth network, or other networks.
  • LAN Local Area Network
  • WAN Wide Area Network
  • WiFi Wireless Fidelity
  • Bluetooth Bluetooth
  • Computer-readable instructions e.g., a program 1318, comprise instructions stored on a computer-readable medium that are executable by the processing unit 1302 of the computer 1300.
  • a hard drive, CD-ROM, or RAM are some examples of articles including a non-transitory computer-readable medium, such as a storage device.
  • the terms “computer-readable medium” and “storage device” do not include carrier waves to the extent that carrier waves are deemed too transitory.
  • Storage can also include networked storage such as a storage area network (SAN) .
  • SAN storage area network
  • FIG. 14 is a schematic diagram illustrating circuitry for implementing devices to perform methods according to example embodiments.
  • One example computing device in the form of a computer 1400 may include a processing unit 1402, memory 1403 where programs run, a general storage component 1410, and deep learning model storage 1411.
  • the example computing device is illustrated and described as computer 1400, the computing device may be in different forms in different embodiments.
  • the computing device may instead be a smartphone, a tablet, a smartwatch, an embedded platform, or other computing device including the same or similar elements as illustrated and described with regard to FIG. 14.
  • the various components of computer 1400 are connected with a system bus 1420.
  • Memory 1403 may include storage for programs including, but not limited to, face detection program 1431, and gesture detection program 1432, as well as storage for audio data processing 1433, and sensor data 1434.
  • Computer 1400 may include or have access to a computing environment that includes inputs 1406, system output 1404, and a communication interface 1416.
  • communication interface 1416 includes a transceiver and an antenna as well ports, such as OBD ports.
  • System output 1404 may include a display device, such as a touchscreen, that also may serve as an input device. The system output 1404 may provide an audible or visual warning, in various embodiments.
  • the inputs 1406 may include one or more of a touchscreen, touchpad, mouse, keyboard, camera, microphone, one or more device-specific buttons, and/or one or more sensor inputs such as image sensor input 1461, audio signal input 1462, vehicle data input 1463, and GPS data input 1464. Additional inputs may be used without departing from the scope of the present subject matter.
  • Computer-readable instructions i.e., a program such as a face detection program 1403, comprise instructions stored on a computer-readable medium that are executable by the processing unit 1402 of the computer 1400.
  • a computer program may be stored or distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with, or as part of, other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.
  • a suitable medium such as an optical storage medium or a solid-state medium supplied together with, or as part of, other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Automation & Control Theory (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mechanical Engineering (AREA)
  • Transportation (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Traffic Control Systems (AREA)

Abstract

A system for determining distress of a driver of a vehicle is provided, comprising a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors. The system also includes one or more processors configured to receive inputs from the plurality of sensors, and process the received inputs to obtain a driver heat change estimate, a driver expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate. The one or more processors are further configured to store the estimates in a memory, use the stored estimates to generate deviation scores for each of the estimates, execute a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores, and generate a warning based on the classification indicating impaired driver behavior.

Description

METHOD FOR DISTRESS AND ROAD RAGE DETECTION
Cross-reference to Related Applications
This application claims priority to U.S. Application 15/902,729, filed on February 22, 2018, and entitled “Method for Distress and Road Rage Detection, ” which is hereby incorporated by reference in its entirety.
Technical Field
The present disclosure relates to automated systems for vehicle safety, and in particular to systems and methods for detection of driver and passenger distress and road rage.
Background
Automated systems for vehicle safety have been adapted for collision avoidance. Previous systems for detection of road rage by a driver of a vehicle have focused on invasive systems such as blood pressure and heart rate monitoring, and noninvasive systems that use mainly images and vocal recording. In addition, previous invasive systems use extensive sensor installation and complicated data collection, while noninvasive systems rely on interpretation of subtle cues, which may vary among individual drivers.
Summary
Methods, apparatus, and systems are provided for detection of driver and passenger distress and road rage. Various examples are now described to introduce a selection of concepts in a simplified form that are further described below in the detailed description. The Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
According to one aspect of the present disclosure, a method for determining distress of a driver of a vehicle is provided. The method comprises receiving inputs from a plurality of sensors by one or more processors, the sensors including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors, and processing the received inputs to obtain a driver heat change estimate, a driver expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate. The estimates are stored in a memory, and the stored estimates are used to generate deviation scores for each of the driver heat change estimate, the driver expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate. A  machine learning algorithm is executed by the one or more processors to classify driver behavior as normal or impaired based on the deviation scores, and to generate a warning based on the classification indicating impaired driver behavior.
Optionally, in any of the preceding aspects, generating the deviation score for the driver or passenger heat change estimate includes: generating a normal driving model offline using normal driving thermal images of the driver or a passenger; comparing the normal driving model with real-time thermal image data of the driver or the passenger to obtain a comparison result; and applying a probability density function (PDF) to the comparison result to obtain the deviation score for the driver or passenger heat change estimate.
Optionally, in any of the preceding aspects, generating the deviation score for the driver or passenger expression estimate includes: using detection-tracking-validation (DTV) to localize frontal face images of the driver or a passenger; constructing a face stream frame from a partitioned face region of the frontal face images; applying a fully convolutional network (FCN) to the face stream frame using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers; reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving expression model using a Gaussian mixture model (GMM) ; and comparing a real-time driver or passenger expression with the normal driving expression model to calculate the deviation score for the driver or passenger expression estimate.
Optionally, in any of the preceding aspects, generating the deviation score for the driver or passenger gesture estimate includes: detecting driver or passenger gestures to obtain an image of a hands region of the driver or passenger; constructing a two-layer hand stream from the image and normalizing the two-layer hand stream for size adjustment; applying a fully convolutional network (FCN) to the two-layer hand stream using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers; reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving gesture model using a Gaussian mixture model (GMM) ; and comparing a real-time driver or passenger gesture with the normal driving or passenger gesture model to calculate the deviation score for the driver or passenger gesture estimate.
Optionally, in any of the preceding aspects, generating the deviation score for the OBD estimate includes: collecting normal driving data from OBD related to two or more  of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; using the normal driving data to generate a normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; and comparing real-time data to the normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision to generate a deviation score for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision.
Optionally, in any of the preceding aspects, the warning includes a visual alert. Optionally, in any of the preceding aspects, the warning includes an audio output. Optionally, in any of the preceding aspects, the warning includes a suggested corrective driver action using a display. Optionally, in any of the preceding aspects, using the processor to execute the machine learning algorithm to classify the driver behavior as normal or impaired includes using a Gaussian mixture model (GMM) . Optionally, in any of the preceding aspects, expectation maximization is used to estimate model parameters of the GMM. Optionally, in any of the preceding aspects, the processor is configured to generate a normal driving model offline for comparison to real-time driving data.
According to another aspect of the present disclosure, a system for determining distress of a driver of a vehicle is provided. The system comprises a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors, and a processor in communication with the plurality of sensors. The processor is configured to: receive inputs from the plurality of sensors, process the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver or passenger gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate, and store the estimates in a memory. The stored estimates are to generate deviation scores for each of the driver or passenger heat change estimate, the driver or passenger expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate. A machine learning algorithm is executed to classify driver behavior as normal or impaired based on the deviation scores, and a warning is generated based on the classification indicating impaired driver or passenger behavior.
Optionally, in any of the preceding aspects, the plurality of sensors further includes exterior-facing sensors of the vehicle. Optionally, in any of the preceding aspects, the processor is further configured to receive a traffic information input, including at least  one of a speed limit and a lane direction. Optionally, in any of the preceding aspects, the warning includes a suggested corrective driver action using a display.
According to another aspect of the present disclosure, a non-transitory computer-readable medium is provided, the medium storing computer instructions to determine distress of a driver of a vehicle and provide a warning, that when executed by one or more processors, cause the one or more processors to perform steps of: receiving inputs from a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors; processing the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate; storing the estimates in a memory; using the stored estimates to generate deviation scores for each of the driver or passenger heat change estimate, the driver or passenger expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate; executing a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores; and generating the warning based on the classification indicating impaired driver or passenger behavior.
Optionally, in any of the preceding aspects, generating the deviation score for the driver or passenger heat change estimate includes: generating a normal driving model offline using normal driving thermal images of the driver or a passenger; comparing the normal driving model with real-time thermal image data of the driver or the passenger to obtain a comparison result; and applying a probability density function (PDF) to the comparison result to obtain the deviation score for the driver or passenger heat change estimate.
Optionally, in any of the preceding aspects, generating the deviation score for the driver or passenger expression estimate includes: using detection-tracking-validation (DTV) to localize frontal face images of the driver or a passenger; constructing a face stream frame from a partitioned face region of the frontal face images; applying a fully convolutional network (FCN) to the face stream frame using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers; reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving expression model using a Gaussian mixture model (GMM) ; and comparing a real-time driver or passenger expression with the normal driving expression model to calculate the deviation score for the driver or passenger expression estimate.
Optionally, in any of the preceding aspects, generating the deviation score for the driver or passenger gesture estimate includes: detecting driver gestures to obtain an image of a hands region of the driver or passenger; constructing a two-layer hand stream from the image and normalizing the two-layer hand stream for size adjustment; applying a fully convolutional network (FCN) to the two-layer hand stream using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers; reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving or passenger gesture model using a Gaussian mixture model (GMM) ; and comparing a real-time driver or passenger gesture with the normal driving or passenger gesture model to calculate the deviation score for the driver or passenger gesture estimate.
Optionally, in any of the preceding aspects, generating the deviation score for the OBD estimate includes: collecting normal driving data from OBD related to two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; using the normal driving data to generate a normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; and comparing real-time data to the normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision to generate a deviation score for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision.
This Summary is an overview of some of the teachings of the present application and not intended to be an exclusive or exhaustive treatment of the present subject matter. Further details about the present subject matter are found in the detailed description and appended claims. The scope of the present inventive subject matter is defined by the appended claims and their legal equivalents.
Brief Description of the Drawings
FIGS. 1A-1B are block diagrams illustrating systems for detection of driver and passenger distress, according to various embodiments.
FIG. 2 is a flow diagram illustrating a method for detection of driver and passenger distress, according to various embodiments.
FIG. 3 is a graph illustrating density of occurrences of driver hand gestures, according to various embodiments.
FIG. 4 is a block diagram illustrating a system for detection of driver and passenger distress, according to various embodiments.
FIG. 5 is a block diagram illustrating calculation of a deviation score for a driver or passenger heat change estimate in a system for detection of driver and passenger distress, according to various embodiments.
FIGS. 6A-6B are block diagrams illustrating detection of driver or passenger expression and calculation of a deviation score for a driver or passenger expression estimation, according to various embodiments.
FIGS. 7A-7C are graphs illustrating calculation of a deviation score for a driver gesture estimate in a system for detection of driver and passenger distress, according to various embodiments.
FIG. 8 is a flow diagram illustrating calculation of a deviation score for mel-frequency cepstral coefficients (MFCC) , according to various embodiments.
FIGS. 9A-9B are flow diagrams illustrating methods for associating hands and a face with a driver using an image sensor stream, according to various embodiments.
FIGS. 10A-10B illustrate an image sensor of the present subject matter and an example of a captured image from the sensor, according to various embodiments.
FIG. 11 is a flow diagram illustrating a method for detection of driver and passenger distress, according to various embodiments.
FIG. 12 illustrates a system for detection of driver and passenger distress, according to various embodiments.
FIG. 13 is a diagram illustrating circuitry for implementing devices to perform methods according to an example embodiment.
FIG. 14 is a schematic diagram illustrating circuitry for implementing devices to perform methods according to example embodiments.
Detailed Description
In the following description, reference is made to the accompanying drawings that form a part hereof, and in which are shown by way of illustration specific embodiments which may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the inventive subject matter, and it is to be understood that other embodiments may be utilized and that structural, logical, and electrical changes may be made without departing from the scope of the present disclosure. The following description  of example embodiments is, therefore, not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims.
When drivers are angry or distressed, they become more aggressive and less attentive, which can lead to accidents. Embodiments of the present subject matter monitor distress and road rage in real time as part of a driver assistance system. The recognition of distress and road rage typically relies on interpretation of very subtle cues, which may vary among individuals. Therefore, embodiments of the present subject matter monitor a plurality of modalities (such as facial expressions, hand gestures, vehicle speed, etc. ) in order to create a robust system, which can be used to detect changes in driver temperament.
Road rage can be classified into four stages: in stage 1, when a driver is annoyed by somebody, they usually start making non-threatening gestures or facial expressions to show annoyance; in stage 2, after showing their dissatisfaction, angry drivers can escalate the situation by honking, flashing lights, braking maliciously, tailgating, and blocking vehicles; in stage 3, aggressive drivers might curse, yell, and threaten another driver; in stage 4, a worst case is that some drivers might fire a gun, hit a vehicle with objects, chase a vehicle, or run a vehicle off the road.
The present subject matter provides a distress and road rage monitoring system, which can monitor a driver or passenger to detect levels of distress and road rage and provide a notification if distress or road rage is detected. The system incorporates, but is not limited to, thermal imaging, speech, and visual information together, as well as other modalities, such as driving performance and hand gestures, in various embodiments. The inputs to a processing unit can be information originating from audio sensors, image sensors (e.g., near-infrared reflectance (NIR) cameras or thermal cameras) , and overall vehicle data. The system can then assist the driver or passenger to reduce the possibility of an incident. By using a multimodal approach, the present system can obtain important information that otherwise cannot be obtained when relying on just a single source of information. Each modality can provide information that may not be found in a different modality (e.g., image information from an image sensor vs. sound information from a sound transducer) .
In addition, embodiments of the present subject matter use neural networks, reinforcement learning, and other machine learning techniques in order for the system to learn which features about the driver and the vehicle can be useful when detecting road rage and stress.
The present disclosure relates to automated systems for vehicle safety, and in particular to systems and methods for detection of driver and passenger distress and road rage.  While examples are provided for driver detection, the systems can also be used for passenger detection, in various embodiments. In one embodiment, a system for determining distress of a driver of a vehicle is provided, comprising a plurality of sensors, including, but not limited to, interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors. The system also includes a processor configured to receive inputs from the plurality of sensors, and process the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate. The processor is further configured to store the estimates in a memory, use the stored estimates to generate deviation scores for each of the estimates, execute a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores, and generate a warning if the classification indicates impaired driver behavior.
The functions or algorithms described herein may be implemented in software in one embodiment. The software may consist of computer-executable instructions stored on computer-readable media or a computer-readable storage device such as one or more non-transitory memories or other types of hardware-based storage devices, either local or networked. Further, such functions correspond to modules, which may be software, hardware, firmware, or any combination thereof. Multiple functions may be performed in one or more modules as desired, and the embodiments described are merely examples. The software may be executed on a digital signal processor, application-specific integrated circuit (ASIC) , microprocessor, or other type of processor operating on a computer system, such as a personal computer, server, or other computer system, turning such a computer system into a specifically programmed machine.
According to one aspect of the present disclosure, a method for determining distress of a driver of a vehicle is provided. The method comprises receiving inputs from a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors, and processing the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver or passenger gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate. The estimates are stored in a memory, and the stored estimates are used to generate deviation scores for each of the driver or passenger heat change estimate, the driver or passenger expression estimate, the driver or passenger gesture estimate, the OBD estimate, and the GPS estimate. A machine learning algorithm is executed to classify  driver behavior as normal or impaired based on the deviation scores, and to generate a warning if the classification indicates impaired driver or passenger behavior.
In various embodiments, a computer-implemented system determines driving information of the driver (and passengers if available) , the driving information being based on sensor information collected by image sensors, location sensors, sound sensors, and vehicle sensors, which can then be used in order to understand the driver’s and passengers’ states, so that the system can further determine if distress or road rage is present in the driver’s state. The system uses machine learning and pre-trained models in order to learn how to predict distress and road rage, in various embodiments, and stores this model in memory. Machine learning techniques, such as reinforcement learning, allow the system to adapt to the driver’s distress/road rage driving performance, and non-distress/road rage driving performance, in various embodiments.
Systems and methods of the present subject matter generate a prediction model of the driver’s distress and road rage level. Various embodiments of a method include identifying the driver and passengers inside the vehicle, identifying the hands and faces of the driver and passengers, tracking the hands and faces, and using this information in order to detect facial expressions, gestures, thermal states, and activities that are indicators of distress. The method further includes identifying the state of the environment, such as traffic conditions, objects near the vehicle, sounds around the vehicle (such as other vehicles honking) , road conditions, and the speed limit, in various embodiments. In various embodiments, the method also includes obtaining driving performance data, such as acceleration, speed, steering angle, and other embedded sensor data. Other inputs can be used without departing from the scope of the present subject matter. The method includes fusing the aforementioned indicators, states, and data to determine if the driver is enraged or distressed, in various embodiments.
Various embodiments of the present system use a multimodal approach (e.g., multiple data streams, such as images, audio, vehicle data, etc. ) , such as described with respect to FIG. 4 below, where each modality can be used to detect features that help the system understand the driver’s distress and road rage levels. In various embodiments, the system can adapt and learn ways different drivers may display rage and distress expressions, and determine driver preferences for how warning and driving assistance are to be provided. For example, some drivers prefer frequent and repetitive warnings, which will provide assistance until the driver calms down, while other drivers prefer short warnings, because these drivers may be distracted by the alarms and warnings. In various embodiments, driver  assistance may include reducing or limiting the speed of the vehicle, applying the brakes of the vehicle, or vibrating the steering wheel. Other types of driver assistance can be used without departing from the scope of the present subject matter. Various embodiments of the present system also accept driver feedback using reinforcement learning, which allows the system to continuously adapt to the driver. Advantages of the technical improvements of the present subject matter include that the present systems provide the desired warnings without requiring invasive sensing, such as blood pressure cuffs or other special equipment.
FIGS. 1A-1B are block diagrams illustrating systems for detection of driver and passenger distress, according to various embodiments. The depicted embodiment includes a plurality of sensors 100 including at least one image sensor 101, at least one audio sensor 102, an OBD access device 103 for obtaining vehicle data, and a GPS input 104 for obtaining vehicle location. Various embodiments include a processing unit 10, and an output (such as generation of an audible or visual alert or taking control of the vehicle) 20 generated by the processing unit 10 based on the condition of a driver 5. Various embodiments also include an outside-facing image sensor 105 that records information about the environment outside the vehicle, as shown in FIG. 1B. In various embodiments, the processing unit 10 can include any platform that has capabilities to run neural processing computations, such as existing vehicle hardware, a mobile phone, or a dedicated device that is connected to vehicle OBD and GPS. The processing unit 10 can include a rage and distress detector 21, a driver performance analyzer 22, a surrounding environment processor 23, a distress and road rage management processor 24, a module for reinforcement 25 and an input for driver feedback 26 in various embodiments. The rage and distress detector 21 uses statistical models that allow the system to use statistical classification to determine distress and road rage levels. Since there are different levels of distress and road rage, the system can use a reference point for each of the modalities that are used as input. For example, the system uses a statistical distribution model that determines how far from normal or from the average the currently detected distress and road rage are. The system can learn offline or in real time a normal driving baseline for a particular driver, and depending on how far the driving performance has deviated from the normal driving performance, the system determines if the driver’s distress and road rage levels are acceptable. Some indicators of normal driving can include, but are not limited to: driving at or under the speed limit based on GPS information; word usage that does not include offensive language, as well as normal sound levels of the voice; and hand gestures that may not be included in the category of offensive hand gestures. In various embodiments, the system adapts the model in real time in order to accommodate a  driver’s normal driving performance, including learning the regular driving speed, the regular body parts heat signatures, and the normal noise levels inside the cabin. Then, using reinforcement learning techniques, the system readjusts the parameters and models that are currently in use to determine if the levels of distress and road rage are within a normal range of driving performance for the particular driver.
FIG. 1B illustrates a sample data flow and sample data analysis schematics using multiple modalities. The system automatically detects and tracks the driver’s face and eyes using image-sensor 101 (e.g., thermal and NIR) streams as input, in an embodiment. Using the face and eyes, the system can recognize the heat change and facial expressions of the driver. The system detects the driver’s hands using the image sensor stream, in various embodiments. Using the hand regions as spatial anchors, the system recognizes the driver’s gestures. The system can also use the audio stream acquired from a microphone inside the vehicle as an input to analyze the driver’s voice and sounds from inside the vehicle, in an embodiment. The rage and distress detector 21 analyses the inputs to understand the driver’s driving performance.
The audio sensor 102 in FIGS. 1A-1B can be an in-vehicle microphone or a smartphone microphone, in various embodiments. The audio sensor 102 can be used to record various audio features, including, but not limited to, speech recognition, as there are certain key words and tone intensities that indicate that the driver is distressed; speech volume (whether the driver is speaking or there are passengers’ voices in the audio signal) ; or whether the driver is hitting/banging a part of the vehicle’s cabin with their hands, during a moment of distress and rage. Other factors may be part of the environment outside the vehicle, such as other vehicles honking or other drivers shouting at the driver. Sounds outside the vehicle may also be factors that can increase distress on the driver, and this distress may lead to road rage. Using machine learning, the system may learn what specific and repetitive sounds may lead to increases in distress and road rage levels for a driver.
The OBD access device or vehicle data device 103 in FIGS. 1A-1B receives, processes, and stores sensor and driving information, and provides such sensor and driving information to the rage and distress detector 21 and the driver performance analyzer 22. The OBD access device 103 can be manufactured by the vehicle’s original equipment manufacturer (OEM) , or can be an aftermarket device. The OBD access device 103 can have access to a controller area network (CAN) bus, for instance, through an OBD logger, and can access sensors, such as an accelerometer, a gyroscope, a GPS sensor, and other types of  sensors, and further can communicate with user devices, such as smartphones, using a wired or wireless connection, in various embodiments.
The driver performance analyzer 22 is used to evaluate driving performance impairment under distress and road rage. When the driver is distressed or enraged, he/she typically reacts more erratically (and at times with a slower reaction time) . A two-level model of performance impairment may be used in this system. In the first level, which represents relatively minor degradation, drivers are generally able to control the vehicle accurately, and there is no significant reduction of driving performance. In the second level, as impairment becomes more severe, drivers become less able to maintain the same driving performance.
The surrounding environment processor 23 in FIG. 1B can use the video frames coming from the outside-facing image sensor 105, as well as the GPS data from the GPS input 104, to detect road conditions such as potholes, lane markers, and road curvature, and surrounding objects such as other vehicles, pedestrians, motorcycles, bicycles, and traffic signs or lights. Other road conditions and surrounding objects can be detected without departing from the scope of the present subject matter. The driver feedback 26 can be used with reinforcement 25 learning algorithms by updating the distress/road rage detector models using the buffered streams. In various embodiments, the distress and road rage management processor 24 generates warnings and suggests corrective actions for the driver.
FIG. 2 is a flow diagram illustrating a method 200 for detection of driver and passenger distress, according to various embodiments. At 205, a processor is used to receive inputs from a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and GPS data sensors. The processor is used to process the received inputs to obtain a driver or passenger heat change estimate, a driver or passenger expression estimate, a driver gesture estimate, an OBD estimate, and a GPS estimate, at 210. At 215, the processor is used to store the estimates in a memory, and at 220, the processor and the stored estimates are used to generate deviation scores for each of the driver or passenger heat change estimate, the driver or passenger expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate. At 225, the processor is used to execute a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores, and at 230, the processor is used to generate a warning if the classification indicates impaired driver behavior.
FIG. 3 is a graph illustrating density of occurrences of driver hand gestures, according to various embodiments. The graph depicts a sample of a normal distribution  model of normal driving performance which shows how normal driving gestures 302 accumulate towards the middle of the distribution (more common or repetitive) , and how gestures that are not that common 304 tend to accumulate on the sides of the distribution (less repetitive or less common) . Common hand gestures include holding the steering wheel, while less common hand gestures include a fist gesture or a middle finger gesture by the driver, as shown in the depicted embodiment.
FIG. 4 is a block diagram illustrating a system for detection of driver and passenger distress, according to various embodiments. The depicted embodiment shows details of the rage and distress detector 21 from FIG. 1A, as it processes data through several streams. In various embodiments, the system receives inputs from the vehicle’s cabin image sensors 2101 including images of the driver 2005, inputs from audio sensors 2102, inputs from vehicle data 2103, and inputs from a GPS sensor 2104. The cabin image sensor 2101 input is processed by a face detector 2111, heat change comparator 2301, expression estimator 2202 and expression density estimator 2302, and further processed by a hand detector 2112, gesture detector 2203 and gesture density estimator 2303, in various embodiments. The audio sensor 2102 input is processed for mel-frequency cepstral coefficients (MFCC) features 2204, MFCC feature density estimator 2304, and by natural language processing (NLP) detector 2205 and NLP density estimator 2305, in various embodiments. The vehicle data 2103 input is processed by OBD measurement generator 2206 and OBD density estimator 2306, and the GPS sensor 2104 input is processed 2207 by GPS features density estimator 2307. For each aspect, a normal driving model will be pre-trained using a probabilistic model, such as a Gaussian mixture model and density estimators. For learning the model, expectation maximization (EM) is used to estimate the mixture model’s parameters, including using maximum likelihood estimation techniques, which seek to maximize the probability, or likelihood, of the observed data given the model parameters. Then the fitted model can be used to perform various forms of inference, in various embodiments. While the driver 2005 is driving, the real-time driving model will be compared with the pre-trained normal driving model, and a deviation score will be calculated for each of the estimates. The deviation scores include, but are not limited to, a heat change deviation score σ H from heat change deviation score generator 2401, an expression deviation score σ E from expression deviation score generator 2401, a gesture deviation score σ G from gesture deviation score generator 2403, an MFCC deviation score σ MFCC from MFCC deviation score generator 2404, an NLP deviation score σ NLP from NLP deviation score generator 2405, vehicle OBD deviation scores (such as a vehicle speed deviation score σ sp, a  steering wheel deviation score σ sw, a steering wheel error deviation score σ swe, a time-to-lane-crossing deviation score σ ttl, a time-to-collision deviation score σ ttc, etc. ) from OBD deviation score generator 2406, and a GPS deviation score σ GPS (which can be useful when comparing a vehicle’s speed to the current location’s speed limit, for instance) from GPS deviation score generator 2407. In various embodiments, these deviation scores will be inputs to a fusion layer 2500, the output of which is used by classifier 2600 to classify the driver state as normal driving behavior or road rage and distress driving behavior, in various embodiments.
For a Gaussian mixture model (p (x i) ) with M components:
Figure PCTCN2019075558-appb-000001
Figure PCTCN2019075558-appb-000002
where,
a i  is the mixture component weights,
μ i  is the component means, and
i is the component variances/co-variances
FIG. 5 is a block diagram illustrating calculation of a deviation score for a driver or passenger heat change estimate in a system for detection of driver and passenger distress, according to various embodiments. To calculate the heat change deviation score 402, a normal driving model generator 311 collects normal driving thermal images 312 and pre-processes the images for a normal driving model 313. In various embodiments, the normal driving model 313 can be generated offline using a statistical analysis method, such as a Gaussian mixture model (GMM) . In one embodiment, the pre-process uses a sequence of continuous image sensor frames from the normal driving model, and obtains the mean reading for each pixel. This mean is then compared with the real-time input of the image sensor to obtain a deviation score. In various embodiments, the normal driving model 313 is compared with real-time thermal images 314 using mathematic manipulation 315, such as subtraction. The comparison result output from comparison system 301 is an input to a heat change deviation score generator, which can use a probability density function (PDF) 401 to generate the heat change deviation score 402 σ H. The heat change deviation score 402 σ H can be an input to the fusion layer 2500, as shown in FIG. 4.
The present disclosure also combines thermal imaging as part of the multimodal approach. A thermal imaging sensor can be used in order to understand the stress  state and the emotional state of a driver, as the skin’s temperature changes based on the activity being performed, and also changes based on the emotional state of a person. However, because the skin’s temperature can change not only due to stress, but also based on other factors, such as physical activity, the present subject matter uses additional modalities to determine the stress level of a driver or passenger. Thus, the temperature of the driver’s hands is also taken into account, since hand temperature is also a good indicator of emotions and distress states. The present system’s multimodal approach makes use of activity recognition, voice recognition, and all the other aforementioned modalities. Combining all these modalities alongside the thermal signature of both the driver’s face and hands produces a more generic and more robust model resistant to false positives.
FIGS. 6A-6B are block diagrams illustrating detection of driver or passenger expression and calculation of a deviation score for a driver or passenger expression estimation 2700, according to various embodiments. To calculate the expression deviation score 2402 σ E, the system receives an input from the cabin image sensors 2101, and uses the face detector 2111, a face validator 2702, and a face tracker 2704 to build a face stream 2706. The system uses a real-time human face detection and tracking technique called detection-tracking-validation (DTV) , in various embodiments. The offline trained face detector 2111 localizes frontal faces, and the online trained face validator 2702 decides whether the tracked face corresponds to the driver. Using each image, a face stream 2706 frame is constructed from the partitioned face/eye regions, and normalized for size adjustment, in various embodiments. A two-dimensional (2D) fully convolutional network (FCN) 2708 with multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers is applied. In various embodiments, the feature map of the last layer of the encoder is reshaped into vector form 2710, and the output is applied to a recurrent neural network (RNN) , such as RNN1 2712, such as a long-short term memory (LSTM) . This network is trained offline using back-propagation with facial expression data, in various embodiments. A normal driving expression model is pre-trained using a Gaussian mixture model (GMM) , in various embodiments. While the driver is driving, the real-time expression is compared with the normal driving model, using an expression detector 2714. The calculated expression deviation score is used as an input to the fusion layer 2500, as shown in FIG. 4.
FIGS. 7A-7C are graphs illustrating calculation of a deviation score for a driver gesture estimate in a system for detection of driver and passenger distress, according to various embodiments. To calculate the gesture deviation score 2403 σ G, the system first detects driver gestures, such a clenched fist, holding the steering wheel, waving hands,  pointing at something, holding a smart phone, slapping, or a middle finger. The gesture detector 2203 of FIG. 4 receives an image, hand regions are partitioned, and a two-layer hand stream is constructed, in various embodiments. The hand stream is normalized for size adjustment, and a 2D FCN with multiple convolutional, pooling, batch normalization, and ReLU layers is applied, in various embodiments. The feature map of the last layer of the encoder is reshaped into vector form, and applied to the RNN, in various embodiments. In various embodiments, the network is trained offline using back-propagation with gesture data. A normal driving gesture model is pre-trained, and during driving, real-time gestures (shown in FIG. 7B) are compared with the normal driving gesture model, as shown in FIG. 7C, to obtain the gesture deviation score 2403 which is used as an input to the fusion layer 2500. FIG. 7A demonstrates the distribution of gestures detected inside a vehicle. The middle of the graph (the mean or expected gesture) indicates what is considered normal.
FIG. 8 is a flow diagram illustrating calculation of a deviation score for mel-frequency cepstral coefficients (MFCC) , according to various embodiments. A time domain audio signal is processed by a sampling step, windowing, and a de-noising step to obtain a speech signal 802, and then calculate the MFCC. An MFCC calculator 800 incorporates a fast Fourier transform (FFT) 806, mel scale filtering 808, a logarithmic function 810, a discrete cosine transform 812, and derivatives 814 to obtain a feature vector 804. In various embodiments, a normal driving MFCC model will be pre-trained using GMM or density estimators. During driving, the MFCC will be compared with the normal driving MFCC model to generate the MFCC deviation score 2404 σ MFCC as one of the inputs to the fusion layer 2500.
In various embodiments, the system uses natural language processing (NLP) to detect cursing and abusing words. A normal driving NLP model will be pre-trained using GMM and density estimators, in various embodiments. During driving, the driver’s words will be compared with the normal driving NLP model, and the NLP deviation score 2405 σ NLP will be calculated as one of the inputs to the fusion layer 2500.
In addition, driving performance measurements can be used to generate OBD deviation scores, which include, but are not limited to, vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision. A multi-channel deviation score generator can be used for OBD data, in an embodiment. According to various embodiments, normal driving OBD data is collected and used to generate measurements, including pre-training a normal driving model to compare with real-time data. Each of the multiple channels is used to calculate a deviation score such as a vehicle speed  deviation score σ sp, a steering wheel deviation score σ sw, a steering wheel error deviation score σ swe, a time-to-lane-crossing deviation score σ ttl, a time-to-collision deviation score σ ttc, etc. In various embodiments, the deviation scores will be inputs to the fusion layer 2500.
According to various embodiments, the present system uses GPS data since a vehicle’s location at a given time can offer useful information regarding a driver’s distress and road rage level. Traffic information (such as speed limit, lane direction, no parking zones, location, etc. ) is obtained and compared`with the vehicle data to compute an initial traffic violation indicator. In`conjunction with GPS data, the system could also use outside-facing sensors (e.g., SONAR, image sensors, LIDAR, etc. ) to detect driving environment factors such as the vehicle’s distance to nearby objects, the location of road lane markers, or traffic signs, as additional sources of information.
In various embodiments, each modality processing module outputs a deviation score as an input to the fusion layer 2500. The deviation`score indicates the amount of deviation from normal for the modality output. In the case of a statistical distribution, the deviation score can include the dispersion from a standard or average measurement (i.e., how different the current measurements are from what is considered normal) . For example, if a passenger is screaming, then the deviation scores for the current measurements of the noise modalities are going to be high.
As shown in FIG. 4, the cabin image sensors 2101 can include at least one cabin-facing image sensor, such as an NIR camera, a thermal IR camera, and/or a smartphone camera. The image sensors are used to extract visual features that help the system determine if there is road rage and distress for the driver. Some of these visual features may include, but are not limited to, thermal features such as changes in the driver’s face temperature and changes in the driver’s hand temperature. In various embodiments, temperature measurements can come from instantaneous changes in temperature (e.g., temperature at a specific time) , or they may also be tracked over time (e.g., temperature change over an hour of measurement) . The present system is capable of observing the changes in temperature over time. Changes in temperature over time can help the system determine, in combination with other features, if distress is building over time. For example, if the temperature is increasing over time inside the vehicle’s cabin then the driver’s temperature may also increase. This increase can be measured by the camera and then be used as an indication of future distress.
According to various embodiments, the present system also uses the image sensors in order to understand other visual cues, which may include, but are not limited to,  facial expressions, as well as hand gestures and activities inside the cabin. For example, the image sensors, after detecting the hands and face of the driver, sense images in which the driver is waving his/her fist, while at the same time the face and hand temperatures are rising, while at the same time the mouth of the driver is wide open (e.g., screaming) . These circumstances can be understood as potential indications of distress and road rage in the driver.
FIGS. 9A-9B are flow diagrams illustrating methods for associating hands and a face with a driver using an image sensor stream, according to various embodiments. When the present system detects multiple people inside a vehicle’s cabin, the system learns to match specific hands to specific faces, so the system knows which hands and which face to track. FIG. 9A illustrates an embodiment to match hands and faces from an image sensor stream 902. The depicted method includes algorithms to detect hands at 904, measure distance between all hands detected and the detected face close to a detected face at 906, match the closest hand (s) to the current detected face and assign an identification tag (ID) to the face/hand pair at 908, and to use that information to build a hand stream at 910. FIG. 9B illustrates another embodiment, where a driver’s skeleton is detected and assigned an identification tag (operation 950) , which is used to obtain the positions of face and hands relative to the skeleton in three-dimensional space (operation 952) . The embodiment proceeds with using a hand detector 954 to build a hand stream 956 and a face detector 958 to build a face stream 960, to associate each hand and face with the people inside the vehicle.
The present subject matter uses machine learning methods to further refine the procedure by adapting to activities between the driver and passengers. For example, the system can learn that not only external drivers in the environment may cause distress and road rage, but also a combination of environmental factors inside the cabin (e.g., kids screaming) . The image sensors can also be used in order to detect hand gestures such as cursing gestures, and other gestures which may have different meanings (e.g., country/culture-dependent gestures) . The system uses stored data to predict if the distress and road rage is happening, or if it may happen in the near future. Image sensors embedded in vehicles are becoming common, and some vehicles on the market already include not only external image sensors, but also internal image sensors that can capture the entire vehicle’s cabin.
FIGS. 10A-10B illustrate an image sensor of the present subject matter and an example of a captured image from the sensor, according to various embodiments. The image sensor may include an NIR camera 1001 mounted on or inside the dashboard and directed  toward the face of a driver 1002, to produce an image 1010 of the driver’s face for further processing.
FIG. 11 is a flow diagram illustrating a method for detection of driver and passenger distress, according to various embodiments. The depicted embodiment shows the multimodal approach of the present subject matter, which gathers information from several sources, including but not limited to gesture inputs 1101, emotion inputs 1102, driving behavior inputs 1103, traffic condition inputs 1104, speech inputs 1105, OBD/vehicle status data 1106, and GPS data 1107, to detect driver distress and road rage 1110.
Alternative embodiments for this system may also include biosignal sensors that can be attached to the steering wheel, and other points of contact in a vehicle. For instance, the transmission clutch for a vehicle may have sensors embedded in the fabric that can measure heartbeats and hand temperature. Also, these biosignal sensors can be embedded in other parts of the vehicle, such as the radio buttons and the control panel buttons. During driving, the steering wheel is one of the most-touched parts of the vehicle, so the steering wheel can include one or more biosignal sensors to help better understand the current status of a driver, in various embodiments. Moreover, the data gathered from these touch sensors embedded in the vehicle’s fabric and equipment can be obtained from the OBD port located inside the vehicle, in an embodiment. Further embodiments may include using a radar, capacitive, or inductive sensor attached to or within a seat of the vehicle, and configured to sense a heartbeat of the occupant. These seat sensors can function in a touchless manner, in an embodiment. Alternative embodiments may also include using the image sensors inside a vehicle in order to perform remote photoplethysmography (rPPG) . Remote photoplethysmography is a technique that uses an image sensor that detects changes that occur to the skin, for example, due to changes in blood pressure as a direct consequence of changes in the heartbeat rate. The fact that this is a touchless technology means that the same image sensor that is used for detecting facial expressions and activity recognition can also be used in order to perform photoplethysmography. The image sensor choice could be an RGB imaging sensor, or a near-infrared imaging sensor, in various embodiments. The additional information provided by rPPG can also be combined with the information obtained from a thermal camera. Using machine learning algorithms, the system can further learn to identify changes in the driver’s skin that are related to stress levels and also to road rage. Moreover, besides techniques such as rPPG, other methods can be used to detect changes in blood flow in a driver’s face, including the use of the Eulerian video magnification method in order to amplify subtle changes in a person’s face. This can further help the machine  learning algorithm to track the changes over time, and predict if the driver will present distress and be prone to road rage.
FIG. 12 illustrates a system for detection of driver and passenger distress, according to various embodiments. In the depicted embodiment, a mobile phone 1201 is mounted to the windshield as part of the system. Unlike previous applications, however, the present subject matter uses a number of sensors outside of the mobile phone 1201 for inputs, therefore is not limited to onboard sensors of the mobile phone 1201. In addition, the present subject matter can use the processor in the mobile phone 1201 as the main computational device, or can use an embedded processor in the vehicle’s computational unit, a designated unit, or a combination of these.
FIG. 13 is a schematic diagram illustrating circuitry for implementing devices to perform methods according to example embodiments. Not all components need be used in various embodiments. For example, the computing devices may each use a different set of components and storage devices.
One example computing device in the form of a computer 1300 may include a processing unit 1302, memory 1303, removable storage 1310, and non-removable storage 1312. Although the example computing device is illustrated and described as the computer 1300, the computing device may be in different forms in different embodiments. For example, the computing device may instead be a smartphone, a tablet, a smartwatch, or another computing device including elements the same as or similar to those illustrated and described with regard to FIG. 13. Devices such as smartphones, tablets, and smartwatches are generally collectively referred to as “mobile devices. ” Further, although the various data storage elements are illustrated as part of the computer 1300, the storage may also or alternatively include cloud-based storage accessible via a network, such as the Internet, or server-based storage. According to one embodiment, the various components of computer 1300 are connected with a system bus 1320.
The memory 1303 may include volatile memory 1314 and/or non-volatile memory 1308. The computer 1300 may include –or have access to a computing environment that includes –a variety of computer-readable media, such as the volatile memory 1314 and/or the non-volatile memory 1308, the removable storage 1310, and/or the non-removable storage 1312. Computer storage includes random access memory (RAM) , read only memory (ROM) , erasable programmable read-only memory (EPROM) or electrically erasable programmable read-only memory (EEPROM) , flash memory or other memory technologies, compact disc read-only memory (CD ROM) , digital versatile disks  (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium capable of storing computer-readable instructions.
The computer 1300 may include or have access to a computing environment that includes an input device 1306, an output device 1304, and a communication interface 1316. In various embodiments, the communication interface 1316 includes a transceiver and an antenna. The output device 1304 may include a display device, such as a touchscreen, that also may serve as an input device. The input device 1306 may include one or more of a touchscreen, a touchpad, a mouse, a keyboard, a camera, one or more device-specific buttons, and other input devices. Various embodiments include one or more sensors 1307 integrated within or coupled via wired or wireless data connections to the computer 1300. The computer 1300 may operate in a networked environment using a communication connection to connect to one or more remote computers, such as database servers. The remote computer may include a personal computer (PC) , server, router, network PC, peer device or other common network node, or the like. The communication connection may include a Local Area Network (LAN) , a Wide Area Network (WAN) , a cellular network, a WiFi network, a Bluetooth network, or other networks.
Computer-readable instructions, e.g., a program 1318, comprise instructions stored on a computer-readable medium that are executable by the processing unit 1302 of the computer 1300. A hard drive, CD-ROM, or RAM are some examples of articles including a non-transitory computer-readable medium, such as a storage device. The terms “computer-readable medium” and “storage device” do not include carrier waves to the extent that carrier waves are deemed too transitory. Storage can also include networked storage such as a storage area network (SAN) .
FIG. 14 is a schematic diagram illustrating circuitry for implementing devices to perform methods according to example embodiments. One example computing device in the form of a computer 1400 may include a processing unit 1402, memory 1403 where programs run, a general storage component 1410, and deep learning model storage 1411. Although the example computing device is illustrated and described as computer 1400, the computing device may be in different forms in different embodiments. For example, the computing device may instead be a smartphone, a tablet, a smartwatch, an embedded platform, or other computing device including the same or similar elements as illustrated and described with regard to FIG. 14. According to one embodiment, the various components of computer 1400 are connected with a system bus 1420.
Memory 1403 may include storage for programs including, but not limited to, face detection program 1431, and gesture detection program 1432, as well as storage for audio data processing 1433, and sensor data 1434. Computer 1400 may include or have access to a computing environment that includes inputs 1406, system output 1404, and a communication interface 1416. In various embodiments, communication interface 1416 includes a transceiver and an antenna as well ports, such as OBD ports. System output 1404 may include a display device, such as a touchscreen, that also may serve as an input device. The system output 1404 may provide an audible or visual warning, in various embodiments. The inputs 1406 may include one or more of a touchscreen, touchpad, mouse, keyboard, camera, microphone, one or more device-specific buttons, and/or one or more sensor inputs such as image sensor input 1461, audio signal input 1462, vehicle data input 1463, and GPS data input 1464. Additional inputs may be used without departing from the scope of the present subject matter. Computer-readable instructions, i.e., a program such as a face detection program 1403, comprise instructions stored on a computer-readable medium that are executable by the processing unit 1402 of the computer 1400.
The disclosure has been described in conjunction with various embodiments. However, other variations and modifications to the disclosed embodiments can be understood and effected from a study of the drawings, the disclosure, and the appended claims, and such variations and modifications are to be interpreted as being encompassed by the appended claims. In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate, preclude, or suggest that a combination of these measures cannot be used. A computer program may be stored or distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with, or as part of, other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.
Although a few embodiments have been described in detail above, other modifications are possible. For example, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. Other steps may be provided in, or steps may be eliminated from, the described flows, and other components may be added to, or removed from, the described systems. Other embodiments may be within the scope of the following claims.

Claims (20)

  1. A method for to determine distress of a driver of a vehicle, the method comprising:
    receiving inputs by one or more processors from a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors;
    processing the received inputs by the one or more processors to obtain a driver heat change estimate, a driver expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate;
    storing by the one or more processors the estimates in a memory;
    using the stored estimates by the one or more processors to generate deviation scores for each of the driver heat change estimate, the driver expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate;
    executing a machine learning algorithm by the one or more processors to classify driver behavior as normal or impaired based on the deviation scores; and
    generating a warning by the one or more processors based on the classification indicating impaired driver behavior.
  2. The method of claim 1, wherein generating the deviation score for the driver heat change estimate includes:
    generating a normal driving model offline using normal driving thermal images of the driver;
    comparing the normal driving model with real-time thermal image data of the driver to obtain a comparison result; and
    applying a probability density function (PDF) to the comparison result to obtain the deviation score for the driver heat change estimate.
  3. The method of claim 1, wherein generating the deviation score for the driver expression estimate includes:
    using detection-tracking-validation (DTV) to localize frontal face images of the driver;
    constructing a face stream frame from a partitioned face region of the frontal face images;
    applying a fully convolutional network (FCN) to the face stream frame using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers;
    reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving expression model using a Gaussian mixture model (GMM) ; and
    comparing a real-time driver or passenger expression with the normal driving expression model to calculate the deviation score for the driver expression estimate.
  4. The method of claim 1, wherein generating the deviation score for the driver gesture estimate includes:
    detecting driver gestures to obtain an image of a hands region of the driver;
    constructing a two-layer hand stream from the image and normalizing the two-layer hand stream for size adjustment;
    applying a fully convolutional network (FCN) to the two-layer hand stream using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers;
    reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving gesture model using a Gaussian mixture model (GMM) ; and
    comparing a real-time driver gesture with the normal driving gesture model to calculate the deviation score for the driver gesture estimate.
  5. The method of claim 1, wherein generating the deviation score for the OBD estimate includes:
    collecting normal driving data from OBD related to two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision;
    using the normal driving data to generate a normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; and
    comparing real-time data to the normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision to generate a deviation score for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision.
  6. The method of claim 1, wherein the warning includes a visual alert.
  7. The method of claim 1, wherein the warning includes an audio output.
  8. The method of claim 1, wherein the warning includes a suggested corrective driver action using a display.
  9. The method of claim 1, wherein using the processor to execute the machine learning algorithm to classify the driver behavior as normal or impaired includes using a Gaussian mixture model (GMM) .
  10. The method of claim 9, wherein expectation maximization is used to estimate model parameters of the GMM.
  11. The method of claim 1, wherein the processor is configured to generate a normal driving model offline for comparison to real-time driving data.
  12. A system for determining distress of a driver of a vehicle, the system comprising:
    a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors; and
    one or more processors in communication with the plurality of sensors, the one or more processors configured to:
    receive inputs from the plurality of sensors;
    process the received inputs to obtain a driver heat change estimate, a driver expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate;
    store the estimates in a memory;
    use the stored estimates to generate deviation scores for each of the driver heat change estimate, the driver expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate;
    execute a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores; and
    generate a warning based on the classification indicating impaired driver behavior.
  13. The system of claim 12, wherein the plurality of sensors further includes exterior-facing sensors of the vehicle.
  14. The system of claim 12, wherein the processor is further configured to receive a traffic information input, including at least one of a speed limit and a lane direction.
  15. The system of claim 12, wherein the warning includes a suggested corrective driver action using a display.
  16. A non-transitory computer-readable medium storing computer instructions to determine distress of a driver of a vehicle and provide a warning, that when executed by one or more processors, cause the one or more processors to perform steps of:
    receiving inputs from a plurality of sensors, including interior vehicle image sensors, an interior vehicle audio sensor, vehicle data sensors, and Global Positioning System (GPS) data sensors;
    processing the received inputs to obtain a driver heat change estimate, a driver expression estimate, a driver gesture estimate, an on-board diagnostics (OBD) estimate, and a GPS estimate;
    storing the estimates in a memory;
    using the stored estimates to generate deviation scores for each of the driver heat change estimate, the driver expression estimate, the driver gesture estimate, the OBD estimate, and the GPS estimate;
    executing a machine learning algorithm to classify driver behavior as normal or impaired based on the deviation scores; and
    generating the warning based on the classification indicating impaired driver behavior.
  17. The computer-readable medium of claim 16, wherein generating the deviation score for the driver heat change estimate includes:
    generating a normal driving model offline using normal driving thermal images of the driver;
    comparing the normal driving model with real-time thermal image data of the driver to obtain a comparison result; and
    applying a probability density function (PDF) to the comparison result to obtain the deviation score for the driver heat change estimate.
  18. The computer-readable medium of claim 16, wherein generating the deviation score for the driver expression estimate includes:
    using detection-tracking-validation (DTV) to localize frontal face images of the driver;
    constructing a face stream frame from a partitioned face region of the frontal face images;
    applying a fully convolutional network (FCN) to the face stream frame using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers;
    reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving expression model using a Gaussian mixture model (GMM) ; and
    comparing a real-time driver expression with the normal driving expression model to calculate the deviation score for the driver or passenger expression estimate.
  19. The computer-readable medium of claim 16, wherein generating the deviation score for the driver gesture estimate includes:
    detecting driver gestures to obtain an image of a hands region of the driver;
    constructing a two-layer hand stream from the image and normalizing the two-layer hand stream for size adjustment;
    applying a fully convolutional network (FCN) to the two-layer hand stream using an encoder, including using multiple convolutional, pooling, batch normalization, and rectified linear unit (ReLU) layers;
    reshaping a feature map of a last layer of the encoder into vector form to obtain an output, and applying the output to a recurrent neural network (RNN) to obtain a normal driving gesture model using a Gaussian mixture model (GMM) ; and
    comparing a real-time driver gesture with the normal driving gesture model to calculate the deviation score for the driver gesture estimate.
  20. The computer-readable medium of claim 16, wherein generating the deviation score for the OBD estimate includes:
    collecting normal driving data from OBD related to two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision;
    using the normal driving data to generate a normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision; and
    comparing real-time data to the normal driving model for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision to generate a deviation score for each of the two or more of vehicle speed, steering wheel angle, steering wheel angle error, time to lane crossing, and time to collision.
PCT/CN2019/075558 2018-02-22 2019-02-20 Method for distress and road rage detection WO2019161766A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980014866.9A CN111741884B (en) 2018-02-22 2019-02-20 Traffic distress and road rage detection method
EP19756593.0A EP3755597B1 (en) 2018-02-22 2019-02-20 Method for distress and road rage detection

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15/902,729 US10322728B1 (en) 2018-02-22 2018-02-22 Method for distress and road rage detection
US15/902,729 2018-02-22

Publications (1)

Publication Number Publication Date
WO2019161766A1 true WO2019161766A1 (en) 2019-08-29

Family

ID=66826167

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/075558 WO2019161766A1 (en) 2018-02-22 2019-02-20 Method for distress and road rage detection

Country Status (4)

Country Link
US (1) US10322728B1 (en)
EP (1) EP3755597B1 (en)
CN (1) CN111741884B (en)
WO (1) WO2019161766A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2597944A (en) * 2020-08-11 2022-02-16 Daimler Ag A method for predicting a user of a user with an autoencoder algorithm, as well as electronic computing device
GB2618313A (en) * 2022-04-22 2023-11-08 Continental Automotive Tech Gmbh A method and system for detecting a state of abnormality within a cabin

Families Citing this family (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11887352B2 (en) 2010-06-07 2024-01-30 Affectiva, Inc. Live streaming analytics within a shared digital environment
US12076149B2 (en) 2010-06-07 2024-09-03 Affectiva, Inc. Vehicle manipulation with convolutional image processing
US11935281B2 (en) 2010-06-07 2024-03-19 Affectiva, Inc. Vehicular in-cabin facial tracking using machine learning
US11823055B2 (en) * 2019-03-31 2023-11-21 Affectiva, Inc. Vehicular in-cabin sensing using machine learning
US10952682B2 (en) 2015-07-19 2021-03-23 Sanmina Corporation System and method of a biosensor for detection of health parameters
US10194871B2 (en) 2015-09-25 2019-02-05 Sanmina Corporation Vehicular health monitoring system and method
US10736580B2 (en) 2016-09-24 2020-08-11 Sanmina Corporation System and method of a biosensor for detection of microvascular responses
US10932727B2 (en) 2015-09-25 2021-03-02 Sanmina Corporation System and method for health monitoring including a user device and biosensor
US9636457B2 (en) 2015-07-19 2017-05-02 Sanmina Corporation System and method for a drug delivery and biosensor patch
US10744261B2 (en) 2015-09-25 2020-08-18 Sanmina Corporation System and method of a biosensor for detection of vasodilation
US10888280B2 (en) 2016-09-24 2021-01-12 Sanmina Corporation System and method for obtaining health data using a neural network
US10321860B2 (en) 2015-07-19 2019-06-18 Sanmina Corporation System and method for glucose monitoring
US10750981B2 (en) 2015-09-25 2020-08-25 Sanmina Corporation System and method for health monitoring including a remote device
US10973470B2 (en) 2015-07-19 2021-04-13 Sanmina Corporation System and method for screening and prediction of severity of infection
US9788767B1 (en) 2015-09-25 2017-10-17 Sanmina Corporation System and method for monitoring nitric oxide levels using a non-invasive, multi-band biosensor
US10945676B2 (en) 2015-09-25 2021-03-16 Sanmina Corporation System and method for blood typing using PPG technology
CN109803583A (en) * 2017-08-10 2019-05-24 北京市商汤科技开发有限公司 Driver monitoring method, apparatus and electronic equipment
US11373460B2 (en) * 2017-08-28 2022-06-28 Cox Communications, Inc. Remote asset detection system
EP3493116B1 (en) 2017-12-04 2023-05-10 Aptiv Technologies Limited System and method for generating a confidence value for at least one state in the interior of a vehicle
US10466783B2 (en) 2018-03-15 2019-11-05 Sanmina Corporation System and method for motion detection using a PPG sensor
WO2019191506A1 (en) * 2018-03-28 2019-10-03 Nvidia Corporation Detecting data anomalies on a data interface using machine learning
US11074434B2 (en) * 2018-04-27 2021-07-27 Microsoft Technology Licensing, Llc Detection of near-duplicate images in profiles for detection of fake-profile accounts
US10655978B2 (en) * 2018-06-27 2020-05-19 Harman International Industries, Incorporated Controlling an autonomous vehicle based on passenger behavior
WO2020069517A2 (en) * 2018-09-30 2020-04-02 Strong Force Intellectual Capital, Llc Intelligent transportation systems
US10696305B2 (en) * 2018-11-15 2020-06-30 XMotors.ai Inc. Apparatus and method for measuring physiological information of living subject in vehicle
US11200438B2 (en) 2018-12-07 2021-12-14 Dus Operating Inc. Sequential training method for heterogeneous convolutional neural network
US10635917B1 (en) * 2019-01-30 2020-04-28 StradVision, Inc. Method and device for detecting vehicle occupancy using passenger's keypoint detected through image analysis for humans' status recognition
US11068069B2 (en) * 2019-02-04 2021-07-20 Dus Operating Inc. Vehicle control with facial and gesture recognition using a convolutional neural network
US11887383B2 (en) 2019-03-31 2024-01-30 Affectiva, Inc. Vehicle interior object management
US11068701B2 (en) * 2019-06-13 2021-07-20 XMotors.ai Inc. Apparatus and method for vehicle driver recognition and applications of same
JP7247790B2 (en) * 2019-07-02 2023-03-29 株式会社デンソー Driving environment monitoring device, driving environment monitoring system, and driving environment monitoring program
WO2021044566A1 (en) * 2019-09-05 2021-03-11 三菱電機株式会社 Physique determination device and physique determination method
CN110705370B (en) * 2019-09-06 2023-08-18 中国平安财产保险股份有限公司 Road condition identification method, device, equipment and storage medium based on deep learning
JP7534076B2 (en) * 2019-09-10 2024-08-14 株式会社Subaru Vehicle control device
EP3796209A1 (en) * 2019-09-17 2021-03-24 Aptiv Technologies Limited Method and system for determining an activity of an occupant of a vehicle
EP3795441A1 (en) * 2019-09-17 2021-03-24 Aptiv Technologies Limited Method and device for determining an estimate of the capability of a vehicle driver to take over control of a vehicle
US20230245238A1 (en) * 2019-10-02 2023-08-03 BlueOwl, LLC Cloud-based vehicular telematics systems and methods for generating hybrid epoch driver predictions using edge-computing
CN110525447A (en) * 2019-10-09 2019-12-03 吉林大学 A kind of the man-machine of anti-commercial vehicle driver road anger drives system altogether
US11769056B2 (en) 2019-12-30 2023-09-26 Affectiva, Inc. Synthetic data for neural network training using vectors
CN113393664A (en) * 2020-02-26 2021-09-14 株式会社斯巴鲁 Driving support device
US11494865B2 (en) 2020-04-21 2022-11-08 Micron Technology, Inc. Passenger screening
US11091166B1 (en) 2020-04-21 2021-08-17 Micron Technology, Inc. Driver screening
CN111605556B (en) * 2020-06-05 2022-06-07 吉林大学 Road rage prevention recognition and control system
US11418863B2 (en) 2020-06-25 2022-08-16 Damian A Lynch Combination shower rod and entertainment system
US11753047B2 (en) * 2020-06-29 2023-09-12 Micron Technology, Inc. Impaired driver assistance
US11840246B2 (en) * 2020-06-29 2023-12-12 Micron Technology, Inc. Selectively enable or disable vehicle features based on driver classification
JP7286022B2 (en) * 2020-07-09 2023-06-02 三菱電機株式会社 Occupant state detection device and occupant state detection method
US20220073085A1 (en) * 2020-09-04 2022-03-10 Waymo Llc Knowledge distillation for autonomous vehicles
KR20220074566A (en) * 2020-11-27 2022-06-03 현대자동차주식회사 Apparatus for vehicle video recording and method thereof
US11753019B2 (en) * 2020-11-30 2023-09-12 Sony Group Corporation Event determination for vehicles and occupants of mobility provider on MaaS platform
CN112906617B (en) * 2021-03-08 2023-05-16 济南中凌电子科技有限公司 Method and system for identifying abnormal behavior of driver based on hand detection
JP2022187273A (en) * 2021-06-07 2022-12-19 トヨタ自動車株式会社 Information processing device and driving evaluation system
CN113415286B (en) * 2021-07-14 2022-09-16 重庆金康赛力斯新能源汽车设计院有限公司 Road rage detection method and equipment
KR20230019334A (en) * 2021-07-30 2023-02-08 현대자동차주식회사 Vehicle path control device, controlling method of vehicle path
WO2024022963A1 (en) 2022-07-29 2024-02-01 Saint-Gobain Glass France Arrangement for driver assistance system
CN115223104B (en) * 2022-09-14 2022-12-02 深圳市睿拓新科技有限公司 Method and system for detecting illegal operation behaviors based on scene recognition
US20240233913A9 (en) * 2022-10-20 2024-07-11 Curio Digital Therapeutics, Inc. User analysis and predictive techniques for digital therapeutic systems
EP4421757A1 (en) * 2023-02-24 2024-08-28 Rockwell Collins, Inc. Audio-visual pilot activity recognition system and method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070063854A1 (en) * 2005-08-02 2007-03-22 Jing Zhang Adaptive driver workload estimator
US20090040054A1 (en) * 2007-04-11 2009-02-12 Nec Laboratories America, Inc. Real-time driving danger level prediction
US20140112556A1 (en) 2012-10-19 2014-04-24 Sony Computer Entertainment Inc. Multi-modal sensor based emotion recognition and emotional interface
US20160311440A1 (en) * 2015-04-22 2016-10-27 Motorola Mobility Llc Drowsy driver detection
US20180025240A1 (en) * 2016-07-21 2018-01-25 Gestigon Gmbh Method and system for monitoring the status of the driver of a vehicle

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6156629A (en) 1984-08-28 1986-03-22 アイシン精機株式会社 Cardiac pulse meter for car
US7027621B1 (en) * 2001-03-15 2006-04-11 Mikos, Ltd. Method and apparatus for operator condition monitoring and assessment
DE102006011481A1 (en) * 2006-03-13 2007-09-20 Robert Bosch Gmbh A method and apparatus for assisting in guiding a vehicle
US20110187518A1 (en) * 2010-02-02 2011-08-04 Ford Global Technologies, Llc Steering wheel human/machine interface system and method
US9493130B2 (en) * 2011-04-22 2016-11-15 Angel A. Penilla Methods and systems for communicating content to connected vehicle users based detected tone/mood in voice input
TWI447039B (en) * 2011-11-25 2014-08-01 Driving behavior analysis and warning system and method thereof
US20150213555A1 (en) 2014-01-27 2015-07-30 Hti Ip, Llc Predicting driver behavior based on user data and vehicle data
CN106575454A (en) 2014-06-11 2017-04-19 威尔蒂姆Ip公司 System and method for facilitating user access to vehicles based on biometric information
CN104112335A (en) * 2014-07-25 2014-10-22 北京机械设备研究所 Multi-information fusion based fatigue driving detecting method
US10065652B2 (en) * 2015-03-26 2018-09-04 Lightmetrics Technologies Pvt. Ltd. Method and system for driver monitoring by fusing contextual data with event data to determine context as cause of event
WO2016195474A1 (en) * 2015-05-29 2016-12-08 Charles Vincent Albert Method for analysing comprehensive state of a subject
US10932727B2 (en) 2015-09-25 2021-03-02 Sanmina Corporation System and method for health monitoring including a user device and biosensor
US10244983B2 (en) * 2015-07-20 2019-04-02 Lg Electronics Inc. Mobile terminal and method for controlling the same
US20170091872A1 (en) 2015-09-24 2017-03-30 Renesas Electronics Corporation Apparatus and method for evaluating driving ability, and program for causing computer to perform method
JP6985005B2 (en) 2015-10-14 2021-12-22 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America Emotion estimation method, emotion estimation device, and recording medium on which the program is recorded.
US20170105667A1 (en) 2015-10-15 2017-04-20 Josh WEI Stress and Heart Rate Trip Monitoring System and Method
JP2017121286A (en) 2016-01-05 2017-07-13 富士通株式会社 Emotion estimation system, emotion estimation method, and emotion estimation program
JP6971017B2 (en) 2016-04-12 2021-11-24 ソニーモバイルコミュニケーションズ株式会社 Detection device, detection method, and program
CN106114516A (en) * 2016-08-31 2016-11-16 合肥工业大学 The angry driver behavior modeling of a kind of drive automatically people's characteristic and tampering devic
CN107235045A (en) * 2017-06-29 2017-10-10 吉林大学 Consider physiology and the vehicle-mounted identification interactive system of driver road anger state of manipulation information
CN206885034U (en) * 2017-06-29 2018-01-16 吉林大学 Consider physiology with manipulating the vehicle-mounted identification interactive system of driver road anger state of information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070063854A1 (en) * 2005-08-02 2007-03-22 Jing Zhang Adaptive driver workload estimator
US20090040054A1 (en) * 2007-04-11 2009-02-12 Nec Laboratories America, Inc. Real-time driving danger level prediction
US20140112556A1 (en) 2012-10-19 2014-04-24 Sony Computer Entertainment Inc. Multi-modal sensor based emotion recognition and emotional interface
US20160311440A1 (en) * 2015-04-22 2016-10-27 Motorola Mobility Llc Drowsy driver detection
US20180025240A1 (en) * 2016-07-21 2018-01-25 Gestigon Gmbh Method and system for monitoring the status of the driver of a vehicle

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3755597A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2597944A (en) * 2020-08-11 2022-02-16 Daimler Ag A method for predicting a user of a user with an autoencoder algorithm, as well as electronic computing device
GB2618313A (en) * 2022-04-22 2023-11-08 Continental Automotive Tech Gmbh A method and system for detecting a state of abnormality within a cabin

Also Published As

Publication number Publication date
EP3755597A4 (en) 2021-04-21
CN111741884A (en) 2020-10-02
EP3755597A1 (en) 2020-12-30
EP3755597B1 (en) 2024-09-04
US10322728B1 (en) 2019-06-18
CN111741884B (en) 2022-11-04

Similar Documents

Publication Publication Date Title
EP3755597B1 (en) Method for distress and road rage detection
US11375338B2 (en) Method for smartphone-based accident detection
US20190370580A1 (en) Driver monitoring apparatus, driver monitoring method, learning apparatus, and learning method
US11535280B2 (en) Method and device for determining an estimate of the capability of a vehicle driver to take over control of a vehicle
EP3588372B1 (en) Controlling an autonomous vehicle based on passenger behavior
US20210357701A1 (en) Evaluation device, action control device, evaluation method, and evaluation program
CN110765807B (en) Driving behavior analysis and processing method, device, equipment and storage medium
Doshi et al. A comparative exploration of eye gaze and head motion cues for lane change intent prediction
Sathyanarayana et al. Information fusion for robust ‘context and driver aware’active vehicle safety systems
JP7303901B2 (en) Suggestion system that selects a driver from multiple candidates
JPWO2021053780A1 (en) Cognitive function estimation device, learning device, and cognitive function estimation method
JP2022033805A (en) Method, device, apparatus, and storage medium for identifying passenger's status in unmanned vehicle
Jafarnejad et al. Non-intrusive Distracted Driving Detection based on Driving Sensing Data.
Andriyanov et al. Eye recognition system to prevent accidents on the road
Lashkov et al. Aggressive behavior detection based on driver heart rate and hand movement data
CN110641468B (en) Controlling autonomous vehicles based on passenger behavior
P Mathai A New Proposal for Smartphone-Based Drowsiness Detection and Warning System for Automotive Drivers
Kumar et al. Early Detection of Driver Drowsiness Detection using Automated Deep Artificial Intelligence Learning (ADAI)
Abut et al. Vehicle Systems and Driver Modelling: DSP, human-to-vehicle interfaces, driver behavior, and safety
Sakethram et al. Inattentional Driver Detection Using Faster R-CNN and Resnet
Rimal et al. Driver Monitoring System using an Embedded Computer Platform
Sen et al. Passive Monitoring of Dangerous Driving Behaviors Using mmWave Radar
CN117542028A (en) Method and device for detecting driving behavior, vehicle and storage medium
Abut et al. Intelligent Vehicles and Transportation
Khare et al. Multimodal interaction in modern automobiles

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19756593

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019756593

Country of ref document: EP

Effective date: 20200922