GB2621863A - Pose classification and in-cabin monitoring methods and associated systems - Google Patents
Pose classification and in-cabin monitoring methods and associated systems Download PDFInfo
- Publication number
- GB2621863A GB2621863A GB2212345.9A GB202212345A GB2621863A GB 2621863 A GB2621863 A GB 2621863A GB 202212345 A GB202212345 A GB 202212345A GB 2621863 A GB2621863 A GB 2621863A
- Authority
- GB
- United Kingdom
- Prior art keywords
- face
- key point
- head
- class
- image data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000012544 monitoring process Methods 0.000 title claims description 26
- 210000003128 head Anatomy 0.000 claims abstract description 111
- 238000010801 machine learning Methods 0.000 claims abstract description 22
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 14
- 210000001508 eye Anatomy 0.000 claims abstract description 11
- 210000005069 ears Anatomy 0.000 claims abstract description 7
- 238000013528 artificial neural network Methods 0.000 claims abstract description 4
- 238000000605 extraction Methods 0.000 claims description 15
- 238000003384 imaging method Methods 0.000 claims description 10
- 238000005259 measurement Methods 0.000 claims description 8
- 238000011156 evaluation Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 4
- 230000001815 facial effect Effects 0.000 description 12
- 210000001331 nose Anatomy 0.000 description 6
- 238000013145 classification model Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 230000002159 abnormal effect Effects 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 206010041349 Somnolence Diseases 0.000 description 3
- 231100001261 hazardous Toxicity 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 230000004397 blinking Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 210000003739 neck Anatomy 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Abstract
Method for determining vehicle occupant’s face or head pose, comprising: extracting key point coordinates data 18 indicative of a body part from an image 14; determining a head or face region in the image from the key point data and cropping the face or head region from the image 20 to obtain a cropped face image 22; classifying 24 the cropped face or head image to determine pose class 26. The vehicle occupant may be a vehicle driver. Key point data may be extracted by a machine learning model (convolutional neural network) which also provides a key point label selected from the following groups; hands, elbows, shoulders, nose, eyes, ears, mouth and chin. The face/head pose class may be determined by a machine learning model (artificial neural network). A body pose class 30 may be obtained by classifying the key point data 18 by a tree-based machine learning model 28. A control means may use the duration that a face/head classification 26 and body pose classification 30 are displayed to determine a control signal for the vehicle.
Description
DESCRIPTION
Pose classification and in-cabin monitoring methods and associated systems
TECHNICAL FIELD
The invention relates to computer-implemented methods for classifying face/head poses and in-cabin monitoring of vehicle occupants.
BACKGROUND
Driver assistance systems for vehicle drivers have become more sophisticated recently and increasingly rely on machine learning models, feature extraction and classification for monitoring the driver or other occupants of the vehicle. The systems are able to detect certain states, such as fatigue, drowsiness, consciousness of the driver, for example, and may generate control signals to the vehicle. These may reach from a simple warning message to the driver, or broadcasting a hazardous state to other traffic participants up to an emergency brake procedure to prevent collisions with other vehicles or persons.
US 10 089 543 B2 discloses a computer-implemented method for detecting a head pose in a vehicle that includes receiving images of a vehicle occupant located in the vehicle from an imaging device and selecting facial feature points from a plurality of facial feature points extracted from the images. The method includes calculating a head pose point based on normalizing the selected facial feature points, determining the head pose based on a change in position of the head pose point over a period of time T, and controlling one or more vehicle systems of the vehicle based on the head pose.
Long Chen et al., "Driver Fatigue Detection Based on Facial Key Points and LSTM", Security and Communication Networks, Volume 2021, Article ID 5383573, 9 pages, https://doi.org/10.1155/2021/5383573, published 14 June 2021, discloses a fatigue state recognition algorithm based on a multitask convolutional neural network (MTCNN) to detect human face; subsequently an open-source software library, such as DLIB is used to locate facial key points to extract a fatigue feature vector of each frame.
EP 3 690 729 Al discloses a method for warning by detecting an abnormal state of a driver of a vehicle based on deep learning. The method includes steps of a driver state detecting device inputting an interior image of the vehicle into a drowsiness detecting network, to detect a facial part of the driver, detect an eye part from the facial part, detect a blinking state of an eye to determine a drowsiness state, and inputting the interior image into a pose matching network, to detect body key points of the driver, determine whether the body key points match one of preset driving postures, to determine the abnormal state.
SUMMARY OF THE INVENTION
It is the object of the invention to improve in-cabin monitoring systems for vehicles.
The object is achieved by the subject-matter of the independent claims. Preferred embodiments are subject-matter of the dependent claims.
The invention provides a computer-implemented method for determining a face/head pose class of a vehicle occupant from unlabeled image data that at least partially includes the vehicle occupant, the method comprising: a) capturing image data that at least partially include the vehicle occupant; b) extracting key point data from the image data, wherein the key point data include at least one key point, that is indicative of a specific body part of the vehicle occupant, and key point coordinates for each key point; c) determining from the key point data obtained in step b) a face/head region within the image data, wherein the face/head region includes a face portion and/or head portion of the vehicle occupant, and cropping the face/head region from the image data in order to obtain cropped image data that only include the face/head region; d) determining a face/head pose class by classifying the cropped image data into one output face/head pose class of a predetermined set of face/head pose classes.
Preferably, the vehicle occupant is a driver of the vehicle.
Preferably, in step b) the key point data is extracted by a machine learning model that includes a convolutional neural network.
Preferably, in step b) each extracted key point is labelled with a body part label chosen from a group comprising or consisting of hands, elbows, shoulders, nose, eyes, ears, mouth, chin.
Preferably, in step d) the face/head pose class is determined by a machine learning model that includes an artificial neural network that is trained to determine a probability score for each face/head pose class of the predetermined set, and the face/head pose class with the highest probability is selected as the output face/head pose class.
Preferably, the method includes the step: e) determining a body pose class by classifying the key point data into one output body pose class of a predetermined set of body pose classes.
Preferably, in step e) the body pose class is determined by a tree-based machine learning model that is configured to determine the output body pose class based only on the key point data.
The invention provides an in-cabin monitoring method for monitoring at least one vehicle occupant within a vehicle, the method comprising: a) performing a previously described method; b) evaluating the output face/head pose class and optionally the output body pose class and generating a control signal for the vehicle based on the evaluation of the respective output pose class.
Preferably, in step b) the evaluation includes a time measurement of how long a specific output face/head pose class and optionally output body pose class are displayed by the vehicle occupant, and generating the control signal also on the time measurement.
Preferably, the control signal brings the vehicle into a safe state by slowly driving the vehicle to a side or (hard) shoulder of the road.
Preferably, the control signal causes the vehicle to perform an emergency braking procedure The invention provides a computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out a previously described method.
The invention provides a computer-readable medium having stored thereon or a data carrier signal carrying the computer program.
The invention provides a classification device that is configured for determining an output face/head pose class of a vehicle occupant from unlabeled image data that at least partially includes the vehicle occupant, the device comprising: a) an imaging device configured for capturing image data that at least partially include the vehicle occupant; b) a key point extraction means that is configured for extracting key point data from the image data, wherein the key point data include at least one key point, that is indicative of a specific body part of the vehicle occupant, and key point coordinates for each key point; c) an image data cropping means that is configured for determining from the key point data obtained in step b) a face/head region within the image data, wherein the face/head region includes a face portion and/or head portion of the vehicle occupant, and cropping the face/head region from the image data in order to obtain cropped image data that only include the face/head region; d) a face/head pose classification means configured for determining a face/head pose class by classifying the cropped image data into one output face/head pose class of a predetermined set of face/head pose classes.
Preferably, the key point extraction means include a machine learning model that includes a convolutional neural network that is configured for extracting the key point data.
Preferably, the key point extraction means are configured to label each extracted key point with a body part label chosen from a group comprising or consisting of hands, elbows, shoulders, nose, eyes, ears, mouth, chin.
Preferably, the face/head classification means are configured to determine the face/head pose class by using a machine learning model that includes an artificial neural network that is trained to determine a probability score for each face/head pose class of the predetermined set, and the face/head pose class with the highest probability is selected as the output face/head pose class.
Preferably, the device comprises e) a body pose classification means configured for determining a body pose class by classifying the key point data into one output body pose class of a predetermined set of body pose classes.
Preferably, the body pose classification means are configured to determine the body pose class by using a tree-based machine learning model that is configured to determine the output body pose class based only on the key point data.
The invention provides an in-cabin vehicle monitoring system configured for monitoring at least one vehicle occupant within a vehicle, the device comprising: a) means for performing a preferred classification method; b) control means configured for evaluating the output face/head pose class and optionally the output body pose class and generating a control signal for the vehicle based on the evaluation of the respective output pose class.
Preferably, the control means are configured for performing a time measurement of how long a specific output face/head pose class and optionally output body pose class are displayed by the vehicle occupant, and for generating the control signal also on the time measurement.
Preferably, the control signal brings the vehicle into a safe state by slowly driving the vehicle to a side or (hard) shoulder of the road.
Preferably, the control signal causes the vehicle to perform an emergency braking procedure.
This solution can be used to detect the state of the driver of a vehicle based on his body and face/head pose. The body and head poses are extracted from an image, e.g., from a near-infrared (NIR) or RGB camera. The idea includes three different machine learning (ML) models.
One model is configured to detect body key points of the driver inside the image. The key point model can be based on a convolutional neural network (CNN) that is trained to output key point confidence maps and part affinity fields. In a postprocessing step, the confidence maps and part affinity fields may be used to calculate the body key point coordinates relative to the input image size. The output of the key point model are typically the coordinates of certain body key points, e.g., hands, elbows, shoulders and also facial/head key points for nose, eyes, ears and neck.
An example for a key point extraction model is the Qualcomm pose estimation (TensorFlow) that is available on GitHub: https://github. com/quic/aimet-modelzoo/blob/develop/zoo_tensorflow/Docs/PoseEstimation.md The model takes input data from images with size 224x400x3, normalized by: (x/256)-0.5, i.e., pixel values from -0.5 to 0.5. An example for suitable training parameters include optimizer: Adam, learning rate: 0.001, mini batch size: 16, epochs: 10.
The model is preferably configured as a two-branch multi-stage CNN, as is known from Cao et al., "Rea!time Multi-Person 20 Pose Estimation using Part Affinity Fields", arXiv: 1611.08050v2. Each stage in the first branch predicts confidence maps, and each stage in the second branch predicts affinity fields. After each stage, the predictions from the two branches, along with the image features, are concatenated for next stage. This architecture is able to simultaneously predict detection confidence maps and affinity fields that encode part-to-part association.
Each branch is preferably configured as an iterative prediction architecture, which can refine predictions over successive stages, preferably with intermediate supervision at each stage. The image is first analyzed by a convolutional network generating a set of feature maps F that is input to the first stage of each branch. At the first stage, the network produces a set of detection confidence maps and a set of part affinity fields. In each subsequent stage, the predictions from both branches in the previous stage, along with the original image features, are concatenated and used to produce refined predictions.
The model is preferably trained with the COCO dataset (https://cocodataset.org/) that includes images labeled with body and face key points. The key point extraction model generates part affinity fields and heatmaps. These are postprocessed by applying non maximum suppression to the heatmaps and assign each detected key point in the heatmap to a specific person by applying the Hungarian algorithm to part affinity fields. The result of the key point extraction model are (20) key point coordinates (e.g. nose, eyes, etc.), i.e. the positions of the key points within the input data image.
Another idea is a model for classifying the key points into a specific body pose. The body pose classification model may take some or all of the key point coordinates as input features. The body classification can be a tree based ML model, for example XGBoost or Random Forest, that can be trained to classify different body poses, e.g. body normal, body leaning left or body leaning right. The architecture, as is known, of the XGBoost tree model is basically a series of if...then.else statements that are true for a particular body pose defined by the keypoints.
The body pose classification model preferably takes as input data the (2D) key point coordinates. An example model architecture is the XGBoost tree model. The model is trained on the key point coordinates extracted from the COCO dataset. It is also possible to use a different dataset. Each collection of key point coordinates in a single image is labelled with a pose class. Suitable training parameters for the model include max_depth: 5, min_child_weight: 1, learning_rate: 0.25, subsample: 0.85, colsample_bytree: 0.45, gamma: 0.4, reg_alpha: 0.08, n_estimators: 300.
The model outputs a probability score for each possible body pose class. It is preferred that the body pose class with the highest probability score is selected and output as the output body pose class.
To detect the face/head pose, first the facial and head key point coordinates from the output of the key point model are reused to calculate the region where the drivers face is located. This region is cropped from the original image and then used as the input to a CNN classifier. The CNN classifier can be trained to detect different face/head poses, e.g., face looking up, face looking straight, face looking down or face looking sidewards.
The face classification model takes as input data the image pixel values of the face/head region that was cropped using the key points. The pixel values may be normalized by (x/127.5)-1.0, i.e., from -1 to 0. The face classification model uses MobileNet V2, for example, which is publicly available under tri The architecture of MobileNet V2, as is known from Sandler et al., "MobileNetV2: Inverted Residuals and Linear Bottlenecks", arXiv:1801.04381v4, has a basic building block of a bottleneck depth-separable convolution with residuals. The architecture of MobileNetV2 contains the initial fully convolution layer with 32 filters, followed by 19 residual bottleneck layers. Preferably, ReLU6 is used as the non-linearity because of its robustness. The kernel size is chosen to be 3x3. During training, dropout and batch normalization can be utilized.
The model is trained using face images that are (manually) cropped from the COCO dataset. Again other training sets are possible. An example for a set of suitable training parameters includes Optimizer: Adam, learning rate: 0.00005, mini batch size: 16, epochs: 25. The model outputs a probability score for each possible face/head pose class. The face/head pose class with the highest probability score may be selected to be output as the output face/head pose class.
The body pose and the face/head pose that were separated determined are then combined to get a more comprehensive estimation of the driver state. For example, if the body is leaning sidewards for a short period of time, but the face is still looking straight, it could be still considered a normal driving position.
With this configuration the output of the key point model is reused to detect the face. Consequently, a separate face detector can be avoided thereby allowing to save time and computing resources. Furthermore, the classification involves the full face/head instead of only certain parts, such as the eyes. The accuracy of the driver state detection may be increased. In addition, the driver state may be detected more robustly.
A driver monitoring system that runs the system described above may warn the driver in case of not being in a normal driving condition or position. The system may also intervene in the control of the vehicle, e.g., triggering an emergency brake.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention are described in more detail with reference to the accompanying schematic drawings.
The only Fig. depicts an embodiment of an in-cabin monitoring system of a vehicle.
DETAILED DESCRIPTION OF EMBODIMENT
Referring to the Fig., an in-cabin monitoring system 10 for a vehicle is depicted. The monitoring system 10 is generally configured to monitor the state of a vehicle occupant based on body pose, head/face pose and additional information, such as time duration etc. The monitoring system 10 comprises an imaging device 12. The imaging device 12 is configured to capture image data 14 of a vehicle occupant, e.g., the driver. The imaging device 12 may be an RGB or N IR camera. The imaging device 12 is typically arranged to capture an upper body portion including the head and face of the vehicle occupant.
The monitoring system 10 comprises a key point extraction means 16. The key point extraction means 16 receive the image data 14 captured by the imaging device 12. The key point extraction means 16 include a machine learning model that is trained to determine key point confidence maps and part affinity fields from the image data 14. The machine learning model is preferably configured as a convolutional neural network that is known per se.
Each key point is characterized by key point coordinates, that are indicative of the position of the key point within the image data 14, and a key point label, that is indicative of the body part that the key point represents, such as hands, elbows, shoulders and also facial/head key points for nose, eyes, ears and the like. The key point extraction means 16 is configured to output key point data 18 that includes the respective key point coordinates and the corresponding key point label for each key point.
The monitoring system 10 comprises an image data cropping means 20. The image data cropping means 20 receives the image data 14 captured by the imaging device 12 and the key point data 18 that was extracted by the key point extraction means 16. The image data cropping means 20 determines a face/head region in the image data 14 based on the key point data 18, wherein the face/head region contains the face of the vehicle occupant. The image data cropping means 20 crops the face/head region out of the captured image data 14 in order to obtain cropped image data 22.
The monitoring system 10 comprises a face/head pose classification means 24. The face/head pose classification means 24 receive the cropped image data 22. The face/head pose classification means 24 includes a machine learning model that is trained to classify the face/head pose of the vehicle occupant by determining a probability score for each member of a predetermined set of face/head pose classes. The face/head pose classification means 24 selects the face/head pose with the highest probability score as the output face/head pose class 26 that the vehicle occupant had at the time of capturing the image data 14.
The monitoring system 10 comprises a body pose classification means 28. The body pose classification means 28 processes only the key point data 18 that was determined by the key point extraction means 14, i.e., no image data is processed. The body pose classification means 28 includes a tree based machine learning model that is configured to classify the body pose of the vehicle occupant at the time of capturing the image data 14 and outputting it as output body pose class 30.
The monitoring system 10 comprises a control means 32. The control means 32 receive the output face/head pose class 26 and the output body pose class 30. The control means 32 may determine, for how long a specific face/head pose class 26 and/or body pose class 30 is exhibited by the vehicle occupant. The control means 32 may include a database of combinations of pose classes 26, 30 and associated time durations that are considered an abnormal or hazardous state of the vehicle occupant. The control means 32 evaluates the pose classes 26, 30 that are received with the stored combinations and performs a predetermined action, when the control means 32 determines that an abnormal or hazardous state is present. The control means 32 may issue a control signal that causes the vehicle to issue a warning, e.g., for the driver or other vehicle occupants using the interior signaling devices, and/or other traffic participants, e.g., by using the vehicle exterior lighting. If the state is severe enough, e.g., the control means 32 determines that the driver is incapacitated, the control means 32 may issue a control signal that causes the vehicle to perform an (emergency) braking procedure and/or, if the equipment allows, guiding the vehicle towards a safe position, e.g., near the curb or hard shoulder.
With this monitoring system 10, the state of the driver or other occupant of a vehicle can be determined based on his body and face/head pose. The body and head poses are extracted from an image that is processed by multiple machine learning models. A first model detects body key points of the driver inside the image. A second model classifies these key points into a specific body pose. A CNN may be used and trained to output key point confidence maps and part affinity fields. In a postprocessing step, the confidence maps and part affinity fields can be used to calculate the body key point coordinates, preferably relative to the input image size. Further, the output of the key point model is key point data that includes the coordinates of certain body key points, for instance hands, elbows, shoulders and also facial/head key points for nose, eyes, ears and the like. The body pose classification model takes all key points coordinates as input features. There may be a tree based ML model, for instance XGBoost or a Random Forest that can be trained to classify different body poses, for instance body normal, body leaning left or body leaning right.
Further, to detect the face/head pose, the facial and head key point coordinates from the output of the key point model are reused to calculate the region of the vehicle occupant's face. The calculated area is then cropped from the original image and used as the input to a CNN classifier. The CNN classifier can be trained to detect different face/head poses, for instance face looking up, face looking straight, face looking down or face looking sidewards. The body pose and the face/head pose can be combined to get a more comprehensive estimation of the driver state. This is done within the control means 32. For instance, if the body is leaning sidewards for a short period of time, but the face is still looking straight, it could be still considered a normal driving position. Therefore, the instant research work reuses the output of the key point model to detect the face instead of applying a separate face detector which saves time and other computing resources.
In order to improve in-cabin monitoring systems for vehicles, the invention proposes computer-implemented method for determining a face/head pose class of a vehicle occupant from unlabeled image data (14). Initially, image data (14) that at least partially include the vehicle occupant are captured. Key point data (18) are extracted from the image data (14), wherein the key point data (18) include at least one key point, that is indicative of a specific body pad of the vehicle occupant. From the key point data (18), a face/head region within the image data (14) is determined, wherein the face/head region includes a face portion and/or head portion of the vehicle occupant. The original image data (14) are cropped in order to obtain cropped image data (22) that only include the face/head region. A face/head pose class is determined by classifying only the cropped image data (22) into one output face/head pose class (26) of a predetermined set of face/head pose classes.
REFERENCE SIGNS
monitoring system 12 imaging device 14 image data 16 key point extraction means 18 key point data image data cropping means 22 cropped image data 24 face/head pose classification means 26 output face/head pose class 28 body pose classification means output body pose class 32 control means
Claims (15)
- CLAIMS1. A computer-implemented method for determining a face/head pose class of a vehicle occupant from unlabeled image data (14) that at least partially includes the vehicle occupant, the method comprising: a) capturing image data (14) that at least partially include the vehicle occupant; b) extracting key point data (18) from the image data (14), wherein the key point data (18) include at least one key point, that is indicative of a specific body part of the vehicle occupant, and key point coordinates for each key point; c) determining from the key point data (18) obtained in step b) a face/head region within the image data (14), wherein the face/head region includes a face portion and/or head portion of the vehicle occupant, and cropping the face/head region from the image data (14) in order to obtain cropped image data (22) that only include the face/head region; d) determining a face/head pose class by classifying the cropped image data (22) into one output face/head pose class (26) of a predetermined set of face/head pose classes.
- 2. The method according to claim 1, characterized in that the vehicle occupant is a driver of the vehicle.
- 3. The method according to any of the preceding claims, characterized in that in step b) the key point data (18) is extracted by a machine learning model that includes a convolutional neural network.
- 4. The method according to any of the preceding claims, characterized in that in step b) each extracted key point is labelled with a body part label chosen from a group comprising or consisting of hands, elbows, shoulders, nose, eyes, ears, mouth, chin.
- 5. The method according to any of the preceding claims, characterized in that in step d) the face/head pose class is determined by a machine learning model that includes an artificial neural network that is trained to determine a probability score for each face/head pose class of the predetermined set, and the face/head pose class with the highest probability is selected as the output face/head pose class (26).
- 6. The method according to any of the preceding claims, characterized by the step: e) determining a body pose class by classifying the key point data (18) into one output body pose class (30) of a predetermined set of body pose classes.
- 7. The method according to claim 6, characterized in that in step e) the body pose class is determined by a tree-based machine learning model that is configured to determine the output body pose class (30) based only on the key point data (18).
- 8. An in-cabin monitoring method for monitoring at least one vehicle occupant within a vehicle, the method comprising: a) performing a method according to any of the preceding claims; b) evaluating the output face/head pose class (26) and optionally the output body pose class (30) and generating a control signal for the vehicle based on the evaluation of the respective output pose class.
- 9. The method according to claim 8, characterized in that in step b) the evaluation includes a time measurement of how long a specific output face/head pose class (26) and optionally output body pose class (30) are displayed by the vehicle occupant, and generating the control signal also on the time measurement.
- 10. A computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out the method of any of the preceding claims.
- 11. A computer-readable medium having stored thereon or a data carrier signal carrying the computer program according to claim 10.
- 12. A classification device that is configured for determining an output face/head pose class (26) of a vehicle occupant from unlabeled image data (14) that at least partially includes the vehicle occupant, the device comprising: a) an imaging device (12) configured for capturing image data (14) that at least partially include the vehicle occupant; b) key point extraction means (16) that are configured for extracting key point data (18) from the image data (14), wherein the key point data (18) include at least one key point, that is indicative of a specific body part of the vehicle occupant, and key point coordinates for each key point; c) image data cropping means (20) that are configured for determining from the key point data (18) obtained in step b) a face/head region within the image data (14), wherein the face/head region includes a face portion and/or head portion of the vehicle occupant, and cropping the face/head region from the image data (14) in order to obtain cropped image data (22) that only include the face/head region; d) a face/head pose classification means (24) configured for determining a face/head pose class by classifying the cropped image data (22) into one output face/head pose class (26) of a predetermined set of face/head pose classes.
- 13. The device according to claim 12, characterized by e) body pose classification means (28) configured for determining a body pose class by classifying the key point data (18) into one output body pose class (30) of a predetermined set of body pose classes.
- 14. An in-cabin vehicle monitoring system (10) configured for monitoring at least one vehicle occupant within a vehicle, the device comprising: a) means for performing a method according to any of the claims 1 to 7; b) control means (32) configured for evaluating the output face/head pose class (26) and optionally the output body pose class (30) and generating a control signal for the vehicle based on the evaluation of the respective output pose class.
- 15. The system according to claim 14, characterized in that the control means (32) are configured for performing a time measurement of how long a specific output face/head pose class (26) and optionally output body pose class (30) are displayed by the vehicle occupant, and for generating the control signal also on the time measurement.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2212345.9A GB2621863A (en) | 2022-08-25 | 2022-08-25 | Pose classification and in-cabin monitoring methods and associated systems |
PCT/EP2023/068484 WO2024041790A1 (en) | 2022-08-25 | 2023-07-05 | Pose classification and in-cabin monitoring methods and associated systems |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2212345.9A GB2621863A (en) | 2022-08-25 | 2022-08-25 | Pose classification and in-cabin monitoring methods and associated systems |
Publications (2)
Publication Number | Publication Date |
---|---|
GB202212345D0 GB202212345D0 (en) | 2022-10-12 |
GB2621863A true GB2621863A (en) | 2024-02-28 |
Family
ID=83931667
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB2212345.9A Pending GB2621863A (en) | 2022-08-25 | 2022-08-25 | Pose classification and in-cabin monitoring methods and associated systems |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2621863A (en) |
WO (1) | WO2024041790A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259719A (en) * | 2019-10-28 | 2020-06-09 | 浙江零跑科技有限公司 | Cab scene analysis method based on multi-view infrared vision system |
US20200218883A1 (en) * | 2017-12-25 | 2020-07-09 | Beijing Sensetime Technology Development Co., Ltd. | Face pose analysis method, electronic device, and storage medium |
CN111616718A (en) * | 2020-07-30 | 2020-09-04 | 苏州清研微视电子科技有限公司 | Method and system for detecting fatigue state of driver based on attitude characteristics |
CN113128295A (en) * | 2019-12-31 | 2021-07-16 | 湖北亿咖通科技有限公司 | Method and device for identifying dangerous driving state of vehicle driver |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10089543B2 (en) | 2016-07-29 | 2018-10-02 | Honda Motor Co., Ltd. | System and method for detecting distraction and a downward vertical head pose in a vehicle |
US10713948B1 (en) * | 2019-01-31 | 2020-07-14 | StradVision, Inc. | Method and device for alerting abnormal driver situation detected by using humans' status recognition via V2V connection |
JP7415469B2 (en) * | 2019-11-15 | 2024-01-17 | 株式会社アイシン | Physique estimation device and posture estimation device |
-
2022
- 2022-08-25 GB GB2212345.9A patent/GB2621863A/en active Pending
-
2023
- 2023-07-05 WO PCT/EP2023/068484 patent/WO2024041790A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200218883A1 (en) * | 2017-12-25 | 2020-07-09 | Beijing Sensetime Technology Development Co., Ltd. | Face pose analysis method, electronic device, and storage medium |
CN111259719A (en) * | 2019-10-28 | 2020-06-09 | 浙江零跑科技有限公司 | Cab scene analysis method based on multi-view infrared vision system |
CN113128295A (en) * | 2019-12-31 | 2021-07-16 | 湖北亿咖通科技有限公司 | Method and device for identifying dangerous driving state of vehicle driver |
CN111616718A (en) * | 2020-07-30 | 2020-09-04 | 苏州清研微视电子科技有限公司 | Method and system for detecting fatigue state of driver based on attitude characteristics |
Also Published As
Publication number | Publication date |
---|---|
GB202212345D0 (en) | 2022-10-12 |
WO2024041790A1 (en) | 2024-02-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108960065B (en) | Driving behavior detection method based on vision | |
CN111439170B (en) | Child state detection method and device, electronic equipment and storage medium | |
JP2021510225A (en) | Behavior recognition method using video tube | |
CN111434553B (en) | Brake system, method and device, and fatigue driving model training method and device | |
Sathasivam et al. | Drowsiness detection system using eye aspect ratio technique | |
Choi et al. | Driver drowsiness detection based on multimodal using fusion of visual-feature and bio-signal | |
Garg | Drowsiness detection of a driver using conventional computer vision application | |
Yan et al. | Recognizing driver inattention by convolutional neural networks | |
CN114092922A (en) | Driver emotion recognition and behavior intervention method based on specificity | |
JP2020042785A (en) | Method, apparatus, device and storage medium for identifying passenger state in unmanned vehicle | |
CN115937830A (en) | Special vehicle-oriented driver fatigue detection method | |
Ribarić et al. | A neural-network-based system for monitoring driver fatigue | |
Khan et al. | Real time eyes tracking and classification for driver fatigue detection | |
CN116965781B (en) | Method and system for monitoring vital signs and driving behaviors of driver | |
GB2621863A (en) | Pose classification and in-cabin monitoring methods and associated systems | |
Koesdwiady et al. | Driver inattention detection system: A PSO-based multiview classification approach | |
Tarba et al. | The driver's attention level | |
KR20210105141A (en) | Method for monitering number of passengers in vehicle using camera | |
Berri et al. | A 3D vision system for detecting use of mobile phones while driving | |
Evstafev et al. | Controlling driver behaviour in ADAS with emotions recognition system | |
Thanh et al. | A driver drowsiness and distraction warning system based on raspberry Pi 3 Kit | |
Babu et al. | Comparative Analysis of Drowsiness Detection Using Deep Learning Techniques | |
WO2022025088A1 (en) | Vehicle safety support system | |
Srivastava | Driver's drowsiness identification using eye aspect ratio with adaptive thresholding | |
Vinodhini et al. | A behavioral approach to detect somnolence of CAB drivers using convolutional neural network |