CN112183220A - Driver fatigue detection method and system and computer storage medium - Google Patents
Driver fatigue detection method and system and computer storage medium Download PDFInfo
- Publication number
- CN112183220A CN112183220A CN202010918289.0A CN202010918289A CN112183220A CN 112183220 A CN112183220 A CN 112183220A CN 202010918289 A CN202010918289 A CN 202010918289A CN 112183220 A CN112183220 A CN 112183220A
- Authority
- CN
- China
- Prior art keywords
- driver
- mouth
- feature points
- image
- face
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 43
- 239000013598 vector Substances 0.000 claims abstract description 35
- 238000000034 method Methods 0.000 claims abstract description 21
- 238000013145 classification model Methods 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims abstract description 14
- 238000013528 artificial neural network Methods 0.000 claims abstract description 13
- 230000000750 progressive effect Effects 0.000 claims abstract description 11
- 210000001508 eye Anatomy 0.000 claims description 118
- 238000011176 pooling Methods 0.000 claims description 20
- 241001282135 Poromitra oscitans Species 0.000 claims description 9
- 206010048232 Yawning Diseases 0.000 claims description 9
- 230000002093 peripheral effect Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 238000012937 correction Methods 0.000 claims description 6
- 210000000744 eyelid Anatomy 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 4
- 210000005252 bulbus oculi Anatomy 0.000 claims description 3
- 230000000737 periodic effect Effects 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 description 7
- 230000036544 posture Effects 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 5
- 210000003128 head Anatomy 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000005286 illumination Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 210000002837 heart atrium Anatomy 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- CGYGETOMCSJHJU-UHFFFAOYSA-N 2-chloronaphthalene Chemical compound C1=CC=CC2=CC(Cl)=CC=C21 CGYGETOMCSJHJU-UHFFFAOYSA-N 0.000 description 1
- 101100135790 Caenorhabditis elegans pcn-1 gene Proteins 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
Abstract
The invention relates to a method for detecting fatigue of a driver, a system thereof and a computer storage medium, wherein the method comprises the following steps: periodically acquiring a current frame face image of a driver; detecting the current frame face image by using a preset progressive calibration network so as to position feature points in the current frame face image; determining left and right eye region images and mouth region images according to the positioning result of the feature points, extracting target feature points in the left and right eye region images and the mouth region images, and performing sparse representation on the target feature points to obtain sparse feature vectors; processing the sparse feature vector by using a pre-trained neural network classification model to output an eye classification result and a mouth classification result of the current frame; and judging whether the driver is tired or not according to the eye classification result and the mouth classification result of the continuous multi-frame face images of the driver in a time sequence. The invention can improve the accuracy and the real-time performance of the fatigue detection of the driver.
Description
Technical Field
The invention relates to the technical field of safe driving of automobiles, in particular to a method and a system for detecting fatigue of a driver and a computer storage medium.
Background
The existing driver fatigue detection method mainly judges whether a driver is in fatigue driving or not through facial information of the driver, particularly eye image information, but because in the actual driving process, the lighting condition in the vehicle and the face posture of the driver are complicated and changeable, the existing traditional face detection and feature point positioning method cannot adapt to the situation that the lighting condition in the vehicle and the face posture of the driver are complicated and changeable in the actual driving process when being applied to the face recognition process of the driver in the vehicle, and the detection accuracy and the real-time performance of the method are still to be improved.
Disclosure of Invention
The invention aims to provide a driver fatigue detection method, a system and a computer storage medium thereof, so as to improve the accuracy and the real-time performance of the driver fatigue detection.
To achieve the above object, according to a first aspect, an embodiment of the present invention proposes a driver fatigue detection method, including:
step S1, periodically acquiring a current frame face image of the driver;
step S2, detecting the current frame face image by using a preset progressive calibration network so as to position the feature points in the current frame face image; wherein the feature points comprise left and right eye feature points and mouth feature points;
step S3, determining a left eye region image, a right eye region image and a mouth region image according to the positioning result of the feature points, extracting target feature points in the left eye region image, the right eye region image and the mouth region image, and performing sparse representation on the target feature points to obtain sparse feature vectors;
step S4, processing the sparse feature vector by using a pre-trained neural network classification model, and outputting an eye classification result and a mouth classification result of the current frame face image;
step S5, judging whether the driver is tired according to the eye classification result and the mouth classification result of the continuous multiframe human face images of the driver in a time sequence; wherein, a plurality of frames of face images of the driver are periodically acquired in one time sequence.
Optionally, the step S1 includes:
step S11, acquiring a current frame original image output by a vehicle camera;
step S12, reducing the size of the original image of the current frame according to a preset proportion;
step S13, reducing the size of the largest face in the original image of the current frame according to a preset proportion;
and step S14, determining a face search area in the current frame original image by using the face position in the previous frame original image, and performing face detection on the face search area by using a sliding window to obtain the current frame face image.
Optionally, the step S2 includes:
step S21, carrying out image transformation on the current frame face image obtained in step S1 according to the following formula;
Ri(x,y)=log[Ii(x,y)/Li(x,y)]=log Ii(x,y)-log[F(x,y)*Ii(x,y)]
where Ii (x, y) is the i-th color component of the current frame face image, Ri (x, y) is the reflected light information of the i-th color component, and Li (x, y) is the irradiated light information of the i-th color component, which denotes a convolution operation, σ is a scale factor, and K is a constant such that F (x, y) satisfies ×. F (x, y) dxdy ═ 1;
and step S22, detecting the current frame face image transformed in the step S21 by using a preset progressive calibration network so as to position the feature points in the current frame face image.
Optionally, the step S3 includes:
s31, constructing a target area feature point set according to the positioning result of the feature points;
step S32, according to the target region feature point set, defining and describing the relative coordinate position relationship between each point of the left and right eye target regions and the mouth target region on a face plane coordinate system, and determining the peripheral rectangular frame with the minimum area of the left and right eye target regions and the mouth target region according to the relative coordinate position relationship;
step S33, performing inclination angle correction on the peripheral rectangular frame with the minimum area of the left and right eye target areas and the mouth target area, and acquiring left and right eye area images and mouth area images according to the peripheral rectangular frame with the minimum area of the left and right eye target areas and the mouth target area after inclination angle correction;
and step S34, extracting target feature points in the left eye region image, the right eye region image and the mouth region image, and performing sparse representation on the target feature points to obtain sparse feature vectors.
Optionally, the step S31 includes:
if the feature points in the face image of the current frame are missing, supplementing the missing feature points of the eye and mouth regions according to a pre-constructed three-eye five-family digital model of the driver; the three-eye five-family digital model comprises a front face and image information of the face of a driver under various deflection angles and pitching angles;
wherein:
each point of the left and right eye target areas comprises: the canthus of the left and right eyes, the centers of the left and right eyeball, and the centers of the upper eyelid contour and the lower eyelid contour of the left and right eyes;
the various points of the mouth target area include: left and right corners of the mouth, center of contour of the upper lip, and center of contour of the lower lip.
Optionally, wherein:
the eye classification result includes: both eyes closed, both eyes slightly closed and both eyes open;
the mouth classification result includes: mouth closed, half mouth open and yawning like mouth open.
Optionally, the neural network classification model includes 3 convolutional layers, 3 pooling layers connected to the 3 convolutional layers in a one-to-one correspondence, a first fully-connected layer connected to the 3 pooling layers, and a second fully-connected layer connected to the first fully-connected layer; the 3 convolutional layers are respectively used for carrying out convolution processing on the left eye region image, the right eye region image and the mouth region image to obtain a feature vector which adopts sparse representation, the 3 pooling layers are respectively used for pooling convolution results of the 3 convolutional layers, and the first full-connection layer and the second full-connection layer are used for classifying outputs of the 3 pooling layers to obtain a classification result.
Optionally, the step S5 includes:
calculating an f value according to an eye classification result of continuous multi-frame face images of a driver in a time sequence and a formula f, wherein the formula f is Nc/Nt multiplied by 100%, and judging whether the eyes of the driver are closed or not according to a comparison result of the f value and a preset threshold; wherein Nc represents the number of frames of the eyes of the driver in a closed state in the period time, and Nt represents the number of effective identification frames of the eye state of the driver in the period time;
determining the duration of the opening of the mouth of the driver according to the mouth classification result of continuous multiframe face images of the driver in a time sequence, and judging that the yawning of the driver is performed when the duration is greater than or equal to the preset duration and the relative position relation of the coordinates of the feature points of the mouth in the yawning state is met.
In a second aspect, an embodiment of the present invention provides a driver fatigue detection system for performing the driver fatigue detection method of the first aspect, the system including:
the image acquisition unit is used for acquiring a face image of a periodic current driver;
the characteristic point detection unit is used for detecting the current frame face image by using a preset progressive calibration network so as to position the characteristic points in the current frame face image; wherein the feature points comprise left and right eye feature points and mouth feature points;
the region image acquisition unit is used for determining a left eye region image, a right eye region image and a mouth region image according to the positioning result of the feature points, extracting target feature points in the left eye region image, the right eye region image and the mouth region image, and performing sparse representation on the target feature points to obtain sparse feature vectors;
the classification unit is used for processing the sparse feature vector by utilizing a pre-trained neural network classification model and outputting an eye classification result and a mouth classification result of the current frame face image;
a fatigue determination unit for determining whether the driver is tired based on the eye classification result and the mouth classification result of the continuous multi-frame face images of the driver in one time sequence; wherein, a plurality of frames of face images of the driver are periodically acquired in one time sequence.
According to a third aspect, an embodiment of the invention proposes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the driver fatigue detection method according to the first aspect.
The embodiment of the invention provides a driver fatigue detection method, a system and a computer storage medium thereof, which cascade and detect multi-frame face images of a driver in a time sequence acquired by a camera through a plurality of convolutional neural networks to complete the positioning of characteristic points, further extract left and right eye target area images and mouth target area images in the images according to the positioning result, extract the sparse representation-based characteristic vector states of the left and right eye target area images and the mouth target area images by utilizing a neural network classification model and classify according to the characteristic vectors, finally judge whether the driver is tired according to the classification result of the multi-frame face images in the time sequence, greatly accelerate the processing speed of the images when the multi-convolutional neural network cascade detection is carried out on the face images, in addition, the sparse representation characteristic vectors are adopted, the method can adapt to the situation that the illumination condition in the automobile and the face posture of the driver are complicated and changeable in the actual driving process, and improves the accuracy and the real-time performance of the fatigue detection of the driver compared with the prior art.
Additional features and advantages of the invention will be set forth in the description which follows.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart illustrating a method for detecting fatigue of a driver according to an embodiment of the present invention.
Fig. 2 is an original face image in the non-ideal environment in the present embodiment.
Fig. 3 is the face image of the original face image in the non-ideal environment in the present embodiment after being processed in step S2.
Fig. 4 is a schematic diagram of the principle of the gradual calibration network PCN in this embodiment.
Fig. 5 is a schematic structural diagram of a neural network classification model in this embodiment.
FIG. 6 is a block diagram of a driver fatigue detection system according to another embodiment of the present invention.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In addition, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present invention. It will be understood by those skilled in the art that the present invention may be practiced without some of these specific details. In some instances, well known means have not been described in detail so as not to obscure the present invention.
Referring to fig. 1, an embodiment of the present invention provides a method for detecting driver fatigue, including:
step S1, periodically acquiring a current frame face image of the driver;
step S2, detecting the current frame face image by using a preset progressive calibration network so as to position the feature points in the current frame face image; wherein the feature points comprise left and right eye feature points and mouth feature points; specifically, localization refers to marking feature points in an image;
step S3, determining a left eye region image, a right eye region image and a mouth region image according to the positioning result of the feature points, extracting target feature points in the left eye region image, the right eye region image and the mouth region image, and performing sparse representation on the target feature points to obtain sparse feature vectors;
step S4, processing the sparse feature vector by using a pre-trained neural network classification model, and outputting an eye classification result and a mouth classification result of the current frame face image;
step S5, judging whether the driver is tired according to the eye classification result and the mouth classification result of the continuous multiframe human face images of the driver in a time sequence; wherein, a plurality of frames of face images of the driver are periodically acquired in one time sequence.
The method of the embodiment comprises the steps of cascading and detecting a plurality of frames of face images of a driver in a time sequence acquired by a camera through a plurality of convolutional neural networks to complete the positioning of feature points, further extracting left and right eye target area images and mouth target area images in the images according to the positioning result, extracting sparse representation-based feature vector states of the left and right eye target area images and the mouth target area images by using a neural network classification model, classifying according to feature vectors, and finally judging whether the driver is tired or not according to the classification result of the plurality of frames of face images in the time sequence, wherein when the face images are subjected to a plurality of convolutional neural network cascading detections, the processing speed of the images can be greatly accelerated, in addition, the sparse representation feature vectors are adopted, the method can adapt to the conditions of illumination in a vehicle and the complicated and changeable face postures of the driver in the actual driving process, compared with the prior art, the accuracy and the real-time performance of the fatigue detection of the driver are improved.
Optionally, step S1 in this embodiment includes:
step S11, acquiring a current frame original image output by a vehicle camera; where the original image includes the driver's face and other factors in the vehicle, such as other "irrelevant people".
Step S12, reducing the size of the original image of the current frame according to a preset proportion; the image pyramid level is reduced, and the image processing acceleration in the initial state is realized.
Step S13, reducing the size of the largest face in the original image of the current frame according to a preset proportion;
specifically, the purpose of reducing the maximum face size is to reduce the size of the face image of the driver, and facilitate the use of a sliding window for rapid detection. The reason is that when there are many possible persons in the current vehicle, the acquired face image may not be the face image of the driver, and the "driver face image" acquired in step S11 may also include "irrelevant persons" who enter the camera view at the back, so that the driver is the image with the largest face area, and after the position of the real driver face image is located, the face image in the area needs to be scaled down, which facilitates the sliding window search for the face features in step S14.
And step S14, determining a face search area in the current frame original image by using the face position in the previous frame original image, and performing face detection on the face search area by using a sliding window to obtain the current frame face image. Specifically, the face range in the original image of the previous frame can be properly amplified to be used as a search area of the sliding window of the current frame, so that the detection area is simplified, the feature extraction range is narrowed, and the acceleration of feature extraction is further realized.
Illustratively, the face frame of the previous frame original image is horizontally enlarged by 1.3 times, and vertically enlarged by 1.5 times, as the search range of the face of the current frame original image.
Optionally, step S2 in this embodiment includes:
step S21, carrying out image transformation on the current frame face image obtained in step S1 according to the following formula;
let the acquired face image be represented as I (x, y) ═ L (x, y) × R (x, y);
wherein, R (x, y) is the reflected light information of the human face, and L (x, y) is the irradiation light information;
in order to reduce the calculated amount of the image and accelerate the speed of the face image preprocessing, the following transformation is carried out on the face image:
let log I (x, y) be log L (x, y) + log R (x, y);
since R (x, y) and L (x, y) contain different information, and the variation frequency of the signal is obvious, based on Retinex theory, the face output of the ith color component is:
Ri(x,y)=log[Ii(x,y)/Li(x,y)]=log Ii(x,y)-log[F(x,y)*Ii(x,y)]
where Ii (x, y) is the i-th color component of the current frame face image, Ri (x, y) is the reflected light information of the i-th color component, and Li (x, y) is the irradiated light information of the i-th color component, which denotes a convolution operation, σ is a scale factor, and K is a constant such that F (x, y) satisfies ×. F (x, y) dxdy ═ 1.
Fig. 2 is an original face image in a non-ideal environment, and fig. 3 is a face image processed in step S2.
And step S22, detecting the current frame face image transformed in the step S21 by using a preset progressive calibration network so as to position the feature points in the current frame face image.
Specifically, the accuracy of the face detection algorithm is reduced due to the head pose of the non-frontal face, so that the key points are not accurately positioned, and the region of interest cannot be accurately extracted. Aiming at the problem, the human face is detected through a PCN (progressive calibration network) in the step, and the feature points are positioned.
In the embodiment, the PCN cascades 3 CNN networks to predict the face frame and the face angle value from thick to thin. As shown in fig. 4, PCN-1 performs a classification task on the face angle while predicting the face mark frame, and turns the face of [ -180 °, 180 ° ] to [ -90 °, 90 ° ]. Similarly, PCN-2 carries out three classifications on the angles of the human face, and limits the angle range of the human face to [ -45 degrees, 45 degrees ]. PCN-3 predicts the precise angle using angle deviation regression. As shown in the following formula, Θ rip is θ 1+ θ 2+ θ 3, and the sum of the angles of the 3 network predictions is the final face offset angle; the finally obtained human face offset angle is used for correcting the offset human face head portrait to enable the human face head portrait to be in the face-facing posture, and the problems that the accuracy of a human face detection algorithm is reduced due to the head posture of a non-face, and further, key points are not accurately positioned and the region of interest cannot be accurately extracted are solved.
Optionally, step S3 in this embodiment includes:
s31, constructing a target area feature point set according to the positioning result of the feature points;
step S32, according to the target region feature point set, defining and describing the relative coordinate position relationship between each point of the left and right eye target regions and the mouth target region on a face plane coordinate system, and determining the peripheral rectangular frame with the minimum area of the left and right eye target regions and the mouth target region according to the relative coordinate position relationship;
step S33, performing inclination angle correction on the peripheral rectangular frame with the minimum area of the left and right eye target areas and the mouth target area, and acquiring left and right eye area images and mouth area images according to the peripheral rectangular frame with the minimum area of the left and right eye target areas and the mouth target area after inclination angle correction;
and step S34, extracting target feature points in the left eye region image, the right eye region image and the mouth region image, and performing sparse representation on the target feature points to obtain sparse feature vectors.
Optionally, step S31 in this embodiment includes:
if the feature points in the face image of the current frame are missing, supplementing the missing feature points of the eye and mouth regions according to a pre-constructed three-eye five-family digital model of the driver;
the three-family five-eye digital model construction process comprises the following steps: the method comprises the steps of obtaining face images of a plurality of front faces and drivers under different deflection angles, different pitching angles and different illumination conditions, and constructing a three-family five-eye digital model of the front faces and the drivers under the different deflection angles and different pitching angles.
It should be noted that each driver has a three-family five-eye digital model corresponding to the driver, and according to the three-family five-eye digital model which is constructed in advance and combined with the detected and positioned partial feature points, the simulation supplement of the feature points can be performed when the head of the driver is large in deflection amplitude, or the pitch angle is large, or the feature points are few under the condition of partial shielding, so that the identification accuracy is greatly improved.
Based on this, the embodiment realizes accurate detection of the driver's face and accurate and rapid positioning of the feature points under the conditions of different illumination conditions, large-amplitude human face deflection, human face pitching and partial shielding in a complex in-vehicle specific environment, and improves the processing speed while ensuring the accuracy.
Wherein:
each point of the left and right eye target areas comprises: the canthus of the left and right eyes, the centers of the left and right eyeball, and the centers of the upper eyelid contour and the lower eyelid contour of the left and right eyes;
the various points of the mouth target area include: left and right corners of the mouth, center of contour of the upper lip, and center of contour of the lower lip.
Optionally, in this embodiment:
the eye classification result includes: both eyes closed, both eyes slightly closed and both eyes open;
the mouth classification result includes: mouth closed, half mouth open and yawning like mouth open.
Optionally, referring to fig. 5, in this embodiment, the neural network classification model includes 3 convolutional layers, 3 pooling layers connected to the 3 convolutional layers in a one-to-one correspondence, a first fully-connected layer connected to the 3 pooling layers, and a second fully-connected layer connected to the first fully-connected layer; the 3 convolutional layers are respectively used for carrying out convolution processing on the left eye region image, the right eye region image and the mouth region image to obtain a feature vector which adopts sparse representation, the 3 pooling layers are respectively used for pooling convolution results of the 3 convolutional layers, and the first full-connection layer and the second full-connection layer are used for classifying outputs of the 3 pooling layers to obtain a classification result.
Specifically, the 3 convolutional layers 1, 2, and 3 include 32, and 64 convolutional kernels, respectively, and the size of the convolutional kernel is 8; in the 3 pooling layers, the pooling layer 1 adopts maximum pooling, and the maximum value in the local receptive field is taken, and the subsequent pooling layers 2 and 3 adopt average pooling, and the average value in the local receptive field is taken. The number of the neurons in the first full-connection layer and the second full-connection layer is 64 and 2 respectively, wherein the number of the neurons in the second full-connection layer is the number of the classification of the image to be classified, namely the eye classification result and the mouth classification result.
Illustratively, in the training process, a feature vector is constructed by using sparse representation, and for classifying and identifying the face processed based on Retinex theory, features capable of describing the face image category are extracted. Firstly, carrying out blocking operation on a human face, namely carrying out region division on a human face image according to three courtyard five eyes, and respectively extracting human eye feature points or mouth feature points in two regions of an atrium and a lower atrium; and then extracting the characteristics of each sub-block of the face by adopting a sparse representation algorithm so as to obtain a classification characteristic set for face recognition.
Specifically, sampling operation is performed on the face subblock Bb, that is, an eye characteristic point or a mouth characteristic point in the face subblock Bb is collected, all sampling results are combined together to obtain a vector vb ∈ Rd (d ═ m × n), and vectors of all sampling results form a characteristic vector of the subblock, so that the face image can be sparsely represented.
For all face image training samples, the feature dictionary formed by the same face sub-blocks is specifically as follows:
Ab=[v1,1,v1,2,…,v1,n1,v2,1,…,v2,n2,…,vk,1,…,vk,nk]∈Rd×N;
and vi and j are sparse representation features of the subblock Bb of the jth face in the ith class, N represents the number of face image training samples, k represents the number of the face classes, and d represents the dimension of the features.
Setting yb to represent the feature vector of a subblock Bb in a human face image test sample y, and solving a sparse expression, specifically:
min||x||0s.t.yb=Abx;
wherein | · | purple sweet0Representing |0And (4) norm.
Convert it to solve for l1Norm specifically is:
min||x||1s.t.yb=Abx;
after obtaining the solution x, calculating the residual error of the solution for the ith class of face:
ri(yb)=||yb-Abi(x)||2;
where i (x) ε RN represents the x-derived vector.
Based on the above contents, the approximate feature vector of the tested face can be obtained according to the features of the training sample.
And further marking the feature vectors and the corresponding face classes, combining training samples and test sample sets of face recognition, wherein the feature vectors are used as the input of a classification model, and the face classes are used as the output of the classification model.
Optionally, step S5 in this embodiment includes:
step S51, calculating an f value according to the eye classification result of the continuous multiframe face images of the driver in a time sequence and a formula f, wherein the formula f is Nc/Nt multiplied by 100 percent, and judging whether the eyes of the driver are closed or not according to the comparison result of the f value and a preset threshold value; wherein Nc represents the number of frames of the eyes of the driver in a closed state in the period time, and Nt represents the number of effective identification frames of the eye state of the driver in the period time;
preferably, the preset threshold value in this embodiment is 0.5.
Specifically, in a certain frame of image, when both eyes are recognized as a closed state, the frame is regarded as an effective recognition frame of the closed eyes; when the judgment results of the two eyes are inconsistent, the middle effective identification frame is included; when both eyes are recognized in the open state, it is considered as a valid recognition frame of the open eyes.
Wherein, the effective identification frame number is the effective identification frame of closed eye + the middle effective identification frame + the effective identification frame of open eye.
And step S52, determining the duration of the opening of the mouth of the driver according to the mouth classification result of the continuous multiframe face images of the driver in a time sequence, and judging that the yawning of the driver occurs when the duration is greater than or equal to the preset duration and accords with the coordinate relative position relation of the feature points of the mouth in the yawning state.
Preferably, in this embodiment, the preset time period is 2 seconds.
Preferably, in this embodiment, a time sequence of the number of frames for effective eye state recognition is defined as 3 seconds.
Referring to fig. 6, another embodiment of the present invention provides a driver fatigue detection system for performing the driver fatigue detection method according to the above embodiment, the system including:
the image acquisition unit 1 is used for acquiring a face image of a periodic current driver;
the feature point detection unit 2 is configured to detect the current frame face image by using a preset progressive calibration network, so as to locate feature points in the current frame face image; wherein the feature points comprise left and right eye feature points and mouth feature points;
the region image acquisition unit 3 is configured to determine left and right eye region images and mouth region images according to the positioning result of the feature points, extract target feature points in the left and right eye region images and mouth region images, and perform sparse representation on the target feature points to obtain sparse feature vectors;
the classification unit 4 is used for processing the sparse feature vector by using a pre-trained neural network classification model and outputting an eye classification result and a mouth classification result of the current frame face image;
a fatigue determination unit 5 for determining whether the driver is tired based on the eye classification result and the mouth classification result of the continuous multiple frames of face images of the driver in one time series; wherein, a plurality of frames of face images of the driver are periodically acquired in one time sequence.
The above-described system embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
It should be noted that the system described in the foregoing embodiment corresponds to the method described in the foregoing embodiment, and therefore, portions of the system described in the foregoing embodiment that are not described in detail can be obtained by referring to the content of the method described in the foregoing embodiment, and details are not described here.
Also, the driver fatigue detection system according to the above-described embodiment, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer-readable storage medium.
Illustratively, the computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like.
Another embodiment of the present invention also proposes a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the driver fatigue detection method of the above-mentioned embodiment.
Specifically, the computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (10)
1. A driver fatigue detection method, characterized by comprising:
step S1, periodically acquiring a current frame face image of the driver;
step S2, detecting the current frame face image by using a preset progressive calibration network so as to position the feature points in the current frame face image; wherein the feature points comprise left and right eye feature points and mouth feature points;
step S3, determining a left eye region image, a right eye region image and a mouth region image according to the positioning result of the feature points, extracting target feature points in the left eye region image, the right eye region image and the mouth region image, and performing sparse representation on the target feature points to obtain sparse feature vectors;
step S4, processing the sparse feature vector by using a pre-trained neural network classification model, and outputting an eye classification result and a mouth classification result of the current frame face image;
step S5, judging whether the driver is tired according to the eye classification result and the mouth classification result of the continuous multiframe human face images of the driver in a time sequence; wherein, a plurality of frames of face images of the driver are periodically acquired in one time sequence.
2. The driver fatigue detection method according to claim 1, wherein the step S1 includes:
step S11, acquiring a current frame original image output by a vehicle camera;
step S12, reducing the size of the original image of the current frame according to a preset proportion;
step S13, reducing the size of the largest face in the original image of the current frame according to a preset proportion;
and step S14, determining a face search area in the current frame original image by using the face position in the previous frame original image, and performing face detection on the face search area by using a sliding window to obtain the current frame face image.
3. The driver fatigue detection method according to claim 1, wherein the step S2 includes:
step S21, carrying out image transformation on the current frame face image obtained in step S1 according to the following formula;
Ri(x,y)=log[Ii(x,y)/Li(x,y)]=log Ii(x,y)-log[F(x,y)*Ii(x,y)]
where Ii (x, y) is the i-th color component of the current frame face image, Ri (x, y) is the reflected light information of the i-th color component, and Li (x, y) is the irradiated light information of the i-th color component, which denotes a convolution operation, σ is a scale factor, and K is a constant such that F (x, y) satisfies ×. F (x, y) dxdy ═ 1;
and step S22, detecting the current frame face image transformed in the step S21 by using a preset progressive calibration network so as to position the feature points in the current frame face image.
4. The driver fatigue detection method according to claim 1, wherein the step S3 includes:
s31, constructing a target area feature point set according to the positioning result of the feature points;
step S32, according to the target region feature point set, defining and describing the relative coordinate position relationship between each point of the left and right eye target regions and the mouth target region on a face plane coordinate system, and determining the peripheral rectangular frame with the minimum area of the left and right eye target regions and the mouth target region according to the relative coordinate position relationship;
step S33, performing inclination angle correction on the peripheral rectangular frame with the minimum area of the left and right eye target areas and the mouth target area, and acquiring left and right eye area images and mouth area images according to the peripheral rectangular frame with the minimum area of the left and right eye target areas and the mouth target area after inclination angle correction;
and step S34, extracting target feature points in the left eye region image, the right eye region image and the mouth region image, and performing sparse representation on the target feature points to obtain sparse feature vectors.
5. The driver fatigue detection method according to claim 4, wherein the step S31 includes:
if the feature points in the face image of the current frame are missing, supplementing the missing feature points of the eye and mouth regions according to a pre-constructed three-eye five-family digital model of the driver; the three-eye five-family digital model comprises a front face and image information of the face of a driver under various deflection angles and pitching angles;
wherein:
each point of the left and right eye target areas comprises: the canthus of the left and right eyes, the centers of the left and right eyeball, and the centers of the upper eyelid contour and the lower eyelid contour of the left and right eyes;
the various points of the mouth target area include: left and right corners of the mouth, center of contour of the upper lip, and center of contour of the lower lip.
6. The driver fatigue detection method according to claim 1, wherein:
the eye classification result includes: both eyes closed, both eyes slightly closed and both eyes open;
the mouth classification result includes: mouth closed, half mouth open and yawning like mouth open.
7. The driver fatigue detection method according to claim 6, wherein the neural network classification model includes 3 convolutional layers, 3 pooling layers connected in one-to-one correspondence with the 3 convolutional layers, a first fully-connected layer connected to the 3 pooling layers, and a second fully-connected layer connected to the first fully-connected layer; the 3 convolutional layers are respectively used for carrying out convolution processing on the sparse feature vector, the 3 pooling layers are respectively used for pooling convolution results of the 3 convolutional layers, and the first full-connection layer and the second full-connection layer are used for classifying outputs of the 3 pooling layers to obtain classification results.
8. The driver fatigue detection method according to claim 6, wherein the step S5 includes:
calculating an f value according to an eye classification result of continuous multi-frame face images of a driver in a time sequence and a formula f, wherein the formula f is Nc/Nt multiplied by 100%, and judging whether the eyes of the driver are closed or not according to a comparison result of the f value and a preset threshold; wherein Nc represents the number of frames of the eyes of the driver in a closed state in the period time, and Nt represents the number of effective identification frames of the eye state of the driver in the period time;
determining the duration of the opening of the mouth of the driver according to the mouth classification result of continuous multiframe face images of the driver in a time sequence, and judging that the yawning of the driver is performed when the duration is greater than or equal to the preset duration and the relative position relation of the coordinates of the feature points of the mouth in the yawning state is met.
9. A driver fatigue detection system for performing the method of any of claims 1-8, the system comprising:
the image acquisition unit is used for acquiring a face image of a periodic current driver;
the characteristic point detection unit is used for detecting the current frame face image by using a preset progressive calibration network so as to position the characteristic points in the current frame face image; wherein the feature points comprise left and right eye feature points and mouth feature points;
the region image acquisition unit is used for determining a left eye region image, a right eye region image and a mouth region image according to the positioning result of the feature points, extracting target feature points in the left eye region image, the right eye region image and the mouth region image, and performing sparse representation on the target feature points to obtain sparse feature vectors;
the classification unit is used for processing the sparse feature vector by utilizing a pre-trained neural network classification model and outputting an eye classification result and a mouth classification result of the current frame face image;
a fatigue determination unit for determining whether the driver is tired based on the eye classification result and the mouth classification result of the continuous multi-frame face images of the driver in one time sequence; wherein, a plurality of frames of face images of the driver are periodically acquired in one time sequence.
10. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program, when executed by a processor, implements the driver fatigue detection method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010918289.0A CN112183220A (en) | 2020-09-04 | 2020-09-04 | Driver fatigue detection method and system and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010918289.0A CN112183220A (en) | 2020-09-04 | 2020-09-04 | Driver fatigue detection method and system and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112183220A true CN112183220A (en) | 2021-01-05 |
Family
ID=73924129
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010918289.0A Pending CN112183220A (en) | 2020-09-04 | 2020-09-04 | Driver fatigue detection method and system and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112183220A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113095146A (en) * | 2021-03-16 | 2021-07-09 | 深圳市雄帝科技股份有限公司 | Mouth state classification method, device, equipment and medium based on deep learning |
CN113723339A (en) * | 2021-09-08 | 2021-11-30 | 西安联乘智能科技有限公司 | Fatigue driving detection method, storage medium, and electronic device |
CN117282038A (en) * | 2023-11-22 | 2023-12-26 | 杭州般意科技有限公司 | Light source adjusting method and device for eye phototherapy device, terminal and storage medium |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104952209A (en) * | 2015-04-30 | 2015-09-30 | 广州视声光电有限公司 | Driving prewarning method and device |
CN106485214A (en) * | 2016-09-28 | 2017-03-08 | 天津工业大学 | A kind of eyes based on convolutional neural networks and mouth state identification method |
US20170119298A1 (en) * | 2014-09-02 | 2017-05-04 | Hong Kong Baptist University | Method and Apparatus for Eye Gaze Tracking and Detection of Fatigue |
CN106682603A (en) * | 2016-12-19 | 2017-05-17 | 陕西科技大学 | Real time driver fatigue warning system based on multi-source information fusion |
CN107133595A (en) * | 2017-05-11 | 2017-09-05 | 南宁市正祥科技有限公司 | The eyes opening and closing detection method of infrared image |
CN107704805A (en) * | 2017-09-01 | 2018-02-16 | 深圳市爱培科技术股份有限公司 | method for detecting fatigue driving, drive recorder and storage device |
CN109063545A (en) * | 2018-06-13 | 2018-12-21 | 五邑大学 | A kind of method for detecting fatigue driving and device |
US20190126935A1 (en) * | 2014-09-22 | 2019-05-02 | Brian K. Phillips | Method and system for impaired driving detection, monitoring and accident prevention with driving habits |
CN110119676A (en) * | 2019-03-28 | 2019-08-13 | 广东工业大学 | A kind of Driver Fatigue Detection neural network based |
CN110532976A (en) * | 2019-09-03 | 2019-12-03 | 湘潭大学 | Method for detecting fatigue driving and system based on machine learning and multiple features fusion |
WO2020024395A1 (en) * | 2018-08-02 | 2020-02-06 | 平安科技(深圳)有限公司 | Fatigue driving detection method and apparatus, computer device, and storage medium |
CN111291590A (en) * | 2018-12-06 | 2020-06-16 | 广州汽车集团股份有限公司 | Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium |
CN111582086A (en) * | 2020-04-26 | 2020-08-25 | 湖南大学 | Fatigue driving identification method and system based on multiple characteristics |
-
2020
- 2020-09-04 CN CN202010918289.0A patent/CN112183220A/en active Pending
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170119298A1 (en) * | 2014-09-02 | 2017-05-04 | Hong Kong Baptist University | Method and Apparatus for Eye Gaze Tracking and Detection of Fatigue |
US20190126935A1 (en) * | 2014-09-22 | 2019-05-02 | Brian K. Phillips | Method and system for impaired driving detection, monitoring and accident prevention with driving habits |
CN104952209A (en) * | 2015-04-30 | 2015-09-30 | 广州视声光电有限公司 | Driving prewarning method and device |
CN106485214A (en) * | 2016-09-28 | 2017-03-08 | 天津工业大学 | A kind of eyes based on convolutional neural networks and mouth state identification method |
CN106682603A (en) * | 2016-12-19 | 2017-05-17 | 陕西科技大学 | Real time driver fatigue warning system based on multi-source information fusion |
CN107133595A (en) * | 2017-05-11 | 2017-09-05 | 南宁市正祥科技有限公司 | The eyes opening and closing detection method of infrared image |
CN107704805A (en) * | 2017-09-01 | 2018-02-16 | 深圳市爱培科技术股份有限公司 | method for detecting fatigue driving, drive recorder and storage device |
CN109063545A (en) * | 2018-06-13 | 2018-12-21 | 五邑大学 | A kind of method for detecting fatigue driving and device |
WO2020024395A1 (en) * | 2018-08-02 | 2020-02-06 | 平安科技(深圳)有限公司 | Fatigue driving detection method and apparatus, computer device, and storage medium |
CN111291590A (en) * | 2018-12-06 | 2020-06-16 | 广州汽车集团股份有限公司 | Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium |
CN110119676A (en) * | 2019-03-28 | 2019-08-13 | 广东工业大学 | A kind of Driver Fatigue Detection neural network based |
CN110532976A (en) * | 2019-09-03 | 2019-12-03 | 湘潭大学 | Method for detecting fatigue driving and system based on machine learning and multiple features fusion |
CN111582086A (en) * | 2020-04-26 | 2020-08-25 | 湖南大学 | Fatigue driving identification method and system based on multiple characteristics |
Non-Patent Citations (5)
Title |
---|
KENING LI等: "A Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion", IEEE ACCESS, vol. 8, pages 101244 - 101259, XP011792330, DOI: 10.1109/ACCESS.2020.2998363 * |
ZHONGMIN LIU等: "Driver fatigue detection based on deeply-learned facial expression representation", JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, vol. 71, pages 1 - 7 * |
刘小双等: "基于眼口状态的疲劳检测系统", 传感器与微系统, no. 10, pages 113 - 115 * |
朱玉斌等: "基于级联宽度学习的疲劳驾驶检测", 计算机工程与设计, vol. 41, no. 2, pages 537 - 541 * |
高宁: "面向驾驶人疲劳检测的人脸分析方法研究", 中国博士学位论文全文数据库 工程科技II辑, no. 8, pages 035 - 15 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113095146A (en) * | 2021-03-16 | 2021-07-09 | 深圳市雄帝科技股份有限公司 | Mouth state classification method, device, equipment and medium based on deep learning |
CN113723339A (en) * | 2021-09-08 | 2021-11-30 | 西安联乘智能科技有限公司 | Fatigue driving detection method, storage medium, and electronic device |
CN117282038A (en) * | 2023-11-22 | 2023-12-26 | 杭州般意科技有限公司 | Light source adjusting method and device for eye phototherapy device, terminal and storage medium |
CN117282038B (en) * | 2023-11-22 | 2024-02-13 | 杭州般意科技有限公司 | Light source adjusting method and device for eye phototherapy device, terminal and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110348319B (en) | Face anti-counterfeiting method based on face depth information and edge image fusion | |
US11487966B2 (en) | Image processing method and apparatus for target recognition | |
CN112183220A (en) | Driver fatigue detection method and system and computer storage medium | |
CN108268838B (en) | Facial expression recognition method and facial expression recognition system | |
CN111160269A (en) | Face key point detection method and device | |
US9639748B2 (en) | Method for detecting persons using 1D depths and 2D texture | |
CN108182397B (en) | Multi-pose multi-scale human face verification method | |
CN111144207B (en) | Human body detection and tracking method based on multi-mode information perception | |
CN113011357A (en) | Depth fake face video positioning method based on space-time fusion | |
CN113592911B (en) | Apparent enhanced depth target tracking method | |
CN110969171A (en) | Image classification model, method and application based on improved convolutional neural network | |
CN110956082A (en) | Face key point detection method and detection system based on deep learning | |
CN110599463A (en) | Tongue image detection and positioning algorithm based on lightweight cascade neural network | |
CN109325408A (en) | A kind of gesture judging method and storage medium | |
CN109063626A (en) | Dynamic human face recognition methods and device | |
Thongtawee et al. | A novel feature extraction for American sign language recognition using webcam | |
Ravi et al. | Sign language recognition with multi feature fusion and ANN classifier | |
CN110895802A (en) | Image processing method and device | |
CN107944363A (en) | Face image processing process, system and server | |
CN110751005B (en) | Pedestrian detection method integrating depth perception features and kernel extreme learning machine | |
CN110570469B (en) | Intelligent identification method for angle position of automobile picture | |
Pathak et al. | Multimodal eye biometric system based on contour based E-CNN and multi algorithmic feature extraction using SVBF matching | |
CN110458064B (en) | Low-altitude target detection and identification method combining data driving type and knowledge driving type | |
Singh | Gaussian elliptical fitting based skin color modeling for human detection | |
CN107886060A (en) | Pedestrian's automatic detection and tracking based on video |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |