CN115050012A - Method for detecting fatigue driving of driver wearing mask based on lightweight model - Google Patents

Method for detecting fatigue driving of driver wearing mask based on lightweight model Download PDF

Info

Publication number
CN115050012A
CN115050012A CN202210660238.1A CN202210660238A CN115050012A CN 115050012 A CN115050012 A CN 115050012A CN 202210660238 A CN202210660238 A CN 202210660238A CN 115050012 A CN115050012 A CN 115050012A
Authority
CN
China
Prior art keywords
driver
fatigue driving
feature
network
method comprises
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210660238.1A
Other languages
Chinese (zh)
Inventor
刘强
谢谦
郑国鸿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202210660238.1A priority Critical patent/CN115050012A/en
Publication of CN115050012A publication Critical patent/CN115050012A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Ophthalmology & Optometry (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of fatigue driving detection, in particular to a method for detecting fatigue driving of a driver wearing a mask based on a lightweight model; the method comprises the following steps: s1: acquiring a face image of a driver, and preprocessing the face image; s2: the face image data are transmitted into a lightweight GAN, and the image quality is enhanced; s3: the method comprises the steps of transmitting image data of a driver face into a lightweight target detection network, extracting features through an improved GhostNet main feature network, combining an SPP structure and a PANet structure to perform feature fusion, and finally classifying and regressing by using a Yolo Head to obtain states and position coordinates of eyes, pupils and heads of the driver; s4: judging whether the driver is in a fatigue driving state or not by integrating the percentage of eyes closed, the moving speed of sight line and the nodding frequency of the driver; compared with the traditional method of increasing the number of network layers and integrating multiple backbones, the method of the invention adopts the channel shuffling and division and the method of integrating multiple backbones, thereby improving the speed and the precision of detection.

Description

Method for detecting fatigue driving of driver wearing mask based on lightweight model
Technical Field
The invention relates to the technical field of fatigue driving detection, in particular to a method for detecting fatigue driving of a driver wearing a mask based on a lightweight model.
Background
Research has shown that more than 20% of many traffic accidents each year are accidents caused by fatigue driving, which has become one of the main causes of driving accidents.
The driver fatigue detection is the most efficient and objective method for analyzing the fatigue state of the driver at present, has the characteristic of non-subjectivity, distinguishes the history that the fatigue state of the driver needs to be sensed by the own physiology of the driver, and can judge the fatigue state of the driver more accurately and objectively. The fatigue state of the driver is more accurate and direct, so that traffic accidents caused by fatigue of the driver can be reduced to a certain extent.
The existing fatigue driving detection is mostly established on the basis of obtaining human face feature points, for example, chinese patent application with application number CN2021115482505 provides a driver multi-index fatigue driving detection method based on image recognition, the method obtains 68 feature points of a human face through a Histogram of Oriented Gradient (HOG) human face detection algorithm and a CE-CLM model, then selects related feature point information, can obtain eye opening and closing state, sight line direction and head steering information of a driver through conversion and calculation of a coordinate system, sets a fatigue standard and a threshold value, and compares the standard and the threshold value with the information of the index to judge the fatigue degree of the driver. However, this method relies too much on positioning the face, and cannot maintain good robustness in cases where the face is occluded, such as bang, sunglasses, masks, etc.
The Chinese patent application with the application number of CN2021115900351 provides a driver fatigue detection method and a system, the method can realize fatigue detection when a driver wears a mask, and whether the driver is in a fatigue driving state can be judged to a great extent by calculating the opening and closing degree of eyes and mouth of the driver, so that effective early warning is realized. However, the method transmits the face image with the mask to the GAN to generate the face image without the mask, and has the problems of large parameters and long training time of the GAN model, so that the GAN model cannot be well deployed on vehicle-mounted mobile detection equipment. And whether the driver is in a fatigue driving state is judged only through the opening and closing degree of the eyes and the mouth of the driver, so that the problem of low recognition accuracy exists.
Disclosure of Invention
The invention provides a method for detecting fatigue driving of a driver wearing a mask based on a lightweight model, aiming at solving the problems in the prior art.
In order to achieve the functions, the technical scheme provided by the invention is as follows:
a fatigue driving detection method for drivers wearing masks based on lightweight models is characterized by comprising the following steps: the method comprises the following steps:
s1: acquiring a face image of a driver in a driving state, and preprocessing the face image;
s2: transmitting the facial image data of the driver into a lightweight GAN, and enhancing the image quality;
s3: the enhanced driver face image data are transmitted into a lightweight target detection network, feature extraction is carried out through an improved GhostNet main feature network, feature fusion is carried out by combining an SPP structure and a PANet structure, and finally, a Yolo Head is used for classification and regression to obtain the state and position coordinates of the eyes, the pupils and the Head of a driver;
s4: and judging whether the driver is in a fatigue driving state or not by integrating the percentage of eyes closed, the moving speed of the sight line and the nodding frequency of the driver.
Preferably, in step S2, the lightweight GAN includes a generator constructed based on the shuffle netv2 and a discriminator constructed based on the patch GAN, and performs low-light improvement and deblurring operations on the input video stream to generate a well-lighted and sharp image.
Preferably, the driver face images are preprocessed in step S1 and uniformly adjusted to 224 × 224 × 3;
in step S3, the process of extracting features through the improved GhostNet backbone feature network is as follows:
the method comprises the steps of enabling an input 224 x 3 image to pass through a 16-channel common 1 x 1 convolution block to obtain a 7 x 160 feature layer, adjusting the number of channels by using a 1 x 1 convolution block to obtain a 7 x 960 feature layer, finally performing global average pooling and 1 x 1 convolution to obtain a 1 x 1280 feature layer, performing full-connection classification, combining and connecting two identical GhostNet trunks, enabling the output of each stage of a first trunk to serve as a part of input, enabling the output to flow to a next trunk network in a parallel stage in a mode of adjacent high-level combination, using depth separable convolution to generate a redundant feature map, and enabling channels to shuffle and divide to reduce calculation amount, and obtaining three effective feature layers through a trunk feature extraction network, wherein the sizes of the three effective feature layers are 76 x 256, 38 x 512 and 19 x 1024 respectively.
Preferably, in step S3, the eyes are individually detected, and the PERCLOS value is calculated according to the proportion of time taken by the driver for more than 80% of the area of the eyelid covering the pupil in unit time.
Preferably, in step S3, the movement speed of the eye is calculated by calculating the distance change between the pupil center coordinates and the eye center coordinates, and the nodding frequency is calculated by counting the nodding times per unit time.
Preferably, the SPP structure is used for three effective characteristic layers to increase the receptive field, the salient features are separated, repeated feature extraction is carried out by using PANet, and the sizes of the three characteristic layers are 76 × 76 × 33, 38 × 38 × 33 and 19 × 19 × 33 respectively;
and finally, transmitting a Yolo Head for prediction and decoding, transforming the three feature layers into 76 × 76 × 11, 38 × 38 × 11 and 19 × 19 × 11, wherein 11 in the last dimension respectively represents x _ offset, y _ offset, h, w, confidence and a classification result, adding x _ offset and y _ offset corresponding to each grid point to obtain the center of a prediction frame, and calculating the length and width of the prediction frame by using a priori frames and h and w to obtain the state and position coordinates of the eyes, the pupils and the Head of the driver.
Preferably, in steps S2 and S3, the data set employed by the lightweight object detection network is derived from a GO PRO data set, a ZJU data set, a YawDD data set, and an actual shooting homemade data set.
Preferably, the method further comprises the following steps:
s5: and if the driver is detected to be in a fatigue state, a warning signal is sent out.
The invention has the beneficial effects that:
1. compared with the traditional method for labeling 68 characteristic points of the face through a Dlib library, the method provided by the invention has the advantages that a target detection network is used, the face positioning step is skipped, and the states and position coordinates of eyes, pupils and the head are directly detected, so that the detection process can be simplified, and the situations of missing detection caused by face loss due to the fact that a driver wears a mask, turns the head and lowers the head can be avoided;
2. the light-weight GAN and GhostNet backbones are used, and deep separable convolution is used, so that the network parameter is greatly reduced, the model is small, the real-time performance is high, and the method is suitable for being deployed on vehicle-mounted mobile equipment;
3. in the detection of human eyes and pupils, the invention introduces a laminated halation enhancement method on the data level; a plurality of same trunks are integrated on the network layer, so that the detection effect on human eyes and pupils is improved; the monocular detection is used for solving the problem that eyes cannot be detected when the face turns;
4. the GAN is constructed aiming at the problems of dark and fuzzy images in the driving environments of low illumination, bump and the like, so that the image quality is improved, and the robustness of the model is further improved.
Drawings
FIG. 1 is a flow chart of the operation of the present invention;
FIG. 2 is a schematic diagram of a target detection network;
FIG. 3 is a schematic diagram of a cascading vignetting enhancement method in a target detection network;
FIG. 4 is a schematic diagram of eye and position detection in the present invention;
FIG. 5 is a schematic diagram of the pupil state and position detection in the present invention;
fig. 6 is a schematic diagram of head state and position detection in the present invention.
Detailed Description
The invention will be further elucidated with reference to the accompanying figures 1 to 6:
a wearing mask driver fatigue driving detection method based on a lightweight model comprises the following steps:
s1: acquiring a face image of a driver in a driving state, and preprocessing the face image;
s2: transmitting the driver face image data into a lightweight generated countermeasure Network (GAN for short) to enhance the image quality;
s3: the enhanced driver face image data is transmitted into a lightweight target detection Network, feature extraction is carried out through an improved GhostNet main feature Network, feature fusion is carried out by combining a Spatial Pyramid Pooling (SPP) structure and a Path Aggregation Network (PANET) structure, and finally a detection Head (Yolo Head) is used for classification and regression to obtain the state and position coordinates of the eyes, the pupils and the Head of a driver;
s4: judging whether the driver is in a fatigue driving state or not by integrating the Percentage of closed eyes of the driver (PERCLOS), the over Time, the sight line moving speed and the head-nodding frequency;
s5: if the driver is detected to be in a fatigue state, a warning signal is sent out.
The above-mentioned overall detection process is shown in FIG. 1.
In step S1, the camera adopts a camera with auto-focusing frame rate of 60fps/S, which is different from a general camera with fixed-focus frame rate of 30fps/S, and the auto-focusing function is used to ensure that the driver' S face is focused as far as possible during driving, so as to provide clear images and details for the network; the higher frame rate is used for ensuring that twice real-time face images are provided for the network in unit time as much as possible in the driving process, more face details are captured, the possibility of smear is reduced, and the identification accuracy is improved.
In step S2, a lightweight GAN (generic adaptive Network, generation countermeasure Network) is composed of a generator constructed based on shuffle netv2 and a discriminator constructed based on patch GAN, so that GAN reduces the amount of calculation while maintaining good performance, performs low-light modification and deblurring operations on the input video stream, and generates a well-illuminated and clear image, thereby improving the image quality.
In step S2, the lightweight augmented Network (gen-versus-Network) training process includes the generator generating a well-illuminated and clear image by receiving low-light and blurred image samples, the discriminator receiving the image generated by the generator to perform true and false determination, and repeating the iteration until the loss function of the generator and the discriminator reaches a minimum value. The data set adopted by the lightweight target detection network is derived from a GO PRO data set, a ZJU data set, a YawDD data set and an actual shooting self-making data set.
In step S3, when performing detection, the input video stream is framed by the network, and the picture is subjected to feature extraction in the backbone network, and the feature extracted in this process, i.e., the feature set of the framed picture, is referred to as a feature layer. The images obtained in steps S1 and S2 are normalized to 224 × 224 × 3. The backbone feature extraction network of the target detection network is constructed based on GhostNet, as shown in fig. 2, that is, when the input is 224 × 224 × 3, 7 × 7 × 160 feature layers are obtained by a 16-channel general 1 × 1 convolution block, then the number of channels is adjusted by using the 1 × 1 convolution block to obtain 7 × 7 × 960 feature layers, finally global average pooling and 1 × 1 convolution are performed to obtain 1 × 1 × 1280 feature layers for full-connection classification, the number of single sub-backbone network layers is reduced but a plurality of same backbone networks are integrated, that is, two same GhostNet backbone combinations are connected, the output of each stage of the first backbone is used as a part of the input, the output flows to the parallel stage of the next backbone network by means of Adjacent High Level Combination (AHLC), a redundant feature map is generated by using deep separable convolution, and the computation amount is reduced by channel shuffling and segmentation, namely, the number of channels of 1/3 is artificially reduced at the stage of the backbone network, the order of the channels is disordered, and three effective feature layers with the sizes of (76 × 76 × 256), (38 × 38 × 512) and (19 × 19 × 1024) are obtained through the backbone feature extraction network. Three effective characteristic layers are obtained through the trunk characteristic extraction network, then the receptive field is enlarged by using the SPP structure, the obvious characteristics are separated, repeated characteristic extraction is carried out by using the PANet, and characteristic enhancement extraction, namely characteristic fusion, is carried out. The three feature layer sizes obtained finally are (76 × 76 × 33), (38 × 38 × 33) and (19 × 19 × 33), the Yolo Head is finally transmitted for prediction and decoding, the three feature layers are transformed into (76 × 76 × 11), (38 × 38 × 11) and (19 × 19 × 11), and 11 in the last dimension includes 4+1+6, which respectively represents x _ offset, y _ offset, h and w, confidence and classification result. The classification result comprises six labels of opening eyes, closing eyes, pupils, head-up and head-down respectively. And adding the x _ offset and the y _ offset corresponding to each grid point to obtain the center of the prediction frame, and calculating the length and the width of the prediction frame by combining the prior frame with h and w to obtain the state and the position coordinates of the eyes, the pupils and the head of the driver.
It should be noted that, the number of layers of a single sub-backbone network is reduced, but a plurality of same backbone networks are integrated, and a deep separable convolution is used to generate a redundant feature map, channel shuffling and segmentation to reduce the calculation amount, that is, the number of channels of 1/3 is artificially reduced and the sequence of the channels is disturbed at the stage of the backbone network, compared with a traditional target detection framework, the size of a model is greatly reduced while the performance of the model is maintained and improved in small steps, and the parameter amount is reduced by about 63%, so that the model is lighter, the deployment at a mobile end and an embedded end can be further met, and the requirement of fatigue detection real-time performance is met. Meanwhile, the target face features are subjected to unified processing, and a same set of frame and detection method is used, so that various different face features such as eye, mouth, nose, head postures and the like can be detected, face information is fully utilized, the detection efficiency is improved, the detection process is simplified, and an upgrade space is reserved.
In step S3, the backbone network uses the mesh activation function:
Mish=x*tanh(ln(1+e)) (1)
other modules use the Leaky Relu activation function:
Leaky Relu(x)=max(αx,x) (2)
wherein α is 0.01.
The target detection network obtains the state and position coordinates of the eyes, pupils and the head of a driver, wherein the eyes are separately detected to obtain the eye state of more than 80% closed or open, the PERCLOS value is calculated according to the time proportion occupied by the area of the eyelids of the driver covering the pupils of more than 80% in unit time, the ratio of the number of frames of the eyes of more than 80% closed to the total number of frames is embodied in the system, the sight line moving speed is calculated by calculating the distance change between the center coordinates of the pupils and the center coordinates of the eyes, and the nodding frequency is calculated by counting the nodding times in unit time. The target detection network is used for respectively and independently detecting the two eyes, so that the problem of side face state target loss caused by traditional feature point positioning is avoided, eye feature acquisition is not influenced by angles, the omission ratio and the false alarm ratio are reduced, and the system stability is improved.
In the above S4, the target detection network skips the face and directly searches for the fatigue feature target in the occluded face, as shown in fig. 4, 5, and 6, fig. 4 shows that the eye closing action is recognized; FIG. 5 illustrates the identification of pupil movement; fig. 6 shows recognition of the nodding and pitching motion. The system judges whether the driver is in a fatigue state or not according to a fatigue judgment rule, the fatigue judgment rule is a four-level fatigue judgment rule which is formulated by combining the eye state, the sight line moving speed and the head movement state, and the judgment rule is not fatigue, is slightly fatigue, is fatigue and is severely fatigue, and the judgment rule is shown as the following table:
Figure BDA0003690191030000101
in S5, when it is detected that the driver is in a tired state, a warning sound is emitted through a speaker of the in-vehicle portable device until the driver is awake. When the driver is awake, the warning sound stops in time, and the driver is reminded to stop at the side or search a service area and a rest area nearby for rest in time.
As shown in fig. 3, for the problem that the target detection network has poor detection effect on small targets, i.e., human eyes and pupils, the method adopts the laminated halation enhancement, i.e., splicing by random zooming, random clipping and random arrangement is added, and six pictures are weighted, fused and arranged, specifically, the method comprises the steps of reading six pictures each time, respectively turning over, zooming, color gamut variation and the like on the six pictures, placing the pictures in six different directions, and enhancing the contrast around the eyes to protrude the eyes by processing a data set, so that the detection on the targets of the human eyes and the pupils is greatly improved.

Claims (8)

1. A wearing mask driver fatigue driving detection method based on a lightweight model is characterized by comprising the following steps: the method comprises the following steps:
s1: acquiring a face image of a driver in a driving state, and preprocessing the face image;
s2: transmitting the facial image data of the driver into a lightweight GAN, and enhancing the image quality;
s3: the enhanced driver face image data are transmitted into a lightweight target detection network, feature extraction is carried out through an improved GhostNet main feature network, feature fusion is carried out by combining an SPP structure and a PANet structure, and finally, a Yolo Head is used for classification and regression to obtain the state and position coordinates of the eyes, the pupils and the Head of a driver;
s4: and judging whether the driver is in a fatigue driving state or not by integrating the percentage of eyes closed, the moving speed of the sight line and the nodding frequency of the driver.
2. The method for detecting fatigue driving of a driver wearing a mask according to claim 1, wherein the method comprises: in step S2, the lightweight GAN includes a generator constructed based on ShuffleNetV2 and a discriminator constructed based on PatchGAN, and performs low-light modification and deblurring operations on the input video stream to generate a well-illuminated and clear image.
3. The method for detecting fatigue driving of a driver wearing a mask according to claim 1, wherein the method comprises:
the face images of the driver are preprocessed in step S1 and uniformly adjusted to 224 × 224 × 3;
in step S3, the process of extracting features through the improved GhostNet backbone feature network is as follows:
the method comprises the steps of enabling an input 224 x 3 image to pass through a 16-channel common 1 x 1 convolution block to obtain a 7 x 160 feature layer, adjusting the number of channels by using a 1 x 1 convolution block to obtain a 7 x 960 feature layer, finally performing global average pooling and 1 x 1 convolution to obtain a 1 x 1280 feature layer, performing full-connection classification, combining and connecting two identical GhostNet trunks, enabling the output of each stage of a first trunk to serve as a part of input, enabling the output to flow to a next trunk network in a parallel stage in a mode of adjacent high-level combination, using depth separable convolution to generate a redundant feature map, and enabling channels to shuffle and divide to reduce calculation amount, and obtaining three effective feature layers through a trunk feature extraction network, wherein the sizes of the three effective feature layers are 76 x 256, 38 x 512 and 19 x 1024 respectively.
4. The method for detecting fatigue driving of a driver wearing a mask according to claim 1, wherein the method comprises: in step S3, the eyes are individually detected, and the PERCLOS value is calculated based on the proportion of time taken by the driver for more than 80% of the area of the eyelid covering the pupil per unit time.
5. The method for detecting fatigue driving of a driver wearing a mask according to claim 1, wherein the method comprises: in step S3, the movement speed of the eye is calculated by calculating the distance change between the pupil center coordinate and the eye center coordinate, and the nodding frequency is calculated by counting the number of nodding times in unit time.
6. The method for detecting fatigue driving of a driver wearing a mask according to claim 3, wherein the method comprises: using SPP structure to increase the receptive field of the three effective characteristic layers, separating out salient features, and using PANet to perform repeated feature extraction, wherein the sizes of the three obtained characteristic layers are 76 × 76 × 33, 38 × 38 × 33 and 19 × 19 × 33 respectively;
and finally, transmitting a Yolo Head for prediction and decoding, transforming the three feature layers into 76 × 76 × 11, 38 × 38 × 11 and 19 × 19 × 11, wherein 11 in the last dimension respectively represents x _ offset, y _ offset, h, w, confidence and a classification result, adding x _ offset and y _ offset corresponding to each grid point to obtain the center of a prediction frame, and calculating the length and width of the prediction frame by using a priori frames and h and w to obtain the state and position coordinates of the eyes, the pupils and the Head of the driver.
7. The method for detecting fatigue driving of a driver wearing a mask according to claim 1, wherein the method comprises: in steps S2 and S3, the data set used by the lightweight object detection network is derived from the GO PRO data set, the ZJU data set, the YawDD data set, and the actual shooting homemade data set.
8. The method for detecting fatigue driving of a driver wearing a mask according to claim 1, wherein the method comprises: further comprising the steps of:
s5: and if the driver is detected to be in a fatigue state, a warning signal is sent out.
CN202210660238.1A 2022-06-13 2022-06-13 Method for detecting fatigue driving of driver wearing mask based on lightweight model Pending CN115050012A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210660238.1A CN115050012A (en) 2022-06-13 2022-06-13 Method for detecting fatigue driving of driver wearing mask based on lightweight model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210660238.1A CN115050012A (en) 2022-06-13 2022-06-13 Method for detecting fatigue driving of driver wearing mask based on lightweight model

Publications (1)

Publication Number Publication Date
CN115050012A true CN115050012A (en) 2022-09-13

Family

ID=83160847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210660238.1A Pending CN115050012A (en) 2022-06-13 2022-06-13 Method for detecting fatigue driving of driver wearing mask based on lightweight model

Country Status (1)

Country Link
CN (1) CN115050012A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578719A (en) * 2022-10-13 2023-01-06 中国矿业大学 YM _ SSH-based fatigue state detection method for lightweight target detection
CN117351648A (en) * 2023-10-08 2024-01-05 海南大学 Driver fatigue monitoring and early warning method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578719A (en) * 2022-10-13 2023-01-06 中国矿业大学 YM _ SSH-based fatigue state detection method for lightweight target detection
CN115578719B (en) * 2022-10-13 2024-05-17 中国矿业大学 YM_SSH-based fatigue state detection method for lightweight target detection
CN117351648A (en) * 2023-10-08 2024-01-05 海南大学 Driver fatigue monitoring and early warning method and system

Similar Documents

Publication Publication Date Title
CN108830252B (en) Convolutional neural network human body action recognition method fusing global space-time characteristics
CN109819208B (en) Intensive population security monitoring management method based on artificial intelligence dynamic monitoring
CN115050012A (en) Method for detecting fatigue driving of driver wearing mask based on lightweight model
CN103020992B (en) A kind of video image conspicuousness detection method based on motion color-associations
Niu et al. View-invariant human activity recognition based on shape and motion features
CN111832413B (en) People flow density map estimation, positioning and tracking method based on space-time multi-scale network
CN109614882A (en) A kind of act of violence detection system and method based on human body attitude estimation
CN111062292B (en) Fatigue driving detection device and method
CN103824070A (en) Rapid pedestrian detection method based on computer vision
CN103079034A (en) Perception shooting method and system
CN111126223B (en) Video pedestrian re-identification method based on optical flow guide features
KR101872811B1 (en) Apparatus and method for action pattern recognition, and method for generating of action pattern classifier
CN106886778B (en) License plate character segmentation and recognition method in monitoring scene
CN111985348B (en) Face recognition method and system
CN106295583B (en) Method and device for reminding user of driving mode
CN106529494A (en) Human face recognition method based on multi-camera model
CN106056624A (en) Unmanned aerial vehicle high-definition image small target detecting and tracking system and detecting and tracking method thereof
CN110378234A (en) Convolutional neural networks thermal imagery face identification method and system based on TensorFlow building
CN114973412A (en) Lip language identification method and system
CN113657195A (en) Face image recognition method, face image recognition equipment, electronic device and storage medium
KR20130106640A (en) Apparatus for trace of wanted criminal and missing person using image recognition and method thereof
CN115719457A (en) Method for detecting small target in unmanned aerial vehicle scene based on deep learning
CN116883883A (en) Marine ship target detection method based on generation of anti-shake of countermeasure network
CN115035159A (en) Video multi-target tracking method based on deep learning and time sequence feature enhancement
CN115116137A (en) Pedestrian detection method based on lightweight YOLO v5 network model and space-time memory mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination