WO2020237940A1

WO2020237940A1 - Fatigue detection method and device based on human eye state identification

Info

Publication number: WO2020237940A1
Application number: PCT/CN2019/108073
Authority: WO
Inventors: 李源; 王晋玮; 侯喆
Original assignee: 初速度（苏州）科技有限公司
Priority date: 2019-05-29
Filing date: 2019-09-26
Publication date: 2020-12-03
Also published as: CN110956068B; CN110956068A

Abstract

Disclosed by the embodiment of the present invention are a fatigue detection method and device based on human eye state identification. The method comprises: obtaining a face image acquired when an image acquisition device shoots a target person, the face image containing the face of the target person; detecting the face image to obtain facial feature points and eyelid feature points of upper and lower eyelids of human eyes in the face; constructing a target three-dimensional face model corresponding to the target person on the basis of a preset three-dimensional face model, the facial feature points and the eyelid feature points, wherein the target three-dimensional face model comprises upper and lower eyelids of human eyes constructed on the basis of the eyelid feature points; determining the current opening and closing length between the upper eyelids and the lower eyelids of the human eyes based on the upper eyelids and the lower eyelids of the human eyes in the target three-dimensional face model; and determining the current fatigue degree of the target person based on the current opening and closing length. The method can determine the spatial information of human eyes, improve the accuracy of the detection result of human eye state, and further improve the accuracy of the detection result of fatigue degree of the target person.

Description

Fatigue detection method and device based on human eye state recognition

Technical field

The present invention relates to the technical field of video surveillance, and in particular to a fatigue detection method and device based on human eye state recognition.

Background technique

People are prone to operating errors when they are fatigued. For example, when the driver is fatigued, he is prone to accidents. In order to reduce the occurrence of dangerous situations caused by fatigue of personnel to a certain extent, fatigue testing is generally performed on personnel. The related fatigue detection process is generally: obtain the face image collected for the target person, detect the face image through the pre-trained eye state detection model, and detect the open and closed state of the target person’s eyes, that is, detection Whether the eyes of the target person are in a closed state, according to the detection result, determine whether the target person is fatigued. If it is detected that the eyes of the target person are in a closed state, it is determined that the target person is fatigued and an alarm is issued. The trained human eye state detection model is a neural network model trained on sample images marked with human eyes in a closed state and human eyes in an open state.

In the above fatigue detection process, before training the model, when labeling the sample image, the labeling standards for the closed state and open state of the eyes in the sample image cannot be unified, such as the labeling of half-opened eyes. In order to open the state, some annotators mark the closed state, which causes the pre-trained eye state detection model to blur the detection boundary between the closed state and the open state of the human eye in the image, which leads to insufficient accuracy of the detection result.

Summary of the invention

The present invention provides a fatigue detection method and device based on human eye state recognition, so as to determine the spatial information of the human eye, improve the accuracy of the detection result of the human eye state, and further improve the fatigue of the target person The accuracy of the test results. The specific technical solutions are as follows:

In the first aspect, embodiments of the present invention provide a fatigue detection method based on human eye state recognition, including:

Obtaining a face image containing the face of the target person collected by the image capture device for shooting the target person;

The face image is detected, and the facial feature points of the face in the face image and the eyelid feature points of the upper and lower eyelids of the human eyes in the face are detected, wherein the facial feature points are: Describe the feature points of each part of the face in the face image;

Based on the preset three-dimensional face model, the facial feature points, and the eyelid feature points, constructing a target three-dimensional face model corresponding to the target person, wherein the target three-dimensional face model includes: based on the eyelid feature The upper and lower eyelids of the human eye constructed by points;

Determining the current opening and closing length between the upper and lower eyelids of the human eye based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model;

Based on the current opening and closing length, the current fatigue degree of the target person is determined.

Optionally, the step of detecting the face image and detecting the facial feature points of the face in the face image and the eyelid feature points of the upper and lower eyelids of the human eyes in the face includes:

Detecting the face image, and detecting facial feature points of the face in the face image;

Based on the facial feature points, determine and cut out the area where the human eyes in the face are located from the human face image, as a human eye image;

Using a preset eyelid feature point detection model, the eyelid feature points of the upper and lower eyelids of the human eye are detected from the human eye image, where the preset eyelid feature point detection model is: A model trained on sample images of the eyelid feature points of the upper and lower eyelids.

Optionally, the human eye image includes a left eye image and a right eye image;

Before the step of using a preset eyelid feature point detection model to detect the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image, the method further includes:

Performing mirror image processing on the first image to obtain a mirror image, wherein the first image is the left eye image or the right eye image;

Stitching the mirror image and the image that has not been mirrored in the human eye image to obtain a stitched image;

The step of using a preset eyelid feature point detection model to detect the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image includes:

Using a preset eyelid feature point detection model, from the stitched image, the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image, and the upper and lower eyelids of the human eye in the image without mirror processing are detected. Eyelid feature points;

Mirror image processing is performed on the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image to obtain the eyelid feature points after mirroring, so as to obtain the eyelid feature points of the upper and lower eyelids of the human eye in the human eye image.

Optionally, before the step of performing mirror processing on the first image to obtain a mirror image, the method further includes:

The left-eye image and the right-eye image are subjected to normalization processing to obtain a corrected left-eye image and a normalized right-eye image, wherein the normalization processing is: making the two eye corner points in the image to be processed The line is parallel to the coordinate axis of the preset image coordinate system, and the image to be processed is the left-eye image and the right-eye image;

The step of performing mirror image processing on the first image to obtain a mirror image includes:

Perform mirror image processing on the converted first image to obtain a mirror image.

Optionally, the step of constructing a target three-dimensional face model corresponding to the target person based on a preset three-dimensional face model, the face feature points, and the eyelid feature points includes:

From the preset three-dimensional face model, the spatial position information of the spatial point at the preset face position is determined as the spatial position information of the spatial point to be processed, wherein the spatial point to be processed and the image feature point exist Corresponding relationship, the image feature points are: the facial feature points and the eyelid feature points;

Using the weak perspective projection matrix and the spatial position information of each spatial point to be processed to determine the projection position information of the projection point of each spatial point to be processed in the face image;

Based on the projection position information of the projection point of each spatial point to be processed and the imaging position information of the image feature point corresponding to each spatial point to be processed, a target three-dimensional face model corresponding to the target person is constructed.

Optionally, the step of determining the current opening and closing length between the upper and lower eyelids of the human eye based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model is achieved by the following two Realize in any one of the ways:

The first way to achieve:

From the target three-dimensional face model, detecting the three-dimensional position information of the first center point of the upper eyelid and the three-dimensional position information of the second center point of the lower eyelid of the human eye;

Based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, determine the distance between the first center point and the second center point as the upper and lower eyelids of the human eye The current opening and closing length between;

The second way to achieve:

From the target three-dimensional face model, determine the three-dimensional position information of the human eye spatial point corresponding to the human eye;

Performing spherical fitting based on the three-dimensional position information of the human eye space point to obtain a sphere model representing the human eye;

Detecting, from the target three-dimensional face model, the three-dimensional position information of the first center point of the upper eyelid and the three-dimensional position information of the second center point of the lower eyelid of the human eye;

Based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, from the sphere model, determine the three-dimensional position information and the three-dimensional position information of the first spherical point corresponding to the first center point. The three-dimensional position information of the second spherical point corresponding to the second center point;

Based on the three-dimensional position information of the first spherical point and the three-dimensional position information of the second spherical point, determine the distance between the first spherical point and the second spherical point as the distance between the upper and lower eyelids of the human eye The current opening and closing length.

Optionally, the step of determining the current fatigue degree of the target person based on the current opening and closing length includes:

Obtaining the historical opening and closing length of the human eye of the target person determined within a preset time period;

Based on the current opening and closing length and the historical opening and closing length, the current fatigue degree of the target person is determined.

Optionally, the step of determining the current fatigue degree of the target person based on the current opening and closing length and the historical opening and closing length includes:

Comparing each opening and closing length with a preset length threshold to obtain a comparison result, where the opening and closing length includes the current opening and closing length and the historical opening and closing length;

Statistically obtain the first result quantity representing the comparison result whose opening and closing length is less than the preset length threshold;

Based on the current opening and closing length and the total number of historical opening and closing lengths and the first result number, determine the current fatigue level of the target person.

Optionally, after the step of determining the current fatigue level of the target person based on the current opening and closing length, the method further includes:

If it is determined that the current fatigue degree of the target person is fatigue, an alarm message is generated and sent.

In the second aspect, an embodiment of the present invention provides a fatigue detection device based on human eye state recognition, including:

The first obtaining module is configured to obtain a face image containing the face of the target person collected by the image capture device for shooting the target person;

The detection module is configured to detect the face image, and detect the facial feature points of the face in the face image and the eyelid feature points of the upper and lower eyelids of the human eyes in the face, wherein the facial feature points Is: used to characterize the feature points of each part of the face in the face image;

The construction module is configured to construct a target three-dimensional face model corresponding to the target person based on a preset three-dimensional face model, the facial feature points and the eyelid feature points, wherein the target three-dimensional face model includes : The upper and lower eyelids of the human eye constructed based on the eyelid feature points;

A first determining module configured to determine the current opening and closing length between the upper and lower eyelids of the human eye based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model;

The second determining module is configured to determine the current fatigue degree of the target person based on the current opening and closing length.

Optionally, the detection module includes:

The first detection unit is configured to detect the face image, and detect facial feature points of the face in the face image;

The determining and intercepting unit is configured to determine and intercept the area where the human eye in the face is located from the face image based on the facial feature point, as a human eye image;

The second detection unit is configured to use a preset eyelid feature point detection model to detect the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image, wherein the preset eyelid feature point detection The model is a model trained based on sample images marked with feature points of the upper and lower eyelids of a human eye.

Optionally, the human eye image includes a left eye image and a right eye image; the device may further include:

The mirroring module is configured to perform mirroring processing on the first image before detecting the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image using the preset eyelid feature point detection model to obtain A mirror image, wherein the first image is the left eye image or the right eye image;

A splicing module configured to splice the mirror image and the image that has not been mirrored in the human eye image to obtain a spliced image;

The second detection unit is specifically configured to: use a preset eyelid feature point detection model to detect, from the stitched image, the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image, and the The eyelid feature points of the upper and lower eyelids of the human eye in the mirror image processed; the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image are mirrored to obtain the eyelid feature points after mirroring to obtain the human eye image The characteristic points of the upper and lower eyelids of the human eye.

Optionally, the detection module further includes:

The normalization unit is configured to perform normalization processing on the left-eye image and the right-eye image before the mirror image processing is performed on the first image to obtain the mirror image, to obtain a normalized left-eye image and a normalized right-eye image. Eye image, wherein the correction processing is: making the line of two eye corner points in the image to be processed parallel to the coordinate axis of the preset image coordinate system, and the image to be processed is the left eye image and the right eye image. Eye image

The mirroring unit is specifically configured to perform mirroring processing on the converted first image to obtain a mirrored image.

Optionally, the construction module is specifically configured to determine the spatial position information of the spatial point at the preset face position from the preset three-dimensional face model, as the spatial position information of the spatial point to be processed, Wherein, there is a corresponding relationship between the spatial points to be processed and image feature points, and the image feature points are: the facial feature points and the eyelid feature points; the weak perspective projection matrix and the spatial position of each spatial point to be processed are used Information, determine the projection position information of the projection point of each spatial point to be processed in the face image; based on the projection position information of the projection point of each spatial point to be processed and the image feature point corresponding to each spatial point to be processed To construct a target three-dimensional face model corresponding to the target person.

Optionally, the first determining module is specifically configured to: detect, from the target three-dimensional face model, the three-dimensional position information of the first center point of the upper eyelid of the human eye and the second The three-dimensional position information of the center point; based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, determine the distance between the first center point and the second center point as The current opening and closing length between the upper and lower eyelids of the human eye.

Optionally, the first determining module is specifically configured to: determine, from the target three-dimensional face model, three-dimensional position information of a human eye space point corresponding to the human eye; based on the human eye space point Perform spherical fitting to obtain a sphere model that characterizes the human eye; from the target three-dimensional face model, detect the three-dimensional position information of the first center point of the upper eyelid of the human eye and the lower The three-dimensional position information of the second center point of the eyelid; based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, it is determined from the sphere model that the first center point corresponds to The three-dimensional position information of the first spherical point and the three-dimensional position information of the second spherical point corresponding to the second center point; based on the three-dimensional position information of the first spherical point and the three-dimensional position information of the second spherical point, The distance between the first spherical point and the second spherical point is determined as the current opening and closing length between the upper and lower eyelids of the human eye.

Optionally, the second determining module includes:

An obtaining unit configured to obtain the historical opening and closing length of the human eye of the target person determined within a preset time period;

The determining unit is configured to determine the current fatigue degree of the target person based on the current opening and closing length and the historical opening and closing length.

Optionally, the determining unit is specifically configured as

Optionally, the device may further include:

The generating and sending module is configured to generate and send alarm information if the current fatigue degree of the target person is determined to be fatigue after the current fatigue degree of the target person is determined based on the current opening and closing length.

It can be seen from the above content that the fatigue detection method and device based on human eye state recognition provided by the embodiments of the present invention can obtain the face image containing the face of the target person collected by the image capture device for shooting the target person; The face image is detected, and the facial feature points of the face in the face image and the eyelid feature points of the upper and lower eyelids of the human eye in the face are detected. The facial feature points are: the feature points used to represent each part of the face in the face image ; Based on the preset three-dimensional face model, facial feature points and eyelid feature points, construct a target three-dimensional face model corresponding to the target person, where the target three-dimensional face model includes: the upper and lower eyelids of the human eye constructed based on the eyelid feature points; Based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model, the current opening and closing length between the upper and lower eyelids of the human eye is determined; based on the current opening and closing length, the current fatigue degree of the target person is determined.

With the application of the embodiment of the present invention, the upper and lower eyelids of the target person’s eyes corresponding to the target person can be constructed based on the facial feature points and the eyelid feature points in the face image containing the target person’s face and the preset three-dimensional face model The target three-dimensional face model, which constructs the spatial information of the human eye of the target person. Based on this spatial information, the spatial distance between the upper and lower eyelids of the human eye can be determined with higher accuracy, that is, the open and closed state of the human eye Furthermore, based on the more accurate spatial distance between the upper and lower eyelids of the human eye, the current fatigue level of the target person can be determined more accurately. In the embodiment of the present invention, the pre-trained human eye state detection model is no longer solely dependent on the detection result of the open and closed state of the human eye in the two-dimensional image, so as to realize the determination of the fatigue degree of the target person and avoid the pre-trained human eye. The state detection model blurs the detection boundary between the closed state and the open state of the human eye in the image, which leads to the occurrence of insufficient detection results. It is possible to determine the spatial information of the human eye, thereby improving the accuracy of the detection result of the human eye state, and the accuracy of the detection result of the current fatigue degree of the target person. Of course, implementing any product or method of the present invention does not necessarily need to achieve all the advantages described above at the same time.

The innovative points of the embodiments of the present invention include:

1. Based on the facial feature points and eyelid feature points in the face image containing the target person’s face and the preset three-dimensional face model, construct a three-dimensional face model of the upper and lower eyelid target corresponding to the target person’s eyes including the target person’s eyes , That is, the spatial information of the human eye of the target person is constructed. Based on the spatial information, the spatial distance between the upper and lower eyelids of the human eye can be determined with higher accuracy, that is, the open and closed state of the human eye. The spatial distance between the upper and lower eyelids of the more flexible human eyes can more accurately determine the current fatigue level of the target person. In the embodiment of the present invention, the pre-trained human eye state detection model is no longer solely dependent on the detection result of the open and closed state of the human eye in the two-dimensional image, so as to realize the determination of the fatigue degree of the target person and avoid the pre-trained human eye. The state detection model blurs the detection boundary between the closed state and the open state of the human eye in the image, which leads to the occurrence of insufficient detection results. It is possible to determine the spatial information of the human eye, thereby improving the accuracy of the detection result of the state of the human eye, and the accuracy of the detection result of the current fatigue degree of the target person.

2. Cut out the area where the human eye is located in the face from the face image, that is, the human eye image, and then use the preset eyelid feature point detection model to detect the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image. Improve the accuracy of the detected eyelid feature points, thereby improving the accuracy of the upper and lower eyelids of the human eye in the target three-dimensional face model constructed based on the eyelid feature points, so as to better improve the fatigue of the target person The accuracy of the test results.

3. Perform mirror image processing on the first image, that is, the left-eye image or the right-eye image to obtain a mirror image, and then stitch the mirror image and the image that has not been mirrored in the human eye image to obtain a stitched image; the subsequent can use the preset The eyelid feature point detection model in the stitched image simultaneously detects the eyelid feature points in the two human eyes in the stitched image, that is, through one detection, the eyelid feature points of the upper and lower eyelids of the two human eyes in the stitched image can be detected. Simplifies the detection process of eyelid feature points using the preset eyelid feature point detection model.

4. The left-eye image and the right-eye image are corrected to obtain the corrected left-eye image and the corrected right-eye image, and then the corrected left-eye image or the corrected right-eye image is subjected to subsequent processing, so that To a certain extent, the detection burden of the preset eyelid feature point detection model can be reduced, and the detection result of eyelid feature points can be improved to a certain extent.

5. When calculating the current opening and closing length between the upper and lower eyelids of the human eye, the first implementation method is to combine the three-dimensional position information of the first center point of the upper eyelid of the human eye in the target three-dimensional face model and the lower eyelid The three-dimensional position information of the second center point, the determined three-dimensional distance between the upper and lower eyelids, is used as the current opening and closing length between the upper and lower eyelids of the human eye to ensure the accuracy of the determined current opening and closing length between the upper and lower eyelids At the same time, the calculation process is simplified. The second implementation method, considering that the actual human eye is spherical, the three-dimensional position information of the human eye space point corresponding to the human eye is determined from the target three-dimensional face model, and spherical fitting is performed to obtain more The sphere model that accurately represents the real human eye, and the distance between the first sphere point corresponding to the first center point of the upper eyelid and the second sphere point corresponding to the second center point of the lower eyelid in the sphere model is determined as a person The current opening and closing length between the upper and lower eyelids of the eye can better improve the accuracy of the current opening and closing length, thereby improving the accuracy of the detection result of the fatigue degree.

Description of the drawings

In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative work.

FIG. 1 is a schematic flowchart of a fatigue detection method based on eye state recognition provided by an embodiment of the present invention;

2A is a schematic flow chart of determining the current opening and closing length between the upper and lower eyelids of a human eye according to an embodiment of the present invention;

2B is a schematic diagram of another flow chart for determining the current opening and closing length between the upper and lower eyelids of a human eye according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a fatigue detection device based on human eye state recognition provided by an embodiment of the present invention.

Detailed ways

The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

It should be noted that the terms "including" and "having" in the embodiments of the present invention and the drawings and any variations thereof are intended to cover non-exclusive inclusions. For example, the process, method, system, product or device that contains a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally also includes Other steps or units inherent to these processes, methods, products or equipment.

The present invention provides a fatigue detection method and device based on human eye state recognition, so as to determine the spatial information of the human eye, thereby improving the accuracy of the detection result of the human eye state, and improving the fatigue of the target person The accuracy of the test results. The embodiments of the present invention will be described in detail below.

FIG. 1 is a schematic flowchart of a fatigue detection method based on eye state recognition provided by an embodiment of the present invention. The method can include the following steps:

S101: Obtain a face image containing the face of the target person collected by the image capture device for shooting the target person.

In the embodiment of the present invention, the method can be applied to any type of electronic device, where the electronic device can be a server or a terminal device. In one case, the electronic device may be an image acquisition device. Correspondingly, the electronic device may directly obtain the face image including the face of the target person collected by itself, and then execute the facial image provided by the embodiment of the present invention for the face image. Fatigue detection process based on human eye status recognition. In another case, the electronic device may be a non-image acquisition device, and correspondingly, the electronic device may communicate with the image acquisition device that shoots for the target person. Wherein, the electronic device can communicate with one or more image acquisition devices to obtain facial images collected by one or more image acquisition devices, and then implement the embodiments of the present invention for the facial images collected by each image acquisition device The provided fatigue detection process based on human eye state recognition, in which different image acquisition devices can target different target persons.

In one implementation, the image acquisition device can be set in the vehicle, and correspondingly, the target person is the driver of the vehicle, the image acquisition device can photograph the face of the driver in the vehicle in real time, and the electronic device can obtain the image acquisition device A facial image containing the driver's face collected by shooting for the driver. In one case, the image acquisition device can directly acquire a face image containing only the driver's face, and then send it to the electronic device. In another case, in addition to the driver’s face, the image captured by the image capture device may also include information such as the seat of the vehicle or the driver’s body. After the electronic device obtains the image captured by the image capture device, it can directly The obtained image is used as a face image for subsequent processes; it can also be based on a preset face detection algorithm to detect the image of the area where the face is located from the obtained image, and use the image of the area where the face is obtained from the obtained image. The image is cut out to obtain a face image containing only the driver's face, so as to improve the detection accuracy of subsequent facial feature points and eyelid feature points, and reduce the amount of detection calculation to a certain extent. Among them, the preset face detection algorithm can be: eigenface method (Eigenface) and face detection algorithm based on neural network model, face detection algorithm based on neural network model can be: FasterR-CNN (Faster Region-Convolutional Neural Networks, fast area-convolutional neural network) detection algorithm, this is all possible. The embodiment of the present invention does not limit the specific type of the preset face detection algorithm. The vehicle may be a private car, a truck, a bus, etc. The embodiment of the present invention does not limit the vehicle type of the vehicle.

In another implementation, the image capture device can also monitor the passing vehicles on the road in real time. Correspondingly, the target person can be the target driver, and the electronic device can obtain multiple image capture devices to take pictures of the target driver. The collected face image containing the face of the target driver. In one case, the image acquisition device can directly acquire a face image containing only the face of the target driver, and then send it to the electronic device. In another case, in addition to the face of the target driver, the image captured by the image capture device may also include information such as the window and front of the vehicle. After the electronic device obtains the image captured by the image capture device, it can directly The image of the face is used as the face image for the subsequent process; it can also be based on the preset face detection algorithm to detect the image of the area where the face is located from the image, and cut the image of the area where the face is located from the image to obtain only A face image containing the face of the target driver.

In another implementation, the image capture device can monitor indoor household personnel in real time. Accordingly, the target person can be the target household person, and the electronic device can obtain the target captured by the image capture device for shooting the target household person. Face image of the face of the householder.

S102: Detect the face image, and detect the facial feature points of the face in the face image and the eyelid feature points of the upper and lower eyelids of the human eyes in the face.

Among them, the facial feature points are: feature points used to represent various parts of the face in the face image.

In this step, the first feature point detection model established in advance can be used to detect the face image, and the facial feature points of the face in the face image and the eyelid feature points of the upper and lower eyelids of the human eyes in the face can be detected. In one case, the pre-established first feature point detection model is: a neural network model obtained by training based on a first sample image calibrated with facial feature points and eyelid feature points.

In one case, the embodiment of the present invention may also include a process of training to obtain a pre-established first feature point detection model. Specifically: the electronic device may first obtain an initial first feature point detection model, and the initial first feature The point detection model includes a feature extraction layer and a feature classification layer; a first sample image is obtained, and each first sample image includes a human face; and calibration information corresponding to each first sample image is obtained, wherein the calibration information includes the first sample image. The sample image contains the calibration position information of the calibration feature points of the face, and the calibration feature points include: facial feature points representing various parts of the face and eyelid feature points in the upper and lower eyelids of the human eye.

The facial feature points of each part may include: each feature point in the face that characterizes the position of the nose, such as nose wings, nose bridge, and nose tip; it may also include various feature points that characterize the position of the lips, such as lips. The feature points of the edge of the lip line; it can also include the feature points that characterize the position of the eyebrows, such as the feature points of the edge of the eyebrow; it can also include the feature points that characterize the location of the human eye, such as the corner of the eye feature point, the eye socket feature Points and pupil feature points, etc.; can also include feature points that characterize the position of the mandible, such as feature points on the contour of the mandible, that is, feature points on the chin contour, etc.; can also include features that characterize the position of the ear Feature points, such as each feature point on each contour of the ear. The calibration information can be manually calibrated or calibrated through a specific calibration procedure.

The electronic device inputs each first sample image into the feature extraction layer of the initial first feature point detection model to obtain the image feature of each first sample image; input the image feature of each first sample image into the initial The feature classification layer of the first feature point detection model of the first feature point to obtain the current location information of the calibration feature point in each first sample image; the current location information of the calibration feature point in each first sample image and its corresponding calibration location Information is matched; if the matching is successful, the first feature point detection model including the feature extraction layer and the feature classification layer is obtained, that is, the pre-established first feature point detection model is obtained; if the matching is not successful, the feature extraction layer and features are adjusted Classification layer parameters, return to the step of executing the feature extraction layer of inputting each first sample image into the initial feature point detection model to obtain the image features of each first sample image; until the matching is successful, the feature extraction is obtained The first feature point detection model of the layer and feature classification layer.

Wherein, the above process of matching the current position information of the calibration feature point in each first sample image with the corresponding calibration position information may be: calculating the current position information of each calibration feature point by using a preset loss function Determine whether the first loss value is less than the first preset loss threshold value between the corresponding calibration position information; if it is determined that the first loss value is less than the first preset loss threshold value, it is determined that the matching is successful. It can be determined that the initial feature point detection model converges, that is, it is determined that the training of the initial feature point detection model is completed, and the pre-established feature point detection model is obtained; if it is determined that the first loss value is not less than the first preset loss threshold, It is determined that the matching is unsuccessful.

Wherein, each first sample image has a corresponding relationship with the current position information of the calibration feature point, and each first sample image has a corresponding relationship with the calibration position information of the calibration feature point in the calibration information, then the calibration feature point There is a corresponding relationship between the current position information and the calibration position information of the calibration feature points in the calibration information.

After training to obtain the pre-established first feature point detection model, the electronic device can detect the obtained face image based on the pre-established first feature point detection model, and detect the facial feature points of the face in the face image And the eyelid characteristic points of the upper and lower eyelids of the human eye in the face.

S103: Based on the preset three-dimensional face model, facial feature points, and eyelid feature points, construct a target three-dimensional face model corresponding to the target person.

Among them, the target three-dimensional face model includes: the upper and lower eyelids of a human eye constructed based on eyelid feature points.

In this step, a preset three-dimensional face model is prestored locally or in a storage device connected to the electronic device, and the electronic device determines the facial feature points of the face in the face image and the upper and lower eyelids of the human eyes in the face. After the feature points, a target three-dimensional face model corresponding to the target person can be constructed based on the preset three-dimensional face model, facial feature points, and eyelid feature points. Among them, 3DMM (3D Morphable Models, three-dimensional deformation model) technology can be used to construct a target three-dimensional face model corresponding to the target person based on preset three-dimensional face models, facial feature points, and eyelid feature points.

In an implementation manner, the S103 may include:

From the preset three-dimensional face model, the spatial position information of the spatial point at the preset facial position is determined as the spatial position information of the spatial point to be processed. Among them, there is a corresponding relationship between the spatial point to be processed and the image feature point. The feature points are: facial feature points and eyelid feature points;

In one implementation manner, the electronic device may receive a user selection instruction, where the user selection instruction carries a preset face position of a spatial point that needs to be selected, and the electronic device may, based on the preset face position carried by the user selection instruction, In the preset three-dimensional face model, the spatial position information of the spatial point at the preset face position is determined as the spatial position information of the spatial point to be processed. In another implementation manner, the electronic device may prestore the preset face position, and the electronic device may read the preset face position from the corresponding storage location, and then determine from the preset three-dimensional face model The spatial position information of the spatial point at the preset face position is used as the spatial position information of the spatial point to be processed.

Among them, there is a corresponding relationship between the spatial points to be processed and the image feature points, and the image feature points are: facial feature points and eyelid feature points, and the to-be-processed spatial points have a one-to-one correspondence with the image feature points. In one case, the preset face position may be set based on the position of the calibration feature point of the face contained in the first sample image.

In one case, the preset three-dimensional face model can be expressed by the following formula (1):

Among them, S represents the preset three-dimensional face model,

Represents the preset average face, A _id represents the shape information of the human face, A _exp represents the expression information of the human face, α _id represents the weight of the shape information of the human face, which can be called the shape weight, α _exp The weight of the expression information representing the human face can be called the expression weight.

The electronic device can draw the three-dimensional face model it represents based on the above formula (1), and the three-dimensional face model is composed of point clouds. The electronic device can determine the spatial point at the preset face position from the drawn three-dimensional face model as the spatial point to be processed, and obtain the spatial position information of the spatial point to be processed.

After the electronic device determines the spatial position information of the spatial point to be processed, it can project each spatial point to be processed into the face image based on the preset weak perspective projection matrix, that is, use the weak perspective projection matrix and each to be processed The spatial position information of the spatial point determines the projection position information of the projection point of each spatial point to be processed in the face image. Based on the projection position information of the projection point of each spatial point to be processed and the imaging position information of the image feature point corresponding to each spatial point to be processed, a target three-dimensional face model corresponding to the target person is constructed. Wherein, the imaging location information of the image feature point is the location information of the image feature point in the face image.

Among them, the above-mentioned process of constructing the target three-dimensional face model corresponding to the target person based on the projection position information of the projection point of each spatial point to be processed and the imaging position information of the image feature point corresponding to each spatial point to be processed may be: Based on the projection position information of the projection point of each spatial point to be processed and the imaging position information of the image feature point corresponding to each spatial point to be processed, the distance error of each spatial point to be processed and its corresponding image feature point is determined, based on The principle of least squares and the distance error of each spatial point to be processed and its corresponding image feature point are used to construct the objective function. When the solution minimizes the function value of the objective function or satisfies a preset constraint condition, a solution of the corresponding unknown quantity in the objective function is obtained based on the solution to obtain a target three-dimensional face model corresponding to the target person.

In one case, the preset weak perspective projection matrix can be expressed by the following formula (2):

s _i2d =fPR(α,β,γ)(S _i +t _3d ); (2)

Among them, s _i2d represents the projection position information of the projection point of the i-th spatial point to be processed, where i can be an integer in [1, n], where n represents the number of spatial points to be processed, f represents the scale factor, and R (α, β, γ) represents a 3*3 rotation matrix, α represents the rotation angle of the preset three-dimensional face model under the horizontal axis in the preset space rectangular coordinate system, and β represents the preset three-dimensional face The rotation angle of the model under the vertical axis in the preset space rectangular coordinate system, γ represents the rotation angle of the preset three-dimensional face model under the vertical axis in the preset space rectangular coordinate system, and t _3d represents the translation vector; S _i represents the spatial position information of the i-th spatial point to be processed, and the rotation matrix and translation vector are used to: convert the preset three-dimensional face model from the preset spatial rectangular coordinate system where it is located to the image acquisition device In the device coordinate system.

The objective function can be expressed by the following formula (3):

Among them, P represents the function value of the objective function, s _i2dt represents the imaging position information of the image feature point corresponding to the i-th spatial point to be processed, ‖·‖ represents the modulus of the vector, and the vector represents: the i-th spatial point to be processed The distance error between the imaging position information of the corresponding image feature point and the projection position information of the projection point of the i-th spatial point to be processed.

In the embodiment of the present invention, the specific values of f, R(α, β, γ), t _3d , α _id , and α _exp can be continuously adjusted through an iterative method to minimize P or satisfy preset constraints. Condition, the preset constraint condition may be that P is not greater than a preset distance error threshold. Obtain the specific values of f,R(α,β,γ),t _3d ,α _id ,α _exp when P reaches the minimum or makes P meet the preset constraints, as the final value, α _id ,α The final value of _exp is substituted into formula (1) to obtain the target three-dimensional face model corresponding to the target person.

S104: Determine the current opening and closing length between the upper and lower eyelids of the human eye based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model.

S105: Determine the current fatigue level of the target person based on the current opening and closing length.

Among them, the opening and closing state of the human eye of a person, that is, the state of the human eye, can represent the fatigue degree of the person to a certain extent, and the opening and closing state of the human eye can be measured by the opening and closing length between the upper and lower eyelids of the human eye Logo. Among them, the distance between the upper and lower eyelids of the human eyes will be relatively small when the average person is in a fatigue state, and the distance between the upper and lower eyelids of the human eye will be relatively large when the person is in a non-fatigue state. In the embodiment of the present invention, the target three-dimensional face model includes the upper and lower eyelids of the human eye of the target person. Through the upper and lower eyelids in the target three-dimensional face model, the three-dimensional distance between the upper and lower eyelids can be determined and used as the current opening and closing length. Based on the current opening and closing length, the current fatigue level of the target person is determined.

In one case, it can be based on the three-dimensional position information of the upper and lower eyelids of any human eye in the target three-dimensional face model, such as the three-dimensional position information of the upper and lower eyelids of the left eye or the three-dimensional position information of the upper and lower eyelids of the right eye, to determine the upper and lower eyelids The current opening and closing length between the two, and then determine the current state of the target person.

In another case, it can be: for the three-dimensional position information of the upper and lower eyelids of the two human eyes of the target person, such as the three-dimensional position information of the upper and lower eyelids of the left and right eyes, determine the current opening and closing length between the upper and lower eyelids, and then , To determine the current status of the target personnel. Among them, it can be the three-dimensional position information of the upper and lower eyelids of each eye of the target person to determine the opening and closing length between the upper and lower eyelids of each eye, and then calculating the average of the opening and closing lengths between the upper and lower eyelids of the two eyes The value is used as the current opening and closing length between the upper and lower eyelids to determine the current state of the target person.

With the application of the embodiment of the present invention, the upper and lower eyelids of the target person’s eyes corresponding to the target person can be constructed based on the facial feature points and the eyelid feature points in the face image containing the target person’s face and the preset three-dimensional face model The target three-dimensional face model, which constructs the spatial information of the human eye of the target person. Based on this spatial information, the spatial distance between the upper and lower eyelids of the human eye can be determined with higher accuracy, that is, the open and closed state of the human eye Furthermore, based on the more accurate spatial distance between the upper and lower eyelids of the human eye, the current fatigue level of the target person can be determined more accurately. The embodiment of the present invention no longer only relies on the detection result of the closed state of the human eye in the two-dimensional image by using the pre-trained human eye state detection model to realize the determination of the fatigue degree of the target person and avoid the pre-trained eye state The detection model blurs the detection boundary between the closed state and the open state of the human eye in the image, which leads to the occurrence of insufficient detection results. It is possible to determine the spatial information of the human eye, thereby improving the accuracy of the detection result of the human eye state, and the accuracy of the detection result of the current fatigue degree of the target person.

In another embodiment of the present invention, the S102 may include:

Detect the face image, and detect the facial feature points of the face in the face image;

Based on the facial feature points, determine and cut out the area where the eyes of the face are located from the face image, and use it as the eye image;

Using the preset eyelid feature point detection model, the eyelid feature points of the upper and lower eyelids of the human eye are detected from the human eye image. Among them, the preset eyelid feature point detection model is: a model trained based on sample images marked with the eyelid feature points of the upper and lower eyelids of a human eye.

The face image contains the characteristics of the entire face of the target person, and the eyelid point of the eyelid of the human eye is directly detected in the face image. It is inevitable that the detection is not accurate enough. In this embodiment, the face image can be detected first, and the facial feature points that can represent the various parts of the target person’s face in the face image are detected, and then, based on the facial feature points, the face is determined from the face image The area where the human eye is located is used as the human eye image, and the human eye image is cut out from the face image. Furthermore, based on the preset eyelid feature point detection model, the eyelid feature points of the upper and lower eyelids of the human eye are detected from the human eye image containing the human eye. In order to improve the accuracy of the detected eyelid feature points of the human eye to a certain extent.

Wherein, the preset eyelid feature point detection model is: a model trained based on sample images marked with eyelid feature points of the upper and lower eyelids of a human eye. The preset eyelid feature point detection model may be a neural network model. For the training process of the preset eyelid feature point detection model, refer to the training process of the first feature point detection model established in advance. It can be understood that, for clear layout, the sample image required by the preset eyelid feature point detection model can be called the second sample image, which is different from the first sample image of the first feature point detection model established in advance. The second sample image is an image marking the eyelid feature points of the upper and lower eyelids of a human eye, and the calibration information corresponding to the second sample image includes the calibration position information of the eyelid feature points of the upper and lower eyelids of the human eye. Wherein, the eyelid feature points of the upper and lower eyelids of the human eye marked by the second sample image may be eyelid feature points calibrated manually or through a specific calibration procedure.

The above detection of the face image, the detection of facial feature points that can represent each part of the target person’s face in the face image, can be: based on the pre-established second feature point detection model, the face image is detected, and the detection is obtained The face image can represent the facial feature points of each part of the target person’s face. The pre-established second feature point detection model is: a nerve trained on the third sample image marked with facial feature points that can represent each part of the face Network model. For the training process of the pre-established second feature point detection model, refer to the above-mentioned training process of the pre-established first feature point detection model. Different from the first sample image of the pre-established first feature point detection model, the third sample image required by the pre-established second feature point detection model is an image marked with facial feature points that can represent various parts of the face, and The calibration information corresponding to the third sample image includes calibration position information that can characterize facial feature points of various parts of the face.

Furthermore, based on the two-dimensional position information of each feature point representing the location of the human eye in the facial feature point, the area where the human eye of the target person is located is determined and cut out from the face image, as the human eye image. Among them, it can be based on the two-dimensional position information of each feature point representing the location of the human eye in the facial feature point, determining the smallest rectangular area containing the human eye of the target person, taking the rectangular area as the area of the human eye, and Cut out to get the human eye image. It may be that the images of the area where the target person is located are respectively intercepted for the two eyes of the target person to obtain the human eye image.

In another embodiment of the present invention, the human eye image includes a left eye image and a right eye image;

Before the step of using the preset eyelid feature point detection model to detect the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image (S102), the method may further include:

Performing mirror image processing on the first image to obtain a mirror image, where the first image is a left-eye image or a right-eye image;

Splicing the mirror image and the image that has not been mirrored in the human eye image to obtain a spliced image;

The S102 may include:

Using a preset eyelid feature point detection model, from the stitched image, detect the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image, and the eyelid feature points of the upper and lower eyelid of the human eye in the image without mirror processing;

Mirror image processing is performed on the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image to obtain the eyelid feature points after mirroring to obtain the eyelid feature points of the upper and lower eyelids of the human eye in the human eye image.

Among them, the human eye image includes: an image containing the left eye of the target person, which may be called a left eye image; and an image containing the right eye of the target person, which may be called a right eye image. In order to reduce the complexity of using the preset eyelid feature point detection model to detect the target person’s eyelid feature point to a certain extent, and shorten the use of the preset eyelid feature point detection model to detect the target person’s eyelid feature point. The detection time. In this embodiment, mirror image processing may be performed on the first image to obtain a mirror image, that is, mirror image processing is performed on the left eye image or the right eye image to obtain a mirror image. Then stitch the mirror image and the image that has not been mirrored in the human eye image to obtain the stitched image; input the stitched image into the preset eyelid feature point detection model to use the preset eyelid feature point detection model to extract , To detect the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image, and the eyelid feature points of the upper and lower eyelids of the human eye in the image without mirror processing. The preset eyelid feature point detection model can simultaneously detect the mirror image and the image without mirror processing, which can shorten the detection time required to detect the eyelid feature points of the target person by using the preset eyelid feature point detection model.

Wherein, if the right-eye image is mirrored, the image that has not been mirrored is the left-eye image; if the left-eye image is mirrored, the image that has not been mirrored is the right-eye image.

Mirroring the left-eye image or the right-eye image can make the left-eye image mirror the right-eye image corresponding to the left-eye image, or make the right-eye image mirror the left-eye image corresponding to the right-eye image, to a certain extent Reduce the complexity of using the preset eyelid feature point detection model to detect the eyelid feature points of the target person.

It is understandable that after training to obtain the aforementioned preset eyelid feature point detection model, the required second sample image may include the left eye image of the sample person and the left eye image obtained by mirroring the right eye image of the sample person. Or include the right eye image of the sample person’s right eye image and the right eye image of the sample person’s left eye image. If the second sample image required by the above-mentioned preset eyelid feature point detection model is obtained through training, it contains the left eye image of the sample person and the left eye image obtained by mirroring the right eye image of the sample person, and the subsequent, in the detection process , The first image is the right eye image of the target person, that is, the right eye image of the target person needs to be mirrored. If the second sample image required by the above-mentioned preset eyelid feature point detection model is obtained by training, it contains the right eye image of the sample person and the right eye image obtained by mirroring the left eye image of the sample person, and then, in the detection process , The first image is the left eye image of the target person, that is, the left eye image of the target person needs to be mirrored.

When the above-mentioned preset eyelid feature point detection model is obtained by training, the right eye image or left eye image of the sample person is mirrored. To a certain extent, it can also increase the training to obtain the above-mentioned preset eyelid feature point detection model. The number of second sample images.

The above process of splicing the mirror image and the image that has not been mirrored in the human eye image to obtain the spliced image can be: splicing the mirror image and the image that has not been mirrored in the human eye image in the spatial dimension or channel dimension. Splicing, where the splicing of the spatial dimension may be: splicing the mirror image and the image that has not been mirrored in the human eye image left and right spliced or spliced up and down. Left and right splicing can be: the right side of the mirror image is spliced with the left side of the image that is not mirrored in the human eye image, and the left side of the mirror image is the right side of the image that is not mirrored in the human eye image. Make splicing. Top and bottom splicing can be: the upper side of the mirror image is spliced with the lower side of the image that is not mirrored in the human eye image, and the lower side of the mirror image is the upper side of the image that is not mirrored in the human eye image. Make splicing.

In another embodiment of the present invention, before the step of performing mirror image processing on the first image to obtain a mirror image, the method may further include:

The left-eye image and the right-eye image are corrected to obtain the corrected left-eye image and the corrected right-eye image, where the normalized processing is: making the line between the two eye corner points in the image to be processed and the preset image The coordinate axes of the coordinate system are parallel, and the images to be processed are the left eye image and the right eye image;

The step of performing mirror image processing on the first image to obtain a mirror image may include:

In one case, the head of the target person may be tilted. In this embodiment, in order to improve the accuracy of the detection result of the eyelid feature points, before mirroring the left eye image and the right eye image, The left-eye image and the right-eye image can be corrected first, that is, the connection between the two corner points of the left-eye image is parallel to the horizontal axis of the preset image coordinate system, and the two corner points of the right-eye image are connected. The line is parallel to the horizontal axis of the preset image coordinate system; or, making the line between the two corner points of the left eye image parallel to the vertical axis of the preset image coordinate system, and making the line between the two corner points of the right eye image Parallel to the longitudinal axis of the preset image coordinate system, this is all possible.

Subsequent, mirror image processing can be performed on the left-eye image after normalization or the right-eye image after normalization to obtain a mirror image.

Wherein, the preset image coordinate system may be the image coordinate system of the image acquisition device.

In another embodiment of the present invention, as shown in FIG. 2A, the S104 may include the following steps:

S201A: From the target three-dimensional face model, detect the three-dimensional position information of the first center point of the upper eyelid and the three-dimensional position information of the second center point of the lower eyelid of the human eye.

S202A: Based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, determine the distance between the first center point and the second center point as the current opening and closing length between the upper and lower eyelids of the human eye.

In this embodiment, in order to ensure the accuracy of the determined opening and closing lengths between the upper and lower eyelids of the human eye, and at the same time reduce the computational burden of the electronic device, the human eye can be directly detected from the target three-dimensional face model. The first center point of the upper eyelid and the second center point of the lower eyelid are detected to obtain the bisecting point of the upper eyelid and the bisecting point of the lower eyelid of the human eye; further, the spatial position information of the first center point and The spatial position information of the second center point, that is, the three-dimensional position information of the first center point and the three-dimensional position information of the second center point of the lower eyelid. Based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point of the lower eyelid, the distance between the first center point and the second center point is determined as the current opening between the upper and lower eyelids of the human eye Closed length. Specifically, the distance between the first center point and the second center point can be expressed as:

Among them, (x ₁ , y ₁ , z ₁ ) represents the three-dimensional position information of the first center point, and (x ₂ , y ₂ , z ₂ ) represents the three-dimensional position information of the second center point.

In another embodiment of the present invention, as shown in FIG. 2B, the S104 may include the following steps:

S201B: Determine the three-dimensional position information of the human eye space point corresponding to the human eye from the target three-dimensional face model.

S202B: Perform spherical fitting based on the three-dimensional position information of the spatial point of the human eye to obtain a sphere model representing the human eye.

S203B: From the target three-dimensional face model, detect the three-dimensional position information of the first center point of the upper eyelid and the three-dimensional position information of the second center point of the lower eyelid of the human eye.

S204B: Based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, determine the three-dimensional position information of the first spherical point corresponding to the first center point and the second center point corresponding to the second center point from the sphere model The three-dimensional position information of the two spherical points.

S205B: Based on the three-dimensional position information of the first spherical point and the three-dimensional position information of the second spherical point, determine the distance between the first spherical point and the second spherical point as the current opening and closing length between the upper and lower eyelids of the human eye.

In this embodiment, in view of the consideration of the actual shape of the eyeball of the human eye, in order to further improve the accuracy of the determined opening and closing length between the upper and lower eyelids of the human eye, it is possible to first determine from the target three-dimensional face model Human eye space points corresponding to human eyes, for example: eyeball space points representing eyeballs; based on the three-dimensional position information of the eye space points in the target three-dimensional face model, spherical fitting is performed to obtain a spherical model representing human eyes. Furthermore, based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, the three-dimensional position information of the first spherical point corresponding to the first center point and the second center point corresponding to the second center point are determined from the sphere model. The three-dimensional position information of the two spherical points, based on the three-dimensional position information of the first spherical point and the three-dimensional position information of the second spherical point, determine the distance between the first spherical point and the second spherical point as the upper and lower eyelids of the human eye The current opening and closing length between.

In one case, based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, the three-dimensional position information of the first spherical point corresponding to the first center point and the second center point are determined from the sphere model. The process of the three-dimensional position information of the second spherical point corresponding to the center point may be: based on the three-dimensional position information of the first center point and the position information of the optical center of the image acquisition device, the optical center and the first center point of the image acquisition device The line between the line and the two intersection points of the sphere model, the intersection point closest to the first center point is taken as the first sphere point corresponding to the first center point, and the first sphere point is determined based on the sphere model The three-dimensional position information; based on the three-dimensional position information of the second center point and the position information of the optical center of the image acquisition device, make the connection between the optical center of the image acquisition device and the second center point, and connect the connection with the sphere model Among the two intersection points, the intersection point closest to the second center point is used as the second spherical point corresponding to the second center point, and the three-dimensional position information of the second spherical point is determined based on the sphere model.

In this embodiment, the spatial points of the human eye in the target three-dimensional face model are spherically fitted to obtain a sphere model that characterizes the human eye, so that the obtained human eye shape is closer to the shape of the real human eye, and is based on the sphere. The three-dimensional position information of the first spherical point and the three-dimensional position information of the second spherical point in the model can determine the opening and closing length between the upper and lower eyelids of the human eye with higher accuracy.

In another embodiment of the present invention, the S105 may include:

Obtain the historical opening and closing length of the human eye of the target person determined within the preset time period;

Based on the current opening and closing length and the historical opening and closing length, the current fatigue level of the target person is determined.

In this embodiment, after determining the current opening and closing length between the upper and lower eyelids of the human eye of the target person, the time dimension information, that is, the historical opening and closing length of the human eye, can be combined to determine the current fatigue degree of the target person.

Wherein, in order to ensure the timeliness of the determined fatigue degree of the target person, the electronic device can obtain the face image containing the face of the target person collected at the current moment when the image capture device is shooting the target person. The preset time length may be a time length preset by the user, or a time length independently set by the electronic device, both of which are possible. Wherein, the historical opening and closing length of the eyes of the target person determined within the preset time period may include: the historical opening and closing length of the eyes of the target person determined within the preset time period ahead of the current moment, that is, The historical opening and closing length of the human eye of the target person determined within the latest preset time period at the current moment.

In one case, the electronic device can store the historical opening and closing length of the human eye of the target person locally or in the storage device connected to it. After calculating the current opening and closing length of the human eye, the electronic device can download the corresponding storage location Obtain the historical opening and closing length of the target person’s eye. Wherein, the historical opening and closing length of the human eye of the target person is determined based on the face image before the face image collected when the image acquisition device shoots the target person. The process of determining the historical opening and closing length of the target person's eyes is similar to the process of determining the current opening and closing length of the target person's eyes, and will not be repeated here.

In the embodiment of the present invention, through the target three-dimensional face model, a more accurate opening and closing length of the human eye can be determined, that is, the physical length of the opening and closing of the human eye. Furthermore, combined with the time dimension, the target can be monitored more flexibly and accurately. The fatigue of the personnel.

In another embodiment of the present invention, the step of determining the current fatigue degree of the target person based on the current opening and closing length and the historical opening and closing length may include:

Count the number of first results that characterize the comparison results whose opening and closing length is less than the preset length threshold;

Based on the current opening and closing length and the total number of historical opening and closing lengths and the first result number, the current fatigue level of the target person is determined.

In this embodiment, the electronic device can obtain a preset length threshold set in advance, and compare each opening and closing length, that is, the current opening and closing length and the historical opening and closing length, with the preset length threshold, respectively, to compare each opening and closing length. The size of the length and the preset length threshold is used to obtain the comparison result; further, the number of comparison results indicating that the opening and closing length is less than the preset length threshold is obtained by statistics, as the first result quantity; subsequent, based on the current opening and closing length and historical opening and closing The total number of lengths and the number of first results determine the current fatigue level of the target person. Among them, the process of determining the current fatigue degree of the target person based on the current opening and closing length and the total number of historical opening and closing lengths and the number of first results may be: calculating the ratio of the number of first results to the total number, if the ratio is greater than The preset ratio determines that the current fatigue degree of the target person is fatigue; if the ratio is not greater than the preset ratio, the current fatigue degree of the target person is determined to be non-fatigue. It can also be: Calculate the difference between the total quantity and the first result quantity, if the difference is less than the preset difference, determine the current fatigue degree of the target person as fatigue; if the difference is not less than the preset difference, then determine The current fatigue level of the target person is not fatigued.

For example: The historical opening and closing length of the human eye of the target person determined within the preset time is 99; that is, the current opening and closing length and the historical opening and closing length are 100. If the statistics show that the opening and closing length is less than the preset length The first result of the comparison result of the threshold is 80. At this time, it can be determined that the current fatigue degree of the target person is fatigue.

In another implementation manner, after the number of first results indicating that the opening and closing length is less than the preset length threshold is obtained by statistics, the first number can be directly compared with the preset number, and if the number of first results is greater than The preset number determines that the current fatigue level of the target person is fatigue; if the first result number is not greater than the preset number, it is determined that the current fatigue level of the target person is not fatigued.

In another embodiment of the present invention, after the step of determining the current fatigue degree of the target person based on the current opening and closing length, the method may further include:

In the embodiment of the present invention, if the target person is the driver, in order to reduce the occurrence of car accidents caused by fatigue driving to a certain extent, when the fatigue degree of the target person is determined to be fatigue, warning information can be generated, To remind the user that the target person is in a state of fatigue, so that the user can take corresponding measures for this situation, so as to reduce the occurrence of car accidents caused by fatigue driving to a certain extent.

In another case, if the target person is the driver, the driver can also be prompted to enter the automatic driving mode, or the driving mode control signal can be sent to control the vehicle to automatically enter the automatic driving mode, so as to reduce the fatigue caused by driving to a certain extent Of the car accident.

In another embodiment of the present invention, if the target person is a householder, a household control signal of the household equipment can be generated and sent. The household control signal can be to control the playback volume of the TV to decrease or turn off the TV; it can be: control The current setting temperature of the air conditioner is within the preset temperature range, and so on.

Corresponding to the foregoing method embodiment, the embodiment of the present invention provides a fatigue detection device based on human eye state recognition, as shown in FIG. 3, which may include:

The first obtaining module 310 is configured to obtain a face image containing the face of the target person collected by the image capturing device for shooting the target person;

The detection module 320 is configured to detect the face image, and detect facial feature points of the face in the face image and eyelid feature points of the upper and lower eyelids of the human eyes in the face, wherein the facial feature Points are: feature points used to characterize various parts of the face in the face image;

The construction module 330 is configured to construct a target three-dimensional face model corresponding to the target person based on a preset three-dimensional face model, the facial feature points and the eyelid feature points, wherein the target three-dimensional face model Including: the upper and lower eyelids of the human eye constructed based on the eyelid feature points;

The first determining module 340 is configured to determine the current opening and closing length between the upper and lower eyelids of the human eye based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model;

The second determining module 350 is configured to determine the current fatigue degree of the target person based on the current opening and closing length.

In another embodiment of the present invention, the detection module 320 includes:

In another embodiment of the present invention, the human eye image includes a left eye image and a right eye image; the apparatus may further include:

The mirroring module (not shown in the figure) is configured to detect the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image using the preset eyelid feature point detection model, Performing mirror image processing on an image to obtain a mirror image, wherein the first image is the left eye image or the right eye image;

A splicing module (not shown in the figure), configured to splice the mirror image and the image that has not been mirrored in the human eye image to obtain a spliced image;

The second detection unit is specifically configured to use a preset eyelid feature point detection model to detect the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image from the stitched image, and the The eyelid feature points of the upper and lower eyelids of the human eye in the mirror image processed; the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image are mirrored to obtain the eyelid feature points after the mirror image to obtain the human eye image The characteristic points of the upper and lower eyelids of the human eye.

In another embodiment of the present invention, the detection module 320 may further include:

In another embodiment of the present invention, the construction module 330 is specifically configured to determine the spatial position information of the spatial point at the preset face position from the preset three-dimensional face model, as the to-be-processed The spatial position information of the spatial point, wherein the spatial point to be processed and the image feature point have a corresponding relationship, and the image feature point is: the facial feature point and the eyelid feature point; using a weak perspective projection matrix and each The spatial position information of the spatial point to be processed, the projection position information of the projection point of each spatial point to be processed in the face image is determined; the projection position information of the projection point of each spatial point to be processed and each to-be-processed The imaging position information of the image feature point corresponding to the spatial point is used to determine the distance error of each spatial point to be processed and its corresponding image feature point; to determine whether the distance error is less than the preset error; if it is less, the corresponding target person is obtained The target three-dimensional face model; if it is not smaller than, adjust the spatial position information of the spatial point to be processed in the preset three-dimensional face model; return to execute the use of the weak perspective projection matrix and the spatial position of each spatial point to be processed Information, determining the projection position information of the projection point of each spatial point to be processed in the face image.

In another embodiment of the present invention, the first determining module 340 is specifically configured to: detect the three-dimensional position of the first center point of the upper eyelid of the human eye from the target three-dimensional face model Information and the three-dimensional position information of the second center point of the lower eyelid; determine the first center point and the second center based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point The distance between the points is used as the current opening and closing length between the upper and lower eyelids of the human eye.

In another embodiment of the present invention, the first determining module 340 is specifically configured to determine, from the target three-dimensional face model, the three-dimensional position information of the human eye spatial point corresponding to the human eye; Based on the three-dimensional position information of the human eye space point, perform spherical fitting to obtain a sphere model that characterizes the human eye; from the target three-dimensional face model, detect the first center of the upper eyelid of the human eye The three-dimensional position information of a point and the three-dimensional position information of the second center point of the lower eyelid; based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, determine from the sphere model The three-dimensional position information of the first spherical point corresponding to the first center point and the three-dimensional position information of the second spherical point corresponding to the second center point; based on the three-dimensional position information of the first spherical point and the second The three-dimensional position information of the spherical point determines the distance between the first spherical point and the second spherical point as the current opening and closing length between the upper and lower eyelids of the human eye.

In another embodiment of the present invention, the second determining module 350 includes:

In another embodiment of the present invention, the determining unit is specifically configured to

In another embodiment of the present invention, the device may further include:

Generating a sending module (not shown in the figure), configured to determine the current fatigue level of the target person based on the current opening and closing length, if it is determined that the current fatigue level of the target person is fatigue , Generate and send alarm information.

The foregoing device embodiment corresponds to the method embodiment, and has the same technical effect as the method embodiment. For specific description, refer to the method embodiment. The device embodiment is obtained based on the method embodiment, and the specific description can be found in the method embodiment part, which will not be repeated here.

Those of ordinary skill in the art can understand that the drawings are only schematic diagrams of an embodiment, and the modules or processes in the drawings are not necessarily necessary for implementing the present invention.

A person of ordinary skill in the art can understand that the modules in the device in the embodiment may be distributed in the device in the embodiment according to the description of the embodiment, or may be located in one or more devices different from this embodiment with corresponding changes. The modules of the above-mentioned embodiments can be combined into one module or further divided into multiple sub-modules.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the foregoing embodiments are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

A fatigue detection method based on human eye state recognition, which is characterized in that it includes:

Obtaining a face image containing the face of the target person collected by the image capture device for shooting the target person;

The face image is detected, and the facial feature points of the face in the face image and the eyelid feature points of the upper and lower eyelids of the human eyes in the face are detected, wherein the facial feature points are: Describe the feature points of each part of the face in the face image;

Based on the preset three-dimensional face model, the facial feature points, and the eyelid feature points, constructing a target three-dimensional face model corresponding to the target person, wherein the target three-dimensional face model includes: based on the eyelid feature The upper and lower eyelids of the human eye constructed by points;

Determining the current opening and closing length between the upper and lower eyelids of the human eye based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model;

Based on the current opening and closing length, the current fatigue degree of the target person is determined.
The method according to claim 1, wherein the detection of the face image, the detection of facial feature points of the face in the face image and the eyelid features of the upper and lower eyelids of the human eye in the face The steps to point include:

Detecting the face image, and detecting facial feature points of the face in the face image;

Based on the facial feature points, determine and cut out the area where the human eyes in the face are located from the human face image, as a human eye image;

Using a preset eyelid feature point detection model, the eyelid feature points of the upper and lower eyelids of the human eye are detected from the human eye image, where the preset eyelid feature point detection model is: A model trained on sample images of the eyelid feature points of the upper and lower eyelids.
3. The method according to claim 2, wherein the human eye image comprises a left eye image and a right eye image;

Before the step of using a preset eyelid feature point detection model to detect the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image, the method further includes:

Performing mirror image processing on the first image to obtain a mirror image, wherein the first image is the left eye image or the right eye image;

Stitching the mirror image and the image that has not been mirrored in the human eye image to obtain a stitched image;

The step of using a preset eyelid feature point detection model to detect the eyelid feature points of the upper and lower eyelids of the human eye from the human eye image includes:

Using a preset eyelid feature point detection model, from the stitched image, the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image, and the upper and lower eyelids of the human eye in the image without mirror processing are detected. Eyelid feature points;

Mirror processing is performed on the eyelid feature points of the upper and lower eyelids of the human eye in the mirror image to obtain the eyelid feature points after mirroring, so as to obtain the eyelid feature points of the upper and lower eyelids of the human eye in the human eye image.
The method according to claim 3, wherein before the step of performing mirror image processing on the first image to obtain a mirror image, the method further comprises:

The left-eye image and the right-eye image are subjected to normalization processing to obtain a corrected left-eye image and a normalized right-eye image, wherein the normalization processing is: making the two eye corner points in the image to be processed The line is parallel to the coordinate axis of the preset image coordinate system, and the image to be processed is the left-eye image and the right-eye image;

The step of performing mirror image processing on the first image to obtain a mirror image includes:

Perform mirror image processing on the converted first image to obtain a mirror image.
The method of claim 1, wherein the method of constructing a target three-dimensional face model corresponding to the target person is based on a preset three-dimensional face model, the face feature points, and the eyelid feature points The steps include:

From the preset three-dimensional face model, the spatial position information of the spatial point at the preset face position is determined as the spatial position information of the spatial point to be processed, wherein the spatial point to be processed and the image feature point exist Corresponding relationship, the image feature points are: the facial feature points and the eyelid feature points;

Using the weak perspective projection matrix and the spatial position information of each spatial point to be processed to determine the projection position information of the projection point of each spatial point to be processed in the face image;

Based on the projection position information of the projection point of each spatial point to be processed and the imaging position information of the image feature point corresponding to each spatial point to be processed, a target three-dimensional face model corresponding to the target person is constructed.
The method according to any one of claims 1 to 5, wherein the determination is based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model. The steps of the current opening and closing length of the space are implemented in either of the following two implementations:

The first way to achieve:

From the target three-dimensional face model, detecting the three-dimensional position information of the first center point of the upper eyelid and the three-dimensional position information of the second center point of the lower eyelid of the human eye;

Based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, determine the distance between the first center point and the second center point as the upper and lower eyelids of the human eye The current opening and closing length between;

The second way to achieve:

From the target three-dimensional face model, determine the three-dimensional position information of the human eye spatial point corresponding to the human eye;

Performing spherical fitting based on the three-dimensional position information of the human eye space point to obtain a sphere model representing the human eye;

Detecting, from the target three-dimensional face model, the three-dimensional position information of the first center point of the upper eyelid and the three-dimensional position information of the second center point of the lower eyelid of the human eye;

Based on the three-dimensional position information of the first center point and the three-dimensional position information of the second center point, from the sphere model, determine the three-dimensional position information and the three-dimensional position information of the first spherical point corresponding to the first center point. The three-dimensional position information of the second spherical point corresponding to the second center point;

Based on the three-dimensional position information of the first spherical point and the three-dimensional position information of the second spherical point, determine the distance between the first spherical point and the second spherical point as the distance between the upper and lower eyelids of the human eye The current opening and closing length.
The method according to any one of claims 1 to 5, wherein the step of determining the current fatigue degree of the target person based on the current opening and closing length comprises:

Obtaining the historical opening and closing length of the human eye of the target person determined within a preset time period;

Based on the current opening and closing length and the historical opening and closing length, the current fatigue degree of the target person is determined.
8. The method of claim 7, wherein the step of determining the current fatigue level of the target person based on the current opening and closing length and the historical opening and closing length comprises:

Comparing each opening and closing length with a preset length threshold to obtain a comparison result, where the opening and closing length includes the current opening and closing length and the historical opening and closing length;

Statistically obtain the first result quantity representing the comparison result whose opening and closing length is less than the preset length threshold;

Based on the current opening and closing length and the total number of historical opening and closing lengths and the first result number, determine the current fatigue level of the target person.
8. The method according to any one of claims 1-8, wherein after the step of determining the current fatigue level of the target person based on the current opening and closing length, the method further comprises:

If it is determined that the current fatigue degree of the target person is fatigue, an alarm message is generated and sent.
A fatigue detection device based on human eye state recognition, characterized in that it comprises:

The first obtaining module is configured to obtain a face image containing the face of the target person collected by the image capture device for shooting the target person;

The detection module is configured to detect the face image, and detect the facial feature points of the face in the face image and the eyelid feature points of the upper and lower eyelids of the human eyes in the face, wherein the facial feature points Is: used to characterize the feature points of each part of the face in the face image;

The construction module is configured to construct a target three-dimensional face model corresponding to the target person based on a preset three-dimensional face model, the facial feature points and the eyelid feature points, wherein the target three-dimensional face model includes : The upper and lower eyelids of the human eye constructed based on the eyelid feature points;

A first determining module configured to determine the current opening and closing length between the upper and lower eyelids of the human eye based on the three-dimensional position information of the upper and lower eyelids of the human eye in the target three-dimensional face model;

The second determining module is configured to determine the current fatigue degree of the target person based on the current opening and closing length.