WO2022134337A1

WO2022134337A1 - Face occlusion detection method and system, device, and storage medium

Info

Publication number: WO2022134337A1
Application number: PCT/CN2021/082571
Authority: WO
Inventors: 陈丹; 陆进; 陈斌; 刘玉宇
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-12-21
Filing date: 2021-03-24
Publication date: 2022-06-30
Also published as: CN112633144A

Abstract

The present application relates to the field of artificial intelligence, and provides a face occlusion detection method and system. The method comprises: acquiring a face image to be detected; performing keypoint detection on the face image to obtain keypoint information of face organs in the face image; according to the keypoint information, performing face organ block segmentation on the face image to obtain corresponding face organ block images; pre-processing the face organ block images, inputting the pre-processed face organ block images into a pre-trained face occlusion detection model to perform face occlusion detection, and outputting corresponding mask images; performing binarization processing on the mask images to obtain the binarized target mask images; and calculating occlusion ratios of various face organs according to pixel values of the target mask images. According to the present application, the occlusion percentages corresponding to various face organs can be accurately calculated, thereby greatly improving the accuracy of face occlusion detection.

Description

Face occlusion detection method, system, device and storage medium

This application claims the priority of the Chinese patent application filed on December 21, 2020 with the application number 202011520261.8 and the invention titled "Facial occlusion detection method, system, device and storage medium", the entire contents of which are by reference incorporated in the application.

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to a method, system, device and storage medium for face occlusion monitoring.

Background technique

With the development of artificial intelligence technology, face recognition and living body detection play a vital role in building traffic, financial authentication and other fields, and the occlusion of face images will have a direct impact on the results of face recognition and living body detection. . Therefore, face occlusion detection is an indispensable link in the face system.

The existing face occlusion detection technical solutions are mainly divided into two directions: one is to use traditional methods to distinguish skin color and texture information from hue and texture, and then judge whether the face image is occluded; the other is to train deep neural networks. To determine whether the face is occluded, the single-task classification method is mainly used to determine whether the entire face is occluded, or the multi-task method is used to integrate with the detection model, and the types and positions of various facial organs and occluders are detected at the same time to determine the occlusion of the face. .

However, in response to the above approach, the inventor found that the traditional method is affected by the complexity of face features and the diversity of occluders, and is not universal and has weak generalization ability. The single-task classification method cannot be accurate to specific organs, landing scenes There are limitations, and the task of detecting organs at the same time when the multi-task method directly locates the occluder is difficult, and the accuracy is difficult to guarantee.

SUMMARY OF THE INVENTION

The main purpose of this application is to provide a face occlusion detection method, system, computer equipment and computer-readable storage medium, which are used to solve the problem that traditional methods in the prior art are not universal and have weak generalization ability, and single-task classification The method cannot be accurate to specific organs, and the landing scene is limited, while the multi-task method is difficult to detect organs, and the accuracy is difficult to guarantee.

A first aspect of the present application provides a face occlusion detection method, and the face occlusion detection method includes:

Obtain the face image to be detected;

Perform key point detection on the face image to obtain key point information of the face organs in the face image;

According to the key point information, the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image;

Preprocessing the face organ block image, and inputting the preprocessed face organ block image into a pre-trained face occlusion detection model to perform face occlusion detection, and output a corresponding mask image;

The occlusion ratio of each face organ is calculated according to the pixel value of the target mask image.

A second aspect of the present application provides a face occlusion detection device, the face occlusion detection device includes: a memory, a processor, and a face occlusion detection program stored in the memory and executable on the processor , the processor implements the following steps when executing the face occlusion detection program:

Obtain the face image to be detected;

A third aspect of the present application provides a storage medium, a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are executed on a computer, the computer is caused to perform the following steps:

Obtain the face image to be detected;

A fourth aspect of the present application provides a face occlusion detection system, where the face occlusion detection system includes:

The acquisition module is used to acquire the face image to be detected;

a first detection module, configured to perform key point detection on the face image to obtain key point information of the face organs in the face image;

a segmentation module, configured to perform facial organ block segmentation on the face image according to the key point information to obtain a corresponding face organ block image;

The second detection module is used to preprocess the face organ block image, and input the preprocessed face organ block image into a pre-trained face occlusion detection model to perform face occlusion detection, and Output the corresponding mask image;

a processing module, configured to perform binarization processing on the mask image to obtain the binarized target mask image;

The calculation module is configured to calculate the occlusion ratio of each face organ according to the pixel value of the target mask image.

A face occlusion detection method, system, computer equipment and computer-readable storage medium provided by the present application, by acquiring a face image to be detected; key point information of face organs; according to the key point information, the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image; the face organ block image is preprocessed, and Input the preprocessed face organ block image into the pre-trained face occlusion detection model to perform face occlusion detection, and output the corresponding mask image; calculate each pixel value according to the target mask image. Occlusion ratio of individual face organs. In this application, by using face organs as blocks to perform pixel-level semantic segmentation, the specific occlusion position of each part of the organ and the occlusion percentage of each face organ can be accurately calculated, which not only reduces the complexity of face occlusion detection, but also reduces the complexity of face occlusion detection. The division is accurate to each face organ, which greatly improves the accuracy of face occlusion detection.

Description of drawings

1 is a schematic flowchart of steps of a method for detecting face occlusion provided by the present application;

FIG. 2 is a schematic flow chart of step refinement of step S200 in FIG. 1 provided by the present application;

FIG. 3 is a schematic grayscale image of facial organ block segmentation provided by the application;

FIG. 4 is a schematic flowchart of step refinement of step S300 in FIG. 1 provided by the present application;

Fig. 5 is a kind of schematic facial organ block segmentation rendering effect diagram provided by this application;

FIG. 6 is a schematic flowchart of step refinement of step S400 in FIG. 1 provided by the present application;

FIG. 7 is a schematic flowchart of step refinement of step S500 in FIG. 1 provided by the present application;

8 is a schematic flow chart of step refinement of the training method for a face occlusion detection model in the face occlusion detection method provided by the present application;

FIG. 9 is a schematic flowchart of step refinement of step S600 in FIG. 1 provided by the present application;

10 is a schematic diagram of an optional program module of the face occlusion detection system provided by the application;

FIG. 11 is a schematic diagram of an optional hardware architecture of the computer device provided by the present application.

Detailed ways

The embodiments of the present application provide a face occlusion detection method, system, device, and storage medium. By using face organs as blocks to perform pixel-level semantic segmentation, the specific occlusion position of each part and organ and each face can be accurately calculated. The occlusion percentage of organs not only reduces the complexity of face occlusion detection, but also the face division is accurate to each face organ, which greatly improves the accuracy of face occlusion detection.

The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that data so used may be interchanged under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" or "having" and any variations thereof are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed steps or units, but may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.

The embodiments of the present application will be described below with reference to the accompanying drawings.

Example 1

Referring to FIG. 1 , a schematic flowchart of steps of a face occlusion detection method provided by an embodiment of the present application is shown. It can be understood that the flowcharts in the embodiments of the present application are not used to limit the order of executing steps. The following is an exemplary description with a computer device as the execution subject, and the computer device may include mobile terminals such as smart phones, tablet personal computers, laptop computers, etc., as well as fixed terminals such as desktop computers. . details as follows:

Step S100, acquiring a face image to be detected.

Specifically, the face image to be detected by the model can be obtained by taking a photo of the face by a camera device, capturing a face by a video monitoring device, and capturing by a web crawler.

Step S200, performing key point detection on the face image to obtain key point information of the face organs in the face image.

Specifically, key point detection is performed by inputting the face image to be detected into a preset key point model, and corresponding key point information is obtained, thereby determining the key point information of the face organs.

In an exemplary embodiment, as shown in FIG. 2, which is a detailed flowchart of the step 200, the step 200 may include:

Step S201, inputting the face image into a preset key point model to perform the key point detection, to obtain a preset number of key point information on the two-dimensional plane of the face image, wherein the key point The information includes the coordinates of the key points and the serial numbers corresponding to the key points;

Step S202, according to the preset number of key point information and the position of each face organ in the face image, determine the key point information of each face organ, wherein the face organ includes forehead, left Eyebrows, right eyebrows, left eye, right eye, nose and mouth.

Specifically, the face image to be detected is input into a preset key point model for key point detection and calibration, 68 key points are marked on the face image to be detected, and the The corresponding serial number is also marked, and the corresponding key point information is obtained to determine the corresponding face organ coordinate point information.

Exemplarily, as shown in FIG. 3 , FIG. 3 is a schematic grayscale image of facial organ block (Patch) segmentation. Taking the left eye as an example, the serial numbers corresponding to the coordinates of the key points are 36, 37, 38, 39, 40, and 41, respectively, and the area enclosed by the coordinates of the key points represents the left eye. Taking the forehead as an example, the serial numbers corresponding to the key point coordinates of the left eyebrow are 17, 18, 19, 20, and 21, respectively, and the serial numbers corresponding to the key point coordinates of the right eyebrow are 22, 23, 24, 25, and 26. Among them, the serial number The horizontal line where the two points 19 and 24 are located is used as the lower boundary of the forehead. Based on the horizontal line where the two points are located, the height of the face frame extending one-fifth of the orientation is used as the upper boundary of the forehead, and the left and right boundaries of the forehead are respectively the serial number 17. The vertical line corresponding to the serial number 26 forms a rectangular area as the forehead. The height of the face frame is the distance between the largest point in the key point coordinates of the eyebrows and the smallest point in the key point coordinates of the face contour.

Please continue to refer to Figure 3. The human cheek can also be divided by the 68 key point information. Taking the left cheek as an example, the sequence numbers corresponding to the key point coordinates are 1, 2, 3, 4, 5, 6, 7, 31 respectively. , 40, 41, 48, the area enclosed by these 11 key points is the left cheek. The face contour can also be divided by the 68 key point information. The serial numbers corresponding to the key point coordinates are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13. , 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, the area surrounded by these 27 key points is the face contour.

The embodiment of the present application obtains the key point information of the face image by performing key point detection on the face image, thereby accurately obtaining the corresponding face organs.

Step S300, according to the key point information, perform face organ Patch segmentation on the face image to obtain a corresponding face organ Patch image.

Specifically, according to the key point information detected by the key point model and the preset division rules, the face image is patched, and the minimum circumscribed rectangular area containing each face organ is taken to obtain the corresponding face organ Patch image.

In an exemplary embodiment, as shown in FIG. 4 , which is a detailed flowchart of the step S300, the step S300 may include:

Step S301, according to the key point information and a preset division rule, determine the minimum circumscribed rectangle corresponding to each face organ.

Step S302, according to the minimum circumscribed rectangle corresponding to each face organ, perform Patch segmentation on the face image to obtain a face organ Patch image corresponding to each face organ.

Specifically, according to the key point information, a set of division rules is designed, and the rules are as follows: according to the area enclosed by the key point coordinates and the sequence number corresponding to the key point, the specific position of the face organ is determined. Since polygon calculation is relatively redundant and the discrimination of occlusion judgment is of little significance, according to the uppermost, lowermost, leftmost and rightmost coordinate points of the face organ, the minimum circumscribed rectangle of the face organ is determined as the human face. The face organ Patch image is extracted for calculation.

Please continue to refer to Figure 3, taking the left eye as an example, the serial numbers corresponding to the key point coordinates are 36, 37, 38, 39, 40, 41, respectively, and the area surrounded by the key point coordinates represents the left eye. According to the key point coordinates , take the smallest rectangle that can contain the left eye as the left eye Patch. The human cheek can also be divided by the 68 key point information. Taking the left cheek as an example, the sequence numbers corresponding to the key point coordinates are 1, 2, 3, 4, 5, 6, 7, 31, 40, 41, 48 , the area enclosed by these 11 key point coordinates represents the left cheek. According to the key point coordinates, the smallest rectangle that can contain the left cheek is taken as the left cheek Patch. The face contour can also be divided by the 68 key point information, and the smallest rectangle is taken as the face contour Patch through the serial numbers 0, 8, 16, 19, and 24 corresponding to the key point coordinates.

In the embodiment of the present application, by taking the minimum circumscribed area of a face organ as the face organ Patch image, the complexity of calculation is reduced compared with the traditional polygon calculation, and the calculation of the occlusion ratio of the face organ is more convenient.

Step S400, preprocessing the face organ Patch image, and inputting the preprocessed face organ Patch image into a pre-trained face occlusion detection model to perform face occlusion detection, and output the corresponding mask. Mask image.

Specifically, first preprocess the divided face organ Patch images to obtain images that can be used for the face occlusion detection model. After pre-training the face occlusion detection model, the preprocessed images are input Face occlusion detection is performed in the face occlusion detection model to output the corresponding mask image.

Exemplarily, as shown in FIG. 5 , FIG. 5 is a schematic effect diagram of a face organ Patch segmentation. The image on the left of Figure 5 is the preprocessed face input image, and the right side of Figure 5 is the mask image output by the face occlusion detection model. Among them, the black part on the right side of Figure 5 is the background, and the white part is the face area.

In an exemplary embodiment, as shown in FIG. 6 , which is a flow chart of the refinement steps of the step S400, the step S400 may include:

Step S401: Fill the face organ Patch image and adjust the size of the filled image to obtain a square Patch image of the corresponding size.

Step S402: Input the square patch image into the pre-trained face occlusion detection model to perform face occlusion detection to obtain the corresponding mask image.

Specifically, call the function Padding 0 to fill the face organ Patch image area into a square, and then call the function resize to adjust the size of the face organ Patch image area to 128*128 to obtain a 128*128 square Patch image.

Specifically, as shown in Table 1, it is the network structure table of the face occlusion detection model. The square patch image first passes through the left half of the face occlusion detection model, namely the first layer to the fourth layer, for feature extraction, which belongs to the downsampling stage; then passes through the right half of the face occlusion detection model, that is, the first layer. Layers 5, 7, and 10 belong to the upsampling stage. This stage involves the fusion of feature maps of different scales. The fusion method is as shown in Table 1. The function Concat operation is used to accumulate the thickness of the feature maps; the last layer is a filter (filtering). device), the size is 1*1*128, and the depth is 1. After this layer of convolution, the face occlusion detection model outputs a mask image with a size of 128*128.

In the embodiment of the present application, the face organ Patch image is preprocessed and input to the face occlusion detection model, and then the mask image of the face organ is obtained through operations such as feature extraction, image fusion, and convolution, so as to accurately identify the face organ , The skin is separated from the occluder, which makes the calculation result of the occlusion ratio of the face organs more accurate.

Table 1

Step S500, performing binarization processing on the mask image to obtain the binarized target mask image.

Specifically, the mask image is first subjected to grayscale processing to obtain a corresponding grayscale image, and then the obtained grayscale image is subjected to binarization processing according to a preset pixel threshold to obtain the binarized target mask image.

In an exemplary embodiment, as shown in FIG. 7 , the step 500 may include:

Step S501, performing grayscale processing on the mask image to obtain a grayscale image;

Step S502, comparing the pixel value of each pixel of the grayscale image with a preset pixel threshold;

Step S503, when the pixel value of the pixel point is higher than the preset pixel threshold, the pixel value of the pixel point is set to a preset pixel value;

Step S504, complete the binarization processing of the mask image, and obtain the binarized target mask image.

Specifically, the mask image is binarized so that each pixel of the mask image is between 0 and 1, the preset pixel threshold is set to 0.75, and the pixels larger than the preset pixel threshold are set to is 1 (representing an occlusion area), and other pixels are set to 0 (representing a non-occlusion area) to obtain the binarized target mask image. The preset pixel threshold can be freely set according to the actual situation, which is not limited here.

In the embodiment of the present application, the binarized target mask image is obtained by performing a binarization process on the mask image, so that the target face region in the image is distinguished from the background, and the result of the model is more accurate.

In an exemplary embodiment, as shown in FIG. 8 , it is an exemplary flowchart of steps of the training method of the face occlusion detection model. The training method of the face occlusion detection model includes:

Step S511, obtaining face training image samples and occluder samples;

Step S512, performing key point detection on the face training image sample to obtain key point information of the face organs in the face training image sample;

Step S513, according to the key point information, perform face organ Patch segmentation on the face training image sample to obtain a corresponding face organ Patch image;

Step S514, randomly adding the occluder sample to the preset position of the face organ Patch image, to replace the pixels of the preset position of the face organ Patch image with the pixel of the occluder sample. pixels, get face occluder training image samples;

Step S515, preprocessing the face occlusion training image sample, and inputting the preprocessed face organ Patch image into the face occlusion detection model to complete the training of the face occlusion detection model.

Specifically, the key point detection is performed on the face training image sample through a key point model to obtain key point information of the face organs in the face training image sample, and then according to the key point information, the face The training image sample is subjected to face organ Patch segmentation to obtain a corresponding face organ Patch image, and the occluder sample is randomly added to the preset position of the face organ Patch image, so that the face organ Patch image is The pixels of the preset position are replaced with the pixels of the occluder samples, and the training image samples of face occluders are obtained, and the pixel values of the regions added by the occluder samples are replaced by the pixel values of the occluder samples. . Wherein, the occlusion samples are captured by web crawlers and captured and extracted by themselves, including fingers, pens, fans, cups, masks, cosmetics, microphones, and the like.

Exemplarily, it is assumed that the coordinates on the two-dimensional plane of the region where the occluder samples are added to the face training image samples are [x1:x2, y1:y2], where x1, x2, y1, and y2 correspond to people, respectively. The abscissas x1, x2 and y1, y2 of the face organs in the mask image. First initialize an all-zero matrix L with a size of 128*128, and then modify all the pixels in the [x1:x2,y1:y2] area to 1. The modified matrix is the supervision label used in training.

Specifically, the face occlusion detection model is trained by the segmentation loss function IOU Loss, so that the pixel value on the face organ patch image is closer to the pixel value at the corresponding position on the all-zero matrix L, that is, there is occlusion The pixel value of the area of the object is close to 1, and the pixel value of other areas is close to 0, and then the gradient descent method commonly used in deep learning is used for training until the face occlusion detection model converges, that is, the Loss value no longer decreases. The pixel value of the mask image output by the face occlusion detection model is infinitely close to the pixel value of the supervision label, and the training is completed. Among them, the function Loss is a commonly used segmentation loss function IOU loss, which is calculated according to the mask image and the all-zero matrix L.

In the embodiment of the present application, various types of occluders are randomly added to the random face area of the face training image sample, and then a large number of face occlusion training image samples are input into the face occlusion detection model for training, so that the face The occlusion detection model is becoming more and more sensitive to the detection of occlusions, so as to achieve the effect of detecting any occlusions.

Step S600: Calculate the occlusion ratio of each face organ according to the pixel value of the target mask image.

Specifically, the pixel value of the target mask image is compared with the preset pixel threshold, and all points higher than the preset pixel threshold are counted, and then the occlusion ratio of each face organ is calculated.

In an exemplary embodiment, as shown in FIG. 9 , the step 600 may include:

Step S601, according to the pixel value situation of the target mask image, count the number of the preset pixel values in each face organ, and obtain the total number of occlusion pixels;

Step S602, according to the total number of occluded pixels, calculate the ratio of the total number of occluded pixels corresponding to the total number of pixel values of face parts, and obtain the occlusion ratio of each face part.

Specifically, according to the pixel value of the target mask image, the ratio of the pixel value of the mask image corresponding to each face organ Patch image to the preset pixel threshold is calculated, that is, the face organ occlusion percentage. The formula for calculating the percentage of organ occlusion is as follows:

Among them, in the formula, x1, y1 are the coordinate positions of the upper left corner of the face organ in the mask image, h and w correspond to the height and width of the face organ in the mask image, respectively, σ _ij represents the binarized mask image. The pixel value at position (i, j),

Indicates that if the pixel corresponding to the (i, j) coordinate in the mask image is 1, take 1, otherwise take 0.

In the face occlusion detection method provided by the embodiments of the present application, the key point information of the corresponding face organ is obtained by performing key point detection on the face image, so as to perform Patch segmentation on the face organ to obtain the corresponding face organ Patch image, After preprocessing, it is input into the pre-trained face occlusion detection model for face detection, and the corresponding mask image is obtained, and finally the corresponding facial organ occlusion ratio is obtained by calculation. Not only the complexity of face occlusion detection is reduced, but also the face division is accurate to each face organ, which greatly improves the accuracy of face occlusion detection.

Embodiment 2

Referring to FIG. 10 , a schematic diagram of program modules of a face occlusion detection system 700 according to an embodiment of the present application is shown. The face occlusion detection system 700 can be applied to computer equipment, and the computer equipment can be a mobile phone, a tablet personal computer, a laptop computer, or other equipment with a data transmission function. In this embodiment of the present application, the face occlusion detection system 700 may include or be divided into one or more program modules, and one or more program modules are stored in a storage medium and processed by one or more processors. Executed to complete the embodiments of the present application, and the above-mentioned face occlusion detection system 700 can be implemented. The program modules referred to in the embodiments of the present application refer to a series of computer program instruction segments capable of completing specific functions, and are more suitable for describing the execution process of the face occlusion detection system 700 in the storage medium than the programs themselves. In an exemplary embodiment, the face occlusion detection system 700 includes an acquisition module 701, a first detection module 702, a segmentation module 703, a second detection module 704, a processing module 705 and a calculation module 706. The following description will specifically introduce the functions of each program module in the embodiments of the present application:

The acquiring module 701 is used for acquiring the face image to be detected.

Specifically, the acquisition module 701 acquires the face image to be detected in the model by taking a photo of the face by a camera device, capturing a face by a video monitoring device, and capturing by a web crawler.

The first detection module 702 is configured to perform key point detection on the face image to obtain key point information of the face organs in the face image.

Specifically, the first detection module 702 performs key point detection by inputting the face image to be detected into a preset key point model to obtain corresponding key point information, thereby determining the key points of the face organs information.

In an exemplary embodiment, the first detection module 702 is specifically configured to:

Input the face image into a preset key point model to perform the key point detection, and obtain a preset number of key point information on the two-dimensional plane of the face image, wherein the key point information includes key points. Point coordinates and serial numbers corresponding to key points;

Determine the key point information of each face organ according to the preset number of key point information and the position of each face organ in the face image, wherein the face organ includes forehead, left eyebrow, right Eyebrows, left eye, right eye, nose and mouth.

Exemplarily, as shown in FIG. 3 , FIG. 3 is a schematic grayscale image of a face organ Patch segmentation. Taking the left eye as an example, the serial numbers corresponding to the coordinates of the key points are 36, 37, 38, 39, 40, and 41, respectively, and the area enclosed by the coordinates of the key points represents the left eye. Taking the forehead as an example, the serial numbers corresponding to the key point coordinates of the left eyebrow are 17, 18, 19, 20, and 21, respectively, and the serial numbers corresponding to the key point coordinates of the right eyebrow are 22, 23, 24, 25, and 26. Among them, the serial number The horizontal line where the two points 19 and 24 are located is used as the lower boundary of the forehead. Based on the horizontal line where the two points are located, the height of the face frame extending one-fifth of the orientation is used as the upper boundary of the forehead, and the left and right boundaries of the forehead are respectively the serial number 17. The vertical line corresponding to serial number 26 forms a rectangular area as the forehead. The height of the face frame is the distance between the largest point in the key point coordinates of the eyebrows and the smallest point in the key point coordinates of the face contour.

In the embodiment of the present application, the key point information of the face image is obtained by performing key point detection on the face image, thereby accurately obtaining the corresponding face organs.

The segmentation module 703 is configured to perform face organ Patch segmentation on the face image according to the key point information to obtain a corresponding face organ Patch image.

Specifically, the segmentation module 703 performs Patch segmentation on the face image according to the key point information detected by the key point model and the preset division rules, and takes the smallest circumscribed rectangular area containing each face organ to obtain the corresponding human face Face Organ Patch Image.

In an exemplary embodiment, the segmentation module 703 is specifically used for:

According to the key point information and the preset division rule, determine the minimum circumscribed rectangle corresponding to each face organ;

Patch segmentation is performed on the face image according to the minimum circumscribed rectangle corresponding to each face organ to obtain a face organ Patch image corresponding to each face organ.

Specifically, the segmentation module 703 designs a set of division rules according to the key point information, and the rules are as follows: according to the area enclosed by the coordinates of the key points and the sequence numbers corresponding to the key points, the specific position of the facial organ is determined. Since polygon calculation is relatively redundant and the discrimination of occlusion judgment is of little significance, according to the uppermost, lowermost, leftmost and rightmost coordinate points of the face organ, the minimum circumscribed rectangle of the face organ is determined as the human face. The face organ Patch image is extracted for calculation.

Please continue to refer to Figure 3, taking the left eye as an example, the serial numbers corresponding to the key point coordinates are 36, 37, 38, 39, 40, 41, respectively, and the area surrounded by the key point coordinates represents the left eye. According to the key point coordinates , take the smallest rectangle that can contain the left eye as the left eye Patch. The human cheek can also be divided by the 68 key point information. Taking the left cheek as an example, the sequence numbers corresponding to the key point coordinates are 1, 2, 3, 4, 5, 6, 7, 31, 40, 41, 48 , the area enclosed by the coordinates of these 11 key points represents the left cheek. According to the coordinates of the key points, the smallest rectangle that can contain the left cheek is taken as the left cheek Patch. The face contour can also be divided by the 68 key point information, and the smallest rectangle is taken as the face contour Patch through the serial numbers 0, 8, 16, 19, and 24 corresponding to the key point coordinates.

The second detection module 704 is configured to preprocess the face organ Patch image, and input the preprocessed face organ Patch image into a pre-trained face occlusion detection model to perform face occlusion detection, And output the corresponding mask image.

Specifically, the second detection module 704 first preprocesses the divided face organ Patch images to obtain images that can be used for the face occlusion detection model, and after pre-training the face occlusion detection model, Input the preprocessed image into the face occlusion detection model for face occlusion detection, so as to output the corresponding mask image.

In an exemplary embodiment, the second detection module 704 is specifically configured to:

Described face organ Patch image is filled and the image after filling is carried out size adjustment, obtains the square Patch image of corresponding size;

Inputting the square patch image into the pre-trained face occlusion detection model to perform face occlusion detection to obtain the corresponding mask image.

Specifically, the second detection module 704 calls the function Padding 0 to fill the face organ Patch image area into a square, and then calls the function resize to adjust the size of the face organ Patch image area to 128*128, obtaining 128 *128 square patch images.

In an exemplary embodiment, the second detection module 704 preprocesses the face organ Patch image and inputs it into the face occlusion detection model, and then obtains the face organ through operations such as feature extraction, image fusion, and convolution. mask image, so as to accurately distinguish face organs, skin and occluders, and make the calculation of the occlusion ratio of face organs more accurate.

Table 1

The processing module 705 is configured to perform binarization processing on the mask image to obtain the binarized target mask image.

Specifically, the processing module 705 first performs grayscale processing on the mask image to obtain a corresponding grayscale image, and then performs binarization processing on the obtained grayscale image according to a preset pixel threshold to obtain the two grayscale images. Valued target mask image.

In an exemplary embodiment, the processing module 705 is specifically configured to:

Perform grayscale processing on the mask image to obtain a grayscale image;

comparing the pixel value of each pixel of the grayscale image with a preset pixel threshold;

When the pixel value of the pixel point is higher than the preset pixel threshold, the pixel value of the pixel point is set to a preset pixel value;

The binarization processing of the mask image is completed to obtain the binarized target mask image.

Specifically, the processing module 705 performs binarization processing on the mask image, so that each pixel of the mask image is between 0 and 1, and sets the preset pixel threshold to 0.75, which is greater than the preset pixel threshold. The pixel points of the threshold are set to 1 (representing an occlusion area), and other pixels are set to 0 (representing a non-occlusion area) to obtain the binarized target mask image. The preset pixel threshold can be freely set according to the actual situation, which is not limited here.

The face occlusion detection system 700 provided by the present application includes a training module of a face occlusion detection model, which is used for:

Obtain face training image samples and occlusion samples;

Perform key point detection on the face training image sample to obtain key point information of the face organs in the face training image sample;

According to the key point information, the face training image sample is subjected to face organ Patch segmentation to obtain a corresponding face organ Patch image;

Randomly adding the occluder sample to the preset position of the face organ Patch image, to replace the pixel of the preset position of the face organ Patch image with the pixel of the occluder sample, to obtain face occlusion training image samples;

The face occlusion training image samples are preprocessed, and the preprocessed face organ Patch image is input into the face occlusion detection model to complete the training of the face occlusion detection model.

Specifically, the training module of the face occlusion detection model performs key point detection on the face training image sample through a key point model to obtain the key point information of the face organs in the face training image sample, and then according to the The key point information, the face training image sample is subjected to face organ Patch segmentation to obtain a corresponding face organ Patch image, and the occluder sample is randomly added to the preset position of the face organ Patch image , to replace the pixels of the preset position of the face organ Patch image with the pixels of the occluder sample to obtain a face occluder training image sample, and add the area pixel value of the occluder sample. Replaced with the pixel value of the occluder sample. Wherein, the occlusion samples are captured by web crawlers and captured and extracted by themselves, including fingers, pens, fans, cups, masks, cosmetics, microphones, and the like.

In an exemplary embodiment, the face occlusion detection system 700 randomly adds various types of occluders to random face regions of the face training image samples, and then inputs a large number of face occlusion training image samples into the The face occlusion detection model is trained to make the face occlusion detection model more and more sensitive to the detection of occlusions, so as to achieve the effect of detecting any occlusions.

The calculation module 706 is configured to calculate the occlusion ratio of each face organ according to the pixel value of the target mask image.

In an exemplary embodiment, the computing module 706 is specifically used to:

According to the pixel value situation of the target mask image, count the number of the preset pixel values in each face organ to obtain the total number of occluded pixels;

According to the total number of occluded pixels, the ratio of the total number of occluded pixels corresponding to the total number of pixel values of face parts is calculated, and the occlusion ratio of each face part is obtained.

Specifically, the calculation module 706 calculates the proportion of the pixel value of the mask image corresponding to the patch image of each face organ according to the pixel value of the target mask image, which is the proportion of the preset pixel threshold, which is the occlusion percentage of the face organ . The formula for calculating the percentage of organ occlusion is as follows:

The face occlusion detection system 700 provided by the embodiment of the present application obtains the key point information of the corresponding face organ by performing key point detection on the face image, so as to perform Patch segmentation on the face organ to obtain the corresponding face organ Patch image , and then input into the pre-trained face occlusion detection model for face detection after preprocessing, to obtain the corresponding mask image, and finally calculate to obtain the corresponding facial organ occlusion ratio. Not only the complexity of face occlusion detection is reduced, but also the face division is accurate to each face organ, which greatly improves the accuracy of face occlusion detection.

Embodiment 3

Referring to FIG. 11 , an embodiment of the present application further provides a schematic diagram of a hardware architecture of a computer device 800 . Such as smart phones, tablet computers, notebook computers, desktop computers, rack servers, blade servers, tower servers or rack servers (including independent servers, or server clusters composed of multiple servers) that can execute programs, etc. . In this embodiment of the present application, the computer device 800 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. As shown in the figure, the computer device 800 at least includes, but is not limited to, a memory 801, a processor 802, and a network interface 803 that can communicate with each other through a device bus. in:

In this embodiment of the present application, the memory 801 includes at least one type of computer-readable storage medium, and the readable storage medium includes a flash memory, a hard disk, a multimedia card, a card-type memory (for example, SD or DX memory, etc.), and random access memory. (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like. In some inventive embodiments, the memory 801 may be an internal storage unit of the computer device 800 , such as a hard disk or a memory of the computer device 800 . In other embodiments of the invention, the memory 801 may also be an external storage device of the computer device 800, for example, a pluggable hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital) device equipped on the computer device 800 Digital, SD) card, flash card (Flash Card), etc. Of course, the memory 801 may also include both the internal storage unit of the computer device 800 and its external storage device. In the embodiment of the present application, the memory 801 is generally used to store the operating device installed in the computer device 800 and various application software, such as the program code of the face occlusion detection system 700 and the like. In addition, the memory 801 can also be used to temporarily store various types of data that have been output or will be output.

The processor 802 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some inventive embodiments. The processor 802 is generally used to control the overall operation of the computer device 800 . In this embodiment of the present application, the processor 802 is configured to run the program code or process data stored in the memory 801, for example, run the program code of the face occlusion detection system 700, so as to realize the face occlusion detection system 700 described above. Occlusion detection method.

The network interface 803 may include a wireless network interface or a wired network interface, and the network interface 803 is generally used to establish a communication connection between the computer device 800 and other electronic devices. For example, the network interface 803 is used to connect the computer device 800 with an external terminal through a network, and establish a data transmission channel and a communication connection between the computer device 800 and the external terminal. The network may be an intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network Wireless or wired network such as network, Bluetooth (Bluetooth), Wi-Fi, etc.

It should be noted that FIG. 11 only shows a computer device 800 having components 801-803, but it should be understood that implementation of all shown components is not required, and that more or less components may be implemented instead.

In this embodiment of the present application, the face occlusion detection system 700 stored in the memory 801 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 801 and are configured by One or more processors (the processor 802 in this embodiment) are executed to complete the face occlusion detection method of the present application.

Embodiment 4

Embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium may be non-volatile or volatile, such as flash memory, hard disk, multimedia card, card-type storage (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic A memory, a magnetic disk, an optical disc, a server, an App application mall, etc., store a computer program thereon, and when the program is executed by the processor, a corresponding function is realized. The computer-readable storage medium of the embodiment of the present application is used to store the face occlusion detection system 700, so as to implement the face occlusion detection method of the present application when executed by the processor.

The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments of the invention.

From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above-mentioned embodiments of the invention can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is more best implementation.

The above are only the preferred invention embodiments of the present application, and are not intended to limit the scope of the patent of the present application. Any equivalent structure or equivalent process transformation made by using the contents of the description and drawings of the present application, or directly or indirectly applied to other related technologies Fields are similarly included within the scope of patent protection of this application.

Claims

A face occlusion detection method, wherein the method comprises:

Obtain the face image to be detected;

Perform key point detection on the face image to obtain key point information of the face organs in the face image;

According to the key point information, the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image;

Preprocessing the face organ block image, and inputting the preprocessed face organ block image into a pre-trained face occlusion detection model to perform face occlusion detection, and output a corresponding mask image;

performing a binarization process on the mask image to obtain the binarized target mask image; and

The occlusion ratio of each face organ is calculated according to the pixel value of the target mask image.
The method for detecting face occlusion according to claim 1, wherein the key point detection is performed on the face image to obtain the key point information of the face organs in the face image, comprising:

Input the face image into a preset key point model to perform the key point detection, and obtain a preset number of key point information on the two-dimensional plane of the face image, wherein the key point information includes key points. The point coordinates and the sequence number corresponding to the key point; and

Determine the key point information of each face organ according to the preset number of key point information and the position of each face organ in the face image, wherein the face organ includes forehead, left eyebrow, right Eyebrows, left eye, right eye, nose and mouth.
The method for detecting face occlusion according to claim 1, wherein, according to the key point information, the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image, comprising:

According to the key point information and the preset division rule, determine the minimum circumscribed rectangle corresponding to each face part; and

According to the minimum circumscribed rectangle corresponding to each face organ, block segmentation is performed on the face image to obtain a face organ block image corresponding to each face organ.
The method for detecting face occlusion according to claim 1, wherein the face parts image is preprocessed, and the preprocessed face parts images are input into a pre-trained face occlusion detection model to perform face occlusion detection and output the corresponding mask image, including:

Filling the face organ block image and adjusting the size of the filled image to obtain a square block image of the corresponding size; and

Inputting the square block image into the pre-trained face occlusion detection model to perform face occlusion detection to obtain the corresponding mask image.
The method for detecting face occlusion according to claim 1, wherein, performing a binarization process on the mask image to obtain the binarized target mask image, comprising:

performing grayscale processing on the mask image to obtain a grayscale image;

comparing the pixel value of each pixel of the grayscale image with a preset pixel threshold;

When the pixel value of the pixel point is higher than the predetermined pixel threshold, setting the pixel value of the pixel point to a predetermined pixel value; and

The binarization processing of the mask image is completed to obtain the binarized target mask image.
The face occlusion detection method according to claim 1 or 4, wherein the training method of the face occlusion detection model comprises:

Obtain face training image samples and occlusion samples;

Perform key point detection on the face training image sample to obtain key point information of the face organs in the face training image sample;

According to the key point information, the face training image sample is subjected to face organ block segmentation to obtain a corresponding face organ block image;

The occluder sample is randomly added to the preset position of the face part image, so as to replace the pixels of the preset position of the face part image with the pixels of the occluder sample, obtaining face occluder training image samples; and

The face occlusion training image samples are preprocessed, and the preprocessed face organ block images are input into the face occlusion detection model to complete the training of the face occlusion detection model.
The method for detecting face occlusion according to claim 1, wherein the calculating the occlusion ratio of each face organ according to the pixel value of the target mask image comprises:

According to the pixel value situation of the target mask image, count the number of the preset pixel values in each face part to obtain the total number of occluded pixels; and

According to the total number of occluded pixels, the ratio of the total number of occluded pixels to the total number of pixel values of corresponding face parts is calculated to obtain the occlusion ratio of each face part.
A face occlusion detection device, wherein the face occlusion detection device comprises: a memory, a processor, and a face occlusion detection program stored in the memory and executable on the processor, the processor When executing the face occlusion detection program, the following steps are implemented:

Obtain the face image to be detected;

Perform key point detection on the face image to obtain key point information of the face organs in the face image;

According to the key point information, the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image;

Preprocessing the face organ block image, and inputting the preprocessed face organ block image into a pre-trained face occlusion detection model to perform face occlusion detection, and output a corresponding mask image;

performing a binarization process on the mask image to obtain the binarized target mask image; and

The occlusion ratio of each face organ is calculated according to the pixel value of the target mask image.
The face occlusion detection device according to claim 8, wherein the processor executes the face occlusion detection program to implement the key point detection on the face image to obtain the face in the face image Key point information for organs, including:

Input the face image into a preset key point model to perform the key point detection, and obtain a preset number of key point information on the two-dimensional plane of the face image, wherein the key point information includes key points. The point coordinates and the sequence number corresponding to the key point; and

Determine the key point information of each face organ according to the preset number of key point information and the position of each face organ in the face image, wherein the face organ includes forehead, left eyebrow, right Eyebrows, left eye, right eye, nose and mouth.
The face occlusion detection device according to claim 8, wherein the processor executes the face occlusion detection program to implement the face part segmentation on the face image according to the key point information, Obtain the corresponding face organ block images, including:

According to the key point information and the preset division rule, determine the minimum circumscribed rectangle corresponding to each face part; and

According to the minimum circumscribed rectangle corresponding to each face organ, block segmentation is performed on the face image to obtain a face organ block image corresponding to each face organ.
The face occlusion detection device according to claim 8, wherein the processor executes the face occlusion detection program to realize the preprocessing of the face organ block image, and the preprocessed face The organ block image is input into the pre-trained face occlusion detection model for face occlusion detection, and the corresponding mask image is output, including:

Filling the face organ block image and adjusting the size of the filled image to obtain a square block image of the corresponding size; and

Inputting the square block image into the pre-trained face occlusion detection model to perform face occlusion detection to obtain the corresponding mask image.
The face occlusion detection device according to claim 8, wherein the processor executes the face occlusion detection program to realize the binarization processing of the mask image to obtain the binarized target Mask image, including:

performing grayscale processing on the mask image to obtain a grayscale image;

comparing the pixel value of each pixel of the grayscale image with a preset pixel threshold;

When the pixel value of the pixel point is higher than the predetermined pixel threshold, setting the pixel value of the pixel point to a predetermined pixel value; and

The binarization processing of the mask image is completed to obtain the binarized target mask image.
The face occlusion detection device according to claim 8 or 11, wherein the processor executes the face occlusion detection program to realize the training method of the face occlusion detection model comprising:

Obtain face training image samples and occlusion samples;

Perform key point detection on the face training image sample to obtain key point information of the face organs in the face training image sample;

According to the key point information, the face training image sample is subjected to face organ block segmentation to obtain a corresponding face organ block image;

The occluder sample is randomly added to the preset position of the face part image, so as to replace the pixels of the preset position of the face part image with the pixels of the occluder sample, obtaining face occluder training image samples; and

The face occlusion training image samples are preprocessed, and the preprocessed face organ block images are input into the face occlusion detection model to complete the training of the face occlusion detection model.
The face occlusion detection device according to claim 8, wherein the processor executes the face occlusion detection program to realize the calculation of the occlusion ratio of each face organ according to the pixel value of the target mask image, include:

According to the pixel value situation of the target mask image, count the number of the preset pixel values in each face part to obtain the total number of occluded pixels; and

According to the total number of occluded pixels, the ratio of the total number of occluded pixels to the total number of pixel values of corresponding face parts is calculated to obtain the occlusion ratio of each face part.
A computer-readable storage medium, storing computer instructions in the computer-readable storage medium, when the computer instructions are executed on a computer, the computer is made to perform the following steps:

Obtain the face image to be detected;

Perform key point detection on the face image to obtain key point information of the face organs in the face image;

According to the key point information, the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image;

Preprocessing the face organ block image, and inputting the preprocessed face organ block image into a pre-trained face occlusion detection model to perform face occlusion detection, and output a corresponding mask image;

performing a binarization process on the mask image to obtain the binarized target mask image; and

The occlusion ratio of each face organ is calculated according to the pixel value of the target mask image.
The computer-readable storage medium according to claim 15, wherein the computer-readable storage medium executes the computer instructions to implement the key point detection on the face image to obtain the facial organs in the face image. Key point information, including:

Input the face image into a preset key point model to perform the key point detection, and obtain a preset number of key point information on the two-dimensional plane of the face image, wherein the key point information includes key points. The point coordinates and the sequence number corresponding to the key point; and

Determine the key point information of each face organ according to the preset number of key point information and the position of each face organ in the face image, wherein the face organ includes forehead, left eyebrow, right Eyebrows, left eye, right eye, nose and mouth.
The computer-readable storage medium according to claim 15, wherein the computer-readable storage medium executes the computer instructions to implement the face image segmentation according to the key point information to obtain the corresponding face organ block images, including:

According to the key point information and the preset division rule, determine the minimum circumscribed rectangle corresponding to each face part; and

According to the minimum circumscribed rectangle corresponding to each face organ, block segmentation is performed on the face image to obtain a face organ block image corresponding to each face organ.
The computer-readable storage medium according to claim 15, wherein the computer-readable storage medium executes the computer instructions to realize the preprocessing of the face part image, and the preprocessed face part image The image is input into a pre-trained face occlusion detection model for face occlusion detection, and the corresponding mask image is output, including:

Filling the face organ block image and adjusting the size of the filled image to obtain a square block image of the corresponding size; and

Inputting the square block image into the pre-trained face occlusion detection model to perform face occlusion detection to obtain the corresponding mask image.
The computer-readable storage medium according to claim 15, wherein the computer-readable storage medium executes the computer instructions to implement the binarization process on the mask image to obtain the binarized target mask images, including:

performing grayscale processing on the mask image to obtain a grayscale image;

comparing the pixel value of each pixel of the grayscale image with a preset pixel threshold;

When the pixel value of the pixel point is higher than the preset pixel threshold, setting the pixel value of the pixel point to the preset pixel value; and

The binarization processing of the mask image is completed to obtain the binarized target mask image.
A face occlusion detection system, wherein the face occlusion detection system includes:

an acquisition module, used to acquire the face image to be detected;

a first detection module, configured to perform key point detection on the face image to obtain key point information of the face organs in the face image;

a segmentation module, configured to perform facial organ block segmentation on the face image according to the key point information to obtain a corresponding face organ block image;

The second detection module is used to preprocess the face organ block image, and input the preprocessed face organ block image into a pre-trained face occlusion detection model to perform face occlusion detection, and Output the corresponding mask image;

a processing module, configured to perform binarization processing on the mask image to obtain the binarized target mask image;

The calculation module is configured to calculate the occlusion ratio of each face organ according to the pixel value of the target mask image.