WO2022134337A1 - Procédé et système de détection d'occlusion de visage, dispositif et support d'enregistrement - Google Patents

Procédé et système de détection d'occlusion de visage, dispositif et support d'enregistrement Download PDF

Info

Publication number
WO2022134337A1
WO2022134337A1 PCT/CN2021/082571 CN2021082571W WO2022134337A1 WO 2022134337 A1 WO2022134337 A1 WO 2022134337A1 CN 2021082571 W CN2021082571 W CN 2021082571W WO 2022134337 A1 WO2022134337 A1 WO 2022134337A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
image
key point
organ
occlusion detection
Prior art date
Application number
PCT/CN2021/082571
Other languages
English (en)
Chinese (zh)
Inventor
陈丹
陆进
陈斌
刘玉宇
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022134337A1 publication Critical patent/WO2022134337A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a method, system, device and storage medium for face occlusion monitoring.
  • face recognition and living body detection play a vital role in building traffic, financial authentication and other fields, and the occlusion of face images will have a direct impact on the results of face recognition and living body detection. . Therefore, face occlusion detection is an indispensable link in the face system.
  • the existing face occlusion detection technical solutions are mainly divided into two directions: one is to use traditional methods to distinguish skin color and texture information from hue and texture, and then judge whether the face image is occluded; the other is to train deep neural networks.
  • the single-task classification method is mainly used to determine whether the entire face is occluded, or the multi-task method is used to integrate with the detection model, and the types and positions of various facial organs and occluders are detected at the same time to determine the occlusion of the face. .
  • the inventor found that the traditional method is affected by the complexity of face features and the diversity of occluders, and is not universal and has weak generalization ability.
  • the single-task classification method cannot be accurate to specific organs, landing scenes There are limitations, and the task of detecting organs at the same time when the multi-task method directly locates the occluder is difficult, and the accuracy is difficult to guarantee.
  • the main purpose of this application is to provide a face occlusion detection method, system, computer equipment and computer-readable storage medium, which are used to solve the problem that traditional methods in the prior art are not universal and have weak generalization ability, and single-task classification
  • the method cannot be accurate to specific organs, and the landing scene is limited, while the multi-task method is difficult to detect organs, and the accuracy is difficult to guarantee.
  • a first aspect of the present application provides a face occlusion detection method, and the face occlusion detection method includes:
  • the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image
  • the occlusion ratio of each face organ is calculated according to the pixel value of the target mask image.
  • a second aspect of the present application provides a face occlusion detection device, the face occlusion detection device includes: a memory, a processor, and a face occlusion detection program stored in the memory and executable on the processor , the processor implements the following steps when executing the face occlusion detection program:
  • the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image
  • the occlusion ratio of each face organ is calculated according to the pixel value of the target mask image.
  • a third aspect of the present application provides a storage medium, a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are executed on a computer, the computer is caused to perform the following steps:
  • the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image
  • the occlusion ratio of each face organ is calculated according to the pixel value of the target mask image.
  • a fourth aspect of the present application provides a face occlusion detection system, where the face occlusion detection system includes:
  • the acquisition module is used to acquire the face image to be detected
  • a first detection module configured to perform key point detection on the face image to obtain key point information of the face organs in the face image
  • a segmentation module configured to perform facial organ block segmentation on the face image according to the key point information to obtain a corresponding face organ block image
  • the second detection module is used to preprocess the face organ block image, and input the preprocessed face organ block image into a pre-trained face occlusion detection model to perform face occlusion detection, and Output the corresponding mask image;
  • a processing module configured to perform binarization processing on the mask image to obtain the binarized target mask image
  • the calculation module is configured to calculate the occlusion ratio of each face organ according to the pixel value of the target mask image.
  • a face occlusion detection method, system, computer equipment and computer-readable storage medium provided by the present application, by acquiring a face image to be detected; key point information of face organs; according to the key point information, the face image is subjected to face organ block segmentation to obtain a corresponding face organ block image; the face organ block image is preprocessed, and Input the preprocessed face organ block image into the pre-trained face occlusion detection model to perform face occlusion detection, and output the corresponding mask image; calculate each pixel value according to the target mask image. Occlusion ratio of individual face organs.
  • the specific occlusion position of each part of the organ and the occlusion percentage of each face organ can be accurately calculated, which not only reduces the complexity of face occlusion detection, but also reduces the complexity of face occlusion detection.
  • the division is accurate to each face organ, which greatly improves the accuracy of face occlusion detection.
  • FIG. 1 is a schematic flowchart of steps of a method for detecting face occlusion provided by the present application
  • FIG. 2 is a schematic flow chart of step refinement of step S200 in FIG. 1 provided by the present application;
  • FIG. 3 is a schematic grayscale image of facial organ block segmentation provided by the application.
  • FIG. 4 is a schematic flowchart of step refinement of step S300 in FIG. 1 provided by the present application;
  • Fig. 5 is a kind of schematic facial organ block segmentation rendering effect diagram provided by this application.
  • FIG. 6 is a schematic flowchart of step refinement of step S400 in FIG. 1 provided by the present application;
  • FIG. 7 is a schematic flowchart of step refinement of step S500 in FIG. 1 provided by the present application.
  • FIG. 8 is a schematic flow chart of step refinement of the training method for a face occlusion detection model in the face occlusion detection method provided by the present application;
  • FIG. 9 is a schematic flowchart of step refinement of step S600 in FIG. 1 provided by the present application.
  • FIG. 10 is a schematic diagram of an optional program module of the face occlusion detection system provided by the application.
  • FIG. 11 is a schematic diagram of an optional hardware architecture of the computer device provided by the present application.
  • the embodiments of the present application provide a face occlusion detection method, system, device, and storage medium.
  • face organs as blocks to perform pixel-level semantic segmentation, the specific occlusion position of each part and organ and each face can be accurately calculated.
  • the occlusion percentage of organs not only reduces the complexity of face occlusion detection, but also the face division is accurate to each face organ, which greatly improves the accuracy of face occlusion detection.
  • FIG. 1 a schematic flowchart of steps of a face occlusion detection method provided by an embodiment of the present application is shown. It can be understood that the flowcharts in the embodiments of the present application are not used to limit the order of executing steps.
  • the following is an exemplary description with a computer device as the execution subject, and the computer device may include mobile terminals such as smart phones, tablet personal computers, laptop computers, etc., as well as fixed terminals such as desktop computers. . details as follows:
  • Step S100 acquiring a face image to be detected.
  • the face image to be detected by the model can be obtained by taking a photo of the face by a camera device, capturing a face by a video monitoring device, and capturing by a web crawler.
  • Step S200 performing key point detection on the face image to obtain key point information of the face organs in the face image.
  • key point detection is performed by inputting the face image to be detected into a preset key point model, and corresponding key point information is obtained, thereby determining the key point information of the face organs.
  • the step 200 may include:
  • Step S201 inputting the face image into a preset key point model to perform the key point detection, to obtain a preset number of key point information on the two-dimensional plane of the face image, wherein the key point
  • the information includes the coordinates of the key points and the serial numbers corresponding to the key points;
  • Step S202 according to the preset number of key point information and the position of each face organ in the face image, determine the key point information of each face organ, wherein the face organ includes forehead, left Eyebrows, right eyebrows, left eye, right eye, nose and mouth.
  • the face image to be detected is input into a preset key point model for key point detection and calibration, 68 key points are marked on the face image to be detected, and the The corresponding serial number is also marked, and the corresponding key point information is obtained to determine the corresponding face organ coordinate point information.
  • FIG. 3 is a schematic grayscale image of facial organ block (Patch) segmentation.
  • the serial numbers corresponding to the coordinates of the key points are 36, 37, 38, 39, 40, and 41, respectively, and the area enclosed by the coordinates of the key points represents the left eye.
  • the serial numbers corresponding to the key point coordinates of the left eyebrow are 17, 18, 19, 20, and 21, respectively, and the serial numbers corresponding to the key point coordinates of the right eyebrow are 22, 23, 24, 25, and 26.
  • the serial number The horizontal line where the two points 19 and 24 are located is used as the lower boundary of the forehead.
  • the height of the face frame extending one-fifth of the orientation is used as the upper boundary of the forehead, and the left and right boundaries of the forehead are respectively the serial number 17.
  • the vertical line corresponding to the serial number 26 forms a rectangular area as the forehead.
  • the height of the face frame is the distance between the largest point in the key point coordinates of the eyebrows and the smallest point in the key point coordinates of the face contour.
  • the human cheek can also be divided by the 68 key point information.
  • the sequence numbers corresponding to the key point coordinates are 1, 2, 3, 4, 5, 6, 7, 31 respectively. , 40, 41, 48, the area enclosed by these 11 key points is the left cheek.
  • the face contour can also be divided by the 68 key point information.
  • the serial numbers corresponding to the key point coordinates are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13. , 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, the area surrounded by these 27 key points is the face contour.
  • the embodiment of the present application obtains the key point information of the face image by performing key point detection on the face image, thereby accurately obtaining the corresponding face organs.
  • Step S300 according to the key point information, perform face organ Patch segmentation on the face image to obtain a corresponding face organ Patch image.
  • the face image is patched, and the minimum circumscribed rectangular area containing each face organ is taken to obtain the corresponding face organ Patch image.
  • the step S300 may include:
  • Step S301 according to the key point information and a preset division rule, determine the minimum circumscribed rectangle corresponding to each face organ.
  • Step S302 according to the minimum circumscribed rectangle corresponding to each face organ, perform Patch segmentation on the face image to obtain a face organ Patch image corresponding to each face organ.
  • a set of division rules is designed, and the rules are as follows: according to the area enclosed by the key point coordinates and the sequence number corresponding to the key point, the specific position of the face organ is determined. Since polygon calculation is relatively redundant and the discrimination of occlusion judgment is of little significance, according to the uppermost, lowermost, leftmost and rightmost coordinate points of the face organ, the minimum circumscribed rectangle of the face organ is determined as the human face. The face organ Patch image is extracted for calculation.
  • the serial numbers corresponding to the key point coordinates are 36, 37, 38, 39, 40, 41, respectively, and the area surrounded by the key point coordinates represents the left eye.
  • the human cheek can also be divided by the 68 key point information.
  • the sequence numbers corresponding to the key point coordinates are 1, 2, 3, 4, 5, 6, 7, 31, 40, 41, 48 , the area enclosed by these 11 key point coordinates represents the left cheek.
  • the smallest rectangle that can contain the left cheek is taken as the left cheek Patch.
  • the face contour can also be divided by the 68 key point information, and the smallest rectangle is taken as the face contour Patch through the serial numbers 0, 8, 16, 19, and 24 corresponding to the key point coordinates.
  • the complexity of calculation is reduced compared with the traditional polygon calculation, and the calculation of the occlusion ratio of the face organ is more convenient.
  • Step S400 preprocessing the face organ Patch image, and inputting the preprocessed face organ Patch image into a pre-trained face occlusion detection model to perform face occlusion detection, and output the corresponding mask.
  • Mask image preprocessing the face organ Patch image, and inputting the preprocessed face organ Patch image into a pre-trained face occlusion detection model to perform face occlusion detection, and output the corresponding mask.
  • FIG. 5 is a schematic effect diagram of a face organ Patch segmentation.
  • the image on the left of Figure 5 is the preprocessed face input image
  • the right side of Figure 5 is the mask image output by the face occlusion detection model.
  • the black part on the right side of Figure 5 is the background
  • the white part is the face area.
  • the step S400 may include:
  • Step S401 Fill the face organ Patch image and adjust the size of the filled image to obtain a square Patch image of the corresponding size.
  • Step S402 Input the square patch image into the pre-trained face occlusion detection model to perform face occlusion detection to obtain the corresponding mask image.
  • Table 1 it is the network structure table of the face occlusion detection model.
  • the square patch image first passes through the left half of the face occlusion detection model, namely the first layer to the fourth layer, for feature extraction, which belongs to the downsampling stage; then passes through the right half of the face occlusion detection model, that is, the first layer. Layers 5, 7, and 10 belong to the upsampling stage. This stage involves the fusion of feature maps of different scales.
  • the fusion method is as shown in Table 1.
  • the function Concat operation is used to accumulate the thickness of the feature maps; the last layer is a filter (filtering). device), the size is 1*1*128, and the depth is 1.
  • the face occlusion detection model outputs a mask image with a size of 128*128.
  • the face organ Patch image is preprocessed and input to the face occlusion detection model, and then the mask image of the face organ is obtained through operations such as feature extraction, image fusion, and convolution, so as to accurately identify the face organ ,
  • the skin is separated from the occluder, which makes the calculation result of the occlusion ratio of the face organs more accurate.
  • Step S500 performing binarization processing on the mask image to obtain the binarized target mask image.
  • the mask image is first subjected to grayscale processing to obtain a corresponding grayscale image, and then the obtained grayscale image is subjected to binarization processing according to a preset pixel threshold to obtain the binarized target mask image.
  • the step 500 may include:
  • Step S501 performing grayscale processing on the mask image to obtain a grayscale image
  • Step S502 comparing the pixel value of each pixel of the grayscale image with a preset pixel threshold
  • Step S503 when the pixel value of the pixel point is higher than the preset pixel threshold, the pixel value of the pixel point is set to a preset pixel value;
  • Step S504 complete the binarization processing of the mask image, and obtain the binarized target mask image.
  • the mask image is binarized so that each pixel of the mask image is between 0 and 1, the preset pixel threshold is set to 0.75, and the pixels larger than the preset pixel threshold are set to is 1 (representing an occlusion area), and other pixels are set to 0 (representing a non-occlusion area) to obtain the binarized target mask image.
  • the preset pixel threshold can be freely set according to the actual situation, which is not limited here.
  • the binarized target mask image is obtained by performing a binarization process on the mask image, so that the target face region in the image is distinguished from the background, and the result of the model is more accurate.
  • FIG. 8 it is an exemplary flowchart of steps of the training method of the face occlusion detection model.
  • the training method of the face occlusion detection model includes:
  • Step S511 obtaining face training image samples and occluder samples
  • Step S512 performing key point detection on the face training image sample to obtain key point information of the face organs in the face training image sample;
  • Step S513 according to the key point information, perform face organ Patch segmentation on the face training image sample to obtain a corresponding face organ Patch image;
  • Step S514 randomly adding the occluder sample to the preset position of the face organ Patch image, to replace the pixels of the preset position of the face organ Patch image with the pixel of the occluder sample. pixels, get face occluder training image samples;
  • Step S515 preprocessing the face occlusion training image sample, and inputting the preprocessed face organ Patch image into the face occlusion detection model to complete the training of the face occlusion detection model.
  • the key point detection is performed on the face training image sample through a key point model to obtain key point information of the face organs in the face training image sample, and then according to the key point information, the face
  • the training image sample is subjected to face organ Patch segmentation to obtain a corresponding face organ Patch image, and the occluder sample is randomly added to the preset position of the face organ Patch image, so that the face organ Patch image is
  • the pixels of the preset position are replaced with the pixels of the occluder samples, and the training image samples of face occluders are obtained, and the pixel values of the regions added by the occluder samples are replaced by the pixel values of the occluder samples.
  • the occlusion samples are captured by web crawlers and captured and extracted by themselves, including fingers, pens, fans, cups, masks, cosmetics, microphones, and the like.
  • the coordinates on the two-dimensional plane of the region where the occluder samples are added to the face training image samples are [x1:x2, y1:y2], where x1, x2, y1, and y2 correspond to people, respectively.
  • the abscissas x1, x2 and y1, y2 of the face organs in the mask image. First initialize an all-zero matrix L with a size of 128*128, and then modify all the pixels in the [x1:x2,y1:y2] area to 1.
  • the modified matrix is the supervision label used in training.
  • the face occlusion detection model is trained by the segmentation loss function IOU Loss, so that the pixel value on the face organ patch image is closer to the pixel value at the corresponding position on the all-zero matrix L, that is, there is occlusion
  • the pixel value of the area of the object is close to 1, and the pixel value of other areas is close to 0, and then the gradient descent method commonly used in deep learning is used for training until the face occlusion detection model converges, that is, the Loss value no longer decreases.
  • the pixel value of the mask image output by the face occlusion detection model is infinitely close to the pixel value of the supervision label, and the training is completed.
  • the function Loss is a commonly used segmentation loss function IOU loss, which is calculated according to the mask image and the all-zero matrix L.
  • various types of occluders are randomly added to the random face area of the face training image sample, and then a large number of face occlusion training image samples are input into the face occlusion detection model for training, so that the face
  • the occlusion detection model is becoming more and more sensitive to the detection of occlusions, so as to achieve the effect of detecting any occlusions.
  • Step S600 Calculate the occlusion ratio of each face organ according to the pixel value of the target mask image.
  • the pixel value of the target mask image is compared with the preset pixel threshold, and all points higher than the preset pixel threshold are counted, and then the occlusion ratio of each face organ is calculated.
  • the step 600 may include:
  • Step S601 according to the pixel value situation of the target mask image, count the number of the preset pixel values in each face organ, and obtain the total number of occlusion pixels;
  • Step S602 according to the total number of occluded pixels, calculate the ratio of the total number of occluded pixels corresponding to the total number of pixel values of face parts, and obtain the occlusion ratio of each face part.
  • the ratio of the pixel value of the mask image corresponding to each face organ Patch image to the preset pixel threshold is calculated, that is, the face organ occlusion percentage.
  • the formula for calculating the percentage of organ occlusion is as follows:
  • x1, y1 are the coordinate positions of the upper left corner of the face organ in the mask image
  • h and w correspond to the height and width of the face organ in the mask image
  • ⁇ ij represents the binarized mask image.
  • the pixel value at position (i, j) Indicates that if the pixel corresponding to the (i, j) coordinate in the mask image is 1, take 1, otherwise take 0.
  • the key point information of the corresponding face organ is obtained by performing key point detection on the face image, so as to perform Patch segmentation on the face organ to obtain the corresponding face organ Patch image, After preprocessing, it is input into the pre-trained face occlusion detection model for face detection, and the corresponding mask image is obtained, and finally the corresponding facial organ occlusion ratio is obtained by calculation. Not only the complexity of face occlusion detection is reduced, but also the face division is accurate to each face organ, which greatly improves the accuracy of face occlusion detection.
  • FIG. 10 a schematic diagram of program modules of a face occlusion detection system 700 according to an embodiment of the present application is shown.
  • the face occlusion detection system 700 can be applied to computer equipment, and the computer equipment can be a mobile phone, a tablet personal computer, a laptop computer, or other equipment with a data transmission function.
  • the face occlusion detection system 700 may include or be divided into one or more program modules, and one or more program modules are stored in a storage medium and processed by one or more processors. Executed to complete the embodiments of the present application, and the above-mentioned face occlusion detection system 700 can be implemented.
  • the program modules referred to in the embodiments of the present application refer to a series of computer program instruction segments capable of completing specific functions, and are more suitable for describing the execution process of the face occlusion detection system 700 in the storage medium than the programs themselves.
  • the face occlusion detection system 700 includes an acquisition module 701, a first detection module 702, a segmentation module 703, a second detection module 704, a processing module 705 and a calculation module 706. The following description will specifically introduce the functions of each program module in the embodiments of the present application:
  • the acquiring module 701 is used for acquiring the face image to be detected.
  • the acquisition module 701 acquires the face image to be detected in the model by taking a photo of the face by a camera device, capturing a face by a video monitoring device, and capturing by a web crawler.
  • the first detection module 702 is configured to perform key point detection on the face image to obtain key point information of the face organs in the face image.
  • the first detection module 702 performs key point detection by inputting the face image to be detected into a preset key point model to obtain corresponding key point information, thereby determining the key points of the face organs information.
  • the first detection module 702 is specifically configured to:
  • the face image to be detected is input into a preset key point model for key point detection and calibration, 68 key points are marked on the face image to be detected, and the The corresponding serial number is also marked, and the corresponding key point information is obtained to determine the corresponding face organ coordinate point information.
  • FIG. 3 is a schematic grayscale image of a face organ Patch segmentation.
  • the serial numbers corresponding to the coordinates of the key points are 36, 37, 38, 39, 40, and 41, respectively, and the area enclosed by the coordinates of the key points represents the left eye.
  • the serial numbers corresponding to the key point coordinates of the left eyebrow are 17, 18, 19, 20, and 21, respectively, and the serial numbers corresponding to the key point coordinates of the right eyebrow are 22, 23, 24, 25, and 26.
  • the serial number The horizontal line where the two points 19 and 24 are located is used as the lower boundary of the forehead.
  • the height of the face frame extending one-fifth of the orientation is used as the upper boundary of the forehead, and the left and right boundaries of the forehead are respectively the serial number 17.
  • the vertical line corresponding to serial number 26 forms a rectangular area as the forehead.
  • the height of the face frame is the distance between the largest point in the key point coordinates of the eyebrows and the smallest point in the key point coordinates of the face contour.
  • the human cheek can also be divided by the 68 key point information.
  • the sequence numbers corresponding to the key point coordinates are 1, 2, 3, 4, 5, 6, 7, 31 respectively. , 40, 41, 48, the area enclosed by these 11 key points is the left cheek.
  • the face contour can also be divided by the 68 key point information.
  • the serial numbers corresponding to the key point coordinates are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13. , 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, the area surrounded by these 27 key points is the face contour.
  • the key point information of the face image is obtained by performing key point detection on the face image, thereby accurately obtaining the corresponding face organs.
  • the segmentation module 703 is configured to perform face organ Patch segmentation on the face image according to the key point information to obtain a corresponding face organ Patch image.
  • the segmentation module 703 performs Patch segmentation on the face image according to the key point information detected by the key point model and the preset division rules, and takes the smallest circumscribed rectangular area containing each face organ to obtain the corresponding human face Face Organ Patch Image.
  • the segmentation module 703 is specifically used for:
  • Patch segmentation is performed on the face image according to the minimum circumscribed rectangle corresponding to each face organ to obtain a face organ Patch image corresponding to each face organ.
  • the segmentation module 703 designs a set of division rules according to the key point information, and the rules are as follows: according to the area enclosed by the coordinates of the key points and the sequence numbers corresponding to the key points, the specific position of the facial organ is determined. Since polygon calculation is relatively redundant and the discrimination of occlusion judgment is of little significance, according to the uppermost, lowermost, leftmost and rightmost coordinate points of the face organ, the minimum circumscribed rectangle of the face organ is determined as the human face. The face organ Patch image is extracted for calculation.
  • the serial numbers corresponding to the key point coordinates are 36, 37, 38, 39, 40, 41, respectively, and the area surrounded by the key point coordinates represents the left eye.
  • the human cheek can also be divided by the 68 key point information.
  • the sequence numbers corresponding to the key point coordinates are 1, 2, 3, 4, 5, 6, 7, 31, 40, 41, 48 , the area enclosed by the coordinates of these 11 key points represents the left cheek.
  • the smallest rectangle that can contain the left cheek is taken as the left cheek Patch.
  • the face contour can also be divided by the 68 key point information, and the smallest rectangle is taken as the face contour Patch through the serial numbers 0, 8, 16, 19, and 24 corresponding to the key point coordinates.
  • the complexity of calculation is reduced compared with the traditional polygon calculation, and the calculation of the occlusion ratio of the face organ is more convenient.
  • the second detection module 704 is configured to preprocess the face organ Patch image, and input the preprocessed face organ Patch image into a pre-trained face occlusion detection model to perform face occlusion detection, And output the corresponding mask image.
  • the second detection module 704 first preprocesses the divided face organ Patch images to obtain images that can be used for the face occlusion detection model, and after pre-training the face occlusion detection model, Input the preprocessed image into the face occlusion detection model for face occlusion detection, so as to output the corresponding mask image.
  • FIG. 5 is a schematic effect diagram of a face organ Patch segmentation.
  • the image on the left of Figure 5 is the preprocessed face input image
  • the right side of Figure 5 is the mask image output by the face occlusion detection model.
  • the black part on the right side of Figure 5 is the background
  • the white part is the face area.
  • the second detection module 704 is specifically configured to:
  • Described face organ Patch image is filled and the image after filling is carried out size adjustment, obtains the square Patch image of corresponding size;
  • the second detection module 704 calls the function Padding 0 to fill the face organ Patch image area into a square, and then calls the function resize to adjust the size of the face organ Patch image area to 128*128, obtaining 128 *128 square patch images.
  • Table 1 it is the network structure table of the face occlusion detection model.
  • the square patch image first passes through the left half of the face occlusion detection model, namely the first layer to the fourth layer, for feature extraction, which belongs to the downsampling stage; then passes through the right half of the face occlusion detection model, that is, the first layer. Layers 5, 7, and 10 belong to the upsampling stage. This stage involves the fusion of feature maps of different scales.
  • the fusion method is as shown in Table 1.
  • the function Concat operation is used to accumulate the thickness of the feature maps; the last layer is a filter (filtering). device), the size is 1*1*128, and the depth is 1.
  • the face occlusion detection model outputs a mask image with a size of 128*128.
  • the second detection module 704 preprocesses the face organ Patch image and inputs it into the face occlusion detection model, and then obtains the face organ through operations such as feature extraction, image fusion, and convolution. mask image, so as to accurately distinguish face organs, skin and occluders, and make the calculation of the occlusion ratio of face organs more accurate.
  • the processing module 705 is configured to perform binarization processing on the mask image to obtain the binarized target mask image.
  • the processing module 705 first performs grayscale processing on the mask image to obtain a corresponding grayscale image, and then performs binarization processing on the obtained grayscale image according to a preset pixel threshold to obtain the two grayscale images. Valued target mask image.
  • processing module 705 is specifically configured to:
  • the pixel value of the pixel point is set to a preset pixel value
  • the binarization processing of the mask image is completed to obtain the binarized target mask image.
  • the processing module 705 performs binarization processing on the mask image, so that each pixel of the mask image is between 0 and 1, and sets the preset pixel threshold to 0.75, which is greater than the preset pixel threshold.
  • the pixel points of the threshold are set to 1 (representing an occlusion area), and other pixels are set to 0 (representing a non-occlusion area) to obtain the binarized target mask image.
  • the preset pixel threshold can be freely set according to the actual situation, which is not limited here.
  • the binarized target mask image is obtained by performing a binarization process on the mask image, so that the target face region in the image is distinguished from the background, and the result of the model is more accurate.
  • the face occlusion detection system 700 includes a training module of a face occlusion detection model, which is used for:
  • the face training image sample is subjected to face organ Patch segmentation to obtain a corresponding face organ Patch image
  • the face occlusion training image samples are preprocessed, and the preprocessed face organ Patch image is input into the face occlusion detection model to complete the training of the face occlusion detection model.
  • the training module of the face occlusion detection model performs key point detection on the face training image sample through a key point model to obtain the key point information of the face organs in the face training image sample, and then according to the The key point information, the face training image sample is subjected to face organ Patch segmentation to obtain a corresponding face organ Patch image, and the occluder sample is randomly added to the preset position of the face organ Patch image , to replace the pixels of the preset position of the face organ Patch image with the pixels of the occluder sample to obtain a face occluder training image sample, and add the area pixel value of the occluder sample. Replaced with the pixel value of the occluder sample.
  • the occlusion samples are captured by web crawlers and captured and extracted by themselves, including fingers, pens, fans, cups, masks, cosmetics, microphones, and the like.
  • the coordinates on the two-dimensional plane of the region where the occluder samples are added to the face training image samples are [x1:x2, y1:y2], where x1, x2, y1, and y2 correspond to people, respectively.
  • the abscissas x1, x2 and y1, y2 of the face organs in the mask image. First initialize an all-zero matrix L with a size of 128*128, and then modify all the pixels in the [x1:x2,y1:y2] area to 1.
  • the modified matrix is the supervision label used in training.
  • the face occlusion detection model is trained by the segmentation loss function IOU Loss, so that the pixel value on the face organ patch image is closer to the pixel value at the corresponding position on the all-zero matrix L, that is, there is occlusion
  • the pixel value of the area of the object is close to 1, and the pixel value of other areas is close to 0, and then the gradient descent method commonly used in deep learning is used for training until the face occlusion detection model converges, that is, the Loss value no longer decreases.
  • the pixel value of the mask image output by the face occlusion detection model is infinitely close to the pixel value of the supervision label, and the training is completed.
  • the function Loss is a commonly used segmentation loss function IOU loss, which is calculated according to the mask image and the all-zero matrix L.
  • the face occlusion detection system 700 randomly adds various types of occluders to random face regions of the face training image samples, and then inputs a large number of face occlusion training image samples into the The face occlusion detection model is trained to make the face occlusion detection model more and more sensitive to the detection of occlusions, so as to achieve the effect of detecting any occlusions.
  • the calculation module 706 is configured to calculate the occlusion ratio of each face organ according to the pixel value of the target mask image.
  • the pixel value of the target mask image is compared with the preset pixel threshold, and all points higher than the preset pixel threshold are counted, and then the occlusion ratio of each face organ is calculated.
  • the computing module 706 is specifically used to:
  • the ratio of the total number of occluded pixels corresponding to the total number of pixel values of face parts is calculated, and the occlusion ratio of each face part is obtained.
  • the calculation module 706 calculates the proportion of the pixel value of the mask image corresponding to the patch image of each face organ according to the pixel value of the target mask image, which is the proportion of the preset pixel threshold, which is the occlusion percentage of the face organ .
  • the formula for calculating the percentage of organ occlusion is as follows:
  • x1, y1 are the coordinate positions of the upper left corner of the face organ in the mask image
  • h and w correspond to the height and width of the face organ in the mask image
  • ⁇ ij represents the binarized mask image.
  • the pixel value at position (i, j) Indicates that if the pixel corresponding to the (i, j) coordinate in the mask image is 1, take 1, otherwise take 0.
  • the face occlusion detection system 700 obtains the key point information of the corresponding face organ by performing key point detection on the face image, so as to perform Patch segmentation on the face organ to obtain the corresponding face organ Patch image , and then input into the pre-trained face occlusion detection model for face detection after preprocessing, to obtain the corresponding mask image, and finally calculate to obtain the corresponding facial organ occlusion ratio. Not only the complexity of face occlusion detection is reduced, but also the face division is accurate to each face organ, which greatly improves the accuracy of face occlusion detection.
  • an embodiment of the present application further provides a schematic diagram of a hardware architecture of a computer device 800 .
  • a computer device 800 Such as smart phones, tablet computers, notebook computers, desktop computers, rack servers, blade servers, tower servers or rack servers (including independent servers, or server clusters composed of multiple servers) that can execute programs, etc. .
  • the computer device 800 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions.
  • the computer device 800 at least includes, but is not limited to, a memory 801, a processor 802, and a network interface 803 that can communicate with each other through a device bus. in:
  • the memory 801 includes at least one type of computer-readable storage medium, and the readable storage medium includes a flash memory, a hard disk, a multimedia card, a card-type memory (for example, SD or DX memory, etc.), and random access memory.
  • RAM static random access memory
  • ROM read only memory
  • EEPROM electrically erasable programmable read only memory
  • PROM programmable read only memory
  • magnetic memory magnetic disk, optical disk, and the like.
  • the memory 801 may be an internal storage unit of the computer device 800 , such as a hard disk or a memory of the computer device 800 .
  • the memory 801 may also be an external storage device of the computer device 800, for example, a pluggable hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital) device equipped on the computer device 800 Digital, SD) card, flash card (Flash Card), etc.
  • the memory 801 may also include both the internal storage unit of the computer device 800 and its external storage device.
  • the memory 801 is generally used to store the operating device installed in the computer device 800 and various application software, such as the program code of the face occlusion detection system 700 and the like.
  • the memory 801 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 802 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some inventive embodiments.
  • the processor 802 is generally used to control the overall operation of the computer device 800 .
  • the processor 802 is configured to run the program code or process data stored in the memory 801, for example, run the program code of the face occlusion detection system 700, so as to realize the face occlusion detection system 700 described above. Occlusion detection method.
  • the network interface 803 may include a wireless network interface or a wired network interface, and the network interface 803 is generally used to establish a communication connection between the computer device 800 and other electronic devices.
  • the network interface 803 is used to connect the computer device 800 with an external terminal through a network, and establish a data transmission channel and a communication connection between the computer device 800 and the external terminal.
  • the network may be an intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network Wireless or wired network such as network, Bluetooth (Bluetooth), Wi-Fi, etc.
  • FIG. 11 only shows a computer device 800 having components 801-803, but it should be understood that implementation of all shown components is not required, and that more or less components may be implemented instead.
  • the face occlusion detection system 700 stored in the memory 801 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 801 and are configured by One or more processors (the processor 802 in this embodiment) are executed to complete the face occlusion detection method of the present application.
  • Embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium may be non-volatile or volatile, such as flash memory, hard disk, multimedia card, card-type storage (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic A memory, a magnetic disk, an optical disc, a server, an App application mall, etc., store a computer program thereon, and when the program is executed by the processor, a corresponding function is realized.
  • the computer-readable storage medium of the embodiment of the present application is used to store the face occlusion detection system 700, so as to implement the face occlusion detection method of the present application when executed by the processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

La présente demande se rapporte au domaine de l'intelligence artificielle et concerne un procédé et un système de détection d'occlusion de visage. Le procédé comprend : l'acquisition d'une image de visage à détecter ; la réalisation d'une détection de points clés sur l'image de visage pour obtenir des informations de points clés d'organes de visage dans l'image de visage ; en fonction des informations de points clés, la réalisation d'une segmentation de bloc d'organe de visage sur l'image de visage pour obtenir des images de bloc d'organe de visage correspondantes ; le prétraitement des images de bloc d'organe de visage, l'entrée des images de bloc d'organe de visage prétraitées dans un modèle de détection d'occlusion de visage pré-entraîné pour réaliser une détection d'occlusion de visage et la production d'images de masque correspondantes ; la réalisation d'un traitement de binarisation sur les images de masque pour obtenir les images de masque cibles binarisées ; et le calcul des rapports d'occlusion de divers organes de visage en fonction des valeurs de pixel des images de masque cibles. Selon la présente demande, les pourcentages d'occlusion correspondant à divers organes de visage peuvent être calculés avec précision, ce qui permet d'améliorer considérablement la précision de la détection d'occlusion de visage.
PCT/CN2021/082571 2020-12-21 2021-03-24 Procédé et système de détection d'occlusion de visage, dispositif et support d'enregistrement WO2022134337A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011520261.8A CN112633144A (zh) 2020-12-21 2020-12-21 人脸遮挡检测方法、系统、设备及存储介质
CN202011520261.8 2020-12-21

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/431,630 Continuation US20240177707A1 (en) 2021-08-06 2024-02-02 Wake-up processing method and device, voice apparatus, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2022134337A1 true WO2022134337A1 (fr) 2022-06-30

Family

ID=75321947

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/082571 WO2022134337A1 (fr) 2020-12-21 2021-03-24 Procédé et système de détection d'occlusion de visage, dispositif et support d'enregistrement

Country Status (2)

Country Link
CN (1) CN112633144A (fr)
WO (1) WO2022134337A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311553A (zh) * 2023-05-17 2023-06-23 武汉利楚商务服务有限公司 应用于半遮挡图像下的人脸活体检测方法及装置

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113111817B (zh) * 2021-04-21 2023-06-27 中山大学 语义分割的人脸完整度度量方法、系统、设备及存储介质
JP2022171150A (ja) * 2021-04-30 2022-11-11 パナソニックIpマネジメント株式会社 情報処理装置、情報処理方法、及び、プログラム
CN113284041B (zh) * 2021-05-14 2023-04-18 北京市商汤科技开发有限公司 一种图像处理方法、装置、设备及计算机存储介质
CN113221767B (zh) * 2021-05-18 2023-08-04 北京百度网讯科技有限公司 训练活体人脸识别模型、识别活体人脸的方法及相关装置
CN113222973B (zh) * 2021-05-31 2024-03-08 深圳市商汤科技有限公司 图像处理方法及装置、处理器、电子设备及存储介质
CN113284219A (zh) * 2021-06-10 2021-08-20 北京字跳网络技术有限公司 图像处理方法、装置、设备及存储介质
CN113537054B (zh) * 2021-07-15 2022-11-01 重庆紫光华山智安科技有限公司 人脸遮挡程度计算方法、装置、电子设备及计算机可读存储介质
CN113469187B (zh) * 2021-07-15 2022-08-23 长视科技股份有限公司 基于目标检测的物体遮挡比例计算方法与系统
CN113743195B (zh) * 2021-07-23 2024-05-17 北京眼神智能科技有限公司 人脸遮挡定量分析方法、装置、电子设备及存储介质
CN113505736A (zh) * 2021-07-26 2021-10-15 浙江大华技术股份有限公司 对象的识别方法及装置、存储介质、电子装置
CN113723310B (zh) * 2021-08-31 2023-09-05 平安科技(深圳)有限公司 基于神经网络的图像识别方法及相关装置
CN113747112B (zh) * 2021-11-04 2022-02-22 珠海视熙科技有限公司 一种多人视频会议头像的处理方法及处理装置
CN114399813B (zh) * 2021-12-21 2023-09-26 马上消费金融股份有限公司 人脸遮挡检测方法、模型训练方法、装置及电子设备
CN114093012B (zh) * 2022-01-18 2022-06-10 荣耀终端有限公司 人脸遮挡的检测方法和检测装置
CN114155561B (zh) * 2022-02-08 2022-09-09 杭州迪英加科技有限公司 一种幽门螺杆菌定位方法及装置
CN115941529A (zh) * 2022-11-28 2023-04-07 国网江苏省电力工程咨询有限公司 一种基于机器人的电缆隧道检测方法和系统
CN115938023B (zh) * 2023-03-15 2023-05-02 深圳市皇家金盾智能科技有限公司 智能门锁人脸识别解锁方法、装置、介质及智能门锁

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107909065A (zh) * 2017-12-29 2018-04-13 百度在线网络技术(北京)有限公司 用于检测人脸遮挡的方法及装置
CN109840477A (zh) * 2019-01-04 2019-06-04 苏州飞搜科技有限公司 基于特征变换的受遮挡人脸识别方法及装置
CN111191616A (zh) * 2020-01-02 2020-05-22 广州织点智能科技有限公司 一种人脸遮挡检测方法、装置、设备及存储介质
CN111428581A (zh) * 2020-03-05 2020-07-17 平安科技(深圳)有限公司 人脸遮挡检测方法及系统
CN111523480A (zh) * 2020-04-24 2020-08-11 北京嘀嘀无限科技发展有限公司 一种面部遮挡物的检测方法、装置、电子设备及存储介质
CN111814569A (zh) * 2020-06-12 2020-10-23 深圳禾思众成科技有限公司 一种人脸遮挡区域的检测方法及系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108319953B (zh) * 2017-07-27 2019-07-16 腾讯科技(深圳)有限公司 目标对象的遮挡检测方法及装置、电子设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107909065A (zh) * 2017-12-29 2018-04-13 百度在线网络技术(北京)有限公司 用于检测人脸遮挡的方法及装置
CN109840477A (zh) * 2019-01-04 2019-06-04 苏州飞搜科技有限公司 基于特征变换的受遮挡人脸识别方法及装置
CN111191616A (zh) * 2020-01-02 2020-05-22 广州织点智能科技有限公司 一种人脸遮挡检测方法、装置、设备及存储介质
CN111428581A (zh) * 2020-03-05 2020-07-17 平安科技(深圳)有限公司 人脸遮挡检测方法及系统
CN111523480A (zh) * 2020-04-24 2020-08-11 北京嘀嘀无限科技发展有限公司 一种面部遮挡物的检测方法、装置、电子设备及存储介质
CN111814569A (zh) * 2020-06-12 2020-10-23 深圳禾思众成科技有限公司 一种人脸遮挡区域的检测方法及系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311553A (zh) * 2023-05-17 2023-06-23 武汉利楚商务服务有限公司 应用于半遮挡图像下的人脸活体检测方法及装置
CN116311553B (zh) * 2023-05-17 2023-08-15 武汉利楚商务服务有限公司 应用于半遮挡图像下的人脸活体检测方法及装置

Also Published As

Publication number Publication date
CN112633144A (zh) 2021-04-09

Similar Documents

Publication Publication Date Title
WO2022134337A1 (fr) Procédé et système de détection d'occlusion de visage, dispositif et support d'enregistrement
CN109359575B (zh) 人脸检测方法、业务处理方法、装置、终端及介质
JP6636154B2 (ja) 顔画像処理方法および装置、ならびに記憶媒体
WO2020199931A1 (fr) Procédé et appareil de détection de points clés de visage, et support de stockage et dispositif électronique
US9547908B1 (en) Feature mask determination for images
EP3916627A1 (fr) Procédé de détection de corps vivant basé sur une reconnaissance faciale, et dispositif électronique et support de stockage
WO2018086607A1 (fr) Procédé de suivi de cible, dispositif électronique et support d'informations
US10534957B2 (en) Eyeball movement analysis method and device, and storage medium
WO2018188453A1 (fr) Procédé de détermination d'une zone de visage humain, support de stockage et dispositif informatique
WO2022078041A1 (fr) Procédé d'entraînement de modèle de détection d'occlusion et procédé d'embellissement d'image faciale
WO2021051611A1 (fr) Procédé de reconnaissance faciale basé sur la visibilité faciale, système, dispositif et support de stockage
WO2019237567A1 (fr) Procédé de détection de chute fondé sur un réseau neuronal à convolution
WO2019061658A1 (fr) Procédé et dispositif de localisation de lunettes, et support d'informations
CN111428581A (zh) 人脸遮挡检测方法及系统
CN108428214B (zh) 一种图像处理方法及装置
CN109271930B (zh) 微表情识别方法、装置与存储介质
KR20200118076A (ko) 생체 검출 방법 및 장치, 전자 기기 및 저장 매체
CN111626163B (zh) 一种人脸活体检测方法、装置及计算机设备
CN111598038B (zh) 脸部特征点检测方法、装置、设备及存储介质
CN108416291B (zh) 人脸检测识别方法、装置和系统
WO2021051547A1 (fr) Procédé et système de détection de comportement violent
US20240013572A1 (en) Method for face detection, terminal device and non-transitory computer-readable storage medium
WO2020248848A1 (fr) Procédé et dispositif de détermination intelligente de cellule anormale, et support d'informations lisible par ordinateur
CN110008943B (zh) 一种图像处理方法及装置、一种计算设备及存储介质
CN112396050B (zh) 图像的处理方法、设备以及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21908349

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21908349

Country of ref document: EP

Kind code of ref document: A1