CN116309643A

CN116309643A - Face shielding score determining method, electronic equipment and medium

Info

Publication number: CN116309643A
Application number: CN202310298673.9A
Authority: CN
Inventors: 何方
Original assignee: Shanghai Yuncong Enterprise Development Co ltd
Current assignee: Shanghai Yuncong Enterprise Development Co ltd
Priority date: 2023-03-23
Filing date: 2023-03-23
Publication date: 2023-06-23

Abstract

The invention relates to computer vision, in particular to a face shielding part determining method, electronic equipment and a medium, and aims to solve the problem that the existing method cannot accurately reflect the face shielding degree. For this purpose, the invention obtains at least partial face region segmentation and occlusion region segmentation in at least partial face region or non-occlusion region segmentation in at least partial face region by inputting face image into trained face segmentation model; respectively acquiring the number of pixels segmented by at least part of the face region and the number of pixels segmented by the shielding region in at least part of the face region or the number of pixels segmented by the non-shielding region in at least part of the face region; face occlusion parts of the face image are determined based on the number of pixels segmented by at least part of the face region and the number of pixels segmented by the occlusion region in at least part of the face region or the number of pixels segmented by the non-occlusion region in at least part of the face region.

Description

Face shielding score determining method, electronic equipment and medium

Technical Field

The invention relates to the technical field of computer vision, and particularly provides a face shielding score determining method, electronic equipment and a medium.

Background

In the face recognition system, face quality evaluation is an indispensable preprocessing step, and aims to filter out face data with poor quality before face recognition so as to improve recognition accuracy. Occlusion is a factor affecting the quality of a face, which can have a large impact on the face recognition, and therefore requires accurate estimation. However, the occlusion score is not an intuitive target, and a specific occlusion score value cannot be directly given (for example, the range of the face occlusion score value is 0-1, 0 indicates no occlusion, 1 indicates all occlusion), and the occlusion score value often needs to be obtained by constructing in other manners.

In the existing face shielding and classifying estimation technology, whether a face belongs to a shielding face or not is identified by a classification method; the shielding degree of each sub-region is estimated by partitioning the face frame region, and then the shielding degree of the whole face is obtained by weighting. However, in these face occlusion score estimation techniques, the method for constructing the face occlusion score cannot accurately reflect the degree of face occlusion (for example, the confidence in the classification method only reflects the confidence that the face is not occlusion, and the face frame area includes a non-face area, and the occlusion condition of the portion should not be used as a factor for determining whether the face is occlusion, so that the partition also has an influence), so that an error further occurs in the occlusion score estimation based on this.

Accordingly, there is a need in the art for a new face occlusion score determination method to address the above-described problems.

Disclosure of Invention

The invention aims to solve the technical problems that the existing face shielding sub-construction method can not accurately reflect the shielding degree of the face, thereby reducing the use experience of users.

In order to achieve the above object, in a first aspect, the present invention provides a face occlusion score determining method, the method comprising the steps of:

acquiring a face image, and inputting the face image into a trained face segmentation model to obtain at least partial face region segmentation and occlusion region segmentation in at least partial face region or non-occlusion region segmentation in at least partial face region;

respectively acquiring the number of pixels segmented by the at least partial face region and the number of pixels segmented by the shielding region in the at least partial face region or the number of pixels segmented by the non-shielding region in the at least partial face region;

and determining the face shielding part of the face image based on the number of the pixels segmented by the at least partial face area and the number of the pixels segmented by the shielding area in the at least partial face area or the number of the pixels segmented by the non-shielding area in the at least partial face area.

In an optional technical solution of the above face occlusion part determining method, the step of determining a face occlusion part of the face image based on the number of pixels segmented by the at least partial face region and the number of pixels segmented by the occlusion region in the at least partial face region or the number of pixels segmented by the non-occlusion region in the at least partial face region includes:

acquiring a first ratio of the number of pixels segmented by the shielding region in the at least partial face region to the number of pixels segmented by the at least partial face region, and taking the first ratio as a face shielding part of the face image;

or, obtaining a second ratio of the number of pixels segmented by the non-occlusion region in the at least partial face region to the number of pixels segmented by the at least partial face region, and taking the second ratio as the face occlusion part of the face image.

In an optional technical scheme of the face occlusion score determining method, the method trains the model at least based on the following steps:

acquiring a first face training image, marking the first face training image, and taking the marked first face training image as a training sample of the face segmentation model; constructing a neural network model, and taking the neural network model as a face segmentation model to be trained, wherein the face segmentation model to be trained comprises a backbone network to be trained, a neck network to be trained and a head network to be trained;

and inputting the training sample into the face segmentation model to be trained for training so as to obtain a face segmentation model which is trained.

In an optional technical solution of the above face occlusion score determining method, the step of "labeling the first face training image" includes:

labeling at least a partial face region segmentation based on the first face training image, and occlusion region segmentation within at least a partial face region or non-occlusion region segmentation within at least a partial face region.

In an optional technical solution of the above face occlusion part determination method, the steps of labeling at least part of face region segmentation, and occlusion region segmentation in at least part of face region or non-occlusion region segmentation in at least part of face region include:

acquiring a second face training image based on the first face training image, wherein the second face training image is an image only comprising a face detection frame area in the first face training image;

labeling at least part of the face region segmentation and shielding region segmentation in at least part of the face region or non-shielding region segmentation in at least part of the face region based on the second face training image and a preset labeling region requirement, wherein the preset labeling region requirement comprises labeling all of the face region or labeling part of the face region.

In an optional technical solution of the above face occlusion score determining method, the step of inputting the training sample into the face segmentation model to be trained to perform training so as to obtain a face segmentation model after training includes:

s1, inputting the training sample into the backbone network to be trained for feature extraction so as to obtain a first feature map of the training sample;

s2, inputting the first feature map of the training sample into the neck network to be trained for feature fusion so as to obtain a second feature map of the training sample;

s3, inputting a second feature map of the training sample into the head network to be trained for region segmentation to obtain a region segmentation result of the training sample, wherein the region segmentation result is a pixel level or a grid level;

s4, acquiring a loss function based on the region segmentation result of the training sample and the training sample, feeding back the loss function to the step S1, and circularly executing the steps S1-S4 until the loss function converges, wherein the loss function at least comprises a cross entropy loss function.

In an optional technical solution of the above method for determining a face occlusion score, the head network to be trained includes a head network to be trained for region segmentation of a face, a head network to be trained for region segmentation of an occlusion region in a face region, or a head network to be trained for region segmentation of a non-occlusion region in a face region, and the step of inputting the second feature map of the training sample into the head network to be trained for region segmentation to obtain a region segmentation result of the training sample includes:

inputting the second feature map of the sample to be trained into the face region segmentation head network to be trained so as to obtain at least partial face region segmentation;

inputting the second feature map of the sample to be trained into a shielding region segmentation head network in the face region to be trained so as to obtain shielding region segmentation in at least part of the face region;

or inputting the second feature map of the sample to be trained into a non-occlusion region segmentation head network in the face region to be trained so as to obtain non-occlusion region segmentation in at least part of the face region.

In an optional technical scheme of the above face shielding score determining method, the method further includes:

constructing the backbone network to be trained based at least on at least one of ResNet, mobileNet, HRNet;

the neck network to be trained is constructed based on at least one of FPN, PANet, bi-FPN.

In a second aspect, the present invention also provides an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the face occlusion score determination method according to any of the preceding claims when executing the computer program.

In a third aspect, the present invention also provides a readable storage medium having stored therein a plurality of program codes adapted to be loaded and executed by a processor to perform the face occlusion score determination method of any of the above.

As can be appreciated by those skilled in the art, in the technical solution of the present invention, by acquiring a face image and inputting the face image into a trained face segmentation model, at least a partial face region segmentation and an occlusion region segmentation within at least a partial face region or a non-occlusion region segmentation within at least a partial face region are obtained; respectively acquiring the number of pixels segmented by at least part of the face region and the number of pixels segmented by the shielding region in at least part of the face region or the number of pixels segmented by the non-shielding region in at least part of the face region; face occlusion parts of the face image are determined based on the number of pixels segmented by at least part of the face region and the number of pixels segmented by the occlusion region in at least part of the face region or the number of pixels segmented by the non-occlusion region in at least part of the face region. The setting can reflect the shielding degree of the face more accurately, so that the accuracy of face recognition is further improved, and the use experience of a user is improved.

Further, the method further comprises: constructing a backbone network to be trained based on at least one of ResNet, mobileNet, HRNet; a neck network to be trained is constructed based on at least one of the FPN, the PANet, and the Bi-FPN. The face segmentation model can be established based on the actual requirements of the user, so that the relationship between accuracy and time consumption is balanced better, and the use experience of the user is further improved.

Drawings

The present disclosure will become more readily understood with reference to the accompanying drawings. As will be readily appreciated by those skilled in the art: the drawings are for illustrative purposes only and are not intended to limit the scope of the present invention. Moreover, like numerals in the figures are used to designate like parts, wherein:

FIG. 1 is a flow chart of the main steps of a face occlusion score determination method according to an embodiment of the present invention;

FIG. 2 is a flow chart of the main steps of training a face segmentation model according to one embodiment of the present invention;

FIG. 3 is a schematic image of a region segmentation of a face labeled at least in part, according to one embodiment of the invention;

FIG. 4 is a schematic image of an occlusion region segmentation labeled at least in part within a face region, according to one embodiment of the invention;

FIG. 5 is a flow chart of the main steps of inputting training samples into a face segmentation model to be trained for training according to one embodiment of the present invention;

fig. 6 is a schematic diagram of a main structure of an electronic device for performing the face occlusion score determination method of the present invention.

Detailed Description

Some embodiments of the invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.

In the description of the present invention, a "module," "processor" may include hardware, software, or a combination of both. A module may comprise hardware circuitry, various suitable sensors, communication ports, memory, or software components, such as program code, or a combination of software and hardware. The processor may be a central processor, a microprocessor, an image processor, a digital signal processor, or any other suitable processor. The processor has data and/or signal processing functions. The processor may be implemented in software, hardware, or a combination of both. Non-transitory computer readable storage media include any suitable medium that can store program code, such as magnetic disks, hard disks, optical disks, flash memory, read-only memory, random access memory, and the like. The term "at least one A or B" or "at least one of A and B" has a meaning similar to "A and/or B" and may include A alone, B alone or A and B. The singular forms "a", "an" and "the" include plural referents.

As described in the background section, the invention provides a face shielding part determining method, which aims at the problem that the existing face shielding part constructing method can not accurately reflect the shielding degree of a face, thereby reducing the use experience of a user.

Referring to fig. 1, fig. 1 is a schematic flow chart of main steps of a face occlusion score determining method according to an embodiment of the present invention. As shown in fig. 1, the invention further provides a method for determining a face shielding score, which comprises the following steps:

step S101: and acquiring a face image, and inputting the face image into a trained face segmentation model to obtain at least partial face region segmentation, occlusion region segmentation in at least partial face region or non-occlusion region segmentation in at least partial face region.

Specifically, the face region in general includes the forehead, the eyebrow, the eyelid, the canthus, the orbit, the bridge of the nose, the nose wings, the tip of the nose, the nasolabial sulcus, the cheek, the lip, the upper jaw, the lower jaw, and the like, and in the face recognition technique, the face region may be recognized based on all of the face regions or the face region in the above part, and therefore, at least partial face region division, or at least partial face region non-blocking region division may be obtained by a trained face division model. For example, it is considered that the forehead area above the eyebrow does not substantially affect the face recognition, and thus at least a part of the face area may be a face area not including the forehead but including only below the eyebrow. The above arrangement of at least a part of the face area is merely illustrative, and may be selected according to actual needs in practical applications.

Step S102: the number of pixels segmented by at least a part of the face region, the number of pixels segmented by the occlusion region in at least a part of the face region, or the number of pixels segmented by the non-occlusion region in at least a part of the face region are acquired, respectively.

Step S103: face occlusion parts of the face image are determined based on the number of pixels segmented by at least a portion of the face region, the number of pixels segmented by an occlusion region within at least a portion of the face region, or the number of pixels segmented by a non-occlusion region within at least a portion of the face region.

Based on the steps S101 to S103, the face image is acquired and is input into a trained face segmentation model to obtain at least partial face region segmentation and at least partial face region segmentation or at least partial face region non-occlusion region segmentation; respectively acquiring the number of pixels segmented by at least part of the face region and the number of pixels segmented by the shielding region in at least part of the face region or the number of pixels segmented by the non-shielding region in at least part of the face region; face occlusion parts of the face image are determined based on the number of pixels segmented by at least part of the face region and the number of pixels segmented by the occlusion region in at least part of the face region or the number of pixels segmented by the non-occlusion region in at least part of the face region. The setting can reflect the shielding degree of the face more accurately, so that the accuracy of face recognition is further improved, and the use experience of a user is improved.

In some embodiments, determining the face mask of the face image based on the number of pixels segmented by at least a portion of the face region, the number of pixels segmented by the mask region within at least a portion of the face region, or the number of pixels segmented by the non-mask region within at least a portion of the face region comprises the steps of:

step S1031: and obtaining a first ratio of the number of pixels segmented by the shielding area in at least part of the face area to the number of pixels segmented by the at least part of the face area, and taking the first ratio as the face shielding part of the face image.

Step S1032: or, obtaining a second ratio of the number of pixels segmented by the non-occlusion region in at least part of the face region to the number of pixels segmented by the face region, and taking the second ratio as the face occlusion part of the face image.

Namely, a first ratio is obtained by the first ratio= (the number of pixels segmented by the shielding area in at least part of the face area)/(the number of pixels segmented by at least part of the face area), and then the first ratio is used as the face shielding part of the face image; or, the second ratio is obtained by the second ratio= (the number of pixels segmented by the non-occlusion region in at least part of the face region)/(the number of pixels segmented by at least part of the face region), and then the second ratio is used as the face occlusion part of the face image.

Referring to fig. 2, fig. 2 is a flowchart illustrating main steps for training a face segmentation model according to an embodiment of the present invention. As shown in fig. 2, the method trains the model based at least on the following steps:

step S201: and acquiring a first face training image, marking the first face training image, and taking the marked first face training image as a training sample of the face segmentation model.

In some embodiments, the first face training image includes a face detection frame region and a non-face detection frame region, and labeling the first face training image includes: labeling at least a partial face region segmentation based on the first face training image, and either an occlusion region segmentation within at least a partial face region or a non-occlusion region segmentation within at least a partial face region.

Specifically, at least a part of the face region refers to a part or all of the region belonging to the face in the first face training image, and the region can be represented by a point set at the boundary of the region or a point set in the region; the occlusion region in at least part of the face region refers to a region which is occluded in at least part of the face region and can be represented by a point set at the boundary of the region or a point set in the region; the non-occlusion region in at least a part of the face region refers to a region in at least a part of the face region that is not occluded, and may be represented by a set of points at the boundary of the region or a set of points in the region. For example, the region segmentation of the first face training image may be performed on a pixel level, that is, each pixel in the first face training image is divided into regions to which the region belongs. The above-described arrangement of the region division is merely illustrative, and may be selected according to actual needs in practical applications.

In some embodiments, labeling at least a partial face region segmentation and occlusion region segmentation within at least a partial face region or non-occlusion region segmentation within at least a partial face region comprises: acquiring a second face training image based on the first face training image, wherein the second face training image is an image only comprising a face detection frame area in the first face training image; labeling at least a part of the face region, or at least a non-shielding region within the face region based on the second face training image and a preset labeling region requirement, wherein the preset labeling region requirement comprises labeling all of the face region or labeling a part of the face region.

Specifically, the first face training image is an image including a face detection frame region and a non-face detection frame region, and the non-face detection frame region may be, for example, a neck region of a human body, a shoulder region of a human body, or the like. Therefore, in order to more accurately label the first face training image, a second face training image only including the face detection frame area of the first face training image can be obtained first, and then labeling is performed based on the second face training image. When labeling is carried out, at least part of the face region refers to part or all of the region belonging to the face in the second face training image, and the region can be represented by a point set at the boundary of the region or a point set in the region; the occlusion region in at least part of the face region refers to a region which is occluded in at least part of the face region and can be represented by a point set at the boundary of the region or a point set in the region; the non-occlusion region in at least a part of the face region refers to a region in at least a part of the face region that is not occluded, and may be represented by a set of points at the boundary of the region or a set of points in the region. If there is no shielding area in at least part of the face area, the shielding area in at least part of the face area is divided into empty sets.

Illustratively, the face detection frame region in the first face training image may be represented by an upper left corner coordinate (X1, Y1) and a lower right corner coordinate (X2, Y2) of the face detection frame region, that is, an upper left corner of the first face training image is set as an origin of coordinates, a positive direction of an X axis is set to be horizontal to the right, a positive direction of a Y axis is set to be vertical to the bottom, and an upper left corner coordinate of the face detection frame region under the coordinate axis is set as (X1, Y1) and a lower right corner coordinate is set as (X2, Y2). The above-described representation of the face detection frame area is merely an exemplary illustration, and may be selected according to actual needs in practical applications.

In some embodiments, at least a portion of the face region may be custom-defined, i.e., may include all of the face region, or may include only a portion of the face region, thereby labeling all of the face region or labeling a portion of the face region. For example, considering that the forehead area above the eyebrow does not substantially affect the face recognition, at least a part of the face area may be defined as a face area not including the forehead but including only below the eyebrow. In this manner of defining at least a part of the face region, at least a part of the face region segmentation based on the labeling of the second face training image may be as shown in fig. 3, and the occlusion region segmentation within the labeled at least part of the face region may be as shown in fig. 4. The above-mentioned defining manner of at least a part of the face region, the marked at least a part of the face region segmented image, and the marked at least a part of the occlusion region segmented image in the face region are merely exemplary, and may be selected according to actual needs in practical applications.

Step S202: and constructing a neural network model, and taking the neural network model as a face segmentation model to be trained, wherein the face segmentation model to be trained comprises a backbone network to be trained, a neck network to be trained and a head network to be trained.

Step S203: and inputting the training sample into the face segmentation model to be trained for training so as to obtain the face segmentation model which is trained.

S203 is further described below.

Referring to fig. 5, fig. 5 is a flowchart illustrating main steps of inputting training samples into a face segmentation model to be trained for training according to an embodiment of the present invention. As shown in fig. 5, in some embodiments, inputting a training sample into a face segmentation model to be trained for training, to obtain a trained face segmentation model includes the following steps:

step S2031: and inputting the training sample into a backbone network to be trained for feature extraction so as to obtain a first feature map of the training sample.

Step S2032: and inputting the first feature map of the training sample into a neck network to be trained for feature fusion so as to obtain a second feature map of the training sample.

Step S2033: and inputting the second feature map of the training sample into a head network to be trained for region segmentation to obtain a region segmentation result of the training sample, wherein the region segmentation result is a pixel level or a grid level.

Step S2034: and acquiring a loss function based on the region segmentation result of the training sample and the training sample, feeding back the loss function to the step S2031, and circularly executing the steps S2031 to S2034 until the loss function converges, wherein the loss function at least comprises a cross entropy loss function.

Specifically, feature extraction is the basis of computer vision tasks, and a good feature extraction network can significantly improve the performance of an algorithm, and in the computer vision tasks, a network for performing feature extraction on an image is called a backbone network. The receptive field refers to the size of the area mapped by the pixel points on the original image on the output characteristic map, the detection is carried out on the rough characteristic map output by the backbone network in the previous target detection, and the rough characteristic map output by the backbone network has a large receptive field, which is friendly to large objects, but too large receptive field is easy to cause 'defocus' for small objects. In order to avoid this problem, the feature images of multiple scales need to be fused from bottom to top before detection, that is, the neck network needs to fuse the first feature images extracted by the backbone network. After the neck network completes the fusion of the first feature map, the head network can detect and position based on the features fused by the neck network, namely, the region segmentation of the training sample is realized.

Based on the structural settings of the backbone network to be trained, the neck network to be trained and the head network to be trained, the region segmentation result can be at a pixel level or a grid level. The region segmentation at the pixel level is to segment the region to which each pixel in the training sample belongs; the region division at the grid level is to divide each training sample into a plurality of grids, so that the division of the region to which each grid belongs is performed. The accuracy of the region division at the pixel level is greater than that at the grid level, but the time consumption of the region division at the pixel level is longer than that at the grid level, so that the level of the region division to be adopted can be comprehensively considered based on the accuracy and the time consumption in actual use.

The loss function adjusts the weight of the face segmentation model to be trained by calculating the difference between the region segmentation result of each iteration of the face segmentation model to be trained and the training sample. The calculation formula of the cross entropy loss function is as follows:

wherein N is the number of samples selected by one training, M is the number of categories, y _ic Taking 1 if the true category of the sample i is c, or taking 0 if the true category of the sample i is c; p is p _ic The predicted probability that sample i belongs to category c.

In some embodiments, the head network to be trained includes a face region segmentation head network to be trained, an occlusion region segmentation head network within a face region to be trained, or a non-occlusion region segmentation head network within a face region to be trained, and inputting a second feature map of the training sample into the head network to be trained for region segmentation to obtain a region segmentation result of the training sample includes the following steps:

step S20331: and inputting the second feature map of the sample to be trained into a face region segmentation head network to be trained so as to obtain at least partial face region segmentation.

Step S20332: and inputting a second feature map of the sample to be trained into a shielding region segmentation head network in the face region to be trained so as to obtain shielding region segmentation in at least part of the face region.

Step S20333: or inputting the second feature map of the sample to be trained into a non-occlusion region segmentation head network in the face region to be trained so as to obtain non-occlusion region segmentation in at least part of the face region.

In some embodiments, the method further comprises: constructing a backbone network to be trained based on at least one of ResNet, mobileNet, HRNet; a neck network to be trained is constructed based on at least one of the FPN, the PANet, and the Bi-FPN.

Specifically, resNet (Residual Network) was proposed by He Kaiming et al of Microsoft laboratories in 2015, which is mainly characterized by having an ultra-deep Network structure, providing Residual structure modules, and using Batch Normalization (batch normalization) acceleration training. MobileNet (mobile network) is based on a streamlined architecture, using depth separable convolution to build lightweight depth neural networks for mobile and embedded vision applications; the network introduces two simple global hyper-parameters-width multiplier and resolution multiplier, which can effectively trade off between delay and accuracy. HRNet (High-Resolution Net) is proposed for 2D human body pose estimation tasks, and the network is primarily for pose estimation of a single individual (i.e., there should be only one human body target in the image of the input network).

The FPN (Feature Pyramid Networks, feature pyramid network) consists of two paths, bottom-up and top-down, the bottom-up path being the convolutional network of general sign extraction. From bottom to top, the spatial resolution decreases, more higher-level structures are detected, and the semantic value of the network layer correspondingly increases. PANet (Path Aggregation Network ) enhances the whole feature hierarchy by using accurate low-layer positioning signals through bottom-up path enhancement, so that the information path between the low-layer and top-layer features is shortened; an adaptive feature pool (adaptive feature pooling) is presented that connects a feature grid with all feature layers such that the useful information in each feature layer is propagated directly to the underlying proposed subnetwork. Bi-FPN (Bi-directional feature pyramid network ) introduces a learnable weight to learn the importance of different input features while repeatedly applying top-down and bottom-up multi-scale feature fusion.

The backbone network to be trained comprises one of ResNet, mobileNet, HRNet, and the neck network to be trained comprises one of FPN, PANet and Bi-FPN. The face segmentation model can be established based on the actual requirements of the user, so that the relationship between accuracy and time consumption is balanced better, and the use experience of the user is further improved.

In some embodiments, the face image is the same size as the training samples. Specifically, the size of the image may be represented by (H, W), where H is the height of the image and W is the width of the image, and the face image and the training sample are represented by the same (H, W).

It should be noted that, the user images according to the embodiments of the present disclosure (including, but not limited to, the first face training image and the second face training image used for training, the face image in the actual environment, etc.) are all images authorized by the user or sufficiently authorized by the parties.

The actions such as image acquisition related in the embodiments of the present disclosure are performed after user and object authorization or after full authorization by each party.

It should be noted that, although the foregoing embodiments describe the steps in a specific order, it will be understood by those skilled in the art that, in order to achieve the effects of the present invention, the steps are not necessarily performed in such an order, and may be performed simultaneously (in parallel) or in other orders, and these variations are within the scope of the present invention.

The invention further provides electronic equipment.

Referring to fig. 6, fig. 6 is a schematic block diagram of a main structure of an electronic device for performing the face occlusion score determining method of the present invention. As shown in fig. 6, the present invention further provides an electronic device for executing the face occlusion score determining method of the present invention, where the electronic device includes: a processor 11, a memory 12 and a computer program 13 stored in the memory 12 and executable on the processor 11. The steps of the various method embodiments described above are implemented by the processor 11 when executing the computer program 13. Alternatively, the processor 11 implements the functions of the modules/units in the above-described embodiments when executing the computer program 13.

The processor 11 may be, for example, a central processing unit (Central Processing Unit, CPU) or other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field-programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 12 may be an internal storage unit of the electronic device, for example, a hard disk or a memory of the electronic device; the memory 12 may also be an external storage device of the electronic device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 12 may also include both internal storage units and external storage devices of the electronic device. The memory 12 is used for storing computer programs and other programs and data required by the electronic device, and the memory 12 may also be used for temporarily storing data that has been output or is to be output.

In some possible implementations, the electronic device may include multiple processors 11 and memory 12. The program for executing the face mask score determination method of the above method embodiment may be divided into a plurality of sub-programs, and each sub-program may be loaded and executed by the processor 11 to execute the different steps of the face mask score determination method of the above method embodiment. Specifically, each of the subroutines may be stored in different memories 12, respectively, and each of the processors 11 may be configured to execute the programs in one or more memories 12 to collectively implement the face mask score determination method of the above method embodiment, that is, each of the processors 11 executes different steps of the face mask score determination method of the above method embodiment, respectively, to collectively implement the face mask score determination method of the above method embodiment.

The plurality of processors 11 may be processors disposed on the same device, for example, the electronic device may be a high-performance device composed of a plurality of processors, and the plurality of processors 11 may be processors configured on the high-performance device. The plurality of processors 11 may be processors disposed on different devices, for example, the electronic device may be a server cluster, and the plurality of processors 11 may be processors on different servers in the server cluster.

The electronic device may be a desktop computer, a notebook computer, a palm computer, a cloud server, or the like. The electronic device may include, but is not limited to, a processor 11 and a memory 12. It will be appreciated by those skilled in the art that fig. 6 is merely an example of an electronic device and is not meant to be limiting, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., an electronic device may also include an input-output device, a network access device, a bus, etc.

Further, the invention also provides a computer readable storage medium. In one embodiment of the computer readable storage medium according to the present invention, the computer readable storage medium may be configured to store a program for performing the face occlusion score determination method of the above method embodiment, which may be loaded and executed by a processor to implement the face occlusion score determination method described above. For convenience of explanation, only those portions of the embodiments of the present invention that are relevant to the embodiments of the present invention are shown, and specific technical details are not disclosed, please refer to the method portions of the embodiments of the present invention. The computer readable storage medium may be a storage device including various electronic devices, and optionally, the computer readable storage medium in the embodiments of the present invention is a non-transitory computer readable storage medium.

Further, it should be understood that, since the respective modules are merely set to illustrate the functional units of the apparatus of the present invention, the physical devices corresponding to the modules may be the processor itself, or a part of software in the processor, a part of hardware, or a part of a combination of software and hardware. Accordingly, the number of individual modules in the figures is merely illustrative.

Those skilled in the art will appreciate that the various modules in the apparatus may be adaptively split or combined. Such splitting or combining of specific modules does not cause the technical solution to deviate from the principle of the present invention, and therefore, the technical solution after splitting or combining falls within the protection scope of the present invention.

Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will fall within the scope of the present invention.

Claims

1. A method for determining a face occlusion score, the method comprising the steps of:

2. The face occlusion part determination method according to claim 1, wherein the step of determining the face occlusion part of the face image based on the number of pixels segmented by the at least partial face region and the number of pixels segmented by the occlusion region within the at least partial face region or the number of pixels segmented by the non-occlusion region within the at least partial face region includes:

3. The face occlusion score determination method of claim 1, wherein said method trains said model based on at least the steps of:

4. A face occlusion score determination method according to claim 3, wherein the first face training image comprises a face detection box area and a non-face detection box area, and the step of labeling the first face training image comprises:

5. The face occlusion part determination method of claim 4, wherein the step of labeling at least part of the face region segmentation and occlusion region segmentation within at least part of the face region or non-occlusion region segmentation within at least part of the face region comprises:

6. A face occlusion score determining method according to claim 3, wherein the step of inputting the training sample into the face segmentation model to be trained to obtain a trained face segmentation model comprises:

7. The face occlusion score determining method of claim 6, wherein the head network to be trained includes a face region segmentation head network to be trained, an occlusion region segmentation head network in a face region to be trained, or a non-occlusion region segmentation head network in a face region to be trained, and the step of inputting the second feature map of the training sample into the head network to be trained for region segmentation to obtain the region segmentation result of the training sample includes:

8. A face occlusion score determination method according to claim 3, further comprising:

9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the face occlusion score determination method of any of claims 1 to 8 when the computer program is executed.

10. A readable storage medium having stored therein a plurality of program codes, wherein the program codes are adapted to be loaded and executed by a processor to perform the face occlusion score determination method of any of claims 1 to 8.