CN115862097A - Method and device for identifying shielding face based on multi-attention and multi-scale feature learning - Google Patents
Method and device for identifying shielding face based on multi-attention and multi-scale feature learning Download PDFInfo
- Publication number
- CN115862097A CN115862097A CN202211493911.3A CN202211493911A CN115862097A CN 115862097 A CN115862097 A CN 115862097A CN 202211493911 A CN202211493911 A CN 202211493911A CN 115862097 A CN115862097 A CN 115862097A
- Authority
- CN
- China
- Prior art keywords
- face
- occlusion
- attention
- feature
- face image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000004927 fusion Effects 0.000 claims abstract description 25
- 238000010586 diagram Methods 0.000 claims abstract description 21
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 4
- 238000000605 extraction Methods 0.000 claims description 23
- 230000006870 function Effects 0.000 claims description 20
- 238000011176 pooling Methods 0.000 claims description 14
- 238000010276 construction Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 230000001815 facial effect Effects 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims 1
- 230000007246 mechanism Effects 0.000 abstract description 6
- 230000006872 improvement Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 230000002411 adverse Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 241000001667 Eueretagrotis sigmoides Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a method and a device for identifying an occluded face based on multi-attention and multi-scale feature learning. The method is based on a channel attention and space attention mechanism, reduces the influence of a shelter on face recognition, and solves the problem that the face recognition accuracy is reduced under the situation of shelter. Firstly, adding a multi-level attention network on the basis of a traditional convolutional neural network, and extracting a channel attention diagram and a space attention diagram of a face image; secondly, constructing a multi-scale feature fusion device, and integrating local and global information of the face image; next, positioning an occlusion region and generating an occlusion mask through an occlusion mask generator, and reducing the influence of the occlusion region; and finally, carrying out shielding category classification and face identity classification through a multi-task learning network to obtain a final face recognition result. The method is simple and easy to implement, the model depth is shallow, the cost is low, and the identification efficiency is ensured while the accurate identification of the shielded face is realized.
Description
Technical Field
The invention belongs to the field of artificial intelligence face recognition, and particularly relates to a method and a device for recognizing an occluded face based on multi-attention and multi-scale feature learning.
Background
As a non-invasive identification verification mode, face identification is more popular and accepted than other biometric identification technologies, and with the development and progress of identification technologies, face identification technologies have been widely deployed in various scenes such as monitoring systems, security systems, industrial production, home monitoring and the like, which is convenient for various aspects of people's life.
The accuracy of the face recognition technology depends on the extraction of key features of the face by the model to a great extent, and whether the face region is complete or not has great influence on the feature extraction process. With the global pandemic of new coronary epidemic, wearing the mask becomes an indispensable requirement for people going out. The mask is used as an external interference factor, so that the human face picture is shielded, and further, partial characteristics are damaged. Under the condition, the common face recognition algorithm loses due high accuracy rate, and finally the mask face recognition task cannot be finished. Therefore, it is urgently needed to provide a related new research algorithm for the identification of the occluded face.
Disclosure of Invention
The invention aims to provide a method and a device for identifying an occluded face based on multi-attention and multi-scale feature learning. The invention provides a method for effectively improving the accuracy of the identification of the shielded face by extracting and fusing multi-level characteristics of a face image and eliminating the influence of a shielded area on the face identification by adopting a channel attention and space attention mechanism. The method has simple logic and obvious effect, can effectively shield the adverse effect of partial shielding on face recognition, and simultaneously supports the face recognition task in the non-shielding scene.
The invention is realized by adopting the following technical scheme:
an occlusion face recognition method based on multi-attention and multi-scale feature learning is characterized in that a multi-level attention network is added on the basis of a convolutional neural network, and a channel attention diagram and a space attention diagram of a face image are extracted; secondly, constructing a multi-scale feature fusion device, integrating local and global information of the face image, and obtaining face features with stronger robustness; then, positioning an occlusion region based on an occlusion mask generator and generating an occlusion mask to reduce the influence of the occlusion region; and finally, simultaneously classifying the shielding categories and the face identities through a multi-task learning network, and obtaining the best face identification generalization effect.
The invention adopts a method for identifying the face to be shielded based on multi-attention and multi-scale feature learning, which specifically comprises the following steps:
extracting the features of the face image: adopting a residual error neural network as a face image feature extraction module to extract the face image features;
obtaining a channel attention diagram of a face image: after the face image features are obtained, calculating attention weights aiming at feature graphs of all channels, and obtaining the correlation degree of all channels and key information;
obtaining a spatial attention diagram of a face image: taking the feature map after the channel attention map is thinned as input, and calculating the correlation degree of different pixel points and key information to obtain the multi-level features of the face image;
multi-scale feature fusion of the face image: aiming at the multi-level features of the face image, a multi-scale feature fusion device is constructed by using a three-layer deconvolution structure, and the multi-scale feature information of the face image containing different resolutions and semantic strengths is obtained by adding feature images of different scales element by element;
and (3) generating an occlusion mask: learning a feature mask highly sensitive to the shielding position of the input image, calculating the weight corresponding to the shielding region, and eliminating the influence of the damaged feature on the face identification in a mode of endowing different weights to the feature;
occlusion classification: taking the feature mask learned by the occlusion mask generator as an input, classifying the feature mask into occlusion categories to supervise learning of the occlusion mask generator;
face classification: and multiplying the regional weight obtained by the shielding mask generator with the multi-scale feature information of the face image to obtain the cleaned face features, and taking the face features as the input of the face class classifier to obtain the classification result of the face class.
The further improvement of the invention lies in that the channel attention map of the face image is obtained, which specifically comprises the following steps:
performing average pooling and maximum pooling on the extracted face features respectively to aggregate spatial information, inputting the spatial information into two shared full-connection layers to fit the correlation among the channel features, and obtaining two channel feature maps;
and adding corresponding elements in the two channel feature maps, and processing by using a Sigmoid activation function to obtain a channel attention map of the face image, wherein the weight in the map reflects the correlation degree of the channel and key information.
The invention further improves that the space attention map of the face image is obtained, which comprises the following steps:
performing maximum pooling and average pooling operation on the extracted face features along the channel direction to obtain two spatial feature maps;
and splicing the two spatial feature maps, fitting the feature correlation on the spatial dimension through convolution operation to obtain a spatial attention map, wherein the weights in the maps reflect the correlation degree of different pixel points and key information.
The further improvement of the invention is that the multi-scale feature fusion of the face image specifically comprises:
constructing a pyramid structure model by using a top-down transverse connection framework by taking an acquired human face image feature extraction module as a main body of the multi-scale feature fusion device;
the input of the pyramid structure model is a preprocessed face image, and face image multi-scale feature information containing different resolutions and semantic intensities is obtained through convolution operation and upsampling operation.
The further improvement of the invention lies in that the generation of the occlusion mask specifically comprises:
the method comprises the steps of inputting face features containing different scales and global information, obtaining a final shielding mask through a convolution network and combining a PReLu activation function, a batch regularization layer and a Sigmoid function, and using the final shielding mask to clean original face features damaged due to partial shielding.
The further improvement of the invention lies in that the occlusion classification specifically comprises:
dividing the face picture into a plurality of rectangular grids, simulating an occlusion area through rectangular combination and constructing a new occlusion type, and obtaining occlusion dictionaries of all occlusion types based on the occlusion area, wherein the occlusion dictionary still contains the situation of no occlusion;
selecting different types of mask pictures as a shelter, and randomly selecting the center of the shelter to integrate the shelter pictures on the face picture;
calculating a corresponding occlusion matrix according to whether each square is occluded or not, and searching a corresponding occlusion category in the generated occlusion dictionary to serve as a label of the occlusion face picture;
sending the marked occlusion face picture into an occlusion mask generator, and learning a mask related to the occlusion category;
and sending the learned mask into an occlusion class classifier for classification, and using the cross entropy as a loss function to supervise the learning process of an occlusion mask generator so as to obtain a more accurate occlusion mask.
The further improvement of the invention is that the face classification specifically comprises:
inputting the human face features after the shielding mask processing, and learning the human face features related to the identity by adopting an edge-based loss function LMCL supervision model;
and finally, adding the loss function of the face recognition task and the loss function of the shielding type recognition task to be used as a final loss function, and supervising the model to enable the final loss function to be more quickly converged to finish face type classification.
Face identification device shelters from based on many attention multi-scale feature learning includes:
the face image feature extraction module adopts a residual error neural network as a face image feature extraction module to extract the face image features;
the channel attention map construction module is used for calculating attention weight aiming at the characteristic map of each channel after the face image characteristics are obtained, and obtaining the correlation degree of each channel and the key information;
the space attention map construction module is used for taking the feature map subjected to channel attention map thinning as input, calculating the correlation degree of different pixel points and key information and obtaining the multi-level features of the face image;
the multi-scale feature fusion module is used for constructing a multi-scale feature fusion device by using a three-layer deconvolution structure aiming at the multi-level features of the face image, and obtaining the multi-scale feature information of the face image with different resolutions and semantic strengths by adding the feature images of different scales element by element;
the shielding mask generation module learns a feature mask highly sensitive to a shielding position of the input image, calculates weights corresponding to shielding regions, and eliminates the influence of damaged features on face identification in a mode of giving different weights to the features;
the shielding type classification module is used for taking the characteristic mask learned by the shielding mask generator as input, classifying the characteristic mask into a shielding type and supervising the learning of the shielding mask generator;
and the human face class classification module is used for multiplying the regional weight obtained by the shielding mask generator with the human face image multi-scale feature information to obtain the cleaned human face features, and taking the human face features as the input of the human face class classifier to obtain the classification result of the human face class.
The invention has at least the following beneficial technical effects:
the invention provides an occlusion face recognition method based on multi-attention and multi-scale feature learning. Firstly, adding a multi-level attention mechanism on the basis of a face feature extraction network, and extracting a channel attention diagram and a space attention diagram of a face image; secondly, constructing a multi-scale feature fusion device, and integrating local and global information of the image; next, positioning an occlusion region and generating an occlusion mask through an occlusion mask generator, and reducing the influence of the occlusion region; and finally, carrying out shielding category classification and face identity classification through a multi-task learning network to obtain a final face recognition result. Compared with a common face recognition algorithm, the method can reduce the influence of the mask and other shielding objects on the face recognition accuracy. Experiments show that the method achieves 97.76% of accuracy rate on the task of shielding the face, and is superior to the existing shielding face recognition algorithm. On the task of non-shielding face recognition, the accuracy rate is equivalent to that of the existing method.
The application also provides a shielding face recognition device based on multi-attention and multi-scale feature learning, which comprises seven modules including a face feature extraction module, a channel attention map construction module, a space attention map construction module, a multi-scale feature fusion module, a shielding mask generation module, a shielding classification module and a face classification module. The face feature extraction module provides deep face features for the follow-up module; the channel attention and spatial attention module provides an extraction method for a channel attention map and a spatial attention map; the multi-scale feature fusion module can fuse local and global information of the image to obtain multi-scale feature information of the face image, which is more beneficial to recognition; the shielding mask generation module can position the shielding region and generate a shielding mask, so that the influence of the shielding region is reduced, and the classification accuracy is improved; the shielding classification module is used for classifying the shielding mask and effectively supervising the generation process of the shielding mask; the final face classification module can realize effective classification aiming at the face classes by inputting the cleaned face features, and the classification efficiency is improved.
Drawings
FIG. 1 is an overall process flow diagram of an occlusion face recognition algorithm based on an attention mechanism;
FIG. 2 is a schematic diagram of a channel attention map extraction process;
FIG. 3 is a schematic diagram of a spatial attention map extraction process;
FIG. 4 is a schematic diagram of a multi-level feature fusion module;
FIG. 5 is a confusion matrix for different algorithms on LFW and Occ-LFW data sets;
FIG. 6 is a functional block diagram of an occlusion face recognition device based on multi-attention scale feature learning according to the present invention;
FIG. 7 is a schematic structural diagram of an electronic device implementing the multi-attention-scale feature learning-based occlusion face recognition algorithm according to the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
Referring to fig. 1, in the method for identifying an occluded face based on multi-attention and multi-scale feature learning, firstly, feature extraction is carried out on an input face image, and a channel attention map and a space attention map of the face image are obtained through calculation according to a multi-level attention mechanism; obtaining feature information containing different resolutions and semantic strengths through a multi-scale feature fusion device; the shielding type identification task branches to learn the shielding mask of the face image through a shielding mask generator and sends the mask to a learning process of a shielding type classifier supervision mask generator; and the face recognition task branch superposes the shielding mask learned by the shielding type recognition task branch on the original face features to eliminate the adverse effect of shielding on recognition and complete the shielding face recognition task. The system specifically comprises the following modules:
1. extracting multilevel characteristics of the face image: the method comprises the steps of face image feature extraction, channel attention map acquisition and space attention map acquisition, and specifically comprises the following steps:
step1, extracting facial image features: adopting a residual error neural network as a face image feature extraction network to extract the face image features;
step2, obtaining a channel attention diagram of the face image: referring to fig. 2, after the face image features are obtained, attention weights are calculated for feature maps of the channels, and the correlation degree between the channels and the key information is obtained; the method comprises the following specific steps:
first, for the inputRespectively carrying out average pooling and maximum pooling on the polymerization space information to obtain a characteristic diagram->And &>Then will->And &>Two shared fully-connected layers are input to fit the correlation between the channel features:
in the formula (W) 0 ,b 0 ,W 1 ,b 1 ) Respectively representing the respective weights and biases of the two fully-connected layers.Whereinr is the compression ratio to reduce the number of parameters. />
Finally, corresponding elements of the expressions (1) and (2) are added, and the obtained product is processed by using a Sigmoid activation function to obtain a final channel attention diagram M c (F) In that respect The process is shown below.
Step3, obtaining a spatial attention diagram of the face image: referring to fig. 3, the feature map obtained by refining the channel attention map is used as an input, and the correlation degree of different pixel points and key information is calculated. The method comprises the following specific steps:
the module inputs a feature map refined by a channel attention mapThe output is a space weight map M s . First, a maximum pooling and average pooling operation is performed on F in the direction of the channel, resulting in a feature map +>And &>Then, the averaged pooling result is->And a maximum pooling result>And splicing to obtain new data with the channel number of 2, and sending the data into a layer of convolutional neural network to fit the characteristic correlation on the spatial dimension. The process is shown below.
2. Multi-scale feature fusion of the face image: referring to fig. 4, the face image feature extraction module in step1 is used as a main body of the multi-scale feature fusion device, and a pyramid structure is constructed by using a top-to-bottom transverse connection architecture. The model takes the processed face image as input and outputs face features x with different scales 1 ,x 2 ,x 3 . Wherein x 1 Is the underlying face recognition feature, x, that needs to be cleaned 2 ,x 3 Containing local and global information of different scales, the process can be formally expressed as follows
x 2 =conv(upsample(conv(x 1 ))+conv(C 2 )) (5)
x 3 =conv(upsample(conv(x 2 ))+conv(C 3 )) (6)
Where conv is the convolution operation and upsample is the upsampling operation.
3. And (3) generating an occlusion mask: learning a feature mask highly sensitive to the shielding position of the input image, calculating the weight corresponding to the shielding region, and eliminating the influence of the damaged feature on the face recognition in a mode of giving different weights to the feature.
4. The multi-task shielding face classification model comprises the following steps: the method comprises two subtasks, namely face class classification and shielding class classification. The method comprises the following specific steps:
step1 occlusion class classifier: feature masks learned by the occlusion mask generator are used as input and classified into occlusion classes to supervise learning of the occlusion mask generator.
Step2 face classification: and multiplying the region weight calculated by the shielding mask generator with the multiscale feature information of the face image to obtain the cleaned face features, and taking the face features as the input of the face class classifier to obtain the classification result of the face class.
Referring to table 1, compared with other algorithms in the field of face recognition, the algorithm provided by the invention is basically consistent with that of Arcface in the aspect of comparison of an unshielded face, and is improved by 0.4% compared with the FROM algorithm in the field of face recognition. In the aspect of comparison of the shielded face, the accuracy of the algorithm is improved by 1.2% compared with that of an Arcface and improved by 1% compared with that of a shielded face recognition algorithm FROM. This demonstrates the effectiveness of the attention-based occlusion face recognition algorithm in face comparison.
Table 1: the invention compares the face accuracy with other face recognition algorithms ArcFace and FROM on LFW and Occ-LFW data sets.
To further evaluate the algorithm's comparison with other algorithms, we further obtained the confusion matrix of the three algorithms on the LFW data set and the Occ-LFW data set, refer to fig. 5. On LFW datasets, FROM is most likely to confuse the same face pair with different face pairs, whereas Arcface is least likely to confuse both. In the aspect of comparison of the same face, the performance of the face-to-face image comparison method is superior to that of FROM and is similar to that of Arcface; in the aspect of comparison of different faces, the performance of the invention is obviously superior to FROM, but is slightly inferior to Arcface. On the Occ-LFW dataset, arcface is the easiest to confuse the same face pair with different face pairs, while the present invention is the least likely to confuse the two. In the aspects of comparison of the same face and comparison of different faces, the performance of the method is superior to that of FROM and Arcface, and therefore the effectiveness of the occlusion face recognition algorithm based on the attention mechanism is fully demonstrated.
The invention provides an occlusion face recognition device based on multi-attention and multi-scale feature learning, which comprises a face feature extraction module, a channel attention map construction module, a space attention map construction module, a multi-scale feature fusion module, an occlusion mask generation module, an occlusion classification module and a face classification module.
1. The face image feature extraction module adopts a residual error neural network as a face image feature extraction module to extract the face image features;
2. the channel attention map construction module is used for calculating attention weight aiming at the characteristic map of each channel after the face image characteristics are obtained, and obtaining the correlation degree of each channel and the key information;
3. the space attention map construction module is used for taking the feature map subjected to channel attention map thinning as input, calculating the correlation degree of different pixel points and key information and obtaining the multi-level features of the face image;
4. the multi-scale feature fusion module is used for constructing a multi-scale feature fusion device by using a three-layer deconvolution structure aiming at the multi-level features of the face image, and obtaining the multi-scale feature information of the face image with different resolutions and semantic strengths by adding the feature images of different scales element by element;
5. the shielding mask generation module learns a feature mask highly sensitive to a shielding position of the input image, calculates weights corresponding to shielding regions, and eliminates the influence of damaged features on face identification in a mode of giving different weights to the features;
6. the shielding type classification module is used for taking the characteristic mask learned by the shielding mask generator as input and classifying the characteristic mask into a shielding type so as to supervise the learning of the shielding mask generator;
7. and the human face class classification module is used for multiplying the regional weight obtained by the shielding mask generator with the human face image multi-scale feature information to obtain the cleaned human face features, and taking the human face features as the input of the human face class classifier to obtain the classification result of the human face class.
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.
Claims (9)
1. The method is characterized in that a multi-level attention network is added on the basis of a convolutional neural network, and a channel attention diagram and a space attention diagram of a face image are extracted; secondly, constructing a multi-scale feature fusion device, integrating local and global information of the face image, and obtaining face features with stronger robustness; then, positioning an occlusion region based on an occlusion mask generator and generating an occlusion mask to reduce the influence of the occlusion region; and finally, simultaneously classifying the shielding categories and the face identities through a multi-task learning network, and obtaining the best face identification generalization effect.
2. The method for identifying the occluded face based on the multi-attention and multi-scale feature learning of claim 1, which is characterized in that the method specifically comprises the following steps:
extracting the facial image features: adopting a residual error neural network as a face image feature extraction module to extract the face image features;
obtaining a channel attention diagram of a face image: after the face image features are obtained, calculating attention weights aiming at feature graphs of all channels, and obtaining the correlation degree of all channels and key information;
obtaining a spatial attention diagram of a face image: taking the feature map after the channel attention map is thinned as input, and calculating the correlation degree of different pixel points and key information to obtain the multi-level features of the face image;
multi-scale feature fusion of the face image: aiming at the multi-level features of the face image, a multi-scale feature fusion device is constructed by using a three-layer deconvolution structure, and the multi-scale feature information of the face image containing different resolutions and semantic strengths is obtained by adding feature images of different scales element by element;
and (3) generating an occlusion mask: learning a feature mask highly sensitive to the shielding position of the input image, calculating the weight corresponding to the shielding region, and eliminating the influence of the damaged feature on the face identification in a mode of endowing different weights to the feature;
occlusion classification: taking the feature mask learned by the occlusion mask generator as an input, and classifying the feature mask into occlusion categories to supervise the learning of the occlusion mask generator;
face classification: and multiplying the regional weight obtained by the shielding mask generator with the multi-scale feature information of the face image to obtain the cleaned face features, and taking the face features as the input of the face class classifier to obtain the classification result of the face class.
3. The method for identifying the occluded face based on the multi-attention and multi-scale feature learning of claim 2, wherein the obtaining of the channel attention map of the face image specifically comprises:
performing average pooling and maximum pooling on the extracted face features respectively to aggregate spatial information, inputting the spatial information into two shared full-connection layers to fit the correlation among the channel features, and obtaining two channel feature maps;
and adding corresponding elements in the two channel feature maps, and processing by using a Sigmoid activation function to obtain a channel attention map of the face image, wherein the weight in the map reflects the correlation degree of the channel and key information.
4. The method for identifying the occluded face based on the multi-attention and multi-scale feature learning of claim 2, wherein obtaining the spatial attention map of the face image comprises:
performing maximum pooling and average pooling operation on the extracted face features along the channel direction to obtain two spatial feature maps;
and splicing the two spatial feature maps, and fitting the feature correlation on the spatial dimension through convolution operation to obtain a spatial attention map, wherein the weights in the maps reflect the correlation degree of different pixel points and key information.
5. The method for recognizing the occluded face based on the multi-attention multi-scale feature learning according to claim 2, wherein the multi-scale feature fusion of the face image specifically comprises:
constructing a pyramid structure model by using a top-down transverse connection framework by taking an acquired human face image feature extraction module as a main body of the multi-scale feature fusion device;
the input of the pyramid structure model is a preprocessed face image, and face image multi-scale feature information containing different resolutions and semantic intensities is obtained through convolution operation and up-sampling operation.
6. The method for identifying an occluded face based on multi-attention and multi-scale feature learning according to claim 5, wherein the generation of the occlusion mask specifically comprises:
the method comprises the steps of inputting face features containing different scales and global information, obtaining a final shielding mask through a convolution network and combining a PReLu activation function, a batch regularization layer and a Sigmoid function, and using the final shielding mask to clean original face features damaged due to partial shielding.
7. The method for identifying an occlusion face based on multi-attention and multi-scale feature learning according to claim 2, wherein the classifying of occlusion categories specifically comprises:
dividing the face picture into a plurality of rectangular grids, simulating an occlusion area through rectangular combination and constructing a new occlusion category, and obtaining occlusion dictionaries of all occlusion categories based on the occlusion area, wherein the occlusion dictionary still contains the situation of no occlusion;
selecting different types of mask pictures as a shelter, and randomly selecting the center of the shelter to integrate the shelter pictures on the face picture;
calculating a corresponding occlusion matrix according to whether each square is occluded or not, and searching a corresponding occlusion category in the generated occlusion dictionary to serve as a label of the occlusion face picture;
sending the marked occlusion face picture into an occlusion mask generator, and learning a mask related to the occlusion category;
and sending the learned mask into an occlusion class classifier for classification, and using the cross entropy as a loss function to supervise the learning process of an occlusion mask generator so as to obtain a more accurate occlusion mask.
8. The method for identifying the occluded face based on the multi-attention and multi-scale feature learning according to claim 2, wherein the classification of the face category specifically comprises:
inputting the human face features after the shielding mask processing, and learning the human face features related to the identity by adopting an edge-based loss function LMCL supervision model;
and finally, adding the loss function of the face recognition task and the loss function of the shielding type recognition task to be used as a final loss function, and supervising the model to enable the final loss function to be more quickly converged to finish face type classification.
9. Face recognition device shelters from based on many attention multi-scale feature learning, its characterized in that includes:
the face image feature extraction module adopts a residual error neural network as a face image feature extraction module to extract the face image features;
the channel attention map construction module is used for calculating attention weight aiming at the characteristic map of each channel after the face image characteristics are obtained, and obtaining the correlation degree of each channel and the key information;
the space attention map construction module is used for taking the feature map subjected to channel attention map thinning as input, calculating the correlation degree of different pixel points and key information and obtaining the multi-level features of the face image;
the multi-scale feature fusion module is used for constructing a multi-scale feature fusion device by using a three-layer deconvolution structure aiming at the multi-level features of the face image, and obtaining the multi-scale feature information of the face image with different resolutions and semantic strengths by adding the feature images of different scales element by element;
the shielding mask generation module learns a feature mask highly sensitive to a shielding position of the input image, calculates weights corresponding to shielding regions, and eliminates the influence of damaged features on face identification in a mode of giving different weights to the features;
the shielding type classification module is used for taking the characteristic mask learned by the shielding mask generator as input and classifying the characteristic mask into a shielding type so as to supervise the learning of the shielding mask generator;
and the human face class classification module is used for multiplying the regional weight obtained by the shielding mask generator with the human face image multi-scale feature information to obtain the cleaned human face features, and taking the human face features as the input of the human face class classifier to obtain the classification result of the human face class.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211493911.3A CN115862097A (en) | 2022-11-25 | 2022-11-25 | Method and device for identifying shielding face based on multi-attention and multi-scale feature learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211493911.3A CN115862097A (en) | 2022-11-25 | 2022-11-25 | Method and device for identifying shielding face based on multi-attention and multi-scale feature learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115862097A true CN115862097A (en) | 2023-03-28 |
Family
ID=85666718
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211493911.3A Pending CN115862097A (en) | 2022-11-25 | 2022-11-25 | Method and device for identifying shielding face based on multi-attention and multi-scale feature learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115862097A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912632A (en) * | 2023-09-12 | 2023-10-20 | 深圳须弥云图空间科技有限公司 | Target tracking method and device based on shielding |
-
2022
- 2022-11-25 CN CN202211493911.3A patent/CN115862097A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912632A (en) * | 2023-09-12 | 2023-10-20 | 深圳须弥云图空间科技有限公司 | Target tracking method and device based on shielding |
CN116912632B (en) * | 2023-09-12 | 2024-04-12 | 深圳须弥云图空间科技有限公司 | Target tracking method and device based on shielding |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108537743B (en) | Face image enhancement method based on generation countermeasure network | |
CN110084156B (en) | Gait feature extraction method and pedestrian identity recognition method based on gait features | |
CN111444881A (en) | Fake face video detection method and device | |
CN103605972B (en) | Non-restricted environment face verification method based on block depth neural network | |
WO2019227479A1 (en) | Method and apparatus for generating face rotation image | |
CN108268859A (en) | A kind of facial expression recognizing method based on deep learning | |
CN112418041B (en) | Multi-pose face recognition method based on face orthogonalization | |
CN112801015B (en) | Multi-mode face recognition method based on attention mechanism | |
JP2016538656A (en) | Method and system for facial image recognition | |
Han et al. | Visual hand gesture recognition with convolution neural network | |
CN110222718B (en) | Image processing method and device | |
CN114511798B (en) | Driver distraction detection method and device based on transformer | |
Soni et al. | Hybrid meta-heuristic algorithm based deep neural network for face recognition | |
Sang et al. | Multi-scale context attention network for stereo matching | |
CN112766217A (en) | Cross-modal pedestrian re-identification method based on disentanglement and feature level difference learning | |
CN115731574A (en) | Cross-modal pedestrian re-identification method based on parameter sharing and feature learning of intermediate modes | |
CN115862097A (en) | Method and device for identifying shielding face based on multi-attention and multi-scale feature learning | |
CN113807232A (en) | Fake face detection method, system and storage medium based on double-flow network | |
Aslam et al. | Wavelet-based convolutional neural networks for gender classification | |
CN113688715A (en) | Facial expression recognition method and system | |
CN113706404A (en) | Depression angle human face image correction method and system based on self-attention mechanism | |
Chui et al. | Capsule networks and face recognition | |
Sang et al. | Image recognition based on multiscale pooling deep convolution neural networks | |
Jiashu | Performance analysis of facial recognition: A critical review through glass factor | |
Zhou et al. | Design of an Intelligent Laboratory Facial Recognition System Based on Expression Keypoint Extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |