CN115457624A - Mask wearing face recognition method, device, equipment and medium with local and overall face features cross-fused - Google Patents

Mask wearing face recognition method, device, equipment and medium with local and overall face features cross-fused Download PDF

Info

Publication number
CN115457624A
CN115457624A CN202210990521.0A CN202210990521A CN115457624A CN 115457624 A CN115457624 A CN 115457624A CN 202210990521 A CN202210990521 A CN 202210990521A CN 115457624 A CN115457624 A CN 115457624A
Authority
CN
China
Prior art keywords
face
mask
image
feature
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210990521.0A
Other languages
Chinese (zh)
Other versions
CN115457624B (en
Inventor
陈岸明
温峻峰
林群雄
洪小龙
孙全忠
李鑫
杜海江
罗海涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Tianwang Guangdong Technology Co ltd
Original Assignee
Zhongke Tianwang Guangdong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Tianwang Guangdong Technology Co ltd filed Critical Zhongke Tianwang Guangdong Technology Co ltd
Priority to CN202210990521.0A priority Critical patent/CN115457624B/en
Publication of CN115457624A publication Critical patent/CN115457624A/en
Application granted granted Critical
Publication of CN115457624B publication Critical patent/CN115457624B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/169Holistic features and representations, i.e. based on the facial image taken as a whole
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a mask wearing face recognition method, a device, equipment and a medium with local and integral face features cross-fused, wherein a front-end camera is used for acquiring an image of a user to be detected; inputting the user image to be detected into a face detection model of the mask, and outputting position information about the face and the mask; according to the position information of the face and the mask, cutting the image of the user to be detected to obtain a complete face area and a face area which is not shielded by the mask in the image, and carrying out image denoising or enhancing treatment; inputting the face region and the face region which is not shielded by the mask into a mask face feature extraction network, enabling the face region to enter a main path network to extract overall contour features, integrating information of the face region and the face region by a feature fusion module, and outputting fused mask face features; and inputting the facial features of the mask into the classifier to obtain the identification result of the identity of the user to be detected. The network has the function of detecting the face area and the mask area at the same time.

Description

Mask wearing face recognition method, device, equipment and medium with local and overall face features cross-fused
Technical Field
The embodiment of the invention relates to the technical field of face recognition, in particular to a mask wearing face recognition method, a mask wearing face recognition device, mask wearing face recognition equipment and mask wearing face recognition media with local and overall face features being fused in a cross mode.
Background
Under the promotion of the deep learning theory, the face recognition technology makes breakthrough progress, is mature day by day, and is widely applied to daily life, such as face payment, community access control, station security inspection systems and the like. However, at present, these applications often require users to use under controlled conditions, and under non-constrained conditions such as poor ambient lighting conditions and occlusion of the face, the accuracy of face recognition is often affected.
Wear gauze mask face identification and belong to the category of sheltering from face identification, the purpose is to solve because the gauze mask shelters from the reduction of the recognition accuracy problem that leads to of facial feature reduction down. Currently, there are two common approaches: the mask has the advantages that firstly, the facial features of the non-shielding area are effectively utilized, and secondly, the shielded area, namely the inherent features under the mask, is repaired. The former method is often effective because the shielding position caused by the mask is relatively fixed, but the method ignores the effect of the whole face characteristics on the recognition; the latter method usually adopts generation of an antagonistic network to repair the image in both local and overall aspects, but such methods are usually difficult to train and difficult to ensure consistency of the repaired area and other areas. At present, learners propose a method based on feature fusion, human body context information is utilized to assist in completing face recognition, such models have good stability, fused features can effectively reduce the influence of mask shielding on classification, but how to enhance the discriminability of fused information and reduce the redundancy of information provides a more efficient and accurate mask face recognition method, and still the problem to be considered next step is solved.
Disclosure of Invention
The embodiment of the invention aims to provide a mask wearing face recognition method, a mask wearing face recognition device, mask wearing face recognition equipment and a mask wearing face recognition medium, which are used for solving the problems in the background art.
In order to achieve the above object, the embodiments of the present invention mainly provide the following technical solutions: a face recognition method for a mask worn by a wearer, which is characterized in that local face features and overall face features are cross-fused, is characterized by comprising the following steps:
acquiring an image of a user to be detected through a front-end camera;
inputting the user image to be detected into a face detection model of the mask, and outputting position information about the face and the mask;
according to the position information of the face and the mask, the image of the user to be detected is cut, a complete face area and a face area which is not shielded by the mask in the image are obtained, and image denoising or enhancing processing is carried out;
inputting the face region and the face region which is not shielded by the mask into a mask face feature extraction network, wherein the face region enters a main path network to extract overall contour features, the non-mask shielded region enters a branch path network to extract local eyebrow features, and finally outputting fused mask face features after integrating the information of the face region and the non-mask shielded region by a feature fusion module;
and inputting the facial features of the mask into a classifier to obtain an identification result of the identity of the user to be detected.
Preferably, the network structure of the mask face detection model consists of a trunk, a neck and a detection head; a main part adopts a universal feature extraction network ResNet; the neck adopts FPN to refine the original characteristic diagram and aggregate semantic information of different levels; the detection head adopts an SSD algorithm, and a context attention module is added in the detection head to enable a network to pay attention to the face and mask area;
the context attention module is composed of a context awareness module and a CBAM attention module. The context sensing module is provided with three branches which are respectively provided with 1, 2 and 3 multiplied by 3 convolution kernels, output results of the three branches are combined into a feature map through channel cascade operation and input into the CBAM attention module;
the mask face detection model is trained by face data of a user wearing a mask. Each image in the face data has a label file annotated with face position and mask position information. After the image is input into the model, the model outputs a corresponding prediction result according to the extracted features, the prediction result comprises coordinates of the face, confidence coefficient of the face and coordinates of the mask and confidence coefficient of the mask, a loss value between the prediction result and a real value in a label file is calculated through a preset mask face loss function, and the mask face detection model is trained by taking the loss value reduction as an optimization target;
the mask face loss function adopts multitask loss and consists of face position offset loss and confidence loss, and mask position offset loss and confidence loss, and the expression is as follows:
Figure RE-GDA0003933120480000041
wherein, L represents the loss value of the mask face detection model; l is a radical of an alcohol conf (. Cndot.) and L loc (. Cndot.) represents a confidence penalty function and a location offset penalty function, respectively;
Figure RE-GDA0003933120480000042
indicating whether a face exists (1 if a face exists, 0 if no face exists),
Figure RE-GDA0003933120480000043
indicating whether a mask is present (1 if a mask is present, 0 if not),
Figure RE-GDA0003933120480000044
the coordinates of the region representing the face of a person,
Figure RE-GDA0003933120480000045
coordinates representing a mask area; p is fc Representing the confidence of the predicted presence of a face, P mc Confidence, P, that mask is predicted to be present fl Representing predicted face region coordinates, P fl Representing predicted mask area coordinates; α represents a confidence penalty term factor and β represents a position offset penalty term factor.
Preferably, the image of the user to be detected is cut according to the output result of the mask face detection model. And cutting out a face area from the image of the user to be detected according to the predicted face area coordinate, and simultaneously cutting out an eyebrow area which is not shielded above a boundary by taking the upper boundary of the mask area as the boundary according to the predicted mask area coordinate. And after cutting, obtaining corresponding human face area and eyebrow area images. In order to improve the image quality and ensure the accuracy of subsequent feature extraction and feature matching, the face and eyebrow region images are further subjected to denoising or enhancement processing.
Preferably, a face feature extraction network is input into the face region and the eyebrow area images, so that face features of the mask are obtained; in order to utilize all descriptive characteristics in the mask face image as much as possible, the mask face characteristic extraction network adopts a parallel design of a main network and a branch network, and is respectively used for extracting the whole face contour characteristic and the local eyebrow characteristic of the face, and finally the mask face characteristic is obtained through a whole-local characteristic fusion module; in order to ensure the feature extraction efficiency, the main network and the branch network respectively adopt two lightweight networks, namely Inception V3 and MobileNet. Meanwhile, in order to make the main network focus more on the outline and appearance characteristics of the human face, a CBAM attention module is connected after the inclusion v 3. The main road network and the branch road network are finally connected with an integral-local feature fusion module;
the integral-local feature fusion module has two stages, the first stage is an information interaction stage, the second stage is an information integration stage, and the integral profile feature respectively output by the main network and the branch network is set as F o ∈R C×H×W Local eyebrow feature is F l ∈R C×H×W Wherein R is C×H×W The spatial dimension representing the feature map is composed of the number of channels C, height H and width W, respectively, when F o And F l After the integral-local feature fusion module is input, the facial features F of the mask are obtained through an information interaction stage and an information integration stage in sequence m ∈R C×H×W
In the information interaction stage, the overall contour feature F o And local brow feature F l After the channel dimensions are compressed by one 1 × 1 convolution kernel respectively, the two are merged into a characteristic F by utilizing bilinear fusion operation b ∈R C×H×W Characteristic F b And characteristic F o And F l After channel cascade, obtaining the weight W through a 1 multiplied by 1 convolution kernel and a softmax function o ∈R C×H×W And W l ∈R C×H×W . Weight W o And W l Respectively with feature F o And F l Multiplying, and respectively passing through a 1 × 1 convolution kernel to obtain characteristic F o ′∈R C×H×W And F l ′∈R C×H×W . Finally, F is mixed b 、 F o ' and F l ' these three features are cascaded in channels to obtain the output of the first stageCharacteristic F s1 ∈R 3C×H×W . The above process can be expressed by the following formula:
F b =bilinear(conv 1×1 (F o ),conv 1×1 (F l ))
W o ,W l =softmax(conv 1×1 (cat(F o ,F b ,F l )))
F s1 =cat(conv 1×1 (F o ⊙W o ),F b ,conv 1×1 (F l ⊙W l ))
wherein, conv 1×1 (. Smallcircle.) denotes a 1 × 1 convolution operation; bilinear (·) represents a bilinear fusion operation; softmax (·) denotes a softmax function; an indication that the matrix is multiplied by an element; cat (·) denotes channel cascade operation;
in the information integration phase, feature F s1 Respectively entering an identity branch and a residual error branch. The identity branch is composed of only one 1 × 1 convolution kernel, and the residual branch is composed of a 1 × 1 convolution kernel, a 3 × 3 depth separable convolution kernel, a ReLU activation function, and a 1 × 1 convolution kernel connected in sequence. Adding the outputs of the constant branch and the residual branch, and performing regularization treatment to obtain a fused face feature F of the mask m (ii) a The above process can be represented by the following formula:
F indentity =conv 1×1 (F s1 )
F residual =conv 1×1 (relu(DWconv 3×3 (conv 1×1 (F s1 ))))
Figure RE-GDA0003933120480000061
wherein, F identity ∈R C×H×W And F residual ∈R C×H×W Respectively representing the output characteristics of the identity branch and the residual error branch; DWconv 3×3 (. Smallcircle.) represents a 3 × 3 depth separable convolution operation; reLU (·) denotes the ReLU activation function; norm (·) represents a regularization operation;
the mask face feature extraction network adopts face data of a wearing mask to train, adopts a GAN network for expanding the existing face recognition data set of the wearing mask, generates a mask for the public face recognition data set, and increases sample diversity by simulating the face data set of the mask. And meanwhile, performing similar cutting processing on the data set to obtain a human face area image in the eyebrow area and a corresponding person identity label. In the training stage, the tail end of the mask face feature extraction network is connected with a full connection layer, classification is carried out according to the extracted mask face features, a loss value between a classification result and a real value is calculated by adopting a triple loss function, and the network is trained to converge by taking a reduced loss value as a target. In the model operation stage, the mask face feature extraction network without the full connection layer is utilized to extract the mask face feature of the user to be detected from the input face area and eyebrow area images.
Preferably, the mask face features of the user to be detected are input into a trained classifier, and a prediction result of the identity of the user to be detected is obtained. The classifier utilizes a registered user sample with characteristics extracted through a mask face characteristic extraction network in advance to train, so that the classifier can be correctly matched with the identity information of a user according to similar mask face characteristics.
A face recognition system for a respirator, the system comprising:
the user image acquisition module is used for acquiring image data of a user to be detected so as to identify the identity; meanwhile, the method is also used for collecting images of registered users so as to train a classifier for matching identity information;
the mask face detection module is used for accurately positioning a face area and a mask area in the user image and outputting position information of the corresponding areas;
the image preprocessing module is used for cutting the user image according to the position information output by the mask face detection module and simultaneously carrying out preprocessing such as denoising and enhancing on the cut image;
the mask face feature extraction module is used for extracting the whole contour feature and the local eyebrow feature from the face region image and the eyebrow region image respectively, and fusing the information of the whole contour feature and the local eyebrow feature to obtain robust mask face features;
and the mask face matching module is used for classifying according to the extracted mask face features and matching the identity of the person registered in the database.
A computer device comprises a processor and a memory for storing executable programs of the processor, and when the processor executes the programs stored in the memory, the wearing mask face recognition method is realized.
A storage medium stores a program which, when executed by a processor, realizes the above-described mask-worn face recognition method.
The technical scheme provided by the embodiment of the invention at least has the following advantages:
1. the invention introduces a mask positioning task on the basis of the existing face detection network, and utilizes a multi-task learning mode to enable the network to have the function of detecting a face region and a mask region at the same time, thereby providing a good prepositive basis for a subsequent feature extraction task;
2. in order to more fully utilize the identifiable information of the face wearing the mask and reduce the influence caused by mask shielding as much as possible, the invention fuses the whole outline characteristics of the face and the local eyebrow characteristics, thereby obtaining the robust mask face characteristics and improving the accuracy of face identification;
3. in order to enable the generalization capability of the mask face feature extraction network to be stronger, the invention utilizes the conventional face data set disclosed in the prior art to form the mask-wearing face data set through the GAN network, thereby increasing the diversity and the number of data samples.
Drawings
Fig. 1 is a flowchart of a mask wearing face recognition method in which local and global face features are fused according to embodiment 1 of the present invention.
Fig. 2 is a schematic structural diagram of a mask face detection network according to embodiment 1 of the present invention.
Fig. 3 is a schematic structural diagram of a mask face extraction network according to embodiment 1 of the present invention.
Fig. 4 is a schematic structural diagram of a global-local feature fusion module according to embodiment 1 of the present invention.
Fig. 5 is a block diagram showing a structure of a face recognition system for a mask in embodiment 2 of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure of the present invention.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
Referring to fig. 1-5, four embodiments of the present invention are provided:
example 1:
in this embodiment, a network model structure is constructed by using a Pytorch deep learning framework based on a Python programming language, and model training is completed under an Ubuntu system. The hardware environment is Ubuntu18.04.3, and the GPU model is GeForce RTX2080Ti.
As shown in fig. 1, the present embodiment discloses a method, an apparatus, a device and a medium for identifying a mask wearing face with local and overall face features cross-fused, which specifically includes the following steps:
s101, obtaining an image of a user to be detected through a front-end camera.
The method comprises the steps of utilizing a front-end camera to collect image data of a registered user in an early stage of registration so as to train a classifier used for matching user identities, and utilizing the camera to obtain the image data of a current user to be detected in a later stage of use.
S102, inputting the image of the user to be detected into a face detection model of the mask, and outputting position information about the face and the mask;
as shown in fig. 2, the network structure of the mask face detection model is composed of a main part, a neck part and a detection head. A main part adopts a universal feature extraction network ResNet; the neck adopts FPN to refine the original characteristic diagram and aggregate semantic information of different levels; the detection head adopts an SSD algorithm, and a context attention module is added in the detection head so that the network pays attention to the face and mask areas.
Specifically, the context attention module is comprised of a context awareness module and a CBAM attention module. The context sensing module is provided with three branches which are respectively provided with 1, 2 and 3 multiplied by 3 convolution kernels, output results of the three branches are combined into a feature map through channel cascade operation and input into the CBAM attention module.
The mask face detection model is trained by adopting face data of a wearing mask. Each image in the face data has a label file annotated with face position and mask position information. After the image is input into the model, the model outputs a corresponding prediction result according to the extracted features, wherein the prediction result comprises the coordinates (upper left corner and lower right corner) of the face and the confidence coefficient of the face, and the coordinates (upper left corner and lower right corner) of the mask and the confidence coefficient of the mask. And calculating a loss value between the prediction result and a true value in the label file through a preset mask face loss function, and training the mask face detection model by taking the loss value reduction as an optimization target.
The mask face loss function adopts multitask loss, and consists of face position offset loss and confidence loss, and mask position offset loss and confidence loss, and the expression is as follows:
Figure RE-GDA0003933120480000111
wherein, L represents the loss value of the mask face detection model; l is conf (. And L) loc (. Cndot.) represents a confidence penalty function and a location offset penalty function, respectively;
Figure RE-GDA0003933120480000112
indicating whether a face exists (1 if a face exists, 0 if no face exists),
Figure RE-GDA0003933120480000113
indicating whether a mask is present (1 if a mask is present, 0 if not),
Figure RE-GDA0003933120480000114
the coordinates of the area representing the face of a person,
Figure RE-GDA0003933120480000115
coordinates representing a mask area; p fc Representing the confidence of predicting the presence of a face, P mc Confidence, P, that mask is predicted to be present fl Representing predicted face region coordinates, P fl Representing predicted mask area coordinates; α represents a confidence penalty term factor and β represents a position offset penalty term factor.
S103, cutting the image of the user to be detected according to the position information of the face and the mask to obtain a complete face area and a face area which is not shielded by the mask in the image, and denoising or enhancing the image;
specifically, the image of the user to be detected is cut according to the output result of the mask face detection model. And cutting out a face area from the image of the user to be detected according to the predicted face area coordinate, and simultaneously cutting out an eyebrow area which is not shielded above a boundary by taking the upper boundary of the mask area as the boundary according to the predicted mask area coordinate. And after cutting, obtaining corresponding human face area and eyebrow area images. In order to improve the image quality and ensure the accuracy of subsequent feature extraction and feature matching, the face and eyebrow region images are further subjected to denoising or enhancement processing.
S104, inputting the face region and the face region which is not shielded by the mask into a mask face feature extraction network, wherein the face region enters a main network to extract overall contour features, the non-mask shielded region enters a branch network to extract local eyebrow features, and finally, integrating information of the face region and the non-mask shielded region through a feature fusion module, and outputting fused mask face features;
as shown in fig. 3, in order to utilize all descriptive features in the mask face image as much as possible, the mask face feature extraction network adopts a parallel design of a main network and a branch network, and is respectively used for extracting the whole face contour feature and the local eyebrow feature, and finally obtaining the mask face feature through a whole-local feature fusion module. In order to ensure the feature extraction efficiency, the main network and the branch network respectively adopt two lightweight networks, namely Inception V3 and MobileNet. Meanwhile, in order to make the main road network focus more on the outline and appearance characteristics of the human face, a CBAM attention module is connected after the inclusion v 3. The main road network and the branch road network are finally connected with an integral-local feature fusion module.
As shown in fig. 4, the global-local feature fusion module has two stages, the first stage is an information exchange stage, and the second stage is an information integration stage. Setting the overall profile characteristics respectively output by the main network and the branch network as F o ∈R C×H×W Local eyebrow feature is F l ∈R C×H×W . Wherein R is C×H×W The spatial dimensions representing the feature map are composed of the number of channels C, the height H and the width W, respectively. When F is present o And F l After the integral-local feature fusion module is input, the mask face feature F is obtained through an information exchange stage and an information integration stage in sequence m ∈R C×H×W
In the information exchange stage, the overall contour feature F o And local brow feature F l After the channel dimensions are compressed through a 1 x 1 convolution kernel respectively, the two are merged into a characteristic F by utilizing bilinear fusion operation b ∈R C×H×W . Characteristic F b And feature F o And F l After channel cascade, obtaining the weight W through a 1 multiplied by 1 convolution kernel and a softmax function o ∈R C×H×W And W l ∈R C×H×W . Weight W o And W l Respectively with feature F o And F l Multiplying, and respectively passing through a 1 × 1 convolution kernel to obtain the characteristic F o ′∈R C×H×W And F l ′∈R C×H×W . Finally, F is mixed b 、 F o ' and F l ' these three characteristics go onChannel cascading to obtain output characteristic F of first stage s1 ∈R 3C×H×W . The above process can be expressed by the following formula:
F b =bilinear(conv 1×1 (F o ),conv 1×1 (F l ))
W o ,W l =softmax(conv 1×1 (cat(F o ,F b ,F l )))
F s1 =cat(conv 1×1 (F o ⊙W o ),F b ,conv 1×1 (F l ⊙W l ))
wherein, conv 1×1 (. Smallcircle.) denotes a 1 × 1 convolution operation; bilinear (·) represents a bilinear fusion operation; softmax (·) represents a softmax function; an indication that the matrix is multiplied by an element; cat (-) denotes channel cascade operation.
In the information integration phase, feature F s1 Respectively entering an identity branch and a residual error branch. The identity branch is composed of only one 1 × 1 convolution kernel, and the residual branch is composed of a 1 × 1 convolution kernel, a 3 × 3 depth separable convolution kernel, a ReLU activation function, and a 1 × 1 convolution kernel connected in sequence. Adding the outputs of the constant branch and the residual branch, and performing regularization treatment to obtain a fused face feature F of the mask m . The above process can be represented by the following formula:
F indentity =conv 1×1 (F s1 )
F residual =conv 1×1 (relu(DWconv 3×3 (conv 1×1 (F s1 ))))
Figure RE-GDA0003933120480000141
wherein, F identity ∈R C×H×W And F residual ∈R C×H×W Respectively representing the output characteristics of the identity branch and the residual error branch; DWconv 3×3 (. H) represents a 3 x 3 depth separable convolution operation; reLU (·) denotes the ReLU activation function; norm (·) represents a regularization operation.
The mask face feature extraction network adopts face data of a wearing mask to train, adopts a GAN network for expanding the existing face recognition data set of the wearing mask, generates a mask for the public face recognition data set, and increases sample diversity by simulating the face data set of the mask. And meanwhile, performing similar cutting processing on the data set to obtain a human face area image in the eyebrow area and a corresponding person identity label. In the training stage, the tail end of the mask face feature extraction network is connected with a full connection layer, classification is carried out according to the extracted mask face features, a loss value between a classification result and a real value is calculated by adopting a triple loss function, and the network is trained to converge by taking a reduced loss value as a target. In the model operation stage, the mask face feature extraction network without the full connection layer is utilized to extract the mask face feature of the user to be detected from the input face area and eyebrow area images.
And S105, inputting the facial features of the mask into a classifier to obtain the identification result of the identity of the user to be detected.
Specifically, the mask face features of the user to be detected are input into a trained classifier, and a prediction result of the identity of the user to be detected is obtained. The classifier utilizes a registered user sample with characteristics extracted through a mask face characteristic extraction network in advance to train, so that the classifier can be correctly matched with the identity information of a user according to similar mask face characteristics.
Those skilled in the art will appreciate that all or part of the steps in the method for implementing the above embodiments may be implemented by a program to instruct associated hardware, and the corresponding program may be stored in a computer-readable storage medium.
It should be noted that although the method operations of the above-described embodiments are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Rather, the depicted steps may change the order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
Example 2:
as shown in fig. 5, the present embodiment provides a mask wearing face recognition system with local and global face features fused, the system includes a user image acquisition module 501, a mask face detection module 502, an image preprocessing module 503, a mask face feature extraction module 504 and a mask face matching module 505, wherein:
a user image acquisition module 501, configured to acquire image data of a user to be detected to identify an identity; and meanwhile, the method is also used for collecting images of registered users so as to train a classifier for matching identity information.
A mask face detection module 502, configured to accurately position a face region and a mask region in a user image, and output position information of the corresponding region;
the image preprocessing module 503 cuts the user image according to the position information output by the mask face detection module, and performs preprocessing such as denoising and enhancing on the cut image;
the mask face feature extraction module 504 is configured to extract an overall contour feature and a local eyebrow feature from the face region image and the eyebrow region image, respectively, and fuse information of the overall contour feature and the local eyebrow feature to obtain a robust mask face feature;
the mask face matching module 505 classifies the facial features of the mask according to the extracted facial features of the mask, and matches the identities of the persons registered in the library.
The specific implementation of each module in this embodiment may refer to embodiment 1, which is not described herein any more; it should be noted that the system provided in this embodiment is only illustrated by the division of the functional modules, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure is divided into different functional modules to complete all or part of the functions described above.
Example 3:
the present embodiment provides a computer device, which may be a computer, and includes a processor, a memory, an input system, a display, and a network interface connected by a system bus, where the processor is configured to provide computing and control capabilities, the memory includes a nonvolatile storage medium and an internal memory, the nonvolatile storage medium stores an operating system, a computer program, and a database, the internal memory provides an environment for the operating system and the computer program in the nonvolatile storage medium to run, and when the processor executes the computer program stored in the memory, the mask-worn face recognition method of embodiment 1 is implemented, as follows:
acquiring an image of a user to be detected through a front-end camera;
inputting the user image to be detected into a face detection model of the mask, and outputting position information about the face and the mask;
according to the position information of the face and the mask, the image of the user to be detected is cut, a complete face area and a face area which is not shielded by the mask in the image are obtained, and image denoising or enhancement processing is carried out;
inputting the face region and the face region which is not shielded by the mask into a mask face feature extraction network, wherein the face region enters a main path network to extract overall contour features, the non-mask shielded region enters a branch path network to extract local eyebrow features, and finally outputting fused mask face features after integrating the information of the face region and the non-mask shielded region by a feature fusion module;
and inputting the facial features of the mask into a classifier to obtain an identification result of the identity of the user to be detected.
Example 4:
the present embodiment provides a storage medium, which is a computer-readable storage medium, and stores a computer program, and when the computer program is executed by a processor, the method for recognizing a face of a mask according to embodiment 1 is implemented as follows:
acquiring an image of a user to be detected through a front-end camera;
inputting the user image to be detected into a face detection model of the mask, and outputting position information about the face and the mask;
according to the position information of the face and the mask, the image of the user to be detected is cut, a complete face area and a face area which is not shielded by the mask in the image are obtained, and image denoising or enhancement processing is carried out;
inputting the face region and the face region which is not shielded by the mask into a mask face feature extraction network, wherein the face region enters a main path network to extract overall contour features, the non-mask shielded region enters a branch path network to extract local eyebrow features, and finally outputting fused mask face features after integrating the information of the face region and the non-mask shielded region by a feature fusion module;
and inputting the facial features of the mask into a classifier to obtain an identification result of the identity of the user to be detected. It should be noted that the computer readable storage medium of the present embodiment may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (8)

1. A face recognition method for a mask worn by a wearer, which is characterized in that local face features and overall face features are cross-fused, is characterized by comprising the following steps:
acquiring an image of a user to be detected through a front-end camera;
inputting the user image to be detected into a face detection model of the mask, and outputting position information about the face and the mask;
according to the position information of the face and the mask, the image of the user to be detected is cut, a complete face area and a face area which is not shielded by the mask in the image are obtained, and image denoising or enhancement processing is carried out;
inputting the face region and the face region which is not shielded by the mask into a mask face feature extraction network, wherein the face region enters a main path network to extract overall contour features, the non-mask shielded region enters a branch path network to extract local eyebrow features, and finally outputting fused mask face features after integrating the information of the face region and the non-mask shielded region by a feature fusion module;
and inputting the facial features of the mask into a classifier to obtain an identification result of the identity of the user to be detected.
2. The face recognition method of the mask according to claim 1, wherein the face recognition method comprises the following steps: the network structure of the mask face detection model consists of a trunk, a neck and a detection head; a main part adopts a universal feature extraction network ResNet; the neck adopts FPN to refine the original characteristic diagram and aggregate semantic information of different levels; the detection head adopts an SSD algorithm, and a context attention module is added in the detection head to enable a network to pay attention to the face and mask area;
the context attention module is composed of a context awareness module and a CBAM attention module. The context sensing module is provided with three branches which are respectively provided with 1, 2 and 3 multiplied by 3 convolution kernels, output results of the three branches are combined into a feature map through channel cascade operation and input into the CBAM attention module;
the mask face detection model is trained by adopting face data of a wearing mask. Each image in the face data has a label file annotated with face position and mask position information. After the image is input into the model, the model outputs a corresponding prediction result according to the extracted features, the prediction result comprises coordinates of the face, the confidence coefficient of the face and the confidence coefficient of the mask, and the loss value between the prediction result and the actual value in the label file is calculated through a preset mask face loss function, the loss value is reduced as an optimization target, and the mask face detection model is trained;
the mask face loss function adopts multitask loss and consists of face position offset loss and confidence loss, and mask position offset loss and confidence loss, and the expression is as follows:
Figure FDA0003803737070000031
wherein, L represents the loss value of the mask face detection model; l is conf (. And L) loc (. Cndot.) represents a confidence penalty function and a location offset penalty function, respectively;
Figure FDA0003803737070000032
indicating whether a face exists (1 if a face exists, 0 if no face exists),
Figure FDA0003803737070000033
indicating the presence or absence of a mask (1 if a mask is present, 0 if not present),
Figure FDA0003803737070000034
the coordinates of the region representing the face of a person,
Figure FDA0003803737070000035
coordinates representing a mask area; p fc Representing the confidence of the predicted presence of a face, P mc Confidence, P, that mask is predicted to be present fl Representing predicted face region coordinates, P fl Representing predicted mask area coordinates; α represents a confidence penalty term factor and β represents a position offset penalty term factor.
3. The face recognition method of the mask according to claim 1, wherein the face recognition method comprises the following steps: and cutting the image of the user to be detected according to the output result of the mask face detection model. And cutting out a face area from the image of the user to be detected according to the predicted face area coordinate, and simultaneously cutting out an eyebrow area which is not shielded above a boundary by taking the upper boundary of the mask area as the boundary according to the predicted mask area coordinate. And after cutting, obtaining corresponding human face area and eyebrow area images. In order to improve the image quality and ensure the accuracy of subsequent feature extraction and feature matching, the face and eyebrow region images are further subjected to denoising or enhancement processing.
4. The face recognition method of the mask according to claim 1, wherein the face recognition method comprises the following steps: inputting the face region and the eyebrow region images into a mask face feature extraction network so as to obtain mask face features; in order to utilize all descriptive characteristics in the mask face image as much as possible, the mask face characteristic extraction network adopts a parallel design of a main network and a branch network, and is respectively used for extracting the whole face contour characteristic and the local eyebrow characteristic of the face, and finally the mask face characteristic is obtained through a whole-local characteristic fusion module; in order to ensure the feature extraction efficiency, the main network and the branch network respectively adopt two lightweight networks, namely Inception V3 and MobileNet. Meanwhile, in order to make the main road network focus more on the outline and appearance characteristics of the human face, a CBAM attention module is connected after the inclusion v 3. The main network and the branch network are finally connected with an integral-local feature fusion module;
the overall-local feature fusion module has two stages, wherein the first stage is an information interaction stage, the second stage is an information integration stage, and the overall profile feature respectively output by the main road network and the branch road network is set as F o ∈R C×H×W Local eyebrow feature is F l ∈R C×H×W Wherein R is C×H×W The spatial dimensions representing the characteristic diagram are respectively formed by channelsNumber C, height H and width W when F o And F l After the integral-local feature fusion module is input, the facial features F of the mask are obtained through an information interaction stage and an information integration stage in sequence m ∈R C×H×W
In the information interaction stage, the overall contour feature F o And local eyebrow feature F l After the channel dimensions are compressed through a 1 x 1 convolution kernel respectively, the two are merged into a characteristic F by utilizing bilinear fusion operation b ∈R C×H×W Characteristic F b And feature F o And F l After channel cascade, obtaining the weight W through a 1 multiplied by 1 convolution kernel and a softmax function o ∈R C×H×W And W l ∈R C×H×W . Weight W o And W l Respectively with feature F o And F l Multiplying, and respectively passing through a 1 × 1 convolution kernel to obtain characteristic F o ′∈R C×H×W And F l ′∈R C ×H×W . Finally, F is mixed b 、F o ' and F l ' these three features are cascaded in channels to obtain the output feature F of the first stage s1 ∈R 3C×H×W . The above process can be expressed by the following formula:
F b =bilinear(conv 1×1 (F o ),conv 1×1 (F l ))
W o ,W l =softmax(conv 1×1 (cat(F o ,F b ,F l )))
F s1 =cat(conv 1×1 (F o ⊙W o ),F b ,conv 1×1 (F l ⊙W l ))
wherein, conv 1×1 (. Smallcircle.) denotes a 1 × 1 convolution operation; bilinear (·) represents a bilinear fusion operation; softmax (·) denotes a softmax function; an indication that the matrix is multiplied by an element; cat (·) denotes channel cascade operation;
in the information integration phase, feature F s1 Respectively entering an identity branch and a residual error branch. The constant branch is composed of only one 1 × 1 convolution kernel, and the residual branch is composed of one 1 × 1 convolution kernel and one 3 × 3 depthThe separating convolution kernel, the ReLU activating function and a 1 multiplied by 1 convolution kernel are connected in sequence. Adding the outputs of the constant branch and the residual branch, and performing regularization treatment to obtain a fused face feature F of the mask m (ii) a The above process can be represented by the following formula:
F indentity =conv 1×1 (F s1 )
F residual =conv 1×1 (relu(DWconv 3×3 (conv 1×1 (F s1 ))))
Figure FDA0003803737070000063
wherein, F identity ∈R C×H×W And F residual ∈R C×H×W Respectively representing the output characteristics of the identity branch and the residual error branch; DWconv 3×3 (. H) represents a 3 x 3 depth separable convolution operation; reLU (·) denotes the ReLU activation function; norm (·) represents the regularization operation;
the mask face feature extraction network adopts face data of a wearing mask to train, adopts a GAN network for expanding the existing face recognition data set of the wearing mask, generates a mask for the public face recognition data set, and increases sample diversity by simulating the face data set of the mask. And meanwhile, performing similar cutting processing on the data set to obtain a human face area image in the eyebrow area and a corresponding person identity label. In the training stage, the tail end of the mask face feature extraction network is connected with a full connection layer, classification is carried out according to the extracted mask face features, a loss value between a classification result and a true value is calculated by adopting a triple loss function, and the network is trained to converge by taking a reduced loss value as a target. In the model operation stage, the mask face feature extraction network without the full connection layer is utilized to extract the mask face feature of the user to be detected from the input face area and eyebrow area images.
5. The face recognition method of the mask according to claim 1, wherein the face recognition method comprises the following steps: inputting the mask face characteristics of the user to be detected into a trained classifier to obtain a prediction result of the identity of the user to be detected. The classifier utilizes a registered user sample with characteristics extracted through a mask face characteristic extraction network in advance to train, so that the classifier can be correctly matched with the identity information of a user according to similar mask face characteristics.
6. A wear gauze mask face recognition device of local and whole face feature cross fusion includes:
the user image acquisition module is used for acquiring image data of a user to be detected so as to identify the identity; meanwhile, the method is also used for collecting images of registered users so as to train a classifier for matching identity information;
the mask face detection module is used for accurately positioning a face area and a mask area in the user image and outputting position information of the corresponding areas;
the image preprocessing module is used for cutting the user image according to the position information output by the mask face detection module and simultaneously carrying out preprocessing such as denoising and enhancing on the cut image;
the mask face feature extraction module is used for extracting the whole contour feature and the local eyebrow feature from the face region image and the eyebrow region image respectively, and fusing the information of the whole contour feature and the local eyebrow feature to obtain robust mask face features;
and the mask face matching module is used for classifying according to the extracted mask face features and matching the identity of the person registered in the database.
7. A computer device comprises a processor and a memory for storing a processor executable program, wherein when the processor executes the program stored in the memory, the wearing mask face recognition method is realized.
8. A storage medium stores a program that, when executed by a processor, implements the above-described face recognition method for a mask.
CN202210990521.0A 2022-08-18 2022-08-18 Face recognition method, device, equipment and medium for wearing mask by cross fusion of local face features and whole face features Active CN115457624B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210990521.0A CN115457624B (en) 2022-08-18 2022-08-18 Face recognition method, device, equipment and medium for wearing mask by cross fusion of local face features and whole face features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210990521.0A CN115457624B (en) 2022-08-18 2022-08-18 Face recognition method, device, equipment and medium for wearing mask by cross fusion of local face features and whole face features

Publications (2)

Publication Number Publication Date
CN115457624A true CN115457624A (en) 2022-12-09
CN115457624B CN115457624B (en) 2023-09-01

Family

ID=84298205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210990521.0A Active CN115457624B (en) 2022-08-18 2022-08-18 Face recognition method, device, equipment and medium for wearing mask by cross fusion of local face features and whole face features

Country Status (1)

Country Link
CN (1) CN115457624B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311553A (en) * 2023-05-17 2023-06-23 武汉利楚商务服务有限公司 Human face living body detection method and device applied to semi-occlusion image
CN116883670A (en) * 2023-08-11 2023-10-13 智慧眼科技股份有限公司 Anti-shielding face image segmentation method
CN116895091A (en) * 2023-07-24 2023-10-17 山东睿芯半导体科技有限公司 Facial recognition method and device for incomplete image, chip and terminal
CN117152575A (en) * 2023-10-26 2023-12-01 吉林大学 Image processing apparatus, electronic device, and computer-readable storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461028A (en) * 2020-04-02 2020-07-28 杭州视在科技有限公司 Mask detection model training and detection method, medium and device in complex scene
CN111914630A (en) * 2020-06-19 2020-11-10 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for generating training data for face recognition
CN112560828A (en) * 2021-02-25 2021-03-26 佛山科学技术学院 Lightweight mask face recognition method, system, storage medium and equipment
CN112597867A (en) * 2020-12-17 2021-04-02 佛山科学技术学院 Face recognition method and system for mask, computer equipment and storage medium
CN113158883A (en) * 2021-04-19 2021-07-23 汇纳科技股份有限公司 Face recognition method, system, medium and terminal based on regional attention
CN113283405A (en) * 2021-07-22 2021-08-20 第六镜科技(北京)有限公司 Mask detection method and device, computer equipment and storage medium
WO2021174880A1 (en) * 2020-09-01 2021-09-10 平安科技(深圳)有限公司 Feature extraction model training method, facial recognition method, apparatus, device and medium
CN113807332A (en) * 2021-11-19 2021-12-17 珠海亿智电子科技有限公司 Mask robust face recognition network, method, electronic device and storage medium
CN114220143A (en) * 2021-11-26 2022-03-22 华南理工大学 Face recognition method for wearing mask
JP2022053158A (en) * 2020-09-24 2022-04-05 エヌ・ティ・ティ・コミュニケーションズ株式会社 Information processor, information processing method, and information processing program
CN114360033A (en) * 2022-03-18 2022-04-15 武汉大学 Mask face recognition method, system and equipment based on image convolution fusion network

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461028A (en) * 2020-04-02 2020-07-28 杭州视在科技有限公司 Mask detection model training and detection method, medium and device in complex scene
CN111914630A (en) * 2020-06-19 2020-11-10 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for generating training data for face recognition
WO2021174880A1 (en) * 2020-09-01 2021-09-10 平安科技(深圳)有限公司 Feature extraction model training method, facial recognition method, apparatus, device and medium
JP2022053158A (en) * 2020-09-24 2022-04-05 エヌ・ティ・ティ・コミュニケーションズ株式会社 Information processor, information processing method, and information processing program
CN112597867A (en) * 2020-12-17 2021-04-02 佛山科学技术学院 Face recognition method and system for mask, computer equipment and storage medium
CN112560828A (en) * 2021-02-25 2021-03-26 佛山科学技术学院 Lightweight mask face recognition method, system, storage medium and equipment
CN113158883A (en) * 2021-04-19 2021-07-23 汇纳科技股份有限公司 Face recognition method, system, medium and terminal based on regional attention
CN113283405A (en) * 2021-07-22 2021-08-20 第六镜科技(北京)有限公司 Mask detection method and device, computer equipment and storage medium
CN113807332A (en) * 2021-11-19 2021-12-17 珠海亿智电子科技有限公司 Mask robust face recognition network, method, electronic device and storage medium
CN114220143A (en) * 2021-11-26 2022-03-22 华南理工大学 Face recognition method for wearing mask
CN114360033A (en) * 2022-03-18 2022-04-15 武汉大学 Mask face recognition method, system and equipment based on image convolution fusion network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AHMAD ALZUBI等: "Masked face Recognition using deep learning: A review", vol. 10, no. 21, pages 1 - 35 *
MINGYUAN XU等: "Mask wearing detection method based on SSD-mask algorithm", pages 138 - 143 *
尹斌军: "基于卷积神经网络的交通标志检测识别算法设计与FPGA验证", no. 2020, pages 035 - 256 *
郭富海: "基于YOLOv4分块权重剪枝的口罩佩戴检测及嵌入式实现", no. 2022, pages 138 - 515 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311553A (en) * 2023-05-17 2023-06-23 武汉利楚商务服务有限公司 Human face living body detection method and device applied to semi-occlusion image
CN116311553B (en) * 2023-05-17 2023-08-15 武汉利楚商务服务有限公司 Human face living body detection method and device applied to semi-occlusion image
CN116895091A (en) * 2023-07-24 2023-10-17 山东睿芯半导体科技有限公司 Facial recognition method and device for incomplete image, chip and terminal
CN116883670A (en) * 2023-08-11 2023-10-13 智慧眼科技股份有限公司 Anti-shielding face image segmentation method
CN116883670B (en) * 2023-08-11 2024-05-14 智慧眼科技股份有限公司 Anti-shielding face image segmentation method
CN117152575A (en) * 2023-10-26 2023-12-01 吉林大学 Image processing apparatus, electronic device, and computer-readable storage medium
CN117152575B (en) * 2023-10-26 2024-02-02 吉林大学 Image processing apparatus, electronic device, and computer-readable storage medium

Also Published As

Publication number Publication date
CN115457624B (en) 2023-09-01

Similar Documents

Publication Publication Date Title
CN115457624A (en) Mask wearing face recognition method, device, equipment and medium with local and overall face features cross-fused
CN110175527B (en) Pedestrian re-identification method and device, computer equipment and readable medium
CN109145742B (en) Pedestrian identification method and system
Steffens et al. Personspotter-fast and robust system for human detection, tracking and recognition
CN109819208A (en) A kind of dense population security monitoring management method based on artificial intelligence dynamic monitoring
CN110751022A (en) Urban pet activity track monitoring method based on image recognition and related equipment
CN112668475B (en) Personnel identity identification method, device, equipment and readable storage medium
CN112085010A (en) Mask detection and deployment system and method based on image recognition
CN111860169B (en) Skin analysis method, device, storage medium and electronic equipment
CN111626243B (en) Mask face shielding identity recognition method and device and storage medium
CN110378221A (en) A kind of power grid wire clamp detects and defect identification method and device automatically
CN103237201A (en) Case video studying and judging method based on social annotation
CN111639602B (en) Pedestrian shielding and orientation detection method
Ryumina et al. A novel method for protective face mask detection using convolutional neural networks and image histograms
He et al. Semi-supervised skin detection by network with mutual guidance
CN112597867A (en) Face recognition method and system for mask, computer equipment and storage medium
CN113657355A (en) Global and local perception pedestrian re-identification method fusing segmentation information
CN114663807A (en) Smoking behavior detection method based on video analysis
Li et al. Distracted driving detection by combining ViT and CNN
Hu et al. Hierarchical attention vision transformer for fine-grained visual classification
Guo et al. Design of a smart art classroom system based on Internet of Things
CN110929711A (en) Method for automatically associating identity information and shape information applied to fixed scene
Peng et al. Masked face detection based on locally nonlinear feature fusion
CN114241556A (en) Non-perception face recognition attendance checking method and device
Oufqir et al. Deep Learning for the Improvement of Object Detection in Augmented Reality

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant