CN114612980A - Deformed face detection based on multi-azimuth fusion attention - Google Patents
Deformed face detection based on multi-azimuth fusion attention Download PDFInfo
- Publication number
- CN114612980A CN114612980A CN202210235051.7A CN202210235051A CN114612980A CN 114612980 A CN114612980 A CN 114612980A CN 202210235051 A CN202210235051 A CN 202210235051A CN 114612980 A CN114612980 A CN 114612980A
- Authority
- CN
- China
- Prior art keywords
- attention
- face
- azimuth
- convolution
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a deformed face detection method based on multi-azimuth blending attention aiming at face deformation detection, which comprises the following steps: 1) segmenting and normalizing the face of the image according to the eye coordinates detected by the dlib landmark point detector; 2) considering the position information of the channel attention neglect, a new attention module is proposed; 3) the double-branch convolution network is fused so as to improve the detection precision. 4) And classifying the final feature map by using the SVM.
Description
Technical Field
The invention relates to the field of face fusion attack detection, in particular to a deformed face detection technology based on multi-azimuth fusion attention.
Background
The face recognition technology obtains great achievement in the security protection field. Over the past few years, researchers have identified various potential deficiencies with biometric systems. Recently, vulnerabilities have been established for face and fingerprint recognition based on deformed biometric images and templates. Morphing techniques may be used to create artificial biometric samples that resemble the biometric information of two (or more) data subjects in the image and feature domain. If an image or template containing deformed individual feature information is infiltrated into the biometric identification system, the subjects that make up the deformed image will successfully authenticate both (or all) according to a single enrollment template. Therefore, a unique association between an individual and his biometric reference data is not necessary.
Such attacks constitute a serious safety hazard for biometric systems, in particular for widely deployed border control systems and electronic travel documents. Different commercial face recognition systems have been found to be highly vulnerable to such attacks. Because of the high intra-class variability of faces, face recognition systems achieve acceptable mismatch rates (FNMRs) at mismatch rates (FMRs) of up to 0.1%. That is, automatic detection of a distorted face image is crucial to ensure the safety of an operational face recognition system.
In order to solve the potential defects of the face recognition system, the detection of the face deformation attack becomes a problem to be solved urgently. The existing human face deformation attack detection method mainly comprises four algorithm types, namely deformation detection methods based on texture features, image quality, deep learning and mixed features. Capturing the change of the microscopic texture of the picture in the deformation process by a texture-based method, thereby realizing the detection of the deformed human face; the method based on image quality detects a deformed face by quantizing the difference of compression artifacts and noise introduced in the deformation process; recent deep learning-based methods use pre-trained CNN architecture extraction features to detect face deformation by classification. However, these methods still have the problems of high error rate, poor robustness, and high network complexity.
Disclosure of Invention
In view of the above disadvantages of the prior art, the present invention provides a method for detecting a deformed human face based on multi-directional blending attention. The method aims to solve the problems of high error rate, poor robustness, large parameter quantity and the like in the conventional method.
In order to achieve the above object, the present invention provides a block diagram based on multi-directional attention fusion, comprising the following steps:
a1, preprocessing an input image;
a2, passing through a double-branch convolution network module;
a3, passing through a multi-azimuth attention blending module;
a4, classification
The invention provides a deformed face detection method based on multi-azimuth attention blending. Compared with the prior art, the method has the following beneficial effects:
the scheme adopts a deep learning method, and mainly detects the deformed human face by fusing a multi-azimuth convergence attention module and a double-branch convolution network. Methods for deformable face detection by conventional attention mechanisms have been successful. Unlike channel attention, which converts the feature tensor into a single feature vector through 2-dimensional global pooling, multi-directional fusion attention decomposes channel attention into two 1-dimensional feature encoding processes, aggregating features along 2 spatial directions, respectively. In this way, remote dependencies can be captured in one spatial direction, while accurate location information can be retained in another spatial direction. The generated feature maps are then encoded as a pair of orientation-aware and location-sensitive attribute maps, respectively, which can be applied complementarily to the input feature maps to enhance the representation of the object of interest. And more important information can be captured by mixing the face image with the double-branch convolution network, different characteristics of real and deformed face images can be better captured, and the face image detection method is favorable for reliably detecting deformed faces.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a block diagram of a deformable face detection based on multi-aspect blend attention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention is described in detail below with reference to the drawings and the detailed description. As shown in fig. 1, a method for detecting a deformed face based on multi-directional attention fusion includes steps a 1-a 4:
a1, preprocessing the input image;
a2, passing through a double-branch convolution network module;
a3 model of multi-directional attention fusion
A4, classification
Each step is described in detail below.
In step a1, in a face morphing attack, the face region is generally centered in the image. To accurately extract features from the image, only the largest central region of the image is retained. In the pre-processing stage, the face of the image is segmented and normalized according to the eye coordinates detected by the dlib landmark point detector.
In step a2, given the input signature X, two signatures U1 and U2 were generated by a3 × 3 block convolution and 3 × 3 hole convolution (5 × 5 receptive field), respectively, from the original signature. The two feature maps are then added to generate a new feature map, the generated map is passed through a multi-orientation blend attention module and through both a and b functions, and the generated function values are multiplied by the original U1 and U2. Since the sum of the function values of a and b is equal to 1, it can be realized to set weights for the feature maps of the branches, because the sizes of the convolution kernels of different branches are different, so that the network can select a proper convolution kernel by itself (the A, B matrixes in the functions of a and b are initialized before training, the sizes are all C × d, and z is the feature map before passing through A, B functions after multi-azimuth blending attention), where we have:
in step A3, a multi-orientation blending attention module is designed. The method comprises the following steps: given an input X, each channel is first encoded along a horizontal and vertical coordinate, respectively, using a posing kernel of size (H,1) or (1, W). Thus, the output of the c-th channel of height h can be expressed as:
likewise, the output of the c-th channel of width w can be written as:
the 2 transformations respectively aggregate features along two spatial directions to obtain a pair of direction-sensing feature maps. After passing through the transform in the information embedding, the section subjects the above transform to a convert operation, and then subjects it to a transform operation using a convolution transform function:
f=δ(F1[zh,zw])) (5)
wherein.]For the concatenate operation along the spatial dimension, δ is the nonlinear activation function, and f is the intermediate feature map that encodes the spatial information in the horizontal and vertical directions. Then will decompose into 2 individual tensors f along the spatial dimensionh∈RC/r×WAnd fw∈RC/r×WWhere r is the control rate used to control the SE block. Using another 2 1 x 1 convolution transformations FhAnd FwRespectively will fhAnd fwThe transformation into a tensor with the same number of channels to the input X yields:
gh=σ(Fh(fh)) (6)
gw=σ(Fw(fw)) (7)
where σ is the sigmoid activation function. To reduce the complexity and computational overhead of the model, a suitable reduction ratio r (e.g., 16) is typically used here to reduce the number of channels of f. Then outputs ghAnd gwExtensions were made as attention weights, respectively.
Finally, the output of the multi-azimuth attentiveness module may be written as:
in step a4, the last 1 key step of the present invention is to determine the human face by finding the optimal classification model through a high-discrimination machine learning algorithm. The support vector machine containing the radial basis function is selected as the classifier, and the classifier not only has high classification accuracy, but also is widely applied to research subjects such as face recognition and the like. And sending the features subjected to the dimensionality reduction in the last step into the SVM, and finishing the detection of the deformed human face according to the output data of the SVM.
The invention provides a deformed face detection method based on multi-azimuth blending attention, and the innovation points of the method comprise the following steps:
a method for detecting a deformed human face based on multi-azimuth fusion attention is provided. The method carries out the detection of the deformed human face by the interaction of a multi-azimuth convergence attention module and a double-branch convolution network. The new attention module can better capture the difference between real and deformed human face images and is beneficial to reliably detecting deformed human faces.
A method of compensating for channel attention ignoring location information is proposed. The use of conventional channel attention only considers re-weighting the importance of each channel by channel relationship and ignores the location information, but the location information is important for generating spatially selective attribute maps. A new attention block is therefore introduced which takes into account not only the inter-channel relationships but also the location information of the feature space.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all modifications and equivalents of the present invention, which are made by the contents of the present specification and the accompanying drawings, or directly/indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (5)
1. A detection system based on multi-azimuth convergence attention. Characterized in that the method is executed by a computer and comprises the following steps:
a1, preprocessing an input image;
a2, passing through a double-branch convolution network module;
a3, passing through a multi-azimuth convergence attention module;
a4, classification.
2. The multi-azimuth fusional attention detection method as claimed in claim 1, wherein said normalized area is clipped to 224 x 224 pixels to ensure that the deformation detection algorithm is applied only to the face area, and the specific implementation of a1 is as follows: in a face morphing attack, the face region is usually located in the center of the image. To accurately extract features from the image, only the largest central region of the image is retained. In the pre-processing stage, the face of the image is segmented and normalized according to the eye coordinates detected by the dlib landmark point detector.
3. The method for detecting multi-directional concentration as claimed in claim 1, wherein the implementation procedure of a2 is as follows: given the input signature X, the original signature was passed through one 3 × 3 packet convolution and 3 × 3 hole convolution (field 5 × 5) to generate two signatures U1 and U2, respectively. The two feature maps are then added to generate a new feature map, the generated map is passed through a multi-orientation blend attention module and through both a and b functions, and the generated function values are multiplied by the original U1 and U2. Since the sum of the function values of a and b is equal to 1, the weighting of the feature maps of the branches can be realized, and because the sizes of convolution kernels of different branches are different, the network can select a proper convolution kernel by itself (the A, B matrixes in the functions of a and b are initialized before training, the sizes of the matrixes are C x d, and z is the feature map before passing through the A, B function after multi-azimuth blending attention) here:
4. the method as claimed in claim 1, wherein the implementation procedure of a3 is as follows: given an input X, each channel is first encoded along a horizontal and vertical coordinate, respectively, using a posing kernel of size (H,1) or (1, W). Thus, the output of the c-th channel with height h can be expressed as:
likewise, the output of the c-th channel of width w can be written as:
the 2 transformations respectively aggregate features along two spatial directions to obtain a pair of direction-sensing feature maps. After passing through the transform in the information embedding, the section subjects the above transform to a convert operation, and then subjects it to a transform operation using a convolution transform function:
f=δ(F1[zh,zw])) (5)
in the formula (II).]For the concatenate operation along the spatial dimension, δ is the nonlinear activation function, and f is the intermediate feature map that encodes the spatial information in the horizontal and vertical directions. Then along the spatial dimension will be decomposed into 2 separate tensors fh∈RC/r×WAnd fw∈RC/r×WWhere r is the control rate used to control SEblock. Using another 2 1 x 1 convolution transformations FhAnd FwRespectively will fhAnd fwThe transformation into a tensor with the same number of channels to the input X yields:
gh=σ(Fh(fh)) (6)
gw=σ(Fw(fw)) (7)
where σ is the sigmoid activation function. To reduce the complexity and computational overhead of the model, a suitable reduction ratio r (e.g., 16) is typically used here to reduce the number of channels of f. Then outputs ghAnd gwExtensions are made as attentionweights, respectively.
Finally, the output of the multi-azimuth attentiveness blended module may be written as:
5. the multi-azimuth molten attention detection method as claimed in claim 1, wherein the reduced-dimension features are classified by SVM, and the specific implementation process of a4 is as follows: the last 1 key step of the invention is to find the optimal classification model through a high-discrimination machine learning algorithm so as to judge the human face. The classifier has high classification accuracy and is widely applied to research subjects such as face recognition and the like. And sending the features subjected to the dimensionality reduction in the last step into the SVM, and finishing the detection of the deformed human face according to the output data of the SVM.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210235051.7A CN114612980A (en) | 2022-03-11 | 2022-03-11 | Deformed face detection based on multi-azimuth fusion attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210235051.7A CN114612980A (en) | 2022-03-11 | 2022-03-11 | Deformed face detection based on multi-azimuth fusion attention |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114612980A true CN114612980A (en) | 2022-06-10 |
Family
ID=81863062
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210235051.7A Pending CN114612980A (en) | 2022-03-11 | 2022-03-11 | Deformed face detection based on multi-azimuth fusion attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114612980A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115035315A (en) * | 2022-06-17 | 2022-09-09 | 佛山科学技术学院 | Tile color difference grading detection method and system based on attention mechanism |
CN117523636A (en) * | 2023-11-24 | 2024-02-06 | 北京远鉴信息技术有限公司 | Face detection method and device, electronic equipment and storage medium |
-
2022
- 2022-03-11 CN CN202210235051.7A patent/CN114612980A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115035315A (en) * | 2022-06-17 | 2022-09-09 | 佛山科学技术学院 | Tile color difference grading detection method and system based on attention mechanism |
CN117523636A (en) * | 2023-11-24 | 2024-02-06 | 北京远鉴信息技术有限公司 | Face detection method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11669607B2 (en) | ID verification with a mobile device | |
US10956719B2 (en) | Depth image based face anti-spoofing | |
CN108229427B (en) | Identity security verification method and system based on identity document and face recognition | |
JP6403233B2 (en) | User authentication method, apparatus for executing the same, and recording medium storing the same | |
CN102902959B (en) | Face recognition method and system for storing identification photo based on second-generation identity card | |
CN111191655B (en) | Object identification method and device | |
Fourati et al. | Anti-spoofing in face recognition-based biometric authentication using image quality assessment | |
US8160880B2 (en) | Generalized object recognition for portable reading machine | |
WO2015149534A1 (en) | Gabor binary pattern-based face recognition method and device | |
CN105550658A (en) | Face comparison method based on high-dimensional LBP (Local Binary Patterns) and convolutional neural network feature fusion | |
CN114612980A (en) | Deformed face detection based on multi-azimuth fusion attention | |
CN109800643A (en) | A kind of personal identification method of living body faces multi-angle | |
CN111144366A (en) | Strange face clustering method based on joint face quality assessment | |
CN107220627B (en) | Multi-pose face recognition method based on collaborative fuzzy mean discrimination analysis | |
US11238271B2 (en) | Detecting artificial facial images using facial landmarks | |
CN106485253B (en) | A kind of pedestrian of maximum particle size structured descriptor discrimination method again | |
CN110427972B (en) | Certificate video feature extraction method and device, computer equipment and storage medium | |
CN105760815A (en) | Heterogeneous human face verification method based on portrait on second-generation identity card and video portrait | |
CN111767877A (en) | Living body detection method based on infrared features | |
Querini et al. | Facial biometrics for 2D barcodes | |
Antil et al. | A two stream face anti-spoofing framework using multi-level deep features and ELBP features | |
Li et al. | A robust framework for multiview age estimation | |
Günay Yılmaz et al. | Face presentation attack detection performances of facial regions with multi-block LBP features | |
CN110276263B (en) | Face recognition system and recognition method | |
CN111368803A (en) | Face recognition method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |