CN114612980A - Deformed face detection based on multi-azimuth fusion attention - Google Patents

Deformed face detection based on multi-azimuth fusion attention Download PDF

Info

Publication number
CN114612980A
CN114612980A CN202210235051.7A CN202210235051A CN114612980A CN 114612980 A CN114612980 A CN 114612980A CN 202210235051 A CN202210235051 A CN 202210235051A CN 114612980 A CN114612980 A CN 114612980A
Authority
CN
China
Prior art keywords
attention
face
azimuth
convolution
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210235051.7A
Other languages
Chinese (zh)
Inventor
彭烨凡
龙敏
徐启航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202210235051.7A priority Critical patent/CN114612980A/en
Publication of CN114612980A publication Critical patent/CN114612980A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a deformed face detection method based on multi-azimuth blending attention aiming at face deformation detection, which comprises the following steps: 1) segmenting and normalizing the face of the image according to the eye coordinates detected by the dlib landmark point detector; 2) considering the position information of the channel attention neglect, a new attention module is proposed; 3) the double-branch convolution network is fused so as to improve the detection precision. 4) And classifying the final feature map by using the SVM.

Description

Deformed face detection based on multi-azimuth fusion attention
Technical Field
The invention relates to the field of face fusion attack detection, in particular to a deformed face detection technology based on multi-azimuth fusion attention.
Background
The face recognition technology obtains great achievement in the security protection field. Over the past few years, researchers have identified various potential deficiencies with biometric systems. Recently, vulnerabilities have been established for face and fingerprint recognition based on deformed biometric images and templates. Morphing techniques may be used to create artificial biometric samples that resemble the biometric information of two (or more) data subjects in the image and feature domain. If an image or template containing deformed individual feature information is infiltrated into the biometric identification system, the subjects that make up the deformed image will successfully authenticate both (or all) according to a single enrollment template. Therefore, a unique association between an individual and his biometric reference data is not necessary.
Such attacks constitute a serious safety hazard for biometric systems, in particular for widely deployed border control systems and electronic travel documents. Different commercial face recognition systems have been found to be highly vulnerable to such attacks. Because of the high intra-class variability of faces, face recognition systems achieve acceptable mismatch rates (FNMRs) at mismatch rates (FMRs) of up to 0.1%. That is, automatic detection of a distorted face image is crucial to ensure the safety of an operational face recognition system.
In order to solve the potential defects of the face recognition system, the detection of the face deformation attack becomes a problem to be solved urgently. The existing human face deformation attack detection method mainly comprises four algorithm types, namely deformation detection methods based on texture features, image quality, deep learning and mixed features. Capturing the change of the microscopic texture of the picture in the deformation process by a texture-based method, thereby realizing the detection of the deformed human face; the method based on image quality detects a deformed face by quantizing the difference of compression artifacts and noise introduced in the deformation process; recent deep learning-based methods use pre-trained CNN architecture extraction features to detect face deformation by classification. However, these methods still have the problems of high error rate, poor robustness, and high network complexity.
Disclosure of Invention
In view of the above disadvantages of the prior art, the present invention provides a method for detecting a deformed human face based on multi-directional blending attention. The method aims to solve the problems of high error rate, poor robustness, large parameter quantity and the like in the conventional method.
In order to achieve the above object, the present invention provides a block diagram based on multi-directional attention fusion, comprising the following steps:
a1, preprocessing an input image;
a2, passing through a double-branch convolution network module;
a3, passing through a multi-azimuth attention blending module;
a4, classification
The invention provides a deformed face detection method based on multi-azimuth attention blending. Compared with the prior art, the method has the following beneficial effects:
the scheme adopts a deep learning method, and mainly detects the deformed human face by fusing a multi-azimuth convergence attention module and a double-branch convolution network. Methods for deformable face detection by conventional attention mechanisms have been successful. Unlike channel attention, which converts the feature tensor into a single feature vector through 2-dimensional global pooling, multi-directional fusion attention decomposes channel attention into two 1-dimensional feature encoding processes, aggregating features along 2 spatial directions, respectively. In this way, remote dependencies can be captured in one spatial direction, while accurate location information can be retained in another spatial direction. The generated feature maps are then encoded as a pair of orientation-aware and location-sensitive attribute maps, respectively, which can be applied complementarily to the input feature maps to enhance the representation of the object of interest. And more important information can be captured by mixing the face image with the double-branch convolution network, different characteristics of real and deformed face images can be better captured, and the face image detection method is favorable for reliably detecting deformed faces.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a block diagram of a deformable face detection based on multi-aspect blend attention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention is described in detail below with reference to the drawings and the detailed description. As shown in fig. 1, a method for detecting a deformed face based on multi-directional attention fusion includes steps a 1-a 4:
a1, preprocessing the input image;
a2, passing through a double-branch convolution network module;
a3 model of multi-directional attention fusion
A4, classification
Each step is described in detail below.
In step a1, in a face morphing attack, the face region is generally centered in the image. To accurately extract features from the image, only the largest central region of the image is retained. In the pre-processing stage, the face of the image is segmented and normalized according to the eye coordinates detected by the dlib landmark point detector.
In step a2, given the input signature X, two signatures U1 and U2 were generated by a3 × 3 block convolution and 3 × 3 hole convolution (5 × 5 receptive field), respectively, from the original signature. The two feature maps are then added to generate a new feature map, the generated map is passed through a multi-orientation blend attention module and through both a and b functions, and the generated function values are multiplied by the original U1 and U2. Since the sum of the function values of a and b is equal to 1, it can be realized to set weights for the feature maps of the branches, because the sizes of the convolution kernels of different branches are different, so that the network can select a proper convolution kernel by itself (the A, B matrixes in the functions of a and b are initialized before training, the sizes are all C × d, and z is the feature map before passing through A, B functions after multi-azimuth blending attention), where we have:
Figure BDA0003541733090000041
Figure BDA0003541733090000051
in step A3, a multi-orientation blending attention module is designed. The method comprises the following steps: given an input X, each channel is first encoded along a horizontal and vertical coordinate, respectively, using a posing kernel of size (H,1) or (1, W). Thus, the output of the c-th channel of height h can be expressed as:
Figure BDA0003541733090000052
likewise, the output of the c-th channel of width w can be written as:
Figure BDA0003541733090000053
the 2 transformations respectively aggregate features along two spatial directions to obtain a pair of direction-sensing feature maps. After passing through the transform in the information embedding, the section subjects the above transform to a convert operation, and then subjects it to a transform operation using a convolution transform function:
f=δ(F1[zh,zw])) (5)
wherein.]For the concatenate operation along the spatial dimension, δ is the nonlinear activation function, and f is the intermediate feature map that encodes the spatial information in the horizontal and vertical directions. Then will decompose into 2 individual tensors f along the spatial dimensionh∈RC/r×WAnd fw∈RC/r×WWhere r is the control rate used to control the SE block. Using another 2 1 x 1 convolution transformations FhAnd FwRespectively will fhAnd fwThe transformation into a tensor with the same number of channels to the input X yields:
gh=σ(Fh(fh)) (6)
gw=σ(Fw(fw)) (7)
where σ is the sigmoid activation function. To reduce the complexity and computational overhead of the model, a suitable reduction ratio r (e.g., 16) is typically used here to reduce the number of channels of f. Then outputs ghAnd gwExtensions were made as attention weights, respectively.
Finally, the output of the multi-azimuth attentiveness module may be written as:
Figure BDA0003541733090000061
in step a4, the last 1 key step of the present invention is to determine the human face by finding the optimal classification model through a high-discrimination machine learning algorithm. The support vector machine containing the radial basis function is selected as the classifier, and the classifier not only has high classification accuracy, but also is widely applied to research subjects such as face recognition and the like. And sending the features subjected to the dimensionality reduction in the last step into the SVM, and finishing the detection of the deformed human face according to the output data of the SVM.
The invention provides a deformed face detection method based on multi-azimuth blending attention, and the innovation points of the method comprise the following steps:
a method for detecting a deformed human face based on multi-azimuth fusion attention is provided. The method carries out the detection of the deformed human face by the interaction of a multi-azimuth convergence attention module and a double-branch convolution network. The new attention module can better capture the difference between real and deformed human face images and is beneficial to reliably detecting deformed human faces.
A method of compensating for channel attention ignoring location information is proposed. The use of conventional channel attention only considers re-weighting the importance of each channel by channel relationship and ignores the location information, but the location information is important for generating spatially selective attribute maps. A new attention block is therefore introduced which takes into account not only the inter-channel relationships but also the location information of the feature space.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all modifications and equivalents of the present invention, which are made by the contents of the present specification and the accompanying drawings, or directly/indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (5)

1. A detection system based on multi-azimuth convergence attention. Characterized in that the method is executed by a computer and comprises the following steps:
a1, preprocessing an input image;
a2, passing through a double-branch convolution network module;
a3, passing through a multi-azimuth convergence attention module;
a4, classification.
2. The multi-azimuth fusional attention detection method as claimed in claim 1, wherein said normalized area is clipped to 224 x 224 pixels to ensure that the deformation detection algorithm is applied only to the face area, and the specific implementation of a1 is as follows: in a face morphing attack, the face region is usually located in the center of the image. To accurately extract features from the image, only the largest central region of the image is retained. In the pre-processing stage, the face of the image is segmented and normalized according to the eye coordinates detected by the dlib landmark point detector.
3. The method for detecting multi-directional concentration as claimed in claim 1, wherein the implementation procedure of a2 is as follows: given the input signature X, the original signature was passed through one 3 × 3 packet convolution and 3 × 3 hole convolution (field 5 × 5) to generate two signatures U1 and U2, respectively. The two feature maps are then added to generate a new feature map, the generated map is passed through a multi-orientation blend attention module and through both a and b functions, and the generated function values are multiplied by the original U1 and U2. Since the sum of the function values of a and b is equal to 1, the weighting of the feature maps of the branches can be realized, and because the sizes of convolution kernels of different branches are different, the network can select a proper convolution kernel by itself (the A, B matrixes in the functions of a and b are initialized before training, the sizes of the matrixes are C x d, and z is the feature map before passing through the A, B function after multi-azimuth blending attention) here:
Figure FDA0003541733080000011
Figure FDA0003541733080000012
4. the method as claimed in claim 1, wherein the implementation procedure of a3 is as follows: given an input X, each channel is first encoded along a horizontal and vertical coordinate, respectively, using a posing kernel of size (H,1) or (1, W). Thus, the output of the c-th channel with height h can be expressed as:
Figure FDA0003541733080000013
likewise, the output of the c-th channel of width w can be written as:
Figure FDA0003541733080000014
the 2 transformations respectively aggregate features along two spatial directions to obtain a pair of direction-sensing feature maps. After passing through the transform in the information embedding, the section subjects the above transform to a convert operation, and then subjects it to a transform operation using a convolution transform function:
f=δ(F1[zh,zw])) (5)
in the formula (II).]For the concatenate operation along the spatial dimension, δ is the nonlinear activation function, and f is the intermediate feature map that encodes the spatial information in the horizontal and vertical directions. Then along the spatial dimension will be decomposed into 2 separate tensors fh∈RC/r×WAnd fw∈RC/r×WWhere r is the control rate used to control SEblock. Using another 2 1 x 1 convolution transformations FhAnd FwRespectively will fhAnd fwThe transformation into a tensor with the same number of channels to the input X yields:
gh=σ(Fh(fh)) (6)
gw=σ(Fw(fw)) (7)
where σ is the sigmoid activation function. To reduce the complexity and computational overhead of the model, a suitable reduction ratio r (e.g., 16) is typically used here to reduce the number of channels of f. Then outputs ghAnd gwExtensions are made as attentionweights, respectively.
Finally, the output of the multi-azimuth attentiveness blended module may be written as:
Figure FDA0003541733080000021
5. the multi-azimuth molten attention detection method as claimed in claim 1, wherein the reduced-dimension features are classified by SVM, and the specific implementation process of a4 is as follows: the last 1 key step of the invention is to find the optimal classification model through a high-discrimination machine learning algorithm so as to judge the human face. The classifier has high classification accuracy and is widely applied to research subjects such as face recognition and the like. And sending the features subjected to the dimensionality reduction in the last step into the SVM, and finishing the detection of the deformed human face according to the output data of the SVM.
CN202210235051.7A 2022-03-11 2022-03-11 Deformed face detection based on multi-azimuth fusion attention Pending CN114612980A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210235051.7A CN114612980A (en) 2022-03-11 2022-03-11 Deformed face detection based on multi-azimuth fusion attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210235051.7A CN114612980A (en) 2022-03-11 2022-03-11 Deformed face detection based on multi-azimuth fusion attention

Publications (1)

Publication Number Publication Date
CN114612980A true CN114612980A (en) 2022-06-10

Family

ID=81863062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210235051.7A Pending CN114612980A (en) 2022-03-11 2022-03-11 Deformed face detection based on multi-azimuth fusion attention

Country Status (1)

Country Link
CN (1) CN114612980A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035315A (en) * 2022-06-17 2022-09-09 佛山科学技术学院 Tile color difference grading detection method and system based on attention mechanism
CN117523636A (en) * 2023-11-24 2024-02-06 北京远鉴信息技术有限公司 Face detection method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035315A (en) * 2022-06-17 2022-09-09 佛山科学技术学院 Tile color difference grading detection method and system based on attention mechanism
CN117523636A (en) * 2023-11-24 2024-02-06 北京远鉴信息技术有限公司 Face detection method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US11669607B2 (en) ID verification with a mobile device
US10956719B2 (en) Depth image based face anti-spoofing
CN108229427B (en) Identity security verification method and system based on identity document and face recognition
JP6403233B2 (en) User authentication method, apparatus for executing the same, and recording medium storing the same
CN102902959B (en) Face recognition method and system for storing identification photo based on second-generation identity card
CN111191655B (en) Object identification method and device
Fourati et al. Anti-spoofing in face recognition-based biometric authentication using image quality assessment
US8160880B2 (en) Generalized object recognition for portable reading machine
WO2015149534A1 (en) Gabor binary pattern-based face recognition method and device
CN105550658A (en) Face comparison method based on high-dimensional LBP (Local Binary Patterns) and convolutional neural network feature fusion
CN114612980A (en) Deformed face detection based on multi-azimuth fusion attention
CN109800643A (en) A kind of personal identification method of living body faces multi-angle
CN111144366A (en) Strange face clustering method based on joint face quality assessment
CN107220627B (en) Multi-pose face recognition method based on collaborative fuzzy mean discrimination analysis
US11238271B2 (en) Detecting artificial facial images using facial landmarks
CN106485253B (en) A kind of pedestrian of maximum particle size structured descriptor discrimination method again
CN110427972B (en) Certificate video feature extraction method and device, computer equipment and storage medium
CN105760815A (en) Heterogeneous human face verification method based on portrait on second-generation identity card and video portrait
CN111767877A (en) Living body detection method based on infrared features
Querini et al. Facial biometrics for 2D barcodes
Antil et al. A two stream face anti-spoofing framework using multi-level deep features and ELBP features
Li et al. A robust framework for multiview age estimation
Günay Yılmaz et al. Face presentation attack detection performances of facial regions with multi-block LBP features
CN110276263B (en) Face recognition system and recognition method
CN111368803A (en) Face recognition method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication