CN115984949B - Low-quality face image recognition method and equipment with attention mechanism - Google Patents

Low-quality face image recognition method and equipment with attention mechanism Download PDF

Info

Publication number
CN115984949B
CN115984949B CN202310272773.4A CN202310272773A CN115984949B CN 115984949 B CN115984949 B CN 115984949B CN 202310272773 A CN202310272773 A CN 202310272773A CN 115984949 B CN115984949 B CN 115984949B
Authority
CN
China
Prior art keywords
feature
feature map
representation
face image
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310272773.4A
Other languages
Chinese (zh)
Other versions
CN115984949A (en
Inventor
梁海丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weihai Vocational College
Original Assignee
Weihai Vocational College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weihai Vocational College filed Critical Weihai Vocational College
Priority to CN202310272773.4A priority Critical patent/CN115984949B/en
Publication of CN115984949A publication Critical patent/CN115984949A/en
Application granted granted Critical
Publication of CN115984949B publication Critical patent/CN115984949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a low-quality face image recognition method and equipment with an attention mechanism, and relates to the technical field of artificial neural networks and face recognition. The characteristic information extraction layer adopts convolution operation with different convolution kernel sizes and various characteristic fusion modes, and is matched with a plurality of stride convolution layers and pooling layers, so that the effect of enhancing the image characteristic extraction effect is achieved. Tests show that the network model provided by the invention has a good recognition effect on low-quality face images, and compared with the prior art, the network model has obvious progress.

Description

Low-quality face image recognition method and equipment with attention mechanism
Technical Field
The invention belongs to the technical field of artificial neural networks and face recognition, and particularly relates to a low-quality face image recognition method and equipment with an attention mechanism.
Background
In the past, face recognition is an important research direction in the field of image technology, and with the development of deep learning technology, many models based on neural networks can obtain more than 99% of recognition accuracy on a high-quality face image dataset. However, low quality face image recognition is still a problem with great difficulty, and especially for low resolution face images, the existing algorithm is still not mature enough, and the recognition accuracy is still to be further improved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a low-quality face image recognition method and equipment with an attention mechanism so as to improve the recognition accuracy of the low-quality face image.
In order to achieve the above object, the present invention adopts the following solutions: a method for recognizing a low quality face image with an attention mechanism, comprising the steps of:
s10, acquiring an unknown face image, and acquiring a trained face image recognition network model;
the face image recognition network model comprises a trunk and a global pooling processing layer, wherein the global pooling processing layer is connected to the tail of the trunk, the trunk comprises a plurality of characteristic information extraction layers connected in sequence, and the mathematical model of the characteristic information extraction layers is as follows:
Figure SMS_15
wherein->
Figure SMS_19
Input representing the feature information extraction layer, < >>
Figure SMS_24
Representing a pre-residual module; />
Figure SMS_4
A convolution operation with a convolution kernel of 3*3 and a step size of 1 is represented by +.>
Figure SMS_7
A convolution operation with a convolution kernel of 5*5 and a step size of 1 is represented by +.>
Figure SMS_9
A convolution operation with a convolution kernel size of 3*3 and a step size of 2 is represented by +.>
Figure SMS_12
A convolution operation with a convolution kernel size of 5*5 and a step size of 2 is represented by +.>
Figure SMS_25
And->
Figure SMS_27
All represent maximum pooling operation with pooling window size 2 x 2 and step size 2,/->
Figure SMS_31
A convolution operation with a convolution kernel of 1*1 and a step size of 1 is represented by +.>
Figure SMS_35
Representing element-corresponding product operation,/->
Figure SMS_26
Representing stitching of the characteristic diagrams therein, +.>
Figure SMS_30
、/>
Figure SMS_34
、/>
Figure SMS_37
、/>
Figure SMS_14
、/>
Figure SMS_17
All represent activation functions ReLU, ">
Figure SMS_20
A characteristic diagram representing the output of the pre-residual module,>
Figure SMS_23
representation->
Figure SMS_1
Feature map generated after activation, ++>
Figure SMS_5
Representation->
Figure SMS_8
Feature map generated after activation, ++>
Figure SMS_11
Representation->
Figure SMS_21
And->
Figure SMS_29
Feature map generated after addition, ++>
Figure SMS_33
Representation->
Figure SMS_38
And->
Figure SMS_28
Feature map generated after element corresponding product is made, < ->
Figure SMS_32
Representation->
Figure SMS_36
Feature map generated after activation, ++>
Figure SMS_39
Representation->
Figure SMS_2
Feature map generated after activation, ++>
Figure SMS_10
Representation of
Figure SMS_13
Feature map generated after pooling operation, +.>
Figure SMS_16
Representation->
Figure SMS_3
Feature map generated after pooling operation, +.>
Figure SMS_6
Representing a compound attention calibration unit,/->
Figure SMS_18
Representing the attention force diagram generated and output by the compound attention calibration unit, < >>
Figure SMS_22
A feature map indicating the output of the feature information extraction layer;
s20, inputting the unknown face image into the face image recognition network model, and sequentially carrying out feature extraction operation on the image information by each feature information extraction layer along with the transmission of the image information along the trunk until the last feature information extraction layer outputs an abstract feature image;
s30, inputting the abstract feature map into the global pooling processing layer, performing global pooling operation on each layer of the abstract feature map by using the global pooling processing layer, and outputting to obtain a face feature vector;
s40, calculating the distance between the face feature vector and all target feature vectors in the search library, wherein the identity corresponding to the target feature vector which is closest to the face feature vector and meets the threshold condition is the identity of the unknown face image.
Further, the global pooling processing layer is a global average pooling layer.
Further, the mathematical model of the compound attention calibration unit is:
Figure SMS_56
wherein the compound attention calibration unit is characterized by +.>
Figure SMS_60
、/>
Figure SMS_63
Figure SMS_41
And->
Figure SMS_44
As input; />
Figure SMS_50
And->
Figure SMS_52
All represent global maximum pooling operations on feature graphs,>
Figure SMS_54
is to operate the feature map in the spatial direction, +.>
Figure SMS_57
Is to operate the feature map in the channel direction, +.>
Figure SMS_59
The representation splices the characteristic diagrams; />
Figure SMS_62
A convolution operation with a convolution kernel of 3*3 and a step size of 1 is represented by +.>
Figure SMS_55
A convolution operation with a convolution kernel size of 1*1 and a step length of 1 is represented; />
Figure SMS_58
Representing element-corresponding product operation,/->
Figure SMS_61
And->
Figure SMS_64
All represent sigmoid activation functions, +.>
Figure SMS_43
Representation of
Figure SMS_47
Feature map generated after convolution operation, +.>
Figure SMS_49
Representation->
Figure SMS_53
Feature map generated after function activation, +.>
Figure SMS_40
Representing the characteristic diagram
Figure SMS_46
、/>
Figure SMS_48
、/>
Figure SMS_51
And->
Figure SMS_42
Respectively carrying out global maximum pooling on the channel directions and splicing to obtain a characteristic diagram, namely ++>
Figure SMS_45
Representing an attention force map output by the compound attention calibration unit.
Further, the mathematical model of the pre-residual module is:
Figure SMS_66
wherein->
Figure SMS_70
Input representing the pre-residual module, < >>
Figure SMS_72
And->
Figure SMS_67
All represent activation functions ReLU, ">
Figure SMS_68
And->
Figure SMS_71
All represent convolution operations with convolution kernel size 3*3 and step size 1, +.>
Figure SMS_73
Representation->
Figure SMS_65
Activating the generated feature map, < >>
Figure SMS_69
Representing the characteristic diagram output by the prepositive residual error module.
The invention also provides a low-quality face image recognition device with an attention mechanism, comprising a processor and a memory, the memory storing a computer program, the processor being arranged to perform the low-quality face image recognition method with an attention mechanism as described above by loading the computer program.
The beneficial effects of the invention are as follows:
(1) The prior art shows that for clear high-resolution face images, required characteristic information can be fully extracted from the images through a plurality of simply overlapped convolution layers, and for low-quality (such as low-resolution) face images, useful face image information in the face images is very limited, and the information is often mixed in a large amount of interference signals, so that the conventional artificial neural network is difficult to well cope with the low-quality image input, and the data fitting effect is poor; in order to improve the utilization rate of the original input image, the invention not only arranges convolution layers with different convolution kernel sizes in each characteristic information extraction layer
Figure SMS_74
And->
Figure SMS_75
) But also to apply +.>
Figure SMS_76
And->
Figure SMS_77
The feature images output by the two convolutions are fused (the products of matrix addition and element correspondence), so that the feature information extraction layer can perform feature extraction operations from different angles, and the feature extraction operations from different angles are mutually complemented and verified, thereby achieving the effect of strengthening the image feature extraction effect;
(2) In the conventional face recognition neural network, the feature extraction units and the pooling layers are alternately arranged, and as the whole network is of a serial structure, only one pooling layer exists between the two feature extraction units, so that the information pooling mode is single, and the feature extraction effect of the whole network on low-quality images is limited; the invention simultaneously sets a plurality of stride convolutions
Figure SMS_78
And->
Figure SMS_79
) And pooling layer ()>
Figure SMS_80
And->
Figure SMS_81
) The feature graphs of different branches are operated respectively, so that the length and width of the feature graphs are reduced, and meanwhile, the information loss caused by a single pooling mode is well relieved;
(3) Feature map generated by multiple branches
Figure SMS_82
、/>
Figure SMS_83
、/>
Figure SMS_84
And->
Figure SMS_85
) In the parallel input compound attention calibration unit, compared with the conventional single feature map as input, the compound attention calibration unit can learn the importance of different information from different angles, and finally obtain an attention map (>
Figure SMS_86
) The information calibration effect is better.
Drawings
FIG. 1 is a schematic diagram of a face image recognition network model according to an embodiment;
FIG. 2 is a schematic structural diagram of a feature information extraction layer in the face image recognition network model shown in FIG. 1;
FIG. 3 is a schematic structural diagram of a composite attention calibration unit in the feature information extraction layer shown in FIG. 2;
fig. 4 is a schematic structural view of a feature information extraction layer in the comparative example;
in the accompanying drawings: the face recognition system comprises a 1-unknown face image, a 2-feature information extraction layer, a 21-preposed residual error module, a 3-global pooling processing layer, a 4-composite attention calibration unit and a 5-face feature vector.
Detailed Description
Examples:
as shown in the drawings, fig. 1, 2 and 3 are a schematic view of a face image recognition network model structure, a schematic view of a feature information extraction layer 2, and a schematic view of a composite attention calibration unit 4 of the present embodiment, respectively. The global pooling processing layer 3 is implemented by adopting a global average pooling layer, four feature information extraction layers 2 are arranged in a backbone of a network, and the feature information extraction layers 2 can be expressed as the following mathematical model:
Figure SMS_103
wherein->
Figure SMS_110
Input representing the feature information extraction layer, < >>
Figure SMS_114
Representing a pre-residual module; />
Figure SMS_88
A convolution operation with a convolution kernel of 3*3 and a step size of 1 is represented by +.>
Figure SMS_91
A convolution operation with a convolution kernel of 5*5 and a step size of 1 is represented by +.>
Figure SMS_95
A convolution operation with a convolution kernel size of 3*3 and a step size of 2 is represented by +.>
Figure SMS_98
A convolution operation with a convolution kernel size of 5*5 and a step size of 2 is represented by +.>
Figure SMS_117
And->
Figure SMS_119
All represent maximum pooling operation with pooling window size 2 x 2 and step size 2,/->
Figure SMS_122
A convolution operation with a convolution kernel of 1*1 and a step size of 1 is represented by +.>
Figure SMS_124
Representing element-corresponding product operation,/->
Figure SMS_118
Representing stitching of the characteristic diagrams therein, +.>
Figure SMS_121
、/>
Figure SMS_123
、/>
Figure SMS_125
、/>
Figure SMS_102
、/>
Figure SMS_106
All represent activation functions ReLU, ">
Figure SMS_109
A characteristic diagram representing the output of the pre-residual module,>
Figure SMS_113
representation->
Figure SMS_87
Feature map generated after activation, ++>
Figure SMS_92
Representation->
Figure SMS_96
Activation ofPost-generated feature map, < >>
Figure SMS_99
Representation->
Figure SMS_89
And->
Figure SMS_94
Feature map generated after addition, ++>
Figure SMS_97
Representation->
Figure SMS_100
And->
Figure SMS_104
Feature map generated after element corresponding product is made, < ->
Figure SMS_107
Representation->
Figure SMS_111
Feature map generated after activation, ++>
Figure SMS_115
Representation->
Figure SMS_90
Feature map generated after activation, ++>
Figure SMS_93
Representation of
Figure SMS_101
Feature map generated after pooling operation, +.>
Figure SMS_105
Representation->
Figure SMS_108
Feature map generated after pooling operation, +.>
Figure SMS_112
Representing a compound attention calibration unit,/->
Figure SMS_116
Representing the attention force diagram generated and output by the compound attention calibration unit, < >>
Figure SMS_120
And the characteristic diagram is output by the characteristic information extraction layer.
For the first characteristic information extraction layer 2, the input is an unknown face image 1 (the number of channels is 3), and the unknown face image is subjected to a first convolution operation in a pre-residual error module 21
Figure SMS_126
) Then, a feature map (the length and width dimensions are equal to those of the unknown face image 1) with the number of channels being 48 is output, and a second convolution operation is performed>
Figure SMS_127
) The size of the feature map is kept unchanged from front to back. For the latter three feature information extraction layers 2, two convolution operations (++) in the pre-residual block 21>
Figure SMS_128
And->
Figure SMS_129
) The length and width dimensions and the number of channels of the feature map remain unchanged.
In all of the feature information extraction layers 2,
Figure SMS_158
and->
Figure SMS_161
The length and width dimensions and the channel number of the feature map are kept unchanged before and after the two convolution operations, so that the feature map is +.>
Figure SMS_164
And (4) feature map>
Figure SMS_133
Is of the same size (B)>
Figure SMS_137
And->
Figure SMS_141
Added to get->
Figure SMS_149
,/>
Figure SMS_131
And->
Figure SMS_135
The corresponding product of the elements is->
Figure SMS_139
. By means of->
Figure SMS_143
、/>
Figure SMS_145
、/>
Figure SMS_148
And->
Figure SMS_152
Filling the appropriate padding value such that +.>
Figure SMS_155
、/>
Figure SMS_146
、/>
Figure SMS_150
And->
Figure SMS_153
After the operation, the obtained characteristic diagram->
Figure SMS_156
、/>
Figure SMS_130
、/>
Figure SMS_134
And
Figure SMS_138
the length and width dimensions are (in the same characteristic information extraction layer 2)/(>
Figure SMS_142
Half of the feature map, ++>
Figure SMS_154
、/>
Figure SMS_157
、/>
Figure SMS_160
And->
Figure SMS_163
The number of channels is equal to (in the same feature information extraction layer 2)>
Figure SMS_159
The feature maps are equal. Then by stitching and->
Figure SMS_162
Convolution operation, feature map->
Figure SMS_165
、/>
Figure SMS_166
、/>
Figure SMS_132
And->
Figure SMS_136
Fusion, finally output profile->
Figure SMS_140
The number of channels is (in the same feature information extraction layer 2) feature map +.>
Figure SMS_144
2 times the number of channels, characteristic diagram->
Figure SMS_147
The length and width dimensions of (a) are (in the same feature information extraction layer 2) feature map ∈ ->
Figure SMS_151
Half the length and width dimensions.
For the compound attention calibration unit 4,
Figure SMS_184
、/>
Figure SMS_190
、/>
Figure SMS_193
and->
Figure SMS_168
After input, on one hand, the method comprises the steps of splicing and
Figure SMS_172
convolving to obtain a characteristic diagram of 4 channels +.>
Figure SMS_175
The method comprises the steps of carrying out a first treatment on the surface of the Then a global max pooling operation is performed in the spatial direction,/->
Figure SMS_179
After activation of the function, a vector of length 4 is obtained>
Figure SMS_183
. On the other hand, separately pair->
Figure SMS_187
、/>
Figure SMS_191
、/>
Figure SMS_194
And->
Figure SMS_196
Performing global maximum pooling operation in the channel direction, and then splicing the obtained 4 two-dimensional matrixes to obtain a characteristic diagram with 4 channels>
Figure SMS_198
. The vector is then +.>
Figure SMS_199
And->
Figure SMS_200
Performing element-corresponding product operation by +.>
Figure SMS_182
For->
Figure SMS_185
Modulating each layer of the pattern; finally use->
Figure SMS_188
Convolution compresses the number of channels to 1, via +.>
Figure SMS_192
Function activation, generating attention force diagram->
Figure SMS_167
。/>
Figure SMS_171
Is a two-dimensional matrix with length and width dimensions and input +.>
Figure SMS_177
Characteristic map is equal, +.>
Figure SMS_180
And->
Figure SMS_169
And (3) performing element corresponding product on the feature map obtained after function activation, and calibrating feature information at different positions in the space direction of the feature map. The compound attention calibration unit 4 uses the vector +.>
Figure SMS_174
For->
Figure SMS_176
Is modulated by the individual layers of (a) so that not only +.>
Figure SMS_181
、/>
Figure SMS_186
、/>
Figure SMS_189
And->
Figure SMS_195
The characteristic relations of different positions in the space direction are utilized, and the characteristic relations in the channel direction are utilized, so that the utilization rate of the composite attention calibration unit 4 on the input information is high, and the characteristic diagram can be more fully mined and learned>
Figure SMS_197
、/>
Figure SMS_170
、/>
Figure SMS_173
And->
Figure SMS_178
The relative relation of the different position information and the feature map can realize more effective calibration after element corresponding product operation.
After the image information is transmitted by the network backbone, the number of channels of the abstract feature map output by the last feature information extraction layer 2 is 768, and after global average pooling processing is carried out on each layer of the abstract feature map, a face feature vector 5 with the length of 768 is generated. The length of all target feature vectors in the search library is 768, and the corresponding target feature vectors are obtained by inputting the high-definition face images into the trained face image recognition network model. In this embodiment, the calculated euclidean distance between the face feature vector 5 and all the target feature vectors in the search library is the identity corresponding to the target feature vector which has the closest distance to the face feature vector 5 and the euclidean distance smaller than the threshold value, namely the identity of the unknown face image 1. If the Euclidean distance between the face feature vector 5 and all the target feature vectors in the search library is greater than or equal to the threshold value, the unknown face image 1 is judged not to be in the search library.
In the embodiment, the SCface data set is adopted to train and test the network model, and the SCface data set comprises high-definition face images and low-resolution face images of 130 persons. The high-definition face image of each person is only 1, the low-resolution face image is 15, and the 15 images are obtained by arranging 5 cameras at three different distances (1 meter, 2.6 meters and 4.2 meters). In the specific implementation, 3 images are randomly extracted from 5 images shot from each distance to form a training set containing 1170 images together, and the rest 780 low-resolution face images are used as a test set. In the training process, a ternary loss function is adopted to optimize the network model. After training, inputting the high-definition images of 130 persons into a network model, forming the target feature vectors in a search library by the output feature vectors, and then inputting the images in the test set into the trained network model for testing. In contrast, the present embodiment also trains the present advanced low-resolution face recognition model MIND-Net using the same training set, and tests on the same test set, and the comparison results are shown in Table 1.
Table 1 examples and bond-Net model test comparative results on test set
Figure SMS_201
By comparing the final recognition accuracy, the face image recognition accuracy of the invention is higher than MIND-Net for face images with different resolutions, and particularly for images with lower resolution (4.2 m distance shooting), the face image recognition method and device can achieve great improvement.
Comparative example:
this comparative example is to more fully explain the role of the compound attention calibration unit 4 proposed by the present invention in the entire model. In this comparative example, the composite attention calibration unit 4 in the embodiment is removed, and the CBAM module is used instead to calibrate the feature map, and the new feature extraction layer 2 structure is shown in fig. 4. The rest of the network model remained unchanged, the training and testing procedure was also completely consistent with the examples, and the modified network model test results are shown in table 2.
Table 2 comparative examples test results on test sets
Figure SMS_202
As can be seen from comparing the data of the above two tables, the feature information in the original input image is rich and the composite attention calibration unit 4 is limited in terms of performance improvement at a relatively high resolution. But in the case of relatively low resolution, it becomes particularly important to make full use of the originally not much high-value information, and the effect of the composite attention calibration unit 4 on improving the network performance is very obvious.
The foregoing examples merely illustrate specific embodiments of the invention, which are described in greater detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims (4)

1. A low-quality face image recognition method with an attention mechanism is characterized by comprising the following steps: the method comprises the following steps:
s10, acquiring an unknown face image, and acquiring a trained face image recognition network model;
the face image recognition network model comprises a trunk and a global pooling processing layer, wherein the global pooling processing layer is connected to the tail of the trunk, the trunk comprises a plurality of characteristic information extraction layers connected in sequence, and the mathematical model of the characteristic information extraction layers is as follows:
Figure QLYQS_23
wherein->
Figure QLYQS_29
Input representing the feature information extraction layer, < >>
Figure QLYQS_32
Representing a pre-residual module; />
Figure QLYQS_2
A convolution operation with a convolution kernel of 3*3 and a step size of 1 is represented by +.>
Figure QLYQS_6
A convolution operation with a convolution kernel of 5*5 and a step size of 1 is represented by +.>
Figure QLYQS_12
A convolution operation with a convolution kernel size of 3*3 and a step size of 2 is represented by +.>
Figure QLYQS_16
A convolution operation with a convolution kernel size of 5*5 and a step size of 2 is represented by +.>
Figure QLYQS_3
And->
Figure QLYQS_7
All represent maximum pooling operation with pooling window size 2 x 2 and step size 2,/->
Figure QLYQS_9
A convolution operation with a convolution kernel of 1*1 and a step size of 1 is represented by +.>
Figure QLYQS_14
Representing element-corresponding product operation,/->
Figure QLYQS_17
Representing stitching of the characteristic diagrams therein, +.>
Figure QLYQS_21
、/>
Figure QLYQS_24
、/>
Figure QLYQS_27
、/>
Figure QLYQS_19
、/>
Figure QLYQS_25
All represent activation functions ReLU, ">
Figure QLYQS_28
A characteristic diagram representing the output of the pre-residual module,>
Figure QLYQS_33
representation->
Figure QLYQS_1
Feature map generated after activation, ++>
Figure QLYQS_8
Representation->
Figure QLYQS_10
Feature map generated after activation, ++>
Figure QLYQS_13
Representation->
Figure QLYQS_30
And->
Figure QLYQS_36
Feature map generated after addition, ++>
Figure QLYQS_38
Representation->
Figure QLYQS_39
And->
Figure QLYQS_31
Feature map generated after element corresponding product is made, < ->
Figure QLYQS_34
Representation->
Figure QLYQS_35
Feature map generated after activation, ++>
Figure QLYQS_37
Representation->
Figure QLYQS_4
Feature map generated after activation, ++>
Figure QLYQS_5
Representation->
Figure QLYQS_11
Feature map generated after pooling operation, +.>
Figure QLYQS_15
Representation->
Figure QLYQS_18
Feature map generated after pooling operation, +.>
Figure QLYQS_20
Representing a compound attention calibration unit,/->
Figure QLYQS_22
Representing the attention force diagram generated and output by the compound attention calibration unit, < >>
Figure QLYQS_26
A feature map indicating the output of the feature information extraction layer;
s20, inputting the unknown face image into the face image recognition network model, and sequentially carrying out feature extraction operation on the image information by each feature information extraction layer along with the transmission of the image information along the trunk until the last feature information extraction layer outputs an abstract feature image;
s30, inputting the abstract feature map into the global pooling processing layer, performing global pooling operation on each layer of the abstract feature map by using the global pooling processing layer, and outputting to obtain a face feature vector;
s40, calculating the distance between the face feature vector and all target feature vectors in a search library, wherein the identity corresponding to the target feature vector which is closest to the face feature vector and meets the threshold condition is the identity of the unknown face image;
the mathematical model of the composite attention calibration unit is:
Figure QLYQS_56
wherein the compound attention calibration unit is characterized by +.>
Figure QLYQS_59
、/>
Figure QLYQS_62
、/>
Figure QLYQS_41
And->
Figure QLYQS_44
As input; />
Figure QLYQS_48
And->
Figure QLYQS_52
All represent global maximum pooling operations on feature graphs,>
Figure QLYQS_49
is to operate the feature map in the spatial direction, +.>
Figure QLYQS_53
Is to operate the feature map in the channel direction, +.>
Figure QLYQS_57
The representation splices the characteristic diagrams; />
Figure QLYQS_60
A convolution operation with a convolution kernel of 3*3 and a step size of 1 is represented by +.>
Figure QLYQS_58
A convolution operation with a convolution kernel size of 1*1 and a step length of 1 is represented; />
Figure QLYQS_61
Representing element-corresponding product operation,/->
Figure QLYQS_63
And->
Figure QLYQS_64
All represent sigmoid activation functions, +.>
Figure QLYQS_43
Representation->
Figure QLYQS_47
Feature map generated after convolution operation, +.>
Figure QLYQS_51
Representation->
Figure QLYQS_54
Feature map generated after function activation, +.>
Figure QLYQS_40
Representing the characteristic diagram->
Figure QLYQS_46
Figure QLYQS_50
、/>
Figure QLYQS_55
And->
Figure QLYQS_42
Respectively carrying out global maximum pooling on the channel directions and splicing to obtain a characteristic diagram, namely ++>
Figure QLYQS_45
Representing an attention force map output by the compound attention calibration unit.
2. The method for recognizing a low-quality face image with an attention mechanism according to claim 1, characterized in that: the global pooling processing layer is a global average pooling layer.
3. The method for recognizing a low-quality face image with an attention mechanism according to claim 1, characterized in that: the mathematical model of the prepositive residual error module is as follows:
Figure QLYQS_66
wherein->
Figure QLYQS_69
Input representing the pre-residual module, < >>
Figure QLYQS_72
And->
Figure QLYQS_67
All represent activation functions ReLU, ">
Figure QLYQS_70
And->
Figure QLYQS_71
All represent convolution operations with convolution kernel size 3*3 and step size 1, +.>
Figure QLYQS_73
Representation->
Figure QLYQS_65
Activating the generated feature map, < >>
Figure QLYQS_68
Representing the characteristic diagram output by the prepositive residual error module.
4. A low quality face image recognition device with an attention mechanism, characterized by: comprising a processor and a memory, said memory storing a computer program, said processor being adapted to perform the method of low quality facial image recognition with attention mechanism of any of claims 1-3 by loading said computer program.
CN202310272773.4A 2023-03-21 2023-03-21 Low-quality face image recognition method and equipment with attention mechanism Active CN115984949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310272773.4A CN115984949B (en) 2023-03-21 2023-03-21 Low-quality face image recognition method and equipment with attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310272773.4A CN115984949B (en) 2023-03-21 2023-03-21 Low-quality face image recognition method and equipment with attention mechanism

Publications (2)

Publication Number Publication Date
CN115984949A CN115984949A (en) 2023-04-18
CN115984949B true CN115984949B (en) 2023-07-04

Family

ID=85958600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310272773.4A Active CN115984949B (en) 2023-03-21 2023-03-21 Low-quality face image recognition method and equipment with attention mechanism

Country Status (1)

Country Link
CN (1) CN115984949B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311479B (en) * 2023-05-16 2023-07-21 四川轻化工大学 Face recognition method, system and storage medium for unlocking automobile

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200161A (en) * 2020-12-03 2021-01-08 北京电信易通信息技术股份有限公司 Face recognition detection method based on mixed attention mechanism
CN114998958A (en) * 2022-05-11 2022-09-02 华南理工大学 Face recognition method based on lightweight convolutional neural network

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11636328B2 (en) * 2018-03-28 2023-04-25 University Of Maryland, College Park L2 constrained softmax loss for discriminative face verification
CN110781784A (en) * 2019-10-18 2020-02-11 高新兴科技集团股份有限公司 Face recognition method, device and equipment based on double-path attention mechanism
GB2596777A (en) * 2020-05-13 2022-01-12 Huawei Tech Co Ltd Facial re-enactment
CN112949565B (en) * 2021-03-25 2022-06-03 重庆邮电大学 Single-sample partially-shielded face recognition method and system based on attention mechanism
CN113361495B (en) * 2021-07-27 2024-04-09 平安科技(深圳)有限公司 Method, device, equipment and storage medium for calculating similarity of face images
CN113724203B (en) * 2021-08-03 2024-04-23 唯智医疗科技(佛山)有限公司 Model training method and device applied to target feature segmentation in OCT image
CN113688783B (en) * 2021-09-10 2022-06-28 一脉通(深圳)智能科技有限公司 Face feature extraction method, low-resolution face recognition method and equipment
CN114360030A (en) * 2022-01-17 2022-04-15 重庆锐云科技有限公司 Face recognition method based on convolutional neural network
CN115100720A (en) * 2022-07-04 2022-09-23 威海职业学院(威海市技术学院) Low-resolution face recognition method
CN115439329B (en) * 2022-11-10 2023-01-24 四川轻化工大学 Face image super-resolution reconstruction method and computer-readable storage medium
CN115661911B (en) * 2022-12-23 2023-03-17 四川轻化工大学 Face feature extraction method, device and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200161A (en) * 2020-12-03 2021-01-08 北京电信易通信息技术股份有限公司 Face recognition detection method based on mixed attention mechanism
CN114998958A (en) * 2022-05-11 2022-09-02 华南理工大学 Face recognition method based on lightweight convolutional neural network

Also Published As

Publication number Publication date
CN115984949A (en) 2023-04-18

Similar Documents

Publication Publication Date Title
Wang et al. Detect globally, refine locally: A novel approach to saliency detection
CN108520535B (en) Object classification method based on depth recovery information
WO2022267641A1 (en) Image defogging method and system based on cyclic generative adversarial network
CN108648197B (en) Target candidate region extraction method based on image background mask
CN114202672A (en) Small target detection method based on attention mechanism
US10032093B2 (en) Method and device for determining the shape of an object represented in an image, corresponding computer program product and computer-readable medium
WO2020108336A1 (en) Image processing method and apparatus, device, and storage medium
CN111553267B (en) Image processing method, image processing model training method and device
CN110689599A (en) 3D visual saliency prediction method for generating countermeasure network based on non-local enhancement
CN110059728B (en) RGB-D image visual saliency detection method based on attention model
CN111882026A (en) Optimization of unsupervised generative confrontation networks by latent spatial regularization
WO2022205755A1 (en) Texture generation method and apparatus, device, and storage medium
CN115984949B (en) Low-quality face image recognition method and equipment with attention mechanism
CN112233012A (en) Face generation system and method
CN111460876A (en) Method and apparatus for identifying video
US20230289608A1 (en) Optimizing Supervised Generative Adversarial Networks via Latent Space Regularizations
CN113781324A (en) Old photo repairing method
CN116205820A (en) Image enhancement method, target identification method, device and medium
JP2023526899A (en) Methods, devices, media and program products for generating image inpainting models
CN115331259A (en) Three-dimensional human body posture estimation method, system and storage medium
CN110163095B (en) Loop detection method, loop detection device and terminal equipment
CN117671509B (en) Remote sensing target detection method and device, electronic equipment and storage medium
US20220101122A1 (en) Energy-based variational autoencoders
CN114663880A (en) Three-dimensional target detection method based on multi-level cross-modal self-attention mechanism
CN116385667B (en) Reconstruction method of three-dimensional model, training method and device of texture reconstruction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant