CN115984949A - Low-quality face image recognition method and device with attention mechanism - Google Patents

Low-quality face image recognition method and device with attention mechanism Download PDF

Info

Publication number
CN115984949A
CN115984949A CN202310272773.4A CN202310272773A CN115984949A CN 115984949 A CN115984949 A CN 115984949A CN 202310272773 A CN202310272773 A CN 202310272773A CN 115984949 A CN115984949 A CN 115984949A
Authority
CN
China
Prior art keywords
feature
face image
characteristic
map
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310272773.4A
Other languages
Chinese (zh)
Other versions
CN115984949B (en
Inventor
梁海丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weihai Vocational College
Original Assignee
Weihai Vocational College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weihai Vocational College filed Critical Weihai Vocational College
Priority to CN202310272773.4A priority Critical patent/CN115984949B/en
Publication of CN115984949A publication Critical patent/CN115984949A/en
Application granted granted Critical
Publication of CN115984949B publication Critical patent/CN115984949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a low-quality face image recognition method and equipment with an attention mechanism, and relates to the technical field of artificial neural networks and face recognition. The feature information extraction layer of the invention adopts convolution operation with different convolution kernel sizes and a plurality of feature fusion modes, and is matched with a plurality of step convolution and pooling layers to achieve the effect of strengthening the image feature extraction. Tests show that the network model provided by the invention has a good recognition effect on low-quality face images, and compared with the prior art, the network model makes an obvious progress.

Description

Low-quality face image recognition method and device with attention mechanism
Technical Field
The invention belongs to the technical field of artificial neural networks and face recognition, and particularly relates to a low-quality face image recognition method and device with an attention mechanism.
Background
Since face recognition is an important research direction in the field of image technology, many neural network-based models have been able to achieve over 99% recognition accuracy on high-quality face image data sets with the development of deep learning technology. However, the low-quality face image recognition is still a problem with great difficulty, especially for the low-resolution face image, the existing algorithm is not mature enough, and the recognition accuracy rate is to be further improved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a low-quality face image recognition method and equipment with an attention mechanism so as to improve the recognition accuracy of the low-quality face image.
In order to achieve the purpose, the invention adopts the following solution: a low-quality face image recognition method with an attention mechanism comprises the following steps:
s10, acquiring an unknown face image, and acquiring a trained face image recognition network model;
the human face image recognition network model comprises a trunk and a global pooling processing layer, the global pooling processing layer is connected to the tail of the trunk, the trunk comprises a plurality of feature information extraction layers which are connected in sequence, and the mathematical model of the feature information extraction layers is as follows:
Figure SMS_15
wherein is present>
Figure SMS_19
Represents an input of the characteristic information extraction layer, is selected>
Figure SMS_24
Representing a pre-residual module; />
Figure SMS_4
Represents a convolution operation with a convolution kernel size of 3 x 3 and a step size of 1, and->
Figure SMS_7
Represents a convolution operation with a convolution kernel size of 5 x 5 and a stride of 1, and->
Figure SMS_9
Denotes the convolution kernel size of 3 ×3. Convolution operation with a step size of 2->
Figure SMS_12
Represents a convolution operation with a convolution kernel size of 5 x 5 and a step size of 2, and->
Figure SMS_25
And &>
Figure SMS_27
Each representing a maximum pooling operation with a pooling window size of 2 x 2, step size of 2>
Figure SMS_31
Represents a convolution operation with a convolution kernel size of 1 x 1 and a step size of 1, and->
Figure SMS_35
Indicates that the element corresponds to a product operation, and->
Figure SMS_26
Showing the splicing together of the feature maps therein,
Figure SMS_30
、/>
Figure SMS_34
、/>
Figure SMS_37
、/>
Figure SMS_14
、/>
Figure SMS_17
both represent an activation function ReLU, <' >>
Figure SMS_20
A characteristic map representing the output of the pre-residual module, based on the comparison result of the pre-residual module>
Figure SMS_23
Represents->
Figure SMS_1
Characteristic map generated after activation, ->
Figure SMS_5
Represents->
Figure SMS_8
Characteristic map generated upon activation>
Figure SMS_11
Represents->
Figure SMS_21
And/or>
Figure SMS_29
The feature map generated after the addition is taken>
Figure SMS_33
Represents->
Figure SMS_38
And/or>
Figure SMS_28
A characteristic map generated after the corresponding multiplication of the elements is made, is based on the result of the evaluation>
Figure SMS_32
Represents->
Figure SMS_36
Characteristic map generated after activation, ->
Figure SMS_39
Represents->
Figure SMS_2
Characteristic map generated after activation, ->
Figure SMS_10
Represents->
Figure SMS_13
Characteristic map generated after pooling operation,. Sup.>
Figure SMS_16
Represents->
Figure SMS_3
Feature maps generated after pooling operations>
Figure SMS_6
Represents a combined attention calibration unit>
Figure SMS_18
Represents an attention diagram generated and output by the combined attention calibration unit, -a>
Figure SMS_22
A feature map representing an output of the feature information extraction layer;
s20, inputting the unknown face image into the face image recognition network model, and carrying out feature extraction operation on image information by each feature information extraction layer in sequence along with the transfer of the image information along the backbone until an abstract feature image is output by the last feature information extraction layer;
s30, inputting the abstract feature map into the global pooling processing layer, performing global pooling operation on each layer of the abstract feature map by using the global pooling processing layer, and outputting to obtain a face feature vector;
s40, calculating the distance between the face feature vector and all target feature vectors in a retrieval library, wherein the identity corresponding to the target feature vector which is closest to the face feature vector and meets the threshold condition is the identity of the unknown face image.
Further, the global pooling processing layer is a global average pooling layer.
Further, the mathematical model of the composite attention calibration unit is:
Figure SMS_56
wherein the combined attention calibration unit holds the characteristic map>
Figure SMS_60
、/>
Figure SMS_63
、/>
Figure SMS_41
And &>
Figure SMS_44
As an input; />
Figure SMS_50
And &>
Figure SMS_52
All represent global max pooling operations on the feature map, device for selecting or keeping>
Figure SMS_54
Is to operate the characteristic map in the spatial direction, based on the characteristic map>
Figure SMS_57
Is to operate the characteristic map in the direction of the channel, based on the result of the evaluation of the characteristic map>
Figure SMS_59
Showing the splicing together of the feature maps therein; />
Figure SMS_62
Represents a convolution operation with a convolution kernel size of 3 x 3 and a step size of 1, and->
Figure SMS_55
Convolution operation representing convolution kernel size of 1 × 1 and step size of 1; />
Figure SMS_58
Indicates that the element corresponds to a product operation, and->
Figure SMS_61
And &>
Figure SMS_64
Each represents a sigmoid activation function, <' >>
Figure SMS_43
Represents->
Figure SMS_47
The feature map generated after the convolution operation is->
Figure SMS_49
Represents->
Figure SMS_53
Feature map generated after function activation, ->
Figure SMS_40
Representing a feature map>
Figure SMS_46
、/>
Figure SMS_48
、/>
Figure SMS_51
And &>
Figure SMS_42
Respectively performing global maximum pooling in the channel direction, and splicing to obtain characteristic maps, and selecting the corresponding characteristic map based on the characteristic map>
Figure SMS_45
An attention map of the output of the composite attention calibration unit is represented.
Further, the mathematical model of the pre-residual module is:
Figure SMS_66
wherein is present>
Figure SMS_70
Represents the input of the pre-residual module, is combined with the input of the pre-residual module>
Figure SMS_72
And &>
Figure SMS_67
Each represents an activation function ReLU +>
Figure SMS_68
And &>
Figure SMS_71
Each represents a convolution operation with a convolution kernel size of 3 x 3 and a step size of 1, and->
Figure SMS_73
Represents->
Figure SMS_65
Activation of the generated characteristic map +>
Figure SMS_69
And representing the characteristic graph output by the preposed residual error module.
The invention also provides a low-quality face image recognition device with an attention mechanism, which comprises a processor and a memory, wherein the memory stores a computer program, and the processor is used for executing the low-quality face image recognition method with the attention mechanism by loading the computer program.
The invention has the beneficial effects that:
(1) The prior art shows that for a clear high-resolution face image, required characteristic information can be fully extracted and obtained from the image through a plurality of convolution layers which are simply superposed, while for a low-quality (such as low-resolution) face image, useful face image information is very limited and is often included in a large amount of interference signals, the existing artificial neural network is difficult to well deal with the low-quality image input, and the data fitting effect is poor; in order to improve the utilization rate of the original input image, the invention not only arranges convolution layers (with different convolution kernel sizes) in each characteristic information extraction layer
Figure SMS_74
And &>
Figure SMS_75
) And also in different ways>
Figure SMS_76
And &>
Figure SMS_77
The feature graphs output by the two convolutions are fused (matrix addition and element corresponding product), so that the feature information extraction layer can perform feature extraction operation from different angles, and the feature extraction operation from different angles is mutually supplemented and verified, thereby achieving the effect of enhancing the image feature extraction effect;
(2) In the conventional face recognition neural network, the feature extraction units and the pooling layer are alternately arranged, and the whole network is of a serial structure, and only one pooling layer exists between the two feature extraction units, so that the information pooling mode is single, and the feature extraction effect of the whole network on low-quality images is limited; the invention sets a plurality of step convolutions (
Figure SMS_78
And &>
Figure SMS_79
) And pooling layer (` Harbin `)>
Figure SMS_80
And &>
Figure SMS_81
) The characteristic diagrams of different branches are respectively operated, so that the length and width of the characteristic diagrams are reduced, and information loss caused by a single pooling mode is well relieved;
(3) Feature map generated by multiple branches (
Figure SMS_82
、/>
Figure SMS_83
、/>
Figure SMS_84
And &>
Figure SMS_85
) Parallel input into the composite attention calibration unit, the composite attention calibration unit can learn the importance of different information from different angles and the resulting attention diagram (or @) compared to a conventional single profile as input>
Figure SMS_86
) The information calibration effect is better.
Drawings
FIG. 1 is a schematic diagram of a structure of a face image recognition network model according to an embodiment;
FIG. 2 is a schematic structural diagram of a feature information extraction layer in the face image recognition network model shown in FIG. 1;
FIG. 3 is a schematic diagram of a composite attention calibration unit in the feature information extraction layer shown in FIG. 2;
FIG. 4 is a schematic view of a structure of a characteristic information extraction layer in a comparative example;
in the drawings: the method comprises the steps of 1-unknown face image, 2-feature information extraction layer, 21-preposed residual module, 3-global pooling processing layer, 4-composite attention calibration unit and 5-face feature vector.
Detailed Description
Example (b):
as shown in the drawings of the specification, fig. 1, fig. 2, and fig. 3 are a schematic structural diagram of a face image recognition network model, a schematic structural diagram of a feature information extraction layer 2, and a schematic structural diagram of a composite attention calibration unit 4, respectively, according to this embodiment. The global pooling processing layer 3 is realized by adopting a global average pooling layer, four feature information extraction layers 2 are arranged in a network backbone, and the feature information extraction layers 2 can be expressed as the following mathematical models:
Figure SMS_103
in which>
Figure SMS_110
Represents an input of the characteristic information extraction layer, is selected>
Figure SMS_114
Representing a pre-residual module; />
Figure SMS_88
Represents a convolution operation with a convolution kernel size of 3 x 3 and a step size of 1, and->
Figure SMS_91
Represents a convolution operation with a convolution kernel size of 5 x 5 with a step size of 1, and->
Figure SMS_95
Represents a convolution operation with a convolution kernel size of 3 x 3 and a step size of 2, and->
Figure SMS_98
Represents the convolution operation with convolution kernel size of 5 x 5 and step size of 2,
Figure SMS_117
and &>
Figure SMS_119
Each represents a maximum pooling operation with a pooling window size of 2 x 2 and a step size of 2, and->
Figure SMS_122
Represents a convolution operation with a convolution kernel size of 1 x 1 and a step size of 1, and->
Figure SMS_124
Indicates that the element corresponds to a product operation, and->
Figure SMS_118
Showing the splicing together of the feature maps therein,
Figure SMS_121
、/>
Figure SMS_123
、/>
Figure SMS_125
、/>
Figure SMS_102
、/>
Figure SMS_106
each represents an activation function ReLU +>
Figure SMS_109
A characteristic map representing the output of the pre-residual module, based on the comparison result of the pre-residual module>
Figure SMS_113
Represents->
Figure SMS_87
Characteristic map generated upon activation>
Figure SMS_92
Represents->
Figure SMS_96
Characteristic map generated after activation, ->
Figure SMS_99
Represents->
Figure SMS_89
And/or>
Figure SMS_94
The feature map generated after the addition is taken>
Figure SMS_97
Represents->
Figure SMS_100
And/or>
Figure SMS_104
Characteristic map generated after element corresponding product is taken, and>
Figure SMS_107
represents->
Figure SMS_111
ActivationPost-generated feature map>
Figure SMS_115
Represents->
Figure SMS_90
Characteristic map generated after activation, ->
Figure SMS_93
Represents->
Figure SMS_101
Characteristic map generated after pooling operation,. Sup.>
Figure SMS_105
Represents->
Figure SMS_108
Characteristic map generated after pooling operation,. Sup.>
Figure SMS_112
Represents a combined attention calibration unit>
Figure SMS_116
Represents an attention map generated and output by the combined attention calibration unit, <' >>
Figure SMS_120
And a feature map representing an output of the feature information extraction layer.
For the first feature information extraction layer 2, the unknown face image 1 (the number of channels is 3) is input, and the first convolution operation in the pre-residual module 21 is performed (i.e., (1)
Figure SMS_126
) Then, the output obtains a feature map with the channel number of 48 (the length and width are equal to the unknown face image 1), and the second convolution operation (based on the length and width of the unknown face image 1)>
Figure SMS_127
) Before and after, the size of the characteristic diagram is kept unchanged. For the following three feature information extraction layers 2, in the pre-residual module 21Two convolution operations (` vs `)>
Figure SMS_128
And &>
Figure SMS_129
) Before and after, the length and width dimensions of the characteristic diagram and the number of channels are kept unchanged.
In all of the feature information extraction layers 2,
Figure SMS_158
and &>
Figure SMS_161
Before and after two convolution operations, the length and width sizes and the channel number of the feature map are kept unchanged, so that the feature map is/are reserved in the same feature information extraction layer 2>
Figure SMS_164
And the characteristic map->
Figure SMS_133
Are all the same in size, are combined in a manner known per se>
Figure SMS_137
And/or>
Figure SMS_141
Add to obtain>
Figure SMS_149
,/>
Figure SMS_131
And/or>
Figure SMS_135
Make the corresponding product of elements to get->
Figure SMS_139
. By means of pairs>
Figure SMS_143
、/>
Figure SMS_145
、/>
Figure SMS_148
And &>
Figure SMS_152
Fills in the appropriate padding value so that->
Figure SMS_155
、/>
Figure SMS_146
、/>
Figure SMS_150
And &>
Figure SMS_153
After operation, the characteristic diagram is obtained
Figure SMS_156
、/>
Figure SMS_130
、/>
Figure SMS_134
And &>
Figure SMS_138
The length and the width of the paper are (in the same characteristic information extraction layer 2)>
Figure SMS_142
Half of the characteristic map, is taken>
Figure SMS_154
、/>
Figure SMS_157
、/>
Figure SMS_160
And &>
Figure SMS_163
Of (2)The number is equal to (in the same characteristic information extraction layer 2) <>
Figure SMS_159
The characteristic diagrams are equal. Then by splicing and->
Figure SMS_162
Convolution operation of feature map>
Figure SMS_165
、/>
Figure SMS_166
、/>
Figure SMS_132
And &>
Figure SMS_136
Fusion, the finally output characteristic map->
Figure SMS_140
The number of channels is (in the same feature information extraction layer 2) the feature map pick>
Figure SMS_144
Characteristic map ^ 2 times the number of channels>
Figure SMS_147
The length and width of the feature map (in the same feature information extraction layer 2) is a feature map>
Figure SMS_151
Half the length and width dimensions.
For the composite attention calibration unit 4,
Figure SMS_184
、/>
Figure SMS_190
、/>
Figure SMS_193
and &>
Figure SMS_168
After the input, on the one hand, by means of splicing and->
Figure SMS_172
Convolution to obtain a feature map with a channel number of 4 +>
Figure SMS_175
(ii) a A global max pooling operation is then performed in the spatial direction,
Figure SMS_179
after the function has been activated, a vector of length 4 is obtained>
Figure SMS_183
. On the other hand, each is individually paired->
Figure SMS_187
、/>
Figure SMS_191
、/>
Figure SMS_194
And
Figure SMS_196
performing global maximum pooling operation in the channel direction, and splicing the obtained 4 two-dimensional matrices to obtain feature maps with the channel number of 4>
Figure SMS_198
. Then the vector is->
Figure SMS_199
And &>
Figure SMS_200
Performing a multiplication operation corresponding to the element, and using the result>
Figure SMS_182
Is paired and/or matched>
Figure SMS_185
Modulating each layer; finally utilizes>
Figure SMS_188
Convolution compresses the channel number to 1, pass->
Figure SMS_192
Function activation, generating an attention map->
Figure SMS_167
。/>
Figure SMS_171
Is a two-dimensional matrix whose length and width dimensions correspond to those of the input->
Figure SMS_177
Characteristic map equals->
Figure SMS_180
And/or>
Figure SMS_169
And performing element corresponding product on the feature diagram obtained after the function activation to realize the calibration of feature information of different positions in the space direction of the feature diagram. The combined attention calibration unit 4 uses the vector ≥>
Figure SMS_174
Is paired and/or matched>
Figure SMS_176
Each layer of (a) is modulated, thus not only utilizing
Figure SMS_181
、/>
Figure SMS_186
、/>
Figure SMS_189
And &>
Figure SMS_195
The characteristic relation of different positions in the space directionThe feature relation in the channel direction is also utilized, the utilization rate of the input information is high by the composite attention calibration unit 4, and the feature map is more fully mined and learned>
Figure SMS_197
、/>
Figure SMS_170
、/>
Figure SMS_173
And &>
Figure SMS_178
The relative relation of different position information and the characteristic diagram can realize more effective calibration after element corresponding product operation.
After the image information is transmitted on the network backbone, the number of the abstract feature map channels output by the last feature information extraction layer 2 is 768, and after global average pooling processing is performed on each layer of the abstract feature map, a face feature vector 5 with the length of 768 is generated. The length of all target feature vectors in the retrieval library is 768, and corresponding target feature vectors are obtained by inputting high-definition face images into the trained face image recognition network model. In this embodiment, the calculated euclidean distance between the face feature vector 5 and all target feature vectors in the search library, and the identity corresponding to the target feature vector which is closest to the face feature vector 5 and whose euclidean distance is smaller than the threshold value is the identity of the unknown face image 1. And if the Euclidean distances between the face feature vector 5 and all the target feature vectors in the search library are larger than or equal to the threshold value, judging that the unknown face image 1 is not in the search library.
In this embodiment, the SCface data set is used to train and test the network model, and the SCface data set includes high-definition face images and low-resolution face images of 130 persons. The high-definition face image of each person is only 1, the low-resolution face image is 15, and the 15 images are obtained by arranging 5 cameras at three different distances (1 meter, 2.6 meters and 4.2 meters). In specific implementation, 3 images are randomly extracted from 5 images captured at each distance, a training set including 1170 images is formed, and the remaining 780 low-resolution face images are used as a test set. And in the training process, optimizing the network model by adopting a ternary loss function. After training is finished, high-definition images of 130 persons are input into the network model, the output feature vectors form target feature vectors in a search library, and then the images in the test set are input into the trained network model for testing. For comparison, the present embodiment also trains the currently advanced low-resolution face recognition model MIND-Net by using the same training set, and tests are performed on the same test set, and the comparison result is shown in Table 1.
TABLE 1 examples and MIND-Net model comparative results were tested on the test set
Figure SMS_201
Compared with the final recognition accuracy, the recognition accuracy of the human face image with different resolutions is higher than MIND-Net, and particularly for the image with lower resolution (shot at a distance of 4.2 meters), the human face image recognition accuracy is greatly improved.
Comparative example:
this comparative example is intended to more fully illustrate the role of the composite attention calibration unit 4 proposed by the present invention in the overall model. In this comparative example, the composite attention calibration unit 4 in the example is removed, and the CBAM module is used to calibrate the feature map, and the new feature extraction layer 2 structure is shown in fig. 4. The rest of the network model remains unchanged, the training and testing process is also completely consistent with the embodiment, and the modified network model test results are shown in table 2.
Table 2 comparative examples test results on test set
Figure SMS_202
Comparing the data of the above two tables, it can be seen that, under the condition of relatively high resolution, the feature information in the original input image is relatively rich, and the composite attention calibration unit 4 has a relatively limited performance improvement. However, under the condition of relatively low resolution, it becomes important to fully utilize the original few high-value information, and the effect of the composite attention calibration unit 4 on improving the network performance is very obvious.
The above-mentioned embodiments only express the specific embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that various changes and modifications can be made by those skilled in the art without departing from the spirit of the invention, and these changes and modifications are all within the scope of the invention.

Claims (5)

1. A low-quality face image recognition method with attention mechanism is characterized in that: the method comprises the following steps:
s10, acquiring an unknown face image, and acquiring a trained face image recognition network model;
the human face image recognition network model comprises a trunk and a global pooling processing layer, the global pooling processing layer is connected to the tail of the trunk, the trunk comprises a plurality of feature information extraction layers which are connected in sequence, and the mathematical model of the feature information extraction layers is as follows:
Figure QLYQS_17
wherein is present>
Figure QLYQS_23
An input representing the feature information extraction layer,
Figure QLYQS_27
representing a pre-residual module; />
Figure QLYQS_2
Represents the convolution operation with convolution kernel size of 3 x 3 and step size of 1, and/or the convolution operation of the convolution kernel size of more than one step size>
Figure QLYQS_5
Represents a convolution operation with a convolution kernel size of 5 x 5 and a stride of 1, and->
Figure QLYQS_10
Represents the convolution operation with convolution kernel size of 3 x 3 and step size of 2, and/or the convolution operation of the convolution kernel size of more than 3 x 3>
Figure QLYQS_13
Represents a convolution operation with a convolution kernel size of 5 x 5 and a stride of 2, and->
Figure QLYQS_18
And &>
Figure QLYQS_21
Each represents a maximum pooling operation with a pooling window size of 2 x 2 and a step size of 2, and->
Figure QLYQS_25
Represents a convolution operation with a convolution kernel size of 1 x 1 and a step size of 1, and->
Figure QLYQS_29
Indicates that the element corresponds to a product operation, and->
Figure QLYQS_32
Means for concatenating the characteristic maps thereof>
Figure QLYQS_34
、/>
Figure QLYQS_36
、/>
Figure QLYQS_38
、/>
Figure QLYQS_19
、/>
Figure QLYQS_22
Both represent an activation function ReLU, <' >>
Figure QLYQS_26
A feature map representing the output of said pre-residual module, based on a pre-determined characteristic value>
Figure QLYQS_30
Represents->
Figure QLYQS_3
The feature map generated after the activation is carried out,
Figure QLYQS_8
represents->
Figure QLYQS_11
Characteristic map generated after activation, ->
Figure QLYQS_15
Represents->
Figure QLYQS_20
And/or>
Figure QLYQS_24
The feature map generated after the addition is taken>
Figure QLYQS_28
Represents->
Figure QLYQS_31
And/or>
Figure QLYQS_33
A characteristic map generated after the corresponding multiplication of the elements is made, is based on the result of the evaluation>
Figure QLYQS_35
Represents->
Figure QLYQS_37
Characteristic map generated upon activation>
Figure QLYQS_39
To represent
Figure QLYQS_4
Characteristic map generated upon activation>
Figure QLYQS_7
Represents->
Figure QLYQS_12
Characteristic map generated after pooling operation,. Sup.>
Figure QLYQS_16
Represents->
Figure QLYQS_1
Feature maps generated after pooling operations>
Figure QLYQS_6
Represents a combined attention calibration unit>
Figure QLYQS_9
Represents an attention map generated and output by the combined attention calibration unit, <' >>
Figure QLYQS_14
A feature map representing an output of the feature information extraction layer;
s20, inputting the unknown face image into the face image recognition network model, and carrying out feature extraction operation on image information by each feature information extraction layer in sequence along with the transfer of the image information along the backbone until an abstract feature image is output by the last feature information extraction layer;
s30, inputting the abstract feature map into the global pooling processing layer, performing global pooling operation on each layer of the abstract feature map by using the global pooling processing layer, and outputting to obtain a face feature vector;
s40, calculating the distances between the face feature vectors and all target feature vectors in a retrieval library, wherein the identity corresponding to the target feature vector which is closest to the face feature vector and meets the threshold condition is the identity of the unknown face image.
2. The method of claim 1, wherein the method comprises the following steps: the global pooling processing layer is a global average pooling layer.
3. The method of claim 1, wherein the method comprises the steps of: the mathematical model of the composite attention calibration unit is:
Figure QLYQS_56
wherein the combined attention calibration unit holds the characteristic map>
Figure QLYQS_60
、/>
Figure QLYQS_63
、/>
Figure QLYQS_41
And &>
Figure QLYQS_47
As an input; />
Figure QLYQS_48
And &>
Figure QLYQS_52
All represent a global max pooling operation on the feature map, device for selecting or keeping>
Figure QLYQS_50
Is to operate the characteristic map in the spatial direction, based on the characteristic map>
Figure QLYQS_55
Operating on a characteristic map in the direction of a channel>
Figure QLYQS_58
Showing the splicing together of the feature maps therein; />
Figure QLYQS_61
Represents a convolution operation with a convolution kernel size of 3 x 3 and a step size of 1, and->
Figure QLYQS_57
Convolution operation representing convolution kernel size of 1 × 1 and step size of 1; />
Figure QLYQS_59
Indicates that the element corresponds to a product operation, and->
Figure QLYQS_62
And &>
Figure QLYQS_64
Each represents a sigmoid activation function, <' >>
Figure QLYQS_43
Represents->
Figure QLYQS_45
The feature map generated after the convolution operation is->
Figure QLYQS_49
Represents->
Figure QLYQS_53
Feature map generated after function activation, ->
Figure QLYQS_40
Indicates that the characteristic map is to be taken>
Figure QLYQS_44
、/>
Figure QLYQS_51
、/>
Figure QLYQS_54
And &>
Figure QLYQS_42
Respectively performing global maximum pooling in the channel direction, and splicing to obtain characteristic maps, and selecting the corresponding characteristic map based on the characteristic map>
Figure QLYQS_46
An attention map representing the output of the composite attention calibration unit.
4. The method of claim 1, wherein the method comprises the following steps: the mathematical model of the preposed residual error module is as follows:
Figure QLYQS_66
in which>
Figure QLYQS_70
Represents the input of the pre-residual module, is greater than or equal to>
Figure QLYQS_72
And &>
Figure QLYQS_67
Both represent an activation function ReLU, <' >>
Figure QLYQS_69
And &>
Figure QLYQS_71
All represent convolution kernel sizes of 3 x 3,Convolution operation with a step size of 1, -and>
Figure QLYQS_73
represents->
Figure QLYQS_65
Activation of a generated feature map>
Figure QLYQS_68
And representing the characteristic graph output by the preposed residual error module.
5. A low-quality face image recognition device with attention mechanism is characterized in that: comprising a processor and a memory, said memory storing a computer program, said processor being adapted to perform the method of low quality face image recognition with attention mechanism according to any of claims 1-4 by loading said computer program.
CN202310272773.4A 2023-03-21 2023-03-21 Low-quality face image recognition method and equipment with attention mechanism Active CN115984949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310272773.4A CN115984949B (en) 2023-03-21 2023-03-21 Low-quality face image recognition method and equipment with attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310272773.4A CN115984949B (en) 2023-03-21 2023-03-21 Low-quality face image recognition method and equipment with attention mechanism

Publications (2)

Publication Number Publication Date
CN115984949A true CN115984949A (en) 2023-04-18
CN115984949B CN115984949B (en) 2023-07-04

Family

ID=85958600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310272773.4A Active CN115984949B (en) 2023-03-21 2023-03-21 Low-quality face image recognition method and equipment with attention mechanism

Country Status (1)

Country Link
CN (1) CN115984949B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311479A (en) * 2023-05-16 2023-06-23 四川轻化工大学 Face recognition method, system and storage medium for unlocking automobile

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190303754A1 (en) * 2018-03-28 2019-10-03 University Of Maryland, College Park L2 constrained softmax loss for discriminative face verification
CN110781784A (en) * 2019-10-18 2020-02-11 高新兴科技集团股份有限公司 Face recognition method, device and equipment based on double-path attention mechanism
GB202007052D0 (en) * 2020-05-13 2020-06-24 Facesoft Ltd Facial re-enactment
CN112200161A (en) * 2020-12-03 2021-01-08 北京电信易通信息技术股份有限公司 Face recognition detection method based on mixed attention mechanism
CN112949565A (en) * 2021-03-25 2021-06-11 重庆邮电大学 Single-sample partially-shielded face recognition method and system based on attention mechanism
CN113688783A (en) * 2021-09-10 2021-11-23 柚皮(重庆)科技有限公司 Face feature extraction method, low-resolution face recognition method and device
CN113724203A (en) * 2021-08-03 2021-11-30 唯智医疗科技(佛山)有限公司 Segmentation method and device for target features in OCT (optical coherence tomography) image
CN114360030A (en) * 2022-01-17 2022-04-15 重庆锐云科技有限公司 Face recognition method based on convolutional neural network
CN114998958A (en) * 2022-05-11 2022-09-02 华南理工大学 Face recognition method based on lightweight convolutional neural network
CN115100720A (en) * 2022-07-04 2022-09-23 威海职业学院(威海市技术学院) Low-resolution face recognition method
CN115439329A (en) * 2022-11-10 2022-12-06 四川轻化工大学 Face image super-resolution reconstruction method and computer-readable storage medium
CN115661911A (en) * 2022-12-23 2023-01-31 四川轻化工大学 Face feature extraction method, device and storage medium
WO2023005161A1 (en) * 2021-07-27 2023-02-02 平安科技(深圳)有限公司 Face image similarity calculation method, apparatus and device, and storage medium

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190303754A1 (en) * 2018-03-28 2019-10-03 University Of Maryland, College Park L2 constrained softmax loss for discriminative face verification
CN110781784A (en) * 2019-10-18 2020-02-11 高新兴科技集团股份有限公司 Face recognition method, device and equipment based on double-path attention mechanism
GB202007052D0 (en) * 2020-05-13 2020-06-24 Facesoft Ltd Facial re-enactment
CN112200161A (en) * 2020-12-03 2021-01-08 北京电信易通信息技术股份有限公司 Face recognition detection method based on mixed attention mechanism
CN112949565A (en) * 2021-03-25 2021-06-11 重庆邮电大学 Single-sample partially-shielded face recognition method and system based on attention mechanism
WO2023005161A1 (en) * 2021-07-27 2023-02-02 平安科技(深圳)有限公司 Face image similarity calculation method, apparatus and device, and storage medium
CN113724203A (en) * 2021-08-03 2021-11-30 唯智医疗科技(佛山)有限公司 Segmentation method and device for target features in OCT (optical coherence tomography) image
CN113688783A (en) * 2021-09-10 2021-11-23 柚皮(重庆)科技有限公司 Face feature extraction method, low-resolution face recognition method and device
CN114360030A (en) * 2022-01-17 2022-04-15 重庆锐云科技有限公司 Face recognition method based on convolutional neural network
CN114998958A (en) * 2022-05-11 2022-09-02 华南理工大学 Face recognition method based on lightweight convolutional neural network
CN115100720A (en) * 2022-07-04 2022-09-23 威海职业学院(威海市技术学院) Low-resolution face recognition method
CN115439329A (en) * 2022-11-10 2022-12-06 四川轻化工大学 Face image super-resolution reconstruction method and computer-readable storage medium
CN115661911A (en) * 2022-12-23 2023-01-31 四川轻化工大学 Face feature extraction method, device and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HUILIN GE 等: "Facial expression recognition based on deep learning", 《COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE》, vol. 215, pages 1 - 9 *
任飞凯 等: "基于LBP和数据扩充的CNN人脸识别研究", 《计算机技术与发展》, vol. 30, no. 3, pages 62 - 66 *
朱思雨: "基于深度残差网络的无监督行人重识别", 《中国优秀硕士学位论文全文数据库 信息科技辑》, vol. 2022, no. 3, pages 138 - 1648 *
罗金梅 等: "基于多特征融合CNN的人脸识别算法研究", 《航空计算技术》, vol. 49, no. 3, pages 40 - 45 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311479A (en) * 2023-05-16 2023-06-23 四川轻化工大学 Face recognition method, system and storage medium for unlocking automobile
CN116311479B (en) * 2023-05-16 2023-07-21 四川轻化工大学 Face recognition method, system and storage medium for unlocking automobile

Also Published As

Publication number Publication date
CN115984949B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
Wang et al. Detect globally, refine locally: A novel approach to saliency detection
CN110532859B (en) Remote sensing image target detection method based on deep evolution pruning convolution net
CN109191382B (en) Image processing method, device, electronic equipment and computer readable storage medium
CN109063724B (en) Enhanced generation type countermeasure network and target sample identification method
CN110059728B (en) RGB-D image visual saliency detection method based on attention model
CN112116601B (en) Compressed sensing sampling reconstruction method and system based on generation of countermeasure residual error network
CN113159051A (en) Remote sensing image lightweight semantic segmentation method based on edge decoupling
CN110245683B (en) Residual error relation network construction method for less-sample target identification and application
CN110929736A (en) Multi-feature cascade RGB-D significance target detection method
KR102042168B1 (en) Methods and apparatuses for generating text to video based on time series adversarial neural network
WO2024027095A1 (en) Hyperspectral imaging method and system based on double rgb image fusion, and medium
CN110210492B (en) Stereo image visual saliency detection method based on deep learning
CN112036260B (en) Expression recognition method and system for multi-scale sub-block aggregation in natural environment
CN114332466B (en) Continuous learning method, system, equipment and storage medium for image semantic segmentation network
CN112396645A (en) Monocular image depth estimation method and system based on convolution residual learning
CN110674925B (en) No-reference VR video quality evaluation method based on 3D convolutional neural network
CN111210382A (en) Image processing method, image processing device, computer equipment and storage medium
CN115984949A (en) Low-quality face image recognition method and device with attention mechanism
CN110991563B (en) Capsule network random routing method based on feature fusion
CN114742985A (en) Hyperspectral feature extraction method and device and storage medium
CN115797808A (en) Unmanned aerial vehicle inspection defect image identification method, system, device and medium
CN114663880A (en) Three-dimensional target detection method based on multi-level cross-modal self-attention mechanism
CN115331259A (en) Three-dimensional human body posture estimation method, system and storage medium
CN108810551B (en) Video frame prediction method, terminal and computer storage medium
CN110135428A (en) Image segmentation processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant