CN115661911A - Face feature extraction method, device and storage medium - Google Patents
Face feature extraction method, device and storage medium Download PDFInfo
- Publication number
- CN115661911A CN115661911A CN202211658800.3A CN202211658800A CN115661911A CN 115661911 A CN115661911 A CN 115661911A CN 202211658800 A CN202211658800 A CN 202211658800A CN 115661911 A CN115661911 A CN 115661911A
- Authority
- CN
- China
- Prior art keywords
- feature
- feature map
- layer
- feature extraction
- activation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 80
- 239000000178 monomer Substances 0.000 claims abstract description 27
- 238000007493 shaping process Methods 0.000 claims abstract description 24
- 238000011144 upstream manufacturing Methods 0.000 claims abstract description 5
- 230000004913 activation Effects 0.000 claims description 85
- 230000006870 function Effects 0.000 claims description 65
- 238000011176 pooling Methods 0.000 claims description 58
- 238000010586 diagram Methods 0.000 claims description 57
- 238000000034 method Methods 0.000 claims description 53
- 230000008569 process Effects 0.000 claims description 49
- 230000004927 fusion Effects 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 7
- 238000002844 melting Methods 0.000 claims description 7
- 230000008018 melting Effects 0.000 claims description 7
- 238000013178 mathematical model Methods 0.000 claims description 6
- 230000001815 facial effect Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims description 2
- 239000002243 precursor Substances 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 abstract description 11
- 238000012360 testing method Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a face feature extraction method, face feature extraction equipment and a storage medium, and belongs to the technical field of face recognition. The feature extraction method comprises the steps of obtaining a face image, obtaining a trained feature extraction network model, extracting basic feature information of the face image by using a basic operation layer, inputting a basic feature map into a first deep tone feature extraction monomer, taking a feature map output by an upstream deep tone feature extraction monomer of a next deep tone feature extraction monomer as input, and generating and outputting a corresponding intermediate feature map; repeating continuously until the last deep tone feature extraction monomer generates and outputs a final-stage feature map; and inputting the final-stage feature map into a feature shaping unit to generate a face feature vector, and finishing face feature extraction. The feature extraction network model of the invention gradually modulates the feature information by setting a plurality of space attention mechanisms, and the network can well eliminate noise and extract the core feature information.
Description
Technical Field
The invention belongs to the technical field of face recognition, and particularly relates to a face feature extraction method, face feature extraction equipment and a storage medium.
Background
With the improvement and popularization of hardware performance, the face recognition technology gradually leaves out of laboratories and enters the daily life of people. Through long-term development, many face recognition algorithms can well deal with common real scenes at present, and satisfactory recognition accuracy is achieved. However, when the quality of the obtained face image is poor (such as unsatisfactory illumination, great posture change, various expression changes, etc.), the existing algorithm still has the problem of poor robustness, and especially for facial image changes caused by age changes, the existing algorithm is difficult to effectively extract required feature information.
Disclosure of Invention
In view of the above-mentioned deficiencies in the prior art, the present invention provides a method, an apparatus and a storage medium for extracting facial features, so as to effectively learn and extract feature information in an age-spanning facial image, and improve the accuracy of identifying the age-spanning facial image.
In order to achieve the above purpose, the solution adopted by the invention is as follows: a face feature extraction method comprises the following steps:
s100, obtaining a face image, and obtaining a trained feature extraction network model; the characteristic extraction network model is sequentially provided with a basic operation layer, a plurality of deep tone characteristic extraction monomers and a characteristic shaping unit, wherein the plurality of deep tone characteristic extraction monomers are sequentially connected in series;
s200, inputting the face image into the feature extraction network model, extracting basic feature information of the face image by using the basic operation layer, and then generating a basic feature map;
s300, inputting the basic feature map into a first deep tone feature extraction monomer, and outputting a primary feature map by the first deep tone feature extraction monomer after feature extraction operation;
s400, the next deep tone feature extraction monomer takes the feature graph output by the upstream deep tone feature extraction monomer as input, then carries out feature extraction operation, generates and outputs a corresponding intermediate feature graph;
s500, continuously repeating the step S400 until the last deep tone feature extraction monomer generates and outputs a final-stage feature map;
s600, inputting the final-stage feature map into the feature shaping unit, and after shaping operation is carried out on the final-stage feature map, generating a face feature vector to finish face feature extraction;
the calculation operation process inside the deep tone feature extraction monomer is represented as the following mathematical model:
wherein,a precursor feature map representing input said depth feature extraction singlets,、andrespectively representing a first convolution operation, a second convolution operation and a third convolution operation,、、、andrespectively representing a first attention module, a second attention module, a third attention module, a fourth attention module and a fifth attention module,it is shown that the elements correspond to a product operation,、andrespectively representing a first activation function, a second activation function and a third activation function,、andrespectively represent、Andthe first characteristic diagram, the second characteristic diagram and the third characteristic diagram generated after activation,a fourth feature map obtained by calibrating the third feature map by the third attention module is shown,a multi-scale fusion unit is represented,representing a side branch feature map generated by fusing the first feature map, the second feature map and the third feature map by the multi-scale fusion unit,a fifth feature map obtained by adding the side branch feature map after calibration to the fourth feature map for the fourth attention module,a first process feature map output from the first attention module,a second process feature map output from the second attention module,a third process feature map output from the third attention module,andrespectively represent a fourth process profile anda fifth pass Cheng Tezheng map, the fourth process feature map and the fifth process feature map both output from the fourth attention module,representing a dimension-varying unit for increasing the number of feature map channels and decreasing feature map width and height dimensions,and representing the back-driving feature map of the deep tone feature extraction monomer output.
Further, the convolution kernel sizes of the first convolution operation, the second convolution operation and the third convolution operation are all 3*3, and the step sizes are all 1; the first activation function, the second activation function, and the third activation function are all ReLU functions.
Further, the internal operation process of the multi-scale fusion unit is represented as the following mathematical model:
wherein,、andrespectively showing a first feature map, a second feature map and a third feature map, the multi-scale fusion unit taking the first feature map, the second feature map and the third feature map as input,representing a side-branch feature map as an output of the multi-scale fusion unit,it is shown that the elements correspond to a product operation,representing a first melting characteristic diagram generated by adding the first characteristic diagram, the second characteristic diagram and the third characteristic diagram,representing a second melting characteristic diagram generated by multiplying the first characteristic diagram, the second characteristic diagram and the third characteristic diagram by element correspondence,it is shown that the feature maps are stitched together,andrespectively representing a fourth convolution operation and a fifth convolution operation, the convolution kernel sizes of the fourth convolution operation and the fifth convolution operation are 1*1, and the step sizes are 1,andrespectively representing a fourth activation function and a fifth activation function, both the fourth activation function and the fifth activation function being ReLU functions,representing a third melting profile generated upon activation of the fourth activation function.
Further, a hierarchical pooling layer and a hierarchical activation function are arranged in each of the first attention module, the second attention module and the third attention module, the hierarchical pooling layer is arranged at an upstream end of the hierarchical activation function, the hierarchical pooling layer is used for performing global maximum pooling operation on the feature map in the channel direction, and the hierarchical activation function is sigmoid;
the first, second, and third process profiles are matrices of hierarchical pooling outputs of the first, second, and third attention modules, respectively.
Furthermore, a branch pooling layer, an introduction full-connection layer, an introduction activation layer, an extraction full-connection layer and an extraction activation layer are sequentially arranged in the fourth attention module; the branch pooling layer is used for performing global maximum pooling operation on the feature map in the space direction, the lead-in activation layer is a nonlinear activation function ReLU, and the lead-out activation layer is a nonlinear activation function sigmoid;
the fourth process characteristic diagram is a vector output after the operation of the branch pooling layer, and the fifth process characteristic diagram is a vector output after the activation of the lead-out activation layer.
Further, the mathematical model of the fifth attention module is:
wherein,、、、andthe first process profile, the second process profile, the third process profile, and a fourth process profile representing input to a fifth attention module, respectively,The fourth process signature and the fifth process signature Cheng Tezheng,showing the splicing operation to the characteristic diagrams therein,representing a first internal reference feature map generated by splicing the first process feature map, the second process feature map and the third process feature map,andrespectively representing a first bridging fully connected layer and a second bridging fully connected layer,、andrespectively representing a first bridging activation function, a second bridging activation function and an integration activation function,andrespectively representing a second internal reference feature diagram and a third internal reference feature diagram generated after the first bridging activation function and the second bridging activation function are activated,it is shown that the elements correspond to a product operation,which represents the integrated pooling layer of the pool,the integrated pooling layer is used for performing global maximum pooling operation on the feature map in the channel direction,an integrated attention map representing the output of the fifth attention module.
Furthermore, the dimension-changing unit comprises a dimension-changing convolutional layer, a dimension-changing activation layer and a dimension-changing pooling layer which are sequentially arranged, the convolutional kernel size of the dimension-changing convolutional layer is 3*3, the step length is 1, the dimension-changing activation layer is a ReLU function, the dimension-changing pooling layer is used for performing maximum pooling operation on the feature map, and the pooling window size of the dimension-changing pooling layer is 2*2, and the step length is 2.
Further, the feature shaping unit comprises a shaping pooling layer, a trunk full-connection layer and a shaping activation layer which are connected in sequence, the shaping pooling layer is used for performing global average pooling operation on the feature map in the spatial direction, and the shaping activation layer is a sigmoid function.
The invention also provides a face feature extraction device, which comprises a processor and a memory, wherein the memory stores a computer program, and the processor is used for executing the face feature extraction method by loading the computer program.
The present invention also provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the face feature extraction method as described above.
The invention has the beneficial effects that:
(1) With the increase of the depth of the network, the receptive field of the characteristic diagram is gradually increased, the characteristic extraction network model of the invention gradually modulates the characteristic information by setting a plurality of space attention mechanisms, and more accurately calibrates different space position information, and the network can well eliminate noise and extract the core characteristic information for the image difference caused by different ages;
(2) Research shows that after the multi-scale fusion unit is adopted to fuse the hierarchy information, the effect of improving the network performance by using one channel attention mechanism is similar to that of the channel attention mechanism arranged on each hierarchy, but the calculation amount is smaller than that of the channel attention mechanisms, and the network is lighter, because the multi-scale fusion unit well removes the interference information in the hierarchy information in the process of fusing the hierarchy information, the calibration efficiency of the channel attention mechanism is greatly improved;
(3) The network architecture of a Transformer and the like based on a pure attention mechanism proves that a large amount of effective information still exists in the attention mechanism, the information is fully utilized, and complex feature mapping can be realized, while in the existing convolutional neural network, only the utilization of the information at the tail output end of the attention mechanism is emphasized, and the information exchange between the middle of the attention mechanism and other parts of the network is lacked, so that the calibration effect of the attention mechanism is limited, and the nonlinear fitting capability of the network to a complex scene is reduced; according to the invention, the fifth attention module integrates and utilizes the intermediate information of the other four attention modules, so that the effect of front-back cooperative modulation of the fifth attention module and the other four attention modules is realized, the modulation consistency and the overall performance are enhanced, and a test result shows that the accuracy of the network for identifying the age-crossing face is obviously improved after the fifth attention module provided by the invention is adopted.
Drawings
Fig. 1 is a feature extraction network model architecture diagram of embodiment 1;
FIG. 2 is a diagram showing the internal structure of a deep tone feature extraction cell in example 1;
FIG. 3 is an internal structural diagram of a first attention module in example 1;
FIG. 4 is a diagram showing the internal architecture of the multi-scale fusion unit in example 1;
FIG. 5 is an internal structural view of a fourth attention module in example 1;
FIG. 6 is an internal structural view of a fifth attention module in example 1;
FIG. 7 is a diagram showing the internal structure of a deep tone feature extraction cell in example 2;
in the drawings:
1-face image, 2-basic operation layer, 3-deep tone feature extraction monomer, 31-first attention module, 311-hierarchical pooling layer, 312-hierarchical activation function, 32-second attention module, 33-third attention module, 34-fourth attention module, 341-branch pooling layer, 342-full-connection layer, 343-activation layer, 344-full-connection layer, 345-activation layer, 35-fifth attention module, 36-multi-scale fusion unit, 37-dimension-changing unit, 371-dimension-changing rolling layer, 372-dimension-changing activation layer, 373-dimension-changing pooling layer, 4-shaping pooling layer, 5-main-stem full-connection layer and 6-shaping activation layer.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
example 1:
fig. 1 shows a structure diagram of a feature extraction network model in this embodiment, and the entire network model is implemented on a computer by programming Python in combination with a pitorch framework. After the face image 1 is input into the network model, firstly, convolution operation is carried out through the basic operation layer 2, and then an output basic characteristic diagram is generated. The convolution kernel size of the base operation layer 2 is 3*3 with a step size of 1. The number of the deep tone feature extraction monomers 3 is 5, the plurality of deep tone feature extraction monomers 3 are sequentially connected end to end, the feature information sequentially passes through each deep tone feature extraction monomer 3, the width and height of each feature map are reduced by half when each feature map passes through one deep tone feature extraction monomer 3, and the number of channels is doubled. The feature shaping unit comprises a shaping pooling layer 4, a trunk full-connection layer 5 and a shaping activation layer 6 which are connected in sequence, the shaping pooling layer 4 is used for performing global average pooling operation on the feature map in the space direction, and the shaping activation layer is a sigmoid function. Let the size of the face image 1 be W × H × 3 (width × height channel, the same below), the feature map output by each module has the following size:
TABLE 1 feature extraction network model output feature map size for each module
Fig. 2 shows an internal architecture diagram of the single deep tone feature extraction unit 3 in this embodiment, and a feature diagram of a certain single deep tone feature extraction unit 3 is inputAnd the size is K G C, and the sizes of the first feature map, the second feature map, the third feature map, the fourth feature map, the fifth feature map and the side branch feature map are all K G C. The dimension-changing unit 37 comprises a dimension-changing convolution layer 371, a dimension-changing activation layer 372 and a dimension-changing pooling layer 373 which are arranged in sequence, the dimensions of output characteristic diagrams of the dimension-changing convolution layer 371 and the dimension-changing activation layer 372 are K G2C, and the dimension-changing pooling layer 373 outputs the characteristic diagramsHas the size of K/2*G/2 x 2C.
The internal operation processes of the first attention module 31, the second attention module 32 and the third attention module 33 are the same, and as shown in fig. 3, a hierarchical pooling layer 311 and a hierarchical activation function 312 are provided, which are connected in sequence, the hierarchical pooling layer 311 is used for performing global maximum pooling operation on feature maps in the channel direction, the hierarchical activation function 312 is a sigmoid function, and the outputs of the hierarchical pooling layer 311 and the hierarchical activation function 312 are matrices with a size K G1. First Process feature mapSecond process characteristic diagramAnd a third process profileThe matrices, which are the output of the hierarchical pooling layer 311 in the first attention module 31, the second attention module 32, and the third attention module 33, respectively, after operation are all also K × G × 1.
FIG. 4 is a diagram illustrating the internal architecture of the multi-scale fusion unit 36 in this embodiment, and the feature map is first obtained by adding and multiplying the corresponding elements、Andpreliminary fusion, the first fusion characteristic diagram generatedAnd a second melting profileAll the dimensions are K G C. Then convolution and a fourth activation function are operated through splicing and a fourth convolutionActivating to obtain a third fusion characteristic diagram of the second fusion(size K G C). Finally, convolution sum is operated through splicing and fifth convolutionActivation of will、、Andfusing to generate side branch feature map. The multi-scale fusion unit 36 transforms the feature map in a progressive, multi-pass, multi-dimensional manner、Andthe method has the advantages of high efficiency and fine denoising capability.
Fig. 5 shows an internal architecture diagram of the fourth attention module 34 in the present embodiment, and the internal architecture diagram is provided with a branch pooling layer 341, an incoming full-link layer 342, an incoming active layer 343, an outgoing full-link layer 344, and an outgoing active layer 345, which are connected in sequence. The branch pooling layer 341 is used for performing global maximum pooling operation on the feature map in the spatial direction, and the fourth process feature mapI.e., the vector output after the operation of the branched pooling layer 341, has a size of 1 × c. The number of input elements of the introduced full-connection layer 342 is C, the number of output elements is C/8, the introduced activation layer 343 is a nonlinear activation function ReLU, the number of input elements of the introduced full-connection layer 344 is C/8, the number of output elements is C, and the introduced activation layer 345 is a nonlinear activation function sigmoid. The fifth process feature map is a vector of 1 × c size that is output after the lead-out active layer 345 is activated.
Fig. 6 shows an internal architecture diagram of the fifth attention module 35 in this embodiment, and a first internal reference feature diagram obtained by splicing the first process feature diagram, the second process feature diagram, and the third process feature diagramThe size is K × G × 3. The number of input elements of the first bridge full connection layer and the second bridge full connection layer is C, the number of output elements is 3, the first bridging activation function, the second bridging activation function and the integration activation function are sigmoid functions and are generatedAndall with a size of 1 x 3, by element-to-element product operation,andare respectively asAnd distributing weight parameters with different sizes for each layer. The integrated pooling layer is used for performing global maximum pooling operation on the feature maps in the channel direction, and the integrated active layer outputs an integrated attention map with the size K G1. Integration of attention map and fifth feature mapAfter the element-corresponding product operation, the integrated attention diagram is formedAnd different spatial positions are distributed with different weight parameters to realize modulation.
In practical application, the face image 1 inputs the trained feature extraction network model, and a corresponding face feature vector is obtained through feature extraction, and then the distance (L1 distance in this embodiment) between the feature vector and a vector in a preset sample library is calculated, and an identity corresponding to a sample vector which is closest to the face feature vector and has a distance smaller than a preset threshold is the identity of the face image 1, so that face recognition is completed. In the embodiment, the data set VGGFace2 is used as a training set to train the network model, and the loss function adopts a ternary loss function. Then, a common age-spanning face recognition test data set CPLFW is used as a test set to test the model, and the test result shows that the recognition accuracy of the feature extraction network model of the embodiment on CPLFW is 94.71%, while in the existing advanced algorithm, the recognition accuracy of VGGFace2 is 84.00%, and the recognition accuracy of ArcFace is 88.36%, which are lower than those of the embodiment.
Example 2:
in the embodiment, only the internal structure of the deep tone feature extraction unit 3 is modified on the basis of the embodiment 1, and other parts of the network model are kept unchanged. Fig. 7 shows an internal architecture diagram of the deep tone feature extraction cell 3 in example 2, and the fifth attention module 35 was removed for comparative experiments compared to example 1. After the same training and testing, the result shows that the recognition accuracy of the network model in example 2 on CPLFW is 89.24%, which is lower than that in example 1, and it fully illustrates that the fifth attention module 35 in the present invention has an important promoting role in the network model.
The above embodiments only express specific embodiments of the present invention, and the description is specific and detailed, but not to be understood as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
Claims (10)
1. A face feature extraction method is characterized by comprising the following steps: the method comprises the following steps:
s100, obtaining a face image, and obtaining a trained feature extraction network model; the characteristic extraction network model is sequentially provided with a basic operation layer, a plurality of deep tone characteristic extraction monomers and a characteristic shaping unit, wherein the plurality of deep tone characteristic extraction monomers are sequentially connected in series;
s200, inputting the face image into the feature extraction network model, extracting basic feature information of the face image by using the basic operation layer, and then generating a basic feature map;
s300, inputting the basic feature map into a first deep tone feature extraction monomer, and outputting a primary feature map by the first deep tone feature extraction monomer after feature extraction operation;
s400, the next deep tone feature extraction monomer takes the feature graph output by the upstream deep tone feature extraction monomer as input, then carries out feature extraction operation, generates and outputs a corresponding intermediate feature graph;
s500, continuously repeating the step S400 until the last deep tone feature extraction monomer generates and outputs a final-stage feature map;
s600, inputting the final stage feature map into the feature shaping unit, and generating a face feature vector after shaping operation is carried out on the final stage feature map to finish face feature extraction;
the calculation operation process inside the deep tone feature extraction monomer is represented as the following mathematical model:
wherein,a precursor feature map representing input of said deep tone feature extraction singlets,、andrespectively representing a first convolution operation, a second convolution operation and a third convolution operation,、、、andrespectively representing a first attention module, a second attention module, a third attention module, a fourth attention module and a fifth attention module,it is shown that the elements correspond to a product operation,、andrespectively representing a first activation function, a second activation function and a third activation function,、andrespectively represent、Andthe first characteristic diagram, the second characteristic diagram and the third characteristic diagram generated after activation,a fourth feature map obtained by calibrating the third feature map by the third attention module is shown,a multi-scale fusion unit is represented,representing a side branch feature map generated by fusing the first feature map, the second feature map and the third feature map by the multi-scale fusion unit,a fifth profile obtained by adding the calibrated sidebranch profile to the fourth profile for the fourth attention module,a first process feature map output from the first attention module,a second process feature map output from the second attention module,a third process feature map output from the third attention module,andrepresenting a fourth process signature and a fifth process signature Cheng Tezheng, respectively, said fourth process signature and said fifth process signature each being output from said fourth attention module,representing a dimension-varying unit for increasing the number of feature map channels and decreasing feature map width and height dimensions,and representing the back-driving feature map of the output of the deep tone feature extraction monomer.
2. The face feature extraction method of claim 1, wherein: the convolution kernel sizes of the first convolution operation, the second convolution operation and the third convolution operation are 3*3, and the step sizes are 1; the first activation function, the second activation function, and the third activation function are all ReLU functions.
3. The face feature extraction method of claim 1, wherein: the internal operation process of the multi-scale fusion unit is expressed as the following mathematical model:
wherein,、andrespectively showing a first feature map, a second feature map and a third feature map, the multi-scale fusion unit taking the first feature map, the second feature map and the third feature map as input,representing a side-branch feature map as an output of the multi-scale fusion unit,it is shown that the elements correspond to a product operation,representing a first melting characteristic diagram generated by adding the first characteristic diagram, the second characteristic diagram and the third characteristic diagram,representing a second melting characteristic diagram generated by multiplying the first characteristic diagram, the second characteristic diagram and the third characteristic diagram by element correspondence,it is shown that the feature maps are stitched together,andrespectively representing a fourth convolution operation and a fifth convolution operation, the convolution kernel sizes of the fourth convolution operation and the fifth convolution operation are 1*1, and the step sizes are 1,andrespectively representing a fourth activation function and a fifth activation function, both the fourth activation function and the fifth activation function being ReLU functions,representing a third melting profile generated after activation of the fourth activation function.
4. The face feature extraction method of claim 1, wherein: the first attention module, the second attention module and the third attention module are respectively provided with a hierarchical pooling layer and a hierarchical activation function, the hierarchical pooling layer is arranged at the upstream end of the hierarchical activation function and is used for performing global maximum pooling operation on the feature map in the channel direction, and the hierarchical activation function is sigmoid;
the first, second, and third process profiles are matrices of hierarchical pooling outputs of the first, second, and third attention modules, respectively.
5. The face feature extraction method of claim 4, wherein: a branch pooling layer, a lead-in full-connection layer, a lead-in activation layer, a lead-out full-connection layer and a lead-out activation layer are sequentially arranged in the fourth attention module; the branch pooling layer is used for performing global maximum pooling operation on the feature map in the space direction, the lead-in activation layer is a nonlinear activation function ReLU, and the lead-out activation layer is a nonlinear activation function sigmoid;
the fourth process characteristic diagram is a vector output after the operation of the branch pooling layer, and the fifth process characteristic diagram is a vector output after the activation of the lead-out activation layer.
6. The face feature extraction method of claim 5, wherein: the mathematical model of the fifth attention module is:
wherein,、、、andthe first process signature, the second process signature, the third process signature, the fourth process signature, and a fifth cross Cheng Tezheng, respectively, representing inputs to a fifth attention module,showing the splicing operation to the characteristic diagrams therein,representing a first internal reference feature map generated by splicing the first process feature map, the second process feature map and the third process feature map,andrespectively representing a first bridging fully connected layer and a second bridging fully connected layer,、andrespectively representing a first bridge activation function and a second bridge activation functionTwo bridging activation functions and an integrating activation function,andrespectively representing a second internal reference feature diagram and a third internal reference feature diagram generated after the first bridging activation function and the second bridging activation function are activated,it is shown that the elements correspond to a product operation,representing an integrated pooling layer for global maximal pooling of feature maps in the channel direction,an integrated attention map of the fifth attention module output is represented.
7. The face feature extraction method of claim 1, characterized in that: the variable-dimension unit comprises a variable-dimension convolution layer, a variable-dimension activation layer and a variable-dimension pooling layer which are sequentially arranged, the convolution kernel size of the variable-dimension convolution layer is 3*3, the step length is 1, the variable-dimension activation layer is a ReLU function, the variable-dimension pooling layer is used for performing maximum pooling operation on the feature map, and the pooling window size of the variable-dimension pooling layer is 2*2, and the step length is 2.
8. The face feature extraction method of claim 1, wherein: the feature shaping unit comprises a shaping pooling layer, a trunk full-connection layer and a shaping activation layer which are sequentially connected, wherein the shaping pooling layer is used for performing global average pooling operation on the feature map in the space direction, and the shaping activation layer is a sigmoid function.
9. A facial feature extraction apparatus comprising a processor and a memory, the memory storing a computer program, characterized in that: the processor is configured to execute the method for extracting human face features according to any one of claims 1 to 8 by loading the computer program.
10. A storage medium on which a computer program is stored, the computer program, when executed by a processor, implementing the face feature extraction method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211658800.3A CN115661911B (en) | 2022-12-23 | 2022-12-23 | Face feature extraction method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211658800.3A CN115661911B (en) | 2022-12-23 | 2022-12-23 | Face feature extraction method, device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115661911A true CN115661911A (en) | 2023-01-31 |
CN115661911B CN115661911B (en) | 2023-03-17 |
Family
ID=85023076
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211658800.3A Active CN115661911B (en) | 2022-12-23 | 2022-12-23 | Face feature extraction method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115661911B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115984949A (en) * | 2023-03-21 | 2023-04-18 | 威海职业学院(威海市技术学院) | Low-quality face image recognition method and device with attention mechanism |
CN116311479A (en) * | 2023-05-16 | 2023-06-23 | 四川轻化工大学 | Face recognition method, system and storage medium for unlocking automobile |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3690721A1 (en) * | 2019-01-31 | 2020-08-05 | StradVision, Inc. | Method for recognizing face using multiple patch combination based on deep neural network |
WO2021027555A1 (en) * | 2019-08-15 | 2021-02-18 | 华为技术有限公司 | Face retrieval method and apparatus |
CN114120406A (en) * | 2021-11-22 | 2022-03-01 | 四川轻化工大学 | Face feature extraction and classification method based on convolutional neural network |
CN114187261A (en) * | 2021-12-07 | 2022-03-15 | 天津大学 | Non-reference stereo image quality evaluation method based on multi-dimensional attention mechanism |
CN114360030A (en) * | 2022-01-17 | 2022-04-15 | 重庆锐云科技有限公司 | Face recognition method based on convolutional neural network |
CN114743014A (en) * | 2022-03-28 | 2022-07-12 | 西安电子科技大学 | Laser point cloud feature extraction method and device based on multi-head self-attention |
CN114998958A (en) * | 2022-05-11 | 2022-09-02 | 华南理工大学 | Face recognition method based on lightweight convolutional neural network |
CN115100720A (en) * | 2022-07-04 | 2022-09-23 | 威海职业学院(威海市技术学院) | Low-resolution face recognition method |
CN115223221A (en) * | 2022-07-04 | 2022-10-21 | 网易(杭州)网络有限公司 | Face detection method and device, electronic equipment and storage medium |
CN115496651A (en) * | 2021-06-02 | 2022-12-20 | 武汉Tcl集团工业研究院有限公司 | Feature processing method and device, computer-readable storage medium and electronic equipment |
-
2022
- 2022-12-23 CN CN202211658800.3A patent/CN115661911B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3690721A1 (en) * | 2019-01-31 | 2020-08-05 | StradVision, Inc. | Method for recognizing face using multiple patch combination based on deep neural network |
WO2021027555A1 (en) * | 2019-08-15 | 2021-02-18 | 华为技术有限公司 | Face retrieval method and apparatus |
CN115496651A (en) * | 2021-06-02 | 2022-12-20 | 武汉Tcl集团工业研究院有限公司 | Feature processing method and device, computer-readable storage medium and electronic equipment |
CN114120406A (en) * | 2021-11-22 | 2022-03-01 | 四川轻化工大学 | Face feature extraction and classification method based on convolutional neural network |
CN114187261A (en) * | 2021-12-07 | 2022-03-15 | 天津大学 | Non-reference stereo image quality evaluation method based on multi-dimensional attention mechanism |
CN114360030A (en) * | 2022-01-17 | 2022-04-15 | 重庆锐云科技有限公司 | Face recognition method based on convolutional neural network |
CN114743014A (en) * | 2022-03-28 | 2022-07-12 | 西安电子科技大学 | Laser point cloud feature extraction method and device based on multi-head self-attention |
CN114998958A (en) * | 2022-05-11 | 2022-09-02 | 华南理工大学 | Face recognition method based on lightweight convolutional neural network |
CN115100720A (en) * | 2022-07-04 | 2022-09-23 | 威海职业学院(威海市技术学院) | Low-resolution face recognition method |
CN115223221A (en) * | 2022-07-04 | 2022-10-21 | 网易(杭州)网络有限公司 | Face detection method and device, electronic equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
孙劲光;荣文钊;: "基于区域的年龄估计模型研究" * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115984949A (en) * | 2023-03-21 | 2023-04-18 | 威海职业学院(威海市技术学院) | Low-quality face image recognition method and device with attention mechanism |
CN116311479A (en) * | 2023-05-16 | 2023-06-23 | 四川轻化工大学 | Face recognition method, system and storage medium for unlocking automobile |
CN116311479B (en) * | 2023-05-16 | 2023-07-21 | 四川轻化工大学 | Face recognition method, system and storage medium for unlocking automobile |
Also Published As
Publication number | Publication date |
---|---|
CN115661911B (en) | 2023-03-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108764471B (en) | Neural network cross-layer pruning method based on feature redundancy analysis | |
CN115661911A (en) | Face feature extraction method, device and storage medium | |
CN113537138B (en) | Traffic sign identification method based on lightweight neural network | |
CN112257794B (en) | YOLO-based lightweight target detection method | |
CN110188768B (en) | Real-time image semantic segmentation method and system | |
CN110096968B (en) | Ultra-high-speed static gesture recognition method based on depth model optimization | |
CN109783910B (en) | Structure optimization design method for accelerating by using generation countermeasure network | |
CN104102919A (en) | Image classification method capable of effectively preventing convolutional neural network from being overfit | |
CN112580515B (en) | Lightweight face key point detection method based on Gaussian heat map regression | |
CN113240683B (en) | Attention mechanism-based lightweight semantic segmentation model construction method | |
CN116645716B (en) | Expression recognition method based on local features and global features | |
CN110059593B (en) | Facial expression recognition method based on feedback convolutional neural network | |
CN112001294A (en) | YOLACT + + based vehicle body surface damage detection and mask generation method and storage device | |
CN114742985A (en) | Hyperspectral feature extraction method and device and storage medium | |
CN115602152A (en) | Voice enhancement method based on multi-stage attention network | |
CN111079767A (en) | Neural network model for segmenting image and image segmentation method thereof | |
CN115240259A (en) | Face detection method and face detection system based on YOLO deep network in classroom environment | |
CN116152128A (en) | High dynamic range multi-exposure image fusion model and method based on attention mechanism | |
CN110414516B (en) | Single Chinese character recognition method based on deep learning | |
CN116051534A (en) | Warehouse ceiling solar panel defect detection method based on artificial intelligence | |
CN111414988A (en) | Remote sensing image super-resolution method based on multi-scale feature self-adaptive fusion network | |
CN114359638A (en) | Residual error capsule network classification model, classification method, equipment and storage medium of image | |
CN111507276B (en) | Construction site safety helmet detection method based on hidden layer enhanced features | |
CN116977822A (en) | Image recognition network integrating CNN and transducer model | |
CN115587628A (en) | Deep convolutional neural network lightweight method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |