CN110610129A - Deep learning face recognition system and method based on self-attention mechanism - Google Patents

Deep learning face recognition system and method based on self-attention mechanism Download PDF

Info

Publication number
CN110610129A
CN110610129A CN201910719368.6A CN201910719368A CN110610129A CN 110610129 A CN110610129 A CN 110610129A CN 201910719368 A CN201910719368 A CN 201910719368A CN 110610129 A CN110610129 A CN 110610129A
Authority
CN
China
Prior art keywords
characteristic diagram
channel
attention
obtaining
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910719368.6A
Other languages
Chinese (zh)
Inventor
凌贺飞
邬继阳
李平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201910719368.6A priority Critical patent/CN110610129A/en
Publication of CN110610129A publication Critical patent/CN110610129A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a system and a method for deeply learning face recognition based on a self-attention mechanism, and belongs to the field of computer vision and pattern recognition. The invention constructs a channel self-attention module, performs dimension conversion transposition on three-dimensional data of a characteristic diagram, learns a cross-correlation relationship matrix among channels to express a relative relationship among different channels, obtains the characteristics after channel optimization through calculation with the original characteristics, and performs different weight assignment on different channels, thereby realizing the selection of channel filtration and reducing the redundant information of characteristic channels. A spatial self-attention module is constructed, spatial information of a three-dimensional feature map is modeled, a cross-correlation relation matrix among the spatial positions of the feature map is learned to represent the relative relation among different positions, the feature after spatial position optimization is obtained through calculation with the input feature, different weights are given to different positions of a face feature map, the selection of important feature areas of the face is achieved, and the feature is concentrated in the important areas of the face.

Description

Deep learning face recognition system and method based on self-attention mechanism
Technical Field
The invention belongs to the field of computer vision and pattern recognition, and particularly relates to a deep learning face recognition system and method based on an attention mechanism.
Background
In recent years, with the rapid development of parallel computing processing capability of computers, the technical field of computer vision has advanced greatly under the push of the heat trend of deep learning, and has certain application requirements in various fields. The face recognition is a technology for enabling a computer to automatically recognize the identities of related personnel in monitoring data in a visual algorithm, and is widely applied to various fields such as intelligent security, personnel attendance, community inspection, self-service and the like. For example, the sky-eye monitoring system in the 'safe city smart community' plan in China tracks and catches suspects by using a face recognition technology; in daily life and work, a face recognition technology is often used, such as a face recognition system installed in a campus laboratory and an enterprise office, so that the attendance checking function of related workers can be completed, and meanwhile, the invasion of external personnel can be prevented; in addition, in the field of financial payment, the face recognition technology is fully utilized, for example, a face recognition system is installed on an ATM (automatic teller machine) of a bank to prevent fraudulent card swiping of other people, and face swiping payment adopted in mobile payment and the like are used for further guaranteeing safety. The face recognition technology generally only needs one common camera to complete recognition and authentication operation in an actual deployment environment, has the advantages of dynamic property, no need of cooperation and the like, has more convenient advantages compared with the traditional biological characteristics such as iris recognition, fingerprint recognition and the like, and the application of the face recognition technology is more and more extensive due to the existence of the factors.
Since 2012, the understanding and analysis of face images by computers has been greatly leaped owing to the theoretical development of deep learning and the technological progress of GPU acceleration. The face recognition technology is also applied to the commercial application of high-speed trains with convolutional neural networks under the non-matching condition. Particularly, the current real-time personnel deployment and control system based on the monitoring video can automatically detect, analyze and capture the face image area in the video while analyzing the monitoring video stream, upload the face image area to a background server to complete real-time face deployment and control comparison, and simultaneously alarm abnormal face images, so that a large amount of manpower, material resources and financial resources are saved for the construction of the current 'safe city'.
By means of the analysis capability of Deep Convolutional Neural Network (DCNN) on images, the Deep features based on the Convolutional Neural Network gradually replace the traditional manual features in face recognition. Compared with the traditional manual shallow feature, the depth feature has stronger distinguishing capability and robustness. At present, the face recognition algorithm based on the convolutional neural network mainly realizes the constraint of a feature space by modifying a loss function, such as CosFace, ArcFace and the like, but does not carry out targeted research on the network structure. The method carries out feature extraction through a general classification convolutional neural network, and then carries out constraint of a feature space on a final classification layer, thereby realizing the purposes of increasing the distance between classes and reducing the distance in the classes. The modification aiming at the loss function actually enhances the distinguishing capability of the features to a great extent, but the methods ignore the problems of the convolutional neural network structure in the face recognition feature extraction. The existing convolutional neural network has a single structure, so that the problems of information redundancy and the like exist in forward-propagated feature extraction, the flexibility is weak, and the generalization capability is slightly poor.
Most of the existing face recognition algorithms use a general image classification backbone network, and such networks have two disadvantages in actual face application. Firstly, the feature map extracted by the standard CNN network often has a large number of channels, for example, the number of channels in the later stage of the ResNet network reaches 2048, so that a certain information redundancy is brought to a great extent by the large number of channels, and even a risk of network overfitting may exist. Although there are some regularization approaches such as Dropout that are effective to alleviate this problem, the results are still unsatisfactory; secondly, based on the cognition of the human face to the real world, different parts in the human face image have different importance in the actual recognition, but the mechanism of parameter sharing of convolution kernel in the convolution neural network endows the same weight to all image pixels, and different processing modes cannot be well given to different positions.
Disclosure of Invention
The invention provides a deep learning face recognition method based on a self-attention mechanism, aiming at the defects that overfitting is caused by high channel number of characteristic images in a general convolutional neural network in the prior art and different positions of a face are not distinguished and treated due to a convolutional kernel weight sharing mechanism and the like and improvement requirements. The method aims to learn the cross-correlation information among characteristic diagram channels through a channel self-attention module to obtain the matrix relation among the channels and endow different channels with different importance; then, the spatial self-attention module learns the cross-correlation information between the positions of the feature map to obtain the matrix relation between the positions, and different weights are given to the spatial positions of the feature map to learn the importance of different positions of the human face. The method not only can keep the excellent performance of the original convolution neural network, but also can optimize the characteristics of the face image in the forward transmission process of the neural network, reduce the information redundancy among image channels, concentrate the convolution kernel on the more important position in the face image, improve the face recognition accuracy and enhance the flexibility and generalization capability of the model.
To achieve the above object, according to one aspect of the present invention, there is provided a deep learning face recognition system based on an attention-free mechanism, the system including:
the input module is used for selecting a face picture training set and inputting a face picture to be recognized;
a self-attention based deep learning module with ResNet as a backbone network, comprising a plurality of residual blocks and a plurality of attention modules, said attention modules comprising a channel attention module and/or a spatial attention module, concatenating the channel attention module and/or the spatial attention module at the end of each residual block, the last layer being a fully connected layer, the residual block is used for further extracting the characteristic diagram of the input face picture or the characteristic diagram, the channel attention module is used for learning a cross-correlation relation matrix among characteristic diagram channels in the forward propagation process to obtain a characteristic diagram after channel optimization, the space attention module is used for learning a cross-correlation relation matrix between the space positions of the feature maps in the forward propagation process to obtain the feature maps after the space positions are optimized; the full connection layer is used for converting the finally optimized feature map into features;
the training module is used for training the self-attention-based deep learning module by adopting the face picture training set to obtain a trained self-attention-based deep learning module;
and the face recognition module is used for inputting the face picture to be recognized into the trained self-attention-based deep learning module and outputting a face recognition result.
Specifically, the channel attention module is realized by the following steps:
inputting a feature map FI∈RC×H×WRespectively obtaining a characteristic diagram theta (F) through two parallel convolutionsI)∈RC×H×W
Characteristic diagram theta (F)I)、φ(FI) Respectively carrying out maximum pooling and average pooling in parallel to obtain a characteristic diagram
Characteristic diagram Pool (F)I)1Obtaining a characteristic diagram through dimension conversionCharacteristic diagram Pool (F)I)2Obtaining a characteristic diagram through dimension conversionThen to Pool' (F)I)2Performing transposition;
obtaining a channel self-attention moment array by a Softmax activation function operated according to rows
Feature map FIObtaining a characteristic diagram through convolutionCharacteristic diagram ρ (F)I) Obtaining a characteristic diagram through dimension conversion
To ACAnd ρ' (F)I) Matrix multiplication and dimension conversion are carried out, and the result after dimension conversion is summed with FIAdding bit by bit to obtain the final characteristic diagram with optimized channel dimension of C multiplied by H multiplied by W
Wherein C, H, W represents the channel dimension, height and width of the original characteristic diagram, theta, phi and rho represent the channel convolution operation,it is shown that the bit-by-bit addition operation,representing a matrix multiplication operation and alpha representing a coefficient controlling the proportion of the original features to the channel optimized features.
In particular, the channel self-attention matrix AcThe deployment is as follows:
wherein,showing the correlation between the ith channel and the jth channel.
Specifically, the spatial attention module is realized by the following steps:
inputting a feature map FC∈RC×H×WRespectively obtaining a characteristic diagram through two parallel convolutions
Characteristic diagram theta (F)C) Obtaining a characteristic diagram through the maximum pooling and the average pooling in parallel connection
Respectively corresponding to the characteristic diagram phi (F)C) And Pool (F)C)1Dimension conversion to obtain a characteristic diagram Andaim at phi' (F)C) Performing transposition;
for feature map phi' (F)C)TAnd Pool' (F)C)1Matrix multiplication and Softmax nonlinear activation calculation are carried out to obtain a spatial self-attention moment array
Feature map FCObtaining a characteristic graph rho (F) through convolutionC)∈RC×H×WObtaining a characteristic diagram through the maximum pooling and the average pooling which are connected in parallelObtaining a characteristic diagram through dimension conversion
Obtaining a feature map with dimension of C multiplied by H multiplied by W after space position optimization through matrix multiplication and bitwise addition
Wherein C, H, W represents the channel dimension, height and width of the original characteristic diagram, theta, phi and rho represent the channel convolution operation,it is shown that the bit-by-bit addition operation,expressing matrix multiplication operation, beta expressing a coefficient for controlling the proportion of the original characteristic to the space position optimization characteristic, and a variable r serving as a coefficient for channel dimension reduction, wherein the requirement that r is more than 1 andare integers.
Specifically, the calculation formula of the loss function L is as follows:
wherein N and N respectively represent the number of samples of the current batch and the total number of categories, the hyperparameter s represents a scale scaling factor,representing the angle between the current sample feature and the corresponding class weight, θjRepresenting the angle between the class weight and the corresponding sample, m1And m2The angle interval and the cosine interval are respectively represented as two hyper-parameters of the loss function.
To achieve the above object, according to another aspect of the present invention, there is provided a deep learning face recognition method based on an attention-free mechanism, the method including the steps of:
training the self-attention-based deep learning network by adopting a face picture training set to obtain a trained self-attention-based deep learning network;
inputting the face picture to be recognized into the trained self-attention-based deep learning network, and outputting a face recognition result;
the self-attention-based deep learning network takes ResNet as a backbone network and comprises a plurality of residual blocks and a plurality of attention modules, wherein each attention module comprises a channel attention module and/or a space attention module, the channel attention module and/or the space attention module are connected in series at the tail of each residual block, the last layer is a full connection layer, the residual blocks are used for further extracting feature maps of input face pictures or feature maps, the channel attention module is used for learning a cross-correlation relationship matrix among feature map channels in the forward propagation process to obtain a feature map after channel optimization, and the space attention module is used for learning a cross-correlation relationship matrix among feature map space positions in the forward propagation process to obtain a feature map after space position optimization; the full connection layer is used for converting the finally optimized feature map into features.
Specifically, the channel attention module is realized by the following steps:
inputting a feature map FI∈RC×H×WRespectively obtaining a characteristic diagram theta (F) through two parallel convolutionsI)∈RC×H×W
Characteristic diagram theta (F)I)、φ(FI) Respectively carrying out maximum pooling and average pooling in parallel to obtain a characteristic diagram
Characteristic diagram Pool (F)I)1Obtaining a characteristic diagram through dimension conversionCharacteristic diagram Pool (F)I)2Obtaining a characteristic diagram through dimension conversionThen to Pool' (F)I)2Performing transposition;
obtaining a channel self-attention moment array by a Softmax activation function operated according to rows
Feature map FIObtaining a characteristic diagram through convolutionCharacteristic diagram ρ (F)I) Obtaining a characteristic diagram through dimension conversion
To ACAnd ρ' (F)I) Matrix multiplication and dimension conversion are carried out, and the result after dimension conversion is summed with FIAdding bit by bit to obtain the final characteristic diagram with optimized channel dimension of C multiplied by H multiplied by W
Wherein C, H, W represents the channel dimension, height and width of the original characteristic diagram, theta, phi and rho represent the channel convolution operation,it is shown that the bit-by-bit addition operation,representing a matrix multiplication operation and alpha representing a coefficient controlling the proportion of the original features to the channel optimized features.
In particular, the channel self-attention matrix AcThe deployment is as follows:
wherein,showing the correlation between the ith channel and the jth channel.
Specifically, the spatial attention module is realized by the following steps:
inputting a feature map FC∈RC×H×WRespectively obtaining a characteristic diagram through two parallel convolutions
Characteristic diagram theta (F)C) Obtaining a characteristic diagram through the maximum pooling and the average pooling in parallel connection
Respectively corresponding to the characteristic diagram phi (F)C) And Pool (F)C)1Dimension conversion to obtain a characteristic diagram Aim at phi' (F)C) Performing transposition;
for feature map phi' (F)C)TAnd Pool' (F)C)1Matrix multiplication and Softmax nonlinear activation calculation are carried out to obtain a spatial self-attention moment array
Feature map FCObtaining a characteristic graph rho (F) through convolutionC)∈RC×H×WObtaining a characteristic diagram through the maximum pooling and the average pooling which are connected in parallelObtaining a characteristic diagram through dimension conversion
Obtaining a feature map with dimension of C multiplied by H multiplied by W after space position optimization through matrix multiplication and bitwise addition
Wherein C, H, W represents the channel dimension, height and width of the original characteristic diagram, theta, phi and rho represent the channel convolution operation,it is shown that the bit-by-bit addition operation,expressing matrix multiplication operation, beta expressing a coefficient for controlling the proportion of the original characteristic to the space position optimization characteristic, and a variable r serving as a coefficient for channel dimension reduction, wherein the requirement that r is more than 1 andare integers.
Specifically, the calculation formula of the loss function L is as follows:
wherein N and N respectively represent the number of samples of the current batch and the total number of categories, the hyperparameter s represents a scale scaling factor,representing the angle between the current sample feature and the corresponding class weight, θjRepresenting the angle between the class weight and the corresponding sample, m1And m2The angle interval and the cosine interval are respectively represented as two hyper-parameters of the loss function.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
1. according to the principle of an attention mechanism, a channel self-attention module is constructed, the module learns a cross-correlation relation matrix among channels by performing operations such as dimension conversion transposition on three-dimensional data of a feature map, the matrix represents the relative relation among different channels, finally, the feature after channel optimization is obtained by calculating the original feature, the cross-correlation relation among different channels is finally learned, different weight assignment is performed on different channels, the selection of channel filtering is realized, and the redundant information of the feature channels is reduced.
2. According to the invention, a spatial self-attention module is constructed according to the principles of an attention mechanism, global feature expression and the like, the spatial self-attention module models spatial information of a three-dimensional feature map, learns a cross-correlation relation matrix among the spatial positions of the feature map, the matrix represents the relative relation among different positions, and finally obtains features after spatial position optimization through calculation with input features, finally learns the cross-correlation relation among different spatial positions, gives different weights to different positions of a face feature map, realizes the selection of important feature regions of a face, distinguishes different parts and processes, and concentrates the features in the most important region of the face.
Drawings
Fig. 1 is an overall framework diagram of a deep learning face recognition system based on an attention-driven mechanism according to an embodiment of the present invention;
FIG. 2 is a block diagram of a channel self-attention module according to an embodiment of the present invention;
FIG. 3 is a block diagram of a spatial self-attention module according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a model combination of channel self-attention and spatial self-attention provided by an embodiment of the present invention;
fig. 5 is an effect diagram of a face recognition method based on a self-attention mechanism according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
As shown in fig. 1, in the deep learning face recognition system based on the Self-Attention mechanism, a Residual Self-Attention Network model (SRANet) is improved based on the standard ResNet. Specifically, the invention adds a serial channel attention module and a serial space attention module at the end of each standard residual block to calculate a channel attention relationship matrix and a space attention relationship matrix, and then obtains the final optimization characteristics in a matrix multiplication mode. In addition, the last average pooling layer in the original ResNet structure is removed, and the average pooling layer is replaced by a full-connection layer with the fixed size of 512 dimensions, so that the final feature extraction is carried out. Compared with average pooled single-channel average, the full-connection layer considers channel and space information at the same time, and is matched with a channel and space attention module, so that the design is more reasonable.
Taking the data in fig. 1 as an example, assume that the original output of a certain residual block in the convolutional neural network is FISRANet based on the self-attention mechanism, first FIInputting the data into a channel self-attention module to calculate a channel attention matrix, and after cross-correlation matrix information among different channels is obtained, FIPerforming matrix multiplication operation with channel self-attention matrix, and performing bitwise addition operation with original input to obtain channel qualityCharacteristic F after conversionC(ii) a Similarly, the feature F after the spatial position optimization can be obtained by using the same methodSFinally, FSThe output of this residual structure is input to the next residual structure, and the ellipses in the figure indicate that there are a plurality of such structures.
The invention divides the face recognition into four stages: the method comprises a face image preprocessing stage, a self-attention model building stage, a loss function calculating stage and a feature extraction and retrieval comparison stage.
Preprocessing stage of face image
The face image preprocessing stage comprises the following steps: selection of a face data set, and preprocessing of the face data. The face data preprocessing is mainly divided into two parts: face detection, key point alignment and image data normalization.
For face detection and face key point alignment, the invention uses a cascaded multi-task convolutional neural network MTCNN commonly used in the industry to predict the face position and the face key point at the same time. In the actual training, 4 coordinate positions and 5 key point positions are predicted, and then the detected original face is cut into 112 × 112 face pictures with fixed sizes through similarity transformation.
For image data normalization, the present invention normalizes the pixel values in the original RGB image to [ -1, 1] by subtracting 127.5 and then dividing by 128. In addition, in training, the normalized training set is horizontally turned over with a probability of 50%, so that the effect of data set expansion is achieved, and the overall system precision is improved.
Self-attention model construction phase
(2.1) selecting backbone network
The invention adopts ResNet-50/ResNet-100 as the backbone network of the self-attention model to train the face recognition model, and in the design of ResNet residual block, the convolution kernel size of 3 x 3 is selected.
(2.2) design and implementation of channel self-attention Module
And a channel self-attention module is added behind each residual block of the backbone network to learn the cross-correlation relationship among the characteristic diagram channels in the forward transmission process of the convolutional neural network.
The structure of the channel self-attention module is shown in fig. 2. Input feature map FI∈RC×H×WFirstly, inputting an input feature map into two parallel convolution layers of 1 multiplied by 1, keeping the space scale of the input feature map unchanged, and halving the number of one channel to obtain the feature mapIn order to reduce the burden of matrix calculation, the invention simultaneously increases the maximum pooling and the average pooling which are connected in parallel before the matrix calculation and after the convolution layer, and the pooling kernels of the maximum pooling and the average pooling are the same in size and are both C multiplied by 2. On one hand, the performance stability is kept, and on the other hand, the consumption of video memory is greatly reduced. Through the operation of two pooling layers, the channel self-attention module only retains one-quarter size of spatial data for calculation, so there are:
wherein,
next, two feature maps Pool (F)I)1、Pool(FI)2Dimension conversion and/or transposition operations are performed. Characteristic diagram Pool (F)I)1Through dimension conversion, willIs converted intoObtaining a characteristic diagram Pool' (F)I)1. Characteristic diagram Pool (F)I)2Through dimension conversion, willIs converted intoObtaining a characteristic diagram Pool' (F)I)2Then is transposed into
Finally, obtaining a channel self-attention matrix A by a Softmax activation function operated according to rowsc
The formula is developed:
wherein,showing the correlation between the ith channel and the jth channel. After the channel notices the completion of the calculation of the moment array, the characteristic F is input in the same wayIObtained by a 1 × 1 convolutionBy dimension conversion to obtain a characteristic diagramFeature map ρ' (F) of the magnitude ofI). Then to ACAnd ρ' (F)I) Matrix multiplication and dimension conversion are carried out, and then F is subjected toIAdding the results after dimension conversion bit by bit to obtain the characteristic F with the dimension of C multiplied by H multiplied by W of the final channel optimizationC
In all the above formulas, C, H, W represents the channel dimension, height and width of the input feature map, respectively, θ, φ and ρ represent the channel convolution operation,it is shown that the bit-by-bit addition operation,the coefficient alpha for controlling the proportion of the original features and the channel optimization features is a learnable parameter with an initial value of 0, and the purpose of the coefficient alpha is to reduce the difficulty of the neural network when the neural network is just trained.
(2.3) design and implementation of spatial self-attention Module
After the channel self-attention module, a serial spatial self-attention module is followed to learn the relationship between the feature map positions, wherein all parameters are trained by a neural network back propagation technology and are self-adaptively learned.
As shown in fig. 3, a feature map F is inputC∈RC×H×WThe spatial self-attention module first inputs the feature FCInputting the data into two parallel 1 × 1 convolutional layers, keeping their spatial scale unchanged, but performing a certain degree of channel dimensionality reduction to obtain a feature map r > 1 andare integers. And then reducing the space dimension of one feature map by adopting two parallel maximum pooling and average pooling (the pooling cores are the same). As shown in FIG. 3, θ (F) is selectedC) Obtaining a characteristic diagram Pool (F)C)1
Then respectively adding phi (F)C) And Pool (F)C)1Dimension conversion toAndobtain a characteristic diagram phi' (F)C) And Pool' (F)C)1. Will phi' (F)C) Is transposed intoThen, the invention carries out matrix multiplication and Softmax nonlinear activation calculation on the two characteristics to obtain a space self-attention matrix AS
Unfolding like the tunnel is self-attentive, one can get:
in the context of this formula, the expression,representing the number of features in the pooled feature space dimension.Is a 2-dimensional matrix representing the relationship between any two spatial locations of the input features, e.g.,denotes phi' (F)C)TI th position and Pool' (F)C)1Is determined, where Softmax is calculated by row.
Next, spatial self-annotation is computedAfter the relationship matrix of the intention, the input feature F is also givenCOne convolution to obtain rho (F)C)∈RC×H×WIs converted intoDimension conversion toMatrix Pool' (p (F)C)). Finally, obtaining the characteristic F with the dimensionality of C multiplied by H multiplied by W after space self-attention structure optimization through matrix multiplication and bitwise additionS
In all the above equations, θ, φ, ρ represent convolution operations,it is shown that the bit-by-bit addition operation,and representing matrix multiplication operation, wherein beta is a learnable parameter with an initial value set to be 0, and a variable r is used as a coefficient for reducing the dimension of a channel and is finally set to be 16 through a comparison experiment.
(2.4) feature optimization and feature extraction settings
As shown in fig. 4, in order to fully and comprehensively optimize the three-dimensional feature map, the channel self-attention module and the spatial self-attention module are respectively connected in series behind the ResNet residual block of the backbone network, so as to optimize the feature map in the forward propagation process. The topology of fig. 4 includes dot-multiply and dot-add operations, with arrows directing the direction of input flow to output.
In addition, in the aspect of final feature extraction, the global average pooling layer of the original ResNet is removed, the optimized features are input into a full-connection layer with fixed dimension, the dimension of the full-connection layer is fixedly set to 512, and the full-connection layer with 512 dimensions is replaced by the full-connection layer with 512 dimensions for final feature extraction.
Loss function calculation stage
In order to effectively solve the problem that the conventional loss function can not comprehensively and effectively constrain all samples of the feature space, the invention provides an improved loss function L based on multi-interval constraint.
The formula is established on the basis of weight normalization and feature normalization, namely the invention firstly needs toAfter such constraints, all sample features are distributed on a hypersphere, where xi∈RdFeatures of the ith sample, with this sample belonging to the yiIndividual class, wj∈RdJ-th column, b, representing a weight parameter WjIs the corresponding bias term parameter, N and N respectively represent the number of samples of the current batch and the total number of categories,representing the angle between the current sample feature and the corresponding class weight, m1And m2Two hyper-parameters of the loss function, representing the angle interval and the cosine interval, respectively, the hyper-parameter s representing a scale scaling factor for avoiding the disappearance of the gradient, θjRepresenting the angle between the class weight and the corresponding sample, | | | | represents a 2-norm operation.
Feature extraction and retrieval comparison stage
After the face image to be recognized is processed by the trained model, a feature vector with fixed dimension is obtained, the vector is used for carrying out real-time comparison with features extracted offline from a library, and whether the face image is a person needing to be retrieved is judged according to cosine similarity obtained through calculation and a set threshold value. In this embodiment, the threshold is usually set in the range of 0.6 to 0.7.
The extraction and retrieval comparison of the characteristics are the stage when real-time face recognition is carried out on line, the given face to be searched is input into the trained model according to the same processing mode, the characteristic vector with the fixed size of 512 dimensions is extracted at the last full-connected layer, the cosine similarity comparison is carried out on the characteristic vector and the characteristics extracted off line in the library, and the cosine similarity calculation formula is as follows:
wherein A isi、BiAnd respectively indicating the features of the facial image to be retrieved and the stored facial image features in the search library, taking a plurality of images with the highest similarity and the similarity being greater than a set threshold value as query results, and completing the final facial recognition process, wherein P represents the dimension of a feature vector, and is 512.
Examples
In order to prove that the deep learning face recognition method based on the self-attention mechanism has advantages in performance and adaptability, the method is verified and analyzed through the following experiments:
A. experimental data set
Training set: CASIA-Webface and MS-Celeb-1M. The total number of the CASIA-Webface is 10575, and the total number of the human face images is 49.4 million. The total number of 100K people in the MS-Celeb-1M raw data is 10M face pictures, but the number of wrong samples is more, so that samples after cleaning are adopted in training, and the total number of images is 86876 individual 3.9M.
And (3) test set: LFW, AgeDB-30, CFP-FP, and MegaFace. The LFW, the AgeDB-30 and the CFP-FP test the human face verification accuracy on a small scale, and the MegaFace tests the human face recognition accuracy on a million level and the human face verification accuracy on a millionth false alarm rate.
B. Evaluation criteria
The invention adopts the mainstream evaluation standard of face recognition research at home and abroad, for face verification test, the accuracy is evaluated, and if the tested sample set has K pairs of pictures, wherein L pairs exist in wrong judgment, the accuracy of face verification is as follows:
for the recognition accuracy of MegaFace in the million level, a cumulative matching feature first accuracy rate CMC @1, namely a Rank1 recognition rate, is adopted. For the assumption that the size of a face query set is Q, each image Q to be queried in the face query set isiQ performs similarity rank matching work, if each query image Q is a query image Q, i is 1, 2iThe first correctly matched image location is r (q)i) Then the calculation formula for CMC @ K is:
in the CMC curve, the identification accuracy is higher when K is larger, and in the MegaFace test protocol, the identification result of Rank1 is analyzed, namely CMC @ 1.
C. Results of the experiment
Experiments show that the face verification accuracy of the invention on LFW, AgeDB-3 and CFP-FP reaches 99.83%, 98.67% and 95.86% respectively; in addition, the Rank1 recognition rate on MegaFace in the million level is 98.38%, the verification rate with the false alarm rate of one millionth is 98.45%, and the levels reach the leading level. Meanwhile, the invention compares the existing mainstream scheme on several data sets, and the experimental results are shown in the following table:
TABLE 1 face verification accuracy (%) -of LFW, AgeDB-30 and CFP-FP
TABLE 2 MegaFace test results (%)
From the above two tables, it can be seen that the present invention shows superior performance in the same experimental environment, and in addition, the present invention also performs visualization processing on the face model based on the self-attention mechanism, and as a result, as shown in fig. 5, it can be seen that the face model with the attention module has a clearer face contour, so that the person is easier to recognize, which fully proves that the model based on the self-attention mechanism can effectively perform feature optimization on the forward transmission process of the convolutional neural network, and enhances the distinguishing force and robustness of the face features.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A system for deep learning face recognition based on a self-attention mechanism, the system comprising:
the input module is used for selecting a face picture training set and inputting a face picture to be recognized;
a self-attention based deep learning module with ResNet as a backbone network, comprising a plurality of residual blocks and a plurality of attention modules, said attention modules comprising a channel attention module and/or a spatial attention module, concatenating the channel attention module and/or the spatial attention module at the end of each residual block, the last layer being a fully connected layer, the residual block is used for further extracting the characteristic diagram of the input face picture or the characteristic diagram, the channel attention module is used for learning a cross-correlation relation matrix among characteristic diagram channels in the forward propagation process to obtain a characteristic diagram after channel optimization, the space attention module is used for learning a cross-correlation relation matrix between the space positions of the feature maps in the forward propagation process to obtain the feature maps after the space positions are optimized; the full connection layer is used for converting the finally optimized feature map into features;
the training module is used for training the self-attention-based deep learning module by adopting the face picture training set to obtain a trained self-attention-based deep learning module;
and the face recognition module is used for inputting the face picture to be recognized into the trained self-attention-based deep learning module and outputting a face recognition result.
2. The face recognition system of claim 1, wherein the channel attention module is implemented by:
inputting a feature map FI∈RC×H×WRespectively obtaining a characteristic diagram theta (F) through two parallel convolutionsI)∈RC×H×W
Characteristic diagram theta (F)I)、φ(FI) Respectively carrying out maximum pooling and average pooling in parallel to obtain a characteristic diagram
Characteristic diagram Pool (F)I)1Obtaining a characteristic diagram through dimension conversionCharacteristic diagram Pool (F)I)2Obtaining a characteristic diagram through dimension conversionThen to Pool' (F)I)2Performing transposition;
channel self-attention is obtained by activating functions of Softmax operated according to rowsMatrix array
Feature map FIObtaining a characteristic diagram through convolutionCharacteristic diagram ρ (F)I) Obtaining a characteristic diagram through dimension conversion
To ACAnd ρ' (F)I) Matrix multiplication and dimension conversion are carried out, and the result after dimension conversion is summed with FIAdding bit by bit to obtain the final characteristic diagram with optimized channel dimension of C multiplied by H multiplied by W
Wherein C, H, W represents the channel dimension, height and width of the original characteristic diagram, theta, phi and rho represent the channel convolution operation,it is shown that the bit-by-bit addition operation,representing a matrix multiplication operation and alpha representing a coefficient controlling the proportion of the original features to the channel optimized features.
3. The face recognition system of claim 2, wherein the channels are from an attention matrix acThe deployment is as follows:
wherein,showing the correlation between the ith channel and the jth channel.
4. The face recognition system of claim 1, wherein the spatial attention module is implemented by:
inputting a feature map FC∈RC×H×WRespectively obtaining a characteristic diagram through two parallel convolutions
Characteristic diagram theta (F)C) Obtaining a characteristic diagram through the maximum pooling and the average pooling in parallel connection
Respectively corresponding to the characteristic diagram phi (F)C) And Pool (F)C)1Dimension conversion to obtain a characteristic diagram Andaim at phi' (F)C) Performing transposition;
for feature map phi' (F)C)TAnd Pool' (F)C)1Matrix multiplication and Softmax nonlinear activation calculation are carried out to obtain a spatial self-attention moment array
Feature map FCObtaining a characteristic graph rho (F) through convolutionC)∈RC×H×WObtaining a characteristic diagram through the maximum pooling and the average pooling which are connected in parallelObtaining a characteristic diagram through dimension conversion
Obtaining a feature map with dimension of C multiplied by H multiplied by W after space position optimization through matrix multiplication and bitwise addition
Wherein C, H, W represents the channel dimension, height and width of the original characteristic diagram, theta, phi and rho represent the channel convolution operation,it is shown that the bit-by-bit addition operation,expressing matrix multiplication operation, beta expressing a coefficient for controlling the proportion of the original characteristic to the space position optimization characteristic, and a variable r serving as a coefficient for channel dimension reduction, wherein the requirement that r is more than 1 andare integers.
5. A face recognition system as claimed in any one of claims 1 to 4, wherein the loss function L is calculated as follows:
wherein N and N respectively represent the current batchThe number of secondary samples, and the total number of classes, the hyper-parameter s represents a scaling factor,representing the angle between the current sample feature and the corresponding class weight, θiRepresenting the angle between the class weight and the corresponding sample, m1And m2The angle interval and the cosine interval are respectively represented as two hyper-parameters of the loss function.
6. A deep learning face recognition method based on a self-attention mechanism is characterized by comprising the following steps:
training the self-attention-based deep learning network by adopting a face picture training set to obtain a trained self-attention-based deep learning network;
inputting the face picture to be recognized into the trained self-attention-based deep learning network, and outputting a face recognition result;
the self-attention-based deep learning network takes ResNet as a backbone network and comprises a plurality of residual blocks and a plurality of attention modules, wherein each attention module comprises a channel attention module and/or a space attention module, the channel attention module and/or the space attention module are connected in series at the tail of each residual block, the last layer is a full connection layer, the residual blocks are used for further extracting feature maps of input face pictures or feature maps, the channel attention module is used for learning a cross-correlation relationship matrix among feature map channels in the forward propagation process to obtain a feature map after channel optimization, and the space attention module is used for learning a cross-correlation relationship matrix among feature map space positions in the forward propagation process to obtain a feature map after space position optimization; the full connection layer is used for converting the finally optimized feature map into features.
7. The face recognition method of claim 6, wherein the channel attention module is implemented by:
inputting a feature map FI∈RC×H×WRespectively obtaining a characteristic diagram theta (F) through two parallel convolutionsI)∈RC×H×W
Characteristic diagram theta (F)I)、φ(FI) Respectively carrying out maximum pooling and average pooling in parallel to obtain a characteristic diagram
Characteristic diagram Pool (F)I)1Obtaining a characteristic diagram through dimension conversionCharacteristic diagram Pool (F)I)2Obtaining a characteristic diagram through dimension conversionThen to Pool' (F)I)2Performing transposition;
obtaining a channel self-attention moment array by a Softmax activation function operated according to rows
Feature map FIObtaining a characteristic diagram through convolutionCharacteristic diagram ρ (F)I) Obtaining a characteristic diagram through dimension conversion
To ACAnd ρ' (F)I) Matrix multiplication and dimension conversion are carried out, and the result after dimension conversion is summed with FIAdding bit by bit to obtain the final characteristic diagram with optimized channel dimension of C multiplied by H multiplied by W
Wherein C, H, W represents the channel dimension, height and width of the original characteristic diagram, theta, phi and rho represent the channel convolution operation,it is shown that the bit-by-bit addition operation,representing a matrix multiplication operation and alpha representing a coefficient controlling the proportion of the original features to the channel optimized features.
8. The face recognition method of claim 7, wherein the channel is from the attention matrix acThe deployment is as follows:
wherein,showing the correlation between the ith channel and the jth channel.
9. The face recognition method of claim 6, wherein the spatial attention module is implemented by:
inputting a feature map FC∈RC×H×WRespectively obtaining a characteristic diagram through two parallel convolutions
Characteristic diagram theta (F)C) Obtaining a characteristic diagram through the maximum pooling and the average pooling in parallel connection
Respectively corresponding to the characteristic diagram phi (F)C) And Pool (F)C)1Dimension conversion to obtain a characteristic diagram Andaim at phi' (F)C) Performing transposition;
for feature map phi' (F)C)TAnd Pool' (F)C)1Matrix multiplication and Softmax nonlinear activation calculation are carried out to obtain a spatial self-attention moment array
Feature map FCObtaining a characteristic graph rho (F) through convolutionC)∈RC×H×WObtaining a characteristic diagram through the maximum pooling and the average pooling which are connected in parallelObtaining a characteristic diagram through dimension conversion
Obtaining a feature map with dimension of C multiplied by H multiplied by W after space position optimization through matrix multiplication and bitwise addition
Wherein C, H, W represents the channel dimension, height and width of the original characteristic diagram, theta, phi and rho represent the channel convolution operation,it is shown that the bit-by-bit addition operation,expressing matrix multiplication operation, beta expressing a coefficient for controlling the proportion of the original characteristic to the space position optimization characteristic, and a variable r serving as a coefficient for channel dimension reduction, wherein the requirement that r is more than 1 andare integers.
10. The face recognition method according to any one of claims 6 to 9, wherein the loss function L is calculated as follows:
wherein N and N respectively represent the number of samples of the current batch and the total number of categories, the hyperparameter s represents a scale scaling factor,representing the angle between the current sample feature and the corresponding class weight, θjRepresenting the angle between the class weight and the corresponding sample, m1And m2The angle interval and the cosine interval are respectively represented as two hyper-parameters of the loss function.
CN201910719368.6A 2019-08-05 2019-08-05 Deep learning face recognition system and method based on self-attention mechanism Pending CN110610129A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910719368.6A CN110610129A (en) 2019-08-05 2019-08-05 Deep learning face recognition system and method based on self-attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910719368.6A CN110610129A (en) 2019-08-05 2019-08-05 Deep learning face recognition system and method based on self-attention mechanism

Publications (1)

Publication Number Publication Date
CN110610129A true CN110610129A (en) 2019-12-24

Family

ID=68890322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910719368.6A Pending CN110610129A (en) 2019-08-05 2019-08-05 Deep learning face recognition system and method based on self-attention mechanism

Country Status (1)

Country Link
CN (1) CN110610129A (en)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111199233A (en) * 2019-12-30 2020-05-26 四川大学 Improved deep learning pornographic image identification method
CN111222515A (en) * 2020-01-06 2020-06-02 北方民族大学 Image translation method based on context-aware attention
CN111260462A (en) * 2020-01-16 2020-06-09 东华大学 Transaction fraud detection method based on heterogeneous relation network attention mechanism
CN111274999A (en) * 2020-02-17 2020-06-12 北京迈格威科技有限公司 Data processing method, image processing method, device and electronic equipment
CN111325145A (en) * 2020-02-19 2020-06-23 中山大学 Behavior identification method based on combination of time domain channel correlation blocks
CN111368815A (en) * 2020-05-28 2020-07-03 之江实验室 Pedestrian re-identification method based on multi-component self-attention mechanism
CN111582215A (en) * 2020-05-17 2020-08-25 华中科技大学同济医学院附属协和医院 Scanning identification system and method for normal anatomical structure of biliary-pancreatic system
CN111798445A (en) * 2020-07-17 2020-10-20 北京大学口腔医院 Tooth image caries identification method and system based on convolutional neural network
CN111860393A (en) * 2020-07-28 2020-10-30 浙江工业大学 Face detection and recognition method on security system
CN111881746A (en) * 2020-06-23 2020-11-03 安徽清新互联信息科技有限公司 Face feature point positioning method and system based on information fusion
CN112001215A (en) * 2020-05-25 2020-11-27 天津大学 Method for identifying identity of text-independent speaker based on three-dimensional lip movement
CN112084911A (en) * 2020-08-28 2020-12-15 安徽清新互联信息科技有限公司 Human face feature point positioning method and system based on global attention
CN112101434A (en) * 2020-09-04 2020-12-18 河南大学 Infrared image weak and small target detection method based on improved YOLO v3
CN112101456A (en) * 2020-09-15 2020-12-18 推想医疗科技股份有限公司 Attention feature map acquisition method and device and target detection method and device
CN112183190A (en) * 2020-08-18 2021-01-05 杭州翌微科技有限公司 Human face quality evaluation method based on local key feature recognition
CN112270213A (en) * 2020-10-12 2021-01-26 萱闱(北京)生物科技有限公司 Improved HRnet based on attention mechanism
CN112464787A (en) * 2020-11-25 2021-03-09 北京航空航天大学 Remote sensing image ship target fine-grained classification method based on spatial fusion attention
CN112464851A (en) * 2020-12-08 2021-03-09 国网陕西省电力公司电力科学研究院 Smart power grid foreign matter intrusion detection method and system based on visual perception
CN112465026A (en) * 2020-11-26 2021-03-09 深圳市对庄科技有限公司 Model training method and device for jadeite mosaic recognition
CN112633158A (en) * 2020-12-22 2021-04-09 广东电网有限责任公司电力科学研究院 Power transmission line corridor vehicle identification method, device, equipment and storage medium
CN112667841A (en) * 2020-12-28 2021-04-16 山东建筑大学 Weak supervision depth context-aware image characterization method and system
CN112801069A (en) * 2021-04-14 2021-05-14 四川翼飞视科技有限公司 Face key feature point detection device, method and storage medium
CN112949841A (en) * 2021-05-13 2021-06-11 德鲁动力科技(成都)有限公司 Attention-based CNN neural network training method
CN112967327A (en) * 2021-03-04 2021-06-15 国网河北省电力有限公司检修分公司 Monocular depth method based on combined self-attention mechanism
CN113065550A (en) * 2021-03-12 2021-07-02 国网河北省电力有限公司 Text recognition method based on self-attention mechanism
CN113344875A (en) * 2021-06-07 2021-09-03 武汉象点科技有限公司 Abnormal image detection method based on self-supervision learning
CN113379657A (en) * 2021-05-19 2021-09-10 上海壁仞智能科技有限公司 Image processing method and device based on random matrix
CN113392696A (en) * 2021-04-06 2021-09-14 四川大学 Intelligent court monitoring face recognition system and method based on fractional calculus
CN113469335A (en) * 2021-06-29 2021-10-01 杭州中葳数字科技有限公司 Method for distributing weight for feature by using relationship between features of different convolutional layers
CN113554151A (en) * 2021-07-07 2021-10-26 浙江工业大学 Attention mechanism method based on convolution interlayer relation
CN113616209A (en) * 2021-08-25 2021-11-09 西南石油大学 Schizophrenia patient discrimination method based on space-time attention mechanism
CN113989579A (en) * 2021-10-27 2022-01-28 腾讯科技(深圳)有限公司 Image detection method, device, equipment and storage medium
CN114005078A (en) * 2021-12-31 2022-02-01 山东交通学院 Vehicle weight identification method based on double-relation attention mechanism
CN114118140A (en) * 2021-10-29 2022-03-01 新黎明科技股份有限公司 Multi-view intelligent fault diagnosis method and system for explosion-proof motor bearing
CN114550162A (en) * 2022-02-16 2022-05-27 北京工业大学 Three-dimensional object identification method combining view importance network and self-attention mechanism
CN115100709A (en) * 2022-06-23 2022-09-23 北京邮电大学 Feature-separated image face recognition and age estimation method
WO2023005161A1 (en) * 2021-07-27 2023-02-02 平安科技(深圳)有限公司 Face image similarity calculation method, apparatus and device, and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256450A (en) * 2018-01-04 2018-07-06 天津大学 A kind of supervised learning method of recognition of face and face verification based on deep learning
CN109543606A (en) * 2018-11-22 2019-03-29 中山大学 A kind of face identification method that attention mechanism is added

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256450A (en) * 2018-01-04 2018-07-06 天津大学 A kind of supervised learning method of recognition of face and face verification based on deep learning
CN109543606A (en) * 2018-11-22 2019-03-29 中山大学 A kind of face identification method that attention mechanism is added

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HEFEI LING 等: ""Self Residual Attention Network for Deep Face Recognition"", 《IEEE ACCESS》 *
JIANKANG DENG 等: ""ArcFace: Additive Angular Margin Loss for Deep Face Recognition"", 《ARXIV》 *

Cited By (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111199233A (en) * 2019-12-30 2020-05-26 四川大学 Improved deep learning pornographic image identification method
CN111222515A (en) * 2020-01-06 2020-06-02 北方民族大学 Image translation method based on context-aware attention
CN111222515B (en) * 2020-01-06 2023-04-07 北方民族大学 Image translation method based on context-aware attention
CN111260462A (en) * 2020-01-16 2020-06-09 东华大学 Transaction fraud detection method based on heterogeneous relation network attention mechanism
CN111260462B (en) * 2020-01-16 2022-05-27 东华大学 Transaction fraud detection method based on heterogeneous relation network attention mechanism
CN111274999A (en) * 2020-02-17 2020-06-12 北京迈格威科技有限公司 Data processing method, image processing method, device and electronic equipment
CN111274999B (en) * 2020-02-17 2024-04-19 北京迈格威科技有限公司 Data processing method, image processing device and electronic equipment
CN111325145A (en) * 2020-02-19 2020-06-23 中山大学 Behavior identification method based on combination of time domain channel correlation blocks
CN111325145B (en) * 2020-02-19 2023-04-25 中山大学 Behavior recognition method based on combined time domain channel correlation block
CN111582215A (en) * 2020-05-17 2020-08-25 华中科技大学同济医学院附属协和医院 Scanning identification system and method for normal anatomical structure of biliary-pancreatic system
CN112001215A (en) * 2020-05-25 2020-11-27 天津大学 Method for identifying identity of text-independent speaker based on three-dimensional lip movement
CN112001215B (en) * 2020-05-25 2023-11-24 天津大学 Text irrelevant speaker identity recognition method based on three-dimensional lip movement
CN111368815A (en) * 2020-05-28 2020-07-03 之江实验室 Pedestrian re-identification method based on multi-component self-attention mechanism
CN111881746A (en) * 2020-06-23 2020-11-03 安徽清新互联信息科技有限公司 Face feature point positioning method and system based on information fusion
CN111881746B (en) * 2020-06-23 2024-04-02 安徽清新互联信息科技有限公司 Face feature point positioning method and system based on information fusion
CN111798445A (en) * 2020-07-17 2020-10-20 北京大学口腔医院 Tooth image caries identification method and system based on convolutional neural network
CN111798445B (en) * 2020-07-17 2023-10-31 北京大学口腔医院 Tooth image caries identification method and system based on convolutional neural network
CN111860393A (en) * 2020-07-28 2020-10-30 浙江工业大学 Face detection and recognition method on security system
CN112183190A (en) * 2020-08-18 2021-01-05 杭州翌微科技有限公司 Human face quality evaluation method based on local key feature recognition
CN112084911B (en) * 2020-08-28 2023-03-07 安徽清新互联信息科技有限公司 Human face feature point positioning method and system based on global attention
CN112084911A (en) * 2020-08-28 2020-12-15 安徽清新互联信息科技有限公司 Human face feature point positioning method and system based on global attention
CN112101434B (en) * 2020-09-04 2022-09-09 河南大学 Infrared image weak and small target detection method based on improved YOLO v3
CN112101434A (en) * 2020-09-04 2020-12-18 河南大学 Infrared image weak and small target detection method based on improved YOLO v3
CN112101456A (en) * 2020-09-15 2020-12-18 推想医疗科技股份有限公司 Attention feature map acquisition method and device and target detection method and device
CN112101456B (en) * 2020-09-15 2024-04-26 推想医疗科技股份有限公司 Attention characteristic diagram acquisition method and device and target detection method and device
CN112270213A (en) * 2020-10-12 2021-01-26 萱闱(北京)生物科技有限公司 Improved HRnet based on attention mechanism
CN112464787A (en) * 2020-11-25 2021-03-09 北京航空航天大学 Remote sensing image ship target fine-grained classification method based on spatial fusion attention
CN112464787B (en) * 2020-11-25 2022-07-08 北京航空航天大学 Remote sensing image ship target fine-grained classification method based on spatial fusion attention
CN112465026A (en) * 2020-11-26 2021-03-09 深圳市对庄科技有限公司 Model training method and device for jadeite mosaic recognition
CN112464851A (en) * 2020-12-08 2021-03-09 国网陕西省电力公司电力科学研究院 Smart power grid foreign matter intrusion detection method and system based on visual perception
CN112633158A (en) * 2020-12-22 2021-04-09 广东电网有限责任公司电力科学研究院 Power transmission line corridor vehicle identification method, device, equipment and storage medium
CN112667841A (en) * 2020-12-28 2021-04-16 山东建筑大学 Weak supervision depth context-aware image characterization method and system
CN112967327A (en) * 2021-03-04 2021-06-15 国网河北省电力有限公司检修分公司 Monocular depth method based on combined self-attention mechanism
CN113065550A (en) * 2021-03-12 2021-07-02 国网河北省电力有限公司 Text recognition method based on self-attention mechanism
CN113392696A (en) * 2021-04-06 2021-09-14 四川大学 Intelligent court monitoring face recognition system and method based on fractional calculus
CN112801069A (en) * 2021-04-14 2021-05-14 四川翼飞视科技有限公司 Face key feature point detection device, method and storage medium
CN112949841A (en) * 2021-05-13 2021-06-11 德鲁动力科技(成都)有限公司 Attention-based CNN neural network training method
CN113379657A (en) * 2021-05-19 2021-09-10 上海壁仞智能科技有限公司 Image processing method and device based on random matrix
CN113379657B (en) * 2021-05-19 2022-11-25 上海壁仞智能科技有限公司 Image processing method and device based on random matrix
CN113344875A (en) * 2021-06-07 2021-09-03 武汉象点科技有限公司 Abnormal image detection method based on self-supervision learning
CN113469335B (en) * 2021-06-29 2024-05-10 杭州中葳数字科技有限公司 Method for distributing weights for features by utilizing relation among features of different convolution layers
CN113469335A (en) * 2021-06-29 2021-10-01 杭州中葳数字科技有限公司 Method for distributing weight for feature by using relationship between features of different convolutional layers
CN113554151A (en) * 2021-07-07 2021-10-26 浙江工业大学 Attention mechanism method based on convolution interlayer relation
CN113554151B (en) * 2021-07-07 2024-03-22 浙江工业大学 Attention mechanism method based on convolution interlayer relation
WO2023005161A1 (en) * 2021-07-27 2023-02-02 平安科技(深圳)有限公司 Face image similarity calculation method, apparatus and device, and storage medium
CN113616209B (en) * 2021-08-25 2023-08-04 西南石油大学 Method for screening schizophrenic patients based on space-time attention mechanism
CN113616209A (en) * 2021-08-25 2021-11-09 西南石油大学 Schizophrenia patient discrimination method based on space-time attention mechanism
CN113989579A (en) * 2021-10-27 2022-01-28 腾讯科技(深圳)有限公司 Image detection method, device, equipment and storage medium
CN114118140A (en) * 2021-10-29 2022-03-01 新黎明科技股份有限公司 Multi-view intelligent fault diagnosis method and system for explosion-proof motor bearing
CN114005078B (en) * 2021-12-31 2022-03-29 山东交通学院 Vehicle weight identification method based on double-relation attention mechanism
CN114005078A (en) * 2021-12-31 2022-02-01 山东交通学院 Vehicle weight identification method based on double-relation attention mechanism
CN114550162B (en) * 2022-02-16 2024-04-02 北京工业大学 Three-dimensional object recognition method combining view importance network and self-attention mechanism
CN114550162A (en) * 2022-02-16 2022-05-27 北京工业大学 Three-dimensional object identification method combining view importance network and self-attention mechanism
CN115100709A (en) * 2022-06-23 2022-09-23 北京邮电大学 Feature-separated image face recognition and age estimation method

Similar Documents

Publication Publication Date Title
CN110610129A (en) Deep learning face recognition system and method based on self-attention mechanism
CN109543606B (en) Human face recognition method with attention mechanism
CN111325115B (en) Cross-modal countervailing pedestrian re-identification method and system with triple constraint loss
CN111126360A (en) Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN105138998B (en) Pedestrian based on the adaptive sub-space learning algorithm in visual angle recognition methods and system again
CN109359541A (en) A kind of sketch face identification method based on depth migration study
CN106503687A (en) The monitor video system for identifying figures of fusion face multi-angle feature and its method
CN102156887A (en) Human face recognition method based on local feature learning
CN108564040B (en) Fingerprint activity detection method based on deep convolution characteristics
CN111709313B (en) Pedestrian re-identification method based on local and channel combination characteristics
CN111950525B (en) Fine-grained image classification method based on destructive reconstruction learning and GoogLeNet
Lin et al. Face gender recognition based on face recognition feature vectors
CN109145704B (en) Face portrait recognition method based on face attributes
CN112232184A (en) Multi-angle face recognition method based on deep learning and space conversion network
CN115830531A (en) Pedestrian re-identification method based on residual multi-channel attention multi-feature fusion
CN110414431B (en) Face recognition method and system based on elastic context relation loss function
CN106886771A (en) The main information extracting method of image and face identification method based on modularization PCA
Chen et al. A finger vein recognition algorithm based on deep learning
Saravanan et al. Using machine learning principles, the classification method for face spoof detection in artificial neural networks
Ebrahimpour et al. Liveness control in face recognition with deep learning methods
Ge et al. Deep and discriminative feature learning for fingerprint classification
Elbarawy et al. Facial expressions recognition in thermal images based on deep learning techniques
Xiao et al. An improved siamese network model for handwritten signature verification
CN111898400A (en) Fingerprint activity detection method based on multi-modal feature fusion
Desai et al. Face anti-spoofing technique using CNN and SVM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191224

WD01 Invention patent application deemed withdrawn after publication