CN107766850B - Face recognition method based on combination of face attribute information - Google Patents

Face recognition method based on combination of face attribute information Download PDF

Info

Publication number
CN107766850B
CN107766850B CN201711232374.6A CN201711232374A CN107766850B CN 107766850 B CN107766850 B CN 107766850B CN 201711232374 A CN201711232374 A CN 201711232374A CN 107766850 B CN107766850 B CN 107766850B
Authority
CN
China
Prior art keywords
layer
attribute
convolution
face
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711232374.6A
Other languages
Chinese (zh)
Other versions
CN107766850A (en
Inventor
马争
解梅
张恒胜
涂晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201711232374.6A priority Critical patent/CN107766850B/en
Publication of CN107766850A publication Critical patent/CN107766850A/en
Application granted granted Critical
Publication of CN107766850B publication Critical patent/CN107766850B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a face recognition method based on combination of face attribute information, and belongs to the technical field of digital image processing. The invention discloses a new method for fusing identity information and attribute information to improve the accuracy of face recognition, aiming at the technical problems that the existing fusion method needs to train a plurality of DCNN networks and then carries out score fusion or feature fusion for further training, the work task is heavy and complicated, and the practical application is not facilitated. The face identity authentication network and the attribute recognition network are fused to form a fusion network, and the identity characteristic and the face attribute characteristic are simultaneously learned in a joint learning mode, so that the face recognition accuracy is improved, the face attribute characteristic can be predicted, and the face identity authentication network is a multi-task network; a weighting function sensitive to cost is adopted, so that the target domain data distribution is not depended on, and the balance training in a source data domain is realized; and the modified fusion framework only adds a few parameters, and the additional calculation load is small.

Description

Face recognition method based on combination of face attribute information
Technical Field
The invention belongs to the technical field of digital image processing, and particularly relates to a face recognition method based on combination of face attribute information.
Background
With the rapid development of deep learning, the face recognition technology has been developed rapidly, and many products applying the face recognition technology are produced. However, the current face recognition technology has many limitations, and the very typical problems are influenced by environmental factors such as a large side face, light and the like, which all reduce the performance of the face recognition system. Many researchers have done much work on face pose correction, domain adaptation (domain), etc., and while many efforts have been made, they are still in the exploration phase. According to research, under the condition that a plurality of conditions are greatly changed, the identification of the facial attribute information (such as gender, eyebrow shape, nose bridge height and the like) of a plurality of people is not greatly influenced, and the facial attribute information can still be accurately identified. Therefore, the accuracy of face recognition can be improved by combining the face attribute information.
At present, a plurality of multitask frameworks applied to face attribute learning are not available. Many of these approaches, while simple in concept, are very burdensome. For example: using AdaBoost to select independent feature subspace and independent SVM classifier for each attribute to realize classification on different attributes; or separately training DCNN (deep convolutional neural network) for each attribute and then training an independent SVM classifier for classification. The work task is very complicated, and the practical utilization value is low. Rudd et al propose a hybrid target optimization network to learn face attributes, and train different attributes jointly, greatly reducing the work difficulty and making it easier to implement.
In the aspect of fusion, many workers try to add attribute information into face recognition to improve the accuracy of face recognition. However, no mature algorithm exists at present for how to fuse the face attribute information and the face identity authentication information. Currently existing fusion methods are roughly divided into two categories:
(1) the fusion framework is shown in fig. 1, and the identity recognition network and n (n > 2) attribute recognition networks are respectively trained, an input picture is subjected to DCNN (deep neural network) to extract features, similarity scores (label corresponding probability values) are output through a full connection layer Fc and a softmax layer, and then all the similarity scores are added to form a new identity similarity score prediction target identity.
(2) Feature level fusion, which can be further divided into an aggregation method and a subspace learning method. The aggregation method is to extract attribute features and identity authentication features by using a network, then simply connect the two features at a feature level or limit the two features to have the same dimension, and then carry out element averaging or multiplication. The subspace learning method is to connect the two features in series, then map the connected features to a more appropriate subspace, and then learn the fusion parameters by adopting a supervised or unsupervised learning method. Unsupervised learning does not utilize identity information for fusion learning, and supervised learning uses identity information for fusion learning, relatively. The fusion framework of the feature layer is as shown in fig. 2, which is similar to the structure of the fractional layer, and respectively trains an identity recognition network and n attribute recognition networks, then inputs the picture into all networks, then extracts the features of the pooling layer of the last pooling layer, fuses all the pooling layer features together through a feature connector, and then performs prediction output on a new feature training SVM or other classifiers.
Both methods need to train a plurality of DCNNs, and then perform score fusion or feature fusion for further training, which is heavy and complicated in work task and not beneficial to practical application.
Disclosure of Invention
The invention aims to: aiming at the existing problems, a new mode of fusing identity information and attribute information is disclosed to improve the accuracy of face recognition. The invention fuses the face identity authentication network and the attribute identification network to form a fusion network, and simultaneously learns the identity characteristics and the face attribute characteristics by adopting a joint learning mode.
The invention relates to a face recognition method based on combination of face attribute information, which comprises the following steps:
constructing a fusion network model:
a third module blockC is used as an input layer of the converged network model, the third module blockC is connected with a first module blockA1, the first module blockA1 is respectively connected with a first module blockA2, and a second module blockB 1; the first module BlockA2 is sequentially connected with a first module BlockA3, a pooling layer of a first global average pooling mode, a first full connection layer and a second full connection layer Softmax layer to form an identity identification network;
the first module Block A2 and the second module Block B1 are both connected with a feature connector, and the feature connection layer is sequentially connected with the second module Block B2, a pooling layer of a second global average pooling mode and a third full connection layer to form a face attribute identification network;
wherein, the first module Block a1 stacks 5 inclusion structures, the first module Block a2 stacks 10 inclusion structures, the first module Block A3 stacks 5 inclusion structures, the inclusion structure includes a feature connector, a convolution layer, a pooling layer, a normalization layer and an input interface layer, and four paths of parallel convolution are included between the feature connector and the input interface layer: the first path is a convolution layer and a normalization layer which are connected in series, wherein the convolution layer is connected with the input interface layer, and the convolution kernel is 1 multiplied by 1; the second path is two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 3 multiplied by 3; the third path comprises two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 5 multiplied by 5; the fourth path comprises a pooling layer, a convolution layer and a normalization layer which are sequentially connected in series, wherein the pooling layer is connected with the input interface layer, the pooling mode is maximum pooling, the pooling kernel is 2 multiplied by 2, and the convolution kernel of the convolution layer is 1 multiplied by 1;
the second modules, BlockB1 and BlockB2, are convolution structures: the system comprises an input interface layer, a convolution layer with convolution kernel of 1 × 1, a convolution layer with convolution kernel of 3 × 3 and an output interface layer which are connected in sequence;
the third module blockC comprises an input layer, 3 groups of convolution layers and pooling layer groups which are connected in series and an output interface layer, wherein the cores of the convolution layers and the pooling layers are respectively 3 multiplied by 3 and 2 multiplied by 2, and the pooling mode is maximum pooling;
training the fusion network model:
step 101: collecting a training sample set, and carrying out image preprocessing on the training sample, wherein the image preprocessing comprises size normalization, image pixel value mean normalization and random angle turnover normalization; randomly dividing the training sample set into a plurality of sub-training sets, wherein the sample number of each sub-training set is S;
step 102: initializing a neural network parameter and an attribute distribution weight of an attribute loss function, and acquiring a network parameter of first iteration and the attribute distribution weight of the attribute loss function; wherein the attribute distribution weight of the attribute loss function comprises the attribute distribution weight of the attribute loss function of the positive sample
Figure BDA0001488378900000031
And of negative examplesAttribute distribution weights for attribute loss functions
Figure BDA0001488378900000032
i is an attribute category identifier;
step 103: using the sub-training set as the input image of the fusion network model, predicting the identity label and each attribute label, comparing the error with the real label, and calculating the loss function
Figure BDA0001488378900000033
Wherein lsoftmaxRepresents the loss function of the Softmax layer,/centerlossCenter loss function, l, representing the identity of the first fully-connected layer facemultitaskA face attribute loss function, λ, representing the third connection layer1And λ2Represents a predetermined loss weight, 0 < lambda12If the ratio is less than 1, taking a channel test observation value;
wherein
Figure BDA0001488378900000034
FC[i]jThe result of the attribute all-connected layer output attribute i representing the jth picture,
Figure BDA0001488378900000035
labels, P, representing pairs of pictures j to attributes it iAttribute distribution weights representing attribute loss functions of positive or negative samples of the t-th iteration corresponding to attribute i, C representing the number of attribute classes;
step 104: calculating a loss function
Figure BDA0001488378900000036
Gradient of (2)
Figure BDA0001488378900000037
Wherein WtA network parameter representing a t-th iteration;
iteratively updating the network parameters: : wt+1=Wt+Vt+1Wherein
Figure BDA0001488378900000038
Beta represents the learning rate of the preset negative gradient, mu represents the weight of the last gradient value, VtRepresents the gradient of the t-th iteration, and the gradient of the first iteration is 0 (if the initial value of t is 0, i.e. V)00), the weight mu is a preset value;
iteratively updating the attribute distribution weights of the attribute loss function:
Figure BDA0001488378900000039
wherein the dimension parameter
Figure BDA00014883789000000310
r=∑Pt iyiFC[i]Current normalized variable Zi=∑Pt iexp(-αyiFC[i]),FC[i]Representing the current output result of the third fully-connected layer for attribute i, i.e., S FCs [ i]jComposition FC [ i],yiTrue labels, i.e. S, representing the current sub-training set for attribute i
Figure BDA0001488378900000041
Composition yi
Step 105: step 103-104 is repeatedly executed, the network parameter and the attribute distribution weight of the attribute loss function of each attribute are iteratively updated until the loss function
Figure BDA0001488378900000042
Converging and storing the currently updated network parameters and the attribute distribution weight of the attribute loss function;
and (3) identification processing of the image to be identified:
step 201: carrying out size normalization and image pixel value mean value normalization processing on an image to be recognized;
step 202: loading the network parameters saved in the training process;
step 203: inputting the image to be recognized processed in the step 201 into the fusion network model constructed in the invention, performing forward propagation, and respectively predicting an identity tag and a C attribute tag through a second full-connection layer and a third full-connection layer, wherein the identity tag is an index tag corresponding to the maximum probability value through the second full-connection layer and a softmax layer; and the attribute label is directly output through the third full connection layer.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
(1) the invention provides a new fusion framework of a human face attribute supervision human face recognition learning task, under the framework, the invention not only improves the accuracy of human face recognition, but also can predict the attribute characteristics of the human face, and is a multi-task network;
(2) a multi-task learning framework is improved, and a cost-sensitive weighting function is adopted, so that the balance training in a source data domain is realized without depending on the data distribution of a target domain;
(3) the modified fusion framework only adds a few parameters, and the additional calculation load is small. Compared with the existing method of independently training the face attribute network and the identity recognition network, extracting the features and fusing the features, the method reduces the workload and the operation load to a certain extent, and is more convenient for practical deployment and application.
Drawings
FIG. 1 is a diagram of a fusion framework for a prior art fractional layer;
FIG. 2 is a diagram of a fusion framework for a prior art fractional layer;
FIG. 3 is a schematic view of the fusion framework of the present invention;
FIG. 4 is a schematic diagram of the first module Block A of the fusion framework of the present invention;
FIG. 5 is a schematic diagram of the second module Block B of the fusion framework of the present invention;
fig. 6 is a schematic structural diagram of a third module BlockC of the fusion framework of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
Referring to fig. 3, the converged network model of the present invention includes first, second and third modules: BlockA, BlockB and BlockC, pooling layer, full connectivity layer FC, feature connector (Filter connectivity), Softmax layer; the third module blockC is an input layer of the converged network model, and is connected with the first module blockA1, the first module blockA1 is respectively connected with the first module blockA2, and the second module blockB 1; the first module BlockA2 is sequentially connected with a first module BlockA3, a pooling layer (global average pooling mode), a first full connection layer (FC 1024) and a second full connection layer (FC N) Softmax layer to form an identity identification network; the first module BlockA2 and the second module BlockB1 are both connected with a feature connector, that is, features obtained by BlockA2 and BlockB1 are stacked in depth through the feature connector, and the feature connection layer is sequentially connected with a second module BlockB2, a pooling layer (global average pooling mode) and a third full connection layer (FC 8), so as to form a face attribute recognition network.
Wherein the first fully-connected layer is used to generate a center loss (centrorloss), that is, to make the face features of each identity gather at the center of the corresponding identity, reduce the intra-class distance, and increase the inter-class distance, and the output dimension of the first fully-connected layer depends on the feature dimension of the input image, for example 1024; the second full connection layer is used for outputting N (identity category number, namely N people) dimensional full connection characteristics, and the second full connection layer outputs final identity information through the softmax layer; the output dimension of the third fully-connected layer depends on the number of preset face attributes, namely, the recognition results (whether corresponding attributes exist) of different belongings are obtained through the second fully-connected layer, wherein the face attributes comprise sex, size of mouth, thickness of lips, size of eyes, thickness of eyebrows, height of nose bridge, size of nose, width of forehead and the like.
The first module, Block a (i), is a Block in which three inclusion structures are stacked (connected in series) and is used to extract shallow, intermediate, and advanced features in a picture, that is, Block a (i) is a different number of inclusion structures, where Block a1 stacks 5 inclusion structures, Block a2 stacks 10 inclusion structures, and Block A3 stacks 5 inclusion structures. The inclusion structure is shown in fig. 4, and the feature connector (Filter classification), the convolutional layer (conv), the pooling layer (posing), the normalization layer (batch normalization), and the input interface layer (Previous layer) include four parallel convolutions between the feature connector and the input interface layer: the first path is a convolution layer and a normalization layer which are connected in series, wherein the convolution layer is connected with the input interface layer, and the convolution kernel is 1 multiplied by 1; the second path is two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 3 multiplied by 3; the third path comprises two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 5 multiplied by 5; the fourth path comprises a pooling layer, a convolution layer and a normalization layer which are sequentially connected in series, wherein the pooling layer is connected with the input interface layer, the pooling mode is maximum pooling, the pooling kernel is 2 x 2, and the convolution kernel of the convolution layer is 1 x 1. The normalization layer is used for simulating large-scale change of parameters, so that training tends to be stable, deeper training of the network becomes easy, convergence is accelerated, a certain regularization effect is achieved, and overfitting of the model is prevented.
The first module Block A (i) of the invention carries out channel dimension reduction and weighted summation on the shallow layer feature and the intermediate feature through 1 x 1 convolution to form a primary feature and a part of intermediate feature of the face attribute, then learns the feature through 3 x 3 convolution kernels respectively, and finally forms the high-level feature of the face attribute, wherein in a neural network, the shallow layer and the intermediate feature contain certain general information, and the high-level feature is a targeted feature guided by a learning task. The present invention combines shallow and intermediate features in an identity recognition network to learn high-level features in the face attributes. Therefore, the convolution Block B with a small amount of added parameters can simultaneously realize the simultaneous learning of the identity characteristics and the attributes.
The second blocks blockb (i) are all convolution structures shown in fig. 5, and include an input interface layer (Previous layer), a convolution layer with a convolution kernel of 1 × 1, a convolution layer with a convolution kernel of 3 × 3, and an output interface layer (Top layer) connected in sequence.
Referring to fig. 6, the third block BlockC includes an input layer, 3 groups of convolutional layers and pooling layer groups connected in series, and an output interface layer (Top layer), where cores of the convolutional layers and the pooling layers are 3 × 3 and 2 × 2, respectively, and the pooling mode is maximum pooling.
After a small number of parameters are added, the face attribute characteristics and the identity characteristics are fused, network synchronous training is realized, and the face recognition accuracy is improved. The added parameters are mainly the parameters of Block B and a full link layer of a multitask classifier.
BlockB parameter increment: the number of input feature maps, N, is denoted by M1Denotes the number of 1 × 1 convolution kernels, N2Representing the number of 3 × 3 convolution kernels, the BlockB parameter is: numparam1=MN1+9N1N2
Number of added parameters of the convolutional layer: if A represents the input feature dimension of the convolutional layer and C represents the attribute feature type, the input dimension of the fully-connected layer is C, so the parameters are: numparam2=AC;
For example, for M128, N1=64,N2An application scenario of 128 may be: numparam181920; and A is of the order of 103C is of the order of 102Then Numparam2Typically of the order of 105The overall added parameters are not much more than the millions of parameters of the whole face recognition network.
The invention utilizes the shallow feature and the intermediate feature of the face identity recognition, further learns to generate a part of intermediate features and the final attribute advanced feature, and then outputs the attribute prediction through the full connection layer. C is used to represent attribute feature type, the output dimension of the full connection layer is C, and for a certain attribute i, the full connection output is FC [ i]If i is more than or equal to 1 and less than or equal to C, the classification result Y [ i ] of the attribute i]And error E [ i ]]Respectively as follows:
Figure BDA0001488378900000061
wherein the loss is: l isi=max(0,1-yiFC[i])。
In the multitask optimization process, the data imbalance problem is a problem which must be solved, so that the loss of each attribute cannot be directly added. In this case, the invention defines a hybrid objective function, which uses the distribution of the attributes in the data field to perform a weighted summation of the loss of each attribute, but in the selection of the weighting function, this is achieved by a cost-sensitive weighting functionThe weighting function is:
Figure BDA0001488378900000071
wherein
Figure BDA0001488378900000072
Representing a scale parameter, r ═ Σ Pt iyiFC[i]Current normalized variable Zi=∑Pt iexp(-αyiFC[i]),FC[i]Representing the current output result of the third fully-connected layer for attribute i, i.e., S FCs [ i]jComposition FC [ i],yiTrue labels, i.e. S, representing the current sub-training set for attribute i
Figure BDA0001488378900000073
Composition yiThereby obtaining a multitask penalty function
Figure BDA0001488378900000074
Wherein N is the number of pictures in a batch, C is the number of attribute types, FC [ i]jThe result of the attribute all-connected layer output attribute i representing the jth picture,
Figure BDA0001488378900000075
a label representing picture j versus attribute i.
The overall system penalty function is then:
Figure BDA0001488378900000076
wherein lsoftmaxRepresents the loss function of the Softmax layer,/centerlossCenter loss function, l, representing the face identity of the first fully-connected layermultitaskFace attribute loss function, λ, corresponding to the third connection layer1And λ2Represents a predetermined loss weight, 0 < lambda12And (5) taking the observed value of the experience, wherein the suggested value is 0.08 and 0.02. Therefore, the whole system is trained under the co-supervision of identity recognition loss and attribute recognition loss, parameters are optimized, and the fusion of attribute recognition and identity recognition at the parameter level is realized, rather than the existing fusion mode at the feature level and the final similarityResulting in fusion of the planes.
The face recognition method based on the fusion network model constructed by the invention mainly comprises two processes of training and recognition, which are specifically as follows:
1. training process:
step 101: and acquiring a training sample set, and performing image preprocessing on the training sample, wherein the image preprocessing comprises size normalization, image pixel value mean normalization and random inversion normalization (left-right inversion is performed to increase the number of the training sample set). For example, scaling the picture size to 128 × 3, 112 × 3(H × W × C, H is the picture height, W is the picture width, C is the picture channel, and 3 is represented as RGB color picture), and then performing mean normalization and random flip normalization;
randomly dividing the training sample set into a plurality of sub-training sets, wherein the sample number of each sub-training set is S;
step 102: initializing a neural network parameter (such as an Xavier method), and distributing the weight of the attribute loss function, thereby obtaining the network parameter of the first iteration and the attribute distribution weight of the attribute loss function. Wherein the attribute distribution weight of the attribute loss function comprises the attribute distribution weight of the attribute loss function of the positive sample
Figure BDA0001488378900000077
Attribute distribution weight of attribute loss function of sum negative sample
Figure BDA0001488378900000078
And respectively represent positive and negative sample loss function initialization weights for attribute i,
Figure BDA0001488378900000079
and
Figure BDA00014883789000000710
representing the number of positive and negative samples with the attribute of i in the training sample set;
step 103: the sub-training set is used as an input image of the fusion network model constructed by the invention, the identity label and the C attribute label are predicted, and the error is compared with the real labelCalculating a loss function
Figure BDA00014883789000000812
In this embodiment, λ1And λ2The preferable values of (b) are 0.08 and 0.02, respectively. Namely, it is
Figure BDA0001488378900000081
Wherein
Figure BDA0001488378900000082
FC[i]jThe result of the attribute all-connected layer output attribute i representing the jth picture,
Figure BDA0001488378900000083
table indicates the label of picture j to attribute i, Pt iAn attribute distribution weight representing an attribute loss function of the positive or negative sample of the tth iteration corresponding to attribute i;
step 104: calculating the gradient of the loss function
Figure BDA0001488378900000084
Wherein WtA network parameter representing a t-th iteration;
updating the network parameter W at the t +1 th timet+1:Wt+1=Wt+Vt+1Wherein
Figure BDA0001488378900000085
Beta represents the learning rate of the preset negative gradient, mu represents the weight of the last gradient value, VtRepresents the gradient of the t-th iteration, and the gradient of the first iteration is 0 (if the initial value of t is 0, i.e. V)00), the weight mu is a preset value;
updating attribute distribution weight of attribute loss function at t +1 th time
Figure BDA0001488378900000086
Namely, the attribute distribution weights of the attribute loss functions of the positive and negative samples are updated according to the updating mode, wherein alpha is a scale parameter,
Figure BDA0001488378900000087
r=∑Pt iyiFC[i]current normalized variable Zi=∑Pt iexp(-αyiFC[i]),FC[i]Representing the current output result of the third fully-connected layer for attribute i, i.e., S FCs [ i]jComposition FC [ i],yiTrue labels, i.e. S, representing the current sub-training set for attribute i
Figure BDA0001488378900000088
Composition yi
Step 105: step 103 and step 104 are repeatedly executed, the network parameter and the attribute distribution weight of the attribute loss function of each attribute are iteratively updated until
Figure BDA00014883789000000813
And (6) converging. And storing the currently updated network parameters and attribute distribution weights of the attribute loss functions.
2. The identification process comprises the following steps:
step 201: carrying out size normalization and image pixel value mean value normalization processing on an image to be recognized;
step 202: loading the network parameters saved in the training process;
step 203: inputting the image to be recognized processed in step 201 into the fusion network model constructed in the invention, calculating forward, and predicting the identity label and the C attribute label through two full-connection layers (second and third). In the specific embodiment, the identity tag obtains the index tag corresponding to the maximum probability value through FC1024 and softmax; the face attribute label is output through FC8, where the face attribute label Y is:
Figure BDA0001488378900000089
while the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except mutually exclusive features and/or steps.

Claims (2)

1. The face recognition method based on the combination of the face attribute information is characterized by comprising the following steps:
constructing a fusion network model:
a third module blockC is used as an input layer of the converged network model, the third module blockC is connected with a first module blockA1, the first module blockA1 is respectively connected with a first module blockA2, and a second module blockB 1; the first module BlockA2 is sequentially connected with a first module BlockA3, a pooling layer of a first global average pooling mode, a first full connection layer and a second full connection layer Softmax layer to form an identity identification network;
the first module Block A2 and the second module Block B1 are both connected with a feature connector, and the feature connection layer is sequentially connected with the second module Block B2, a pooling layer of a second global average pooling mode and a third full connection layer to form a face attribute identification network;
wherein, the first module Block a1 stacks 5 inclusion structures, the first module Block a2 stacks 10 inclusion structures, the first module Block A3 stacks 5 inclusion structures, the inclusion structure includes a feature connector, a convolution layer, a pooling layer, a normalization layer and an input interface layer, and four paths of parallel convolution are included between the feature connector and the input interface layer: the first path is a convolution layer and a normalization layer which are connected in series, wherein the convolution layer is connected with the input interface layer, and the convolution kernel is 1 multiplied by 1; the second path is two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 3 multiplied by 3; the third path comprises two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 5 multiplied by 5; the fourth path comprises a pooling layer, a convolution layer and a normalization layer which are sequentially connected in series, wherein the pooling layer is connected with the input interface layer, the pooling mode is maximum pooling, the pooling kernel is 2 multiplied by 2, and the convolution kernel of the convolution layer is 1 multiplied by 1;
the second modules, BlockB1 and BlockB2, are convolution structures: the system comprises an input interface layer, a convolution layer with convolution kernel of 1 × 1, a convolution layer with convolution kernel of 3 × 3 and an output interface layer which are connected in sequence;
the third module blockC comprises an input layer, 3 groups of convolution layers and pooling layer groups which are connected in series and an output interface layer, wherein the cores of the convolution layers and the pooling layers are respectively 3 multiplied by 3 and 2 multiplied by 2, and the pooling mode is maximum pooling;
training the fusion network model:
step 101: collecting a training sample set, and carrying out image preprocessing on the training sample, wherein the image preprocessing comprises size normalization, image pixel value mean normalization and random angle turnover normalization; randomly dividing the training sample set into a plurality of sub-training sets, wherein the sample number of each sub-training set is S;
step 102: initializing a neural network parameter and an attribute distribution weight of an attribute loss function, and acquiring a network parameter of first iteration and the attribute distribution weight of the attribute loss function; wherein the attribute distribution weight of the attribute loss function comprises the attribute distribution weight of the attribute loss function of the positive sample
Figure FDA0001488378890000011
Attribute distribution weight of attribute loss function of sum negative sample
Figure FDA0001488378890000012
Wherein i is a face attribute class identifier;
step 103: using the sub-training set as the input image of the fusion network model, predicting the identity label and each attribute label, comparing the error with the real label, and calculating the loss function
Figure FDA0001488378890000027
Wherein
Figure FDA0001488378890000028
Representing the loss function of the Softmax layer,
Figure FDA0001488378890000029
a central loss function representing the identity of the face at the first fully connected layer,
Figure FDA00014883788900000210
a face attribute loss function, λ, representing the third connection layer1And λ2Represents a predetermined loss weight, 0 < lambda12If the ratio is less than 1, taking a channel test observation value;
wherein
Figure FDA0001488378890000021
Represents the current output result of the third fully-connected layer of the jth picture to attribute i,
Figure FDA0001488378890000022
a real label representing picture j versus attribute i,
Figure FDA0001488378890000023
attribute distribution weights representing attribute loss functions of positive or negative samples of the t-th iteration corresponding to attribute i, C representing the number of attribute classes;
step 104: calculating a loss function
Figure FDA00014883788900000211
Gradient of (2)
Figure FDA00014883788900000212
Wherein WtA network parameter representing a t-th iteration;
iteratively updating the network parameters: wt+1=Wt+Vt+1Wherein
Figure FDA00014883788900000213
Beta represents the learning rate of the preset negative gradient, mu represents the weight of the last gradient value, VtRepresenting the gradient of the t iteration, wherein the gradient of the first iteration is 0, and the weight mu is a preset value;
iteratively updating attribute distribution weights for attribute loss functionsHeavy:
Figure FDA0001488378890000024
wherein the dimension parameter
Figure FDA0001488378890000025
Current normalized variable
Figure FDA0001488378890000026
Representing the current output result of the third fully-connected layer for attribute i, i.e., S FCs [ i]jComposition FC [ i],yiTrue labels, i.e. S, representing the current sub-training set for attribute i
Figure FDA00014883788900000214
Composition yi
Step 105: step 103-104 is repeatedly executed, the network parameter and the attribute distribution weight of the attribute loss function of each attribute are iteratively updated until the loss function
Figure FDA00014883788900000215
Converging and storing the currently updated network parameters and the attribute distribution weight of the attribute loss function;
and (3) identification processing of the image to be identified:
step 201: carrying out size normalization and image pixel value mean value normalization processing on an image to be recognized;
step 202: loading the network parameters saved in the training process;
step 203: inputting the image to be recognized processed in the step 201 into the fusion network model, performing forward propagation, and predicting identity tags and C types of face attribute tags through a second full-connection layer and a third full-connection layer respectively, wherein the identity tags are subjected to the extraction of index tags corresponding to the maximum probability value through the second full-connection layer and the softmax layer; and the face attribute label is directly output through the third full-connection layer.
2. The method of claim 1, wherein the attribute loss function for positive samplesProperty distribution weight of
Figure FDA0001488378890000031
Attribute distribution weight of attribute loss function of sum negative sample
Figure FDA0001488378890000032
The initial values of (a) are:
Figure FDA0001488378890000033
Figure FDA0001488378890000034
Figure FDA0001488378890000035
and
Figure FDA0001488378890000036
representing the number of positive and negative samples with attribute i in the training sample set.
CN201711232374.6A 2017-11-30 2017-11-30 Face recognition method based on combination of face attribute information Active CN107766850B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711232374.6A CN107766850B (en) 2017-11-30 2017-11-30 Face recognition method based on combination of face attribute information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711232374.6A CN107766850B (en) 2017-11-30 2017-11-30 Face recognition method based on combination of face attribute information

Publications (2)

Publication Number Publication Date
CN107766850A CN107766850A (en) 2018-03-06
CN107766850B true CN107766850B (en) 2020-12-29

Family

ID=61276369

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711232374.6A Active CN107766850B (en) 2017-11-30 2017-11-30 Face recognition method based on combination of face attribute information

Country Status (1)

Country Link
CN (1) CN107766850B (en)

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509862B (en) * 2018-03-09 2022-03-25 华南理工大学 Rapid face recognition method capable of resisting angle and shielding interference
CN108520213B (en) * 2018-03-28 2021-10-19 五邑大学 Face beauty prediction method based on multi-scale depth
CN108846380B (en) * 2018-04-09 2021-08-24 北京理工大学 Facial expression recognition method based on cost-sensitive convolutional neural network
CN110555340B (en) * 2018-05-31 2022-10-18 赛灵思电子科技(北京)有限公司 Neural network computing method and system and corresponding dual neural network implementation
CN109033938A (en) * 2018-06-01 2018-12-18 上海阅面网络科技有限公司 A kind of face identification method based on ga s safety degree Fusion Features
JP7113674B2 (en) * 2018-06-15 2022-08-05 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Information processing device and information processing method
CN108898125A (en) * 2018-07-10 2018-11-27 深圳市巨龙创视科技有限公司 One kind being based on embedded human face identification and management system
CN109214286B (en) * 2018-08-01 2021-05-04 中国计量大学 Face recognition method based on deep neural network multi-layer feature fusion
CN109191184A (en) * 2018-08-14 2019-01-11 微梦创科网络科技(中国)有限公司 Advertisement placement method and system based on image recognition
CN109359515A (en) * 2018-08-30 2019-02-19 东软集团股份有限公司 A kind of method and device that the attributive character for target object is identified
CN109344713B (en) * 2018-08-31 2021-11-02 电子科技大学 Face recognition method of attitude robust
CN109508627A (en) * 2018-09-21 2019-03-22 国网信息通信产业集团有限公司 The unmanned plane dynamic image identifying system and method for shared parameter CNN in a kind of layer
CN109359599A (en) * 2018-10-19 2019-02-19 昆山杜克大学 Human facial expression recognition method based on combination learning identity and emotion information
CN109711386B (en) * 2019-01-10 2020-10-09 北京达佳互联信息技术有限公司 Method and device for obtaining recognition model, electronic equipment and storage medium
CN110069994B (en) * 2019-03-18 2021-03-23 中国科学院自动化研究所 Face attribute recognition system and method based on face multiple regions
CN111723613A (en) * 2019-03-20 2020-09-29 广州慧睿思通信息科技有限公司 Face image data processing method, device, equipment and storage medium
CN110009051A (en) * 2019-04-11 2019-07-12 浙江立元通信技术股份有限公司 Feature extraction unit and method, DCNN model, recognition methods and medium
CN110084216B (en) * 2019-05-06 2021-11-09 苏州科达科技股份有限公司 Face recognition model training and face recognition method, system, device and medium
CN110135389A (en) * 2019-05-24 2019-08-16 北京探境科技有限公司 Face character recognition methods and device
CN110348387B (en) * 2019-07-12 2023-06-27 腾讯科技(深圳)有限公司 Image data processing method, device and computer readable storage medium
CN110516569B (en) * 2019-08-15 2022-03-08 华侨大学 Pedestrian attribute identification method based on identity and non-identity attribute interactive learning
CN110956116B (en) * 2019-11-26 2023-09-29 上海海事大学 Face image gender identification model and method based on convolutional neural network
CN111046759A (en) * 2019-11-28 2020-04-21 深圳市华尊科技股份有限公司 Face recognition method and related device
CN111275057B (en) * 2020-02-13 2023-06-20 腾讯科技(深圳)有限公司 Image processing method, device and equipment
CN111353411A (en) * 2020-02-25 2020-06-30 四川翼飞视科技有限公司 Face-shielding identification method based on joint loss function
CN111401294B (en) * 2020-03-27 2022-07-15 山东财经大学 Multi-task face attribute classification method and system based on adaptive feature fusion
CN111428671A (en) * 2020-03-31 2020-07-17 杭州博雅鸿图视频技术有限公司 Face structured information identification method, system, device and storage medium
CN111507248B (en) * 2020-04-16 2023-05-26 成都东方天呈智能科技有限公司 Face forehead region detection and positioning method and system based on low-resolution thermodynamic diagram
CN111680595A (en) * 2020-05-29 2020-09-18 新疆爱华盈通信息技术有限公司 Face recognition method and device and electronic equipment
CN112507312B (en) * 2020-12-08 2022-10-14 电子科技大学 Digital fingerprint-based verification and tracking method in deep learning system
CN112990270B (en) * 2021-02-10 2023-04-07 华东师范大学 Automatic fusion method of traditional feature and depth feature
CN113139460A (en) * 2021-04-22 2021-07-20 广州织点智能科技有限公司 Face detection model training method, face detection method and related device thereof
CN113705439B (en) * 2021-08-27 2023-09-08 中山大学 Pedestrian attribute identification method based on weak supervision and metric learning
CN114360009B (en) * 2021-12-23 2023-07-18 电子科技大学长三角研究院(湖州) Multi-scale characteristic face attribute recognition system and method in complex scene
CN117079337B (en) * 2023-10-17 2024-02-06 成都信息工程大学 High-precision face attribute feature recognition device and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203395A (en) * 2016-07-26 2016-12-07 厦门大学 Face character recognition methods based on the study of the multitask degree of depth
CN106355170A (en) * 2016-11-22 2017-01-25 Tcl集团股份有限公司 Photo classifying method and device
CN106815566A (en) * 2016-12-29 2017-06-09 天津中科智能识别产业技术研究院有限公司 A kind of face retrieval method based on multitask convolutional neural networks
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203395A (en) * 2016-07-26 2016-12-07 厦门大学 Face character recognition methods based on the study of the multitask degree of depth
CN106355170A (en) * 2016-11-22 2017-01-25 Tcl集团股份有限公司 Photo classifying method and device
CN106815566A (en) * 2016-12-29 2017-06-09 天津中科智能识别产业技术研究院有限公司 A kind of face retrieval method based on multitask convolutional neural networks
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Patch-based face hallucination with multitask deep neural network";Wei-jen Ko;《2016 ICME》;20160829;第11-15页 *
基于卷积神经网络的人脸识别方法;陈耀丹;《东北师大学报》;20160630;第48卷(第2期);第70-76页 *

Also Published As

Publication number Publication date
CN107766850A (en) 2018-03-06

Similar Documents

Publication Publication Date Title
CN107766850B (en) Face recognition method based on combination of face attribute information
CN108564029B (en) Face attribute recognition method based on cascade multitask learning deep neural network
WO2021042828A1 (en) Neural network model compression method and apparatus, and storage medium and chip
US11417148B2 (en) Human face image classification method and apparatus, and server
Lee et al. Deeply-supervised nets
CN109583322B (en) Face recognition deep network training method and system
Cheng et al. Exploiting effective facial patches for robust gender recognition
CN111639544B (en) Expression recognition method based on multi-branch cross-connection convolutional neural network
WO2020114118A1 (en) Facial attribute identification method and device, storage medium and processor
Lin et al. Regression Guided by Relative Ranking Using Convolutional Neural Network (R $^ 3 $3 CNN) for Facial Beauty Prediction
CN109033938A (en) A kind of face identification method based on ga s safety degree Fusion Features
CN102314614B (en) Image semantics classification method based on class-shared multiple kernel learning (MKL)
CN110097029B (en) Identity authentication method based on high way network multi-view gait recognition
CN111476806B (en) Image processing method, image processing device, computer equipment and storage medium
US11093800B2 (en) Method and device for identifying object and computer readable storage medium
CN109376787B (en) Manifold learning network and computer vision image set classification method based on manifold learning network
WO2015008567A1 (en) Facial impression estimation method, device, and program
Li et al. Task relation networks
CN114463812A (en) Low-resolution face recognition method based on dual-channel multi-branch fusion feature distillation
CN112101087A (en) Facial image identity de-identification method and device and electronic equipment
CN114492634A (en) Fine-grained equipment image classification and identification method and system
Watson et al. Person re-identification combining deep features and attribute detection
CN104598898A (en) Aerially photographed image quick recognizing system and aerially photographed image quick recognizing method based on multi-task topology learning
CN111401116A (en) Bimodal emotion recognition method based on enhanced convolution and space-time L STM network
Jia et al. Multiple metric learning with query adaptive weights and multi-task re-weighting for person re-identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant