CN111382684A - Angle robust personalized facial expression recognition method based on counterstudy - Google Patents

Angle robust personalized facial expression recognition method based on counterstudy Download PDF

Info

Publication number
CN111382684A
CN111382684A CN202010136966.3A CN202010136966A CN111382684A CN 111382684 A CN111382684 A CN 111382684A CN 202010136966 A CN202010136966 A CN 202010136966A CN 111382684 A CN111382684 A CN 111382684A
Authority
CN
China
Prior art keywords
sample
expression
domain
angle
source domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010136966.3A
Other languages
Chinese (zh)
Other versions
CN111382684B (en
Inventor
王上飞
王灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202010136966.3A priority Critical patent/CN111382684B/en
Publication of CN111382684A publication Critical patent/CN111382684A/en
Application granted granted Critical
Publication of CN111382684B publication Critical patent/CN111382684B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an angle robust personalized facial expression recognition method based on counterstudy, which comprises the following steps: 1. carrying out image preprocessing on a database containing images with N types of human face expressions; 2. constructing a feature decoupling and domain self-adaptive network model based on counterstudy; 3. training the constructed network model by using an alternative iterative optimization mode; 4. and predicting the facial image to be detected by using the trained model to realize the classification and identification of the facial expression. The invention can simultaneously overcome the negative influence on the facial expression recognition effect caused by the angle and the difference between individuals in the facial expression recognition, thereby realizing the accurate recognition of the facial expression.

Description

Angle robust personalized facial expression recognition method based on counterstudy
Technical Field
The invention relates to the technical field of computer vision, in particular to an angle robust personalized facial expression recognition method based on counterstudy.
Background
Facial expression recognition is an important research topic in the field of computer vision, and has wide application in human-computer interaction, fatigue detection, crime reconnaissance and medical treatment. The current facial expression recognition method mostly assumes that a facial image is a front face, but in an actual application scene, the relative position of a user is not fixed, the scene is changeable, and only the facial expression recognition under a multi-angle condition can meet the actual requirement. Therefore, in recent years, some methods have been proposed for researchers to cope with the influence of angles on the recognition of facial expressions. Depending on how angle changes are handled, these methods can be divided into three categories: a specific perspective classifier method, a single classifier method, and an angle normalization method. The method for the classifier with the specific visual angle is intuitive, namely, corresponding classifiers are trained for samples with different angles respectively, however, the method is limited by limited training samples, and the classifier with robust performance cannot be learned for each angle. The single classifier approach attempts to learn a more robust classifier from a large number of samples, and can bring richer and diversified training samples to the learning of the classifier through sample generation, thanks to the application of generating a countermeasure network and a variational self-encoder. However, the generation of high-quality samples is a process which is difficult to guarantee, and the generated low-quality samples can bring noise to the learning of the classifier instead, so that the performance of the classifier is affected. The angle normalization method is to map face samples or feature representations of any angle into face samples or feature representations of a front face, and the consistency of individuals and the invariance of expression contents are kept during conversion. However, the method relies on the pair of training samples, that is, for a non-positive-face sample of an individual, a corresponding positive-face sample of the individual needs to exist, which severely restricts the use of the method in practice.
Besides the angle, the inter-individual difference is also an important factor affecting the facial expression recognition effect. Different individuals have great differences in facial expression for the same expression due to differences in facial form, character, appearance, and the like, and the effects of expression recognition are seriously affected. For example, for "smiling", some people tend to break into laugh, and some people tend to close into smile, and although both of them belong to the expression "happy", the expressions at the pixel level are different from each other, thereby causing difficulty in learning the features. In addition, the individual appearance varies greatly, which also brings challenges to expression analysis. Individual robust facial expression recognition can be solved by an individual-based method, i.e. individualized facial expression recognition method. The method based on the specific individual aims at establishing the specific classifier for the specific individual, so that the learned classifier is only concentrated on a single individual, and the deviation caused by learning of the classifier by other individuals is avoided. However, limited to the sample size of a single individual, it is difficult to learn a facial expression classifier with good performance.
Disclosure of Invention
The invention provides an angle robust personalized facial expression recognition method based on counterstudy to overcome the defects of the prior art, so that the influence of angle and difference between individuals in facial expression recognition can be simultaneously overcome, and the recognition rate of the facial expression recognition is improved.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to an angle robust personalized facial expression recognition method based on antagonistic learning, which is characterized by comprising the following steps of:
step 1, carrying out image preprocessing on a database containing images with N types of human face expressions:
carrying out face detection and correction on all facial expression images in the database by using an MTCNN (multiple-terminal coupled neural network) algorithm so as to obtain a normalized facial image data set which is used as a sample set;
randomly dividing the sample set by taking the individuals in the database as a dividing reference to obtain a source domain data set S and a target domain data set T; let any sample in the source domain data set S be xsAny sample x in the source domainsIs marked as ysAny sample x in the source domainsIs denoted by ps(ii) a Let any sample in the target domain data set T be xt
Step 2, constructing a feature decoupling and field self-adaptive network model based on counterstudy, and comprising the following steps: source domain feature extractor EsAnd a target domain feature extractor EtAngle classifier DpAnd expression classifier R, angle domain discriminator DdpAnd expression domain discriminator DdeSource domain image generator GsAnd a target domain image generator Gt
The source domain feature extractor EsAnd a target domainSign extractor EtThe system has the same network structure and consists of an input convolutional layer, M downsampling convolutional layers, Q residual convolutional layers and two branches containing W convolutional layers in sequence; an example regularization layer and a ReLU activation function are connected to each convolution layer;
the angle classifier DpExpression classifier R and angle domain discriminator DdpAnd expression domain discriminator DdeAll are formed by full-connection networks of H layers;
the source domain image generator GsAnd a target domain image generator GtThe network structure is the same, and the network structure is composed of an input convolutional layer, J up-sampling anti-convolutional layers and an output convolutional layer in sequence, for each convolutional layer before the output convolutional layer, an example regularization layer and a ReLU activation function are accessed, and for the output convolutional layer, a Tanh activation function is accessed;
initializing weight values of all convolution layers, anti-convolution layers and full connection layers in the feature decoupling and domain adaptive network model based on counterstudy by using Gaussian distribution;
step 3, four learning strategies of a feature decoupling and field self-adaptive network model based on antagonistic learning, including a supervision learning strategy, an antagonistic field self-adaptive learning strategy, a cross antagonistic feature decoupling learning strategy and an image reconstruction learning strategy;
step 3.1, a supervised learning strategy:
step 3.3.1, any sample x in the source domainsInputting the source domain feature extractor EsIn (1), two kinds of feature vectors are obtained
Figure BDA0002397663710000031
Wherein f iss eRepresenting a sample x in the source domainsExpression-related feature of fs pRepresenting a sample x in the source domainsThe angle-related characteristic of (a);
step 3.3.2 sample x in the Source DomainsAngle-dependent characteristic f ofs pInputting the angle classifier DpCarrying out angle identification to obtain a sample x in a source domainsThe angle category of (1);
method for establishing loss function of angle identification by using formula (1)
Figure BDA00023976637100000314
Figure BDA0002397663710000035
In formula (1), Sup (·) represents a supervised loss function;
step 3.3.3 sample x in the Source DomainsExpression-related feature f ofs eInputting the expression into the expression classifier R for expression recognition to obtain a sample x in a source domainsThe expression category of (1);
loss function for expression recognition established by using formula (2)
Figure BDA00023976637100000313
Figure BDA0002397663710000037
Step 3.2, the adaptive learning strategy of the countermeasure field:
step 3.2.1, any sample x in the target domaintInputting the target domain feature extractor EtIn (1), two kinds of feature vectors are obtainedt e,ft pIn which ft eRepresenting a sample x in the target domaintExpression-related feature of ft pRepresenting a sample x in the target domaintThe angle-related characteristic of (a);
step 3.2.2 sample x in the Source DomainsAngle-dependent characteristic d ofs pOr samples x in the target domaintExpression-related feature f oft pInputting the angle domain discriminator DdpObtaining an angle-dependent feature fs pAs true or expression-related features ft pA false recognition result;
step 3.2.3 sample x in the Source DomainsExpression-related feature f ofs eOr samples x in the target domaintExpression-related feature f oft eInputting the expression domain discriminator DdeIn the method, expression related characteristics f are obtaineds eAs true or expression-related features ft eA false recognition result;
step 3.2.4, establishing a counterlearning loss function by using the formula (3)
Figure BDA00023976637100000315
Figure BDA00023976637100000312
Step 3.3, a countermeasure characteristic decoupling learning strategy:
step 3.3.1, sample x in the Source DomainsAngle-dependent characteristic f ofs pInputting the expression into the expression classifier R to obtain a sample x in a source domainsThe expression classification result of (1);
sample x in the source domainsExpression-related feature f ofs eInput angle classifier DpObtaining a sample x in the source domainsThe angle classification result of (1);
step 3.3.2, establishing expression classifier R for angle correlation characteristic f by using formula (4)s pAnd an angle classifier DpFor expression-related feature fs eIs classified as a loss function
Figure BDA0002397663710000045
Figure BDA0002397663710000046
Step 3.4, image reconstruction learning strategy:
step 3.4.1, sample x in the Source DomainsAngle-dependent characteristic f ofs pAnd an objectSamples x in the domaintExpression-related feature f oft eCombined and input to the source domain image generator GsGenerating a reconstructed image in the source domain
Figure BDA0002397663710000048
Step 3.4.2, sample x in target DomaintAngle-dependent characteristic f oft pAnd sample x in the source domainsExpression-related feature f ofs eAre combined and input to the target domain image generator GtGenerating a reconstructed image in the target domain
Figure BDA00023976637100000410
Step 3.4.3, establishing constraints for the reconstructed image using equation (5)
Figure BDA00023976637100000413
Figure BDA00023976637100000411
In formula (5), x'sRepresents another sample in the source domain data set S and is associated with sample xsHaving the same angle label as sample xtThe same expression label is possessed; x'tRepresents another sample in the target domain data set T and is associated with sample xtHaving the same angle label as sample xsThe same expression label is possessed;
and 4, constructing an overall loss function, and performing feature decoupling of counterstudy and study of a domain adaptive network model by using an alternative iterative optimization mode to obtain an optimal facial expression recognition model:
step 4.1, constructing a total objective function by using the formula (6):
Figure BDA00023976637100000412
in the formula (6), α, β, η and λ are all weight factors;
step 4.2, setting the total training step number as K1The current total number of training steps is k1
Setting the optimized number of steps at three positions inside as K2,K3And K4The corresponding current optimization step number is k2,k3And k4
Setting the number of samples sampled each time in training as B;
initialization k1,k2,k3,k4Are all '0';
step 4.3, respectively carrying out external kth from the source domain data set S and the target domain data set T1Sub-inner kth2Second random B samples are taken and used as the k-th outer sample1Sub-inner kth2A source domain training sample and a target domain training sample of the secondary iteration;
step 4.4, optimizing the source domain feature extractor E using equation (7)sAnd an expression classifier R to obtain the k-th external1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure BDA0002397663710000051
Figure BDA0002397663710000052
Step 4.5, optimizing the source domain feature extractor E by using the formula (8)sAnd a target domain feature extractor EtGet the external kth1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure BDA0002397663710000053
Figure BDA0002397663710000054
Step 4.6, optimizing the source domain feature extractor E by using the formula (9)sAnd a target domain feature extractor EtGet the external kth1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure BDA0002397663710000055
Figure BDA0002397663710000056
Step 4.7, let k2+1 assignment to k2Then, judge k2≥K2If yes, executing the step 4.8, otherwise, returning to the step 4.3 for sequential execution;
step 4.8, respectively carrying out external kth from the source domain data set S and the target domain data set T1Sub-inner kth3Second random B samples are taken and used as the k-th outer sample1Sub-inner kth3A source domain training sample and a target domain training sample of the secondary iteration;
step 4.9, optimizing the source domain feature extractor E by using the formula (10)sTarget domain feature extractor EtSource domain image generator GsAnd a target domain image generator GtGet the external kth1Sub-inner kth3Corresponding gradient of the sub-iteration
Figure BDA0002397663710000057
Figure BDA0002397663710000058
Step 4.10, let k3+1 assignment to k3Then, judge k3≥K3If yes, executing the step 4.11, otherwise, returning to the step 4.8 for sequential execution;
step 4.11, carry out external kth from the source domain data set S and the target domain data set T respectively1Sub-inner kth4Second random B samples are taken and used as the k-th outer sample1Sub-inner kth4A source domain training sample and a target domain training sample of the secondary iteration;
step 4.12Optimization of source domain feature extractor E by equation (11)sAnd angle classifier DpGet the external kth1Sub-inner kth4Corresponding gradient of the sub-iteration
Figure BDA0002397663710000061
Figure BDA0002397663710000062
Step 4.13, optimizing expression domain discriminator D by using formula (12)deAnd angle domain discriminator DdpGet the external kth1Sub-inner kth4Corresponding gradient of the sub-iteration
Figure BDA0002397663710000063
Figure BDA0002397663710000064
Step 4.14, let k4+1 assignment to k4Then, judge k4≥K4If yes, executing the step 4.15, otherwise, returning to the step 4.11 for sequential execution;
step 4.15, let k1+1 assignment to k1Then, judge k1≥K1Whether the face expression is established or not or whether the algorithm is converged or not is judged, if yes, the training is finished, and an optimal face expression recognition model is obtained and is used for realizing the classification of the face expression; otherwise, the step 4.3 is returned to execute in sequence.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, through proposing the cross-confrontation feature decoupling learning strategy, the expression-related features and the angle-related features can be decoupled, so that the expression-related features do not contain angle information irrelevant to expression recognition, and the angle-related features do not contain expression information irrelevant to angle recognition, the problems that the existing angle-robust facial expression recognition method is limited by sample diversity, depends on high-quality facial image generation and the like are solved, and the more angle-robust facial expression recognition is realized.
2. According to the invention, by providing the countermeasure field adaptive learning strategy, the source domain information can be effectively migrated to the target domain, the learning of the facial expression recognition task of the target domain is facilitated, the defect that the traditional personalized facial expression recognition method is limited by a small number of target domain samples is overcome, the strategy does not need the expression and angle labeling information of the target domain, the usability in the actual environment is improved, and the influence of the inter-individual difference in the facial expression recognition on the recognition effect can be effectively coped with.
3. According to the method, through the provision of the reconstruction learning strategy, the performance of cross confrontation feature decoupling learning and confrontation field adaptive learning can be further improved, and the facial expression recognition effect of the method is further improved.
4. The invention designs an alternative iterative optimization method, which can simultaneously carry out supervised learning, cross confrontation feature decoupling learning, confrontation field self-adaptive learning and reconstruction learning, realizes end-to-end training and prediction, reduces manual intervention, supplements each learning strategy, jointly learns the features of angles and individual robustness, and optimizes the process of learning the relevant features of the facial expression.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a model block diagram of the present invention;
FIG. 3 is a graph of the reconstructed results of the present invention on the Multi-PIE and BU-3DFE databases.
Detailed Description
In this embodiment, as shown in fig. 1, a method for identifying personalized facial expressions based on angle robustness of counterstudy is performed according to the following steps:
step 1, carrying out image preprocessing on a database containing images with N types of human face expressions:
performing face detection and correction on all facial expression images in the database by using an MTCNN (multiple-transmission neural network) algorithm to obtain a normalized facial image data set which is used as a sample set, wherein the pixel size of all the facial images subjected to normalization processing is 128 × 128;
randomly dividing the sample set by taking individuals in the database as a division reference to obtain a source domain data set S and a target domain data set T; let any sample in the source domain data set S be xsAny sample x in the source domainsIs marked as ysAny sample x in the source domainsIs denoted by ps(ii) a Let any sample in the target domain data set T be xtThe target domain sample has no expression and angle marking information;
in this embodiment, as shown in FIG. 3, a Multi-PIE and BU-3DFE facial expression database is used. The Multi-PIE facial expression database contains 755,370 facial images, collected from 337 volunteers, for 13 angles: 90 ° to 90 ° and at 15 ° intervals, the expressions are noted: smile, surprise, strabismus, aversion, scream and neutrality. The BU-3DFE facial expression database contains 100 3D models, wherein 56 men and 44 women, samples at any angle can be obtained by rotating the 3D models, and the expressions are labeled as: anger, disgust, fear, happiness, neutrality, sadness and surprise.
Step 2, as shown in fig. 2, constructing a feature decoupling and domain adaptive network model based on counterstudy, and including: source domain feature extractor EsAnd a target domain feature extractor EtAngle classifier DpAnd expression classifier R, angle domain discriminator DdpAnd expression domain discriminator DdeSource domain image generator GsAnd a target domain image generator Gt
Source domain feature extractor EsAnd a target domain feature extractor EtHas the same network structure, and sequentially consists of an input convolutional layer (convolutional kernel size is 7 × 7, number is 3, step size is 2, and filling is 3), M downsampled convolutional layers (in the example, M is set to be 4, convolutional kernel sizes are 4 × 4, step sizes are 2, filling is 1, number is 64, 32, 16 and 8 respectively), Q residual convolutional layers (in the example, Q is set to be 3, convolutional kernel sizes are 3 × 3, and number is 3Step length is 8, step length is 2, padding is 1), two branches containing W convolutional layers (in the example, W is set to be 2, convolutional kernel size is 3 × 3, number is 8, step length is 2, padding is 1) are formed, and an example regularization layer and a ReLU activation function are connected after each convolutional layer;
angle classifier DpExpression classifier R and angle domain discriminator DdpAnd expression domain discriminator DdeAre all composed of a fully connected network of H layers of input length 512 (in this example, H is set to 3);
source domain image generator GsAnd a target domain image generator GtThe convolutional code generator is provided with the same network structure and sequentially consists of an input convolutional layer (the convolutional kernel size is 7 × 7, the number of the convolutional kernel is 8, the step length is 1, and the filling is 3), J up-sampling anti-convolutional layers (in the example, J is set to be 4, the convolutional kernel size is 4 × 4, the step length is 2, the filling is 1, and the number of the convolutional kernel is 8, 16, 32 and 64 respectively) and an output convolutional layer (the convolutional kernel size is 7 × 7, the number of the convolutional kernel is 3, the step length is 1, and the filling is 3);
initializing weights of all convolution layers, anti-convolution layers and full-connection layers in the feature decoupling and domain adaptive network model based on counterlearning by using Gaussian distribution obeying N (0, 0.02);
step 3, four learning strategies of a feature decoupling and field self-adaptive network model based on antagonistic learning, including a supervision learning strategy, an antagonistic field self-adaptive learning strategy, a cross antagonistic feature decoupling learning strategy and an image reconstruction learning strategy;
step 3.1, a supervised learning strategy:
step 3.3.1, any sample x in the source domainsInput source domain feature extractor EsIn (1), two kinds of feature vectors are obtained
Figure BDA0002397663710000081
Wherein f iss eRepresenting samples in the source domainxsExpression-related feature of fs pRepresenting a sample x in the source domainsThe angle-related characteristic of (a); the two features are obtained by unfolding a feature map output by the convolutional layer, and the dimension is 512;
step 3.3.2 sample x in Source DomainsAngle-dependent characteristic f ofs pInput angle classifier DpCarrying out angle identification to obtain a sample x in a source domainsThe angle category of (1);
method for establishing loss function of angle identification by using formula (1)
Figure BDA0002397663710000086
Figure BDA0002397663710000085
In formula (1), Sup (·) represents a supervised loss function; square losses, Softmax losses, cross entropy losses, etc. can be used;
step 3.3.3 sample x in Source DomainsExpression-related feature f ofs eInputting the expression into an expression classifier R for expression recognition to obtain a sample x in a source domainsThe expression category of (1);
loss function for expression recognition established by using formula (2)
Figure BDA0002397663710000093
Figure BDA0002397663710000091
Step 3.2, the adaptive learning strategy of the countermeasure field:
step 3.2.1, any sample x in the target DomaintInput target domain feature extractor EtIn (1), two kinds of feature vectors are obtainedt e,ft pIn which ft eRepresenting a sample x in the target domaintExpression-related feature of ft pRepresenting a sample x in the target domaintThe angle-related characteristic of (a); all in oneSimilarly, the two features are obtained by expanding a feature map output by the convolution layer, and the dimension is 512;
step 3.2.2, introduction of countermeasure domain adaptive learning strategy reduction fs pAnd ft pInter-domain distribution variability exists. In particular, sample x in the source domainsAngle-dependent characteristic f ofs pOr samples x in the target domaintExpression-related feature f oft pInput angle domain discriminator DdpObtaining an angle-dependent feature fs pAs true or expression-related features ft pA false recognition result; in the angular domain discriminator DdpTo distinguish f as much as possibles pAnd ft pTime, source domain feature extractor EsAnd a target domain feature extractor EtAs far as possible so that f is generateds pAnd ft pCan not be judged by the angle domain DdpAnd (5) identifying. Thus the source domain feature extractor EsAnd a target domain feature extractor EtAnd angle domain discriminator DdpConstituting an antagonistic relationship.
Step 3.2.3, introduction of confrontation domain adaptive learning strategy reduction fs eAnd ft eThere is a range of distribution variability. In particular, sample x in the source domainsExpression-related feature f ofs eOr samples x in the target domaintExpression-related feature f oft eInput expression domain discriminator DdeIn the method, expression related characteristics f are obtaineds eAs true or expression-related features ft eA false recognition result; in expression domain discriminator DdeTo distinguish f as much as possibles eAnd ft eTime, source domain feature extractor EsAnd a target domain feature extractor EtAs far as possible so that f is generateds eAnd ft eIdentifier D for expression domaindeAnd (5) identifying. Thus the source domain feature extractor EsAnd a target domain feature extractor EtAnd expression domain discriminator DdeConstituting an antagonistic relationship.
Step 3.2.4, establishing a counterlearning loss function by using the formula (3)
Figure BDA0002397663710000094
Figure BDA0002397663710000092
Step 3.3, a countermeasure characteristic decoupling learning strategy:
step 3.3.1, sample x in Source DomainsAngle-dependent characteristic f ofs pInputting the expression into a classifier R to obtain a sample x in a source domainsThe expression classification result of (1);
sample x in the source domainsExpression-related feature f ofs eInput angle classifier DpIn (1), obtain a sample x in the source domainsThe angle classification result of (1);
step 3.3.2, establishing expression classifier R for angle correlation characteristic f by using formula (4)s pAnd an angle classifier DpFor expression-related feature fs eIs classified as a loss function
Figure BDA0002397663710000104
Figure BDA0002397663710000105
By optimizing this loss, the expression classifier R cannot correlate features f to angless pRecognizing expression information while making an angle classifier DpInability to characterize the situational related features fs eIdentifying angle information so that the angle-related feature fs pThere is no expression information independent of angle, so that the expression-related feature fs pAngle information irrelevant to the expression does not exist, and decoupling of the angle and the expression information is realized;
step 3.4, image reconstruction learning strategy:
step 3.4.1, sample in Source DomainxsAngle-dependent characteristic f ofs pAnd sample x in the target domaintExpression-related feature f oft eCombining and inputting source domain image generator GsIn the generation of a reconstructed image in the source domain
Figure BDA00023976637100001012
Characteristic f at this times pAnd ft eThe characteristic diagram output by the convolutional layer is not unfolded, the length, the width and the depth are 8 × 8 × 8, and the characteristic diagram is spliced directly on the depth when the characteristic diagram is combined to obtain a characteristic diagram of 8 × 8 × 16;
step 3.4.2, sample x in target DomaintAngle-dependent characteristic f oft pAnd sample x in the source domainsExpression-related feature f ofs eCombine and input target domain image generator GtIn generating a reconstructed image in the target domain
Figure BDA00023976637100001015
Characteristic f at this timet pAnd fs eThe feature map output by the convolutional layer is not unfolded, the length, width and depth are 8 × 8 × 8, and the feature map is spliced on the depth during combination to obtain a feature map of 8 × 8 × 16;
step 3.4.3, establishing constraints for the reconstructed image using equation (5)
Figure BDA00023976637100001021
Figure BDA00023976637100001017
In formula (5), x'sRepresents another sample in the source domain data set S and is associated with sample xsHaving the same angle label as sample xtThe same expression label is possessed; x'tRepresents another sample in the target domain data set T and is associated with sample xtHaving the same angle label as sample xsThe same expression label is possessed; at this point sample x is neededtAngle of (2)Degree and expression labeling information, while the target domain dataset is non-expressive and angle labeled, so sample xtThe angle and expression information of (1) are obtained by a pseudo tag, i.e.
Figure BDA00023976637100001018
Wherein
Figure BDA00023976637100001019
Represents a sample xtThe pseudo-angle of (a) is noted,
Figure BDA00023976637100001020
represents a sample xtMarking the pseudo expression;
and 4, constructing an overall loss function, and performing feature decoupling of counterstudy and study of a domain adaptive network model by using an alternative iterative optimization mode to obtain an optimal facial expression recognition model:
step 4.1, constructing a total objective function by using the formula (6):
Figure BDA0002397663710000111
in the formula (6), α, β, η and λ are all weight factors, and in the present example, the four weight factors have values of 2.0, 3.0, 0.2 and 0.1, respectively;
step 4.2, setting the total training step number as K1The current total number of training steps is k1
Setting the optimized number of steps at three positions inside as K2,K3And K4The corresponding current optimization step number is k2,k3And k4
Setting the number of samples sampled each time in training as B;
initialization k1,k2,k3,k4Are all '0';
the learning rate is set to l _ rate, K in this example1Set to 30, K2,K3And K4Set to 1, 3 and 1, respectively, B is set to 32, and the initial learning rate l _ rate is setIs 0.001.
Step 4.3, respectively carrying out external kth from the source domain data set S and the target domain data set T1Sub-inner kth2Second random B samples are taken and used as the k-th outer sample1Sub-inner kth2A source domain training sample and a target domain training sample of the secondary iteration;
step 4.4, optimizing the source domain feature extractor E by using the formula (7)sAnd an expression classifier R to obtain the k-th external1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure BDA0002397663710000112
Figure BDA0002397663710000113
Step 4.5, optimizing the source domain feature extractor E by using the formula (8)sAnd a target domain feature extractor EtGet the external kth1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure BDA0002397663710000114
Figure BDA0002397663710000115
Step 4.6, optimizing the source domain feature extractor E by using the formula (9)sAnd a target domain feature extractor EtGet the external kth1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure BDA0002397663710000116
Figure BDA0002397663710000121
Step 4.7, let k2+1 assignment to k2Then, judge k2≥K2If yes, executing step 4.8, otherwise returning to step4.3 executing in sequence;
step 4.8, respectively carrying out external kth from the source domain data set S and the target domain data set T1Sub-inner kth3Second random B samples are taken and used as the k-th outer sample1Sub-inner kth3A source domain training sample and a target domain training sample of the secondary iteration;
step 4.9, optimizing the source domain feature extractor E by using the formula (10)sTarget domain feature extractor EtSource domain image generator GsAnd a target domain image generator GtGet the external kth1Sub-inner kth3Corresponding gradient of the sub-iteration
Figure BDA0002397663710000122
Figure BDA0002397663710000123
Step 4.10, let k3+1 assignment to k3Then, judge k3≥K3If yes, executing the step 4.11, otherwise, returning to the step 4.8 for sequential execution;
step 4.11, carry out external kth from the source domain data set S and the target domain data set T respectively1Sub-inner kth4Second random B samples are taken and used as the k-th outer sample1Sub-inner kth4A source domain training sample and a target domain training sample of the secondary iteration;
step 4.12, optimizing the source domain feature extractor E by using the formula (11)sAnd angle classifier DpGet the external kth1Sub-inner kth4Corresponding gradient of the sub-iteration
Figure BDA0002397663710000124
Figure BDA0002397663710000125
Step 4.13, optimizing expression domain discriminator D by using formula (12)deAnd angular domainDiscriminator DdpGet the external kth1Sub-inner kth4Corresponding gradient of the sub-iteration
Figure BDA0002397663710000126
Figure BDA0002397663710000127
Step 4.14, let k4+1 assignment to k4Then, judge k4≥K4If yes, executing the step 4.15, otherwise, returning to the step 4.11 for sequential execution;
step 4.15, let k1+1 assignment to k1Then, two determinations are made, first 20 < k1<K1If yes, the learning rate is updated to show a linear decay of the learning rate, i.e., l _ rate ═ l _ rate- γ × l _ rate, where γ is a decay factor, set to 0.1 in this example, and then it is determined that k is equal to k1≥K1Whether the face expression recognition model is established or not or whether the algorithm is converged or not is judged, if yes, the training is finished, the optimal face expression recognition model is obtained and used for realizing the classification of the face expression, and the final face expression recognition model is obtained by a combined target domain feature extractor EtAnd an expression classifier R is obtained as
Figure BDA0002397663710000133
Wherein the content of the first and second substances,
Figure BDA0002397663710000134
representing the combination of functions, otherwise returning to step 4.3 for sequential execution.
The test results of the present invention are further described in conjunction with the following chart:
in order to verify the contribution of each learning strategy to the final facial expression recognition effect, a comparison experiment is carried out, and the method comprises the following four aspects: (1) only supervised learning strategies are used; (2) combining supervised learning and confrontation field adaptive learning strategies; (3) combining supervised learning, confrontation field adaptive learning and cross confrontation learning strategies; (4) all learning strategies are used. The results of the experiment are shown in tables 1 and 2.
TABLE 1 recognition rates (in%)
Figure BDA0002397663710000131
TABLE 2 recognition rates (in%) of different learning strategies on BU-3DFE database
Figure BDA0002397663710000132
As can be seen from the experimental results of table 1 and table 2: with the increase of the use of the learning strategy provided by the invention, the experimental result is obviously improved, and a better facial expression recognition effect is still achieved at a larger angle of deviation from the front face, thus showing the effectiveness of the invention.

Claims (1)

1. An individualized facial expression recognition method based on angle robustness of counterstudy is characterized by comprising the following steps:
step 1, carrying out image preprocessing on a database containing images with N types of human face expressions:
carrying out face detection and correction on all facial expression images in the database by using an MTCNN (multiple-terminal coupled neural network) algorithm so as to obtain a normalized facial image data set which is used as a sample set;
randomly dividing the sample set by taking the individuals in the database as a dividing reference to obtain a source domain data set S and a target domain data set T; let any sample in the source domain data set S be xsAny sample x in the source domainsIs marked as ysAny sample x in the source domainsIs denoted by ps(ii) a Let any sample in the target domain data set T be xt
Step 2, constructing a feature decoupling and field self-adaptive network model based on counterstudy, and comprising the following steps: source domain feature extractor EsAnd a target domain feature extractor EtAngle classifier DpAnd expression classifier R, angle domain discriminator DdpAnd expression domain discriminator DdeSource domain image generator GsAnd a target domain image generator Gt
The source domain feature extractor EsAnd a target domain feature extractor EtThe system has the same network structure and consists of an input convolutional layer, M downsampling convolutional layers, Q residual convolutional layers and two branches containing W convolutional layers in sequence; an example regularization layer and a ReLU activation function are connected to each convolution layer;
the angle classifier DpExpression classifier R and angle domain discriminator DdpAnd expression domain discriminator DdeAll are formed by full-connection networks of H layers;
the source domain image generator GsAnd a target domain image generator GtThe network structure is the same, and the network structure is composed of an input convolutional layer, J up-sampling anti-convolutional layers and an output convolutional layer in sequence, for each convolutional layer before the output convolutional layer, an example regularization layer and a ReLU activation function are accessed, and for the output convolutional layer, a Tanh activation function is accessed;
initializing weight values of all convolution layers, anti-convolution layers and full connection layers in the feature decoupling and domain adaptive network model based on counterstudy by using Gaussian distribution;
step 3, four learning strategies of a feature decoupling and field self-adaptive network model based on antagonistic learning, including a supervision learning strategy, an antagonistic field self-adaptive learning strategy, a cross antagonistic feature decoupling learning strategy and an image reconstruction learning strategy;
step 3.1, a supervised learning strategy:
step 3.3.1, any sample x in the source domainsInputting the source domain feature extractor EsIn (1), two kinds of feature vectors are obtaineds e,fs pIn which fs eRepresenting a sample x in the source domainsExpression-related feature of fs pRepresenting the source domainSample xsThe angle-related characteristic of (a);
step 3.3.2 sample x in the Source DomainsAngle-dependent characteristic f ofs pInputting the angle classifier DpCarrying out angle identification to obtain a sample x in a source domainsThe angle category of (1);
method for establishing loss function l of angle identification by using formula (1)p(Es,Dp):
Figure FDA0002397663700000021
In formula (1), Sup (·) represents a supervised loss function;
step 3.3.3 sample x in the Source DomainsExpression-related feature f ofs eInputting the expression into the expression classifier R for expression recognition to obtain a sample x in a source domainsThe expression category of (1);
loss function l for expression recognition is established by using formula (2)e(Es,R):
Figure FDA0002397663700000022
Step 3.2, the adaptive learning strategy of the countermeasure field:
step 3.2.1, any sample x in the target domaintInputting the target domain feature extractor EtIn (1), two kinds of feature vectors are obtainedt e,ft pIn which ft eRepresenting a sample x in the target domaintExpression-related feature of ft pRepresenting a sample x in the target domaintThe angle-related characteristic of (a);
step 3.2.2 sample x in the Source DomainsAngle-dependent characteristic f ofs pOr samples x in the target domaintExpression-related feature f oft pInputting the angle domain discriminator DdpObtaining an angle-dependent feature fs pAs true or expression-related features ft pA false recognition result;
step 3.2.3 sample x in the Source DomainsExpression-related feature f ofs eOr samples x in the target domaintExpression-related feature f oft eInputting the expression domain discriminator DdeIn the method, expression related characteristics f are obtaineds eAs true or expression-related features ft eA false recognition result;
step 3.2.4, establishing a counterlearning loss function l by using the formula (3)adv(Es,Et,Ddp,Dde):
Figure FDA0002397663700000023
Step 3.3, a countermeasure characteristic decoupling learning strategy:
step 3.3.1, sample x in the Source DomainsAngle-dependent characteristic f ofs pInputting the expression into the expression classifier R to obtain a sample x in a source domainsThe expression classification result of (1);
sample x in the source domainsExpression-related feature f ofs eInput angle classifier DpObtaining a sample x in the source domainsThe angle classification result of (1);
step 3.3.2, establishing expression classifier R for angle correlation characteristic f by using formula (4)s pAnd an angle classifier DpFor expression-related feature fs eIs classified as a loss function
Figure FDA0002397663700000031
Figure FDA0002397663700000032
Step 3.4, image reconstruction learning strategy:
step 3.4.1, sample x in the Source DomainsAngle-dependent characteristic f ofs pAnd a target domainMiddle sample xtExpression-related feature f oft eCombined and input to the source domain image generator GsGenerating a reconstructed image in the source domain
Figure FDA0002397663700000033
Step 3.4.2, sample x in target DomaintAngle-dependent characteristic f oft pAnd sample x in the source domainsExpression-related feature f ofs eAre combined and input to the target domain image generator GtGenerating a reconstructed image in the target domain
Figure FDA0002397663700000034
Step 3.4.3, establishing constraint l of reconstructed image by using formula (5)clc(Es,Et,Gs,Gt):
Figure FDA0002397663700000035
In formula (5), x'sRepresents another sample in the source domain data set S and is associated with sample xsHaving the same angle label as sample xtThe same expression label is possessed; x'tRepresents another sample in the target domain data set T and is associated with sample xtHaving the same angle label as sample xsThe same expression label is possessed;
and 4, constructing an overall loss function, and performing feature decoupling of counterstudy and study of a domain adaptive network model by using an alternative iterative optimization mode to obtain an optimal facial expression recognition model:
step 4.1, constructing a total objective function by using the formula (6):
Figure FDA0002397663700000036
in the formula (6), α, β, η and λ are all weight factors;
step 4.2, setting the total training step number as K1The current total number of training steps is k1
Setting the optimized number of steps at three positions inside as K2,K3And K4The corresponding current optimization step number is k2,k3And k4
Setting the number of samples sampled each time in training as B;
initialization k1,k2,k3,k4Are all '0';
step 4.3, respectively carrying out external kth from the source domain data set S and the target domain data set T1Sub-inner kth2Second random B samples are taken and used as the k-th outer sample1Sub-inner kth2A source domain training sample and a target domain training sample of the secondary iteration;
step 4.4, optimizing the source domain feature extractor E using equation (7)sAnd an expression classifier R to obtain the k-th external1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure FDA0002397663700000041
Figure FDA0002397663700000042
Step 4.5, optimizing the source domain feature extractor E by using the formula (8)sAnd a target domain feature extractor EtGet the external kth1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure FDA0002397663700000043
Figure FDA0002397663700000044
Step 4.6, optimizing the source domain feature extractor E by using the formula (9)sAnd a target domain feature extractor EtGet the external kth1Sub-inner kth2Corresponding gradient of the sub-iteration
Figure FDA0002397663700000045
Figure FDA0002397663700000046
Step 4.7, let k2+1 assignment to k2Then, judge k2≥K2If yes, executing the step 4.8, otherwise, returning to the step 4.3 for sequential execution;
step 4.8, respectively carrying out external kth from the source domain data set S and the target domain data set T1Sub-inner kth3Second random B samples are taken and used as the k-th outer sample1Sub-inner kth3A source domain training sample and a target domain training sample of the secondary iteration;
step 4.9, optimizing the source domain feature extractor E by using the formula (10)sTarget domain feature extractor EtSource domain image generator GsAnd a target domain image generator GtGet the external kth1Sub-inner kth3Corresponding gradient of the sub-iteration
Figure FDA0002397663700000047
Figure FDA0002397663700000048
Step 4.10, let k3+1 assignment to k3Then, judge k3≥K3If yes, executing the step 4.11, otherwise, returning to the step 4.8 for sequential execution;
step 4.11, carry out external kth from the source domain data set S and the target domain data set T respectively1Sub-inner kth4Second random B samples are taken and used as the k-th outer sample1Sub-inner kth4A source domain training sample and a target domain training sample of the secondary iteration;
step 4.12, optimizing the source domain feature extractor E by using the formula (11)sAnd angle classifier DpGet the external kth1Sub-inner kth4Corresponding gradient of the sub-iteration
Figure FDA0002397663700000051
Figure FDA0002397663700000052
Step 4.13, optimizing expression domain discriminator D by using formula (12)deAnd angle domain discriminator DdpGet the external kth1Sub-inner kth4Corresponding gradient of the sub-iteration
Figure FDA0002397663700000053
Figure FDA0002397663700000054
Step 4.14, let k4+1 assignment to k4Then, judge k4≥K4If yes, executing the step 4.15, otherwise, returning to the step 4.11 for sequential execution;
step 4.15, let k1+1 assignment to k1Then, judge k1≥K1Whether the face expression is established or not or whether the algorithm is converged or not is judged, if yes, the training is finished, and an optimal face expression recognition model is obtained and is used for realizing the classification of the face expression; otherwise, the step 4.3 is returned to execute in sequence.
CN202010136966.3A 2020-03-02 2020-03-02 Angle robust personalized facial expression recognition method based on antagonistic learning Active CN111382684B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010136966.3A CN111382684B (en) 2020-03-02 2020-03-02 Angle robust personalized facial expression recognition method based on antagonistic learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010136966.3A CN111382684B (en) 2020-03-02 2020-03-02 Angle robust personalized facial expression recognition method based on antagonistic learning

Publications (2)

Publication Number Publication Date
CN111382684A true CN111382684A (en) 2020-07-07
CN111382684B CN111382684B (en) 2022-09-06

Family

ID=71218531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010136966.3A Active CN111382684B (en) 2020-03-02 2020-03-02 Angle robust personalized facial expression recognition method based on antagonistic learning

Country Status (1)

Country Link
CN (1) CN111382684B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101241A (en) * 2020-09-17 2020-12-18 西南科技大学 Lightweight expression recognition method based on deep learning
CN112133311A (en) * 2020-09-18 2020-12-25 科大讯飞股份有限公司 Speaker recognition method, related device and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120288167A1 (en) * 2011-05-13 2012-11-15 Microsoft Corporation Pose-robust recognition
CN108446609A (en) * 2018-03-02 2018-08-24 南京邮电大学 A kind of multi-angle human facial expression recognition method based on generation confrontation network
CN109508669A (en) * 2018-11-09 2019-03-22 厦门大学 A kind of facial expression recognizing method based on production confrontation network
CN110188656A (en) * 2019-05-27 2019-08-30 南京邮电大学 The generation and recognition methods of multi-orientation Face facial expression image
CN110348330A (en) * 2019-06-24 2019-10-18 电子科技大学 Human face posture virtual view generation method based on VAE-ACGAN

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120288167A1 (en) * 2011-05-13 2012-11-15 Microsoft Corporation Pose-robust recognition
CN108446609A (en) * 2018-03-02 2018-08-24 南京邮电大学 A kind of multi-angle human facial expression recognition method based on generation confrontation network
CN109508669A (en) * 2018-11-09 2019-03-22 厦门大学 A kind of facial expression recognizing method based on production confrontation network
CN110188656A (en) * 2019-05-27 2019-08-30 南京邮电大学 The generation and recognition methods of multi-orientation Face facial expression image
CN110348330A (en) * 2019-06-24 2019-10-18 电子科技大学 Human face posture virtual view generation method based on VAE-ACGAN

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CAN WANG ET AL.: ""Identity- and Pose-Robust Facial Expression Recognition through Adversarial Feature Learning"", 《AFFECTIVE COMPUTING & FACIAL ANALYTICS》 *
姚乃明 等: ""基于生成式对抗网络的鲁棒人脸表情识别"", 《自动化学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101241A (en) * 2020-09-17 2020-12-18 西南科技大学 Lightweight expression recognition method based on deep learning
CN112133311A (en) * 2020-09-18 2020-12-25 科大讯飞股份有限公司 Speaker recognition method, related device and readable storage medium

Also Published As

Publication number Publication date
CN111382684B (en) 2022-09-06

Similar Documents

Publication Publication Date Title
CN108615010B (en) Facial expression recognition method based on parallel convolution neural network feature map fusion
Liu et al. Connecting image denoising and high-level vision tasks via deep learning
CN108596039B (en) Bimodal emotion recognition method and system based on 3D convolutional neural network
CN107122809B (en) Neural network feature learning method based on image self-coding
CN110532900B (en) Facial expression recognition method based on U-Net and LS-CNN
CN106372581B (en) Method for constructing and training face recognition feature extraction network
CN108648191B (en) Pest image recognition method based on Bayesian width residual error neural network
CN112800903B (en) Dynamic expression recognition method and system based on space-time diagram convolutional neural network
CN111274921B (en) Method for recognizing human body behaviors by using gesture mask
CN109886881B (en) Face makeup removal method
CN110046671A (en) A kind of file classification method based on capsule network
CN108171318B (en) Convolution neural network integration method based on simulated annealing-Gaussian function
CN109344759A (en) A kind of relatives&#39; recognition methods based on angle loss neural network
CN111814611B (en) Multi-scale face age estimation method and system embedded with high-order information
CN109002755B (en) Age estimation model construction method and estimation method based on face image
CN109783666A (en) A kind of image scene map generation method based on iteration fining
Ocquaye et al. Dual exclusive attentive transfer for unsupervised deep convolutional domain adaptation in speech emotion recognition
CN111582397A (en) CNN-RNN image emotion analysis method based on attention mechanism
CN112651940B (en) Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
CN107463954A (en) A kind of template matches recognition methods for obscuring different spectrogram picture
CN113642621A (en) Zero sample image classification method based on generation countermeasure network
CN111382684B (en) Angle robust personalized facial expression recognition method based on antagonistic learning
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
CN112667071A (en) Gesture recognition method, device, equipment and medium based on random variation information
Han et al. Multi-scale feature network for few-shot learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant