CN111382684A

CN111382684A - Angle robust personalized facial expression recognition method based on counterstudy

Info

Publication number: CN111382684A
Application number: CN202010136966.3A
Authority: CN
Inventors: 王上飞; 王灿
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2020-03-02
Filing date: 2020-03-02
Publication date: 2020-07-07
Anticipated expiration: 2040-03-02
Also published as: CN111382684B

Abstract

The invention discloses an angle robust personalized facial expression recognition method based on counterstudy, which comprises the following steps: 1. carrying out image preprocessing on a database containing images with N types of human face expressions; 2. constructing a feature decoupling and domain self-adaptive network model based on counterstudy; 3. training the constructed network model by using an alternative iterative optimization mode; 4. and predicting the facial image to be detected by using the trained model to realize the classification and identification of the facial expression. The invention can simultaneously overcome the negative influence on the facial expression recognition effect caused by the angle and the difference between individuals in the facial expression recognition, thereby realizing the accurate recognition of the facial expression.

Description

Angle robust personalized facial expression recognition method based on counterstudy

Technical Field

The invention relates to the technical field of computer vision, in particular to an angle robust personalized facial expression recognition method based on counterstudy.

Background

Facial expression recognition is an important research topic in the field of computer vision, and has wide application in human-computer interaction, fatigue detection, crime reconnaissance and medical treatment. The current facial expression recognition method mostly assumes that a facial image is a front face, but in an actual application scene, the relative position of a user is not fixed, the scene is changeable, and only the facial expression recognition under a multi-angle condition can meet the actual requirement. Therefore, in recent years, some methods have been proposed for researchers to cope with the influence of angles on the recognition of facial expressions. Depending on how angle changes are handled, these methods can be divided into three categories: a specific perspective classifier method, a single classifier method, and an angle normalization method. The method for the classifier with the specific visual angle is intuitive, namely, corresponding classifiers are trained for samples with different angles respectively, however, the method is limited by limited training samples, and the classifier with robust performance cannot be learned for each angle. The single classifier approach attempts to learn a more robust classifier from a large number of samples, and can bring richer and diversified training samples to the learning of the classifier through sample generation, thanks to the application of generating a countermeasure network and a variational self-encoder. However, the generation of high-quality samples is a process which is difficult to guarantee, and the generated low-quality samples can bring noise to the learning of the classifier instead, so that the performance of the classifier is affected. The angle normalization method is to map face samples or feature representations of any angle into face samples or feature representations of a front face, and the consistency of individuals and the invariance of expression contents are kept during conversion. However, the method relies on the pair of training samples, that is, for a non-positive-face sample of an individual, a corresponding positive-face sample of the individual needs to exist, which severely restricts the use of the method in practice.

Besides the angle, the inter-individual difference is also an important factor affecting the facial expression recognition effect. Different individuals have great differences in facial expression for the same expression due to differences in facial form, character, appearance, and the like, and the effects of expression recognition are seriously affected. For example, for "smiling", some people tend to break into laugh, and some people tend to close into smile, and although both of them belong to the expression "happy", the expressions at the pixel level are different from each other, thereby causing difficulty in learning the features. In addition, the individual appearance varies greatly, which also brings challenges to expression analysis. Individual robust facial expression recognition can be solved by an individual-based method, i.e. individualized facial expression recognition method. The method based on the specific individual aims at establishing the specific classifier for the specific individual, so that the learned classifier is only concentrated on a single individual, and the deviation caused by learning of the classifier by other individuals is avoided. However, limited to the sample size of a single individual, it is difficult to learn a facial expression classifier with good performance.

Disclosure of Invention

The invention provides an angle robust personalized facial expression recognition method based on counterstudy to overcome the defects of the prior art, so that the influence of angle and difference between individuals in facial expression recognition can be simultaneously overcome, and the recognition rate of the facial expression recognition is improved.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention relates to an angle robust personalized facial expression recognition method based on antagonistic learning, which is characterized by comprising the following steps of:

step 1, carrying out image preprocessing on a database containing images with N types of human face expressions:

carrying out face detection and correction on all facial expression images in the database by using an MTCNN (multiple-terminal coupled neural network) algorithm so as to obtain a normalized facial image data set which is used as a sample set;

randomly dividing the sample set by taking the individuals in the database as a dividing reference to obtain a source domain data set S and a target domain data set T; let any sample in the source domain data set S be x_sAny sample x in the source domain_sIs marked as y_sAny sample x in the source domain_sIs denoted by p_s(ii) a Let any sample in the target domain data set T be x_t；

Step 2, constructing a feature decoupling and field self-adaptive network model based on counterstudy, and comprising the following steps: source domain feature extractor E_sAnd a target domain feature extractor E_tAngle classifier D_pAnd expression classifier R, angle domain discriminator D_dpAnd expression domain discriminator D_deSource domain image generator G_sAnd a target domain image generator G_t；

The source domain feature extractor E_sAnd a target domainSign extractor E_tThe system has the same network structure and consists of an input convolutional layer, M downsampling convolutional layers, Q residual convolutional layers and two branches containing W convolutional layers in sequence; an example regularization layer and a ReLU activation function are connected to each convolution layer;

the angle classifier D_pExpression classifier R and angle domain discriminator D_dpAnd expression domain discriminator D_deAll are formed by full-connection networks of H layers;

the source domain image generator G_sAnd a target domain image generator G_tThe network structure is the same, and the network structure is composed of an input convolutional layer, J up-sampling anti-convolutional layers and an output convolutional layer in sequence, for each convolutional layer before the output convolutional layer, an example regularization layer and a ReLU activation function are accessed, and for the output convolutional layer, a Tanh activation function is accessed;

initializing weight values of all convolution layers, anti-convolution layers and full connection layers in the feature decoupling and domain adaptive network model based on counterstudy by using Gaussian distribution;

step 3, four learning strategies of a feature decoupling and field self-adaptive network model based on antagonistic learning, including a supervision learning strategy, an antagonistic field self-adaptive learning strategy, a cross antagonistic feature decoupling learning strategy and an image reconstruction learning strategy;

step 3.1, a supervised learning strategy:

step 3.3.1, any sample x in the source domain_sInputting the source domain feature extractor E_sIn (1), two kinds of feature vectors are obtained

Wherein f is_s ^eRepresenting a sample x in the source domain_sExpression-related feature of f_s ^pRepresenting a sample x in the source domain_sThe angle-related characteristic of (a);

step 3.3.2 sample x in the Source Domain_sAngle-dependent characteristic f of_s ^pInputting the angle classifier D_pCarrying out angle identification to obtain a sample x in a source domain_sThe angle category of (1);

method for establishing loss function of angle identification by using formula (1)

In formula (1), Sup (·) represents a supervised loss function;

step 3.3.3 sample x in the Source Domain_sExpression-related feature f of_s ^eInputting the expression into the expression classifier R for expression recognition to obtain a sample x in a source domain_sThe expression category of (1);

loss function for expression recognition established by using formula (2)

Step 3.2, the adaptive learning strategy of the countermeasure field:

step 3.2.1, any sample x in the target domain_tInputting the target domain feature extractor E_tIn (1), two kinds of feature vectors are obtained_t ^e,f_t ^pIn which f_t ^eRepresenting a sample x in the target domain_tExpression-related feature of f_t ^pRepresenting a sample x in the target domain_tThe angle-related characteristic of (a);

step 3.2.2 sample x in the Source Domain_sAngle-dependent characteristic d of_s ^pOr samples x in the target domain_tExpression-related feature f of_t ^pInputting the angle domain discriminator D_dpObtaining an angle-dependent feature f_s ^pAs true or expression-related features f_t ^pA false recognition result;

step 3.2.3 sample x in the Source Domain_sExpression-related feature f of_s ^eOr samples x in the target domain_tExpression-related feature f of_t ^eInputting the expression domain discriminator D_deIn the method, expression related characteristics f are obtained_s ^eAs true or expression-related features f_t ^eA false recognition result;

step 3.2.4, establishing a counterlearning loss function by using the formula (3)

Step 3.3, a countermeasure characteristic decoupling learning strategy:

step 3.3.1, sample x in the Source Domain_sAngle-dependent characteristic f of_s ^pInputting the expression into the expression classifier R to obtain a sample x in a source domain_sThe expression classification result of (1);

sample x in the source domain_sExpression-related feature f of_s ^eInput angle classifier D_pObtaining a sample x in the source domain_sThe angle classification result of (1);

step 3.3.2, establishing expression classifier R for angle correlation characteristic f by using formula (4)_s ^pAnd an angle classifier D_pFor expression-related feature f_s ^eIs classified as a loss function

Step 3.4, image reconstruction learning strategy:

step 3.4.1, sample x in the Source Domain_sAngle-dependent characteristic f of_s ^pAnd an objectSamples x in the domain_tExpression-related feature f of_t ^eCombined and input to the source domain image generator G_sGenerating a reconstructed image in the source domain

Step 3.4.2, sample x in target Domain_tAngle-dependent characteristic f of_t ^pAnd sample x in the source domain_sExpression-related feature f of_s ^eAre combined and input to the target domain image generator G_tGenerating a reconstructed image in the target domain

Step 3.4.3, establishing constraints for the reconstructed image using equation (5)

In formula (5), x'_sRepresents another sample in the source domain data set S and is associated with sample x_sHaving the same angle label as sample x_tThe same expression label is possessed; x'_tRepresents another sample in the target domain data set T and is associated with sample x_tHaving the same angle label as sample x_sThe same expression label is possessed;

and 4, constructing an overall loss function, and performing feature decoupling of counterstudy and study of a domain adaptive network model by using an alternative iterative optimization mode to obtain an optimal facial expression recognition model:

step 4.1, constructing a total objective function by using the formula (6):

in the formula (6), α, β, η and λ are all weight factors;

step 4.2, setting the total training step number as K₁The current total number of training steps is k₁；

Setting the optimized number of steps at three positions inside as K₂，K₃And K₄The corresponding current optimization step number is k₂，k₃And k₄；

Setting the number of samples sampled each time in training as B;

initialization k₁,k₂,k₃,k₄Are all '0';

step 4.3, respectively carrying out external kth from the source domain data set S and the target domain data set T₁Sub-inner kth₂Second random B samples are taken and used as the k-th outer sample₁Sub-inner kth₂A source domain training sample and a target domain training sample of the secondary iteration;

step 4.4, optimizing the source domain feature extractor E using equation (7)_sAnd an expression classifier R to obtain the k-th external₁Sub-inner kth₂Corresponding gradient of the sub-iteration

Step 4.5, optimizing the source domain feature extractor E by using the formula (8)_sAnd a target domain feature extractor E_tGet the external kth₁Sub-inner kth₂Corresponding gradient of the sub-iteration

Step 4.6, optimizing the source domain feature extractor E by using the formula (9)_sAnd a target domain feature extractor E_tGet the external kth₁Sub-inner kth₂Corresponding gradient of the sub-iteration

Step 4.7, let k₂+1 assignment to k₂Then, judge k₂≥K₂If yes, executing the step 4.8, otherwise, returning to the step 4.3 for sequential execution;

step 4.8, respectively carrying out external kth from the source domain data set S and the target domain data set T₁Sub-inner kth₃Second random B samples are taken and used as the k-th outer sample₁Sub-inner kth₃A source domain training sample and a target domain training sample of the secondary iteration;

step 4.9, optimizing the source domain feature extractor E by using the formula (10)_sTarget domain feature extractor E_tSource domain image generator G_sAnd a target domain image generator G_tGet the external kth₁Sub-inner kth₃Corresponding gradient of the sub-iteration

Step 4.10, let k₃+1 assignment to k₃Then, judge k₃≥K₃If yes, executing the step 4.11, otherwise, returning to the step 4.8 for sequential execution;

step 4.11, carry out external kth from the source domain data set S and the target domain data set T respectively₁Sub-inner kth₄Second random B samples are taken and used as the k-th outer sample₁Sub-inner kth₄A source domain training sample and a target domain training sample of the secondary iteration;

step 4.12Optimization of source domain feature extractor E by equation (11)_sAnd angle classifier D_pGet the external kth₁Sub-inner kth₄Corresponding gradient of the sub-iteration

Step 4.13, optimizing expression domain discriminator D by using formula (12)_deAnd angle domain discriminator D_dpGet the external kth₁Sub-inner kth₄Corresponding gradient of the sub-iteration

Step 4.14, let k₄+1 assignment to k₄Then, judge k₄≥K₄If yes, executing the step 4.15, otherwise, returning to the step 4.11 for sequential execution;

step 4.15, let k₁+1 assignment to k₁Then, judge k₁≥K₁Whether the face expression is established or not or whether the algorithm is converged or not is judged, if yes, the training is finished, and an optimal face expression recognition model is obtained and is used for realizing the classification of the face expression; otherwise, the step 4.3 is returned to execute in sequence.

Compared with the prior art, the invention has the beneficial effects that:

1. according to the invention, through proposing the cross-confrontation feature decoupling learning strategy, the expression-related features and the angle-related features can be decoupled, so that the expression-related features do not contain angle information irrelevant to expression recognition, and the angle-related features do not contain expression information irrelevant to angle recognition, the problems that the existing angle-robust facial expression recognition method is limited by sample diversity, depends on high-quality facial image generation and the like are solved, and the more angle-robust facial expression recognition is realized.

2. According to the invention, by providing the countermeasure field adaptive learning strategy, the source domain information can be effectively migrated to the target domain, the learning of the facial expression recognition task of the target domain is facilitated, the defect that the traditional personalized facial expression recognition method is limited by a small number of target domain samples is overcome, the strategy does not need the expression and angle labeling information of the target domain, the usability in the actual environment is improved, and the influence of the inter-individual difference in the facial expression recognition on the recognition effect can be effectively coped with.

3. According to the method, through the provision of the reconstruction learning strategy, the performance of cross confrontation feature decoupling learning and confrontation field adaptive learning can be further improved, and the facial expression recognition effect of the method is further improved.

4. The invention designs an alternative iterative optimization method, which can simultaneously carry out supervised learning, cross confrontation feature decoupling learning, confrontation field self-adaptive learning and reconstruction learning, realizes end-to-end training and prediction, reduces manual intervention, supplements each learning strategy, jointly learns the features of angles and individual robustness, and optimizes the process of learning the relevant features of the facial expression.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention;

FIG. 2 is a model block diagram of the present invention;

FIG. 3 is a graph of the reconstructed results of the present invention on the Multi-PIE and BU-3DFE databases.

Detailed Description

In this embodiment, as shown in fig. 1, a method for identifying personalized facial expressions based on angle robustness of counterstudy is performed according to the following steps:

performing face detection and correction on all facial expression images in the database by using an MTCNN (multiple-transmission neural network) algorithm to obtain a normalized facial image data set which is used as a sample set, wherein the pixel size of all the facial images subjected to normalization processing is 128 × 128;

randomly dividing the sample set by taking individuals in the database as a division reference to obtain a source domain data set S and a target domain data set T; let any sample in the source domain data set S be x_sAny sample x in the source domain_sIs marked as y_sAny sample x in the source domain_sIs denoted by p_s(ii) a Let any sample in the target domain data set T be x_tThe target domain sample has no expression and angle marking information;

in this embodiment, as shown in FIG. 3, a Multi-PIE and BU-3DFE facial expression database is used. The Multi-PIE facial expression database contains 755,370 facial images, collected from 337 volunteers, for 13 angles: 90 ° to 90 ° and at 15 ° intervals, the expressions are noted: smile, surprise, strabismus, aversion, scream and neutrality. The BU-3DFE facial expression database contains 100 3D models, wherein 56 men and 44 women, samples at any angle can be obtained by rotating the 3D models, and the expressions are labeled as: anger, disgust, fear, happiness, neutrality, sadness and surprise.

Step 2, as shown in fig. 2, constructing a feature decoupling and domain adaptive network model based on counterstudy, and including: source domain feature extractor E_sAnd a target domain feature extractor E_tAngle classifier D_pAnd expression classifier R, angle domain discriminator D_dpAnd expression domain discriminator D_deSource domain image generator G_sAnd a target domain image generator G_t；

Source domain feature extractor E_sAnd a target domain feature extractor E_tHas the same network structure, and sequentially consists of an input convolutional layer (convolutional kernel size is 7 × 7, number is 3, step size is 2, and filling is 3), M downsampled convolutional layers (in the example, M is set to be 4, convolutional kernel sizes are 4 × 4, step sizes are 2, filling is 1, number is 64, 32, 16 and 8 respectively), Q residual convolutional layers (in the example, Q is set to be 3, convolutional kernel sizes are 3 × 3, and number is 3Step length is 8, step length is 2, padding is 1), two branches containing W convolutional layers (in the example, W is set to be 2, convolutional kernel size is 3 × 3, number is 8, step length is 2, padding is 1) are formed, and an example regularization layer and a ReLU activation function are connected after each convolutional layer;

angle classifier D_pExpression classifier R and angle domain discriminator D_dpAnd expression domain discriminator D_deAre all composed of a fully connected network of H layers of input length 512 (in this example, H is set to 3);

source domain image generator G_sAnd a target domain image generator G_tThe convolutional code generator is provided with the same network structure and sequentially consists of an input convolutional layer (the convolutional kernel size is 7 × 7, the number of the convolutional kernel is 8, the step length is 1, and the filling is 3), J up-sampling anti-convolutional layers (in the example, J is set to be 4, the convolutional kernel size is 4 × 4, the step length is 2, the filling is 1, and the number of the convolutional kernel is 8, 16, 32 and 64 respectively) and an output convolutional layer (the convolutional kernel size is 7 × 7, the number of the convolutional kernel is 3, the step length is 1, and the filling is 3);

initializing weights of all convolution layers, anti-convolution layers and full-connection layers in the feature decoupling and domain adaptive network model based on counterlearning by using Gaussian distribution obeying N (0, 0.02);

step 3.1, a supervised learning strategy:

step 3.3.1, any sample x in the source domain_sInput source domain feature extractor E_sIn (1), two kinds of feature vectors are obtained

Wherein f is_s ^eRepresenting samples in the source domainx_sExpression-related feature of f_s ^pRepresenting a sample x in the source domain_sThe angle-related characteristic of (a); the two features are obtained by unfolding a feature map output by the convolutional layer, and the dimension is 512;

step 3.3.2 sample x in Source Domain_sAngle-dependent characteristic f of_s ^pInput angle classifier D_pCarrying out angle identification to obtain a sample x in a source domain_sThe angle category of (1);

In formula (1), Sup (·) represents a supervised loss function; square losses, Softmax losses, cross entropy losses, etc. can be used;

step 3.3.3 sample x in Source Domain_sExpression-related feature f of_s ^eInputting the expression into an expression classifier R for expression recognition to obtain a sample x in a source domain_sThe expression category of (1);

loss function for expression recognition established by using formula (2)

Step 3.2, the adaptive learning strategy of the countermeasure field:

step 3.2.1, any sample x in the target Domain_tInput target domain feature extractor E_tIn (1), two kinds of feature vectors are obtained_t ^e，f_t ^pIn which f_t ^eRepresenting a sample x in the target domain_tExpression-related feature of f_t ^pRepresenting a sample x in the target domain_tThe angle-related characteristic of (a); all in oneSimilarly, the two features are obtained by expanding a feature map output by the convolution layer, and the dimension is 512;

step 3.2.2, introduction of countermeasure domain adaptive learning strategy reduction f_s ^pAnd f_t ^pInter-domain distribution variability exists. In particular, sample x in the source domain_sAngle-dependent characteristic f of_s ^pOr samples x in the target domain_tExpression-related feature f of_t ^pInput angle domain discriminator D_dpObtaining an angle-dependent feature f_s ^pAs true or expression-related features f_t ^pA false recognition result; in the angular domain discriminator D_dpTo distinguish f as much as possible_s ^pAnd f_t ^pTime, source domain feature extractor E_sAnd a target domain feature extractor E_tAs far as possible so that f is generated_s ^pAnd f_t ^pCan not be judged by the angle domain D_dpAnd (5) identifying. Thus the source domain feature extractor E_sAnd a target domain feature extractor E_tAnd angle domain discriminator D_dpConstituting an antagonistic relationship.

Step 3.2.3, introduction of confrontation domain adaptive learning strategy reduction f_s ^eAnd f_t ^eThere is a range of distribution variability. In particular, sample x in the source domain_sExpression-related feature f of_s ^eOr samples x in the target domain_tExpression-related feature f of_t ^eInput expression domain discriminator D_deIn the method, expression related characteristics f are obtained_s ^eAs true or expression-related features f_t ^eA false recognition result; in expression domain discriminator D_deTo distinguish f as much as possible_s ^eAnd f_t ^eTime, source domain feature extractor E_sAnd a target domain feature extractor E_tAs far as possible so that f is generated_s ^eAnd f_t ^eIdentifier D for expression domain_deAnd (5) identifying. Thus the source domain feature extractor E_sAnd a target domain feature extractor E_tAnd expression domain discriminator D_deConstituting an antagonistic relationship.

Step 3.3, a countermeasure characteristic decoupling learning strategy:

step 3.3.1, sample x in Source Domain_sAngle-dependent characteristic f of_s ^pInputting the expression into a classifier R to obtain a sample x in a source domain_sThe expression classification result of (1);

sample x in the source domain_sExpression-related feature f of_s ^eInput angle classifier D_pIn (1), obtain a sample x in the source domain_sThe angle classification result of (1);

By optimizing this loss, the expression classifier R cannot correlate features f to angles_s ^pRecognizing expression information while making an angle classifier D_pInability to characterize the situational related features f_s ^eIdentifying angle information so that the angle-related feature f_s ^pThere is no expression information independent of angle, so that the expression-related feature f_s ^pAngle information irrelevant to the expression does not exist, and decoupling of the angle and the expression information is realized;

step 3.4, image reconstruction learning strategy:

step 3.4.1, sample in Source Domainx_sAngle-dependent characteristic f of_s ^pAnd sample x in the target domain_tExpression-related feature f of_t ^eCombining and inputting source domain image generator G_sIn the generation of a reconstructed image in the source domain

Characteristic f at this time_s ^pAnd f_t ^eThe characteristic diagram output by the convolutional layer is not unfolded, the length, the width and the depth are 8 × 8 × 8, and the characteristic diagram is spliced directly on the depth when the characteristic diagram is combined to obtain a characteristic diagram of 8 × 8 × 16;

step 3.4.2, sample x in target Domain_tAngle-dependent characteristic f of_t ^pAnd sample x in the source domain_sExpression-related feature f of_s ^eCombine and input target domain image generator G_tIn generating a reconstructed image in the target domain

Characteristic f at this time_t ^pAnd f_s ^eThe feature map output by the convolutional layer is not unfolded, the length, width and depth are 8 × 8 × 8, and the feature map is spliced on the depth during combination to obtain a feature map of 8 × 8 × 16;

In formula (5), x'_sRepresents another sample in the source domain data set S and is associated with sample x_sHaving the same angle label as sample x_tThe same expression label is possessed; x'_tRepresents another sample in the target domain data set T and is associated with sample x_tHaving the same angle label as sample x_sThe same expression label is possessed; at this point sample x is needed_tAngle of (2)Degree and expression labeling information, while the target domain dataset is non-expressive and angle labeled, so sample x_tThe angle and expression information of (1) are obtained by a pseudo tag, i.e.

Wherein

Represents a sample x_tThe pseudo-angle of (a) is noted,

represents a sample x_tMarking the pseudo expression;

step 4.1, constructing a total objective function by using the formula (6):

in the formula (6), α, β, η and λ are all weight factors, and in the present example, the four weight factors have values of 2.0, 3.0, 0.2 and 0.1, respectively;

Setting the number of samples sampled each time in training as B;

initialization k₁,k₂,k₃,k₄Are all '0';

the learning rate is set to l _ rate, K in this example₁Set to 30, K₂，K₃And K₄Set to 1, 3 and 1, respectively, B is set to 32, and the initial learning rate l _ rate is setIs 0.001.

step 4.4, optimizing the source domain feature extractor E by using the formula (7)_sAnd an expression classifier R to obtain the k-th external₁Sub-inner kth₂Corresponding gradient of the sub-iteration

Step 4.7, let k₂+1 assignment to k₂Then, judge k₂≥K₂If yes, executing step 4.8, otherwise returning to step4.3 executing in sequence;

step 4.12, optimizing the source domain feature extractor E by using the formula (11)_sAnd angle classifier D_pGet the external kth₁Sub-inner kth₄Corresponding gradient of the sub-iteration

Step 4.13, optimizing expression domain discriminator D by using formula (12)_deAnd angular domainDiscriminator D_dpGet the external kth₁Sub-inner kth₄Corresponding gradient of the sub-iteration

step 4.15, let k₁+1 assignment to k₁Then, two determinations are made, first 20 < k₁＜K₁If yes, the learning rate is updated to show a linear decay of the learning rate, i.e., l _ rate ═ l _ rate- γ × l _ rate, where γ is a decay factor, set to 0.1 in this example, and then it is determined that k is equal to k₁≥K₁Whether the face expression recognition model is established or not or whether the algorithm is converged or not is judged, if yes, the training is finished, the optimal face expression recognition model is obtained and used for realizing the classification of the face expression, and the final face expression recognition model is obtained by a combined target domain feature extractor E_tAnd an expression classifier R is obtained as

Wherein the content of the first and second substances,

representing the combination of functions, otherwise returning to step 4.3 for sequential execution.

The test results of the present invention are further described in conjunction with the following chart:

in order to verify the contribution of each learning strategy to the final facial expression recognition effect, a comparison experiment is carried out, and the method comprises the following four aspects: (1) only supervised learning strategies are used; (2) combining supervised learning and confrontation field adaptive learning strategies; (3) combining supervised learning, confrontation field adaptive learning and cross confrontation learning strategies; (4) all learning strategies are used. The results of the experiment are shown in tables 1 and 2.

TABLE 1 recognition rates (in%)

TABLE 2 recognition rates (in%) of different learning strategies on BU-3DFE database

As can be seen from the experimental results of table 1 and table 2: with the increase of the use of the learning strategy provided by the invention, the experimental result is obviously improved, and a better facial expression recognition effect is still achieved at a larger angle of deviation from the front face, thus showing the effectiveness of the invention.

Claims

1. An individualized facial expression recognition method based on angle robustness of counterstudy is characterized by comprising the following steps:

The source domain feature extractor E_sAnd a target domain feature extractor E_tThe system has the same network structure and consists of an input convolutional layer, M downsampling convolutional layers, Q residual convolutional layers and two branches containing W convolutional layers in sequence; an example regularization layer and a ReLU activation function are connected to each convolution layer;

step 3.1, a supervised learning strategy:

step 3.3.1, any sample x in the source domain_sInputting the source domain feature extractor E_sIn (1), two kinds of feature vectors are obtained_s ^e,f_s ^pIn which f_s ^eRepresenting a sample x in the source domain_sExpression-related feature of f_s ^pRepresenting the source domainSample x_sThe angle-related characteristic of (a);

method for establishing loss function l of angle identification by using formula (1)_p(E_s,D_p)：

In formula (1), Sup (·) represents a supervised loss function;

loss function l for expression recognition is established by using formula (2)_e(E_s,R)：

Step 3.2, the adaptive learning strategy of the countermeasure field:

step 3.2.2 sample x in the Source Domain_sAngle-dependent characteristic f of_s ^pOr samples x in the target domain_tExpression-related feature f of_t ^pInputting the angle domain discriminator D_dpObtaining an angle-dependent feature f_s ^pAs true or expression-related features f_t ^pA false recognition result;

step 3.2.4, establishing a counterlearning loss function l by using the formula (3)_adv(E_s,E_t,D_dp,D_de)：

Step 3.3, a countermeasure characteristic decoupling learning strategy:

Step 3.4, image reconstruction learning strategy:

step 3.4.1, sample x in the Source Domain_sAngle-dependent characteristic f of_s ^pAnd a target domainMiddle sample x_tExpression-related feature f of_t ^eCombined and input to the source domain image generator G_sGenerating a reconstructed image in the source domain