CN112183213B

CN112183213B - Facial expression recognition method based on Intril-Class Gap GAN

Info

Publication number: CN112183213B
Application number: CN202010905875.1A
Authority: CN
Inventors: 刘韵婷; 陈亮; 吴攀
Original assignee: Shenyang Ligong University
Current assignee: Shenyang Ligong University
Priority date: 2019-09-02
Filing date: 2020-09-01
Publication date: 2024-02-02
Anticipated expiration: 2040-09-01
Also published as: CN112183213A

Abstract

A facial expression recognition method based on Intril-Class Gap GAN, the recognition model is constructed, comprising the following steps: (1) Collecting real-time images of different sources and different expressions of the human face; (2) Inputting the image into an Intril-Class Gap GAN neural network model for recognition; (3) outputting a recognition result; according to the facial expression recognition method based on the generated countermeasure, the facial expression features are automatically extracted by comparing with the traditional method for manually extracting the expression features, and compared with the facial expression recognition of a neural network at a slightly early stage, the facial expression recognition method based on the generated countermeasure has the advantages that the recognition rate is improved, and therefore the expression recognition is accurately carried out.

Description

Facial expression recognition method based on Intril-Class Gap GAN

Technical Field

The invention relates to the field of facial expression recognition of image processing and deep learning, in particular to a facial expression recognition method based on generation countermeasure.

Background

The huge floating population in China forms huge pressure on urban infrastructure and public service, severe injury accident frequently occurs in recent years, security and protection situations are focused, urban management and service systems are seriously lagged, perfection is urgently needed, and urban monitoring is enhanced and facial expression recognition on lawbreakers becomes important. The expression is an emotional state represented by facial muscle changes, and by identifying facial emotional expression of a person, abnormal psychological states can be judged, extreme emotions can be presumed, facial expression of the person in a complex environment can be observed, technical support is provided for further judging the psychology of the person, and the person who is more suspicious can be roughly judged, so that certain criminal activities can be prevented in time. The traditional facial expression recognition is mainly based on a facial expression recognition method of template matching and a neural network. Moreover, the traditional facial expression needs human intervention in the feature selection process, the feature extraction algorithm is finely designed by means of manpower, and the traditional facial expression lacks sufficient computing power, is high in training difficulty and low in accuracy, and is easy to lose original expression information.

Disclosure of Invention

The invention aims to:

according to the proposed intra-class gap existing in facial expression recognition under the real condition, aiming at the technical problems that the difficulty is high in complex environment security check and the requirement of the facial expression recognition rate cannot be met due to the intra-class gap, the facial expression recognition method based on the generated countermeasure is provided.

The technical scheme is as follows:

an facial expression recognition method based on Intril-Class Gap GAN,

the identification model construction comprises the following steps:

(1) Collecting real-time images of different sources and different expressions of the human face;

(2) Inputting the image into an Intril-Class Gap GAN neural network model for recognition;

(3) Outputting a recognition result;

the method for constructing the Intril-Class Gap GAN neural network model in the step (2) comprises the following steps:

(2.1) collecting historical images of different sources and different expressions of the human face;

(2.2) preprocessing the collected face image to construct a face expression data set;

(2.3) constructing an Intril-Class Gap GAN neural network model aiming at the facial expression recognition problem that the data set in the step (2.2) has intra-Class gaps;

(2.4) training the generator and discriminator of the network simultaneously in combination with the pixel differences and the potential vector differences between the input image and the reconstructed image to ensure that the differences between the reconstructed image and the input image are minimal.

The facial expression data set construction method in the step (2.2) comprises the following steps:

s11: based on Multi-PIE and JAFFE expression data sets, facial expression pictures are downloaded on the network through the step (2.1), the facial expression data sets required by homemade are carried out, abomination, happy, neutral, anxious and surprise and fear facial expressions of people with different countries, different age groups, different professions and the like are selected for experiments, a large number of facial expression characteristics with intra-class gaps are increased in complexity and used as input images x of network training

S12: geometric normalization processing the input image, and performing face detection on the normalized image;

s13: the scale normalizes the image after the processing in step S12, unifying the size of the image.

The specific steps in the step (2.4) are as follows:

s14: training a facial expression recognition network model based on an IC-GAN (intra-Class Gap GAN) neural network generating a countermeasure based on the image processed in step S13;

s15: carrying out data enhancement and data expansion processing on the image;

s16: and training the network model and storing the trained network model.

The step S12 includes the following steps:

s121: aiming at the collected image, calibrating characteristic points [ x, y ], calibrating the characteristic points of the two eyes and the nose to obtain coordinate values of the characteristic points;

s122: rotating the image according to the coordinates of eyes on the face to ensure the consistency of the face direction, wherein the distance between the eyes of the person is d, and the middle point of the two eyes is O;

s123: and determining a square frame containing the human face according to the calibrated characteristic points and the geometric model, and respectively cutting d distances from O to left and right, wherein the distances are respectively 0.5d and 1.5d in the up-down direction.

The step S13 includes the following steps:

s131: and (3) carrying out scale normalization on the picture cut in the step (S123), unifying the images to 256 pixels by 256 pixels, and completing geometric normalization on the images.

The step S14 includes the following steps:

s141: constructing a proposed integrated circuit-gate (IC-Class Gap GAN) neural network by using a pytorch deep learning framework, firstly inputting the picture processed in the step S13 into a first layer of convolution layer to perform convolution operation, and performing convolution on an input image through convolution check of 4*4 to output 128×128×64; performing nonlinear operation on the convolution by adopting a LeakyReLu activation function, wherein the output is 128 x 64; the LeakyReLu activation function is:

a _i is represented by the general formula (1), ++ infinity) interval is a fixed parameter of (2);

s142: the convolution operation is carried out on the output of the upper layer by using the convolution kernel of 4*4, the output is 64 x 128, the normalization operation is carried out on the output of the upper layer by adopting the batch norm layer, the nonlinear operation is carried out on the convolution by adopting the LeakyReLu activation function, and the output is 64 x 128

S143: continuing to convolve, batch norm and LeakyReLu operation on the output of the upper layer by using the method of the step S142, wherein the output is 4 x 100;

s144: performing inverse convolution operation of a convolution kernel 4*4 on the output of the S143, wherein the output is 29 x 1, performing batch normalization operation by using a batch norm, performing nonlinear operation on the output by adopting a ReLu activation function, and outputting the output as 32 x 128; the ReLu activation function is:

s145: the convolution, the bachnorm and the ReLu operations described in step S144 are performed again on the output of the previous layer, the output is 64 x 64 the method comprises the steps of carrying out a first treatment on the surface of the;

s146: the output of the upper layer is subjected to nonlinear operation by using a ReLu activation function, and then is subjected to convolution operation by using inverse convolution with a convolution kernel of 4*4, then nonlinear operation is performed using the Tanh activation function, the output is 128 x 128; the Tanh activation function is as follows:

s147: the output of the upper layer is subjected to operations in the S141-S143 again, and the output is 1 x 5;

s148: the image after scale normalization in the step S13 and the output of the step S147 are input into a convolution layer of 4*4 together for convolution operation, then nonlinear activation is carried out by using a nonlinear activation function LeakyReLu, and the output is 128×128×64;

s149: performing convolution operation on the output of the upper layer by using a convolution check of 4*4, performing batch normalization operation by using a batch norm, and performing nonlinear activation by using a LeakyReLu;

s1491: continuing to convolve the output of the upper layer by adopting the process of S142, wherein the output of the non-linear operation is 4 x 1;

s1492: finally, softmax is adopted for the output of the upper layer, and the probability of true judgment is output;

s1493: and (3) performing full-connection operation on the output of the S147 process, and finally training 5 expressions through a Softmax classifier, wherein the 5 expressions are 1=happy, 2=abnomination, 3=neutral, 4=analog and 5= surprise and fear, so that facial expression recognition is realized.

Step S15 includes

S151: the network loss function is also divided into four parts, and for the generation network of the first part, the difference between the original image and the reconstructed image is reduced on the pixel level, and the reconstruction error loss is as follows:

L _con ＝E _x～pX ||x-G(x)|| ₁ ；

pX represents data allocation; x is the input image G (X) is the image generated by the generator in the network;

the feature matching method proposed by Salimans et al is used for reducing the instability of training, the image feature level is optimized, and one feature matching error of the discriminator of the second partial network is as follows:

L _adv ＝E _x～pX ||f(x)-f(G(x))|| ₂

wherein f (·) represents a discriminator model transformation;

the third part is a potential vector z and a reconstructed potential vectorThe encoding loss of the facial expression information of the picture is prevented, and the interference with the picture independence information in the network decoding process is prevented:

wherein h (■) represents a transcoding;

the network loss of the fourth part is the cross entropy loss of Softmax layer:

where k (■) represents the cross entropy loss process of Softmax and k (y) represents the trueAs a result of this, the processing time,representing the recognition result;

the overall network loss function is as follows:

L＝ω _adv L _adv +ω _con L _con +ω _p L _p +ω _s L _s

wherein omega _adv ，ω _con ，ω _p ，ω _s Is a parameter to regulate losses;

s152: the Optimizer selects Adam Optimizer, the learning rate is set to be 0.0002, training samples are trained in batches, 16 pictures are selected for training in each batch, and epoch is set to be 100, 200, 300 and 400 respectively;

s153: in each training, 1 picture of epoch is obtained first, then loss value of loss is calculated, and then parameters of the network are updated continuously by using an Adam optimizer to minimize loss value of the network.

In the step (3), inputting the picture into a trained IC-GAN network model for recognition, and finally outputting the probability of each type of facial expression, wherein the expression class with the highest output probability is the classification result of the user; the probability calculation formula is as follows:

wherein z is _i Representing an ith output of the network; omega _ij The j-th weight of the i-th neuron, b is the bias; s is S _i Representing the output of the ith neuron, y _i Representative is the ith output value of Softmax.

The advantages and effects are that:

the invention designs a facial expression recognition method based on generating countermeasure, which comprises a network training process and an off-line recognition process of facial expression recognition with intra-class gaps; the off-line identification process should include the following steps:

s11: downloading through a network, analyzing video by frame skipping, and collecting an input image x;

s12: geometrically normalizing the input image x, and detecting a normalized image x';

s13: processing the detected and cut image x' to a uniform size;

s14: constructing a network model based on facial expression recognition of the generated countermeasure;

s15: performing data enhancement and data expansion processing on the image x' and unifying the image size;

s16: training the network model and storing the trained network model;

for the identification process, the following steps should be included:

s21: downloading through a network, analyzing video by frame skipping, and collecting an input image I;

s22: then the input image I is input into the trained network model;

s23: and obtaining a recognition result.

The following steps should also be included for the step S12:

s121: performing geometric normalization processing on the input image; the geometric normalization method comprises scale normalization, outer head correction and face twisting correction;

s122: performing face detection on the geometrically normalized image by using a face detection method in an OpenCV open source library, and then performing noise reduction treatment on the detected image;

s23: the geometrically normalized image x' is obtained.

The step S13 should further include:

s131: determining the position of an image according to the coordinates of the face;

s132: obtaining a face image by using OpenCV detection;

s133: and adjusting the face image after clipping to be uniform size, and changing the face image after clipping to be 256 x 256 size.

Still further, step S14 should further comprise: s141: constructing an IC-GAN neural network by using a pytorch deep learning framework, firstly inputting a picture into a con_1 layer for convolution operation, and carrying out convolution on an input image through convolution check of 4*4 to output 128×128×64; performing nonlinear operation on the convolution by adopting a LeakyReLu activation function, wherein the output is 128 x 64; the LeakyReLu activation function is:

s142: the convolution operation is carried out on the output of the upper layer by using the convolution kernel of 4*4, the output is 64×64×128, then the normalization operation is carried out on the output of the upper layer by adopting the batch norm layer, and then the nonlinear operation is carried out on the convolution by adopting the LeakyReLu activation function, and the output is 64×64×128;

s143: the output of the upper layer is continuously subjected to convolution, batch norm and LeakyReLu operation by using the method of S142, and the output is 4 x 100;

s144: performing inverse convolution operation of a convolution kernel 4*4 on the output of the S143, wherein the output is 29 x 1, performing batch normalization operation by using a batch norm, performing nonlinear operation on the output by adopting a ReLu activation function, and outputting the output as 32 x 128;

s145: the outputs of the previous layers are again subjected to the convolution, the bachnorm and the ReLu operations described in S144, the output is 64 x 64 the method comprises the steps of carrying out a first treatment on the surface of the;

s146: the output of the upper layer is subjected to nonlinear operation by using a ReLu activation function, and then is subjected to convolution operation by using inverse convolution with a convolution kernel of 4*4, then nonlinear operation is performed using the Tanh activation function, the output is 128 x 128;

s148: the original image and the output of S147 are input into a convolution layer of 4*4 together for convolution operation, then nonlinear activation is carried out by using a nonlinear activation function LeakyReLu, and the output is 128 x 64;

s1491: continuing to convolve the output of the upper layer by adopting the process of S150, wherein the output of the non-linear operation is 4 x 1;

s1492: and finally, adopting Softmax for the output of the upper layer, and outputting the probability of judging true.

S1493: carrying out full connection operation on the output of the S147 process, and finally training 5 expressions through a Softmax classifier, wherein the 5 expressions are 1=happy, 2=abnomination, 3=neutral, 4=analog and 5= surprise and fear, so that facial expression recognition is realized;

step S15 should also include: s151: according to the network structure and experimental characteristics, the network loss is also divided into four parts, and for the generated network of the first part, the difference between the original image and the reconstructed image is reduced on the pixel level, and the reconstruction error loss is as follows:

L _con ＝E _x～pX ||x-G(x)|| ₁ ；

L _adv ＝E _x～pX ||f(x)-f(G(x))|| ₂

where f (·) represents the discriminator model transformation.

where h (·) represents the transcoding.

The network loss of the fourth part is the cross entropy loss of Softmax layer:

where k (·) represents the cross entropy loss process of Softmax, k (y) represents the true result,representing the recognition result.

The overall network loss function is as follows:

L＝ω _adv L _adv +ω _con L _con +ω _p L _p +ω _s L _s

wherein omega _adv ，ω _con ，ω _p ，ω _s Is a parameter to regulate losses.

S152: the Optimizer selects Adam Optimizer, the learning rate is set to 0.0002, the training samples are trained in batches, 16 pictures are selected for training in each batch, and epoch is set to 100, 200, 300 and 400 respectively.

Still further, the step S16 should further include: s161: downloading through a network, frame skipping, analyzing the video and collecting an input image;

s162: performing geometric normalization processing, face detection, opencv processing and unified size on the input image;

s163: and inputting the processed image into a trained IC-GAN network model for recognition, and finally outputting the probability of each expression, wherein the expression with the highest probability is used as the expression which is wanted to be recognized by the network.

Compared with the prior art, the invention has the advantages that:

according to the facial expression recognition method based on the generated countermeasure, the facial expression features are automatically extracted by comparing with the traditional method for manually extracting the expression features, and compared with the facial expression recognition of a neural network at a slightly early stage, the facial expression recognition method based on the generated countermeasure has the advantages that the recognition rate is improved, and therefore the expression recognition is accurately carried out.

Drawings

For a clearer description of embodiments of the present invention or of the prior art, the following will briefly describe all the drawings that are essential to the description of the embodiments or of the prior art, so that the following drawings are some embodiments of the present invention, and for other researchers in this field, other drawings can be obtained from these drawings.

FIG. 1 is a flow chart of the overall process of the present invention.

FIG. 2 is a schematic diagram of an IC-GAN network model according to the present invention

Detailed Description

An facial expression recognition method based on Intril-Class Gap GAN,

the identification model construction comprises the following steps:

(3) Outputting a recognition result;

(2.3) aiming at the problem of facial expression recognition that the data set in the step (2.2) has intra-Class gaps (the gaps of the same expression are called intra-Class gaps or the similar expressions have different expression forms, the intra-Class gaps are larger, the acquired images can be influenced by the shooting angles of external environment shielding objects and the like, the laughing expressions can be mistakenly recognized into other types of expressions and the similar expressions due to the reasons, but the characteristic differences are particularly and finally influenced by the recognition accuracy due to the complicated surrounding environment and the like, the similar expressions are smiles and the like, and the situation that the misrecognition into other types of expressions is caused by the influence of the shooting angles of the external environment shielding objects and the like is caused, and an Intral-Class Gap GAN neural network model is constructed;

(2.4) simultaneously training the generator and the discriminator of the network in combination with the pixel differences and the differences of the potential vectors between the input image (training sample input during network training) and the reconstructed image (the image generated during training is used for matching with the original image, and when the reconstructed image is generated, that is, the reconstructed image is not different from the input image, the network is considered to be trained so that the image features can be extracted correctly), so as to ensure that the difference between the reconstructed image and the input image is minimum. (when the network is trained by comparing the original input picture with the network generated picture and the network generated picture is consistent with the input picture, the network is most powerful in identifying the time when the network is trained

s11: based on Multi-PIE and JAFFE expression data sets, facial expression images are downloaded on a network through a (2.1) step, a facial expression data set (sample expansion) required by self-made text is carried out, abomination, happy, neutral, anxious and surprise and fear facial expressions of people with different countries, different age groups, different professions and the like are selected for experiments, a large number of intra-class gaps (the intra-class gaps can be larger, and generally only the intra-class gaps are large), the basic difference is recognized, the same expression form of one expression comprises the same expression (such as smile and smile) and the like, the form presented under the same background environment is called intra-class, the same smile as a person is presented under the same background environment, the condition that the intra-class gap exists is not met, or the intra-class gap is large is not met, for example, the background is different, the expression is different or the same person is different, as long as one person is met, the condition that the intra-class gap belongs to the intra-class gap or the large intra-class gap is met is basically, the difference is met, the form of the difference is large intra-class gap is known as the difference is used as the training data of the complex image data of the facial expression

S12: geometric normalization processing the input image, and performing face detection on the normalized image (obtaining a suitable face image, as described in claim 3, by processing to obtain sample data suitable for network training, for example, rotation may be required to ensure consistency of face directions, etc.);

s13: scale normalization the image after the processing in step S12 unifies the sizes of the images (S12 and S13 are preprocessing processes).

The specific steps in the step (2.4) are as follows:

s15: carrying out data enhancement and data expansion processing on the image;

s16: and training the network model and storing the trained network model.

The step S12 includes the following steps:

s122: rotating the image according to the coordinates of eyes on the face to ensure the consistency of the face direction (the rotation invariance of the face in the image plane is reflected in the process of preprocessing the face image), wherein the distance of the eyes of the person is d, and the midpoint of the two eyes is O;

The step S13 includes the following steps:

The step S14 includes the following steps:

a _i is represented by the general formula (1), + -infinity) interval.

S142: the output of the upper layer (the first layer of convolution layer) is continuously convolved by using the convolution kernel of 4*4, the output is 64×64×128, the output of the upper layer is normalized by adopting the batch norm layer, the convolution is non-linearly operated by adopting the LeakyReLu activation function, and the output is 64×64×128

Step S15 includes

S151: according to the constructed IC-GAN network structure, the network loss function is also divided into four parts, and for the generated network of the first part, the difference between the original image and the reconstructed image is reduced on the pixel level, and the reconstruction error loss is as follows:

L _con ＝E _x～pX ||x-G(x)|| ₁ ；

L _adv ＝E _x～pX ||f(x)-f(G(x))|| ₂

where f (·) represents the discriminator model transformation.

where h (·) represents the transcoding.

The network loss of the fourth part is the cross entropy loss of Softmax layer:

The overall network loss function is as follows:

L＝ω _adv L _adv +ω _con L _con +ω _p L _p +ω _s L _s

In the step (3), inputting the picture into the trained IC-GAN network model for recognition, finally outputting the probability of each type of facial expression, and outputting the expression category with the highest probability as the result of our classification

In order to enable a researcher in the art to more clearly understand the solution of the present invention, the following detailed and complete description of the solution of the present invention will be given by way of example only with reference to the accompanying drawings in the embodiments of the present invention. All other embodiments that could be made by the researcher without having to obtain the innovative effort are based on the embodiments of the present invention and are intended to be within the scope of the invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and in the foregoing description are not intended to describe the sequence or order of similar objects and to distinguish similar objects in the description. Where so used, portions of the data are interchangeable, facilitating description or illustration of the implementation of the unexpected sequence. In addition, the terms "comprising" and "having" and their like terms in the specification clearly describe the other steps listed in the specification as inherent to such processes, methods, products and apparatus.

As shown in fig. 1 and 2, the present invention provides a facial expression recognition method based on generating a countermeasure, which includes a network training process and an offline recognition process of facial expression recognition with intra-class gaps.

As an embodiment, the off-line identification process should include the following steps:

step S11: downloading through a network, analyzing video by frame skipping, and collecting an input image x;

step S12: geometrically normalizing the input image x, and detecting a normalized image x';

step S13: processing the detected and cut image x' to a uniform size;

step S14: constructing a network model based on facial expression recognition of the generated countermeasure;

step S15: performing data enhancement and data expansion processing on the image x' and unifying the image size;

step S16: training the network model and storing the trained network model;

in a specific embodiment, step S12 should further comprise the steps of:

step S121: performing geometric normalization processing on the input image; the geometric normalization method comprises scale normalization, outer head correction and face twisting correction;

step S122: performing face detection on the geometrically normalized image by using a face detection method in an OpenCV open source library, and then performing noise reduction treatment on the detected image;

s23: the geometrically normalized image x' is obtained.

As a preferred embodiment, step S23 further comprises:

s132: obtaining a face image by using OpenCV detection;

as a preferred embodiment, the IC-GAN network uses a pytorch build network including an input layer, a convolution layer, an activation function, a pooling layer, a full connection layer, a BN layer, and an output layer.

As a preferred embodiment, the size of the front and back of the convolution layer can be described as the following formula:

the input size of the convolution layer is: w (W) ₁ *H ₁ *D ₁

The output size of the convolution layer is:

D ₂ ＝K

in the above formula, K is the number of convolution kernels, F is the size of the convolution kernels, S is the step size, and P is the boundary filling.

As a preferred embodiment, as a mixed expression dataset of the present application, which has a total of 4455 images and 5 expression labels, 1=happy, 2=abnomination, 3=neutral, 4=analog, 5= surprise and fear, the present invention has the problem of unbalanced distribution of the dataset pseudobook, so that the dataset is expanded by adopting modes of affine transformation, mirror image transformation, contrast adjustment, brightness adjustment and the like, and the number of the mixed expression datasets after expansion is as shown in table 1:

table 1 number of expressions of the mixed dataset after expansion

As a most preferred method of the present application, step S15 should further comprise: s151: according to the network structure and experimental characteristics, defining the network loss as 4 parts;

s163: and inputting the processed image into a trained IC-GAN network model for recognition, and finally outputting the probability of each expression, wherein the expression with the highest probability is the expression which is wanted to be recognized by the network.

Compared with the prior art, the invention has the advantages that:

As an embodiment of the application, the number of samples after data enhancement is 4455 training samples, 411 test samples, and the idea of model training is that before inputting pictures into a network for training, firstly, images are cut through OpenCV open source codes, secondly, the images are unified to 256 x 256 sizes, and then, the preprocessed pictures are used as the input of the network to train an IC-GAN network model. The Softmax loss function adopts a cross entropy loss function, the Optimizer adopts an Adam Optimizer, the learning rate is set to be 0.0002, training samples are trained in batches, 16 pictures are selected for training in each batch, and epoch is set to be 100, 200, 300 and 400 respectively.

As a preferred embodiment of the present application, the identification process should comprise the steps of:

s22: then the input image I is input into the trained network model;

s23: and obtaining a recognition result.

The above-mentioned embodiment numbers of the present invention are only for describing the present invention, and do not represent the quality of any embodiment.

In the embodiments of the present invention, the descriptions of the embodiments have emphasis, and if some portions of the descriptions in one embodiment are not clear, reference may be made to the corresponding descriptions in other embodiments;

in several embodiments provided in the present application, the described technical content may be implemented in other manners. All the above description is merely illustrative.

Claims

1. A facial expression recognition method based on Intril-Class Gap GAN is characterized by comprising the following steps:

the identification model construction comprises the following steps:

(3) Outputting a recognition result;

(2.3) constructing an Intril-Class Gap GAN neural network model aiming at the facial expression recognition problem with intra-Class gaps in the data set in the step (2.2);

(2.4) training the generator and discriminator of the network simultaneously in combination with the pixel differences and the potential vector differences between the input image and the reconstructed image, ensuring that the differences between the reconstructed image and the input image are minimal;

the method comprises the following steps:

s14: training a facial expression recognition network model based on the IC-GAN neural network generating the countermeasure based on the image processed in step S13;

the method comprises the following steps:

s15: carrying out data enhancement and data expansion processing on the image;

comprising:

L _con ＝E _x～pX ||x-G(x)|| ₁ ；

L _adv ＝E _x～pX ||f(x)-f(G(x))|| ₂

wherein f (·) represents a discriminator model transformation;

wherein h (■) represents a transcoding;

the network loss of the fourth part is the cross entropy loss of Softmax layer:

where k (■) represents the cross entropy loss process of Softmax, k (y) represents the true result,representing the recognition result;

the overall network loss function is as follows:

L＝ω _adv L _adv +ω _con L _con +ω _p L _p +ω _s L _s

s153: firstly obtaining 1 picture of epoch in each training, then calculating loss value of loss, and then continuously updating parameters of the network by using an Adam optimizer to minimize the loss value of the network;

s16: and training the network model and storing the trained network model.

2. The facial expression recognition method based on intraclass Gap GAN according to claim 1, wherein:

s11: based on Multi-PIE and JAFFE expression data sets, facial expression pictures are downloaded on the network through the step (2.1), the facial expression data sets required by homemade are carried out, abomination, happy, neutral, anxious and surprise and fear facial expressions of different countries, different age groups and different professional groups are selected for experiments, and a large number of facial expression characteristics with intra-class gaps are added as the complexity of the data sets to be used as input images x of network training;

3. The facial expression recognition method based on intraclass Gap GAN according to claim 2, wherein: the step S12 includes the following steps:

4. The facial expression recognition method based on intraclass Gap GAN according to claim 2, wherein:

the step S13 includes the following steps:

5. The facial expression recognition method based on Intril-Class Gap GAN according to claim 1, wherein the facial expression recognition method comprises the following steps:

in the step (3), the picture is input into a trained IC-GAN network model for recognition, the probability of each type of facial expression is finally output, and the expression class with the highest output probability is the classification result of the user.