CN114495211A

CN114495211A - Micro-expression identification method, system and computer medium based on graph convolution network

Info

Publication number: CN114495211A
Application number: CN202210015324.7A
Authority: CN
Inventors: 司家鑫; 周华毅; 吴洺
Original assignee: Chongqing Research Institute Of Shanghai Jiaotong University
Current assignee: Chongqing Research Institute Of Shanghai Jiaotong University
Priority date: 2022-01-07
Filing date: 2022-01-07
Publication date: 2022-05-13

Abstract

The invention belongs to the technical field of expression recognition, and particularly discloses a micro expression recognition method, a system and a computer medium based on a graph convolution network. By adopting the technical scheme, the correlation characteristics between AU characteristics and the correlation characteristics between AU classifiers are simulated based on the multi-scale characteristic extraction algorithm of the activation region and two graph convolution networks, and the accuracy of AU identification is ensured.

Description

Micro-expression identification method, system and computer medium based on graph convolution network

Technical Field

The invention belongs to the technical field of expression recognition, and relates to a micro expression recognition method, a system and a computer medium based on a graph convolution network.

Background

The expressions, especially the micro-expressions, suggest the real feeling of the human mind, and play an important role in interpersonal communication, mental state diagnosis, education quality improvement and the like.

There are two types of facial expression recognition methods: the method comprises a machine learning-based identification method and a deep learning-based identification method. The recognition method based on machine learning is easily influenced by the quality of the feature extraction effect, and the recognition method based on deep learning combines the feature extraction and the expression classification together to overcome the defects of the recognition method based on machine learning. In the deep learning-based method, a convolutional neural network is widely used for establishing an expression classification model. Matsuguetal (Matsugue M, Mori K, Mitari Y, et al. subject independent facial expression with robust face detection using a connected facial expression network [ J ]. Neural Networks,2003,16(5-6): 555-.

Because human facial expressions are formed by movements of facial muscles, in order to describe subtle changes of the expressions more accurately, in the prior art, human face motion units (AU) are used to describe changes of facial micro-expressions, so that the true feeling of the heart is revealed. A central point is designed for each AU, local features are manually extracted according to the central point, but the activation regions among different AUs have differences, and the features of the same scale are adopted for representation, so that the difference characteristic of the AU scales cannot be effectively captured. Or an AU identification algorithm combined with the graph convolution network is adopted, local features of AUs are updated through the graph convolution network, only the correlation between AU expression features is considered, the correlation between classifiers for finally identifying the AUs is ignored, and the AU identification effect is reduced.

Disclosure of Invention

The invention aims to provide a micro expression recognition method, a system and a computer medium based on a graph convolution network, which improve the recognition effect of human face micro expression.

In order to achieve the purpose, the basic scheme of the invention is as follows: a face micro-expression recognition method based on a graph convolution network comprises the following steps:

extracting multiscale local features of AUs according to the corresponding relation between each AU and a specific face region;

an AU correlation matrix is constructed by using AU multi-scale local features;

based on the AU correlation matrix, two-layer graph convolution networks are constructed, wherein one graph convolution network is used for updating the characteristic representation of each AU, and the other graph convolution network is used for updating an AU classifier;

training an AU classifier by adopting a multi-label classification loss function;

and obtaining a final AU classifier for face micro-expression recognition.

The working principle and the beneficial effects of the basic scheme are as follows: the scheme designs a feature representation algorithm capable of effectively capturing multi-scale and local features of AUs, improves the feature expression capability of AU recognition, and lays a foundation for AU recognition improvement. The graph neural network capable of simulating the correlation between AUs transfers the priori knowledge of the correlation between AUs to the feature representation, and the efficiency of AU identification is improved. The classifier can simulate the correlation between AUs and improve the final effect of AU identification.

Further, the method for extracting AU multi-scale local features is as follows:

predefining a multi-scale window by using a multi-scale local region feature extraction algorithm according to the size of an AU activation region, and mapping the multi-scale window into a final feature layer extracted by a convolutional neural network to obtain AU multi-scale local features;

generating a feature with a fixed size by adopting RoIAlign, and unifying the sizes of AU features in different areas;

and (3) fusing paired AU features with uniform sizes through average pooling operation, and reducing dimensions through fully-connected mapping to obtain 64-dimensional AU features.

The difference of the face area of each AU during activation is extracted in a targeted manner, so that the subsequent identification processing is facilitated, and the accuracy is improved.

Further, the method for constructing the AU correlation matrix comprises the following steps:

defining a correlation matrix A between AUs by using the simultaneous occurrence frequency and the total occurrence frequency between AUs:

A_i，j＝Q_i，j/N_i，

wherein A is_i,jFor the correlation matrix, Q, in which the ith and jth AU are simultaneously present in a face image_i,jFor the frequency of the simultaneous occurrence of the ith and jth AU, N_iThe frequency of occurrence of the ith AU.

The correlation between AUs is simulated by using the graph convolution network, and an AU correlation matrix needs to be defined firstly so as to carry out subsequent operation.

Further, the graph convolution network layer for updating AU feature representation is:

wherein H^(l)Is a node representation of the l-th layer, H^(l+1)Represented by nodes at the l +1 th layer, D is a dimensionality reduction matrix of the incidence matrix A and is defined as D ═ Σ_j(A+I)_ij(ii) a Because A is a square matrix, the input and output dimensions of the nodes are unchanged, sigma is an activation function LEAKYReLU, I is a unit matrix, and W is a parameter in the neural network.

An AU correlation propagation algorithm based on a graph convolution network not only improves the AU feature expression capability, but also improves the capability of an AU classifier.

Further, when the AU features represent propagation AU correlation, local multiscale features extracted for each AU are input;

when AU correlation is propagated in the AU classifier, inputting the pre-trained classifier, and then carrying out iterative updating on the AU classifier by using the graph convolution network.

The AU classifier based on the graph convolution network transfers the correlation of AUs to the classifier, thereby directly improving the recognition effect of AUs.

Further, the loss function is:

wherein

Is a predictive value of the algorithm and,

and

an AU feature representation and classifier generated after the graph convolution network; omega_iAs a weight value, y_iIn order to be the true value of the value,

is a predicted value.

And training the AU classifier by using the loss function training, and optimizing the recognition accuracy of the AU classifier.

The invention also provides a face micro-expression recognition system based on the graph convolution network, which comprises a face image acquisition unit and a processing unit, wherein the face image acquisition unit is used for acquiring a face image to be recognized, the output end of the face image acquisition unit is connected with the processing unit, and the processing unit executes the method of the invention to recognize the face micro-expression.

By utilizing the system and the face micro expression recognition algorithm based on the graph convolution network, the face motion unit can be automatically detected, and the recognition and the application of the follow-up micro expression are facilitated.

The present invention also provides a computer medium having stored therein a program for executing the method of the present invention.

The computer medium is convenient to use, and micro-expression recognition operation can be performed on various corresponding processing devices by utilizing the computer medium, so that the application range is expanded.

Drawings

FIG. 1 is a schematic flow chart of a human face micro expression recognition method based on a graph volume network according to the present invention;

FIG. 2 is a schematic structural diagram of an AU multi-scale local feature extraction method in a preferred embodiment of the invention;

fig. 3 is a schematic structural diagram of an AU correlation matrix according to a preferred embodiment of the present invention;

FIG. 4 is a schematic flow chart of the corresponding update of the graph convolution network of the face micro-expression recognition method based on the graph convolution network of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

In the description of the present invention, it is to be understood that the terms "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used merely for convenience of description and for simplicity of description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention.

In the description of the present invention, unless otherwise specified and limited, it is to be noted that the terms "mounted," "connected," and "connected" are to be interpreted broadly, and may be, for example, a mechanical connection or an electrical connection, a communication between two elements, a direct connection, or an indirect connection via an intermediate medium, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.

Each AU corresponds to a specific face region and thus has strong local features. Meanwhile, there is a certain correlation between AUs, for example, AU6 (lifting cheek) and AU12 (stretching mouth corner) often appear in pairs, showing a smile state; the regionality and AU correlation are very important for the recognition of micro-expressions.

Based on the above, the invention discloses a face micro expression recognition method based on a graph convolution network, aiming at presentation forms and internal relevant characteristics of AUs, firstly, because each AU corresponds to a specific face area and the difference of the activation area of each AU is large, the scheme provides a multi-scale characteristic extraction algorithm based on the activation area, and can extract the proper local characteristics of each AU and ensure the accuracy of AU recognition. Secondly, in order to effectively capture the correlation characteristics between AUs, two graph convolution networks are designed, and are respectively used for simulating the correlation characteristics between AU characteristics and the correlation characteristics between AU classifiers, and the correlation priori knowledge between AUs is transmitted to characteristic representation, so that the efficiency of AU identification is improved.

As shown in fig. 1, the method for recognizing the micro expression of the human face of the embodiment includes the following steps:

according to the corresponding relation between each AU and the specific face area, based on the concept of the central point of the AU, the improved central point of the AU is provided and shown in the table 1, and the multi-scale local features of the AU are extracted;

TABLE 1 AU center points

An AU correlation matrix is constructed by utilizing AU multi-scale local features so as to simulate the correlation between AUs by utilizing a graph convolution network in the following process;

based on the AU correlation matrix, two-layer graph convolution networks are constructed, wherein one graph convolution network is used for updating the characteristic representation of each AU, and the other graph convolution network is used for updating an AU classifier; as shown in fig. 4, based on MSUR feature extraction, when AU features represent propagating AU correlations, local multi-scale features extracted for each AU are input; when propagating AU correlation in AU Classifiers, inputting Pre-trained Classifiers (Pre-trained Classifiers), based on AU relational modeling of GCN (AU correlation modeling with GCN), presetting a score (predicted score for each AU), then utilizing a graph convolution network to iteratively update the AU Classifiers (Classifiers), namely, utilizing BP algorithm to update by calculating classification loss (Multi-label loss), which is a neural network training method, wherein the purpose of updating is to learn the Classifiers more suitable for AU;

and training the AU classifier by adopting a multi-label classification loss function to obtain a final AU classifier, and performing face micro-expression recognition.

In a preferred embodiment of the present invention, the method for extracting AU multi-scale local features comprises the following steps:

according to the difference of the face area of each AU during activation, a multiscale local area feature extraction algorithm is utilized, based on the size of an AU activation area, a multiscale window (different scales) is predefined, namely an activation template is predefined, the difference of the activation template of each AU is large, and when each AU is activated, a preset activation template is adopted. Mapping the multi-scale window to a final feature layer extracted by a convolutional neural network ResNet-34 to obtain AU multi-scale local features, wherein the AU multi-scale local features are also a general algorithm in the neural network, and the original image area is mapped to the feature image layer, and only the areas in the feature image are required to be determined to be associated with the given areas in the original image, and can be determined through feature image perception;

generating features with fixed size by adopting RoIAlign (region of interest matching, RoIAlign is generally used as a proprietary name and comes from Mask RCNN of kaiminhe), and unifying the sizes of AU features of different regions;

as AUs are generated in pairs, paired AU features with uniform sizes are fused through average pooling operation, and dimension reduction is carried out through fully-connected mapping to obtain 64-dimensional AU features. As shown in fig. 2, taking AU2 as an example, the final layer of feature map of ResNet-34 is 512x7x7, a preset window is mapped to the feature map, features with fixed length are generated by a roiign operator, and pairs of features are fused by an average pooling operator (avg.

In a preferred embodiment of the present invention, the method for constructing the AU correlation matrix comprises:

A_i，j＝Q_i，j/N_i，

wherein A is_i,jFor the correlation matrix, Q, in which the ith AU and the jth AU appear in one face image at the same time_i,jFor the frequency of the simultaneous occurrence of the ith and jth AU, N_iThe specific structure of AU correlation matrix for the frequency of occurrence of the ith AU is shown in fig. 3.

In a preferred embodiment of the present invention, the graph volume network layer update for updating AU feature representation is defined as:

wherein H^(l)Is a node representation of the l-th layer, H^(l+1)Represented by nodes at the l +1 th layer, D is a dimensionality reduction matrix of the incidence matrix A and is defined as D ═ Σ_j(A+I)_ij(ii) a Since a is a square matrix, the input and output dimensions of the node are not changed, and σ is an activation function, leak relu, f (x) x, if x>＝0；f(x)＝alpha x,x<0, taking alpha as 0.001; i is an identity matrix and W is a parameter in the neural network of the graph.

The loss function is:

performing feedforward propagation training and BP propagation training of the neural network using the loss function, wherein

Is a predictive value of the algorithm and,

and

is a predicted value.

The invention also provides a face micro-expression recognition system based on the graph convolution network, which comprises a face image acquisition unit and a processing unit, wherein the face image acquisition unit is used for acquiring a face image to be recognized, the output end of the face image acquisition unit is connected with the processing unit, and the processing unit executes the method of the invention to recognize the face micro-expression. By utilizing the system and the face micro expression recognition algorithm based on the graph convolution network, the face motion unit can be automatically detected, and the recognition and the application of the follow-up micro expression are facilitated.

The scheme provides a multi-scale local AU feature extraction algorithm, the AU feature expression capability can be effectively improved, and the priori knowledge is fused in a predefined mode, so that the speed of reasoning and training is remarkably increased. The AU feature correlation learning algorithm based on the graph convolution network can effectively learn the correlation between AUs, and the AU identification efficiency is improved through AU correlation analysis. The AU classifier based on the graph convolution network transmits the correlation of AUs to the classifier, thereby directly improving the recognition effect of AUs.

The present invention also provides a computer medium having stored therein a program for executing the method of the present invention. The computer medium is convenient to use, and micro-expression recognition operation can be performed on various corresponding processing devices by utilizing the computer medium, so that the application range is expanded.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A face micro-expression recognition method based on a graph convolution network is characterized by comprising the following steps:

an AU correlation matrix is constructed by using AU multi-scale local features;

and obtaining a final AU classifier for face micro-expression recognition.

2. The method for recognizing the micro expression of the human face based on the graph volume network as claimed in claim 1, wherein the method for extracting the AU multi-scale local features comprises the following steps:

3. The method for recognizing the micro expression of the human face based on the graph volume network as claimed in claim 1, wherein the method for constructing the AU correlation matrix comprises:

A_i，j＝Q_i，j/N_i，

4. The method of claim 1, wherein the convolutional network layer for updating AU feature representation comprises:

5. The method of claim 1, wherein when AU features represent propagation AU correlations, local multiscale features extracted for each AU are input;

when the AU relevance is propagated in the AU classifier, the pre-trained classifier is input, and then the AU classifier is updated iteratively by using the graph convolution network.

6. The method of claim 1, wherein the loss function is:

wherein

Is a predictive value of the algorithm and,

and

is a predicted value.

7. A face micro-expression recognition system based on a graph volume network is characterized by comprising a face image acquisition unit and a processing unit, wherein the face image acquisition unit is used for acquiring a face image to be recognized, the output end of the face image acquisition unit is connected with the processing unit, and the processing unit executes the method of any one of claims 1 to 6 to perform face micro-expression recognition.

8. A computer medium, characterized in that a program for performing the method of one of claims 1 to 6 is stored in the computer medium.