CN114049675B - Facial expression recognition method based on light-weight two-channel neural network - Google Patents

Facial expression recognition method based on light-weight two-channel neural network Download PDF

Info

Publication number
CN114049675B
CN114049675B CN202111430259.6A CN202111430259A CN114049675B CN 114049675 B CN114049675 B CN 114049675B CN 202111430259 A CN202111430259 A CN 202111430259A CN 114049675 B CN114049675 B CN 114049675B
Authority
CN
China
Prior art keywords
channel
layer
feature
network
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111430259.6A
Other languages
Chinese (zh)
Other versions
CN114049675A (en
Inventor
樊春晓
王振兴
林杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202111430259.6A priority Critical patent/CN114049675B/en
Publication of CN114049675A publication Critical patent/CN114049675A/en
Application granted granted Critical
Publication of CN114049675B publication Critical patent/CN114049675B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of expression recognition methods, and discloses a facial expression recognition method based on a light-weight double-channel neural network, which comprises the following operation steps: s1, image preprocessing and construction of a graph structure; s2, constructing a GCN-based light-weight dual-channel network, and automatically extracting global features and local features of the expression; s3, feature fusion and expression classification. According to the invention, a graph structure is constructed from the input expression image, two local characteristics of facial expression geometry and texture can be automatically extracted by using the GCN, and the interference of human factors is avoided, so that the accuracy of expression classification results is improved, and the lightweight two-channel network still can obtain excellent classification performance under the conditions of simplified network, few network layers and small parameters, and has higher running speed and better robustness.

Description

Facial expression recognition method based on light-weight two-channel neural network
Technical Field
The invention relates to the technical field of expression recognition methods, in particular to a facial expression recognition method based on a light-weight double-channel neural network.
Background
The existing traditional method based on local features mainly aims at encoding and characterizing expression variable areas (such as eyes, mouth and nose) of a human face. However, the facial local features extracted by the methods are easily interfered by human factors, and can cause loss of facial expression information, so that classification is inaccurate. The deep learning method based on the global features takes RGB original face data as input, so that the accuracy of recognition is improved, but the complexity of a network in the method is also increased. And because the number of facial expression datasets is limited, the problem of over fitting is easy to occur. The algorithm has the problems of large performance fluctuation, poor robustness and the like when facing complex scenes, regardless of the traditional method based on local features or the depth method based on global features.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides a facial expression recognition method based on a light-weight double-channel neural network, which solves the problems of low accuracy and poor robustness of the existing method.
(II) technical scheme
In order to achieve the above purpose, the present invention provides the following technical solutions: the facial expression recognition method based on the light-weight double-channel neural network comprises the following operation steps:
s1, image preprocessing and construction of a graph structure
S11, preprocessing a face image of an input picture, and firstly carrying out graying treatment on the face image to reduce the data dimension;
s12, face detection and clipping are carried out to reduce the influence of background information irrelevant to the face in the image on feature extraction and the like;
s13, normalizing the cut face image into 224x224 with uniform size, and using the face image as an input of a CNN channel;
s14, construction of graph structure
Detecting face feature points in an input image, constructing a graph, connecting every two graph nodes, and obtaining a weighted adjacent matrix by forming a weight value of an edge by a distanceRepresenting geometric features of expression, wherein pixel values around the feature points are attributes of the nodes of the graph to obtain a node feature matrix +.>Texture features that may represent expressions;
s2, constructing a GCN-based light-weight dual-channel network, and automatically extracting global features and local features of expressions
S21, global feature channel-CNN channel
The CNN channel consists of 5 convolution units, wherein each convolution unit comprises a convolution layer and a maximum pooling layer which are a convolution kernel of 3x3 and a pooling kernel of 2x2, a correction linear unit is used as an activation function of each convolution layer, a vectorization layer unidimensionally converts multidimensional data into global feature vectors, the later feature vectors are convenient to connect, and a batch normalization layer is added;
in batch training, the activation of each batch is centered around zero mean and unit variance, for an m-dimensional input x= { X (1) ,...,x (m) Regularization of each dimension will be
Where E and Var are the expected value and variance of the input X, and the input of one layer in CNN has four dimensions, so that each dimension is normalized, and by using batch normalization, all samples in one miniband are correlated together, so that the network will not generate a certain result from a certain training sample, i.e. the output of the same sample will not depend only on the sample itself, but also on other samples belonging to the same batch as the sample, and the network will take the batch randomly each time, so that overfitting is avoided to some extent;
s22, local feature channel-GCN channel
The graph convolution network is similar to the common convolutional neural network in concept, and for the node characteristic matrix X and the weighted adjacency matrix A, the propagation modes of layers are as follows:
the GCN channel is specifically formed by a 4-layer graph convolution layer;
s3, feature fusion and expression classification
S31, connecting the global features extracted by the two-channel network with the local features to obtain a connection feature vector;
s32, inputting the data into a full-connection layer for feature fusion and expression classification, and obtaining a final classification result.
Preferably, in the step S2, the four dimensions refer to batch size, channels, width and height.
Preferably, in the step S2,I N is an identity matrix; />Is->Degree matrix of (2), the formula is->W (l) Is a trainable weight matrix; h (l) Is a feature of each layer, and for the input layer, H is X; sigma represents an activation function, such as ReLU (·) =max (0, ·).
Preferably, in the step S3, the connection feature vector may be expressed as:
v c =(v g ,v l ) Wherein v is c 、v g And v l Representing a connection feature vector, a global feature vector, and a local feature vector, respectively.
(III) beneficial effects
The invention provides a facial expression recognition method based on a light-weight double-channel neural network, which has the following beneficial effects:
(1) The invention constructs the graph structure from the input expression image, and can automatically extract two local characteristics of facial expression geometry and texture by using GCN, thereby being not interfered by human factors and improving the accuracy of expression classification results.
(2) The invention can extract local features and global features simultaneously by using two channels, and fuse the global and local features to obtain a comprehensive representation, thus obtaining better recognition results than the method which uses a single type of features.
(3) The light-weight dual-channel network can still obtain excellent classification performance under the conditions of simplified network, few network layers and small parameters, and has higher running speed and better robustness.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a component diagram of the present invention;
FIG. 3 is a detailed table of CNN channels of the present invention;
fig. 4 is a GCN channel detailed information table of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1-4, the present invention provides a technical solution: the facial expression recognition method based on the light-weight double-channel neural network comprises the following operation steps:
s1, image preprocessing and construction of a graph structure
S11, preprocessing a facial image of an input picture to better extract facial expression characteristics, and carrying out graying treatment on the facial image to reduce data dimension;
s12, face detection and clipping are carried out to reduce the influence of background information irrelevant to the face in the image on feature extraction and the like;
s13, normalizing the cut face image into 224x224 with uniform size, and using the face image as an input of a CNN channel;
s14, construction of graph structure
The input different from CNN is a whole picture, and the input of GCN is a graph structure. In order to construct a graph structure from facial expressions, it is necessary to detect facial feature points in an input image, and construct a graph structure as in fig. 2. In FIG. 2, each two graph nodes are connected, and the distances form the weights of the edges to obtain a weighted adjacency matrixRepresenting geometric features of expression, wherein pixel values around the feature points are attributes of the nodes of the graph to obtain a node feature matrix +.>Texture features that may represent expressions;
s2, constructing a GCN-based light-weight dual-channel network, and automatically extracting global features and local features of expressions
S21, global feature channel-CNN channel
The CNN channel consists of 5 convolution units, each containing one convolution layer and one maximum pooling layer, both a 3x3 convolution kernel and a 2x2 pooling kernel, the details of this channel being set forth in fig. 3. The correction linear unit is used as an activation function of each convolution layer, the vectorization layer unidimensionally converts multidimensional data into global feature vectors, the connection of the following feature vectors is facilitated, and in addition, a batch normalization layer is added for solving the problems of large intra-class difference and small inter-class difference of facial expressions. Unlike the face recognition task, where one category represents only one person, in facial expression recognition, one category contains a plurality of individuals. Thus, images belonging to the same expression class may have different appearances, sexes, skin colors, and ages. Thereby creating a large intra-class difference.
In batch training, the activation of each batch is centered around zero mean and unit variance, for an m-dimensional input x= { X (1) ,...,x (m) Regularization of each dimension will be
Where E and Var are the expected value and variance of the input X. The input to one layer in the CNN has four dimensions (channels), so each dimension is normalized. By using batch normalization, all samples in a minimatch are correlated together, so the network does not generate a deterministic result from a training sample, i.e., the output of the same sample is no longer dependent only on the sample itself, but also on other samples belonging to the same batch as the sample, and the network takes the batch randomly each time, which avoids overfitting to some extent;
s22, local feature channel-GCN channel
The graph convolution network is similar to the common convolutional neural network in concept, and for the node characteristic matrix X and the weighted adjacency matrix A, the propagation modes of layers are as follows:
wherein the method comprises the steps ofI N Is an identity matrix; />Is->Degree matrix of (2), the formula is->W (l) Is a trainable weight matrix; h (l) Is a feature of each layer, and for the input layer, H is X; sigma represents an activation function, e.gReLU(·)=max(0,·)。
The GCN channel is specifically formed by a 4-layer graph convolution layer, and detailed information is shown in fig. 4;
s3, feature fusion and expression classification
S31, connecting the global features extracted by the two-channel network with the local features to obtain a connection feature vector, which can be expressed as:
v c =(v g ,v l )
wherein v is c 、v g And v l Respectively representing a connection feature vector, a global feature vector and a local feature vector;
s32, inputting the data into a full-connection layer for feature fusion and expression classification, and obtaining a final classification result.
The method solves the problems of low accuracy and poor robustness of the existing method. The GCN channel extracts local features, the CNN channel extracts global features, the two-channel neural network fuses the global and local features to obtain a comprehensive representation, and a better recognition result is obtained compared with the single type of features. In addition, through the light-weight double-channel network with compact structure, the network is simplified, the problem of network complexity is solved, the problem of over fitting is also solved, and the robustness is very strong.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (4)

1. The facial expression recognition method based on the light-weight double-channel neural network is characterized by comprising the following operation steps of:
s1, image preprocessing and construction of a graph structure
S11, preprocessing a face image of an input picture, and firstly carrying out graying treatment on the face image to reduce the data dimension;
s12, face detection and clipping are carried out to reduce the influence of background information irrelevant to the face in the image on feature extraction and the like;
s13, normalizing the cut face image into 224x224 with uniform size, and using the face image as an input of a CNN channel;
s14, construction of graph structure
Detecting face feature points in an input image, constructing a graph, connecting every two graph nodes, and obtaining a weighted adjacent matrix by forming a weight value of an edge by a distanceRepresenting geometric features of expression, wherein pixel values around the feature points are attributes of the nodes of the graph to obtain a node feature matrix +.>Texture features that may represent expressions;
s2, constructing a GCN-based light-weight dual-channel network, and automatically extracting global features and local features of expressions
S21, global feature channel-CNN channel
The CNN channel consists of 5 convolution units, wherein each convolution unit comprises a convolution layer and a maximum pooling layer which are a convolution kernel of 3x3 and a pooling kernel of 2x2, a correction linear unit is used as an activation function of each convolution layer, a vectorization layer unidimensionally converts multidimensional data into global feature vectors, the later feature vectors are convenient to connect, and a batch normalization layer is added;
in batch training, the activation of each batch is centered around zero mean and unit variance, for an m-dimensional input x= { X (1) ,...,x (m) Regularization of each dimension will be
Where E and Var are the expected value and variance of the input X, and the input of one layer in CNN has four dimensions, so that each dimension is normalized, and by using batch normalization, all samples in one miniband are correlated together, so that the network will not generate a certain result from a certain training sample, i.e. the output of the same sample will not depend only on the sample itself, but also on other samples belonging to the same batch as the sample, and the network will take the batch randomly each time, so that overfitting is avoided to some extent;
s22, local feature channel-GCN channel
The graph convolution network is similar to the common convolutional neural network in concept, and for the node characteristic matrix X and the weighted adjacency matrix A, the propagation modes of layers are as follows:
the GCN channel is specifically formed by a 4-layer graph convolution layer;
s3, feature fusion and expression classification
S31, connecting the global features extracted by the two-channel network with the local features to obtain a connection feature vector;
s32, inputting the data into a full-connection layer for feature fusion and expression classification, and obtaining a final classification result.
2. The facial expression recognition method based on a lightweight two-channel neural network according to claim 1, wherein in the step S2, four dimensions refer to batch size, channels, width and height.
3. The facial expression recognition method based on the lightweight two-channel neural network according to claim 1, wherein in the step S2,I N is an identity matrix; />Is->Degree matrix of (2), the formula is->W (l) Is a trainable weight matrix; h (l) Is a feature of each layer, and for the input layer, H is X; sigma represents an activation function, such as ReLU (·) =max (0, ·).
4. The facial expression recognition method based on the lightweight two-channel neural network according to claim 1, wherein in the step S3, the connection feature vector may be expressed as:
v c =(v g ,v l ) Wherein v is c 、v g And v l Representing a connection feature vector, a global feature vector, and a local feature vector, respectively.
CN202111430259.6A 2021-11-29 2021-11-29 Facial expression recognition method based on light-weight two-channel neural network Active CN114049675B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111430259.6A CN114049675B (en) 2021-11-29 2021-11-29 Facial expression recognition method based on light-weight two-channel neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111430259.6A CN114049675B (en) 2021-11-29 2021-11-29 Facial expression recognition method based on light-weight two-channel neural network

Publications (2)

Publication Number Publication Date
CN114049675A CN114049675A (en) 2022-02-15
CN114049675B true CN114049675B (en) 2024-02-13

Family

ID=80211583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111430259.6A Active CN114049675B (en) 2021-11-29 2021-11-29 Facial expression recognition method based on light-weight two-channel neural network

Country Status (1)

Country Link
CN (1) CN114049675B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612987A (en) * 2022-03-17 2022-06-10 深圳集智数字科技有限公司 Expression recognition method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491835A (en) * 2018-06-12 2018-09-04 常州大学 Binary channels convolutional neural networks towards human facial expression recognition
WO2019196308A1 (en) * 2018-04-09 2019-10-17 平安科技(深圳)有限公司 Device and method for generating face recognition model, and computer-readable storage medium
CN112766220A (en) * 2021-02-01 2021-05-07 西南大学 Dual-channel micro-expression recognition method and system, storage medium and computer equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815785A (en) * 2018-12-05 2019-05-28 四川大学 A kind of face Emotion identification method based on double-current convolutional neural networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019196308A1 (en) * 2018-04-09 2019-10-17 平安科技(深圳)有限公司 Device and method for generating face recognition model, and computer-readable storage medium
CN108491835A (en) * 2018-06-12 2018-09-04 常州大学 Binary channels convolutional neural networks towards human facial expression recognition
CN112766220A (en) * 2021-02-01 2021-05-07 西南大学 Dual-channel micro-expression recognition method and system, storage medium and computer equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于改进的卷积神经网络的人脸表情识别方法;邹建成;曹秀玲;;北方工业大学学报;20200415(02);全文 *
面向面部表情识别的双通道卷积神经网络;曹金梦;倪蓉蓉;杨彪;;南京师范大学学报(工程技术版);20180920(03);全文 *

Also Published As

Publication number Publication date
CN114049675A (en) 2022-02-15

Similar Documents

Publication Publication Date Title
US11417148B2 (en) Human face image classification method and apparatus, and server
CN110288018B (en) WiFi identity recognition method fused with deep learning model
CN108615010B (en) Facial expression recognition method based on parallel convolution neural network feature map fusion
CN110348399B (en) Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network
WO2021082480A1 (en) Image classification method and related device
CN112580590A (en) Finger vein identification method based on multi-semantic feature fusion network
CN110084266B (en) Dynamic emotion recognition method based on audio-visual feature deep fusion
CN104268593A (en) Multiple-sparse-representation face recognition method for solving small sample size problem
CN105956570B (en) Smiling face's recognition methods based on lip feature and deep learning
CN110717423B (en) Training method and device for emotion recognition model of facial expression of old people
CN111160130B (en) Multi-dimensional collision recognition method for multi-platform virtual identity account
CN110046544A (en) Digital gesture identification method based on convolutional neural networks
CN112842348B (en) Automatic classification method for electrocardiosignals based on feature extraction and deep learning
CN116052218B (en) Pedestrian re-identification method
CN115862091A (en) Facial expression recognition method, device, equipment and medium based on Emo-ResNet
CN111368734B (en) Micro expression recognition method based on normal expression assistance
CN111694977A (en) Vehicle image retrieval method based on data enhancement
CN114049675B (en) Facial expression recognition method based on light-weight two-channel neural network
CN110349176B (en) Target tracking method and system based on triple convolutional network and perceptual interference learning
Guo et al. Multifeature extracting CNN with concatenation for image denoising
CN111523483A (en) Chinese food dish image identification method and device
CN113378620B (en) Cross-camera pedestrian re-identification method in surveillance video noise environment
CN114241564A (en) Facial expression recognition method based on inter-class difference strengthening network
CN113763417B (en) Target tracking method based on twin network and residual error structure
CN111914617B (en) Face attribute editing method based on balanced stack type generation type countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant