CN113642383A - Face expression recognition method based on joint loss multi-feature fusion - Google Patents

Face expression recognition method based on joint loss multi-feature fusion Download PDF

Info

Publication number
CN113642383A
CN113642383A CN202110697155.5A CN202110697155A CN113642383A CN 113642383 A CN113642383 A CN 113642383A CN 202110697155 A CN202110697155 A CN 202110697155A CN 113642383 A CN113642383 A CN 113642383A
Authority
CN
China
Prior art keywords
layer
feature
network
features
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110697155.5A
Other languages
Chinese (zh)
Inventor
苗壮
林克正
李靖宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202110697155.5A priority Critical patent/CN113642383A/en
Publication of CN113642383A publication Critical patent/CN113642383A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The application relates to a facial expression recognition method based on joint loss multi-feature fusion, which comprises the following steps: detecting a human face to obtain a human face image; respectively extracting features of the face image through an improved ResNet network and a VGG network; reducing the dimension of the extracted features through a full connection layer; fusing the characteristics by adopting a weighted fusion method; and sending the facial expression to a Softmax layer for classification, and outputting facial expression categories. The method adopts two neural network architectures to extract the features, and fully fuses the extracted features. In the training process, a loss function combining cosine loss and cross entropy loss weighting is used, and the combined loss function can realize the functions of close combination among the same categories and large separation among different categories.

Description

Face expression recognition method based on joint loss multi-feature fusion
Technical Field
The invention relates to a facial expression recognition method, and belongs to the field of image recognition.
Background
Facial expression recognition is one of research hotspots in the field of computer vision, and the application field of the facial expression recognition is quite wide. The method comprises man-machine interaction, safe driving, intelligent monitoring, auxiliary driving, case detection and the like. The current facial expression recognition algorithm is mainly based on the traditional method and the deep learning method. The traditional face Feature extraction algorithm mainly includes Principal Component Analysis (PCA), Scale-Invariant Feature Transformation (SIFT), Local Binary Pattern (LBP), Gabor wavelet Transformation, Histogram Of oriented gradients (HOG), and the like, and with the development Of research depth and artificial intelligence technology, the Deep learning method is very different in the field Of image recognition, and the Deep Neural Network (DNN) is applied to expression recognition and obtains better performance.
However, the current expression recognition method is easily affected by picture noise and human interference factors to cause poor recognition rate, a single-channel neural network starts from the image global, local features of the image are easily ignored, and loss of the features is caused, and single feature extraction of a single network model is one of the reasons for low recognition rate.
Disclosure of Invention
The invention aims to solve the technical problem of single convolutional neural network feature loss in a facial expression recognition process, and provides a facial expression recognition method based on joint loss multi-feature fusion.
In order to achieve the purpose, the invention adopts the technical scheme that:
s1, carrying out face detection on the image to be recognized to obtain a face area;
s2, extracting the characteristics of the obtained face image through an improved ResNet network;
s3, extracting the characteristics of the obtained face image through a VGG network;
s4, sending the characteristics obtained in the steps S2 and S3 into a full connection layer for dimensionality reduction;
s5, fusing the features subjected to dimensionality reduction in the step S4 into new features in a weighting fusion mode;
and S6, sending the new features in the step S5 into the full-connection layer for dimension reduction, then performing class prediction on the features by using a Softmax layer, and outputting class information.
Further, the method for acquiring the face region by face detection in step S1 uses an MSSD network model, and includes:
s11, based on the SSD target detection network, the original basic network VGG-16 is changed into a lightweight network MobileNet.
And S12, fusing the 7 th depth separable convolutional layer (shallow layer feature) in the network in the step S11 with the feature map of the last 5 layers (deep layer feature), respectively readjusting the feature maps of the six layers into one-dimensional vectors, and then performing series fusion to realize multi-scale face detection.
And S13, extracting features of the target detection network through a basic network, and performing classification regression and bounding box regression on the meta-structure.
Further, the specific method for performing feature extraction on the acquired face image through the improved ResNet network in step S2 is as follows: and improving a residual block in the ResNet network, increasing convolution operation, reducing parameter quantity, modifying the number of network layers and introducing a pre-activation method. The step S2 includes:
s21, changing the face image X detected in S1 to (X)1,x2,...,xn) Sending the global feature into a ResNet network, and obtaining a corresponding global feature f after processing a plurality of residual blocksS=(fS 1,fS 2,...,fS m) The convolution operation process is as follows:
Figure RE-GDA0003286355790000011
wherein xlAnd xl+1Shown are the input and output of the ith residual unit, respectively. F is the residual function, and h (x)l)=xlRepresenting an identity mapping, f is the RRelu activation function. The learning features from the superficial layer L to the deep layer L are
Figure RE-GDA0003286355790000021
S22 obtaining the feature vector after the features are subjected to the flattening layer
Figure RE-GDA0003286355790000022
Further, the specific content of the extraction features of the VGG network in step S3 is:
the VGG network adopts continuous 3 multiplied by 3 convolution kernels to replace a larger convolution kernel, the effect is better when a plurality of small convolution kernels are used for a given receptive field, nonlinear operation can be achieved through an activation function, a better network structure can be trained, and meanwhile cost cannot be increased. The network extraction feature process is as follows:
the face image detected in the S1 is subjected to a plurality of layers of convolution operation and maximum pooling operation of the VGG network to obtain the corresponding local feature fV=(fV 1,fV 2,...,fV k) (ii) a Obtaining feature vectors after features are subjected to flattening layer
Figure RE-GDA0003286355790000023
Further, the specific method for reducing the dimension in step S4 is as follows:
s41, extracting the feature vector in the step S2
Figure RE-GDA0003286355790000024
Input into two fully-connected layers fc1-1And fc1-2Performing dimensionality reduction, and adopting an RRelu activation function as follows:
Figure RE-GDA0003286355790000025
the structures of all layers of the full connecting layer are as follows:
fc1-1={s1,s2,...,s512}
fc1-2={s1,s2,...,s7}
where s denotes the neuron of the current fully-connected layer, fc1-1512 neurons in the population, fc1-2There are 7 neurons in the tree, and the final output dimension of the fully-connected layer is a feature vector of 7
Figure RE-GDA0003286355790000026
S42, extracting the feature vector in the step S3
Figure RE-GDA0003286355790000027
Input into two fully-connected layers fc2-1And fc2-2The dimension reduction is carried out, and the structures of the layers are as follows:
fc2-1={l1,l2,...,l512}
fc2-2={l1,l2,...,l7}
where l denotes the neuron of the current fully-connected layer, fc2-1512 neurons in the population, fc2-2There are 7 neurons in the tree, and the final output dimension of the fully-connected layer is a feature vector of 7
Figure RE-GDA0003286355790000028
Further, the step S5 is specifically:
characterizing in step S4
Figure RE-GDA0003286355790000029
And
Figure RE-GDA00032863557900000210
formation of new features F after weighted fusionzSetting a weight coefficient k to adjust the characteristic proportion of the two channels, wherein the fusion process is as follows:
Figure RE-GDA00032863557900000211
when k takes 0 or 1, it means that only one convolutional neural network extracts features.
Further, the Softmax activation function classification process in step S6 is as follows:
Figure RE-GDA00032863557900000212
where Z is the output of the previous layer, the input of Softmax, and the dimensions C, yiThe value of i represents the number of classes as the probability value of a certain class.
The invention has the advantages that:
1. the method adopts the double-convolution neural network to extract the characteristics, improves the basic network to obtain a network structure with better effect, and then adopts a weighting fusion mode to fuse the two characteristic vectors to obtain more effective characteristic information.
2. The local features and the global features are effectively fused in the convolutional neural network, and the fused features are input into a subsequent convolutional layer for continuous extraction in the process of feature extraction, so that the information of a feature map is enriched.
3. By adopting a new loss function-combined loss function and using the loss function after cosine loss and cross entropy loss weighting combination, the functions of close combination between the same categories and large separation between different categories can be realized. And enhancing the discriminability of the features extracted by the neural network.
Drawings
Fig. 1 is a network diagram of MSSD face detection.
Fig. 2 is a structural diagram of an improved ResNet network.
FIG. 3 is a flow chart of a facial expression recognition method based on joint loss multi-feature fusion.
Fig. 4 is an overall structure diagram of the neural network for extracting expressive features.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the case of the example 1, the following examples are given,
referring to fig. 1 to 4, embodiment 1 provides a facial expression recognition method based on joint loss multi-feature fusion,
the method comprises the following steps:
s1, carrying out face detection on the image to be recognized to obtain a face area;
referring to fig. 1, the largest highlight in MobileNet is a deep separable convolution, which is composed of a deep convolution and a point convolution, and greatly speeds up training and recognition, so that a network is constructed by using the deep separable convolution. In the MSSD network, an input end passes through 1 standard convolutional layer with the convolutional kernel size of 3 multiplied by 3 and the step length of 2 and then passes through 13 depth separable convolutional layers, and a rear output end is connected with 4 standard convolutional layers with convolutional kernels respectively combined by 1 multiplied by 1 and 3 multiplied by 3 in an alternating mode and 1 maximum pooling layer, so that the standard convolutional layers of the network use the convolutional kernels with the step length of 2 to replace the pooling layers in consideration of the loss of part of effective characteristics of the pooling layers. The network shallow layer features have smaller receptive field, have more detailed information and have more advantages for detecting small targets, so the MSSD face detection network adopts a mode of fusing the shallow layer features and the deep layer features. The fusion of the shallow features and the deep features of layer 7 works best, so the network uses the fused features of layers 7, 15, 16, 17, 18, and 19. The network firstly readjusts the characteristic images of the six layers into one-dimensional vectors respectively, and then performs series fusion to realize multi-scale face detection.
In step S1, the image to be recognized uses some international facial expression public data sets, such as FER2013, CK +, Jaffe, etc., or a camera is used to acquire the image and the image is used for face detection and segmentation, and the specific steps are as follows:
s11, based on the SSD target detection network, the original basic network VGG-16 is changed into a lightweight network MobileNet.
And S12, fusing the 7 th depth separable convolutional layer (shallow layer feature) in the network in the step S11 with the feature map of the last 5 layers (deep layer feature), respectively readjusting the feature maps of the six layers into one-dimensional vectors, and then performing series fusion to realize multi-scale face detection.
And S13, extracting features of the target detection network through a basic network, and performing classification regression and bounding box regression on the meta-structure.
Specifically, in the step S1, an image is acquired from a facial expression database or a camera, then a MSSD network is used to perform face detection on the image, a face area with the highest reliability is screened out, the interference of the background in the image is removed, and finally a face grayscale image with a size of 48 × 48 is acquired.
S2, extracting the characteristics of the obtained face image through an improved ResNet network;
referring to fig. 2, the improvement of the network is to change the residual block into three convolutional layers, each convolutional layer has a convolutional kernel of 1 × 1, and the size of the convolutional kernel of the middle convolutional layer is not changed, so that a convolution operation is added, and the parameter amount of the network is greatly reduced. Pre-activation can be achieved by lifting the BN layer and the active layer to the convolutional layer, and the altered ResNet network will train faster and less error than the original ResNet network.
Step S2 specifically includes:
s21, changing the face image X detected in S1 to (X)1,x2,...,xn) Sending the global feature into a ResNet network, and obtaining a corresponding global feature f after processing a plurality of residual blocksS=(fS 1,fS 2,...,fS m) The convolution operation process is as follows:
Figure RE-GDA0003286355790000041
wherein xlAnd xl+1Shown are the input and output of the ith residual unit, respectively. F is the residual function, and h (x)l)=xlRepresenting an identity mapping, f is the RRelu laserA live function. The learning features from the superficial layer L to the deep layer L are
Figure RE-GDA0003286355790000042
S22 obtaining the feature vector after the features are subjected to the flattening layer
Figure RE-GDA0003286355790000043
S3, extracting the features of the obtained face image through a VGG network:
specifically, the VGG network adopts continuous 3 x 3 convolution kernels to replace large convolution kernels, the effect is better when a plurality of small convolution kernels are used for a given receptive field, nonlinear operation can be achieved through an activation function, a better network structure can be trained, and meanwhile cost cannot be increased. The VGG network is a basic structure, the size of a convolution kernel is 3 multiplied by 3, 0 padding is added on the periphery to be 1, so that the size of a feature graph obtained by the convolution kernel is guaranteed to be unchanged, then the size of the feature graph is reduced to half through a maximum pooling layer, the feature graph passes through five convolution layers in total, the number of channels of the five convolution kernels is respectively 64, 128, 256, 512 and 512, two branches are used for feature fusion, and the size is adjusted through the convolution pooling layer for fusion. Two channels are fused together after being transformed into feature vectors through a full connection layer, and a dropout layer is introduced in order to prevent overfitting. And then, the prediction result is transmitted to a following full connection layer and a subsequent softmax layer for classification prediction. The face image detected in the S1 is subjected to a plurality of layers of convolution operation and maximum pooling operation of the VGG network to obtain the corresponding local feature fV=(fV 1,fV 2,...,fV k) (ii) a Obtaining feature vectors after features are subjected to flattening layer
Figure RE-GDA0003286355790000044
Step S4 specifically includes:
s41, extracting the feature vector in the step S2
Figure RE-GDA0003286355790000045
Input into two fully-connected layers fc1-1And fc1-2Performing dimensionality reduction, and adopting an RRelu activation function as follows:
Figure RE-GDA0003286355790000046
the structure of each layer is as follows:
fc1-1={s1,s2,...,s512}
fc1-2={s1,s2,...,s7}
where s denotes the neuron of the current fully-connected layer, fc1-1512 neurons in the population, fc1-2There are 7 neurons in the tree, and the final output dimension of the fully-connected layer is a feature vector of 7
Figure RE-GDA0003286355790000047
S42, extracting the feature vector in the step S3
Figure RE-GDA0003286355790000048
Input two-layer full-connection layer fc2-1And fc2-2The dimension reduction is carried out, and the structures of the layers are as follows:
fc2-1={l1,l2,...,l512}
fc2-2={l1,l2,...,l7}
where l denotes the neuron of the current fully-connected layer, fc2-1512 neurons in the population, fc2-2The final output dimension of the feature vector with 7 dimensions is 7 in a full-connection layer of 7 neurons
Figure RE-GDA0003286355790000049
Specifically, the features output by the two convolutional neural networks are respectively reduced to the features with the same dimensionality, and preparation is made for feature fusion.
S5, fusing the features subjected to dimensionality reduction in the step S4 into new features in a weighting fusion mode;
referring to fig. 4, the overall network structure is to perform a clipping operation on the VGG19 network, and then merge the network with the improved ResNet network. Then, the shallow information and the deep information are combined together and input into the next convolution layer, so that the extracted characteristic information can be more complete. The network structure can better obtain image features beneficial to classification without increasing training time. Compared with the characteristics extracted through a single channel, the characteristics after fusion are easier to match with a real label, and the recognition effect is better. Characterizing in step S4
Figure RE-GDA0003286355790000051
And
Figure RE-GDA0003286355790000052
formation of new features F after weighted fusionzSetting a weight coefficient k to adjust the characteristic proportion of the two channels, wherein the fusion process is as follows:
Figure RE-GDA0003286355790000053
when k takes 0 or 1, it means a network with only one single channel.
The advantage of weighted fusion is that the proportion of different neural network output characteristics can be adjusted, and the optimal value of k is found to be 0.5 through a large number of experiments.
S6, sending the new features in the step S5 into a full connection layer, classifying the new features by utilizing a Softmax activation function, and outputting expressions;
the Softmax activation function classification process in step S6 is as follows:
Figure RE-GDA0003286355790000054
wherein Z is the output of the previous layer, SoftmaxInput with dimension C, yiThe value of i represents the number of categories for the probability value of a certain category, the expression is divided into 7 categories, namely anger (anger), disgust (disgust), fear (fear), happy (happy), hurt (sad), surprised (surrised) and neutral (Normal), and the final classification result is the category corresponding to the neuron node outputting the maximum probability value.
The invention is not described in detail, but is well known to those skilled in the art.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (7)

1. A facial expression recognition method based on joint loss multi-feature fusion is characterized by comprising the following steps:
s1, carrying out face detection on the image to be recognized to obtain a face area;
s2, extracting the characteristics of the obtained face image through an improved ResNet network;
s3, extracting the characteristics of the obtained face image through a VGG network;
s4, sending the characteristics obtained in the steps S2 and S3 into a full connection layer for dimensionality reduction;
s5, fusing the features subjected to dimensionality reduction in the step S4 into new features in a weighting fusion mode;
and S6, sending the new features in the step S5 into the full-connection layer for dimension reduction, then performing class prediction on the features by using a Softmax layer, and outputting class information.
2. The joint loss multi-feature fusion facial expression recognition method according to claim 1, wherein the step S1 comprises:
s11, based on the SSD target detection network, the original basic network VGG-16 is changed into a lightweight network MobileNet.
And S12, fusing the 7 th depth separable convolutional layer (shallow layer feature) in the network in the step S11 with the feature map of the last 5 layers (deep layer feature), respectively readjusting the feature maps of the six layers into one-dimensional vectors, and then performing series fusion to realize multi-scale face detection.
And S13, extracting features of the target detection network through a basic network, and performing classification regression and bounding box regression on the meta-structure.
3. The method for recognizing facial expressions based on joint loss multi-feature fusion according to claim 2, wherein the step S2 includes:
s21, changing the face image X detected in S1 to (X)1,x2,...,xn) Sending the global feature into a ResNet network, and obtaining a corresponding global feature f after processing a plurality of residual blocksS=(fS 1,fS 2,...,fS m) The convolution operation process is as follows:
Figure RE-FDA0003286355780000011
wherein xlAnd xl+1Shown are the input and output of the ith residual unit, respectively. F is the residual function, and h (x)l)=xlRepresenting an identity mapping, f is the RRelu activation function. The learning features from the superficial layer L to the deep layer L are
Figure RE-FDA0003286355780000012
S22 obtaining the feature vector after the features are subjected to the flattening layer
Figure RE-FDA0003286355780000013
4. The method for recognizing facial expressions based on joint loss multi-feature fusion according to claim 3, wherein in the step S3, the facial image detected in S1 is subjected to several layers of convolution operations and maximum pooling operations of VGG network to obtain corresponding local features fV=(fV 1,fV 2,...,fV k) (ii) a Obtaining feature vectors after features are subjected to flattening layer
Figure RE-FDA0003286355780000014
5. The method for recognizing facial expressions based on joint loss multi-feature fusion according to claim 4, wherein the step S4 includes:
s41, extracting the feature vector in the step S3
Figure RE-FDA0003286355780000015
Input into two fully-connected layers fc1-1And fc1-2Performing dimensionality reduction, and adopting an RRelu activation function as follows:
Figure RE-FDA0003286355780000016
the structure of each layer is as follows:
fc1-1={s1,s2,...,s512}
fc1-2={s1,s2,...,s7}
where s denotes the neuron of the current fully-connected layer, fc1-1512 neurons in the population, fc1-2There are 7 neurons in the tree, and the final output dimension of the fully-connected layer is a feature vector of 7
Figure RE-FDA0003286355780000017
S42, extracting the feature vector in the step S4
Figure RE-FDA0003286355780000018
Input two-layer full-connection layer fc2-1And fc2-2The dimension reduction is carried out, and the structures of the layers are as follows:
fc2-1={l1,l2,...,l512}
fc2-2={l1,l2,...,l7}
where l denotes the neuron of the current fully-connected layer, fc2-1512 neurons in the population, fc2-2The final output dimension of the feature vector with 7 dimensions is 7 in a full-connection layer of 7 neurons
Figure RE-FDA0003286355780000021
6. The method for recognizing facial expressions based on joint loss multi-feature fusion as claimed in claim 5, wherein the weighted fusion calculation method in the step S5 is:
characterizing in step S4
Figure RE-FDA0003286355780000022
And
Figure RE-FDA0003286355780000023
formation of new features F after weighted fusionzSetting a weight coefficient k to adjust the characteristic proportion of the two channels, wherein the fusion process is as follows:
Figure RE-FDA0003286355780000024
when k takes 0 or 1, it means that only one convolutional neural network extracts features.
7. The method for recognizing facial expressions based on joint loss multi-feature fusion according to claim 6, wherein in the step S6, the expression of the Softmax activation function is as follows:
Figure RE-FDA0003286355780000025
where Z is the output of the previous layer, the input of Softmax, and the dimensions C, yiThe value of i represents the number of categories for the probability value of a certain category, the expression is divided into 7 categories, namely anger (anger), disgust (disgust), fear (fear), happy (happy), hurt (sad), surprised (surrised) and neutral (Normal), and the final classification result is the category corresponding to the neuron node outputting the maximum probability value.
CN202110697155.5A 2021-06-23 2021-06-23 Face expression recognition method based on joint loss multi-feature fusion Pending CN113642383A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110697155.5A CN113642383A (en) 2021-06-23 2021-06-23 Face expression recognition method based on joint loss multi-feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110697155.5A CN113642383A (en) 2021-06-23 2021-06-23 Face expression recognition method based on joint loss multi-feature fusion

Publications (1)

Publication Number Publication Date
CN113642383A true CN113642383A (en) 2021-11-12

Family

ID=78416120

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110697155.5A Pending CN113642383A (en) 2021-06-23 2021-06-23 Face expression recognition method based on joint loss multi-feature fusion

Country Status (1)

Country Link
CN (1) CN113642383A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114186617A (en) * 2021-11-23 2022-03-15 浙江大学 Mechanical fault diagnosis method based on distributed deep learning

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190216334A1 (en) * 2018-01-12 2019-07-18 Futurewei Technologies, Inc. Emotion representative image to derive health rating
CN110414371A (en) * 2019-07-08 2019-11-05 西南科技大学 A kind of real-time face expression recognition method based on multiple dimensioned nuclear convolution neural network
CN110543895A (en) * 2019-08-08 2019-12-06 淮阴工学院 image classification method based on VGGNet and ResNet
CN111126472A (en) * 2019-12-18 2020-05-08 南京信息工程大学 Improved target detection method based on SSD
CN111259954A (en) * 2020-01-15 2020-06-09 北京工业大学 Hyperspectral traditional Chinese medicine tongue coating and tongue quality classification method based on D-Resnet
CN112418330A (en) * 2020-11-26 2021-02-26 河北工程大学 Improved SSD (solid State drive) -based high-precision detection method for small target object
CN112597873A (en) * 2020-12-18 2021-04-02 南京邮电大学 Dual-channel facial expression recognition method based on deep learning
CN112766413A (en) * 2021-02-05 2021-05-07 浙江农林大学 Bird classification method and system based on weighted fusion model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190216334A1 (en) * 2018-01-12 2019-07-18 Futurewei Technologies, Inc. Emotion representative image to derive health rating
CN110414371A (en) * 2019-07-08 2019-11-05 西南科技大学 A kind of real-time face expression recognition method based on multiple dimensioned nuclear convolution neural network
CN110543895A (en) * 2019-08-08 2019-12-06 淮阴工学院 image classification method based on VGGNet and ResNet
CN111126472A (en) * 2019-12-18 2020-05-08 南京信息工程大学 Improved target detection method based on SSD
CN111259954A (en) * 2020-01-15 2020-06-09 北京工业大学 Hyperspectral traditional Chinese medicine tongue coating and tongue quality classification method based on D-Resnet
CN112418330A (en) * 2020-11-26 2021-02-26 河北工程大学 Improved SSD (solid State drive) -based high-precision detection method for small target object
CN112597873A (en) * 2020-12-18 2021-04-02 南京邮电大学 Dual-channel facial expression recognition method based on deep learning
CN112766413A (en) * 2021-02-05 2021-05-07 浙江农林大学 Bird classification method and system based on weighted fusion model

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
GU SHENGTAO等: "Facial expression recognition based on global and local feature fusion with CNNs", 《2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC)》 *
MINHYUK JUNG等: "Human activity classification based on sound recognition and residual convolutional neural network", 《AUTOMATION IN CONSTRUCTION》 *
李旻择等: "基于多尺度核特征卷积神经网络的实时人脸表情识别", 《计算机应用》 *
李春虹等: "基于深度可分离卷积的人脸表情识别", 《计算机工程与设计》 *
李校林等: "基于VGG-NET的特征融合面部表情识别", 《计算机工程与科学》 *
郑锡聪: "Resnet和DS证据融合的双模态学习情况识别研究", 《中国优秀博硕士学位论文全文数据库(硕士) 社会科学Ⅱ辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114186617A (en) * 2021-11-23 2022-03-15 浙江大学 Mechanical fault diagnosis method based on distributed deep learning
CN114186617B (en) * 2021-11-23 2022-08-30 浙江大学 Mechanical fault diagnosis method based on distributed deep learning

Similar Documents

Publication Publication Date Title
Guo et al. A survey on deep learning based face recognition
Dino et al. Facial expression classification based on SVM, KNN and MLP classifiers
Mane et al. A survey on supervised convolutional neural network and its major applications
Zeng et al. Multi-stage contextual deep learning for pedestrian detection
Yan et al. Multi-attributes gait identification by convolutional neural networks
Rajan et al. Novel deep learning model for facial expression recognition based on maximum boosted CNN and LSTM
Sun et al. Facial expression recognition based on a hybrid model combining deep and shallow features
Sajjanhar et al. Deep learning models for facial expression recognition
Peng et al. Towards facial expression recognition in the wild: A new database and deep recognition system
Moustafa et al. Age-invariant face recognition based on deep features analysis
CN110276248B (en) Facial expression recognition method based on sample weight distribution and deep learning
Yang et al. Semi-supervised learning of feature hierarchies for object detection in a video
KR101777601B1 (en) Distinction method and system for characters written in caoshu characters or cursive characters
Julina et al. Facial emotion recognition in videos using hog and lbp
CN112883941A (en) Facial expression recognition method based on parallel neural network
Bodapati et al. A deep learning framework with cross pooled soft attention for facial expression recognition
Sujanaa et al. Emotion recognition using support vector machine and one-dimensional convolutional neural network
CN113642383A (en) Face expression recognition method based on joint loss multi-feature fusion
CN113516047A (en) Facial expression recognition method based on deep learning feature fusion
Khemakhem et al. Facial expression recognition using convolution neural network enhancing with pre-processing stages
Dewan et al. Fish detection and classification
Espinel et al. Face gesture recognition using deep-learning models
TWI722383B (en) Pre feature extraction method applied on deep learning
Mahmoodzadeh Human Activity Recognition based on Deep Belief Network Classifier and Combination of Local and Global Features
CN113052132A (en) Video emotion recognition method based on face key point track feature map

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20211112

WD01 Invention patent application deemed withdrawn after publication