CN110705430A - Multi-person facial expression recognition method and system based on deep learning - Google Patents
Multi-person facial expression recognition method and system based on deep learning Download PDFInfo
- Publication number
- CN110705430A CN110705430A CN201910916023.XA CN201910916023A CN110705430A CN 110705430 A CN110705430 A CN 110705430A CN 201910916023 A CN201910916023 A CN 201910916023A CN 110705430 A CN110705430 A CN 110705430A
- Authority
- CN
- China
- Prior art keywords
- expression recognition
- expression
- volume
- training
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a method and a system for recognizing facial expressions of multiple persons based on deep learning, wherein the recognition method comprises the following steps: 1. establishing an expression recognition model; 2. constructing a training sample set, and training parameters in the expression recognition model; 3. detecting the face in the image to be recognized by adopting an MTCNN network to obtain a face window in the image to be recognized; and inputting the detected face area into a trained expression recognition model for recognition to obtain an expression classification result of each face in the image to be recognized. The recognition method applies deep learning to expression recognition, can quickly complete the task of recognizing facial expressions of multiple people, and has high recognition rate.
Description
Technical Field
The invention belongs to the technical field of expression recognition, and particularly relates to a method and a system for recognizing facial expressions of multiple persons based on deep learning.
Background
Facial expression, a commonly used expression of human emotion, is generally used as a way to identify emotion when communicating between people. With the development of human-computer interaction, facial expression recognition becomes a hot topic in recent decades, is widely applied to the aspects of traffic, medicine, education and the like, and permeates various aspects of people's life.
The traditional expression recognition algorithm extracts features manually, the process is complex, and the calculated amount is large. The concept of deep learning, which originates from artificial neural networks, essentially refers to a class of methods for effectively training neural networks with deep structures. The convolutional neural network is the most important model in deep learning and is a special feedforward neural network. Standard convolutional neural networks, including input, convolutional, pooling, and output layers, have greatly advanced the development of image classification, recognition, and understanding techniques. Through multi-level convolution calculation, the deep learning can automatically learn the characteristics related to the facial expression, and finally the facial expression recognition is completed.
However, due to the complexity of the facial expression recognition problem, the effect of applying the deep learning technique to the facial expression recognition is affected by the reasons of the gesture, the obstruction, the illumination and the like, and the recognition accuracy is not high.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide a method for recognizing facial expressions of multiple persons, which is high in recognition accuracy.
The technical scheme is as follows: the invention discloses a method for recognizing facial expressions of multiple persons based on deep learning, which comprises a training stage and a recognition stage, wherein the training stage comprises the following steps:
(1) establishing an expression recognition model, wherein the expression recognition model has a VGG-19 network structure, and comprises the following steps: sequentially and alternately arranging 5 volume blocks, 5 maximum pooling layers, 2 full-connection layers and a softmax classification layer; in the 5 volume blocks, the first volume block and the second volume block respectively comprise 2 volume layers, and the third volume block, the fourth volume block and the fifth volume block respectively comprise 4 volume layers; the softmax classification layer is a classification layer of 7 classifications;
the parameters of the 5 volume blocks and the 5 maximum pooling layers adopt the parameters of a pre-trained VGG-19 network;
(2) constructing a training sample set, and training parameters of 2 full-connection layers and a softmax classification layer in the expression recognition model;
the training sample set is images in CK + and fer2013 expression data sets and comprises 7 types of expressions: anger generates gas; disgust aversion; fear of fears; happy; sad hurts the heart; surrised was surprised; normal neutral; all pictures are unified into 48 × 48 gray level images;
training the constructed expression recognition network model by adopting a random gradient descent algorithm of self-adaptive moment estimation;
the identification phase comprises the steps of:
(3) detecting the face in the image to be recognized by adopting an MTCNN network to obtain a face window in the image to be recognized; and inputting the detected face area into a trained expression recognition model for recognition to obtain an expression classification result of each face in the image to be recognized.
The images in the training sample set further comprise: performing data enhancement on the images in the CK + and fer2013 expression data sets; the data enhancement comprises: the image is randomly rotated, scaled, horizontally or vertically projectively transformed, and horizontally flipped.
In the MTCNN network, the step length of all convolutions is 1, and the step length of pooling is 2; the activation function is PReLU:
wherein alpha is less than or equal to 1 and is an adjustable parameter.
On the other hand, the invention discloses a multi-person facial expression recognition system for realizing the method, which comprises the following steps: the facial recognition system comprises a facial detection module, an expression recognition training module and an expression recognition model; the face detection module is used for detecting a face region in an input image;
the expression recognition training module is used for training an expression recognition model according to a training sample set;
the expression recognition model is used for recognizing the facial expressions in the detected face area to obtain an expression classification result.
The face detection module, the expression recognition training module and the expression recognition model are computers equipped with NVIDIA GTX1080Ti GPUs.
Has the advantages that: compared with the prior art, the method for identifying the facial expressions of multiple persons disclosed by the invention has the following advantages: 1. the invention applies deep learning to expression recognition, can quickly complete the task of facial expression recognition of multiple people, and has high recognition rate under the influence of reasons such as postures, shelters, illumination and the like; 2. the method avoids the limitation of manually extracting the features of the traditional expression recognition algorithm, and the deep learning can automatically learn the features related to the facial expression through the multi-level convolution calculation; 3. the invention uses the transfer learning method, can greatly reduce the training parameters of the network, effectively extract the multilayer characteristics of the facial expression, and improve the accuracy of the expression recognition while ensuring the speed.
Drawings
FIG. 1 is a flow chart of a method for recognizing facial expressions of multiple persons according to the present invention;
FIG. 2 is a network architecture diagram of an expression recognition model of the present invention;
FIG. 3 is a flow chart of face detection in the present invention;
fig. 4 is a block diagram of a system for recognizing facial expressions of multiple persons according to the present invention.
Detailed Description
The invention is further elucidated with reference to the drawings and the detailed description.
As shown in FIG. 1, the invention discloses a method for recognizing facial expressions of multiple persons based on deep learning, which comprises a training stage and a recognition stage, wherein the training stage comprises the following steps:
step 1, establishing an expression recognition model, wherein the expression recognition model has a VGG-19 network structure, and comprises the following steps: sequentially alternating 5 volume blocks and 5 maximum pooling layers, 2 full-link layers and softmax sorting layers, as shown in fig. 2; in the 5 volume blocks, the first volume block and the second volume block respectively comprise 2 volume layers, and the third volume block, the fourth volume block and the fifth volume block respectively comprise 4 volume layers, and 16 volume layers are formed in total; the softmax classification layer is a classification layer of 7 classifications;
and migrating the VGG-19 pre-training model parameters to the expression recognition network model as parameters of 16 convolutional layers and 5 pooling layers. The expression recognition model has the same structure as the VGG-19 network structure, and is different in that 2 full-connection layers are adopted to replace 3 full-connection layers of the VGG-19 network structure, so that the calculated amount is reduced, and overfitting can be slowed down by less parameter amount; the number of the nodes of the 1 st full connection layer is 1024, and the number of the nodes of the 2 nd full connection layer is 4; and replacing the original softmax classification layer of the VGG-19 with a classification layer of 7 classifications.
the training sample set is an image in CK + and fer2013 expression data sets, and comprises 7 types of expressions: anger generates gas; disgust aversion; fear of fears; happy; sad hurts the heart; surrised was surprised; normal neutral; all pictures are unified into 48 × 48 gray scale images. In order to improve the generalization capability of the model, the images in the training sample set further include: performing data enhancement on the images in the CK + and fer2013 expression data sets; the data enhancement comprises: the image is randomly rotated, scaled, horizontally or vertically projectively transformed, and horizontally flipped.
The invention adopts a random gradient descent algorithm of self-adaptive moment estimation, namely an Adam algorithm to train the constructed expression recognition network model.
The identification phase comprises the steps of:
the MTCNN multitask cascade convolution neural network comprises three cascade convolution neural networks, and the face position is predicted step by step and the characteristic points are calibrated from coarse to fine. The three cascaded convolutional neural networks are a proposed Network, a purified Network and an Output Network.
As shown in fig. 3, the face detection step is:
s1, zooming the input image according to different scales to form an image pyramid which is used as the input of the three-layer cascade network;
s2, rapidly generating a face candidate window and a boundary regression vector thereof by using a proposed Network Proposal Network; correcting the candidate window using the boundary regression vector; combining the overlapping windows by using a non-maximum value inhibition method;
s3, purifying the face candidate window selected in the step S2 by using a purification Network; the RefineNework also corrects the candidate window using a boundary regression vector; further merging the overlapping windows by using a non-maximum value inhibition method;
s4, screening the face candidate window in the step S3 by using an Output Network to obtain one or more accurate face positions, and finishing face detection.
In the invention, the step length of all convolutions of MTCNN is 1, and the step length of pooling is 2; the excitation layer is connected behind the convolution layer and the full-connection layer, and the activation function adopts the following parameters:
wherein alpha is less than or equal to 1 and is an adjustable parameter.
And inputting the face area detected by the MTCNN into a trained expression recognition model for recognition to obtain an expression classification result of each face in the image to be recognized.
As shown in fig. 4, in order to implement the recognition system for recognizing facial expressions of multiple persons, the recognition system includes: the facial recognition system comprises a facial detection module 1, an expression recognition training module 2 and an expression recognition model 3; the face detection module is used for detecting a face region in an input image; the expression recognition training module is used for training an expression recognition model according to a training sample set; the expression recognition model is used for recognizing the facial expressions in the detected face area to obtain an expression classification result.
In order to improve the training speed of the multi-person facial expression recognition system, the facial detection module, the expression recognition training module and the expression recognition model in the embodiment are computers equipped with NVIDIA GTX1080Ti GPUs.
Claims (8)
1. The method for recognizing the facial expressions of multiple persons based on deep learning is characterized by comprising a training stage and a recognition stage, wherein the training stage comprises the following steps:
(1) establishing an expression recognition model, wherein the expression recognition model has a VGG-19 network structure, and comprises the following steps: sequentially and alternately arranging 5 volume blocks, 5 maximum pooling layers, 2 full-connection layers and a softmax classification layer; in the 5 volume blocks, the first volume block and the second volume block respectively comprise 2 volume layers, and the third volume block, the fourth volume block and the fifth volume block respectively comprise 4 volume layers; the softmax classification layer is a classification layer of 7 classifications;
the parameters of the 5 volume blocks and the 5 maximum pooling layers adopt the parameters of a pre-trained VGG-19 network;
(2) constructing a training sample set, and training parameters of 2 full-connection layers and a softmax classification layer in the expression recognition model;
the training sample set is images in CK + and fer2013 expression data sets and comprises 7 types of expressions: anger generates gas; disgust aversion; fear of fears; happy; sad hurts the heart; surrised was surprised; normal neutral; all pictures are unified into 48 × 48 gray level images;
training the constructed expression recognition network model by adopting a random gradient descent algorithm of self-adaptive moment estimation;
the identification phase comprises the steps of:
(3) detecting the face in the image to be recognized by adopting an MTCNN network to obtain a face window in the image to be recognized; and inputting the detected face area into a trained expression recognition model for recognition to obtain an expression classification result of each face in the image to be recognized.
2. The method of claim 1, wherein the images in the training sample set further comprise: performing data enhancement on the images in the CK + and fer2013 expression data sets; the data enhancement comprises: the image is randomly rotated, scaled, horizontally or vertically projectively transformed, and horizontally flipped.
4. Many people facial expression recognition system based on degree of depth study, its characterized in that includes: the facial recognition system comprises a facial detection module, an expression recognition training module and an expression recognition model; the face detection module is used for detecting a face region in an input image;
the expression recognition training module is used for training an expression recognition model according to a training sample set;
the expression recognition model is used for recognizing the facial expressions in the detected face area to obtain an expression classification result.
5. The system of claim 4, wherein the face detection module detects faces in the input image using an MTCNN network; in the MTCNN network, the step length of all convolutions is 1, and the step length of pooling is 2; the activation function is PReLU:
wherein alpha is less than or equal to 1 and is an adjustable parameter.
6. The system of claim 4, wherein the expression recognition model is configured as a VGG-19 network, and comprises: sequentially and alternately arranging 5 volume blocks, 5 maximum pooling layers, 2 full-connection layers and a softmax classification layer; in the 5 volume blocks, the first volume block and the second volume block respectively comprise 2 volume layers, and the third volume block, the fourth volume block and the fifth volume block respectively comprise 4 volume layers; the softmax classification layer is a classification layer of 7 classifications;
the parameters of the 5 volume blocks and the 5 maximum pooling layers adopt the parameters of a pre-trained VGG-19 network.
7. The system of claim 4, wherein the training sample set is images from CK +, fer2013 expression dataset and data enhanced images from CK + and fer2013 expression dataset; including 7 types of expressions: anger generates gas; disgust aversion; fear of fears; happy; sad hurts the heart; surrised was surprised; normal neutral; all pictures are unified into 48 × 48 gray level images;
the data enhancement comprises: the image is randomly rotated, scaled, horizontally or vertically projectively transformed, and horizontally flipped.
8. The system of claim 4, wherein the face detection module, the expression recognition training module, and the expression recognition model are computers equipped with NVIDIA GTX1080Ti GPUs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910916023.XA CN110705430A (en) | 2019-09-26 | 2019-09-26 | Multi-person facial expression recognition method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910916023.XA CN110705430A (en) | 2019-09-26 | 2019-09-26 | Multi-person facial expression recognition method and system based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110705430A true CN110705430A (en) | 2020-01-17 |
Family
ID=69198076
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910916023.XA Pending CN110705430A (en) | 2019-09-26 | 2019-09-26 | Multi-person facial expression recognition method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110705430A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111466878A (en) * | 2020-04-14 | 2020-07-31 | 合肥工业大学 | Real-time monitoring method and device for pain symptoms of bedridden patients based on expression recognition |
CN111507241A (en) * | 2020-04-14 | 2020-08-07 | 四川聚阳科技集团有限公司 | Lightweight network classroom expression monitoring method |
CN111680760A (en) * | 2020-06-16 | 2020-09-18 | 北京联合大学 | Clothing style identification method and device, electronic equipment and storage medium |
CN111738178A (en) * | 2020-06-28 | 2020-10-02 | 天津科技大学 | Wearing mask facial expression recognition method based on deep learning |
CN112801002A (en) * | 2021-02-05 | 2021-05-14 | 黑龙江迅锐科技有限公司 | Facial expression recognition method and device based on complex scene and electronic equipment |
CN113011253A (en) * | 2021-02-05 | 2021-06-22 | 中国地质大学(武汉) | Face expression recognition method, device, equipment and storage medium based on ResNeXt network |
CN113069080A (en) * | 2021-03-22 | 2021-07-06 | 上海交通大学医学院附属第九人民医院 | Difficult airway assessment method and device based on artificial intelligence |
CN113642467A (en) * | 2021-08-16 | 2021-11-12 | 江苏师范大学 | Facial expression recognition method based on improved VGG network model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105654049A (en) * | 2015-12-29 | 2016-06-08 | 中国科学院深圳先进技术研究院 | Facial expression recognition method and device |
CN107491726A (en) * | 2017-07-04 | 2017-12-19 | 重庆邮电大学 | A kind of real-time expression recognition method based on multi-channel parallel convolutional neural networks |
CN107729872A (en) * | 2017-11-02 | 2018-02-23 | 北方工业大学 | Facial expression recognition method and device based on deep learning |
CN109002766A (en) * | 2018-06-22 | 2018-12-14 | 北京邮电大学 | A kind of expression recognition method and device |
-
2019
- 2019-09-26 CN CN201910916023.XA patent/CN110705430A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105654049A (en) * | 2015-12-29 | 2016-06-08 | 中国科学院深圳先进技术研究院 | Facial expression recognition method and device |
CN107491726A (en) * | 2017-07-04 | 2017-12-19 | 重庆邮电大学 | A kind of real-time expression recognition method based on multi-channel parallel convolutional neural networks |
CN107729872A (en) * | 2017-11-02 | 2018-02-23 | 北方工业大学 | Facial expression recognition method and device based on deep learning |
CN109002766A (en) * | 2018-06-22 | 2018-12-14 | 北京邮电大学 | A kind of expression recognition method and device |
Non-Patent Citations (3)
Title |
---|
KAIPENG ZHANG ET AL.: "Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks", 《IEEE SIGNAL PROCESSING LETTERS》 * |
徐克虎 等: "《智能计算方法及其应用》", 31 July 2019 * |
黄孝平: "《当代机器深度学习方法与应用研究》", 30 November 2017 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111466878A (en) * | 2020-04-14 | 2020-07-31 | 合肥工业大学 | Real-time monitoring method and device for pain symptoms of bedridden patients based on expression recognition |
CN111507241A (en) * | 2020-04-14 | 2020-08-07 | 四川聚阳科技集团有限公司 | Lightweight network classroom expression monitoring method |
CN111680760A (en) * | 2020-06-16 | 2020-09-18 | 北京联合大学 | Clothing style identification method and device, electronic equipment and storage medium |
CN111738178A (en) * | 2020-06-28 | 2020-10-02 | 天津科技大学 | Wearing mask facial expression recognition method based on deep learning |
CN112801002A (en) * | 2021-02-05 | 2021-05-14 | 黑龙江迅锐科技有限公司 | Facial expression recognition method and device based on complex scene and electronic equipment |
CN113011253A (en) * | 2021-02-05 | 2021-06-22 | 中国地质大学(武汉) | Face expression recognition method, device, equipment and storage medium based on ResNeXt network |
CN113011253B (en) * | 2021-02-05 | 2023-04-21 | 中国地质大学(武汉) | Facial expression recognition method, device, equipment and storage medium based on ResNeXt network |
CN113069080A (en) * | 2021-03-22 | 2021-07-06 | 上海交通大学医学院附属第九人民医院 | Difficult airway assessment method and device based on artificial intelligence |
CN113642467A (en) * | 2021-08-16 | 2021-11-12 | 江苏师范大学 | Facial expression recognition method based on improved VGG network model |
CN113642467B (en) * | 2021-08-16 | 2023-12-01 | 江苏师范大学 | Facial expression recognition method based on improved VGG network model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110705430A (en) | Multi-person facial expression recognition method and system based on deep learning | |
Park et al. | A depth camera-based human activity recognition via deep learning recurrent neural network for health and social care services | |
Acharya et al. | Deep learning based large scale handwritten Devanagari character recognition | |
WO2022111236A1 (en) | Facial expression recognition method and system combined with attention mechanism | |
CN105469065B (en) | A kind of discrete emotion identification method based on recurrent neural network | |
CN107679526B (en) | Human face micro-expression recognition method | |
CN107562784A (en) | Short text classification method based on ResLCNN models | |
CN106803069A (en) | Crowd's level of happiness recognition methods based on deep learning | |
CN109543667A (en) | A kind of text recognition method based on attention mechanism | |
CN110119785A (en) | Image classification method based on multilayer spiking convolutional neural network | |
CN108830252A (en) | A kind of convolutional neural networks human motion recognition method of amalgamation of global space-time characteristic | |
CN112906604B (en) | Behavior recognition method, device and system based on skeleton and RGB frame fusion | |
CN108133188A (en) | A kind of Activity recognition method based on motion history image and convolutional neural networks | |
Said et al. | Design of a face recognition system based on convolutional neural network (CNN) | |
CN112464865A (en) | Facial expression recognition method based on pixel and geometric mixed features | |
CN103824054A (en) | Cascaded depth neural network-based face attribute recognition method | |
CN107657233A (en) | Static sign language real-time identification method based on modified single multi-target detection device | |
CN104361316A (en) | Dimension emotion recognition method based on multi-scale time sequence modeling | |
Jayadeep et al. | Mudra: convolutional neural network based Indian sign language translator for banks | |
CN105469041A (en) | Facial point detection system based on multi-task regularization and layer-by-layer supervision neural networ | |
CN107657204A (en) | The construction method and facial expression recognizing method and system of deep layer network model | |
CN110097089A (en) | A kind of sensibility classification method of the documentation level based on attention combination neural net | |
CN110059593B (en) | Facial expression recognition method based on feedback convolutional neural network | |
CN109117817A (en) | The method and device of recognition of face | |
CN108710950A (en) | A kind of image quantization analysis method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200117 |
|
RJ01 | Rejection of invention patent application after publication |