CN109635709B - Facial expression recognition method based on significant expression change area assisted learning - Google Patents
Facial expression recognition method based on significant expression change area assisted learning Download PDFInfo
- Publication number
- CN109635709B CN109635709B CN201811490141.0A CN201811490141A CN109635709B CN 109635709 B CN109635709 B CN 109635709B CN 201811490141 A CN201811490141 A CN 201811490141A CN 109635709 B CN109635709 B CN 109635709B
- Authority
- CN
- China
- Prior art keywords
- network
- layer
- expression
- auxiliary
- main
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a facial expression recognition method based on the auxiliary learning of a significant expression change area, which is characterized in that an auxiliary learning network is established to extract the characteristics of the significant expression change area in a facial expression image, parameters of a main network and the first 3 characteristic extraction layers of the auxiliary learning network are shared, and the characteristics extracted by the fourth layer and the fifth layer of the auxiliary learning network are subjected to characteristic weighting fusion with the fourth layer and the fifth layer of the main network, so that the main network structure can learn the characteristics of some significant expression areas in the auxiliary network; processing the facial expression data set by using a face detection and positioning algorithm to obtain a face area image for training a main network; the facial region image is preprocessed to obtain an image with a region with significant expression change, and the image is used for training the auxiliary learning network, so that the main network for expression recognition can focus more attention on the region with significant expression change, and expression features with more recognizability and robustness are extracted.
Description
Technical Field
The invention relates to the field of artificial intelligence, in particular to a facial expression recognition method based on the auxiliary learning of a significant expression change area.
Background
In the communication between people, the information transmitted by the facial expression reflects the abundant inner world of human beings, and is an important carrier of human behavior information and emotion. With the development of scientific technology, the facial expression recognition technology is deeply researched and widely applied in various fields, and the facial expression recognition is often used in the field of human-computer interaction.
The steps of facial expression recognition generally comprise acquisition of facial expression images, cutting, normalization, expression feature extraction, model training and expression classification of original facial expression images, wherein the key step is expression feature extraction, and the effectiveness of the extracted features determines the level of the facial expression recognition performance. In the prior art, the whole facial expression image is generally recognized, and the facial expression transfer of important information is mainly realized through the change of eyes, lips and mouth. Therefore, if the feature extraction is carried out on the whole facial expression image, the loss of part of expression feature information is easy to cause, the original feature information is lost to a certain extent, the obtained identification performance is not satisfactory, and in addition, the extracted feature dimensions are very large, so that the classification in the next stage is not facilitated; and the recognition accuracy is not high.
Disclosure of Invention
The invention provides a face recognition method based on the auxiliary learning of an obvious expression change area, which aims to solve the problems that in the prior art, the recognition performance is low, the extracted feature dimension is very large and the classification of the next stage is not facilitated because only the feature extraction is carried out on the whole face expression image, and the recognition accuracy is effectively improved through parameter sharing between an auxiliary learning network and a main network.
In order to achieve the purpose of the invention, the technical scheme is as follows: a facial expression recognition method based on the auxiliary learning of a significant expression change area comprises the following steps:
s1: constructing a main network comprising 5 layers of feature extraction layers for extracting facial expression features, inputting the extracted high-layer semantic features into a full connection layer, and inputting the features output by the full connection layer into a Softmax classification layer for expression classification operation to obtain an expression result judged by the network;
s2: constructing an auxiliary learning network comprising 5 layers of feature extraction layers for extracting significant expression features in a human face, inputting the extracted high-level semantic features into a full connection layer, and inputting the features output by the full connection layer into a Softmax classification layer for expression classification operation to obtain an expression result judged by the network;
s3: sharing parameters of the front 3 layers of feature extraction layers of the main network and the auxiliary learning network; then, the output characteristics of the fourth layer and the fifth layer of the auxiliary learning network are respectively weighted and fused with the output characteristics of the fourth layer and the fifth layer of the main network, and then the fused characteristics are input into the main network to continue the extraction work of the high-level semantic characteristics of the auxiliary main network;
s4: the main network and the auxiliary learning network adopt cross entropy loss functions to judge the network loss, the back propagation of the network is carried out according to the judgment result of the network loss, the parameters of the main network and the auxiliary learning network are adjusted, and the main network and the auxiliary learning network are continuously optimized;
s5: extracting corresponding face area images in each image from the facial expression data set with the facial expression labels by using a face detection and positioning algorithm, and inputting the face area images into a main network for training; meanwhile, preprocessing the face region image to obtain an image with a region with a significant change in expression, and inputting the image into an auxiliary learning network for training; training the main network and the auxiliary learning network in an alternate training mode;
s6: and outputting the facial expression image to be recognized into the main network to complete facial expression recognition.
Preferably, each of the 5 layers of feature extraction layers of the main network and the 5 layers of feature extraction layers of the auxiliary learning network includes a convolution layer, a pooling layer, a Batch Normalization layer and a ReLU layer; and 5 layers of feature extraction layers of the main network and 5 layers of feature extraction layers of the auxiliary learning network are used for extracting facial expression features.
Preferably, in step S3, the weighted fusion is calculated as follows:
wherein: α is a weighting factor;is the feature output of the ith layer of the main network structure,is an auxiliary netThe characteristic output of the i-th layer of the complex,the feature vector is subjected to main and auxiliary network fusion, and i is 4 and 5;
the fused features are input into the next layer as output features of the corresponding layer of the main network and continue to be propagated forwards.
Further, α is 0.5, and the feature weights extracted by the fourth layer and the fifth layer of the feature extraction layer of the main network and the auxiliary learning network are respectively 0.5 in proportion.
Preferably, in step S4, the loss of the cross entropy loss function is calculated as follows:
the whole network aims to minimize the loss function of the main network and the auxiliary learning network:
argmin(Loss main +Loss auxiliary )
wherein: loss main Loss function, Loss, of the main network auxiliary Is a loss function of the secondary learning network.
Preferably, in step S5, the training mode is to train the primary network three times, and then train the auxiliary learning network once, and cycle training is repeated.
Further, in step S5, the image with the significantly changing expression area includes feature data of an eye and an eyebrow area and feature data of a lip and a mouth area.
The invention has the following beneficial effects:
1. according to the method, an auxiliary learning network is built to extract the characteristics of the significant expression change area in the facial expression image, parameters of the main network and the first 3 layers of characteristic extraction layers of the auxiliary learning network are shared, and the characteristics extracted by the fourth layer and the fifth layer of the auxiliary learning network are subjected to characteristic weighting fusion with the fourth layer and the fifth layer of the main network, so that the main network structure can learn the characteristics of some significant expression areas in the auxiliary network.
2. Processing the facial expression data set with the facial expression labels by using a facial detection and positioning algorithm to obtain corresponding facial area images, and training a main network; and preprocessing the face region image to obtain an image with a region with significant expression change, and training the auxiliary learning network, so that the main network for expression recognition can focus more attention on the region with significant expression change, and expression features with more recognizability and robustness are extracted.
Drawings
FIG. 1 is a diagram of the total feature extraction layer of the present invention.
Fig. 2 is a feature extraction level diagram of the nth level feature extraction.
Detailed Description
The invention is described in detail below with reference to the drawings and the detailed description.
Example 1
As shown in fig. 1, a facial expression recognition method based on the area with significant expression change for assisted learning includes:
s1: constructing a main network comprising 5 feature extraction layers for extracting facial expression features, wherein each feature extraction layer in the 5 feature extraction layers of the main network comprises a convolution layer, a pooling layer, a Batch Normalization layer and a ReLU layer, as shown in FIG. 2; the extracted high-level semantic features are input into a full connection layer, a 3-level full connection structure is adopted in the embodiment, and then the features output by the full connection layer are input into a Softmax classification layer to perform expression classification operation, so that an expression result judged by a network is obtained.
S2: constructing an auxiliary learning network comprising 5 feature extraction layers for extracting significant expression features in a human face, wherein each feature extraction layer in the 5 feature extraction layers of the auxiliary learning network comprises a convolution layer, a pooling layer, a Batch Normalization layer and a ReLU layer, as shown in FIG. 2; the extracted high-level semantic features are input into a full connection layer, a 3-level full connection structure is adopted in the embodiment, and then the features output by the full connection layer are input into a Softmax classification layer to perform expression classification operation, so that an expression result judged by a network is obtained.
S3: in order to enable the main network to learn the characteristics of the significant expression areas in the auxiliary learning network, sharing the extracted parameters of the front 3 layers of characteristic extraction layers of the main network and the auxiliary learning network; in order to enable the network to learn different characteristics respectively, the output characteristics of the fourth layer and the fifth layer of the auxiliary learning network are weighted and fused with the output characteristics of the fourth layer and the fifth layer of the main network respectively, and then the fused characteristics are input into the main network to continue the extraction work of the high-level semantic characteristics of the auxiliary main network;
s4: in order to improve the facial expression recognition precision, the main network and the auxiliary learning network both adopt a cross entropy loss function to judge the network loss, perform backward propagation of the network according to the judgment result of the network loss, adjust the parameters of the network, and continuously optimize the main network and the auxiliary learning network;
s5: extracting corresponding face area images in each image from the facial expression data set with the facial expression labels by using a face detection and positioning algorithm, and inputting the face area images into a main network for training; meanwhile, preprocessing the face region image to obtain an image with a remarkably expression change region, and inputting the image into an auxiliary learning network for training; training the main network and the auxiliary learning network in an alternate training mode;
in the embodiment, the face expression data set adopts a CK + data set, the CK + data set is divided into frames, the last 3 frames in each sequence are taken as an expression data set with a label, and then a corresponding face area image in each image is extracted from the collected face expression data set CK + by using an Adaboost face detection and positioning algorithm in Opencv, so that the influence of background noise on face expression recognition can be removed to a certain extent.
S6: and outputting the facial expression image to be recognized into the main network to complete facial expression recognition.
In step S3 of this embodiment, the weighted fusion is calculated as follows:
wherein: α is a weighting factor;is the feature output of the ith layer of the main network structure,is the feature output of the i-th layer of the auxiliary network,the feature vector is subjected to main and auxiliary network fusion, and i is 4 and 5;
the fused features are input into the next layer as output features of the corresponding layer of the main network and continue to be propagated forwards.
The alpha is 0.5, and the occupation ratio of the extracted feature weights of the fourth layer and the fifth layer of the feature extraction layer of the main network and the auxiliary learning network is 0.5.
In step S4 of this embodiment, the loss of the cross entropy loss function is calculated as follows:
the whole network aims to minimize the loss function of the main network and the auxiliary learning network:
argmin(Loss main +Loss auxiliary )
wherein: loss main Loss function, Loss, of the main network auxiliary Is a loss function of the assisted learning network.
The parameter information of the main network can be continuously adjusted through optimization of the loss function, so that the main network for expression recognition can focus more attention on the area with obvious expression change, and expression features with better recognition ability and robustness are extracted.
In step S5 of this embodiment, the region with significantly changed expression is mainly a region including the eye and eyebrow region and the region near the lip and mouth, and the removed region is a region near the nose, because the contribution of these regions to the recognition of expression is very small. The processed upper and lower parts of the area are spliced into a complete face image area again, and the image with the area with the significant change of expression for assisting network input is obtained after the preprocessing, wherein the image has the characteristic data of the eye and eyebrow area and the characteristic data of the lip and mouth area.
Because the main network and the auxiliary learning network share the parameters of the first 3 layers, the training mode adopts an alternate training strategy, specifically, the main network is trained for three times, then the auxiliary learning network is trained for one time, and the cyclic training is repeated. In the embodiment, a corresponding face area image in each image extracted from the collected face expression data set CK + is input into the main network for training, and meanwhile, the face area image is preprocessed to obtain an image with a significantly expression change area and is input into the auxiliary learning network for training.
The main network and the auxiliary learning network form an identification model, and the image with identification is output to the model, so that the facial expression identification can be realized.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (4)
1. A facial expression recognition method based on the auxiliary learning of a significant expression change area is characterized by comprising the following steps: the identification method comprises the following steps:
s1: constructing a main network comprising 5 layers of feature extraction layers for extracting facial expression features, inputting the extracted high-level semantic features into a full connection layer, and inputting the features output by the full connection layer into a Softmax classification layer for expression classification operation to obtain an expression result judged by the network;
s2: constructing an auxiliary learning network comprising 5 layers of feature extraction layers for extracting significant expression features in the human face, inputting the extracted high-level semantic features into a full connection layer, and inputting the features output by the full connection layer into a Softmax classification layer for expression classification operation to obtain an expression result judged by the network;
s3: sharing parameters of the front 3 layers of feature extraction layers of the main network and the auxiliary learning network; then, the output characteristics of the fourth layer and the fifth layer of the auxiliary learning network are respectively weighted and fused with the output characteristics of the fourth layer and the fifth layer of the main network, and then the fused characteristics are input into the main network to continue the extraction work of the high-level semantic characteristics of the auxiliary main network;
s4: the main network and the auxiliary learning network both adopt a cross entropy loss function to judge the network loss, perform back propagation of the network according to the judgment result of the network loss, adjust the parameters of the main network and the auxiliary learning network, and continuously optimize the main network and the auxiliary learning network;
s5: respectively extracting a corresponding face area image in each image from the face expression data set with the face expression label by using a face detection and positioning algorithm, and inputting the face area images into a main network for training; meanwhile, preprocessing the face region image to obtain an image with a region with a significant change in expression, and inputting the image into an auxiliary learning network for training; training the main network and the auxiliary learning network in an alternative training mode;
s6: outputting the facial expression image to be recognized to a main network to complete facial expression recognition;
in step S3, the weighted fusion is calculated as follows:
wherein: α is a weighting factor;is the feature output of the ith layer of the main network structure,is the feature output of the i-th layer of the auxiliary network,the feature vector is subjected to main and auxiliary network fusion, and i is 4 and 5;
the fused features are input into the next layer for continuous forward propagation as output features of the corresponding layer of the main network;
the alpha is 0.5, and the occupation ratio of the extracted feature weights of the fourth layer and the fifth layer of the main network and the auxiliary learning network is 0.5;
in step S4, the loss of the cross entropy loss function is calculated as follows:
the whole network aims to minimize the loss function of the main network and the auxiliary learning network:
argmin(Loss main +Loss auxiliary )
wherein: loss main Loss function, Loss, of the main network auxiliary Is a loss function of the secondary learning network.
2. The facial expression recognition method based on the significant expression change region aided learning of claim 1, wherein: each of the 5 layers of feature extraction layers of the main network and the 5 layers of feature extraction layers of the auxiliary learning network comprises a convolution layer, a pooling layer, a Batch Normalization layer and a ReLU layer; and 5 layers of feature extraction layers of the main network and 5 layers of feature extraction layers of the auxiliary learning network are used for extracting facial expression features.
3. The facial expression recognition method based on the significant expression change region aided learning of claim 1, wherein: and step S5, training the main network three times in an alternative training mode, then training the auxiliary learning network once, and repeating the cycle training.
4. The facial expression recognition method based on the significant expression change region aided learning of claim 1, wherein: and step S5, the image with the expression significant change area comprises the feature data of the eye and eyebrow area and the feature data of the lip and mouth area.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811490141.0A CN109635709B (en) | 2018-12-06 | 2018-12-06 | Facial expression recognition method based on significant expression change area assisted learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811490141.0A CN109635709B (en) | 2018-12-06 | 2018-12-06 | Facial expression recognition method based on significant expression change area assisted learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109635709A CN109635709A (en) | 2019-04-16 |
CN109635709B true CN109635709B (en) | 2022-09-23 |
Family
ID=66071879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811490141.0A Active CN109635709B (en) | 2018-12-06 | 2018-12-06 | Facial expression recognition method based on significant expression change area assisted learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109635709B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112307889B (en) * | 2020-09-22 | 2022-07-26 | 北京航空航天大学 | Face detection algorithm based on small auxiliary network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015018372A (en) * | 2013-07-10 | 2015-01-29 | 日本電気株式会社 | Expression extraction model learning device, expression extraction model learning method and computer program |
CN107292256A (en) * | 2017-06-14 | 2017-10-24 | 西安电子科技大学 | Depth convolved wavelets neutral net expression recognition method based on secondary task |
CN107316061A (en) * | 2017-06-22 | 2017-11-03 | 华南理工大学 | A kind of uneven classification ensemble method of depth migration study |
CN107423727A (en) * | 2017-08-14 | 2017-12-01 | 河南工程学院 | Face complex expression recognition methods based on neutral net |
CN108921024A (en) * | 2018-05-31 | 2018-11-30 | 东南大学 | Expression recognition method based on human face characteristic point information Yu dual network joint training |
-
2018
- 2018-12-06 CN CN201811490141.0A patent/CN109635709B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015018372A (en) * | 2013-07-10 | 2015-01-29 | 日本電気株式会社 | Expression extraction model learning device, expression extraction model learning method and computer program |
CN107292256A (en) * | 2017-06-14 | 2017-10-24 | 西安电子科技大学 | Depth convolved wavelets neutral net expression recognition method based on secondary task |
CN107316061A (en) * | 2017-06-22 | 2017-11-03 | 华南理工大学 | A kind of uneven classification ensemble method of depth migration study |
CN107423727A (en) * | 2017-08-14 | 2017-12-01 | 河南工程学院 | Face complex expression recognition methods based on neutral net |
CN108921024A (en) * | 2018-05-31 | 2018-11-30 | 东南大学 | Expression recognition method based on human face characteristic point information Yu dual network joint training |
Non-Patent Citations (1)
Title |
---|
Attributes for Improved Attributes: A Multi-Task Network Utilizing Implicit and Explicit Relationships for Facial Attribute Classification;Emily Hand等;《Thirty-First AAAI Conference On Artificial Intelligence》;20170212;第31卷;正文全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109635709A (en) | 2019-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108829677B (en) | Multi-modal attention-based automatic image title generation method | |
CN111340814B (en) | RGB-D image semantic segmentation method based on multi-mode self-adaptive convolution | |
CN113158862B (en) | Multitasking-based lightweight real-time face detection method | |
CN113221639A (en) | Micro-expression recognition method for representative AU (AU) region extraction based on multitask learning | |
CN110059598A (en) | The Activity recognition method of the long time-histories speed network integration based on posture artis | |
CN111967272B (en) | Visual dialogue generating system based on semantic alignment | |
CN114360005B (en) | Micro-expression classification method based on AU region and multi-level transducer fusion module | |
CN107016046A (en) | The intelligent robot dialogue method and system of view-based access control model displaying | |
CN112330718B (en) | CNN-based three-level information fusion visual target tracking method | |
CN109558805A (en) | Human bodys' response method based on multilayer depth characteristic | |
CN110175248A (en) | A kind of Research on face image retrieval and device encoded based on deep learning and Hash | |
CN109712108A (en) | It is a kind of that vision positioning method is directed to based on various distinctive candidate frame generation network | |
CN112669343A (en) | Zhuang minority nationality clothing segmentation method based on deep learning | |
CN106127112A (en) | Data Dimensionality Reduction based on DLLE model and feature understanding method | |
CN109859222A (en) | Edge extracting method and system based on cascade neural network | |
CN111401116B (en) | Bimodal emotion recognition method based on enhanced convolution and space-time LSTM network | |
CN110633689B (en) | Face recognition model based on semi-supervised attention network | |
CN116129289A (en) | Attention edge interaction optical remote sensing image saliency target detection method | |
CN110188791B (en) | Visual emotion label distribution prediction method based on automatic estimation | |
CN109635709B (en) | Facial expression recognition method based on significant expression change area assisted learning | |
CN114764941A (en) | Expression recognition method and device and electronic equipment | |
CN110472655A (en) | A kind of marker machine learning identifying system and method for border tourism | |
CN111901610B (en) | Parallel image description method based on multilayer encoder | |
Shao et al. | DCMSTRD: End-to-end Dense Captioning via Multi-Scale Transformer Decoding | |
CN109583406B (en) | Facial expression recognition method based on feature attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |