CN111814713A - Expression recognition method based on BN parameter transfer learning - Google Patents
Expression recognition method based on BN parameter transfer learning Download PDFInfo
- Publication number
- CN111814713A CN111814713A CN202010682216.6A CN202010682216A CN111814713A CN 111814713 A CN111814713 A CN 111814713A CN 202010682216 A CN202010682216 A CN 202010682216A CN 111814713 A CN111814713 A CN 111814713A
- Authority
- CN
- China
- Prior art keywords
- facial expression
- facial
- expression recognition
- source
- expression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 55
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000013526 transfer learning Methods 0.000 title claims abstract description 17
- 230000008921 facial expression Effects 0.000 claims abstract description 151
- 241000282414 Homo sapiens Species 0.000 claims abstract description 19
- 238000013508 migration Methods 0.000 claims abstract description 16
- 230000005012 migration Effects 0.000 claims abstract description 16
- 230000001815 facial effect Effects 0.000 claims description 26
- 230000004927 fusion Effects 0.000 claims description 16
- 238000010586 diagram Methods 0.000 claims description 13
- 238000012706 support-vector machine Methods 0.000 claims description 12
- 238000007476 Maximum Likelihood Methods 0.000 claims description 11
- 230000000694 effects Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 7
- 230000009191 jumping Effects 0.000 claims description 2
- 238000005286 illumination Methods 0.000 abstract description 3
- 238000002474 experimental method Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 210000003205 muscle Anatomy 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 206010063659 Aversion Diseases 0.000 description 1
- 235000008694 Humulus lupulus Nutrition 0.000 description 1
- 241001061076 Melanonus zugmayeri Species 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 210000001097 facial muscle Anatomy 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- ZOCKGBMQLCSHFP-KQRAQHLDSA-N valrubicin Chemical compound O([C@H]1C[C@](CC2=C(O)C=3C(=O)C4=CC=CC(OC)=C4C(=O)C=3C(O)=C21)(O)C(=O)COC(=O)CCCC)[C@H]1C[C@H](NC(=O)C(F)(F)F)[C@H](O)[C@H](C)O1 ZOCKGBMQLCSHFP-KQRAQHLDSA-N 0.000 description 1
- 229940054937 valstar Drugs 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of target recognition, and discloses an expression recognition method based on BN parameter migration learning. The invention fully utilizes the transfer learning mechanism to apply the learning knowledge in a certain field to different but related fields, can effectively solve the problem of insufficient sample data volume of facial expression modeling caused by illumination, shooting angle and the like in facial expression recognition, reduces the influence of insufficient sample number on parameter learning precision and recognition result, and can be widely applied to the environment with noisy, uncertain and difficult acquisition of a large amount of human face target data.
Description
Technical Field
The invention relates to the field of target identification application in artificial intelligence, image engineering, management science and engineering, in particular to an expression identification method based on BN parameter transfer learning.
Background
Bayesian Network (BN) has practical application value in uncertainty modeling and decision support, and Bayesian network parameter learning is a process of obtaining conditional probability distribution of all network nodes through sample data and priori knowledge under the condition that the structure is known.
After the problem domain is converted into a BN model for representation, the reasoning task can be completed by utilizing the BN theory. Among them, the union tree (union tree) algorithm is one of the currently used BN accurate inference algorithms with fast computation speed and the most extensive applications. The BN organically combines the theoretical results of probability theory and graph theory, is an effective method for solving the problem of uncertain and incomplete information reasoning, and is an ideal tool applicable to facial expression recognition.
Parameter learning of the BN model refers to the problem of estimating parameters of the BN model on the premise that the structure of the BN model is known. Currently, Maximum Likelihood Estimation (MLE) provides a method for estimating model parameters given observation data, i.e., the model is fixed and the parameters are unknown. When the data is sufficient, a maximum likelihood estimation algorithm is usually adopted, and better parameter learning precision can be obtained. Maximum A Posteriori (MAP) estimation is a point estimate of the quantity that is difficult to observe, obtained from empirical data. Similar to maximum likelihood estimation, the maximum a posteriori estimate incorporates a priori distributions of the quantities to be estimated.
Facial expression recognition is a cross discipline integrating artificial intelligence, neurology, computers and the like, and has wide application in the fields of psychological analysis, clinical medicine, vehicle monitoring and business. Facial expressions refer to various emotions expressed by changes in facial muscles, eye muscles, and oral muscles. With the intensive research on Facial expressions, Ekman further perfects the Facial expressions of Human beings, and proposes a Facial motion Coding System based on motion units (AU) to analyze the motion characteristics of the motion units to describe the related expressions (provenance: p.ekman, w.v. friesen, j.c. hager, Facial Action Coding System, a Human Face, Salt lake city, UT, 2002.).
The existing literature provides a facial expression recognition method based on Bayesian network modeling under a small data set aiming at the problem of rare feature samples obtained in the process of facial expression recognition (details: Guowen, Gaowen, Xiaoqin, Xucheng, Limeng, facial expression recognition [ J ] based on BN modeling under the small data set, scientific technology and engineering, 2018, 18(35): 179-183). The method comprises the steps of firstly extracting geometric features and HOG features of facial expression images, forming an Action Unit (AU) label sample set through feature fusion, normalization and other processing, secondly providing a BN structure for facial expression recognition, converting qualitative expert experience into a constraint set among BN conditional probabilities, and then introducing convex optimization maximization solution to complete estimation of BN model parameters. However, the acquisition of expert experience is often highly subjective and not beneficial to the estimation of BN parameters.
The principle of transfer learning is to apply knowledge experience in one field to other scenes, train and analyze samples with labels in one or more task fields (source fields) to obtain a parameter model of the tasks, and apply the parameter model to another related task field (target field) to complete classification of data in another field.
In the migration learning, when the two tasks with correlation and commonality are processed, the data of the two tasks of the source domain and the target domain do not need to be processed separately, and the experience and knowledge of pattern recognition on one task data is used for processing the other task data. Therefore, inaccuracy of a learning result caused by difference of one task data can be avoided in the BN parameter learning process, and influence of subjectivity of expert experience can be particularly avoided.
Disclosure of Invention
The invention provides an expression recognition method based on BN parameter transfer learning, which can solve the problems in the prior art.
The invention provides an expression recognition method based on BN parameter transfer learning, which comprises the following steps:
s1, acquiring a face activity unit AU;
s2, judging whether the facial expression BN is modeled or not;
judging whether the modeling Flag BN _ Flag is 'true', setting the initial value to be 'false', if the BN _ Flag is 'true', indicating that the BN is modeled, jumping to S7, and entering an identification process; otherwise, executing S3 and entering the modeling process;
s3, obtaining a sample AU required by BN modeling;
s31, determining the number E _ num and the type of the facial expression recognition;
s32, extracting a sample data set of a required facial activity unit AU according to the expression of each category;
s4, determining a facial expression recognition target/source domain network BN model structure diagram;
determining a model structure chart G1 for establishing a target network human face expression recognition BN and determining a model structure chart G2 for a source domain network human face expression recognition BN by using the acquired sample data set of the face activity unit AU and the relationship between the human face expression and the activity unit AU as prior information;
s5, learning source domain BN parameters;
s51, obtaining sample data set M of each source area face activity unit AU (S)n);
S52, calculating a source weight coefficient k (n) of each source domain sample value in the total source domain sample value, and using a formula (1) to show that:
wherein, the sum of the weight coefficients of all the sources is 1, and the values of the weight coefficients are all arbitrary real numbers between [0,1 ];
s53, obtaining parameter theta of each source domain BN model by adopting a Maximum Likelihood Estimation (MLE) methodniI is the ith node in the source domain BN model child nodes;
s54, fusing parameters theta of BN models in various source domainsniObtaining the total source domain BN parameter thetaSi;
θSi=∑nk(n)θni(2);
S6, acquiring BN parameters of the facial expressions of the target domain;
s61, obtaining parameter theta of target domain initial BN model by adopting maximum posterior probability MAP estimation methodTiI is the ith node in the target domain BN model child nodes;
s62, calculating the final parameter theta of the target domain facial expression BN according to the weight factori;
θi=α1θTi+α2θsi(3)
Wherein α 1 and α 2 are weighting factors, α 1+ α 2 ═ 1;
s63, setting the BN _ Flag to be true, and finishing the modeling of the BN; returning to S1;
s7, recognizing facial expressions;
s71, setting a facial expression attribute probability threshold psi;
s72, putting the facial expression recognition evidence AU into the constructed BN model, and carrying out BN inference by using a joint tree inference algorithm to obtain a facial expression attribute probability psi';
s73, judging facial expressions;
and if the probability Ψ' of the facial expression attributes is greater than or equal to the threshold Ψ, outputting the facial expression attributes, namely the facial expression recognition result, and otherwise, acquiring the new AU data set again.
The specific step of extracting the facial activity unit AU according to each category of the representation in step S32 includes:
s321, obtaining geometric characteristics of the facial expression of the human face through a CLM algorithm on the facial expression image of the human face;
s322, extracting the texture features of the facial expressions from the facial expression images of the human faces through an HOG algorithm;
s323, carrying out feature fusion and normalization processing on the geometric features and the textural features of the facial expression to obtain fusion features of the facial expression;
s324, classifying the fusion characteristics of the facial expressions by using a Support Vector Machine (SVM) to obtain a data set M (t) of a target domain facial activity unit AU and a data set M (S) of a source domain facial activity unit AUn) N is 1,2, the. q, q is the number of source domains, and q is a natural number.
The specific step of obtaining the geometric features of the facial expression in step S321 includes:
s3211, obtaining positioning information of feature points of the expression image;
and (5) positioning the feature points by utilizing a CLM facial feature point positioning algorithm, and extracting the geometric features of the facial expression.
The specific step of determining the facial expression recognition target/source network BN model structure diagram in step S4 includes:
s41, determining a BN node for recognizing the facial expressions;
determining a father node and a child node of the BN;
s42, determining a directed acyclic graph of the BN (facial expression recognition);
and sequentially connecting the father node and the child node of the BN by using the directed edge, determining to establish a BN model structure diagram G1 of the target network for facial expression recognition, and establishing a BN model structure diagram G2 of the source network for facial expression recognition.
The step S11, where the number of categories E _ num of facial expression recognition is 6, includes: "happy", "surprised", "fear", "angry", "disgust" and "sad" 6 types of expressions.
Compared with the prior art, the invention has the beneficial effects that:
(1) the problem of low recognition accuracy when BN experiment training data set data are insufficient (small data problem) in the target domain facial expression recognition process is solved by utilizing transfer learning. Transfer learning is a learning problem in a domain where data and information are acquired from a near domain to solve an insufficient amount of data. The learning method of the learning model on the target domain is constructed through knowledge in the source domain data, and the recognition accuracy is improved.
(2) The influence of the subjectivity of the expert experience on the BN parameter learning precision can be avoided. The learned knowledge in the source task is migrated to the target task, the learning algorithm in the source task is based on the algorithms such as MLE (maximum likelihood algorithm) and MAP (maximum likelihood algorithm) under the classical sufficient samples, and the learning mechanism is based on a data-driven method, so that the influence of subjective expert experience on the learning precision in the BN parameter learning is avoided.
Drawings
Fig. 1 is a flowchart of an expression recognition method based on a migration mechanism according to the present invention.
Fig. 2 is a diagram of a BN model structure of a target domain for facial expression recognition provided by the present invention.
Fig. 3 is a diagram of a BN model structure of a facial expression recognition source domain according to the present invention.
Fig. 4 is a flow chart of BN parameter learning of facial expressions in the source network according to the present invention.
Fig. 5 is a flow chart of BN parameter learning of facial expressions in a target network according to the present invention.
Fig. 6 shows six basic expressions of the CK data set according to the embodiment of the present invention.
Fig. 7 shows six basic expressions of FER2013 data set according to an embodiment of the present invention.
Fig. 8 is a facial feature point location diagram according to an embodiment of the present invention.
Detailed Description
An embodiment of the present invention will be described in detail below with reference to fig. 1-8, but it should be understood that the scope of the present invention is not limited to the embodiment.
The embodiment of the invention provides an expression recognition method based on BN parameter migration learning, which comprises the steps of constructing a facial expression recognition BN model structure according to the relationship between facial expressions and AU labels, obtaining final facial expression recognition BN parameters according to a migration mechanism by utilizing BN parameters calculated by a face source domain data set and BN initial parameters of a face target domain data set, and recognizing the facial expressions by utilizing a reasoning algorithm in a BN theory to carry out BN reasoning. The invention fully utilizes the transfer learning mechanism to apply the learning knowledge in a certain field to different but related fields, can effectively solve the problem of insufficient sample data volume of facial expression modeling caused by illumination, shooting angle and the like in facial expression recognition, reduces the influence of insufficient sample number on parameter learning precision and recognition result, and can be widely applied to the environment with noisy, uncertain and difficult acquisition of a large amount of human face target data.
As shown in fig. 1, the expression recognition method based on BN parameter transfer learning provided by the present invention includes the following steps:
s1: acquiring a facial Activity Unit (AU);
the meaning of the facial activity unit AU is seen in: valstar M F, Almaev T, Girard J M, et. FERA2015-second facial expression and analysis challenge [ C ]// IEEE, International Conference and works hops on Automatic Face and Gesture recognition.2015:1-8, wherein the action unit AU6 represents the occurrence or non-occurrence of "lower mouth contour upwarping", AU25 represents the occurrence or non-occurrence of "double-lip separated exposed teeth", etc.;
optionally, An Openface open source tool (ex: Battrusaitis T, Robinson P, Morency LP. Openface: An open source face analysis toolkit [ C)]I/2016 IEEEWinter Conference on Applications of Computer Vision (WACV). IEEE,2016.), the CLM algorithm is carried out on the input picture information to obtain the geometric characteristics of the facial expression, the HOG algorithm extracts the HOG characteristics of the facial expression, the HOG characteristics are subjected to feature fusion and normalization processing to obtain the fusion characteristics of the facial expression, and classification is carried out by a Support Vector Machine (SVM) to obtain a data set M (t) of a target domain facial Activity Unit (AU) and a data set M (S) of a source domain facial Activity Unit (AU)n) n is 1, 2. And q is the number of the source domains, and is a natural number.
A decision is made to determine whether the facial expression BN has been modeled S2.
It is judged whether the BN structure modeling Flag BN _ Flag is "true". The initial value of BN _ Flag is set to "false", and BN _ Flag is 0(0 is "false"). If the modeling result is that BN _ Flag is 1(1 is true), the process proceeds to S7, and facial expression recognition is performed. Otherwise, executing S3 and entering the BN modeling process;
and S3, obtaining sample AUs required by BN modeling.
S31, determining the number E _ num of the categories of the facial expression recognition; in this example, E _ num is taken to be 6. I.e. corresponding to 6 types of expressions "happy", "surprised", "fear", "angry", "disgust" and "sad".
S32, extracting a face activity unit AU according to each type of expression;
optionally, geometric features and texture features of the facial expression are obtained from the facial expression image by using a CLM algorithm and a HOG algorithm, fusion features of the facial expression are obtained by feature fusion and normalization processing, and a Support Vector Machine (SVM) is used for classification to obtain a data set M (t) of a target domain facial Activity Unit (AU) and a data set M (S) of a source domain facial Activity Unit (AU)n) n is 1, 2. And q is the number of the source domains, and is a natural number.
S4, determining a BN model structure diagram of a target (source) network for facial expression recognition;
s41: and (4) determining the BN node by facial expression recognition.
Optionally, selecting an "Expression" node and recognizing a father node (a node to be queried) of the BN by using the facial Expression; the node facial expressions such as "AU 1", "AU 2", "AU 4", "AU 5", "AU 6", "AU 7", "AU 9", "AU 12", "AU 15", "AU 17", "AU 23" and "AU 25" identify child nodes (evidence nodes) of the BN.
S42: and determining a directed acyclic graph of the facial expression recognition BN.
And connecting the parent node and the child node in sequence by using directed edges, namely sequentially taking an Expression as an arrow tail of 12 directed edges, wherein the arrow points to AU1, AU2, AU4, AU5, AU6, AU7, AU9, AU12, AU15, AU17, AU23 and AU 25. Determining to establish a target facial expression recognition BN model structure G1 and a source domain facial expression recognition BN model structure G2;
s5: learning source domain BN parameters;
s51: calculating a small sample quantity threshold value C (n) of each source domain network according to a BN model of a source network recognized by facial expressions (see D.Koller, and N.Friedman, physical Graphical Models: Principles and Techniques-Adaptive computing and Machine Learning, United States of America, the MIT Press,2009.)
S52: determining a data set M (S) of a source-domain facial Activity Unit (AU)n) If the value is larger than C (n), if so, executing S122, otherwise, using the method of S10 to continue to obtain M (S)n)。
S53: calculating a source weight coefficient k (n) of each source domain sample value in the total source domain sample value; as shown in equation (1).
The sum of the weight coefficients of each source is 1, and the weight coefficients have values of 0,1]Any real number in between. Wherein M (S)n) For each source domain AU sample set (n ═ 1,2,.., q);
s54, learning the parameters of each source domain BN model by adopting a Maximum Likelihood Estimation (MLE) method
θni(i is the ith node of the source domain BN model child nodes),
s55: BN parameter theta of fusion source domainni;
θSi=∑nk(n)θni(2)
S6: acquiring BN parameters of the facial expression of a target domain;
s61, learning the parameter theta of the initial BN model of the target domain by adopting a maximum a posteriori probability (MAP) methodTi;
S62, calculating the final parameter theta of the target BN according to the weight factori;
θi=α1θTi+α2θsi(3)
Wherein α 1 and α 2 are weighting factors, α 1+ α 2 ═ 1;
alternatively, α 1 is 0.7.
S63, setting the BN _ Flag to 'true' to indicate that the BN modeling is completed; returning to S1 to obtain facial expression recognition evidence AU;
if BN _ Flag is true, the facial expression AU is acquired by returning to S1, and the facial expression is not modeled again, but the facial expression recognition is performed in S7.
S7, recognizing the facial expression;
and S71, setting a facial expression attribute probability threshold psi.
Alternatively, Ψ may take 0.7.
S72, carrying out BN inference by using an inference algorithm in the BN theory to obtain a facial expression attribute probability psi';
optionally, the inference algorithm is a joint tree inference algorithm.
And S73, judging the facial expression. And if the facial expression attribute probability Ψ' is greater than or equal to the threshold Ψ, outputting a facial expression attribute, i.e., a facial expression recognition result. Otherwise, a new AU evidence sample is reacquired.
Examples
The data set source in this embodiment is two parts:
(1) the subject matter in the available section of the CK database (ex: Swati NigamRajiv Single hEMAILauthora. K. Misra. efficient presentation using custom program for oriented documents in wavelet domain, Multimedia Tools and Applications, Vol.77(21) (2018)28725-28747.) is 97 college students who entered into the psychology course. Their ages varied from 18 to 30 years. Women account for 65%, african americans for 15%, asian or hispanic for 3%. The facial expression data set comprises six basic expressions and is a currently relatively universal facial expression data set. As shown in fig. 6.
(2) FER2013 dataset
FER2013 facial expression data set (Exit: Hong-Wei Ng, Viet Dung Nguyen, VassiiosVonikakis, Stefan Winkler, Deep learning for evaluation repetition on small data using transfer learning 2015.) is composed of 35886 facial expression pictures, including test chart, bulletin verification chart and private verification chart. Each picture is composed of a gray scale image fixed at 48x48 in size. As shown in fig. 7.
The target database CK for facial expression recognition is very wide in quantity, but the number of participators (97 people) and interpersonal similarity are quite limited, and meanwhile, the database is obtained in a laboratory environment and cannot completely simulate a real scene. In order to improve the robustness of the algorithm, namely the identification accuracy, a transfer learning mechanism is provided. The source domain database FER2013 is used for learning knowledge in tasks, and therefore expression classification of the target database CK is enhanced.
In this embodiment, the CK library is used as a target domain data set of an experiment, 240 images (the last frames of images of an image sequence) are selected for each expression, 120 expression images are used as a training set for parameter learning, a BN parameter can be obtained by a BN-based parameter migration learning method, expression recognition is further completed, and the remaining 120 images of the CK data set are used as a test set. And taking the FER2013 database as a source domain, selecting 300 pieces of data as a source domain data set of an experiment, and selecting 300 groups of AU sample data for parameter model migration in each expression.
The invention relates to an expression recognition method based on a migration mechanism, in the embodiment, a source domain data set M (S)n) 300; and (3) performing BN parameter migration to obtain final BN parameters of the facial expression after the target domain data set M (t) 120 used for training. The test data set is 120 groups;
the method comprises the following steps:
s1: acquiring a facial Activity Unit (AU);
optionally, using An Openface open source tool (here: baltrudis T, Robinson P, Morency lp. Openface: An open source face facial appearance analysis toolkit [ C ]//2016IEEE window Conference on Applications of Computer Vision (WACV). IEEE,2016.), performing a CLM algorithm on the input picture information to obtain geometric features of the facial expression, extracting the HOG features of the facial expression by the HOG algorithm, performing feature fusion and normalization processing to obtain fusion features of the facial expression, and performing classification by using a Support Vector Machine (SVM) to obtain a data set m (T) ═ 120 of the target domain facial Activity Unit (AU) and a data set m (sn) n ═ 1,2,. and AU. And q is the number of the source domains, and is a natural number. In this example, the number q of data sets of the source domain is 1, that is, there is a data set M of one source domain (s 1).
S2, it is determined whether the facial expression BN has been modeled.
When the method is used for the first time, the initial value BN _ Flag is 0, namely the BN is not modeled, S3 needs to be executed, and the BN modeling is carried out.
S3 obtaining sample AU needed by BN modeling
S31, determining the number E _ num of the categories of the facial expression recognition; in this example, E _ num is taken to be 6. I.e. corresponding to 6 types of expressions "happy", "surprised", "fear", "angry", "disgust" and "sad".
And S32, extracting the face activity unit AU according to each type of expression.
Utilizing an Openface open source tool to carry out CLM algorithm on the input expression image picture information to obtain the geometric characteristics of the facial expression, and extracting references (Guo Wen strong, Gao Wen strong, and the like) by an HOG algorithm]Scientific technology and engineering, 2018,018(035), 179-1):
S321: and obtaining the positioning information of the feature points of the expression image. Optionally, performing feature point positioning by using a CLM facial feature point positioning algorithm, and extracting geometric features of the facial expression; as shown in fig. 8, the coordinates of the 68 feature points in the expression image of the target domain in the image are
{260.892 243.672;263.012 269.343;266.038 293.998;270.955 317.406
281.108 340.161;299.33 357.192;326.176 367.506;353.233 376.735
……
407.442 305.409;387.881 304.42;379.841 306.079;372.826 306.621}
S322: and obtaining an AU sample data set. Optionally, the expression image is subjected to feature fusion and normalization processing, and is classified by using a Support Vector Machine (SVM) to obtain an AU sample data set with a source domain of M (S)1) And; the target domain is M (t); obtaining the number of labels of AU samples of partial facial expressionsThe data sets are shown in table 1. The six basic expressions in this example are: "happiness", "surprise", "fear", "anger", "disgust" and "sadness".
Table 1 AU sample label dataset
S4, determining a BN model structure diagram of a target (source) network for facial expression recognition;
determining to establish a target human face expression recognition BN model structure G1 and a source domain human face expression recognition BN model structure G2 according to the acquired AU sample data set;
the BN model structures of the target domain and the source domain are shown in fig. 2 and 3, respectively. The number and orientation of the nodes are determined according to the relationship between the facial expression and the AU in Table 2. Wherein, the father node Expression _ S, Expression _ T node represents the states of facial expressions, including six Expression states; taking 12 AUs as child nodes, each AU has two value-taking events, namely 'non-occurrence' and 'occurrence'. The states of 'occurrence' and 'non-occurrence' of the AU node are represented by '1' and '2'; when "Expression _ S is happy," AU6 is 1, AU12 is 1, and the other AU labels are 0. (1 means that the current AU occurs, 0 means that AU does not occur). Obviously, the state of a facial expression can be characterized by the AU state.
S5: learning source domain BN parameters;
s51: in this embodiment, the source domain data set is FER2013 data set, where M (S) is data set1)=300;
S52: calculating a source weight coefficient k (n) of each source domain sample value in the total source domain sample value;
in this embodiment, a source network model is used, and according to the formula (1) above,
s53: obtaining a source domain BN parameter according to the maximum likelihood estimation;
in this embodiment, 300 groups of AUs of the FER2013 data set are selected as the source domain data set, and the parameter of the BN node i can be obtained according to the MAP. When the value of i is 1, the value of i,
the other i is 2 … 12, and the method is the same.
S54: calculating BN parameter theta of total source domainS1(ii) a As shown in the above equation (2), the results are as follows.
S6: acquiring BN parameters of the facial expression of a target domain;
s61: learning parameter theta of target domain initial BN model by adopting maximum posterior probability estimation methodTi;
In this embodiment, an AU of 120 images in the CK data set is selected as a target domain data set, and the target domain data set is obtained by maximum a posteriori estimation, and when i is 1, the target domain initial BN parameter is obtained, and the other methods are the same as those of the method of 2 … 12.
S62: calculating a target domain BN parameter thetaiAs shown in the parameter fusion formula (3).
θi=α1θTi+α2θsi(3)
In this embodiment, α 1 is 0.7, and α 2 is 0.3. From the above equation (3), the parameter of the first child node of the target domain BN when i is 1 can be calculated
S63, setting the BN _ Flag to 'true' to indicate that the BN modeling is completed; returning to S1 to obtain facial expression recognition evidence AU;
since BN _ Flag is true, after returning to S1 to acquire the facial expression AU, the facial expression AU will not be modeled again, i.e., S3 to S6 are skipped, and the process proceeds to S7 for expression recognition.
S7, recognizing facial expressions;
s71, setting the probability threshold psi of the facial expression attribute to be 0.7;
s72, in the BN model, obtaining observation evidence ev to be identified by D, and carrying out BN inference by using an inference algorithm in a BN theory to obtain a facial expression attribute probability psi';
for example, in the present embodiment, a set of processing data of "gas generation" whose evidence to be observed ev ═ 112112111221, where "1", "2", represents an AU label sample set obtained by feature processing for the gas generation expression, is input. When ev is input [ 112112111221 ], a joint Tree (Junction Tree) inference algorithm is used to obtain Ψ' 0.975;
s73: and judging the facial expression.
In the present embodiment, Ψ' may be obtained as 0.975> Ψ, that is, the output facial expression attribute is "anger".
Inputting the observation evidence ev to be diagnosed, and carrying out reasoning results by utilizing a joint tree algorithm as shown in a table.
Attribute probability of "gas generation" is Ψ' ═ 0.975; and Expression _ T is the attribute probability of other expressions as shown in table 3.
TABLE 3 expression Attribute probability reasoning results
Object properties | Happy music | Fear of | Aversion to | Sadness and sorrow | Is surprised |
Attribute probability Ψ' | 0.000 | 0.002 | 0.012 | 0.010 | 0.001 |
Similarly, 6 basic expressions of 120 test sets AU tag data sets of the CK data set are input, and according to the expression recognition method based on BN parameter migration learning proposed in this embodiment, an expression recognition result can be obtained, as shown in table 4.
TABLE 4 CK data expression classification results
The invention provides an expression recognition method based on a migration mechanism, which comprises the steps of constructing a BN model structure for facial expression recognition according to the relationship between facial expressions and action unit labels, obtaining final BN parameters for facial expression recognition according to the migration mechanism by using BN parameters calculated by a face source domain data set and BN initial parameters of a face target domain data set, and recognizing the facial expressions by using a reasoning algorithm in a BN theory to carry out BN reasoning. The invention fully utilizes the transfer learning mechanism to apply the learning knowledge in a certain field to different but related fields, can effectively solve the problem of insufficient sample data volume of facial expression modeling caused by illumination, shooting angle and the like in facial expression recognition, reduces the influence of insufficient sample number on parameter learning precision and recognition result, and can be widely applied to the environment with noisy, uncertain and difficult acquisition of a large amount of human face target data.
The method solves the problem of low recognition accuracy when the BN experiment training data set is insufficient (small data problem) in the target domain facial expression recognition process by using transfer learning. Transfer learning is a learning problem in a domain where data and information are acquired from a near domain to solve an insufficient amount of data. The learning method of the learning model on the target domain is constructed through knowledge in the source domain data, and the recognition accuracy is improved.
The method can avoid the influence of the subjectivity of the expert experience on the BN parameter learning precision. The learned knowledge in the source task is migrated to the target task, the learning algorithm in the source task is based on the algorithms such as MLE (maximum likelihood algorithm) and MAP (maximum likelihood algorithm) under the classical sufficient samples, and the learning mechanism is based on a data-driven method, so that the influence of subjective expert experience on the learning precision in the BN parameter learning is avoided.
The above disclosure is only for a few specific embodiments of the present invention, however, the present invention is not limited to the above embodiments, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.
Claims (5)
1. A facial expression recognition method based on BN parameter transfer learning is characterized by comprising the following steps:
s1, acquiring a face activity unit AU;
s2, judging whether the facial expression BN is modeled or not;
judging whether the modeling Flag BN _ Flag is 'true', setting the initial value to be 'false', if the BN _ Flag is 'true', indicating that the BN is modeled, jumping to S7, and entering an identification process; otherwise, executing S3 and entering the modeling process;
s3, obtaining a sample AU required by BN modeling;
s31, determining the number E _ num and the type of the facial expression recognition;
s32, extracting a sample data set of a required facial activity unit AU according to the expression of each category;
s4, determining a facial expression recognition target/source domain network BN model structure diagram;
determining a model structure chart G1 for establishing a target network human face expression recognition BN and determining a model structure chart G2 for a source domain network human face expression recognition BN by using the acquired sample data set of the face activity unit AU and the relationship between the human face expression and the activity unit AU as prior information;
s5, learning source domain BN parameters;
s51, obtaining sample data set M of each source area face activity unit AU (S)n);
S52, calculating a source weight coefficient k (n) of each source domain sample value in the total source domain sample value, and using a formula (1) to show that:
wherein, the sum of the weight coefficients of all the sources is 1, and the values of the weight coefficients are all arbitrary real numbers between [0,1 ];
s53, obtaining parameter theta of each source domain BN model by adopting a Maximum Likelihood Estimation (MLE) methodniI is the ith node in the source domain BN model child nodes;
s54, fusing parameters theta of BN models in various source domainsniObtaining the total source domain BN parameter thetaSi;
θSi=∑nk(n)θni(2);
S6, acquiring BN parameters of the facial expressions of the target domain;
s61, obtaining parameter theta of target domain initial BN model by adopting maximum posterior probability MAP estimation methodTiI is the ith node in the target domain BN model child nodes;
s62, calculating the final parameter theta of the target domain facial expression BN according to the weight factori;
θi=α1θTi+α2θsi(3)
Wherein α 1 and α 2 are weighting factors, α 1+ α 2 ═ 1;
s63, setting the BN _ Flag to be true, and finishing the modeling of the BN; returning to S1;
s7, recognizing facial expressions;
s71, setting a facial expression attribute probability threshold psi;
s72, putting the facial expression recognition evidence AU into the constructed BN model, and carrying out BN inference by using a joint tree inference algorithm to obtain a facial expression attribute probability psi';
s73, judging facial expressions;
and if the probability Ψ' of the facial expression attributes is greater than or equal to the threshold Ψ, outputting the facial expression attributes, namely the facial expression recognition result, and otherwise, acquiring the new AU data set again.
2. The BN parameter migration learning-based expression recognition method as claimed in claim 1, wherein the step S32 of extracting facial activity units AU according to each type of expression includes:
s321, obtaining geometric characteristics of the facial expression of the human face through a CLM algorithm on the facial expression image of the human face;
s322, extracting the texture features of the facial expressions from the facial expression images of the human faces through an HOG algorithm;
s323, carrying out feature fusion and normalization processing on the geometric features and the textural features of the facial expression to obtain fusion features of the facial expression;
s324, classifying the fusion characteristics of the facial expressions by using a Support Vector Machine (SVM) to obtain a data set M (t) of a target domain facial activity unit AU and a data set M (S) of a source domain facial activity unit AUn) N is 1,2, the. q, q is the number of source domains, and q is a natural number.
3. The expression recognition method based on BN parameter migration learning of claim 2, wherein the specific step of obtaining the geometric features of the facial expression in step S321 includes:
s3211, obtaining positioning information of feature points of the expression image;
and (5) positioning the feature points by utilizing a CLM facial feature point positioning algorithm, and extracting the geometric features of the facial expression.
4. The expression recognition method based on BN parameter migration learning of claim 1, wherein the specific step of determining the facial expression recognition target/source network BN model structure diagram in step S4 includes:
s41, determining a BN node for recognizing the facial expressions;
determining a father node and a child node of the BN;
s42, determining a directed acyclic graph of the BN (facial expression recognition);
and sequentially connecting the father node and the child node of the BN by using the directed edge, determining to establish a BN model structure diagram G1 of the target network for facial expression recognition, and establishing a BN model structure diagram G2 of the source network for facial expression recognition.
5. The expression recognition method based on BN parameter migration learning of claim 1, wherein the number of categories E _ num of facial expression recognition in step S11 is 6, including: "happy", "surprised", "fear", "angry", "disgust" and "sad" 6 types of expressions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010682216.6A CN111814713A (en) | 2020-07-15 | 2020-07-15 | Expression recognition method based on BN parameter transfer learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010682216.6A CN111814713A (en) | 2020-07-15 | 2020-07-15 | Expression recognition method based on BN parameter transfer learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111814713A true CN111814713A (en) | 2020-10-23 |
Family
ID=72865209
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010682216.6A Pending CN111814713A (en) | 2020-07-15 | 2020-07-15 | Expression recognition method based on BN parameter transfer learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111814713A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112454390A (en) * | 2020-11-27 | 2021-03-09 | 中国科学技术大学 | Humanoid robot facial expression simulation method based on deep reinforcement learning |
CN113724367A (en) * | 2021-07-13 | 2021-11-30 | 北京理工大学 | Robot expression driving method and device |
CN113780456A (en) * | 2021-09-17 | 2021-12-10 | 陕西科技大学 | Pain recognition method, system and computer storage medium |
CN114333027A (en) * | 2021-12-31 | 2022-04-12 | 之江实验室 | Cross-domain new facial expression recognition method based on joint and alternative learning framework |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108573282A (en) * | 2018-04-16 | 2018-09-25 | 陕西科技大学 | Target identification method based on the BN parameter learnings under small data set |
CN109190490A (en) * | 2018-08-08 | 2019-01-11 | 陕西科技大学 | Based on the facial expression BN recognition methods under small data set |
CN109376692A (en) * | 2018-11-22 | 2019-02-22 | 河海大学常州校区 | Migration convolution neural network method towards facial expression recognition |
CN110689130A (en) * | 2019-10-24 | 2020-01-14 | 陕西科技大学 | Bearing fault diagnosis method |
-
2020
- 2020-07-15 CN CN202010682216.6A patent/CN111814713A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108573282A (en) * | 2018-04-16 | 2018-09-25 | 陕西科技大学 | Target identification method based on the BN parameter learnings under small data set |
CN109190490A (en) * | 2018-08-08 | 2019-01-11 | 陕西科技大学 | Based on the facial expression BN recognition methods under small data set |
CN109376692A (en) * | 2018-11-22 | 2019-02-22 | 河海大学常州校区 | Migration convolution neural network method towards facial expression recognition |
CN110689130A (en) * | 2019-10-24 | 2020-01-14 | 陕西科技大学 | Bearing fault diagnosis method |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112454390A (en) * | 2020-11-27 | 2021-03-09 | 中国科学技术大学 | Humanoid robot facial expression simulation method based on deep reinforcement learning |
CN113724367A (en) * | 2021-07-13 | 2021-11-30 | 北京理工大学 | Robot expression driving method and device |
CN113780456A (en) * | 2021-09-17 | 2021-12-10 | 陕西科技大学 | Pain recognition method, system and computer storage medium |
CN113780456B (en) * | 2021-09-17 | 2023-10-27 | 陕西科技大学 | Pain identification method, system and computer storage medium |
CN114333027A (en) * | 2021-12-31 | 2022-04-12 | 之江实验室 | Cross-domain new facial expression recognition method based on joint and alternative learning framework |
CN114333027B (en) * | 2021-12-31 | 2024-05-14 | 之江实验室 | Cross-domain novel facial expression recognition method based on combined and alternate learning frames |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107609009B (en) | Text emotion analysis method and device, storage medium and computer equipment | |
CN111814713A (en) | Expression recognition method based on BN parameter transfer learning | |
CN108288051B (en) | Pedestrian re-recognition model training method and device, electronic equipment and storage medium | |
Garain et al. | GRA_Net: A deep learning model for classification of age and gender from facial images | |
CN111028319B (en) | Three-dimensional non-photorealistic expression generation method based on facial motion unit | |
CN110705490B (en) | Visual emotion recognition method | |
CN113076905B (en) | Emotion recognition method based on context interaction relation | |
CN109190490B (en) | Facial expression BN recognition method based on small data set | |
CN113869105B (en) | Human behavior recognition method | |
US20210374460A1 (en) | Method, non-transitory computer-readable storage medium, and apparatus for searching an image database | |
Cui et al. | Label error correction and generation through label relationships | |
CN114118259A (en) | Target detection method and device | |
Pan et al. | A quantitative model for identifying regions of design visual attraction and application to automobile styling | |
CN116403262A (en) | Online learning concentration monitoring method, system and medium based on machine vision | |
CN115457620A (en) | User expression recognition method and device, computer equipment and storage medium | |
Yang et al. | Fine-grained image quality caption with hierarchical semantics degradation | |
Ramachandran et al. | Facial Expression Recognition with enhanced feature extraction using PSO & EBPNN | |
Li et al. | A new deep learning method for multi-label facial expression recognition based on local constraint features | |
Schmitt et al. | Consistency Models for Scalable and Fast Simulation-Based Inference | |
Kirpal et al. | Real-time Age, Gender and Emotion Detection using Caffe Models | |
Cho et al. | Human action recognition using variational Bayesian hidden Markov model with Gaussian-Wishart emission mixture model | |
Pandey et al. | Learning to segment with image-level supervision | |
Liu et al. | Implicit video multi-emotion tagging by exploiting multi-expression relations | |
Labiadh et al. | Real Time Face Recognition based on Residual Neural Network | |
Ziani et al. | Intelligent face sketch recognition system using shearlet transform and convolutional neural network model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |