CN112381047A - Method for enhancing and identifying facial expression image - Google Patents
Method for enhancing and identifying facial expression image Download PDFInfo
- Publication number
- CN112381047A CN112381047A CN202011377211.9A CN202011377211A CN112381047A CN 112381047 A CN112381047 A CN 112381047A CN 202011377211 A CN202011377211 A CN 202011377211A CN 112381047 A CN112381047 A CN 112381047A
- Authority
- CN
- China
- Prior art keywords
- face
- facial expression
- image
- shape
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2148—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an enhanced recognition method of facial expression images, which comprises the following steps: 1) using an Adaboost cascade detector based on Haar characteristics to carry out face positioning, framing out a face part and cutting, carrying out image preprocessing on a cut image and then storing the image; 2) establishing a mapping relation between the appearance of the human face and the shape of the human face by using a cascade regression tree algorithm in a regression model, and extracting facial feature points; 3) calculating corresponding Euclidean distances by using the facial expression characterization model to obtain a six-element array for characterizing facial expression characteristics; 4) and training a classification model by using a random forest algorithm, and inputting the six-element array into the trained model to realize classification and identification. The method can better identify the facial expression image with certain head deflection on the basis of identifying the standard posture facial expression image, has higher identification efficiency and higher running speed, meets the requirement of practical application, and is more suitable for practical scenes.
Description
Technical Field
The invention relates to the technical field of computer vision and pattern recognition, in particular to an enhanced recognition method of facial expression images.
Background
The facial expression recognition technology is used for analyzing the specific mood of a person by extracting specific expression images in pictures or videos so as to better perform human-computer interaction. In view of the characteristic of high information degree, the facial expression recognition plays an important role in the aspects of psychological analysis, clinical medicine, safe driving, criminal investigation and case solving and the like. The facial expression recognition at the present stage mainly aims at the facial expression under the standard posture, namely the front facial expression, but in general, people often unconsciously generate certain head deflection when making the expression, and in practical application, many complex situations may be faced, so that the recognition effect often cannot meet the expected requirement.
Current attempts at facial expression recognition with head deflection can be summarized in the following three categories: face keypoints-based methods, appearance-based methods, and pose-based methods. The method based on the human face key points mainly positions the key points by means of a geometric model and then identifies the key points, a large number of samples for marking the key points are needed, and the key points are difficult to mark automatically (hero, loyal, Chua Jian, Kong Gao, Kongshi, a new deflection angle human face expression identification method [ J ] computer application research, 2018,35(01):282 + 286 ]); the appearance-based method is to obtain local or global expression characteristics of the face under different postures, reduce interference of factors irrelevant to expressions in an image and avoid the problem of difficult extraction of key points, but the recognition effect is general (Wang Chenxing, Liang. a new expression recognition method [ J ]. electronic technology and software engineering, 2018(06): 67.); the gesture-based method can be divided into two types, one is to group the expression library according to different gestures of the face, and group training, recognition and classification are carried out; the other method is to establish the relationship between the non-frontal face and the frontal face sample, map the non-frontal face to the frontal face, and then classify and recognize the frontal face (Zheng civilization, Von Tianke, non-frontal facial expression recognition method based on posture normalization [ P ]. Jiangsu: CN103400105A,2013-11-20.), which has good effect, but is not suitable for practical application because of complex algorithm and slow operation.
Disclosure of Invention
The invention provides a method for enhancing and identifying facial expressions, aiming at the problems. The method adopts the regression model to extract the features, can accurately extract the facial expression features under the deflection angle, and effectively improves the practicability of the method by combining an integrated learning mode. The facial expression recognition method can recognize the facial expression image under the standard posture, has a good recognition effect on the facial expression image with certain head deflection, reduces the algorithm complexity by adopting an integrated learning mode, improves the operation speed, and is more suitable for actual scenes.
The invention is realized by at least one of the following technical schemes.
A facial expression image enhancement identification method comprises the following steps:
1) using an Adaboost cascade detector based on Haar characteristics to carry out face positioning, framing out a face part and cutting, carrying out image preprocessing on a cut image and then storing the image;
2) establishing a mapping relation between the face appearance and the face shape by using a regression model, realizing face alignment, extracting face characteristic points, and determining the face characteristic points;
3) calculating the Euclidean distance corresponding to the human face characteristic points to obtain a six-element array representing the human face expression characteristics;
4) and inputting the six-element array into a trained classification model to realize classification and identification.
Preferably, in step 1), the face localization is performed by using an Adaboost cascade detector based on Haar features, which specifically includes:
firstly, obtaining Haar characteristics of an image, utilizing an integral diagram to traverse to obtain Haar characteristic values, using the Haar characteristic values as input of a classifier, giving each input data the same initial weight to train a weak classifier, selecting the weak classifier with the minimum error as an optimal weak classifier of the current round, calculating the weight of the next optimal weak classifier by calculating the error between a predicted value and a real value of the optimal weak classifier, performing multiple iterations, performing weighted combination on N optimal weak classifiers to obtain a strong classifier, and finally cascading a plurality of strong classifiers for detection and positioning.
Preferably, in the step 2), a mapping relation between the face appearance and the face shape is established by using a cascading regression tree algorithm in the regression model, so that the face alignment is realized, and the face characteristic points of the face h are extracted.
Preferably, the extracting facial h face feature points specifically includes:
selecting a 300-W database as a sample set for training, and defining ai=(xi,yi) (i ═ 1, 2., h) is the coordinates of the feature points in one picture P in the sample set, and S ═ is defined as (a ═ a ·1 Τ,a2 Τ,...,am Τ) And m is h, which is a coordinate vector of all feature points on one picture and is called as a shape, and the regression iteration formula is as follows:
λta regression model formed by cascading a plurality of regressors,is the current estimate of S, λt(-) image update vector predicted by Cascade regressor, SzFor the true shape of a human face, Δ S(t+1)Is a residual error; inputting the current face shape and the sample image into the regression model, predicting and updating the vector through the regression model to obtain a new shape estimator, namely a new current face shape and a residual error of the current face shape, namely a difference value between the current shape and a real shape, iteratively updating the regression model according to the current residual error,and continuously reducing the residual error, gradually approaching the real shape of the human face, and finally accurately extracting the characteristic points of the human face.
Preferably, the training process of the regression model formed by the N regressors specifically includes:
defining a sample set (P)1,S1),....,(Pn,Sn),Pr(r ═ 1, 2.. and n) are expression images in the sample set, and the shape vector corresponding to each expression image is Sr(r=1,2,...,n);
Inputting training sample images, initial shape estimators and residual quantities, learning rate is rho, and initialized regression functionComprises the following steps:
where C is a constant that minimizes the initial prediction loss function, Δ Sr (t)N ═ nR, R is the initialization multiple of each expression image;
and taking the square error as a loss function, and deriving the loss function to obtain a gradient which is taken as a fitting object in each step of iteration:
is a regression function, Δ Sr (t)Is the amount of the residual error,is SrThe current estimate of (a);
k is 1, K; n constructs a regression tree λ based on a weak classifier GikK is the number of weak classifiers G, update:
Finally obtain lambdatAnd finishing the construction of the regression model.
Preferably, in the step 3), the corresponding Euclidean distance is calculated by using the facial expression representation model to obtain a six-element array representing the facial expression features.
Preferably, the calculating the corresponding euclidean distance by using the facial expression representation model specifically includes: classifying and distinguishing through the facial expression characterization model, and calculating to obtain a six-element array D ═ (D) by adopting the facial expression characterization model measured in an Euclidean distance mode1,d2,d3,d4,d5,d6) Wherein d is1Indicating the distance between the two eyebrows, d2Indicating the distance between eyebrows and eyes, d3Indicating the distance between the upper and lower boundaries of the eye, d4Indicating the height of the mouth, d5Indicates the width of the mouth, d6Indicating the distance from the corner of the mouth to the highest position of the upper lip.
Preferably, in the step 4), a random forest algorithm is used for training a classification model, and the six-element array is input into the trained model to realize classification and recognition.
Preferably, the classification identification specifically includes the following steps:
step one, selecting an expression database fer2013 to train and randomly extract a part of samples and a part of attributes;
determining splitting attributes from the attributes to be selected by using a Gini coefficient, generating nodes, generating a CART decision tree, and forming a random forest by using the generated multiple decision trees;
and step three, after the samples are input, N classification results are generated in the forest, a voting mechanism is adopted to vote the classification results obtained by all the input samples, and the class with the largest voting times is the output identification result.
Preferably, the determining the splitting attribute from the attributes to be selected by using the Gini coefficient specifically includes:
let the sample set X contain N classes, the Gini coefficient is defined as:
σiindicating the frequency of occurrence of class i in the sample set X if X is classified as X under the selected attribute X1And X2Two sample subsets, and the sum of Gini coefficients of the two divided sample subsets is:
M1and M2Are each X1And X2The number of samples, M is the number of X samples, and the Gini coefficient is:
Gini=Gini(X)-Ginisplit(x)(X)
and selecting the attribute with the minimum Gini coefficient as a splitting attribute at each splitting node.
According to the invention, an Adaboost cascade detector based on Haar features is used for preprocessing an original image, a cascade regression tree algorithm is used for establishing a posture mapping relation to realize face alignment, a face expression characterization model is combined to obtain face expression features, and finally a random forest algorithm is used for classification and identification, so that accurate extraction of the face expression features at a deflection angle is realized, the identification efficiency is improved, and certain practical value is achieved.
Compared with the prior art, the invention has the beneficial effects that:
(1) the features are extracted by using the cascade regression tree algorithm, the problem of low positioning accuracy of directly extracting the face key points is solved, the face is aligned firstly by using a regression iteration mode, and then the features are extracted, so that the extraction accuracy of the non-frontal face expression features is effectively improved.
(2) And the random forest algorithm is adopted for classification and identification, so that the identification efficiency is improved, the running time is shortened, and the practical value is enhanced.
(3) The deflection range of facial expression recognition is expanded, and the applicable scenes of the system are expanded.
Drawings
Fig. 1 is a schematic flow chart of the method for enhancing and recognizing a facial expression image according to the present embodiment.
Detailed Description
In order that those skilled in the art will better understand the technical solution of the present invention, the present invention will be further described in detail with reference to the accompanying drawings and the detailed description, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment discloses an enhanced recognition method of a facial expression image, which can recognize the facial expression image under a standard posture, has a good recognition effect on the facial expression image with certain head deflection, reduces algorithm complexity by adopting an integrated learning mode, improves operation speed and is more suitable for actual scenes.
As shown in fig. 1, the method for enhancing and identifying a facial expression image in this embodiment specifically includes the following steps:
1) the method comprises the following steps of utilizing an Adaboost cascade detector based on Haar characteristics to carry out face positioning, framing out a face part and cutting, carrying out image preprocessing of gray normalization and scale normalization on a cut image and then storing the image, wherein the specific process comprises the following steps:
firstly, obtaining Haar characteristics of an image, utilizing an integral diagram to traverse to obtain Haar characteristic values, using the Haar characteristic values as input of a classifier, giving each input data the same initial weight to train a weak classifier, selecting the weak classifier with the minimum error as an optimal weak classifier of the current round, calculating the weight of the next optimal weak classifier by calculating the error between a predicted value and a real value of the optimal weak classifier, performing multiple iterations, performing weighted combination on N optimal weak classifiers to obtain a strong classifier, and finally cascading a plurality of strong classifiers for detection and positioning.
Locating and framing a face part, externally expanding the face frame in a proper proportion to ensure that a complete face image is obtained, preprocessing the image after cutting, specifically comprising scale normalization and gray scale normalization, properly reducing the image to increase the smoothness and definition of the image, and performing weighted average graying on the image to keep the accurate information of the image as far as possible under the condition of reducing the calculated amount of the image.
2) Establishing a mapping relation between the appearance of the human face and the shape of the human face by using a cascading regression tree algorithm in a regression model, realizing human face alignment, and extracting 68 personal face characteristic points of the human face, wherein the specific steps are as follows:
the algorithm comprises a two-layer regression process. Here, a 300-W database was selected as the sample set for training, and a was definedi=(xi,yi) (i ═ 1, 2., 68.) as the coordinates of the feature points in one picture P in the sample set, S ═ is defined (a ═1 Τ,a2 Τ,...,am Τ) And (m-68) is a coordinate vector of all feature points on one picture, and is called as a shape. In the first layer regression, the regression iteration formula is:
λta regression model formed by cascading a plurality of regressors,is the current estimate of S, λt(-) image update vector predicted by Cascade regressor, SzFor the true shape of a human face, Δ S(t+1)Is the residual error. Regression modelInputting a current face shape and a sample image, predicting and updating a vector through a regression model to obtain a new shape estimator, namely a residual error between the new current face shape and the face shape at the moment, namely a difference value between the current shape and the real shape, iteratively updating the regression model according to the current residual error, continuously reducing the residual error, gradually approaching the real shape of the face, and finally accurately extracting the face characteristic points.
The second-layer regression is a regression model formed by training N regressors, and the training process specifically comprises the following steps:
defining a set of samples (P) in a training database1,S1),....,(Pn,Sn),Pr(r ═ 1, 2.. times, n) is the expression image in the sample set, SrAnd (r ═ 1, 2.., n) is a shape vector corresponding to each expression image.
Inputting training sample images, initial shape estimators and residual quantities, learning rate is rho, and initialized regression functionComprises the following steps:
where C is a constant that minimizes the initial prediction loss function, Δ Sr (t)N ═ nR as the residual amount, where R is the initialized multiple of each expression image.
And taking the square error as a loss function, and deriving the loss function to obtain a gradient which is taken as a fitting object in each step of iteration:
is a regression function, Δ Sr (t)Is the amount of the residual error,is SrThe current estimate of (a).
K is 1, K; n constructs a regression tree λ based on a weak classifier GikAnd K is the number of the weak classifiers G, and is updated:
finally obtain lambdatAnd finishing the construction of the regression model.
After the model file is trained, the expression picture is input, and then the accurate extracted face 68 personal face characteristic points can be obtained.
3) Calculating corresponding Euclidean distance by using the facial expression representation model to obtain a six-element array representing facial expression characteristics, which specifically comprises the following steps:
in view of the fact that single expressions can be classified and distinguished through structural characteristics of facial expressions, a facial expression representation model measured in an Euclidean distance mode is adopted, extracted facial feature points are input into the facial expression representation model, and a six-element array D (D) is obtained through calculation by means of the model1,d2,d3,d4,d5,d6) Wherein d is1Indicating the distance between the two eyebrows, d2Indicating the distance between eyebrows and eyes, d3Indicating the distance between the upper and lower boundaries of the eye, d4Indicating the height of the mouth, d5Indicates the width of the mouth, d6Indicating the distance from the corner of the mouth to the highest position of the upper lip.
4) Training a classification model by utilizing a random forest algorithm, and inputting the six-element array into the trained model to realize classification and identification, wherein the classification and identification specifically comprises the following steps:
the human face expression library fer2013 is selected for training, images in the expression library mostly rotate on a plane and a non-plane, and a plurality of images are shielded by shielding objects such as hands, hair and scarves, so that the human face expression library fer2013 is more suitable for actual life situations. The specific process of training by adopting the random forest algorithm comprises the following steps:
selecting an expression database fer2013 for training, and randomly extracting part of samples and part of attributes after preprocessing and feature extraction are carried out on expression images in the fer2013 database; and step two, determining splitting attributes from the attributes to be selected by utilizing Gini coefficients, generating nodes, generating CART decision trees, and forming a random forest by the generated multiple decision trees. Determining the splitting attribute from the attributes to be selected by using the Gini coefficient, specifically:
if the sample set X contains N classes, the Gini coefficient is defined as:
σiindicating the frequency of occurrence of class i in the sample set X if X is classified as X under the selected attribute X1And X2Two sample subsets, then the Gini coefficient sum for the two sample subsets after division is:
M1and M2Are each X1And X2The number of samples, M is the number of X samples, and the Gini coefficient is:
Gini=Gini(X)-Ginisplit(x)(X)
and selecting the attribute with the minimum Gini coefficient as a splitting attribute at each splitting node.
And step three, after the samples are input, N classification results are generated in the forest, a voting mechanism is adopted to vote the classification results obtained by all the input samples, and the class with the largest voting times is the output identification result.
And after the optimal model parameters are determined, inputting the obtained hexagram array into a model to obtain a final classification recognition result.
The above embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above embodiments, and any changes, modifications, substitutions, combinations, and simplifications made by the present specification and drawings without departing from the spirit and principle of the present invention should be construed as equivalents and included in the protection scope of the present invention.
Claims (10)
1. A facial expression image enhancement identification method is characterized by comprising the following steps:
1) using an Adaboost cascade detector based on Haar characteristics to carry out face positioning, framing out a face part and cutting, carrying out image preprocessing on a cut image and then storing the image;
2) establishing a mapping relation between the face appearance and the face shape by using a regression model, realizing face alignment, extracting face characteristic points, and determining the face characteristic points;
3) calculating the Euclidean distance corresponding to the human face characteristic points to obtain a six-element array representing the human face expression characteristics;
4) and inputting the six-element array into a trained classification model to realize classification and identification.
2. The method for enhancing and recognizing the facial expression image according to claim 1, wherein in the step 1), the face positioning is performed by using an Adaboost cascade detector based on Haar features, and the method specifically comprises the following steps:
firstly, obtaining Haar characteristics of an image, utilizing an integral diagram to traverse to obtain Haar characteristic values, using the Haar characteristic values as input of a classifier, giving each input data the same initial weight to train a weak classifier, selecting the weak classifier with the minimum error as an optimal weak classifier of the current round, calculating the weight of the next optimal weak classifier by calculating the error between a predicted value and a real value of the optimal weak classifier, performing multiple iterations, performing weighted combination on N optimal weak classifiers to obtain a strong classifier, and finally cascading a plurality of strong classifiers for detection and positioning.
3. The method for enhancing the recognition of the facial expression image according to claim 2, wherein the step 2) is to establish a mapping relationship between the facial appearance and the facial shape by using a cascading regression tree algorithm in a regression model, to realize the alignment of the face, and to extract the facial feature points of the face h.
4. The method of claim 3, wherein the extracting facial h facial feature points specifically comprises:
selecting a 300-W database as a sample set for training, and defining ai=(xi,yi) (i ═ 1, 2., h) is the coordinates of the feature points in one picture P in the sample set, and S ═ is defined as (a ═ a ·1 Τ,a2 Τ,...,am Τ) And m is h, which is a coordinate vector of all feature points on one picture and is called as a shape, and the regression iteration formula is as follows:
λta regression model formed by cascading a plurality of regressors,is the current estimate of S, λt(-) image update vector predicted by Cascade regressor, SzFor the true shape of a human face, Δ S(t+1)Is a residual error; inputting the current face shape and the sample image into a regression model, predicting and updating a vector through the regression model to obtain a new shape estimator, namely a new current face shape and a residual error of the current face shape, namely a difference value between the current shape and a real shape, and carrying out regression model processing according to the current residual errorAnd (4) carrying out iterative updating, continuously reducing the residual error, gradually approaching the real shape of the human face, and finally accurately extracting the characteristic points of the human face.
5. The method of claim 4, wherein the training process of the regression model formed by the N regressors specifically comprises:
defining a sample set (P)1,S1),....,(Pn,Sn),Pr(r ═ 1, 2.. times.n) are expression images in the sample set, and the shape vector corresponding to each expression image is Sr(r=1,2,...,n);
Inputting training sample images, initial shape estimators and residual quantities, learning rate is rho, and initialized regression functionComprises the following steps:
where C is a constant that minimizes the initial prediction loss function, Δ Sr (t)N ═ nR, R is the initialization multiple of each expression image;
and taking the square error as a loss function, and deriving the loss function to obtain a gradient which is taken as a fitting object in each step of iteration:
is a regression function, Δ Sr (t)Is the amount of the residual error,is SrThe current estimate of (a);
k is 1, K; n constructs a regression tree λ based on a weak classifier GikAnd K is the number of the weak classifiers G, and is updated:
finally obtain lambdatAnd finishing the construction of the regression model.
6. The method for enhancing the recognition of the facial expression image according to claim 5, wherein the step 3) is to calculate the corresponding Euclidean distance by using the facial expression representation model to obtain a six-element array representing the facial expression features.
7. The method of claim 6, wherein the calculating the Euclidean distance using the facial expression representation model specifically comprises: classifying and distinguishing through the facial expression characterization model, and calculating to obtain a six-element array D ═ (D) by adopting the facial expression characterization model measured in an Euclidean distance mode1,d2,d3,d4,d5,d6) Wherein d is1Indicating the distance between the two eyebrows, d2Indicating the distance between eyebrows and eyes, d3Indicating the distance between the upper and lower boundaries of the eye, d4Indicating the height of the mouth, d5Indicates the width of the mouth, d6Indicating the distance from the corner of the mouth to the highest position of the upper lip.
8. The method for enhancing recognition of facial expression images according to claim 7, wherein in step 4), a random forest algorithm is used for training a classification model, and the six-element array is input into the trained model to realize classification recognition.
9. The method for enhancing recognition of facial expression images according to claim 8, wherein the classification recognition specifically comprises the following steps:
step one, selecting an expression database fer2013 to train and randomly extract a part of samples and a part of attributes;
determining splitting attributes from the attributes to be selected by using a Gini coefficient, generating nodes, generating a CART decision tree, and forming a random forest by using the generated multiple decision trees;
and step three, after the samples are input, N classification results are generated in the forest, a voting mechanism is adopted to vote the classification results obtained by all the input samples, and the class with the largest voting times is the output identification result.
10. The method for enhancing and recognizing the facial expression image according to claim 9, wherein the determining of the splitting attribute from the candidate attributes by using the Gini coefficient specifically comprises:
let the sample set X contain N classes, the Gini coefficient is defined as:
σiindicating the frequency of occurrence of class i in the sample set X if X is classified as X under the selected attribute X1And X2Two sample subsets, and the sum of Gini coefficients of the two divided sample subsets is:
M1and M2Are each X1And X2The number of samples, M is the number of X samples, and the Gini coefficient is:
Gini=Gini(X)-Ginisplit(x)(X)
and selecting the attribute with the minimum Gini coefficient as a splitting attribute at each splitting node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011377211.9A CN112381047B (en) | 2020-11-30 | 2020-11-30 | Enhanced recognition method for facial expression image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011377211.9A CN112381047B (en) | 2020-11-30 | 2020-11-30 | Enhanced recognition method for facial expression image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112381047A true CN112381047A (en) | 2021-02-19 |
CN112381047B CN112381047B (en) | 2023-08-22 |
Family
ID=74590391
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011377211.9A Active CN112381047B (en) | 2020-11-30 | 2020-11-30 | Enhanced recognition method for facial expression image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112381047B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113065552A (en) * | 2021-03-29 | 2021-07-02 | 天津大学 | Method for automatically positioning head shadow measurement mark point |
CN113111789A (en) * | 2021-04-15 | 2021-07-13 | 山东大学 | Facial expression recognition method and system based on video stream |
CN117437522A (en) * | 2023-12-19 | 2024-01-23 | 福建拓尔通软件有限公司 | Face recognition model training method, face recognition method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631436A (en) * | 2016-01-27 | 2016-06-01 | 桂林电子科技大学 | Face alignment method based on cascade position regression of random forests |
CN106682598A (en) * | 2016-12-14 | 2017-05-17 | 华南理工大学 | Multi-pose facial feature point detection method based on cascade regression |
CN108108677A (en) * | 2017-12-12 | 2018-06-01 | 重庆邮电大学 | One kind is based on improved CNN facial expression recognizing methods |
CN109961006A (en) * | 2019-01-30 | 2019-07-02 | 东华大学 | A kind of low pixel multiple target Face datection and crucial independent positioning method and alignment schemes |
CN111523367A (en) * | 2020-01-22 | 2020-08-11 | 湖北科技学院 | Intelligent facial expression recognition method and system based on facial attribute analysis |
-
2020
- 2020-11-30 CN CN202011377211.9A patent/CN112381047B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631436A (en) * | 2016-01-27 | 2016-06-01 | 桂林电子科技大学 | Face alignment method based on cascade position regression of random forests |
CN106682598A (en) * | 2016-12-14 | 2017-05-17 | 华南理工大学 | Multi-pose facial feature point detection method based on cascade regression |
CN108108677A (en) * | 2017-12-12 | 2018-06-01 | 重庆邮电大学 | One kind is based on improved CNN facial expression recognizing methods |
CN109961006A (en) * | 2019-01-30 | 2019-07-02 | 东华大学 | A kind of low pixel multiple target Face datection and crucial independent positioning method and alignment schemes |
CN111523367A (en) * | 2020-01-22 | 2020-08-11 | 湖北科技学院 | Intelligent facial expression recognition method and system based on facial attribute analysis |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113065552A (en) * | 2021-03-29 | 2021-07-02 | 天津大学 | Method for automatically positioning head shadow measurement mark point |
CN113111789A (en) * | 2021-04-15 | 2021-07-13 | 山东大学 | Facial expression recognition method and system based on video stream |
CN113111789B (en) * | 2021-04-15 | 2022-12-20 | 山东大学 | Facial expression recognition method and system based on video stream |
CN117437522A (en) * | 2023-12-19 | 2024-01-23 | 福建拓尔通软件有限公司 | Face recognition model training method, face recognition method and device |
CN117437522B (en) * | 2023-12-19 | 2024-05-03 | 福建拓尔通软件有限公司 | Face recognition model training method, face recognition method and device |
Also Published As
Publication number | Publication date |
---|---|
CN112381047B (en) | 2023-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10929649B2 (en) | Multi-pose face feature point detection method based on cascade regression | |
Mahmood et al. | WHITE STAG model: Wise human interaction tracking and estimation (WHITE) using spatio-temporal and angular-geometric (STAG) descriptors | |
CN108268838B (en) | Facial expression recognition method and facial expression recognition system | |
Jiang et al. | Multi-layered gesture recognition with Kinect. | |
Ding et al. | Features versus context: An approach for precise and detailed detection and delineation of faces and facial features | |
Sung et al. | Example-based learning for view-based human face detection | |
US7912246B1 (en) | Method and system for determining the age category of people based on facial images | |
Amor et al. | 4-D facial expression recognition by learning geometric deformations | |
CN112381047B (en) | Enhanced recognition method for facial expression image | |
CN104616316B (en) | Personage's Activity recognition method based on threshold matrix and Fusion Features vision word | |
CN103279768B (en) | A kind of video face identification method based on incremental learning face piecemeal visual characteristic | |
CN106599785B (en) | Method and equipment for establishing human body 3D characteristic identity information base | |
Li et al. | Efficient 3D face recognition handling facial expression and hair occlusion | |
CN103093237B (en) | A kind of method for detecting human face of structure based model | |
CN107392105B (en) | Expression recognition method based on reverse collaborative salient region features | |
More et al. | Gait recognition by cross wavelet transform and graph model | |
Xia et al. | Face occlusion detection using deep convolutional neural networks | |
EP2535787A2 (en) | 3D free-form gesture recognition system for character input | |
Du | High-precision portrait classification based on mtcnn and its application on similarity judgement | |
Bhuyan et al. | Trajectory guided recognition of hand gestures having only global motions | |
CN110516638B (en) | Sign language recognition method based on track and random forest | |
Juang et al. | Human posture classification using interpretable 3-D fuzzy body voxel features and hierarchical fuzzy classifiers | |
Saabni | Facial expression recognition using multi Radial Bases Function Networks and 2-D Gabor filters | |
Khan et al. | Suspect identification using local facial attributed by fusing facial landmarks on the forensic sketch | |
Kelly et al. | Recognition of spatiotemporal gestures in sign language using gesture threshold hmms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |