CN112381047A - Method for enhancing and identifying facial expression image - Google Patents

Method for enhancing and identifying facial expression image Download PDF

Info

Publication number
CN112381047A
CN112381047A CN202011377211.9A CN202011377211A CN112381047A CN 112381047 A CN112381047 A CN 112381047A CN 202011377211 A CN202011377211 A CN 202011377211A CN 112381047 A CN112381047 A CN 112381047A
Authority
CN
China
Prior art keywords
face
facial expression
image
shape
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011377211.9A
Other languages
Chinese (zh)
Other versions
CN112381047B (en
Inventor
谢巍
刘彦汝
钱文轩
谢苗苗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202011377211.9A priority Critical patent/CN112381047B/en
Publication of CN112381047A publication Critical patent/CN112381047A/en
Application granted granted Critical
Publication of CN112381047B publication Critical patent/CN112381047B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an enhanced recognition method of facial expression images, which comprises the following steps: 1) using an Adaboost cascade detector based on Haar characteristics to carry out face positioning, framing out a face part and cutting, carrying out image preprocessing on a cut image and then storing the image; 2) establishing a mapping relation between the appearance of the human face and the shape of the human face by using a cascade regression tree algorithm in a regression model, and extracting facial feature points; 3) calculating corresponding Euclidean distances by using the facial expression characterization model to obtain a six-element array for characterizing facial expression characteristics; 4) and training a classification model by using a random forest algorithm, and inputting the six-element array into the trained model to realize classification and identification. The method can better identify the facial expression image with certain head deflection on the basis of identifying the standard posture facial expression image, has higher identification efficiency and higher running speed, meets the requirement of practical application, and is more suitable for practical scenes.

Description

Method for enhancing and identifying facial expression image
Technical Field
The invention relates to the technical field of computer vision and pattern recognition, in particular to an enhanced recognition method of facial expression images.
Background
The facial expression recognition technology is used for analyzing the specific mood of a person by extracting specific expression images in pictures or videos so as to better perform human-computer interaction. In view of the characteristic of high information degree, the facial expression recognition plays an important role in the aspects of psychological analysis, clinical medicine, safe driving, criminal investigation and case solving and the like. The facial expression recognition at the present stage mainly aims at the facial expression under the standard posture, namely the front facial expression, but in general, people often unconsciously generate certain head deflection when making the expression, and in practical application, many complex situations may be faced, so that the recognition effect often cannot meet the expected requirement.
Current attempts at facial expression recognition with head deflection can be summarized in the following three categories: face keypoints-based methods, appearance-based methods, and pose-based methods. The method based on the human face key points mainly positions the key points by means of a geometric model and then identifies the key points, a large number of samples for marking the key points are needed, and the key points are difficult to mark automatically (hero, loyal, Chua Jian, Kong Gao, Kongshi, a new deflection angle human face expression identification method [ J ] computer application research, 2018,35(01):282 + 286 ]); the appearance-based method is to obtain local or global expression characteristics of the face under different postures, reduce interference of factors irrelevant to expressions in an image and avoid the problem of difficult extraction of key points, but the recognition effect is general (Wang Chenxing, Liang. a new expression recognition method [ J ]. electronic technology and software engineering, 2018(06): 67.); the gesture-based method can be divided into two types, one is to group the expression library according to different gestures of the face, and group training, recognition and classification are carried out; the other method is to establish the relationship between the non-frontal face and the frontal face sample, map the non-frontal face to the frontal face, and then classify and recognize the frontal face (Zheng civilization, Von Tianke, non-frontal facial expression recognition method based on posture normalization [ P ]. Jiangsu: CN103400105A,2013-11-20.), which has good effect, but is not suitable for practical application because of complex algorithm and slow operation.
Disclosure of Invention
The invention provides a method for enhancing and identifying facial expressions, aiming at the problems. The method adopts the regression model to extract the features, can accurately extract the facial expression features under the deflection angle, and effectively improves the practicability of the method by combining an integrated learning mode. The facial expression recognition method can recognize the facial expression image under the standard posture, has a good recognition effect on the facial expression image with certain head deflection, reduces the algorithm complexity by adopting an integrated learning mode, improves the operation speed, and is more suitable for actual scenes.
The invention is realized by at least one of the following technical schemes.
A facial expression image enhancement identification method comprises the following steps:
1) using an Adaboost cascade detector based on Haar characteristics to carry out face positioning, framing out a face part and cutting, carrying out image preprocessing on a cut image and then storing the image;
2) establishing a mapping relation between the face appearance and the face shape by using a regression model, realizing face alignment, extracting face characteristic points, and determining the face characteristic points;
3) calculating the Euclidean distance corresponding to the human face characteristic points to obtain a six-element array representing the human face expression characteristics;
4) and inputting the six-element array into a trained classification model to realize classification and identification.
Preferably, in step 1), the face localization is performed by using an Adaboost cascade detector based on Haar features, which specifically includes:
firstly, obtaining Haar characteristics of an image, utilizing an integral diagram to traverse to obtain Haar characteristic values, using the Haar characteristic values as input of a classifier, giving each input data the same initial weight to train a weak classifier, selecting the weak classifier with the minimum error as an optimal weak classifier of the current round, calculating the weight of the next optimal weak classifier by calculating the error between a predicted value and a real value of the optimal weak classifier, performing multiple iterations, performing weighted combination on N optimal weak classifiers to obtain a strong classifier, and finally cascading a plurality of strong classifiers for detection and positioning.
Preferably, in the step 2), a mapping relation between the face appearance and the face shape is established by using a cascading regression tree algorithm in the regression model, so that the face alignment is realized, and the face characteristic points of the face h are extracted.
Preferably, the extracting facial h face feature points specifically includes:
selecting a 300-W database as a sample set for training, and defining ai=(xi,yi) (i ═ 1, 2., h) is the coordinates of the feature points in one picture P in the sample set, and S ═ is defined as (a ═ a ·1 Τ,a2 Τ,...,am Τ) And m is h, which is a coordinate vector of all feature points on one picture and is called as a shape, and the regression iteration formula is as follows:
Figure BDA0002808495620000031
Figure BDA0002808495620000032
λta regression model formed by cascading a plurality of regressors,
Figure BDA0002808495620000033
is the current estimate of S, λt(-) image update vector predicted by Cascade regressor, SzFor the true shape of a human face, Δ S(t+1)Is a residual error; inputting the current face shape and the sample image into the regression model, predicting and updating the vector through the regression model to obtain a new shape estimator, namely a new current face shape and a residual error of the current face shape, namely a difference value between the current shape and a real shape, iteratively updating the regression model according to the current residual error,and continuously reducing the residual error, gradually approaching the real shape of the human face, and finally accurately extracting the characteristic points of the human face.
Preferably, the training process of the regression model formed by the N regressors specifically includes:
defining a sample set (P)1,S1),....,(Pn,Sn),Pr(r ═ 1, 2.. and n) are expression images in the sample set, and the shape vector corresponding to each expression image is Sr(r=1,2,...,n);
Inputting training sample images, initial shape estimators and residual quantities, learning rate is rho, and initialized regression function
Figure BDA0002808495620000041
Comprises the following steps:
Figure BDA0002808495620000042
where C is a constant that minimizes the initial prediction loss function, Δ Sr (t)N ═ nR, R is the initialization multiple of each expression image;
and taking the square error as a loss function, and deriving the loss function to obtain a gradient which is taken as a fitting object in each step of iteration:
Figure BDA0002808495620000043
Figure BDA0002808495620000044
is a regression function, Δ Sr (t)Is the amount of the residual error,
Figure BDA0002808495620000045
is SrThe current estimate of (a);
k is 1, K; n constructs a regression tree λ based on a weak classifier GikK is the number of weak classifiers G, update:
Figure BDA0002808495620000046
Figure BDA0002808495620000047
Finally obtain lambdatAnd finishing the construction of the regression model.
Preferably, in the step 3), the corresponding Euclidean distance is calculated by using the facial expression representation model to obtain a six-element array representing the facial expression features.
Preferably, the calculating the corresponding euclidean distance by using the facial expression representation model specifically includes: classifying and distinguishing through the facial expression characterization model, and calculating to obtain a six-element array D ═ (D) by adopting the facial expression characterization model measured in an Euclidean distance mode1,d2,d3,d4,d5,d6) Wherein d is1Indicating the distance between the two eyebrows, d2Indicating the distance between eyebrows and eyes, d3Indicating the distance between the upper and lower boundaries of the eye, d4Indicating the height of the mouth, d5Indicates the width of the mouth, d6Indicating the distance from the corner of the mouth to the highest position of the upper lip.
Preferably, in the step 4), a random forest algorithm is used for training a classification model, and the six-element array is input into the trained model to realize classification and recognition.
Preferably, the classification identification specifically includes the following steps:
step one, selecting an expression database fer2013 to train and randomly extract a part of samples and a part of attributes;
determining splitting attributes from the attributes to be selected by using a Gini coefficient, generating nodes, generating a CART decision tree, and forming a random forest by using the generated multiple decision trees;
and step three, after the samples are input, N classification results are generated in the forest, a voting mechanism is adopted to vote the classification results obtained by all the input samples, and the class with the largest voting times is the output identification result.
Preferably, the determining the splitting attribute from the attributes to be selected by using the Gini coefficient specifically includes:
let the sample set X contain N classes, the Gini coefficient is defined as:
Figure BDA0002808495620000051
σiindicating the frequency of occurrence of class i in the sample set X if X is classified as X under the selected attribute X1And X2Two sample subsets, and the sum of Gini coefficients of the two divided sample subsets is:
Figure BDA0002808495620000052
M1and M2Are each X1And X2The number of samples, M is the number of X samples, and the Gini coefficient is:
Gini=Gini(X)-Ginisplit(x)(X)
and selecting the attribute with the minimum Gini coefficient as a splitting attribute at each splitting node.
According to the invention, an Adaboost cascade detector based on Haar features is used for preprocessing an original image, a cascade regression tree algorithm is used for establishing a posture mapping relation to realize face alignment, a face expression characterization model is combined to obtain face expression features, and finally a random forest algorithm is used for classification and identification, so that accurate extraction of the face expression features at a deflection angle is realized, the identification efficiency is improved, and certain practical value is achieved.
Compared with the prior art, the invention has the beneficial effects that:
(1) the features are extracted by using the cascade regression tree algorithm, the problem of low positioning accuracy of directly extracting the face key points is solved, the face is aligned firstly by using a regression iteration mode, and then the features are extracted, so that the extraction accuracy of the non-frontal face expression features is effectively improved.
(2) And the random forest algorithm is adopted for classification and identification, so that the identification efficiency is improved, the running time is shortened, and the practical value is enhanced.
(3) The deflection range of facial expression recognition is expanded, and the applicable scenes of the system are expanded.
Drawings
Fig. 1 is a schematic flow chart of the method for enhancing and recognizing a facial expression image according to the present embodiment.
Detailed Description
In order that those skilled in the art will better understand the technical solution of the present invention, the present invention will be further described in detail with reference to the accompanying drawings and the detailed description, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment discloses an enhanced recognition method of a facial expression image, which can recognize the facial expression image under a standard posture, has a good recognition effect on the facial expression image with certain head deflection, reduces algorithm complexity by adopting an integrated learning mode, improves operation speed and is more suitable for actual scenes.
As shown in fig. 1, the method for enhancing and identifying a facial expression image in this embodiment specifically includes the following steps:
1) the method comprises the following steps of utilizing an Adaboost cascade detector based on Haar characteristics to carry out face positioning, framing out a face part and cutting, carrying out image preprocessing of gray normalization and scale normalization on a cut image and then storing the image, wherein the specific process comprises the following steps:
firstly, obtaining Haar characteristics of an image, utilizing an integral diagram to traverse to obtain Haar characteristic values, using the Haar characteristic values as input of a classifier, giving each input data the same initial weight to train a weak classifier, selecting the weak classifier with the minimum error as an optimal weak classifier of the current round, calculating the weight of the next optimal weak classifier by calculating the error between a predicted value and a real value of the optimal weak classifier, performing multiple iterations, performing weighted combination on N optimal weak classifiers to obtain a strong classifier, and finally cascading a plurality of strong classifiers for detection and positioning.
Locating and framing a face part, externally expanding the face frame in a proper proportion to ensure that a complete face image is obtained, preprocessing the image after cutting, specifically comprising scale normalization and gray scale normalization, properly reducing the image to increase the smoothness and definition of the image, and performing weighted average graying on the image to keep the accurate information of the image as far as possible under the condition of reducing the calculated amount of the image.
2) Establishing a mapping relation between the appearance of the human face and the shape of the human face by using a cascading regression tree algorithm in a regression model, realizing human face alignment, and extracting 68 personal face characteristic points of the human face, wherein the specific steps are as follows:
the algorithm comprises a two-layer regression process. Here, a 300-W database was selected as the sample set for training, and a was definedi=(xi,yi) (i ═ 1, 2., 68.) as the coordinates of the feature points in one picture P in the sample set, S ═ is defined (a ═1 Τ,a2 Τ,...,am Τ) And (m-68) is a coordinate vector of all feature points on one picture, and is called as a shape. In the first layer regression, the regression iteration formula is:
Figure BDA0002808495620000071
Figure BDA0002808495620000072
λta regression model formed by cascading a plurality of regressors,
Figure BDA0002808495620000073
is the current estimate of S, λt(-) image update vector predicted by Cascade regressor, SzFor the true shape of a human face, Δ S(t+1)Is the residual error. Regression modelInputting a current face shape and a sample image, predicting and updating a vector through a regression model to obtain a new shape estimator, namely a residual error between the new current face shape and the face shape at the moment, namely a difference value between the current shape and the real shape, iteratively updating the regression model according to the current residual error, continuously reducing the residual error, gradually approaching the real shape of the face, and finally accurately extracting the face characteristic points.
The second-layer regression is a regression model formed by training N regressors, and the training process specifically comprises the following steps:
defining a set of samples (P) in a training database1,S1),....,(Pn,Sn),Pr(r ═ 1, 2.. times, n) is the expression image in the sample set, SrAnd (r ═ 1, 2.., n) is a shape vector corresponding to each expression image.
Inputting training sample images, initial shape estimators and residual quantities, learning rate is rho, and initialized regression function
Figure BDA0002808495620000081
Comprises the following steps:
Figure BDA0002808495620000082
where C is a constant that minimizes the initial prediction loss function, Δ Sr (t)N ═ nR as the residual amount, where R is the initialized multiple of each expression image.
And taking the square error as a loss function, and deriving the loss function to obtain a gradient which is taken as a fitting object in each step of iteration:
Figure BDA0002808495620000083
Figure BDA0002808495620000084
is a regression function, Δ Sr (t)Is the amount of the residual error,
Figure BDA0002808495620000085
is SrThe current estimate of (a).
K is 1, K; n constructs a regression tree λ based on a weak classifier GikAnd K is the number of the weak classifiers G, and is updated:
Figure BDA0002808495620000086
Figure BDA0002808495620000087
finally obtain lambdatAnd finishing the construction of the regression model.
After the model file is trained, the expression picture is input, and then the accurate extracted face 68 personal face characteristic points can be obtained.
3) Calculating corresponding Euclidean distance by using the facial expression representation model to obtain a six-element array representing facial expression characteristics, which specifically comprises the following steps:
in view of the fact that single expressions can be classified and distinguished through structural characteristics of facial expressions, a facial expression representation model measured in an Euclidean distance mode is adopted, extracted facial feature points are input into the facial expression representation model, and a six-element array D (D) is obtained through calculation by means of the model1,d2,d3,d4,d5,d6) Wherein d is1Indicating the distance between the two eyebrows, d2Indicating the distance between eyebrows and eyes, d3Indicating the distance between the upper and lower boundaries of the eye, d4Indicating the height of the mouth, d5Indicates the width of the mouth, d6Indicating the distance from the corner of the mouth to the highest position of the upper lip.
4) Training a classification model by utilizing a random forest algorithm, and inputting the six-element array into the trained model to realize classification and identification, wherein the classification and identification specifically comprises the following steps:
the human face expression library fer2013 is selected for training, images in the expression library mostly rotate on a plane and a non-plane, and a plurality of images are shielded by shielding objects such as hands, hair and scarves, so that the human face expression library fer2013 is more suitable for actual life situations. The specific process of training by adopting the random forest algorithm comprises the following steps:
selecting an expression database fer2013 for training, and randomly extracting part of samples and part of attributes after preprocessing and feature extraction are carried out on expression images in the fer2013 database; and step two, determining splitting attributes from the attributes to be selected by utilizing Gini coefficients, generating nodes, generating CART decision trees, and forming a random forest by the generated multiple decision trees. Determining the splitting attribute from the attributes to be selected by using the Gini coefficient, specifically:
if the sample set X contains N classes, the Gini coefficient is defined as:
Figure BDA0002808495620000091
σiindicating the frequency of occurrence of class i in the sample set X if X is classified as X under the selected attribute X1And X2Two sample subsets, then the Gini coefficient sum for the two sample subsets after division is:
Figure BDA0002808495620000101
M1and M2Are each X1And X2The number of samples, M is the number of X samples, and the Gini coefficient is:
Gini=Gini(X)-Ginisplit(x)(X)
and selecting the attribute with the minimum Gini coefficient as a splitting attribute at each splitting node.
And step three, after the samples are input, N classification results are generated in the forest, a voting mechanism is adopted to vote the classification results obtained by all the input samples, and the class with the largest voting times is the output identification result.
And after the optimal model parameters are determined, inputting the obtained hexagram array into a model to obtain a final classification recognition result.
The above embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above embodiments, and any changes, modifications, substitutions, combinations, and simplifications made by the present specification and drawings without departing from the spirit and principle of the present invention should be construed as equivalents and included in the protection scope of the present invention.

Claims (10)

1. A facial expression image enhancement identification method is characterized by comprising the following steps:
1) using an Adaboost cascade detector based on Haar characteristics to carry out face positioning, framing out a face part and cutting, carrying out image preprocessing on a cut image and then storing the image;
2) establishing a mapping relation between the face appearance and the face shape by using a regression model, realizing face alignment, extracting face characteristic points, and determining the face characteristic points;
3) calculating the Euclidean distance corresponding to the human face characteristic points to obtain a six-element array representing the human face expression characteristics;
4) and inputting the six-element array into a trained classification model to realize classification and identification.
2. The method for enhancing and recognizing the facial expression image according to claim 1, wherein in the step 1), the face positioning is performed by using an Adaboost cascade detector based on Haar features, and the method specifically comprises the following steps:
firstly, obtaining Haar characteristics of an image, utilizing an integral diagram to traverse to obtain Haar characteristic values, using the Haar characteristic values as input of a classifier, giving each input data the same initial weight to train a weak classifier, selecting the weak classifier with the minimum error as an optimal weak classifier of the current round, calculating the weight of the next optimal weak classifier by calculating the error between a predicted value and a real value of the optimal weak classifier, performing multiple iterations, performing weighted combination on N optimal weak classifiers to obtain a strong classifier, and finally cascading a plurality of strong classifiers for detection and positioning.
3. The method for enhancing the recognition of the facial expression image according to claim 2, wherein the step 2) is to establish a mapping relationship between the facial appearance and the facial shape by using a cascading regression tree algorithm in a regression model, to realize the alignment of the face, and to extract the facial feature points of the face h.
4. The method of claim 3, wherein the extracting facial h facial feature points specifically comprises:
selecting a 300-W database as a sample set for training, and defining ai=(xi,yi) (i ═ 1, 2., h) is the coordinates of the feature points in one picture P in the sample set, and S ═ is defined as (a ═ a ·1 Τ,a2 Τ,...,am Τ) And m is h, which is a coordinate vector of all feature points on one picture and is called as a shape, and the regression iteration formula is as follows:
Figure FDA0002808495610000021
Figure FDA0002808495610000022
λta regression model formed by cascading a plurality of regressors,
Figure FDA0002808495610000023
is the current estimate of S, λt(-) image update vector predicted by Cascade regressor, SzFor the true shape of a human face, Δ S(t+1)Is a residual error; inputting the current face shape and the sample image into a regression model, predicting and updating a vector through the regression model to obtain a new shape estimator, namely a new current face shape and a residual error of the current face shape, namely a difference value between the current shape and a real shape, and carrying out regression model processing according to the current residual errorAnd (4) carrying out iterative updating, continuously reducing the residual error, gradually approaching the real shape of the human face, and finally accurately extracting the characteristic points of the human face.
5. The method of claim 4, wherein the training process of the regression model formed by the N regressors specifically comprises:
defining a sample set (P)1,S1),....,(Pn,Sn),Pr(r ═ 1, 2.. times.n) are expression images in the sample set, and the shape vector corresponding to each expression image is Sr(r=1,2,...,n);
Inputting training sample images, initial shape estimators and residual quantities, learning rate is rho, and initialized regression function
Figure FDA0002808495610000024
Comprises the following steps:
Figure FDA0002808495610000025
where C is a constant that minimizes the initial prediction loss function, Δ Sr (t)N ═ nR, R is the initialization multiple of each expression image;
and taking the square error as a loss function, and deriving the loss function to obtain a gradient which is taken as a fitting object in each step of iteration:
Figure FDA0002808495610000031
Figure FDA0002808495610000032
is a regression function, Δ Sr (t)Is the amount of the residual error,
Figure FDA0002808495610000033
is SrThe current estimate of (a);
k is 1, K; n constructs a regression tree λ based on a weak classifier GikAnd K is the number of the weak classifiers G, and is updated:
Figure FDA0002808495610000034
Figure FDA0002808495610000035
finally obtain lambdatAnd finishing the construction of the regression model.
6. The method for enhancing the recognition of the facial expression image according to claim 5, wherein the step 3) is to calculate the corresponding Euclidean distance by using the facial expression representation model to obtain a six-element array representing the facial expression features.
7. The method of claim 6, wherein the calculating the Euclidean distance using the facial expression representation model specifically comprises: classifying and distinguishing through the facial expression characterization model, and calculating to obtain a six-element array D ═ (D) by adopting the facial expression characterization model measured in an Euclidean distance mode1,d2,d3,d4,d5,d6) Wherein d is1Indicating the distance between the two eyebrows, d2Indicating the distance between eyebrows and eyes, d3Indicating the distance between the upper and lower boundaries of the eye, d4Indicating the height of the mouth, d5Indicates the width of the mouth, d6Indicating the distance from the corner of the mouth to the highest position of the upper lip.
8. The method for enhancing recognition of facial expression images according to claim 7, wherein in step 4), a random forest algorithm is used for training a classification model, and the six-element array is input into the trained model to realize classification recognition.
9. The method for enhancing recognition of facial expression images according to claim 8, wherein the classification recognition specifically comprises the following steps:
step one, selecting an expression database fer2013 to train and randomly extract a part of samples and a part of attributes;
determining splitting attributes from the attributes to be selected by using a Gini coefficient, generating nodes, generating a CART decision tree, and forming a random forest by using the generated multiple decision trees;
and step three, after the samples are input, N classification results are generated in the forest, a voting mechanism is adopted to vote the classification results obtained by all the input samples, and the class with the largest voting times is the output identification result.
10. The method for enhancing and recognizing the facial expression image according to claim 9, wherein the determining of the splitting attribute from the candidate attributes by using the Gini coefficient specifically comprises:
let the sample set X contain N classes, the Gini coefficient is defined as:
Figure FDA0002808495610000041
σiindicating the frequency of occurrence of class i in the sample set X if X is classified as X under the selected attribute X1And X2Two sample subsets, and the sum of Gini coefficients of the two divided sample subsets is:
Figure FDA0002808495610000042
M1and M2Are each X1And X2The number of samples, M is the number of X samples, and the Gini coefficient is:
Gini=Gini(X)-Ginisplit(x)(X)
and selecting the attribute with the minimum Gini coefficient as a splitting attribute at each splitting node.
CN202011377211.9A 2020-11-30 2020-11-30 Enhanced recognition method for facial expression image Active CN112381047B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011377211.9A CN112381047B (en) 2020-11-30 2020-11-30 Enhanced recognition method for facial expression image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011377211.9A CN112381047B (en) 2020-11-30 2020-11-30 Enhanced recognition method for facial expression image

Publications (2)

Publication Number Publication Date
CN112381047A true CN112381047A (en) 2021-02-19
CN112381047B CN112381047B (en) 2023-08-22

Family

ID=74590391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011377211.9A Active CN112381047B (en) 2020-11-30 2020-11-30 Enhanced recognition method for facial expression image

Country Status (1)

Country Link
CN (1) CN112381047B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065552A (en) * 2021-03-29 2021-07-02 天津大学 Method for automatically positioning head shadow measurement mark point
CN113111789A (en) * 2021-04-15 2021-07-13 山东大学 Facial expression recognition method and system based on video stream
CN117437522A (en) * 2023-12-19 2024-01-23 福建拓尔通软件有限公司 Face recognition model training method, face recognition method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631436A (en) * 2016-01-27 2016-06-01 桂林电子科技大学 Face alignment method based on cascade position regression of random forests
CN106682598A (en) * 2016-12-14 2017-05-17 华南理工大学 Multi-pose facial feature point detection method based on cascade regression
CN108108677A (en) * 2017-12-12 2018-06-01 重庆邮电大学 One kind is based on improved CNN facial expression recognizing methods
CN109961006A (en) * 2019-01-30 2019-07-02 东华大学 A kind of low pixel multiple target Face datection and crucial independent positioning method and alignment schemes
CN111523367A (en) * 2020-01-22 2020-08-11 湖北科技学院 Intelligent facial expression recognition method and system based on facial attribute analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631436A (en) * 2016-01-27 2016-06-01 桂林电子科技大学 Face alignment method based on cascade position regression of random forests
CN106682598A (en) * 2016-12-14 2017-05-17 华南理工大学 Multi-pose facial feature point detection method based on cascade regression
CN108108677A (en) * 2017-12-12 2018-06-01 重庆邮电大学 One kind is based on improved CNN facial expression recognizing methods
CN109961006A (en) * 2019-01-30 2019-07-02 东华大学 A kind of low pixel multiple target Face datection and crucial independent positioning method and alignment schemes
CN111523367A (en) * 2020-01-22 2020-08-11 湖北科技学院 Intelligent facial expression recognition method and system based on facial attribute analysis

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065552A (en) * 2021-03-29 2021-07-02 天津大学 Method for automatically positioning head shadow measurement mark point
CN113111789A (en) * 2021-04-15 2021-07-13 山东大学 Facial expression recognition method and system based on video stream
CN113111789B (en) * 2021-04-15 2022-12-20 山东大学 Facial expression recognition method and system based on video stream
CN117437522A (en) * 2023-12-19 2024-01-23 福建拓尔通软件有限公司 Face recognition model training method, face recognition method and device
CN117437522B (en) * 2023-12-19 2024-05-03 福建拓尔通软件有限公司 Face recognition model training method, face recognition method and device

Also Published As

Publication number Publication date
CN112381047B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
US10929649B2 (en) Multi-pose face feature point detection method based on cascade regression
Mahmood et al. WHITE STAG model: Wise human interaction tracking and estimation (WHITE) using spatio-temporal and angular-geometric (STAG) descriptors
CN108268838B (en) Facial expression recognition method and facial expression recognition system
Jiang et al. Multi-layered gesture recognition with Kinect.
Ding et al. Features versus context: An approach for precise and detailed detection and delineation of faces and facial features
Sung et al. Example-based learning for view-based human face detection
US7912246B1 (en) Method and system for determining the age category of people based on facial images
Amor et al. 4-D facial expression recognition by learning geometric deformations
CN112381047B (en) Enhanced recognition method for facial expression image
CN104616316B (en) Personage's Activity recognition method based on threshold matrix and Fusion Features vision word
CN103279768B (en) A kind of video face identification method based on incremental learning face piecemeal visual characteristic
CN106599785B (en) Method and equipment for establishing human body 3D characteristic identity information base
Li et al. Efficient 3D face recognition handling facial expression and hair occlusion
CN103093237B (en) A kind of method for detecting human face of structure based model
CN107392105B (en) Expression recognition method based on reverse collaborative salient region features
More et al. Gait recognition by cross wavelet transform and graph model
Xia et al. Face occlusion detection using deep convolutional neural networks
EP2535787A2 (en) 3D free-form gesture recognition system for character input
Du High-precision portrait classification based on mtcnn and its application on similarity judgement
Bhuyan et al. Trajectory guided recognition of hand gestures having only global motions
CN110516638B (en) Sign language recognition method based on track and random forest
Juang et al. Human posture classification using interpretable 3-D fuzzy body voxel features and hierarchical fuzzy classifiers
Saabni Facial expression recognition using multi Radial Bases Function Networks and 2-D Gabor filters
Khan et al. Suspect identification using local facial attributed by fusing facial landmarks on the forensic sketch
Kelly et al. Recognition of spatiotemporal gestures in sign language using gesture threshold hmms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant