CN109034079B

CN109034079B - Facial expression recognition method for non-standard posture of human face

Info

Publication number: CN109034079B
Application number: CN201810865356.XA
Authority: CN
Inventors: 李�瑞; 王儒敬; 宋全军; 谢成军; 张洁; 陈天娇; 陈红波; 胡海瀛
Original assignee: Hefei Institutes of Physical Science of CAS
Current assignee: Hefei Institutes of Physical Science of CAS
Priority date: 2018-08-01
Filing date: 2018-08-01
Publication date: 2022-03-11
Anticipated expiration: 2038-08-01
Also published as: CN109034079A

Abstract

The invention relates to a facial expression recognition method for a human face in a non-standard posture, which overcomes the defect that expression recognition cannot be carried out under the condition that facial expression information is incomplete compared with the prior art. The invention comprises the following steps: collecting and preprocessing training images; constructing a classification model; collecting and preprocessing an image to be detected; recognition of facial expressions. The facial expression recognition method based on the facial expression prediction technology predicts the facial expression under the condition that facial expression information is incomplete, so that the facial expression recognition can be carried out on faces in different angles.

Description

Facial expression recognition method for non-standard posture of human face

Technical Field

The invention relates to the technical field of image recognition, in particular to a facial expression recognition method for a human face under a non-standard posture.

Background

The computer vision technology and the pattern recognition technology are solutions for making a computer according to the judgment of human facial expressions, and become research focus of many scientists. In medical terms, if the computer can effectively analyze the expression function of the patient, the computer can sooth the patient or perform further treatment according to the change of the expression of the patient, thereby relieving the pain of the patient psychologically and physiologically. In life, if the computer can effectively identify the joy, anger, sadness and fun of the depressed children, the psychological burden of families can be greatly relieved.

In the prior art, facial expression recognition can be realized only based on the judgment of the whole face in a standard shooting state. In practical applications, the whole face of the person to be recognized is required to be directed to the image pickup device in order to acquire the whole face information. However, in the application of face recognition to the medical field, it is difficult to obtain face information of a person to be recognized in a comprehensive manner. Some technicians also propose that facial expressions can be predicted under the condition of acquiring partial facial information based on a prediction analysis technology, but the prediction only stays in a theoretical stage, and the prediction accuracy is very low.

Therefore, how to recognize facial expressions in the non-standard postures of the human face becomes an urgent technical problem to be solved.

Disclosure of Invention

The invention aims to solve the defect that expression recognition cannot be performed under the condition that facial expression information is incomplete in the prior art, and provides a facial expression recognition method for a human face in a non-standard posture to solve the problem.

In order to achieve the purpose, the technical scheme of the invention is as follows:

a facial expression recognition method for a human face under a nonstandard posture comprises the following steps:

11) collecting and preprocessing training images, namely collecting 7 expression samples of happiness, anger, hurt, surprise, aversion, fear and neutrality, wherein the 7 expression samples are facial expressions in non-standard postures, not less than 100 expression samples of each type are used as training images, and performing histogram equalization and normalization processing on all the training images;

12) constructing a classification model, namely extracting 7 expression spatial information characteristics and constructing the classification model based on the spatial information characteristics;

13) collecting and preprocessing an image to be detected, collecting facial expressions to be recognized in a very standard posture shot by collecting equipment, and performing histogram equalization and normalization processing on the image to be detected to generate a test sample;

14) and identifying the facial expression, namely inputting the preprocessed test sample into a spatial feature network model to automatically identify the facial expression.

The construction of the training model comprises the following steps:

21) vectorization of 7 expression training sample images:

wherein F represents the spatial information characteristics of the 7 expression training sets, and F_ijA feature vector representing the ith expression, jth image, where i is 1,2 …, 7; j-1, 2 …, n; n training sets are provided;

22) vectorizing the expression space features, wherein the direction of a vector represents the space information of facial expressions, and the length of the vector represents the category probability of the expressions;

23) calculating the weight of the spatial feature network model extraction layer,

inputting the vectorized facial expression features into a first layer of convolution network, performing nonlinear normalization, and performing convolution operation by adopting a kernel function of 11 × 11; inputting the feature vector after convolution operation into a first spatial feature network with N neuron layers, obtaining weights of the N neuron layers after iteration through a dynamic path planning method, and outputting the spatial feature vector;

24) calculating the weight of the spatial feature network model classification layer,

and inputting the spatial feature vectors into a spatial network classification layer, wherein the spatial feature network is provided with 7 neuron layers, and the weight of each neuron layer is obtained by calculating the loss function of each neuron layer, so that a classification model is obtained.

The vectorization processing of the expression space features comprises the following steps:

31) converting the ith picture containing expression space characteristics into a vector s with dimension of w multiplied by h_i，

Wherein w is the width of the image and h is the height of the image;

32) inputting a vector s_iScaling scale normalization, which is expressed as follows:

wherein, V_iRepresenting the feature vector of the normalized ith picture, and sqrt representing the square-open function.

The calculation of the weight of the spatial feature network model extraction layer comprises the following steps:

41) calculating the correlation coefficient from the i layer to the j layer:

wherein, a_ijRepresents a constant from the i-th layer to the j-th layer, and the initial value is 0; a is_ijWill change as the weight is iteratively updated, a_ijRepresents a constant from the i-th layer to the k-th layer;

represents the softmax function; k denotes a k-th layer network, d denotes an offset;

42) computing a prediction vector from i-th layer to j-th layer

Wherein the content of the first and second substances,

represents a prediction vector from the i-th layer to the j-th layer; w is a_ijRepresenting the weight from the ith layer to the j layer; v. of_iA normalized vector representing the input;

43) computing an activation vector s for layer j_j：

Wherein

Represents a prediction vector from the i-th layer to the j-th layer; cov_ijRepresenting the correlation coefficients from the ith layer to the j layer;

represents a prediction vector from the i-th layer to the j-th layer;

44) calculating a normalized vector of j layers:

wherein s is_jRepresenting an activation vector.

45) Repeating steps 41 to 44) continuously adjusting a_ij，c_ij，w_ijValue up to W_ijUntil convergence.

The method for calculating the weight of the spatial feature network model classification layer comprises the following steps:

51) calculating the error between the predicted value and the true value:

wherein E is_lossRepresenting the error between the predicted value and the true value; fⁱActual value, W, representing the ith sub-picture_iRepresents the classification weight, V, of the ith sub-imageⁱRepresenting a feature vector, N representing the total number of images trained;

52) constantly changing W_iValue of E_lossValue up to 0.01, calculated W_iI.e. the weight of the classification layer.

Advantageous effects

Compared with the prior art, the facial expression recognition method for the non-standard posture of the face predicts the facial expression under the condition that facial expression information is incomplete based on a prediction analysis technology so as to realize the expression recognition for the faces in different angles.

The method eliminates the influence of illumination on the recognition of facial expressions through pretreatment, and simplifies the complex environment; and then 7 expressions are identified by constructing a spatial feature model, so that not only can the category of each expression be obtained, but also the probability of each expression can be analyzed, and the real emotion calculation is realized. The method can directly identify the facial expressions under the nonstandard postures at different angles, and improves the robustness and accuracy of facial expression identification.

Drawings

FIG. 1 is a sequence diagram of the method of the present invention;

FIG. 2 is a diagram of facial expression recognition results using a gabor feature + SVM classification method;

fig. 3 is a facial expression recognition result diagram of the method of the present invention.

Detailed Description

So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:

as shown in fig. 1, the method for recognizing facial expressions under non-standard postures of human faces according to the present invention includes the following steps:

in the first step, the collection and pre-processing of training images. 7 expression samples of happiness, anger, hurry, surprise, aversion, fear and neutrality are collected, the 7 expression samples are facial expressions in non-standard postures, no less than 100 expression samples of each type are used as training images, and histogram equalization and normalization processing are carried out on all the training images.

And secondly, constructing a classification model, extracting 7 expression spatial information characteristics, and constructing the classification model based on the spatial information characteristics. The structure of the classification model is a training weight, and the spatial model is constructed for predictive analysis through the training of the weight. The method comprises the following specific steps:

(1) vectorization of 7 expression training sample images:

wherein F represents the spatial information characteristics of the 7 expression training sets, and F_ijA feature vector representing the ith expression, jth image, where i is 1,2 …, 7; j-1, 2 …, n; there are n training sets.

(2) Vectorizing the expression space features, wherein the direction of the vector represents the space information of the facial expression, and the length of the vector represents the category probability of the expression. The method comprises the following steps:

a1, converting the ith picture containing expression space characteristics into a vector s with dimension w multiplied by h_i，

Where w is the width of the image and h is the height of the image.

A2, inputting vector s_iScaling scale normalization, which is expressed as follows:

(3) And calculating the weight of the spatial feature network model extraction layer. Inputting the vectorized facial expression features into a first layer of convolution network, performing nonlinear normalization, and performing convolution operation by adopting a kernel function of 11 × 11; inputting the feature vector after convolution operation into a first space feature network with N neuron layers, obtaining the weight values of the N neuron layers after iteration through a dynamic path planning method, and outputting the space feature vector.

Here, in order to distinguish the correlation between different expressions and to be able to calculate the probability of each expression, a softmax () function is used to calculate the correlation parameter. The method comprises the following specific steps:

b1, calculating the correlation coefficient from the i layer to the j layer:

represents the softmax function; k denotes a k-th layer network and d denotes an offset.

B2, calculating the prediction vector from the i-th layer to the j-th layer

Wherein

Represents a prediction vector from the i-th layer to the j-th layer; w is a_ijRepresenting the weight from the ith layer to the j layer; v. of_iA normalized vector representing the input.

B3, calculating an activation vector s of a j layer_j：

Wherein

representing the prediction vector from the i-th layer to the j-th layer.

B4, calculating a normalized vector of j layers:

wherein s is_jRepresenting an activation vector.

B5) Repeating the steps B1 to B4, and continuously adjusting a_ij，c_ij，w_ijValue up to W_ijUntil convergence.

(4) And calculating the weight of the spatial feature network model classification layer.

And inputting the space feature vector features into a space network classification layer, wherein the space feature network is provided with 7 neuron layers, and the weight of each neuron layer is obtained by calculating the loss function of each neuron layer, so that a classification model is obtained. The method comprises the following specific steps:

c1, calculating the error between the predicted value and the true value:

c2, constantly changing W_iValue of E_lossValue up to 0.01, calculated W_iI.e. the weight of the classification layer.

And thirdly, collecting and preprocessing the image to be detected.

And collecting facial expressions to be recognized in the very standard postures shot by the acquisition equipment, and performing histogram equalization and normalization processing on the picture to be detected to generate a test sample.

And fourthly, recognizing the facial expression, namely inputting the preprocessed test sample into the spatial feature network model to automatically recognize the facial expression.

As shown in fig. 2, it is the result of recognizing the side face expression by using the gabor feature + SVM classification method, and the result shows a neutral expression, mainly because it relies on the side face contour recognition, resulting in a high error rate.

As shown in fig. 3, it is the result of the identification by the spatial feature model method of the present invention, and the result is happy, mainly because: through calculation (training) of the weight, the spatial feature model can obtain the category of each expression and analyze the probability of each expression, so that real emotion calculation is realized.

As shown in Table 1, the data analysis results of the gabor characteristic + SVM classification method and the method of the invention are shown in Table 1, and the identification rate and the accuracy rate of the invention are higher for the same test sample number.

TABLE 1 comparison table of expression recognition results of abor characteristic + SVM classification method and method of the present invention

By comparing the two methods, the spatial feature model can well and correctly identify the facial expressions of different angles, and the robustness is good.

The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A facial expression recognition method for a human face under a non-standard posture is characterized by comprising the following steps:

12) constructing a classification model, namely extracting 7 expression spatial information characteristics and constructing the classification model based on the spatial information characteristics; the construction of the classification model comprises the following steps:

121) vectorization of 7 expression training sample images:

122) vectorizing the expression space features, wherein the direction of a vector represents the space information of facial expressions, and the length of the vector represents the category probability of the expressions;

1221) converting the ith picture containing expression space characteristics into a vector s with dimension of w multiplied by h_i，

Wherein w is the width of the image and h is the height of the image;

1222) inputting a vector s_iScaling scale normalization, which is expressed as follows:

wherein, V_iRepresenting a feature vector of the normalized ith picture, and sqrt representing a square-open function;

123) calculating the weight of the spatial feature network model extraction layer,

124) calculating the weight of the spatial feature network model classification layer,

inputting the space feature vectors into a space network classification layer, wherein the space feature network is provided with 7 neuron layers, and the weight of each neuron layer is obtained by calculating a loss function of each neuron layer so as to obtain a classification model;

2. The method according to claim 1, wherein the calculating the weight of the spatial feature network model extraction layer comprises the following steps:

21) calculating the correlation coefficient from the i layer to the j layer:

wherein, a_ijRepresents a constant from the i-th layer to the j-th layer, and the initial value is 0; a is_ijWill change as the weight is iteratively updated, a_ikRepresents a constant from the i-th layer to the k-th layer;

sofmax (·) denotes the softmax function; k denotes a k-th layer network, d denotes an offset;

22) computing a prediction vector from i-th layer to j-th layer

Wherein the content of the first and second substances,

23) computing an activation vector s for layer j_j：

Wherein

represents a prediction vector from the i-th layer to the j-th layer;

24) calculating a normalized vector of j layers:

wherein s is_jRepresenting an activation vector;

25) repeating steps 21 to 24) to continuously adjust a_ij，c_ij，w_ijValue up to W_ijUntil convergence.

3. The method for recognizing facial expressions under non-standard postures as claimed in claim 1, wherein the step of calculating the weight of the spatial feature network model classification layer comprises the following steps:

31) calculating the error between the predicted value and the true value:

wherein E is_lossRepresenting the error between the predicted value and the true value; fⁱTrue value representing ith sub-picture，W_iRepresents the classification weight, V, of the ith sub-imageⁱRepresenting a feature vector, N representing the total number of images trained;

32) constantly changing W_iValue of E_lossValue up to 0.01, calculated W_iI.e. the weight of the classification layer.