CN113158825A

CN113158825A - Facial expression recognition method based on feature extraction

Info

Publication number: CN113158825A
Application number: CN202110341583.4A
Authority: CN
Inventors: 郭晓金; 张哲�; 张震; 刘煌
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2021-03-30
Filing date: 2021-03-30
Publication date: 2021-07-23

Abstract

The invention discloses a facial expression recognition method based on feature extraction, which comprises the steps of firstly dividing a facial expression image into sub-regions, obtaining two eight-bit binary sequences for the same pixel point by adopting two AR-LGBP operators with different sizes, carrying out logical XOR operation on the two binary sequences in a one-to-one correspondence manner to obtain a new binary sequence, converting the sequence into decimal numerical values which are pixel values of the pixel point, calculating the pixel value of each pixel point in each sub-region according to the method to obtain a histogram of the sub-region, and connecting the histograms of the sub-regions to generate facial expression feature vectors; and finally, reducing the dimension of the generated feature vector through a principal component analysis algorithm, and then carrying out facial expression classification and recognition by combining an SVM classifier. The invention not only considers the pixel relation between neighborhoods, can increase the feature description capability, but also has expansibility, and can extract features under different scales.

Description

Facial expression recognition method based on feature extraction

Technical Field

The invention relates to the technical field of digital image processing, in particular to a feature extraction technology which is high in efficiency, low in complexity, good in robustness and strong in identification capability and is adopted in an expression identification process.

Background

At present, research into artificial intelligence has reached a high level, while relatively little research has been done into human emotion and cognition. In real life, people expect that computers can serve the society like human beings, are more intelligent during man-machine interaction, are far from having the perception capabilities of vision and hearing, and need to add emotion understanding and emotion recognition functions. The visual information reflected by the human face is the most direct and important carrier for human emotional expression and interaction, and researchers can deduce the true mind of the expressive person through the change of facial expressions. Therefore, expression recognition is very important for emotion research.

Due to the complexity of human emotion and human facial expression and the higher and higher requirements for emotion recognition accuracy in some fields at present, the characteristic extraction algorithm of the classical Local Binary Pattern (LBP) is greatly improved in the aspects of extracting texture characteristics and realizing time.

In the classic LBP algorithm, within a window range of 3 × 3, a central pixel value is used as a threshold value to perform binarization processing on 8 pixel points in a neighborhood. By comparing the central pixel g_cAnd adjacent pixel g_iThe grey value is encoded in relation to its magnitude, the result being a binary number of eight bits, if g_i≥g_cIf so, it is 1, otherwise, it is 0. And then, the binary codes of the 8 sampling points in the neighborhood are connected clockwise from the upper left corner to form an LBP binary code sequence of the central pixel. Finally, each binary number is assigned a binomial coefficient 2ⁿThe weighted sum results in the LBP decimal code for the center pixel. The LBP has the advantages of gray scale invariance, rotation invariance, strong anti-interference and texture discrimination capability, simple calculation and certain inhibition effect on illumination. The LBP calculation process only compares the central pixel with the adjacent pixels, does not consider the gray scale relation between the adjacent pixels, and cannot extract the non-local information.

To address this problem, the industry proposes a Local Gradient Binary Pattern (LGBP) solution. LGBP calculates the relation among local pixels from the horizontal direction, the vertical direction and the diagonal direction, and compares the edges in the horizontal direction, the vertical direction and the diagonal direction respectivelyAnd coding two pixel values at the edge, and if the comparison result is greater than or equal to 0, assigning the value to be 1, otherwise, assigning the value to be 0. And filling the result into 8 sampling points in the neighborhood clockwise from the upper left corner, and finally connecting the binary codes of the 8 sampling points to form a binary sequence of the LGBP. And respectively and accurately expressing the change condition of each expression area in the image according to the comparison result. Finally, each binary number is assigned a binomial coefficient 2ⁿThe weighted sum yields the LGBP decimal code. LGBP has stronger discrimination ability than LBP, but the operator is easily influenced by noise, and the neighborhood size is fixed, and texture features cannot be well extracted under large scale.

Disclosure of Invention

In view of this, the present invention aims to provide a facial expression recognition scheme based on an improved LGBP, that is, an Asymmetric Local Gradient Binary Pattern (AR-LGBP), which not only considers the pixel relationship between neighborhoods, can increase feature description capability, but also has extensibility, and can extract features at different scales.

In order to achieve the above object, the technical scheme adopted by the invention is a facial expression recognition method based on feature extraction, which comprises the following steps:

step one, dividing the facial expression image into a plurality of sub-regions.

And step two, adopting an asymmetric local gradient binary pattern to calculate the pixel value of each pixel point in the sub-region, and obtaining a histogram of the pixel values of the sub-region.

And step three, connecting the histograms of the sub-regions to generate the facial expression feature vectors.

And step four, reducing the dimension of the feature vector in the step three through a Principal Component Analysis (PCA) algorithm, and then combining a Support Vector Machine (SVM) classifier to carry out facial expression classification and recognition.

On the basis of the scheme, the second step in the scheme is optimized by using logic operation in a digital circuit.

In a digital circuit, when the binary numbers a and b are the same, the output is 0, and when they are not the same, the output is 1. The idea after the scheme is optimized is as follows: and for the same pixel point, according to an AR-LGBP operator, obtaining two sequences PA and PB by adopting two neighborhoods with different sizes, carrying out logical XOR operation on the two sequences according to bits to obtain a new sequence P, converting the new sequence into 8-bit binary number, taking the value as a texture characteristic value of the pixel point, calculating each pixel point in each sub-region according to the method, finally obtaining a histogram of the sub-region, and finally continuing the step three and the step four to finish final identification. The new scheme is an exclusive or Asymmetric Local Gradient Binary Pattern (XOR Asymmetric Local Gradient Binary Pattern, XOR-AR-LGBP).

The invention adopts the technical scheme to bring the following technical effects: (1) compared with an LGBP scheme, the size of the sub-neighborhood of the AR-LGBP scheme can be changed according to m and n, the expansibility is realized, the characteristics of the image can be extracted at any scale, and the advantages of the characteristics at any scale can be extracted so as to have better recognition effect finally; and the digit of the binary sequence is constant and does not change along with the size change of the neighborhood of the operator, so the characteristic dimension of the operator is constant and the phenomenon of dimension disaster caused by the increase of the complexity of the algorithm is avoided.

(2) The advantage of the AR-LGBP scheme is inherited, the XOR-AR-LGBP scheme combines the XOR logical operation in the digital circuit, the strength change relation among sub-neighborhoods with different scales in the horizontal direction, the vertical direction and the diagonal direction can be reflected, the texture features of the image can be better extracted, and the method has better identification capability.

Drawings

FIG. 1 is a schematic diagram of the 3 × 3 neighborhood gray scale of AR-LGBP in the present invention;

FIG. 2 is a schematic diagram of the 3 × 3 neighborhood gray scale of the AR-LGBP processed in the present invention;

FIG. 3 is a schematic diagram of XOR-AR-LGBP encoding in the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and specific embodiments.

The design idea of the invention is to provide a method which can extract expression characteristics under multi-scale and comprehensively consider the pixel relation between neighborhoodsThe method of (a). FIG. 1 is a schematic diagram of the 3 × 3 neighborhood gray scale of AR-LGBP. The AR-LGBP operator neighborhood size is (2m +1) × (2n +1), where,

w represents the image width, h represents the image height, m is used to determine the width of any sub-neighborhood, n is used to determine the height of any sub-neighborhood, the symbol

Indicating a rounding down. The operator divides the neighborhood into 9 sub-neighborhoods, which are marked as R_i. These neighborhoods may be divided into four sub-neighborhoods R of size mxn₁，R₃，R₅，R ₇2 sub-neighborhoods R of size mx 1₄，R ₈2 sub-neighborhoods R of size 1 Xn₂，R ₆1 sub-neighborhoods R of size 1 × 1₉，R₉Is the central pixel point.

For ease of computation, the AR-LGBP is processed, as shown in FIG. 2, by setting the pixel value of each sub-neighborhood to be

The size of the pixel is equal to the mean value of all pixel points in the sub-neighborhood, and a formula can be used

Is represented by, wherein, N_iAnd the number of all pixel points in the ith sub-neighborhood is represented. p is a radical of_ijAnd expressing the pixel value of the jth pixel point of the ith sub-neighborhood. Its binary sequence can be represented as: p₀,P₁…P₇Wherein P is₀：

P₁：

P₂:

P₃：

P₄：

P₅：

P₆：

P₇：

The s function is a binary function, and the AR-LGBP coding formula is as follows:

P_ithe value of the ith bit in the binary sequence is represented, and x and y respectively represent the abscissa and the ordinate of the pixel point.

Further, the exclusive or operation in the digital circuit is combined with the AR-LGBP algorithm. In a digital circuit, when the binary numbers a and b are the same, the output is 0, and when they are not the same, the output is 1. The idea after the scheme is optimized is as follows: for the same pixel point, according to an AR-LGBP operator, two neighborhoods with different sizes are adopted to obtain two sequences PA and PB, the two sequences are subjected to logical XOR operation according to bits to obtain a new sequence P, the new sequence is converted into an 8-bit binary number, and the value is used as a texture characteristic value of the pixel point and can be expressed as follows:

wherein PA_iAnd PB_iThe ith bit of the sequences PA and PB respectively,

indicating a logical exclusive-or operation and L indicates the length of the sequence, where L is 8. As shown in FIG. 3, is XOR-AR-LGBThe P-coding process takes two neighborhood sizes of 5 × 5 and 3 × 3 as an example. For the same pixel, there is intensity variation between neighborhoods, such as 3 × 3 sub-neighborhoods R of the central pixel₈Pixel value greater than R₄I.e. by

The code of the AR-LGBP descriptor is PA (01010001)₂81. Center pixel point R₉5 x 5 sub-neighborhoods R₈Pixel value less than R₄I.e. by

The code of the AR-LGBP descriptor is PB (00010001)₂17, the PA and PB carry out bitwise logical XOR operation to obtain the code of XOR-AR-LGBP

And finally, reducing the dimension of the generated feature vector through a Principal Component Analysis (PCA) algorithm, and then combining a Support Vector Machine (SVM) classifier to carry out facial expression classification and recognition. The principal component analysis algorithm and the support vector machine classifier can be realized by adopting the method in the prior art.

Claims

1. A facial expression recognition method based on feature extraction is characterized by comprising the following steps:

dividing a facial expression image into a plurality of sub-areas;

step two, adopting an asymmetric local gradient binary pattern to calculate the pixel value of each pixel point in the sub-region to obtain a histogram of the sub-region;

connecting the histograms of the sub-regions to generate facial expression characteristic vectors;

and step four, reducing the dimension of the feature vector in the step three through a principal component analysis algorithm, and then combining a support vector machine classifier to carry out facial expression classification and recognition.

2. The facial expression recognition method based on feature extraction as claimed in claim 1, wherein: the process of calculating the pixel value is as follows: two AR-LGBP operators with different scales are adopted for a certain pixel point to obtain two eight-bit binary sequences, the two binary sequences are subjected to logic exclusive-OR operation in a one-to-one correspondence mode to obtain a new binary sequence P, the binary sequence P is converted into an 8-bit binary number and is used as a texture characteristic value of the pixel point, and a decimal value converted from the new binary sequence is the pixel value of the pixel point.

3. The facial expression recognition method based on feature extraction as claimed in claim 2, wherein: the AR-LGBP operator neighborhood size is (2m +1) × (2n +1), wherein,

1≤

Expressing rounding down, dividing the neighborhood into 9 sub-neighborhoods, and recording as R_iEach sub-neighborhood has a pixel value of

The binary sequence P is represented as: p₀,P₁…P₇Wherein P is₀：

P₁：

P₂:

P₃：

P₄：

P₅：

P₆：

P₇：

The s-function is a binary function.

4. The facial expression recognition method based on feature extraction as claimed in claim 3, wherein: the 9 sub-neighborhoods comprise four sub-neighborhoods R with the size of mxn₁，R₃，R₅，R₇2 sub-neighborhoods R of size mx 1₄，R₈2 sub-neighborhoods R of size 1 Xn₂，R₆1 sub-neighborhoods R of size 1 × 1₉，R₉Is the central pixel point.

5. The facial expression recognition method based on feature extraction as claimed in claim 2, wherein: the texture characteristic value is

Wherein PA_iAnd PB_iThe ith bit of the sequences PA and PB respectively,

and expressing logical XOR operation, L expressing the length of the sequence, and x and y expressing the horizontal and vertical coordinates of the central pixel points of the neighborhood.