CN102831447B - Method for identifying multi-class facial expressions at high precision - Google Patents

Method for identifying multi-class facial expressions at high precision Download PDF

Info

Publication number
CN102831447B
CN102831447B CN201210314435.4A CN201210314435A CN102831447B CN 102831447 B CN102831447 B CN 102831447B CN 201210314435 A CN201210314435 A CN 201210314435A CN 102831447 B CN102831447 B CN 102831447B
Authority
CN
China
Prior art keywords
prime
expression
image
node
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210314435.4A
Other languages
Chinese (zh)
Other versions
CN102831447A (en
Inventor
罗森林
谢尔曼
潘丽敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201210314435.4A priority Critical patent/CN102831447B/en
Publication of CN102831447A publication Critical patent/CN102831447A/en
Application granted granted Critical
Publication of CN102831447B publication Critical patent/CN102831447B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to a method for identifying multi-class facial expressions at a high precision based on Haar-like features, which belongs to the technical field of computer science and graphic image process. Firstly, the high-accuracy face detection is achieved by using the Haar-like features and a series-wound face detection classifier; further, the feature selection is carried out on the high-dimension Haar-like feature by using the Ada Boost. MH algorithm; and finally, the expression classifier training is carried out by using the Random Forest algorithm to complete the expression identification. Compared with the prior art, the method can reduce training and identifying time while increasing the multi-class expression identification rate, and can implement the parallelization conveniently to increase the identification rate and meet the requirement of real-time processing and mobile computing. The method can identify the static image and the dynamic image at a high precision, is not only applicable to the desktop computer but also to the mobile computing platforms, such as cellphone, tablet personal computer and the like.

Description

The recognition methods of multi-class facial expression high precision
Technical field
The present invention relates to a kind of multi-class facial expression high precision recognition methods based on Haar-like feature, belong to computer science and graph and image processing technical field.
Background technology
Facial expression is the important channel of Human communication, and expression recognition (facial expression recognition, FER), as the technology of in man-machine interaction, just more and more comes into one's own.Diversified expression is divided into several base class by people usually, and then uses sorting technique to solve identification problem.Such as, Cohn-Kanade, JAFFE Facial expression database have recorded anger, detest, fear, happiness, sadness, surprised 6 kinds of expressions, CAS-PEAL-R1 face expression database have recorded smile, frowns, surprised, dehisce, 5 kinds of expressions of closing one's eyes.
Human facial expression recognition needs solution 2 basic problems: 1. how to extract representative strong, that discrimination is high proper vector to characterize different facial expressions; 2. adopt high, the fireballing recognition methods of which kind of accuracy rate to distinguish different facial expressions.Take a broad view of existing human facial expression recognition technology, normally used method has:
1. in feature extraction:
(1) Optical-flow Feature: binaryzation or gray processing are carried out to sequence of video images, and then feature extraction is carried out to the light stream sports ground of this sequence, obtain characteristic sequence.The method is that sign extraction rate is fast not in the problem one of carrying out in Expression Recognition application, and two is that the accuracy of identification of discrimination model is not enough.
(2) Gabor characteristic: passage Gabor filter being divided into some, and then by Gabor filter, Two-Dimensional Gabor Wavelets conversion is carried out with the textural characteristics extracting Facial Expression Image to the Facial Expression Image after standardization processing.The shortcoming of the method is that extraction rate is comparatively slow, has difficulties in application in real time.
(3) these data are formed a characteristic series vector by expressive features moment characteristics: battle array extracts the length of normalized face key point displacement and particular geometric feature successively for each two field picture in facial expression image sequence; All characteristic series vectors in sequence arrange formation eigenmatrix in order, and each eigenmatrix represents a facial expression image sequence.The method owing to relating to the identification of face's key point, therefore extraction rate and precision all existing defects.
(4) based on the image local feature extracting method of 2 D partial least square method: first sample graph is divided into several sub-blocks equal-sized by expression classification, the textural characteristics of each sub-block of recycling LBP operator extraction, and form Local textural feature matrix, adopt adaptive weighted mechanism, use 2 D partial least square method to carry out statistical nature extraction to Local textural feature matrix.The algorithm design more complicated of the method, extraction rate is lower, is not suitable for real-time disposition.
(5) based on AVR and the feature strengthening LBP: carry out wavelet decomposition to standard faces image, extract LBP feature again, calculate afterwards and strengthen variance-rate AVR eigenwert and the additional penalty factor, finally extract the eigenwert of the different dimensions that some groups are distinguished mutually with AVR value.The method needs to use the steps such as wavelet transformation, LBP feature extraction, penalty factor calculating, and extraction rate is lower, cannot meet the demand of process in real time.
(6) face parameter attribute: the position first identifying face in facial zone, and then extract the texture of each organ (as eye, nose, eyebrow, the corners of the mouth etc.), profile parameters as proper vector according to the image information of face.The method relates to the identification to face organ, therefore accuracy of identification and feature representative aspect existing defects.
In addition, some researchs more early also comprise histogram, histogram of gradients, expressive features point motion feature etc. based on piecewise affine transformations.For the characteristic type that dimension is larger, also usually relate to dimension-reduction treatment, common Feature Dimension Reduction disposal route has: cluster linear discriminant analysis method, principal component analysis (PCA) etc.
2. in expression differentiating method:
(1) support vector machine (SVM) algorithm; Support vector machine (Support Vector Machine, SVM) be that the VC being based upon Statistical Learning Theory ties up on theoretical and Structural risk minization basis, between the complicacy (namely to the study precision of specific training sample) and learning ability (namely identifying the ability of arbitrary sample error-free) of model, optimal compromise is sought, to obtaining best Generalization Ability according to limited sample information.SVM algorithm, when training, needs constantly to adjust to be optimized to kernel function, kernel functional parameter, therefore training process often more complicated, and this is the important deficiency during this algorithm uses; In addition, SVM algorithm is a kind of two sorting algorithms, for the identification of plurality of classes, needs to do further improvement to algorithm.
(2) Canonical Correlation Analysis: the method uses principal component analysis (PCA) dimensionality reduction thought, respectively major component is extracted to two groups of variablees, and make the degree of correlation between the major component extracted from two groups of variablees reach maximum, and from uncorrelated mutually between each major component of same group of internal extraction, the linear relationship of two groups of variable entirety is described by the correlativity from the major component extracted respectively between two groups.The method is relatively accurate for the description of linear relationship, but precision when measuring for more complicated relation is not satisfactory, and this is this algorithm limitation in use.
(3) Histogram Matching: the method be input as two groups of statistics with histogram amounts, usually two groups of one-dimensional vector are regarded as, and then use the distance metric method (such as Euclidean distance, card side, histogram intersection, Bhattacharyya distance, land mobile distance etc.) of one-dimensional vector, similarity measurement is carried out to statistics with histogram amount.But the method designs histogrammic statistic unit, it is relatively stricter to be required by the representativeness of statistic, if above-mentioned 2 can not meet very well, then recognition effect will be greatly affected.
(4) AdaBoost algorithm: this algorithm is a kind of iterative algorithm, its core concept trains different sorters (Weak Classifier) for same training set, then these weak classifier set are got up, form a stronger final sorter (strong classifier).Its algorithm itself realizes by changing Data distribution8.The limitation one that the method uses is its training time, and for the high dimensional data of larger data amount, the method often needs the plenty of time to train; Two is choosing of Weak Classifier, often needs to carry out a large amount of test and just can search out optimum Weak Classifier.
In sum, for this application scenarios of multiple expression high-precision high-speed identification, existing feature extracting method existing characteristics representativeness is limited, precision and the high not deficiency of extraction rate; Meanwhile, also there is the limitations such as accuracy of identification is undesirable, complexity is too high, discernible expression categorical measure is limited, recognition speed is low in existing expression differentiating method.
Summary of the invention
The object of the invention is, for solving multiple facial expression high-precision high-speed identification problem, to propose a kind of human facial expression recognition method based on Haar-like feature.
Design concept of the present invention is: first use Haar-like characteristic sum series connection Face datection sorter to realize the Face datection of high accuracy; And then utilize AdaBoost.MH algorithm to carry out Feature Selection to higher-dimension Haar-like feature; Final utilization random forests algorithm carries out expression classifier training, to complete Expression Recognition.
Technical scheme of the present invention realizes as follows:
Step 1, in order to realize the automatic extraction of facial zone image, first using multiple facial zone images (as positive sample) and multiple non-face area images (as negative sample) to carry out off-line training, obtaining face recognition sorter.Face recognition sorter obtains by conventional training method multiple in prior art.The present invention adopts the AdaBoost Cascading Classifier training method based on Haar-like feature.
Step 2, on the basis of step 1, carries out the off-line training of facial expression sorter.Detailed process is as follows:
Step 2.1, first carry out expression mark to face-image training data, concrete grammar is: picture or the video (for expression video, extraction key frame is wherein as training image) of collecting various expression classification to be identified, form training plan image set A, the picture number wherein comprised is m; Then use continuous print integer numbers the class label as each pictures or key frame, forms expression class label collection Y={1,2 ..., k}, wherein k is expression classification number to be identified.
Step 2.2, carries out the extraction of facial zone data to the every width training image after marking through step 2.1, obtains the face-image cut out out.
The concrete grammar that facial zone data are extracted is: the integrogram first calculating each width image.Described integrogram is identical with original image size, and on it, the value of any point (x, y) is all pixel value sum in former figure corresponding point (x ', y ') and upper left side thereof:
ii ( x , y ) = Σ x ′ ≤ x , y ′ ≤ y i ( x ′ , y ′ ) ,
In formula, ii (x, y) represents the value of point (x, y) on integrogram, and i (x ', y ') represents the pixel value of point on original image (x ', y ').
After calculating integrogram, according to the face recognition sorter that step 1 obtains, use sliding window method to extract the Haar-like feature of image in sliding window region, determine facial zone fast; Then by the image cutting-out of facial zone out, zoom to the size needed for Expression Recognition, keep original expression to mark, form training plan image set B.
Step 2.3, in order to train expression classifier, every width image in the training plan image set B formed step 2.2 carries out the second extraction of Haar-like feature, the concrete grammar of Haar-like feature extraction is: the integrogram calculating the image that each width is cut out out, calculates corresponding H tie up Haar-like eigenwert (wherein H is determined by the size of the Haar-like characteristic type adopted and picture) according to each integrogram.
Corresponding for every width image H is tieed up Haar-like proper vector and is denoted as a line, make whole H of all m width images tie up Haar-like proper vectors and form that m is capable, the eigenmatrix X of H row;
Step 2.4, uses AdaBoost.MH algorithm, carries out Feature Selection to the Haar-like eigenmatrix X that step 2.3 obtains; Described AdaBoost.MH algorithm takes turns computing by F, carries out iteration screening to single feature Weak Classifier, ties up Haar-like characteristic value collection select F to tie up principal character to complete Feature Selection from H, obtains that m is capable, the principal character matrix X ' of F row;
The Weak Classifier used in described iteration screening, need meet the following conditions: 1. the input of Weak Classifier is one-dimensional eigenwert (a certain specific dimension namely in proper vector); 2., for class label to be identified, the output of Weak Classifier is 1 or-1.
The detailed process utilizing AdaBoost.MH algorithm to carry out Feature Selection is:
Step 2.4.1, the weight that initialization every width image is corresponding, is denoted as D 1(i, y i)=1/ (mk), y i∈ Y represents the expression class label of i-th image, i=1 ... m;
Step 2.4.2, starts f and takes turns iteration (f=1 ... F): successively using the input of each column data of eigenmatrix X as a Weak Classifier, carry out H computing, obtain r f, j:
r f , j = Σ i = 1 m D f ( i , y i ) K [ l i ] h j ( x i , j , y i ) ,
Wherein, j=1 ... H, x i, jrepresent the jth element (the jth eigenwert namely in proper vector corresponding to i-th training image) in i-th row of X; h j(x i, j, y i) represent with x i, jas the Weak Classifier of input, D f(i, y i) represent that f takes turns the weighted value of i-th training image in iteration, K [ y i ] = + 1 y i ∈ Y = { 1,2 , . . . , k } - 1 y i ∉ Y = { 1,2 , . . . , k } ;
After terminating H computing, get H the r that epicycle iteration obtains f, jin maximal value, be denoted as r f, and by r fjth dimensional feature value x that is corresponding, that adopt X jas the Weak Classifier h of input j(x j, Y), the Weak Classifier h filtered out is taken turns as f f(x j, Y), simultaneously by x jnew feature space is joined as the feature dimensions filtered out;
Step 2.4.3, calculates the Weak Classifier h selected by step 2.4.2 f(x j, Y) weight α f:
α f = 1 2 ln ( 1 + r f 1 - r f ) ;
Step 2.4.4, calculates the weight D that f+1 takes turns each image in iteration f+1;
D f + 1 = D f ( i , y i ) exp ( - α f K [ y i ] h f ( x i , j , y i ) ) Z f , i = 1 . . . m .
Wherein, h f(x i, j, y i) represent that f takes turns that filter out in iteration, using the jth dimensional feature value of i-th image as input Weak Classifier, Z fit is normalized factor
Z f = Σ i = 1 m D f ( i , y i ) exp ( - α f K [ y i ] h f ( x i , j , y i ) ) ;
Step 2.4.5, the new weight obtained by step 2.4.4 substitutes into step 2.4.2, according to the method iteration of step 2.4.2 to step 2.4.4, until filter out F to tie up principal character, and in eigenmatrix X, extract F row, forms the principal character matrix X ' that m is capable, F arranges;
Step 3, use the principal character matrix X ' obtained through the step 2 and expression class label collection Y marked through step 2.1, training generates Expression Recognition sorter, and the process of training follows random forests algorithm, and concrete grammar is:
Step 3.1, according to the decision tree number T in designing requirement and node intrinsic dimensionality u, generates T CART categorised decision tree.The record format of the root node of described decision tree is N (J), and the record format of intermediate node is N (V, J), and the record format of leafy node is (V, J, y t).Wherein, J represents the disruptive features dimension of node N, and V represents the eigenwert of node N, y trepresent the class label of node N.
The generation method of every CART categorised decision tree is:
Step 3.1.1, carrying out m time can put back to grab sample, each a line extracting principal character matrix X ', forms that new m is capable, the matrix X of F row ", for the growth that this CART categorised decision is set; X " in training sample label corresponding to each row feature form new expression class label collection Y ";
Step 3.1.2, from root node, carries out node split by node, finally completes the growth of whole tree.The fission process of each node is:
A) from eigenmatrix X ", Stochastic choice u arranges as the training data needed for this node split, wherein represent X " jth row (jth that namely F ties up in Haar-like feature space is tieed up);
B) the information gain IG of the x " j selected is calculated respectively j, obtain u IG j;
IG j = IG ( Y ′ ′ | x ‾ j ′ ′ ) = H ( Y ′ ′ ) - H ( Y ′ ′ | x ‾ j ′ ′ ) , - - - ( 1 )
Wherein, H (Y ") is expression class label collection Y " information entropy:
H ( Y ′ ′ ) = - Σ w = 1 k p ( y ′ ′ w = V w ) · log 2 p ( y ′ ′ w = V w ) ,
Vw represents the value of w class label in Y ", is also V w∈ 1,2 ..., k};
for expression class label collection Y " based on conditional entropy:
H ( Y ′ ′ | x ‾ j ′ ′ ) = Σ s = 1 h p ( x ′ ′ j = V s ) · H ( Y ′ ′ | x ′ ′ j = V s )
= - Σ s = 1 h p ( x ′ ′ j = V s ) · Σ w = 1 q p ( y ′ ′ w = V w | s ) · log 2 p ( y ′ ′ w = V w | s )
V srepresent x " js kind value in all different values of middle element; V w/srepresent V scorresponding expression class label; H (Y " | x " j, s=V s) be V sthe information entropy of corresponding expression class label set, x " jrepresent the element in the jth row of X ", h≤m, q≤k;
C) u information gain value IG obtaining of comparison step b j, and make IG by X " jthe maximum row of value extract, and are denoted as x " j, the columns J this be listed in X ' records simultaneously, and the disruptive features as this node is tieed up, during for identification;
D) x is added up " jin the quantity c of all different eigenwerts, then set up c respectively with the child node of different eigenwert for node eigenwert V to current node, and using this child node as the root node of subtree, generate new subtree, the growing method of subtree is: by x " jin all values equal row vector in the element place X " of this eigenwert and propose, form new eigenmatrix X v, then institute's espressiove class label corresponding for proposed row vector is proposed, form new expression class label set Y vthen (X is used v, Y v) substitute (X ", Y "), recursively carries out the operation of step a-d, until meet following condition for the moment, terminates the growth of this subtree:
1. this node cannot continue to divide (i.e. X vline number be less than 2, or in each row, all eigenwerts are all equal), the expression class label set Y of correspondence vthe highest label of the middle frequency of occurrences is as the class label y of this node tpreserve;
2. the Y under this node vin expression class label all identical, using the class label y of unique expression class label as this node tpreserve.
Step 3.2, preserves all T CART categorised decision tree, forms final random forest Expression Recognition sorter.
Step 4, the random forest Expression Recognition sorter utilizing step 3 off-line training to obtain, carries out ONLINE RECOGNITION to still image to be measured or dynamic video;
1) to the recognition methods of still image be:
Step a, extracts the facial zone in still image to be identified;
Step b, on the basis of step a, according to the Haar-like feature extracting method described in step 2.2, and the principal character matrix X ' that step 2.4.5 obtains, the F extracted needed for identifying ties up Haar-like feature, forms the proper vector of facial expression image to be identified, is denoted as x; Use x jthe J dimensional feature value of representation feature vector x;
Step c, T CART categorised decision tree in the random forest Expression Recognition sorter utilizing step 3 off-line training to obtain identifies the proper vector x of facial expression image to be identified respectively, the identification of every CART categorised decision tree is from root node, and detailed process is:
C.1. from sorter, obtain the disruptive features dimension J of current node, read the J dimensional feature value x of the proper vector x of facial expression image to be identified j;
C.2. search in the child node of current node, select node eigenwert and x jthe most close child node;
C.3. repeatedly recursively carry out operation c.1-c.2, until current node is leafy node, stop recurrence, and by the class label y of this leafy node trecognition result as this CART categorised decision tree exports;
Steps d, to T the Output rusults y of T CART categorised decision tree in random forest Expression Recognition sorter tadd up, class label the highest for the frequency of occurrences is exported as final recognition result.
2) to the recognition methods of dynamic video be:
Step e, decodes to video file, extracts every frame data, obtains image sequence to be identified;
Step f, on the basis of step e, the every width image treated in recognition image sequence carries out the extraction of facial zone data, obtains face-image sequence to be identified;
Step g, according to the Haar-like feature extracting method described in step 2.2, the F that the every breadth portion image extracting step 2.4.5 in the face-image sequence to be identified obtain step f filters out ties up Haar-like feature;
Step h, on the basis of step g, the random forest Expression Recognition sorter using step 3 off-line training to obtain, identifies the every width face-image in face-image sequence to be identified, obtains expression classification sequence; The identifying of described face-image is identical with step c, d;
Step I, the expression classification sequence obtained step h is smoothing, removes the burr judgement in the middle of recognition sequence, obtains final recognition result.
Beneficial effect
Compared to the method based on features such as the colour of skin, border, Gabor, wavelet transformations, the method for detecting human face that the present invention adopts has the advantages that recognition speed is fast, accuracy rate is high.
Compared with the method such as Gabor characteristic, Wavelet Transform Feature, Optical-flow Feature, the technology that the present invention adopts has higher accuracy rate and less calculating consumption, is not only applicable to desktop computer, is also applicable to the mobile computing platform such as mobile phone, panel computer.
With machine learning method and traditional Canonical Correlation Analysis such as AdaBoost, SVM, compare with the method for similarity measurement based on template matches, the present invention adopts the method for " Feature Selection+random forest " to realize the final identification of expression, there is the recognition accuracy of recognition speed and Geng Gao faster, and parallelization can be realized easily, to improve recognition efficiency further, to meet the demand of process and mobile computing in real time.
Accompanying drawing explanation
Fig. 1 is human facial expression recognition Method And Principle figure of the present invention;
Fig. 2 is the schematic diagram of embodiment septum reset image extraction method;
The 10 class Haar-like features that Fig. 3 uses for embodiment septum reset image extraction method;
The 5 class Haar-like features that Fig. 4 adopts for embodiment septum reset Expression Recognition;
Fig. 5 is the schematic diagram of " Feature Selection+random forest " method in embodiment;
Fig. 6 is in embodiment, when using CAS-PEAL-R1 expression storehouse to test, the performance test of the present invention and traditional AdaBoost.MH algorithm, (a) figure be traditional AdaBoost.MH to the recognition accuracy of all kinds of expression, (b) figure is that " Feature Selection+random forest " method proposed by the invention is to the recognition accuracy of all kinds of expression;
Fig. 7 is in embodiment, uses JAFFE to express one's feelings storehouse when testing, carry by the present invention " Feature Selection+random forest " method and AdaBoost.MH overall accuracy show.
Embodiment
In order to better objects and advantages of the present invention are described, be described in further details below in conjunction with the embodiment of drawings and Examples to the inventive method.
Respectively using to static images and dynamic video as input, design and dispose 3 tests: (1) expresses one's feelings the static images test in storehouse, (3) for the test of dynamic video for the express one's feelings static images test in storehouse, (2) of CAS-PEAL-R1 for JAFFE.
CAS-PEAL-R1 is the face database that Inst. of Computing Techn. Academia Sinica builds, and wherein face expression database comprises the front photograph of 379 people, and everyone records 5 kinds of expressions: frown, smile, close one's eyes, dehisce, surprised.JAFFE describes 6 class expressions of 10 Japanese womens: glad, sad, dejected, angry, surprised, detest.
In order to describe algorithm performance curve, need to test different parameter combinations (i.e. the value of u, T) performance impacts to static images identification, therefore in test (1), (2), 400 the random forest sorters trained under testing different node intrinsic dimensionality u respectively.In first round Feature Selection process, the value that the value of F is defined as 10000, u for step-length, rises to 4000 by 10 with 10; The value of T is defined as u 2upper round numbers, record the overall accuracy under every group (u, T) value.Tie up confusion matrix C for a certain N, its overall accuracy P is defined as:
p = Σ i = 1 N c ii Σ i = 1 N Σ j = 1 N c ij × 100 % . - - - ( 2 )
Analytical algorithm performance more objectively, in test (1), on algorithm performance curve, the overall accuracy of every bit all adopts 10 folding bracketing methods to obtain.For test (2), because expression storehouse picture sum is less, be not suitable for deployment 10 folding cross-beta, thus adopt the method for more strict opener test.
For test (3), use the video of camera shooting as input, the recognition result of every two field picture is presented on screen with identification is consuming time.
To be described one by one above-mentioned 3 testing processs below, all tests all complete on same computer, and concrete configuration is: Intel double-core CPU(dominant frequency 1.8G), 1G internal memory, WindowsXP SP3 operating system.
In above-mentioned 3 tests, identical face recognition sorter is used to carry out the automatic extraction of facial zone image.The concrete training flow process of face recognition sorter as shown in Figure 2, adopts the AdaBoost Cascading Classifier training method based on Haar-like feature, uses 10 class Haar-like features shown in Fig. 3 to train.
In addition, in above-mentioned 3 tests, identical Weak Classifier is used.The definition of Weak Classifier is:
h j ( x , y ) = 1 p j , y x j < p j , y &theta; j , y - 1 p j , y x j &GreaterEqual; p j , y &theta; j , y - - - ( 3 )
Wherein, x jrepresent the input of Weak Classifier, θ j, ythe threshold value obtained after representing training, p j, ythe direction of the instruction sign of inequality.
1. for the static images test in CAS-PEAL-R1 expression storehouse
From 5 kinds of facial expression images in CAS-PEAL-R1 expression storehouse, respectively select 150 width as experimental data.In order to carry out 10 folding cross validations, for the valued combinations often organizing F, T, 750 width images being divided into 10 groups at random, carrying out 10 respectively and take turns the training of expression classifier and identify test.Respectively taking turns in test, using in 10 groups of images 1 group as test data, for assessing the accuracy rate of sorter; Remaining 9 groups of data as training data, for the off-line training of facial expression sorter.10 take turns after test completes, and test result once, gathers by the tested mistake of each image, generates confusion matrix, and according to formula (2) calculated population accuracy rate.
Above-mentioned 10 to take turns test process identical, and the idiographic flow of often taking turns test is:
Step 1, is loaded into face recognition sorter.
Step 2, carries out the off-line training of facial expression sorter.
Step 2.1, first carries out expression mark to face-image training data.Because the expression in CAS-PEAL-R1 expression storehouse is documented on filename, therefore the work of this step is that { 1,2,3,4,5} dumps in the middle of mark storehouse according to the direct corresponding integer of the mark key word on filename.
Step 2.2, in order to obtain the face-image cut out out, carries out the extraction of facial zone data to the every width training image after marking through step 2.1.The concrete grammar that facial zone data are extracted is: the integrogram first calculating entire image, and then according to the face recognition sorter that step 1 obtains, use sliding window method to extract the Haar-like feature of image in sliding window region, determine facial zone (increase coefficient of sliding window is 1.2) fast; Out by the image cutting-out of facial zone finally, zoom to the size (in the present embodiment, employing is of a size of 32 × 32pix) needed for Expression Recognition, keep original expression to mark, form training plan image set.
Step 2.3, in order to carry out Expression Recognition, to step 2.2 the face-image cut out out carry out the second extraction of Haar-like feature.
Fig. 4 illustrates the 5 class Haar-like features that the present embodiment uses, and this feature has following three features:
1. fast operation.Coordinate integrogram, the extraction of any size Haar-like feature only need perform digital independent and the plus and minus calculation of fixed number of times.The Haar-like feature comprising 2 rectangles only need read 6 points and carry out plus/minus computing from integrogram, and the feature of 3 rectangles only need read 8 points, and the feature of 4 rectangles only need read 9 points.
2. distinction is strong.The dimension of Haar-like feature space is very high, and 5 category features used for the present embodiment, the image of 32 × 32, total dimension of 5 category features has exceeded 510,000, and concrete quantity is as shown in table 1.
The quantity of table 1 32 × 32 image 5 class Haar-like features
This dimension, considerably beyond the pixel count of picture itself, also far away higher than the dimension of the traditional characteristics such as Gabor, facial expression feature point patterns, therefore has higher differentiation potentiality.
The concrete grammar extracted is: the integrogram calculating the face-image that each width is cut out out, calculates 510112 all dimension Haar-like eigenwerts, obtain Haar-like characteristic value collection according to integrogram.
Calculate the integrogram of the image that each width is cut out out, 510112 corresponding dimension Haar-like eigenwerts are calculated according to each integrogram, corresponding for every width image 510112 dimension Haar-like proper vectors are denoted as a line, make whole Haar-like proper vectors of all 675 width images form the eigenmatrix X of 675 row, 510112 row.
In the following description, y is used i∈ Y represents the expression class label of i-th image, x ii-th row (510112 dimension Haar-like proper vectors namely corresponding to i-th training image) of representation feature matrix X, x jrepresent the jth row (the jth dimensions namely in 510112 dimension Haar-like feature spaces) of X, x i, jrepresent the jth element (the jth eigenwert namely in proper vector corresponding to i-th training image) in i-th row of X;
Step 2.4, uses AdaBoost.MH algorithm, carries out Feature Selection to the Haar-like eigenmatrix X that step 2.3 obtains; Described AdaBoost.MH algorithm takes turns computing by 10000, iteration screening is carried out to single feature Weak Classifier, tieing up Haar-like characteristic value collection from H selects 10000 dimension principal characters to complete Feature Selection, obtains the principal character matrix X ' of 675 row, 10000 row;
The Weak Classifier used in above-mentioned interative computation, need meet the following conditions: 1. the input of Weak Classifier is one-dimensional eigenwert (a certain specific dimension namely in proper vector); 2., for class label l to be identified, the output of Weak Classifier is 1 or-1.
The detailed process utilizing AdaBoost.MH algorithm to carry out Feature Selection is:
Step 2.4.1, the weight that initialization every width image is corresponding, is denoted as D 1(i, y i)=1/ (675 × 5).
Step 2.4.2, starts epicycle iteration (in following explanation, using f to represent the wheel number of iteration), successively using the input of each column data of eigenmatrix X as a Weak Classifier, carries out 510112 computings, calculate r according to the following formula f, jvalue:
r f , j = &Sigma; i = 1 m D f ( i , y i ) K [ l i ] h j ( x i , j , y i ) ,
Wherein, j=1 ... 510112, x i, jrepresent the jth element in i-th row of X; h j(x i, j, y i) represent with x i, jas the Weak Classifier of input, D f(i, y i) represent the weighted value of i-th training image in epicycle iteration (i.e. f wheel iteration), K [ y i ] = + 1 y i &Element; Y = { 1,2 , . . . , k } - 1 y i &NotElement; Y = { 1,2 , . . . , k } ;
After above-mentioned 510112 computings terminate, compare 510112 r that epicycle iterative computation goes out f, jvalue, get its maximal value, be denoted as r f, and then find and make r f, jreach maximum occurrences r f, adopt jth dimensional feature value as the Weak Classifier h of input j(x j, Y), the Weak Classifier that it can be used as epicycle to filter out (in order to following description is convenient, is denoted as h f(x j, Y)), the jth dimensional feature x simultaneously this Weak Classifier adopted jnew feature space is joined as the feature dimensions filtered out;
Step 2.4.3, calculates the Weak Classifier h selected by step 2.4.2 f(x j, Y) weight α f:
&alpha; f = 1 2 ln ( 1 + r f 1 - r f ) ;
Step 2.4.4, calculates the weight D of each image in next round iteration f+1;
D f + 1 = D f ( i , y i ) exp ( - &alpha; f K [ y i ] h f ( x i , j , y i ) ) Z f , i = 1 . . . 675 .
Wherein, h f(x i, l) represent that the jth dimensional feature value extracted in i-th image in f wheel iteration is as the Weak Classifier inputted, Z fit is normalized factor
Z f = &Sigma; il D f ( i , y i ) exp ( - &alpha; f K i [ y i ] h f ( x i , l ) ) i = 1 . . . 675 .
Wherein, h f(x i, j, y i) represent that f takes turns that filter out in iteration, using the jth dimensional feature value of i-th image as input Weak Classifier, Z fit is normalized factor
Z f = &Sigma; i = 1 m D f ( i , y i ) exp ( - &alpha; f K [ y i ] h f ( x i , j , y i ) ) ;
Step 2.4.5, the new weight obtained by step 2.4.4 substitutes into step 2.4.2, take turns according to the method iteration 10000 of step 2.4.2 to step 2.4.4, thus filter out 10000 dimension principal characters, form new feature space, that is: from eigenmatrix X, extract by wheel the characteristic series filtered out, form the principal character matrix X ' of 675 row, 10000 row;
Step 3, use the principal character matrix X ' obtained through the step 2 and expression class label collection Y={1 marked through step 2.1,2,3,4,5}, training generates Expression Recognition sorter, and the process of training follows random forests algorithm, and concrete grammar is:
Step 3.1, according to the decision tree number T in designing requirement and node intrinsic dimensionality u, generates T CART categorised decision tree.The record format of the root node of described decision tree is N (J), and the record format of intermediate node is N (V, J), and the record format of leafy node is (V, J, y t).Wherein, J represents the disruptive features dimension of node N, and V represents the eigenwert of node N, y trepresent the class label of node N.
The generation method of every CART categorised decision tree is:
Step 3.1.1, carrying out 675 times can put back to grab sample, each a line extracting principal character matrix X ', forms the matrix X of new 675 row, 10000 row ", be specifically designed to the growth that this CART categorised decision is set; X " in training sample label corresponding to each row feature form new expression class label collection Y ";
Step 3.1.2, from root node, carries out node split by node, finally completes the growth of whole tree.The fission process of each node is:
A) from eigenmatrix X, Stochastic choice u arranges as the training data needed for this node split, wherein represent X " jth row (jth that namely F ties up in Haar-like feature space is tieed up);
B) x selected is calculated respectively " jinformation gain IG j, obtain u IG j;
IG j = IG ( Y &prime; &prime; | x &OverBar; j &prime; &prime; ) = H ( Y &prime; &prime; ) - H ( Y &prime; &prime; | x &OverBar; j &prime; &prime; ) , - - - ( 1 )
Wherein, H (Y ") is expression class label collection Y " information entropy:
H ( Y &prime; &prime; ) = - &Sigma; w = 1 k p ( y &prime; &prime; w = V w ) &CenterDot; log 2 p ( y &prime; &prime; w = V w ) ,
V wrepresent Y " in the value of w class label, be also V w∈ 1,2 ..., k};
for expression class label collection Y " based on conditional entropy:
H ( Y &prime; &prime; | x &OverBar; j &prime; &prime; ) = &Sigma; s = 1 h p ( x &prime; &prime; j = V s ) &CenterDot; H ( Y &prime; &prime; | x &prime; &prime; j = V s )
= - &Sigma; s = 1 h p ( x &prime; &prime; j = V s ) &CenterDot; &Sigma; w = 1 q p ( y &prime; &prime; w = V w | s ) &CenterDot; log 2 p ( y &prime; &prime; w = V w | s )
V srepresent x " js kind value in all different values of middle element; V w/srepresent V scorresponding expression class label; H (Y " | x " j, s=V s) be V sthe information entropy of corresponding expression class label set, x " jrepresent the element in the jth row of X ", h≤m, q≤k;
C) u information gain value IG obtaining of comparison step b j, and make IG by X " jthe maximum row of value extract, and are denoted as x " j, the columns J this be listed in X ' records simultaneously, and the disruptive features as this node is tieed up, during for identification;
D) x is added up " jin the quantity c of all different eigenwerts, then set up c respectively with the child node of different eigenwert for node eigenwert V to current node, and using this child node as the root node of subtree, generate new subtree, the growing method of subtree is: by x " jin all values equal row vector in the element place X " of this eigenwert and propose, form new eigenmatrix X v, then institute's espressiove class label corresponding for proposed row vector is proposed, form new expression class label set Y v; Then use (Xv, Yv) substitute (X ", Y "), recursively carries out the operation of step a-d, until meet following condition for the moment, terminates the growth of this subtree:
1. this node cannot continue to divide (i.e. X vline number be less than 2, or in each row, all eigenwerts are all equal), the expression class label set Y of correspondence vthe highest label of the middle frequency of occurrences is as the class label y of this node tpreserve;
2. the Y under this node vin expression class label all identical, using the class label y of unique expression class label as this node tpreserve.
Step 3.2, by all T the unified preservation of CART categorised decision tree, forms final random forest Expression Recognition sorter.
Step 4, uses the Expression Recognition sorter that step 3 trains, and carry out Expression Recognition respectively to every width test picture, record recognition result is consuming time with identification.The concrete grammar of every width test picture recognition is:
Step a, extracts the facial zone in still image to be identified;
Step b, on the basis of step a, according to the Haar-like feature extracting method described in step 2.2, and the Feature Selection result of step 2.4.5, extract 10000 dimension Haar-like features needed for identifying, form the proper vector of facial expression image to be identified, be denoted as x; Use x jthe J dimensional feature value of representation feature vector x;
Step c, use the proper vector x that step b extracts, the random forest Expression Recognition sorter utilizing step 3 off-line training to obtain, T CART categorised decision tree in sorter is used to identify the proper vector x of facial expression image to be identified respectively, the identification of every CART categorised decision tree is from root node, and detailed process is:
C.1. from sorter, obtain the disruptive features dimension J of current node, read the J dimensional feature value x of the proper vector x of facial expression image to be identified j;
C.2. search in the child node of current node, select node value v and x jthe most close child node child v;
C.3. repeatedly recursively carry out operation c.1-c.2, until current node is leaf node, stops recurrence, and the class label of this leaf node is made y tfor the recognition result of this CART categorised decision tree exports;
Steps d, to T the Output rusults y of T CART categorised decision tree in random forest Expression Recognition sorter tadd up, class label the highest for the frequency of occurrences is exported as final recognition result.
Step e, takes turns after test terminates until 10, gathers comparison, obtain confusion matrix to the recognition result of all 750 pictures, and according to formula (2), calculated population accuracy rate.
2. for the static images test in JAFFE expression storehouse
JAFFE describes 6 class expressions of 10 Japanese womens: glad, sad, dejected, angry, surprised, detest.Due to expression storehouse picture sum less (amounting to 216 width images), be not suitable for deployment 10 folding cross-beta, thus adopt the method for more strict opener test, extract 10 width as test set in every class expression, all the other 26 width are as training set.
Idiographic flow is similar with test 1, and difference is with it: the value of (1) expression classification number k is 6, and the value of (2) training image quantity m is 156.
3. for the test of dynamic video
In order to test the recognition performance of the present invention to dynamic video, using the video of camera shooting as input, the recognition result of every two field picture is presented on screen with identification is consuming time.Be chosen at the expression classifier parameter that recognition performance in test 1 is best, set it to: T=60, u=3550(and sorter have 60 CART categorised decision trees; Each node growth of each tree is randomly drawed 3550 dimensions and is carried out from 10000 dimensional features).Concrete steps are:
Step 1: the video data obtained USB camera, extracts every frame data, obtains image sequence to be identified.
Step 2: on the basis of step 1, the every width image treated in recognition image sequence carries out the extraction of facial zone data, obtains face-image sequence to be identified.
Step 3: the every breadth portion image zooming-out F in the face-image sequence to be identified obtain step 2 ties up Haar-like feature (in this example, F=10000).
Step 4: on the basis of step 3, use the random forest expression classifier (T=60 that in test 1, step 3 off-line training obtains, u=3550) the every width face-image in face-image sequence to be identified is identified, obtain expression classification sequence, and by the recognition result of every frame with identify output consuming time over the display.The identifying of every width face-image is identical with step 4 in test 1.
Test result
For test 1, in order to contrast institute's extracting method of the present invention and the difference of AdaBoost.MH method in Expression Recognition accuracy rate, use traditional AdaBoost.MH method to carry out the experiment similar with test 1, " Feature Selection+random forest " method carried with the present invention compares.Two kinds of methods to the recognition accuracy of all kinds of expression as shown in Figure 6.As can clearly see from the figure: AdaBoost.MH is the highest to the discrimination of closing one's eyes, to the discrimination (all not more than 75%) on the low side of dehiscing, surprised two classes are expressed one's feelings.Correspondingly, as u > 900, " Feature Selection+random forest " method has all exceeded 90% to the recognition accuracy that 5 classes are expressed one's feelings.
In recognition speed, table 3 have recorded the on average consuming time of each link when carrying out Expression Recognition to 750 width pictures in test 1.Visible, " Feature Selection+random forest " method identification is consuming time is 5.2ms, and add the time overhead of recognition of face, recognition speed can reach 27.62 frames/second.
The identification of table 3 " Feature Selection+random forest " method is consuming time
For test 2, the traditional AdaBoost.MH method of same use carries out the experiment similar with test 2, and " Feature Selection+random forest " method carried with the present invention compares.Fig. 7 illustrates the performance of two kinds of method overall accuracy, visible, and the accuracy rate of method of the present invention is significantly higher than AdaBoost.MH method.
In test 3, this method has very high recognition accuracy; Meanwhile, the Expression Recognition of single frames is consuming time all at about 5ms.
The experimental result of above-mentioned 3 tests shows, the present invention has high, the fireballing feature of accuracy rate.The 10 folding cross validation results displays of expressing one's feelings at CAS-PEAL-R1 on storehouse, overall recognition accuracy reaches 94.7%; Express one's feelings in the opener test in storehouse at JAFFE, also obtain the recognition accuracy of 91.2%; In recognition speed, often opening the average identification of face consuming time is 5.2ms, can meet the demand of Real time identification.

Claims (5)

1. multi-class facial expression high precision recognition methods, is characterized in that: comprise the steps:
Step 1, uses multiple facial zone images to carry out off-line training as positive sample, multiple non-face area images as negative sample, obtains face recognition sorter;
Step 2, on the basis of step 1, carries out the off-line training of facial expression sorter; Detailed process is as follows:
Step 2.1, carry out expression mark to face-image training data, concrete grammar is: collect the various picture of expression classification to be identified or the key frame of video, and form training plan image set A, the picture number wherein comprised is m; Use continuous print integer numbers the class label as each pictures or key frame, forms expression class label collection Y={1,2 ..., k}, wherein k is expression classification number to be identified;
Step 2.2, carries out the extraction of facial zone data to the every width training image after marking through step 2.1, obtains the face-image cut out out, forms training plan image set B;
Step 2.3, in order to train expression classifier, every width image in the training plan image set B formed step 2.2 carries out the second extraction of Haar-like feature, the concrete grammar of Haar-like feature extraction is: the integrogram calculating the image that each width is cut out out, calculates corresponding H tie up Haar-like eigenwert according to each integrogram;
Corresponding for every width image H is tieed up Haar-like proper vector and is denoted as a line, make whole H of all m width images tie up Haar-like proper vectors and form that m is capable, the eigenmatrix X of H row;
Step 2.4, uses AdaBoost.MH algorithm, carries out feature extraction to the Haar-like eigenmatrix X that step 2.3 obtains; The detailed process of Haar-like feature extracting method is:
Step 2.4.1, the weight that initialization every width image is corresponding, is denoted as D 1(i, y i)=1/ (mk), y i∈ Y represents the expression class label of i-th image, i=1 ... m;
Step 2.4.2, starts f and takes turns iteration, f=1 ... F: successively using the input of each column data of eigenmatrix X as a Weak Classifier, carry out H computing, obtain r f,j:
r f , j = &Sigma; i = 1 m D f ( i , y i ) K [ y i ] h j ( x i , j , y i ) ,
Wherein, j=1 ... H, x i,jrepresent the jth element in i-th row of X; h j(x i,j, y i) represent with x i,jas the Weak Classifier of input, D f(i, y i) represent that f takes turns the weighted value of i-th training image in iteration,
K [ y i ] = + 1 y i &Element; Y = { 1,2 , . . . , k } - 1 y i &NotElement; Y = { 1,2 , . . . , k } ;
After terminating H computing, get H the r that epicycle iteration obtains f,jin maximal value, be denoted as r f, and by r fjth dimensional feature value x that is corresponding, that adopt X jas the Weak Classifier h of input j(x j, Y), the Weak Classifier h filtered out is taken turns as f f(x j, Y), simultaneously by x jnew feature space is joined as the feature dimensions filtered out;
Step 2.4.3, calculates the Weak Classifier h selected by step 2.4.2 f(x j, Y) weight α f:
&alpha; f = 1 2 ln ( 1 + r f 1 - r f ) ;
Step 2.4.4, calculates the weight D that f+1 takes turns each image in iteration f+1;
D f + 1 = D f ( i , y i ) exp ( - &alpha; f K [ y i ] h f ( x i , j , y i ) ) Z f , i = 1 . . . m .
Wherein, h f(x i,j, y i) represent that f takes turns that filter out in iteration, using the jth dimensional feature value of i-th image as input Weak Classifier, Z fit is normalized factor
Z f = &Sigma; i = 1 m D f ( i , y i ) exp ( - &alpha; f K [ y i ] h f ( x i , j , y i ) ) ;
Step 2.4.5, the new weight obtained by step 2.4.4 substitutes into step 2.4.2, according to the method iteration of step 2.4.2 to step 2.4.4, until filter out F to tie up Haar-like feature, and in eigenmatrix X, extract F row, forms the principal character matrix X ' that m is capable, F arranges;
Step 3, use the principal character matrix X ' obtained through step 2.4.5 and the expression class label collection Y marked through step 2.1, training generates Expression Recognition sorter, and the process of training follows random forests algorithm, and concrete grammar is:
Step 3.1, according to the decision tree number T in designing requirement and node intrinsic dimensionality u, generates T CART categorised decision tree; The record format of the root node of described decision tree is N (J), and the record format of intermediate node is N (V, J), and the record format of leafy node is (V, J, y t); Wherein, J represents the disruptive features dimension of node N, and V represents the eigenwert of node N, y trepresent the class label of node N;
The generation method of every CART categorised decision tree is:
Step 3.1.1, carrying out m time can put back to grab sample, each a line extracting principal character matrix X ', forms that new m is capable, the matrix X of F row ", for the growth that this CART categorised decision is set; X " in training sample label corresponding to each row feature form new expression class label collection Y ";
Step 3.1.2, from root node, carries out node split by node, finally completes the growth of whole tree; The fission process of each node is:
A) from matrix X " Stochastic choice u arrange as the training data needed for this node split, wherein represent X " jth row;
B) calculate respectively and to select information gain IG j, obtain u IG j;
IG j = IG ( Y &prime; &prime; | x &OverBar; j &prime; &prime; ) = H ( Y &prime; &prime; ) - H ( Y &prime; &prime; | x &OverBar; j &prime; &prime; ) ,
Wherein, H (Y ") is expression class label collection Y " information entropy:
H ( Y &prime; &prime; ) = - &Sigma; w = 1 k p ( y &prime; &prime; w = V w ) &CenterDot; log 2 p ( y &prime; &prime; w = V w ) ,
V wrepresent Y " in the value of w class label, V w∈ 1,2 ..., k};
for expression class label collection Y " based on conditional entropy:
H ( Y &prime; &prime; | x &OverBar; j &prime; &prime; ) = &Sigma; s = 1 h p ( x &prime; &prime; j = V s ) &CenterDot; H ( Y &prime; &prime; | x &prime; &prime; j = V s ) = - &Sigma; s = 1 h p ( x &prime; &prime; j = V s ) &CenterDot; &Sigma; w = 1 q p ( y &prime; &prime; w = V w | s ) &CenterDot; log 2 p ( y &prime; &prime; w = V w | s )
V srepresent x " js kind value in all different values of middle element; V w|srepresent V scorresponding expression class label; H (Y " | x " j=V s) be V sthe information entropy of corresponding expression class label set, x " jrepresent X " jth row in element, h≤m, q≤k;
C) u information gain value IG obtaining of comparison step b j, and by X " in make IG jthe maximum row of value extract, and are denoted as x " j, the columns J this be listed in X ' records simultaneously, and the disruptive features as this node is tieed up;
D) x is added up " jin the quantity c of all different eigenwerts, then set up c respectively with the child node of different eigenwert for node eigenwert V to current node, and using this child node as the root node of subtree, generate new subtree, the growing method of subtree is: by x " jin all values equal the element place X of this eigenwert " in row vector propose, form new eigenmatrix X v, then institute's espressiove class label corresponding for proposed row vector is proposed, form new expression class label set Y v; Then (X is used v, Y v) substitute (X ", Y "), recursively carries out the operation of step a-d, until meet following condition for the moment, terminates the growth of this subtree:
1. X vline number be less than 2, or in each row all eigenwerts all equal cause this node cannot continue division time, the expression class label set Y of correspondence vthe highest label of the middle frequency of occurrences is as the class label y of this node tpreserve;
2. the Y under this node vin expression class label all identical time, using the class label y of unique expression class label as this node tpreserve;
Step 3.2, preserves all T CART categorised decision tree, forms final random forest Expression Recognition sorter;
Step 4, the random forest Expression Recognition sorter utilizing step 3 off-line training to obtain, carries out ONLINE RECOGNITION to still image to be measured or dynamic video;
1) to the recognition methods of still image be:
Step a, extracts the facial zone in still image to be identified;
Step b, on the basis of step a, according to Haar-like feature extracting method, and the principal character matrix X ' that step 2.4.5 obtains, the F extracted needed for identifying tie up Haar-like feature, form the proper vector of facial expression image to be identified, are denoted as x; Use x jthe J dimensional feature value of representation feature vector x;
Step c, T CART categorised decision tree in the random forest Expression Recognition sorter utilizing step 3 off-line training to obtain identifies the proper vector x of facial expression image to be identified respectively, the identification of every CART categorised decision tree is from root node, and detailed process is:
C.1. from sorter, obtain the disruptive features dimension J of current node, read the J dimensional feature value x of the proper vector x of facial expression image to be identified j;
C.2. search in the child node of current node, select node eigenwert and x jthe most close child node;
C.3. repeatedly recursively carry out operation c.1-c.2, until current node is leafy node, stop recurrence, and by the class label y of this leafy node trecognition result as this CART categorised decision tree exports;
Steps d, to T the Output rusults y of T CART categorised decision tree in random forest Expression Recognition sorter tadd up, class label the highest for the frequency of occurrences is exported as final recognition result;
2) to the recognition methods of dynamic video be:
Step e, decodes to video file, extracts every frame data, obtains image sequence to be identified;
Step f, on the basis of step e, the every width image treated in recognition image sequence carries out the extraction of facial zone data, obtains face-image sequence to be identified;
Step g, according to the Haar-like feature extracting method described in step 2.4, the F that the every breadth portion image extracting step 2.4.5 in the face-image sequence to be identified obtain step f filters out ties up Haar-like feature;
Step h, on the basis of step g, the random forest Expression Recognition sorter using step 3 off-line training to obtain, identifies the every width face-image in face-image sequence to be identified, obtains expression classification sequence; The identifying of described face-image is identical with step c, d;
Step I, the expression classification sequence obtained step h is smoothing, removes the burr judgement in the middle of recognition sequence, obtains final recognition result.
2. multi-class facial expression high precision according to claim 1 recognition methods, it is characterized in that: in Feature Selection method described in step 2.4, the input of the Weak Classifier used in interative computation is the one-dimensional eigenwert in proper vector, simultaneously, for expression class label y to be identified, the output of Weak Classifier is 1 or-1.
3. multi-class facial expression high precision according to claim 1 recognition methods, is characterized in that: the concrete grammar that facial zone data described in step 2.2 are extracted is: the integrogram first calculating each width image; Described integrogram is identical with original image size, and on it, the value of any point (x, y) is all pixel value sum in former figure corresponding point (x ', y ') and upper left side thereof:
ii ( x , y ) = &Sigma; x &prime; &le; x , y &prime; &le; y i ( x &prime; , y &prime; ) ,
In formula, ii (x, y) represents the value of point (x, y) on integrogram, and i (x', y') represents the pixel value of point on original image (x ', y ');
After calculating integrogram, according to the face recognition sorter that step 1 obtains, use sliding window method to extract the Haar-like feature of image in sliding window region, determine facial zone fast; Then by the image cutting-out of facial zone out, zoom to the size needed for Expression Recognition, keep original expression to mark.
4. multi-class facial expression high precision according to claim 1 recognition methods, is characterized in that: the value of H described in step 2.3 is determined by the Haar-like characteristic type adopted and dimension of picture.
5. multi-class facial expression high precision according to claim 1 recognition methods, is characterized in that: face recognition sorter described in step 1 adopts the AdaBoost Cascading Classifier training method based on Haar-like feature to obtain.
CN201210314435.4A 2012-08-30 2012-08-30 Method for identifying multi-class facial expressions at high precision Expired - Fee Related CN102831447B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210314435.4A CN102831447B (en) 2012-08-30 2012-08-30 Method for identifying multi-class facial expressions at high precision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210314435.4A CN102831447B (en) 2012-08-30 2012-08-30 Method for identifying multi-class facial expressions at high precision

Publications (2)

Publication Number Publication Date
CN102831447A CN102831447A (en) 2012-12-19
CN102831447B true CN102831447B (en) 2015-01-21

Family

ID=47334573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210314435.4A Expired - Fee Related CN102831447B (en) 2012-08-30 2012-08-30 Method for identifying multi-class facial expressions at high precision

Country Status (1)

Country Link
CN (1) CN102831447B (en)

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103248955B (en) * 2013-04-22 2017-07-28 深圳Tcl新技术有限公司 Personal identification method and device based on intelligent remote control system
WO2015078007A1 (en) * 2013-11-29 2015-06-04 徐勇 Quick human face alignment method
CN103679151B (en) * 2013-12-19 2016-08-17 成都品果科技有限公司 A kind of face cluster method merging LBP, Gabor characteristic
CN105303149B (en) * 2014-05-29 2019-11-05 腾讯科技(深圳)有限公司 The methods of exhibiting and device of character image
CN104284252A (en) * 2014-09-10 2015-01-14 康佳集团股份有限公司 Method for generating electronic photo album automatically
CN104376333A (en) * 2014-09-25 2015-02-25 电子科技大学 Facial expression recognition method based on random forests
CN105718937B (en) * 2014-12-03 2019-04-05 财团法人资讯工业策进会 Multi-class object classification method and system
CN104966088B (en) * 2015-06-02 2018-10-23 南昌航空大学 Based on small echo in groups-variation interconnection vector machine fracture surface image recognition methods
CN105117740B (en) * 2015-08-21 2021-06-15 北京旷视科技有限公司 Font identification method and apparatus
CN105139041A (en) * 2015-08-21 2015-12-09 北京旷视科技有限公司 Method and device for recognizing languages based on image
CN106874921B (en) * 2015-12-11 2020-12-04 清华大学 Image classification method and device
CN106528586A (en) * 2016-05-13 2017-03-22 上海理工大学 Human behavior video identification method
CN106778677A (en) * 2016-12-30 2017-05-31 东北农业大学 Feature based selection and driver's fatigue state recognition method and device of facial multizone combining classifiers
CN106845531A (en) * 2016-12-30 2017-06-13 东北农业大学 The method and system of face fatigue state identification are carried out using the first yojan of relative covering
CN107341688A (en) * 2017-06-14 2017-11-10 北京万相融通科技股份有限公司 The acquisition method and system of a kind of customer experience
GB201713829D0 (en) * 2017-08-29 2017-10-11 We Are Human Ltd Image data processing system and method
CN108288048B (en) * 2018-02-09 2021-11-23 中国矿业大学 Facial emotion recognition feature selection method based on improved brainstorming optimization algorithm
CN108460364B (en) * 2018-03-27 2022-03-11 百度在线网络技术(北京)有限公司 Method and apparatus for generating information
CN108710820A (en) * 2018-03-30 2018-10-26 百度在线网络技术(北京)有限公司 Infantile state recognition methods, device and server based on recognition of face
CN108764321B (en) * 2018-05-21 2019-08-30 Oppo广东移动通信有限公司 Image-recognizing method and device, electronic equipment, storage medium
CN108734214A (en) * 2018-05-21 2018-11-02 Oppo广东移动通信有限公司 Image-recognizing method and device, electronic equipment, storage medium
CN109086657B (en) * 2018-06-08 2019-11-01 华南理工大学 A kind of ear detection method, system and model based on machine learning
CN108982544B (en) * 2018-06-20 2020-09-29 青岛联合创智科技有限公司 Method for detecting defective parts of printed circuit board
CN109359532A (en) * 2018-09-12 2019-02-19 中国人民解放军国防科技大学 BGP face recognition method based on heuristic information
CN109359675B (en) * 2018-09-28 2022-08-12 腾讯科技(武汉)有限公司 Image processing method and apparatus
CN109784143A (en) * 2018-11-27 2019-05-21 中国电子科技集团公司第二十八研究所 A kind of micro- expression classification method based on optical flow method
CN109376717A (en) * 2018-12-14 2019-02-22 中科软科技股份有限公司 Personal identification method, device, electronic equipment and the storage medium of face comparison
CN109829959B (en) * 2018-12-25 2021-01-08 中国科学院自动化研究所 Facial analysis-based expression editing method and device
CN109840485B (en) * 2019-01-23 2021-10-08 科大讯飞股份有限公司 Micro-expression feature extraction method, device, equipment and readable storage medium
CN110175578B (en) * 2019-05-29 2021-06-22 厦门大学 Deep forest-based micro expression identification method applied to criminal investigation
CN110321845B (en) * 2019-07-04 2021-06-18 北京奇艺世纪科技有限公司 Method and device for extracting emotion packets from video and electronic equipment
CN110532971B (en) 2019-09-02 2023-04-28 京东方科技集团股份有限公司 Image processing apparatus, training method, and computer-readable storage medium
CN112492389B (en) * 2019-09-12 2022-07-19 上海哔哩哔哩科技有限公司 Video pushing method, video playing method, computer device and storage medium
CN111582136B (en) * 2020-04-30 2024-04-16 京东方科技集团股份有限公司 Expression recognition method and device, electronic equipment and storage medium
CN112149596A (en) * 2020-09-29 2020-12-29 厦门理工学院 Abnormal behavior detection method, terminal device and storage medium
CN113111789B (en) * 2021-04-15 2022-12-20 山东大学 Facial expression recognition method and system based on video stream
EP4102256A1 (en) 2021-06-11 2022-12-14 Infineon Technologies AG Sensor devices, electronic devices, method for performing object detection by a sensor device and method for performing object detection by an electronic device
CN114004982A (en) * 2021-10-27 2022-02-01 中国科学院声学研究所 Acoustic Haar feature extraction method and system for underwater target recognition

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《基于Haar 特征的Turbo-Boost表情识别算法》;谢尔曼, 罗森林, 潘丽敏;《计算机辅助设计与图形学学报》;20110831;第23卷(第8期);第1442-1446页,第1454页 *
《拟自适应分类随机森林算法》;马景义 等;《数理统计与管理》;20100930;第29卷(第5期);第805-810页 *
Takeshi Mita,Toshimitsu Kaneko,Osamu Hori.《Joint Haar-like Features for Face Detection》.《Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05)》.2005, *

Also Published As

Publication number Publication date
CN102831447A (en) 2012-12-19

Similar Documents

Publication Publication Date Title
CN102831447B (en) Method for identifying multi-class facial expressions at high precision
Bar et al. Classification of artistic styles using binarized features derived from a deep neural network
Lucey et al. Investigating spontaneous facial action recognition through aam representations of the face
CN102938065B (en) Face feature extraction method and face identification method based on large-scale image data
Dollár et al. Integral channel features.
Chen et al. Fast human detection using a novel boosted cascading structure with meta stages
CN100461204C (en) Method for recognizing facial expression based on 2D partial least square method
CN111523462B (en) Video sequence expression recognition system and method based on self-attention enhanced CNN
CN111860171B (en) Method and system for detecting irregular-shaped target in large-scale remote sensing image
CN107871101A (en) A kind of method for detecting human face and device
CN105139039A (en) Method for recognizing human face micro-expressions in video sequence
CN104778457A (en) Video face identification algorithm on basis of multi-instance learning
CN110263215A (en) A kind of video feeling localization method and system
Folego et al. From impressionism to expressionism: Automatically identifying van Gogh's paintings
CN102163281A (en) Real-time human body detection method based on AdaBoost frame and colour of head
CN109325507A (en) A kind of image classification algorithms and system of combination super-pixel significant characteristics and HOG feature
CN103336942A (en) Traditional Chinese painting identification method based on Radon BEMD (bidimensional empirical mode decomposition) transformation
Wang et al. S 3 d: scalable pedestrian detection via score scale surface discrimination
CN109614866A (en) Method for detecting human face based on cascade deep convolutional neural networks
CN109190456A (en) Pedestrian detection method is overlooked based on the multiple features fusion of converging channels feature and gray level co-occurrence matrixes
CN106650696A (en) Handwritten electrical element identification method based on singular value decomposition
Lu et al. Automatic gender recognition based on pixel-pattern-based texture feature
CN103942572A (en) Method and device for extracting facial expression features based on bidirectional compressed data space dimension reduction
CN110222660B (en) Signature authentication method and system based on dynamic and static feature fusion
Chuang et al. Hand posture recognition and tracking based on bag-of-words for human robot interaction

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150121

Termination date: 20150830

EXPY Termination of patent right or utility model