CN105139004A - Face expression identification method based on video sequences - Google Patents

Face expression identification method based on video sequences Download PDF

Info

Publication number
CN105139004A
CN105139004A CN201510612526.XA CN201510612526A CN105139004A CN 105139004 A CN105139004 A CN 105139004A CN 201510612526 A CN201510612526 A CN 201510612526A CN 105139004 A CN105139004 A CN 105139004A
Authority
CN
China
Prior art keywords
human face
face expression
expression sequence
image
sequence image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510612526.XA
Other languages
Chinese (zh)
Other versions
CN105139004B (en
Inventor
于明
郭迎春
师硕
于洋
刘依
阎刚
邓玉娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei University of Technology
Original Assignee
Hebei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei University of Technology filed Critical Hebei University of Technology
Priority to CN201510612526.XA priority Critical patent/CN105139004B/en
Publication of CN105139004A publication Critical patent/CN105139004A/en
Application granted granted Critical
Publication of CN105139004B publication Critical patent/CN105139004B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a face expression identification method based on video sequences and relates to a method used for identifying graphs. With this method, dynamic space-time texture characteristics of face expression sequences are extracted by use of the HCBP-TOP algorithm. The method comprises steps of preprocessing face expression sequences; performing image layering and partitioning processing for face expression sequences with the space pyramid partition method; extracting dynamic space-time texture characteristics of face expression sequences by use of the HCBP-TOP algorithm; and using an SVM classifier to train and predict face expressions. According to the invention, defects in the prior art are overcome that central pixels are not taken into consideration; local detain information is neglected; identification efficiency and precision of face expressions are quite low; and the traditional method is not widely applicable.

Description

Based on the facial expression recognizing method of video sequence
Technical field
Technical scheme of the present invention relates to the method for identifying figure, specifically based on the facial expression recognizing method of video sequence.
Background technology
Expression is the most effective mode during human emotion exchanges, expression recognition system had important application in the field relating to vision system and pattern-recognition in recent years, as psychological study, video conference, affection computation, intelligent human-machine interaction and medical industry.Along with human-computer interaction technology promotes all sidedly, how research allows perception human expressions in computer system more capable ground be the focus of current artificial intelligence, and research and development expression recognition system is significant.
Early stage facial expression recognizing method concentrates on the human face expression in research still image, but the subject matter that still image exists have ignored time-domain information.Human face expression is as a dynamic change procedure, and its time-domain information plays very important effect.Facial expression recognizing method based on video sequence can reflect the change of human face expression itself better, thus improves accuracy and the robustness of expression recognition.Therefore, the expression recognition studied based on video sequence has important significance of scientific research.
The existing facial expression recognizing method based on video sequence has: the Bartlett of University of California equals within 1999, to utilize multiple dimensioned multidirectional Gabor filter to extract facial image feature, then support vector machine (supportvectormachines is utilized, SVM) carry out tagsort, thus identify different human face expressions.But Gabor characteristic has, and computation complexity is large, dimension is high and be subject to the shortcoming of illumination interference; The Wang Yubo of Tsing-Hua University equals the Haar-like feature that 2003 extract facial image, then utilizes algorithm based on continuous Adaboost to facial expression classification.Some advantages that Haar-like geometric properties has intuitively, dimension is low and descriptive power is strong, but the method edge characteristic sum line features is comparatively responsive, and feature extraction precision is not high, in addition, when the background environment of image or video is complicated, Adaboost sorter can produce higher mistake identification.The Liao of North Carolina University utilized in 2009 the local binary patterns (DominantLBP that is dominant, DLBP) and Gabor methods combining extract feature, selecting of principal character has been carried out for LBP algorithm, make computing more quick, and obtain good effect in conjunction with carrying out Texture classification after DLBP and Gabor method extraction feature, but the method exists the shortcoming of two aspects: on the one hand, LBP does not consider the impact of center pixel in image texture characteristic expression; On the other hand, the method does not fully take into account the effect of time-domain information, lost a part of information and causes discrimination undesirable.This defect of center pixel is not considered for LBP, the solution that the centralization binary pattern (hereinafter referred to as CBP) proposed on LBP basis is in recent years this problem provides thinking, proposes again MCBP (the being called for short MCBP-IMED) method of multiple dimensioned CBP (being called for short MCBP) and embedded images Euclidean distance (IMED) on this basis.LBP is owing to having the advantage of gray scale unchangeability and rotational invariance, therefore be widely used in Expression Recognition field, but its defect is difficult to obtain larger support area, space, and do not have a robustness to the change at direction of illumination and visual angle, in Texture classification, performance is also not fully up to expectations.
Summary of the invention
Technical matters to be solved by this invention is: provide the facial expression recognizing method based on video sequence, that one utilizes three-dimensional orthogonal Haar-like center binary pattern (Haar-likeCentralizedBinaryPatternsfromThreeOrthogonalPan els, hereinafter referred to as HCBP-TOP) extract the facial expression recognizing method of the dynamic space-time textural characteristics of human face expression sequence, overcome prior art and do not consider center pixel, ignore local detail information, the efficiency of expression recognition and accuracy of identification are all lower, not there is the defect of general applicability.
The present invention solves this technical problem adopted technical scheme: based on the facial expression recognizing method of video sequence, be a kind of facial expression recognizing method utilizing HCBP-TOP algorithm to extract the dynamic space-time textural characteristics of human face expression sequence, concrete steps are as follows:
The first step, the pre-service of human face expression sequence image:
(1) human face expression sequence image cutting:
By the human face expression sequence image that reads from existing human face expression video sequence data storehouse by RBG spatial transformation to gray space, the formula (1) of employing is as follows:
Gray=0.299R+0.587G+0.114B(1),
Wherein, Gray is gray-scale value, and scope is generally from 0 to 255, and R is red component, and G is green component, and B is blue component,
According to the geometric model of the characteristic sum face of face face " three five, front yards ", cutting is carried out to the human face expression sequence image being transformed into gray space, if the horizontal range between eyes is d, with the mid point of two lines for reference point, upwards distance reference point 0.55d place is decided to be coboundary, and distance reference point 1.45d place is decided to be lower boundary downwards, and distance 0.9d place is decided to be left margin left, distance 0.9d place is decided to be right margin to the right, completes the cutting of human face expression sequence image thus;
(2) human face expression sequence image convergent-divergent:
Bicubic interpolation algorithm is adopted to change the yardstick of image to the human face expression sequence after above-mentioned (1) cutting, realize dimension normalization, to carry out human face expression sequence image convergent-divergent, the facial image size after human face expression sequence convergent-divergent is 64 × 64 pixels;
(3) human face expression sequence image gray balance:
Adopt histogram equalization to process to above-mentioned (2) the human face expression sequence image obtained and carry out gray proces, so far complete the pre-service of human face expression sequence image;
Second step, adopts spatial pyramid partitioning scheme to the process of human face expression sequence image hierarchical block:
Namely spatial pyramid segmentation is spatially progressively segmented image, here the spatial pyramid partitioning scheme adopted is the division carrying out the index multiple of 2 in the level and vertical two standard coordinate directions of human face expression sequence image, if pyramidal total number of plies is L+1, the number of plies i of each layer of pyramid is respectively 0,1,2 ... L, the dividing method that pyramid is i-th layer is: at level standard coordinate direction, human face expression sequence image is divided into 2 iblock, is divided into 2 at vertical standard coordinate steering handle human face expression sequence image iblock, human face expression sequence image is divided into 2 the most at last i× 2 iblock, sets above-mentioned L=2 here, and namely utilize spatial pyramid partitioning scheme that human face expression sequence image pretreated in the first step is divided into 2+1 layer, level 0 is former human face expression sequence image, and i-th layer is divided into 2 former human face expression sequence image i× 2 isub-block, i=1,2;
3rd step, utilizes HCBP-TOP algorithm to extract the dynamic space-time textural characteristics of human face expression sequence image:
HCBP-TOP algorithm is utilized human face expression sequence image to be carried out to the extraction of the dynamic space-time territory textural characteristics of hierarchical block, described " hierarchical block " is the human face expression sequence image hierarchical block obtained in above-mentioned second step, after second step spatial pyramid partitioning scheme is to the process of human face expression sequence image hierarchical block, each sub-block X-axis is got further with HCBP-TOP algorithm, Y-axis, the proper vector of time T axle three dimensions, and global feature vector in combined and spliced one-tenth block, again the proper vector of all pieces in entire image is integrated, form the characteristic in certain layering of piece image, finally the dynamic space-time textural characteristics histogram of pyramidal each layer human face expression sequence image is connected into the dynamic space-time textural characteristics histogram of whole human face expression sequence image according to weight allocation, concrete grammar is as follows:
(1) the HCBP feature of image subblock is extracted:
In each layer, utilize HCBP-TOP algorithm to the HCBP feature sub-block of correspondence position in each sequence being extracted image subblock, this sub-block is carried out to the statistics with histogram of the dynamic space-time textural characteristics of human face expression sequence image, then the dynamic space-time textural characteristics histogram of the human face expression sequence image of all sub-blocks is connected into the dynamic space-time textural characteristics histogram of the human face expression sequence image of each layer in whole human face expression sequence, namely human face expression sequence is obtained at the deformation data of X-Y plane and the movable information in X-T plane and Y-T plane, leaching process is calculated by following HCBP coding:
8 groups of HCBP encoding model M 1-M 8as shown in formula (2):
M 1 = 1 1 1 0 0 1 - 1 - 1 - 1 0 1 - 1 0 0 0 0 - 1 0 0 0 0 0 0 0 0 , M 2 = 1 1 1 1 1 - 1 - 1 - 1 - 1 - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 , M 3 = 0 0 1 1 1 0 - 1 - 1 - 1 1 0 0 0 - 1 1 0 0 0 - 1 0 0 0 0 0 0 ,
M 4 = 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 , M 5 = 0 0 0 0 0 0 0 0 - 1 0 0 0 0 - 1 1 0 - 1 - 1 - 1 1 0 0 1 1 1 , M 6 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 - 1 - 1 - 1 - 1 1 1 1 1 1 ,
M 7 = 0 0 0 0 0 0 - 1 0 0 0 1 - 1 0 0 0 1 - 1 - 1 - 1 0 1 1 1 0 0 , M 8 = 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 - - - ( 2 ) ,
In above formula, in every group model, outermost layer 5 pixels are set weight is 1, and it is-1 that adjacent second outer 5 pixels are set weights, and it is 0 that other parts are set weight, center point P 0be used for recording the texture variations storing Haar-like type feature, with P in each plane of X-Y, X-T, Y-T 0centered by constitute one as Suo Shi formula (3) 5 × 5 wicket, around have 24 neighborhood point P i(i=1,2 ..., 24),
W ( x , y ) = P 9 P 10 P 11 P 12 P 13 P 24 P 1 P 2 p 3 P 14 p 23 p 8 p 0 p 4 P 15 P 22 P 7 P 6 P 5 P 16 P 21 P 20 P 19 P 18 P 17 - - - ( 3 ) ,
As seen from the above description, there is fenestella W in its surrounding in any one pixel I (x, y, t) in image sequence I j(x, y), j=0,1,2, W 0(x, y) represents the wicket that this pixel is corresponding on an x-y plane, W 1(x, y) represents the wicket that this pixel is corresponding in X-Z plane, W 2(x, y) represents the wicket that this pixel is corresponding in Y-Z plane, W jthe HCBP value f of (x, y) jthe computing formula of (x, y, t) as shown in the formula:
f j ( x , y , t ) = H C B P ( I ( x , y , t ) ) = Σ k = 1 8 B ( a j , k ) × 2 8 - k - - - ( 4 ) ,
Wherein,
B ( a j , k ) = 1 , a j , k &GreaterEqual; T j , k 0 , a j , k < T j , k - - - ( 5 ) ,
T j , k = &Sigma; k = 1 8 5 ( I ( x , y , t ) - ( C k &CenterDot; W j ( x , y ) + I ( x , y , t ) ) / 11 ) - - - ( 6 ) ,
a j,k=M k·W j(x,y)(7),
M kfor encoding model, C kfor the pixel summation model that weights in each encoding model are non-vanishing, C k=| M k|, the threshold value comparison function that B (x) is HCBP value, T j,kfor threshold value, a j,kfor fenestella W j(x, y) and encoding model M kthe decimal number obtained after convolution;
Model M is utilized to each image subblock 1-M 8scan image-region pixel, model window size is 5 × 5 pixels, the corresponding threshold value of each model, and the computing method of threshold value are, first calculate 10 non-vanishing pixels of weights and center pixel in each model totally 11 pixels with, i.e. C kw j(x, y), then the mean value of these 11 pixels is obtained, 5 times of difference of pixel and mean value centered by the threshold value of each model, then formula (4) is utilized to calculate the dynamic space-time textural characteristics of human face expression sequence, weights be 5 pixel values of 1 and deduct weights be 5 pixel values of-1 and, i.e. the change information of inboard, lateral, obtains a decimal number a j,k, by a j,kwith threshold value T j,krelatively, if a j,kbe greater than threshold value then HCBP-TOP code value be 1, otherwise count 0;
(2) the dynamic space-time texture HCBP-TOP feature of human face expression sequence image is extracted:
HCBP-TOP algorithm is utilized to extract the dynamic space-time textural characteristics of human face expression sequence image to the hierarchical pyramid sub-block that second step generates, according to the dynamic space-time texture HCBP-TOP feature extracting the human face expression sequence image in the every block of HCBP feature calculation of image subblock in above-mentioned (1), each sub-block X, Y, T tri-dimensions at XY, XT, HCBP combination of eigenvectors on YT direction is spliced into global feature vector HCBP-TOP in block, again the proper vector of all pieces in entire image is integrated, form the characteristic in certain layering of piece image, the feature histogram of certain layer is obtained according to histogram functions, the different effect played when Images Classification with each tomographic image block of spatial information due to tomographic image each during Images Classification is different, the dynamic space-time textural characteristics histogram of pyramidal each layer human face expression sequence image is connected into the dynamic space-time textural characteristics histogram of whole human face expression sequence image according to weight allocation, the dynamic space-time textural characteristics histogram weight of the human face expression sequence image of the layer of weight allocation principle belonging to the sub-block of large scale is little, and the dynamic space-time textural characteristics histogram weight of the human face expression sequence image of layer belonging to the sub-block of small scale is large, the histogrammic weight of dynamic space-time textural characteristics of the human face expression sequence image of definition i-th layer, pyramid is i+1, the weight that then the former figure feature histogram of level 0 distributes is 1, the weight that ground floor feature histogram distributes is 2, along with the number of plies is larger, the weight of this layer of feature histogram distribution is larger, namely the proportion of this layer of characteristic information in total characteristic represents is larger, then the dynamic space-time textural characteristics histogram of each layer human face expression sequence image is merged the dynamic space-time textural characteristics histogram of the human face expression sequence image being connected into general image according to the set of weights divided, wherein the sub-block number of i-th layer is 2 i× 2 ithe dynamic space-time textural characteristics span of the human face expression sequence image of each sub-block is 0 to 255, and represent through the dynamic space-time textural characteristics that standardization processing obtains final human face expression sequence image, by training classifier in the dynamic space-time textural characteristics data input SVM of human face expression sequence image that extracts,
4th step, adopts SVM classifier to carry out training and the prediction of human face expression:
The dynamic space-time textural characteristics data input SVM training classifier of the human face expression sequence image the 3rd step extracted carries out training and the prediction of human face expression, judge which class human face expression the dynamic space-time textural characteristics of the human face expression sequence image extracted belongs to actually, adopt leaving-one method, the average result getting experiment is Expression Recognition rate, and concrete operations flow process is as follows:
(1) the dynamic space-time textural characteristics data input SVM training classifier of the human face expression sequence image above-mentioned 3rd step extracted, go out according to the dynamic space-time textural characteristics sample architecture of these human face expression sequence images and distinguish corresponding training and testing classification sample matrix with the dynamic space-time textural characteristics matrix of the dynamic space-time textural characteristics matrix of training sample human face expression sequence image and test sample book human face expression sequence image, the value in this training and testing classification sample matrix is the class categories of sample;
(2) self-defined kernel function is adopted for the express one's feelings dynamic space-time textural characteristics of sequence image of local facial, cross validation is adopted to select optimal parameter c and g, Lagrange factor c=790, g=1.9, first the dynamic space-time textural characteristics matrix feeding svmtrain function of the dynamic space-time textural characteristics matrix of training sample human face expression sequence image and test sample book human face expression sequence image is obtained support vector, again the dynamic space-time textural characteristics matrix of test sample book human face expression sequence image and above-mentioned support vector are sent in svmpredict function and predict, complete expression recognition thus, in Cohn-Kanade storehouse and SFEW storehouse, experiment obtains anger, detest, fear, glad, sad and surprised 6 kinds of expressions, complete the identification of human face expression thus.
The above-mentioned facial expression recognizing method based on video sequence, described HCBP-TOP algorithm, it is the algorithm adopting CBP characteristic sum Haar-like feature to combine, the subimage sequence of same frequency is extracted to the dynamic space-time textural characteristics of the human face expression sequence of piecemeal with HCBP-TOP algorithm, wherein HCBP-TOP histogram is defined as follows:
H i , j = &Sigma; x , y , t E { f j ( x , y , t ) = i } i = 0 , ... , 255 ; j = 0 , 1 , 2 - - - ( 8 ) ,
Wherein: f j(x, y, t) represents the HCBP value of pixel I (x, y, t) in jth plane, (j=0:XY; 1:XT; 2:YT), function E{f} is defined as follows:
E { f } = 1 , i f f = i 0 , e l s e - - - ( 9 ) .
The above-mentioned facial expression recognizing method based on video sequence, described spatial pyramid segmentation and SVM classifier are known.
The invention has the beneficial effects as follows: compared with prior art, outstanding substantive distinguishing features of the present invention and marked improvement as follows:
(1) the inventive method is in human face expression sequence image stratification step, overcome and extract entire image and ignore the defect that local detail information causes discrimination low, adopt spatial pyramid partitioning scheme that image sequence is divided into L+1 layer, the sub-block number that every tomographic image divides is 2 i× 2 i, wherein i ∈ 0,1 ... L}, every layer is extracted feature in units of sub-block, and gives different weights according to the number of plies to statistical nature histogram, and the more weights of piecemeal are larger, embody detailed information by weight size, improve Expression Recognition rate, be generally applicable in various image sequence;
(2) the inventive method is in the dynamic space-time textural characteristics step utilizing HCBP-TOP algorithm extraction human face expression sequence image, the defect that recognition speed that center pixel causes is low is not considered relative to existing HLBP, by asking for the mean value of ten elements and central element in each pattern, using five times of the difference of the mean value of 11 elements and center pixel as the threshold value in each pattern, improve Expression Recognition rate;
(3) the inventive method is carried out in training and prediction steps at sorter, SVM classifier is adopted to carry out training and the prediction of human face expression, the centralization binary pattern algorithm of employing based on Haar-like and the dynamic space-time textural characteristics of HCBP extraction human face expression sequence image, spatial pyramid partitioning scheme is adopted to distribute the histogrammic weight of dynamic space-time textural characteristics of human face expression sequence image, the dynamic space-time textural characteristics histogram of statistics human face expression sequence image is merged the dynamic space-time textural characteristics of human face expression sequence image as a whole, the dynamic space-time textural characteristics data of the human face expression sequence image extracted input SVM training classifier is carried out training and the prediction of human face expression, the dynamic space-time textural characteristics of this centralization binary pattern human face expression sequence image based on Haar-like well can describe human face expression feature, there is very high accuracy of identification, further increase the practicality of expression recognition system,
(4) the inventive method can effectively utilize the information of human face expression time-space domain feature, and overcome the defect not considering center pixel in LBP recognition methods, improve efficiency and the accuracy of identification of human face in video frequency sequence Expression Recognition, improve speed and the efficiency of training, make the method have more actual general using value.
The following examples have made further proof to outstanding substantive distinguishing features of the present invention and marked improvement.
Accompanying drawing explanation
Below in conjunction with drawings and Examples, the present invention is further described.
Fig. 1 is the steps flow chart schematic diagram of the facial expression recognizing method that the present invention is based on video sequence.
Fig. 2 is the schematic diagram of the dynamic space-time textural characteristics process extracting human face expression sequence image in the inventive method with HCBP-TOP algorithm.
Embodiment
Embodiment illustrated in fig. 1ly show, the steps flow chart that the present invention is based on the facial expression recognizing method of video sequence shows and is: the dynamic space-time textural characteristics → employing SVM classifier of human face expression sequence pre-service → employing spatial pyramid partitioning scheme to human face expression sequence image hierarchical block process → utilize HCBP-TOP algorithm to extract human face expression sequence image carries out training and the prediction of human face expression.
Embodiment illustrated in fig. 2ly to show, in the inventive method by the dynamic space-time textural characteristics process of HCBP-TOP algorithm extraction human face expression sequence image be: X is carried out to the human face expression sequence image after the process of spatial pyramid partitioning scheme, Y, the dynamic space-time texture feature extraction of the human face expression sequence image of T tri-dimensions, X and Y represents horizontal and vertical direction, T represents time domain, at XY, XT and YT tri-normal surfaces extract the deformation data of XY plane and the movable information in XT plane and YT plane, the combination of eigenvectors of three dimensions is spliced into global feature vector in block, again the proper vector of all pieces in entire image is integrated, form the characteristic in certain layering of piece image.
Elaborate further: after above-mentioned second step spatial pyramid partitioning scheme is to the process of human face expression sequence image hierarchical block, each sub-block X is obtained further with HCBP-TOP algorithm, Y, the proper vector of T tri-dimensions global feature vector in combined and spliced one-tenth block, again the proper vector of all pieces in entire image is integrated, form the characteristic in certain layering of piece image, the different effect played when Images Classification with each tomographic image block of spatial information due to tomographic image each during Images Classification is different, dynamic space-time textural characteristics data by each layer human face expression sequence image are merged the dynamic space-time textural characteristics data of the human face expression sequence image being connected into general image according to the set of weights divided.The dynamic space-time textural characteristics data of the human face expression sequence image after gathering obtain final image feature representation through standardization processing.
Embodiment
The present embodiment is based on the facial expression recognizing method of video sequence, and be a kind of facial expression recognizing method utilizing HCBP-TOP algorithm to extract the dynamic space-time textural characteristics of human face expression sequence, concrete steps are as follows:
The first step, the pre-service of human face expression sequence image:
(1) human face expression sequence image cutting:
By the human face expression sequence image that reads from existing human face expression video sequence data storehouse by RBG spatial transformation to gray space, the formula (1) of employing is as follows:
Gray=0.299R+0.587G+0.114B(1),
Wherein, Gray is gray-scale value, and scope is generally from 0 to 255, and R is red component, and G is green component, and B is blue component,
According to the geometric model of the characteristic sum face of face face " three five, front yards ", cutting is carried out to the human face expression sequence image being transformed into gray space, if the horizontal range between eyes is d, with the mid point of two lines for reference point, upwards distance reference point 0.55d place is decided to be coboundary, and distance reference point 1.45d place is decided to be lower boundary downwards, and distance 0.9d place is decided to be left margin left, distance 0.9d place is decided to be right margin to the right, completes the cutting of human face expression sequence image thus;
(2) human face expression sequence image convergent-divergent:
Bicubic interpolation algorithm is adopted to change the yardstick of image to the human face expression sequence after above-mentioned (1) cutting, realize dimension normalization, to carry out human face expression sequence image convergent-divergent, the facial image size after human face expression sequence convergent-divergent is 64 × 64 pixels;
(3) human face expression sequence image gray balance:
Adopt histogram equalization to process to above-mentioned (2) the human face expression sequence image obtained and carry out gray proces, so far complete the pre-service of human face expression sequence image;
Second step, adopts spatial pyramid partitioning scheme to the process of human face expression sequence image hierarchical block:
Namely spatial pyramid segmentation is spatially progressively segmented image, here the spatial pyramid partitioning scheme adopted is the division carrying out the index multiple of 2 in the level and vertical two standard coordinate directions of human face expression sequence image, if pyramidal total number of plies is L+1, the number of plies i of each layer of pyramid is respectively 0,1,2 ... L, the dividing method that pyramid is i-th layer is: at level standard coordinate direction, human face expression sequence image is divided into 2 iblock, is divided into 2 at vertical standard coordinate steering handle human face expression sequence image iblock, human face expression sequence image is divided into 2 the most at last i× 2 iblock, sets above-mentioned L=2 here, and namely utilize spatial pyramid partitioning scheme that human face expression sequence image pretreated in the first step is divided into 2+1 layer, level 0 is former human face expression sequence image, and i-th layer is divided into 2 former human face expression sequence image i× 2 isub-block, i=1,2;
3rd step, utilizes HCBP-TOP algorithm to extract the dynamic space-time textural characteristics of human face expression sequence image:
HCBP-TOP algorithm is utilized human face expression sequence image to be carried out to the extraction of the dynamic space-time territory textural characteristics of hierarchical block, described " hierarchical block " is the human face expression sequence image hierarchical block obtained in above-mentioned second step, after second step spatial pyramid partitioning scheme is to the process of human face expression sequence image hierarchical block, each sub-block X-axis is got further with HCBP-TOP algorithm, Y-axis, the proper vector of time T axle three dimensions, and global feature vector in combined and spliced one-tenth block, again the proper vector of all pieces in entire image is integrated, form the characteristic in certain layering of piece image, finally the dynamic space-time textural characteristics histogram of pyramidal each layer human face expression sequence image is connected into the dynamic space-time textural characteristics histogram of whole human face expression sequence image according to weight allocation, concrete grammar is as follows:
(1) the HCBP feature of image subblock is extracted:
In each layer, utilize HCBP-TOP algorithm to the HCBP feature sub-block of correspondence position in each sequence being extracted image subblock, this sub-block is carried out to the statistics with histogram of the dynamic space-time textural characteristics of human face expression sequence image, then the dynamic space-time textural characteristics histogram of the human face expression sequence image of all sub-blocks is connected into the dynamic space-time textural characteristics histogram of the human face expression sequence image of each layer in whole human face expression sequence, namely human face expression sequence is obtained at the deformation data of X-Y plane and the movable information in X-T plane and Y-T plane, leaching process is calculated by following HCBP coding:
8 groups of HCBP encoding model M 1-M 8as shown in formula (2):
M 1 = 1 1 1 0 0 1 - 1 - 1 - 1 0 1 - 1 0 0 0 0 - 1 0 0 0 0 0 0 0 0 , M 2 = 1 1 1 1 1 - 1 - 1 - 1 - 1 - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 , M 3 = 0 0 1 1 1 0 - 1 - 1 - 1 1 0 0 0 - 1 1 0 0 0 - 1 0 0 0 0 0 0 ,
M 4 = 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 , M 5 = 0 0 0 0 0 0 0 0 - 1 0 0 0 0 - 1 1 0 - 1 - 1 - 1 1 0 0 1 1 1 , M 6 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 - 1 - 1 - 1 - 1 1 1 1 1 1 ,
M 7 = 0 0 0 0 0 0 - 1 0 0 0 1 - 1 0 0 0 1 - 1 - 1 - 1 0 1 1 1 0 0 , M 8 = 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 - - - ( 2 ) ,
In above formula, in every group model, outermost layer 5 pixels are set weight is 1, and it is-1 that adjacent second outer 5 pixels are set weights, and it is 0 that other parts are set weight, center point P 0be used for recording the texture variations storing Haar-like type feature, with P in each plane of X-Y, X-T, Y-T 0centered by constitute one as Suo Shi formula (3) 5 × 5 wicket, around have 24 neighborhood point P i(i=1,2 ..., 24),
W ( x , y ) = P 9 P 10 P 11 P 12 P 13 P 24 P 1 P 2 p 3 P 14 p 23 p 8 p 0 p 4 P 15 P 22 P 7 P 6 P 5 P 16 P 21 P 20 P 19 P 18 P 17 - - - ( 3 ) ,
As seen from the above description, there is fenestella W in its surrounding in any one pixel I (x, y, t) in image sequence I j(x, y), j=0,1,2, W 0(x, y) represents the wicket that this pixel is corresponding on an x-y plane, W 1(x, y) represents the wicket that this pixel is corresponding in X-Z plane, W 2(x, y) represents the wicket that this pixel is corresponding in Y-Z plane, W jthe HCBP value f of (x, y) jthe computing formula of (x, y, t) as shown in the formula:
f j ( x , y , t ) = H C B P ( I ( x , y , t ) ) = &Sigma; k = 1 8 B ( a j , k ) &times; 2 8 - k - - - ( 4 ) ,
Wherein,
B ( a j , k ) = 1 , a j , k &GreaterEqual; T j , k 0 , a j , k < T j , k - - - ( 5 ) ,
T j , k = &Sigma; k = 1 8 5 ( I ( x , y , t ) - ( C k &CenterDot; W j ( x , y ) + I ( x , y , t ) ) / 11 ) - - - ( 6 ) ,
a j,k=M k·W j(x,y)(7),
M kfor encoding model, C kfor the pixel summation model that weights in each encoding model are non-vanishing, C k=| M k|, the threshold value comparison function that B (x) is HCBP value, T j,kfor threshold value, a j,kfor fenestella W j(x, y) and encoding model M kthe decimal number obtained after convolution;
Model M is utilized to each image subblock 1-M 8scan image-region pixel, model window size is 5 × 5 pixels, the corresponding threshold value of each model, and the computing method of threshold value are, first calculate 10 non-vanishing pixels of weights and center pixel in each model totally 11 pixels with, i.e. C kw j(x, y), then the mean value of these 11 pixels is obtained, 5 times of difference of pixel and mean value centered by the threshold value of each model, then formula (4) is utilized to calculate the dynamic space-time textural characteristics of human face expression sequence, weights be 5 pixel values of 1 and deduct weights be 5 pixel values of-1 and, i.e. the change information of inboard, lateral, obtains a decimal number a j,k, by a j,kwith threshold value T j,krelatively, if a j,kbe greater than threshold value then HCBP-TOP code value be 1, otherwise count 0;
(2) the dynamic space-time texture HCBP-TOP feature of human face expression sequence image is extracted:
HCBP-TOP algorithm is utilized to extract the dynamic space-time textural characteristics of human face expression sequence image to the hierarchical pyramid sub-block that second step generates, according to the dynamic space-time texture HCBP-TOP feature of the human face expression sequence image in the every block of HCBP feature calculation of the extraction image subblock of above-mentioned (1) step, each sub-block X, Y, T tri-dimensions at XY, XT, HCBP combination of eigenvectors on YT direction is spliced into global feature vector HCBP-TOP in block, again the proper vector of all pieces in entire image is integrated, form the characteristic in certain layering of piece image, the feature histogram of certain layer is obtained according to histogram functions, the different effect played when Images Classification with each tomographic image block of spatial information due to tomographic image each during Images Classification is different, the dynamic space-time textural characteristics histogram of pyramidal each layer human face expression sequence image is connected into the dynamic space-time textural characteristics histogram of whole human face expression sequence image according to weight allocation, the dynamic space-time textural characteristics histogram weight of the human face expression sequence image of the layer of weight allocation principle belonging to the sub-block of large scale is little, and the dynamic space-time textural characteristics histogram weight of the human face expression sequence image of layer belonging to the sub-block of small scale is large, the histogrammic weight of dynamic space-time textural characteristics of the human face expression sequence image of definition i-th layer, pyramid is i+1, the weight that then the former figure feature histogram of level 0 distributes is 1, the weight that ground floor feature histogram distributes is 2, along with the number of plies is larger, the weight of this layer of feature histogram distribution is larger, namely the proportion of this layer of characteristic information in total characteristic represents is larger, then the dynamic space-time textural characteristics histogram of each layer human face expression sequence image is merged the dynamic space-time textural characteristics histogram of the human face expression sequence image being connected into general image according to the set of weights divided, wherein the sub-block number of i-th layer is 2 i× 2 ithe dynamic space-time textural characteristics span of the human face expression sequence image of each sub-block is 0 to 255, and represent through the dynamic space-time textural characteristics that standardization processing obtains final human face expression sequence image, by training classifier in the dynamic space-time textural characteristics data input SVM of human face expression sequence image that extracts,
4th step, adopts SVM classifier to carry out training and the prediction of human face expression:
The dynamic space-time textural characteristics data input SVM training classifier of the human face expression sequence image the 3rd step extracted carries out training and the prediction of human face expression, judge which class human face expression the dynamic space-time textural characteristics of the human face expression sequence image extracted belongs to actually, adopt leaving-one method, the average result getting experiment is Expression Recognition rate, and concrete operations flow process is as follows:
(1) the dynamic space-time textural characteristics data input SVM training classifier of the human face expression sequence image above-mentioned 3rd step extracted, go out according to the dynamic space-time textural characteristics sample architecture of these human face expression sequence images and distinguish corresponding training and testing classification sample matrix with the dynamic space-time textural characteristics matrix of the dynamic space-time textural characteristics matrix of training sample human face expression sequence image and test sample book human face expression sequence image, the value in this training and testing classification sample matrix is the class categories of sample;
(2) self-defined kernel function is adopted for the express one's feelings dynamic space-time textural characteristics of sequence image of local facial, cross validation is adopted to select optimal parameter c and g, Lagrange factor c=790, g=1.9, first the dynamic space-time textural characteristics matrix feeding svmtrain function of the dynamic space-time textural characteristics matrix of training sample human face expression sequence image and test sample book human face expression sequence image is obtained support vector, again the dynamic space-time textural characteristics matrix of test sample book human face expression sequence image and above-mentioned support vector are sent in svmpredict function and predict, complete expression recognition thus, in Cohn-Kanade storehouse and SFEW storehouse, experiment obtains anger, detest, fear, glad, sad and surprised 6 kinds of expressions, complete the identification of human face expression thus.
The present embodiment based in the facial expression recognizing method of video sequence, described HCBP-TOP algorithm, it is the algorithm adopting CBP characteristic sum Haar-like feature to combine, to the dynamic space-time textural characteristics of the human face expression sequence that the subimage sequence of same frequency is improved with HCBP-TOP algorithm extraction piecemeal, wherein HCBP-TOP histogram is defined as follows:
H i , j = &Sigma; x , y , t E { f j ( x , y , t ) = i } i = 0 , ... , 255 ; j = 0 , 1 , 2 - - - ( 8 ) ,
Wherein: f j(x, y, t) represents the HCBP value of pixel I (x, y, t) in jth plane, (j=0:XY; 1:XT; 2:YT), function E{f} is defined as follows:
E { f } = 1 , i f f = i 0 , e l s e - - - ( 9 ) .
The present embodiment is tested on the existing human face expression video sequence data storehouse of Cohn-Kanade and SFEW two.From Cohn-Kanade storehouse, choose 327 face expression video sequence images and test, in experiment, human face expression sequence image is divided into anger, detest, fear, glad, sad and surprised, comprise 38 respectively, 35,41,50,43,48 image sequences, each face expression video sequence image comprises 10 frames, and start frame is neutral expression, end frame is the tip that expression occurs, totally 3270 width images.MATLABR2012a platform under Windows7 environment runs, and respectively identification experiment has been carried out to the human face expression video gathered under normal illumination, low-light, intense light irradiation.In order to effectively assess the method for the present embodiment, extract that 3270 frames comprise the different colour of skin, the image of different light has carried out experimental analysis from human face expression video sequence, the accurate discrimination of the present embodiment is 92.86%, and false drop rate is 7.14%.
In order to verify the advantage of the present embodiment method in expression recognition rate, the facial expression recognizing method that the present embodiment chooses the facial expression recognizing method of the dynamic space-time textural characteristics of the LBP-TOP algorithm extraction human face expression sequence being usually used in expression recognition and the dynamic space-time textural characteristics of HLBP-TOP algorithm extraction human face expression sequence compares, utilize SVM classifier to train, Cohn-Kanade carries out expression recognition contrast experiment.Table 1 lists the discrimination of the human face expression of algorithms of different on Cohn-Kanade database, the selection mode of test sample book is wherein trained to be that a part for the every class video sequence of Stochastic choice is as training sample, a remaining part is test sample book, ensure that training and test sample book do not repeat, more can ensure ubiquity and the correctness of experimental result.
The discrimination of the human face expression of table 1. algorithms of different on Cohn-Kanade database
The hierarchy number difference of spatial pyramid partitioning scheme also can have an impact to expression recognition rate, and table 2 lists the different hierarchy number of spatial pyramid partitioning scheme to the impact of expression recognition rate.Wherein the hierarchy number of spatial pyramid partitioning scheme is 2, and when namely dividing three layers, effect is best.
The different hierarchy number of table 2. spatial pyramid partitioning scheme is on the impact of Cohn-Kanade database human face expression average recognition rate
From SFEW storehouse, choose 940 human face expression video sequence images and test, in experiment human face expression sequence image be divided into anger, detest, fear, glad, sad and surprised, comprise 214,66,116,227,198,119 images respectively.Table 3 lists the discrimination of the human face expression of algorithms of different on SFEW database.
The discrimination of the human face expression of table 3. algorithms of different on SFEW database
Result shows, the discrimination that the present embodiment method utilizes HCBP-TOP algorithm to extract the facial expression recognizing method of the dynamic space-time textural characteristics of human face expression sequence is obviously better than the facial expression recognizing method that LBP-TOP algorithm extracts the facial expression recognizing method of the dynamic space-time textural characteristics of human face expression sequence and the dynamic space-time textural characteristics of HLBP-TOP algorithm extraction human face expression sequence.
The segmentation of spatial pyramid described in the present embodiment and SVM classifier are known, and involved equipment is well known in the art and obtains by being purchased approach.

Claims (2)

1. based on the facial expression recognizing method of video sequence, it is characterized in that: be a kind of facial expression recognizing method utilizing HCBP-TOP algorithm to extract the dynamic space-time textural characteristics of human face expression sequence, concrete steps are as follows:
The first step, the pre-service of human face expression sequence image:
(1) human face expression sequence image cutting:
By the human face expression sequence image that reads from existing human face expression video sequence data storehouse by RBG spatial transformation to gray space, the formula (1) of employing is as follows:
Gray=0.299R+0.587G+0.114B(1),
Wherein, Gray is gray-scale value, and scope is generally from 0 to 255, and R is red component, and G is green component, and B is blue component,
According to the geometric model of the characteristic sum face of face face " three five, front yards ", cutting is carried out to the human face expression sequence image being transformed into gray space, if the horizontal range between eyes is d, with the mid point of two lines for reference point, upwards distance reference point 0.55d place is decided to be coboundary, and distance reference point 1.45d place is decided to be lower boundary downwards, and distance 0.9d place is decided to be left margin left, distance 0.9d place is decided to be right margin to the right, completes the cutting of human face expression sequence image thus;
(2) human face expression sequence image convergent-divergent:
Bicubic interpolation algorithm is adopted to change the yardstick of image to the human face expression sequence after above-mentioned (1) cutting, realize dimension normalization, to carry out human face expression sequence image convergent-divergent, the facial image size after human face expression sequence convergent-divergent is 64 × 64 pixels;
(3) human face expression sequence image gray balance:
Adopt histogram equalization to process to above-mentioned (2) the human face expression sequence image obtained and carry out gray proces, so far complete the pre-service of human face expression sequence image;
Second step, adopts spatial pyramid partitioning scheme to the process of human face expression sequence image hierarchical block:
Namely spatial pyramid segmentation is spatially progressively segmented image, here the spatial pyramid partitioning scheme adopted is the division carrying out the index multiple of 2 in the level and vertical two standard coordinate directions of human face expression sequence image, if pyramidal total number of plies is L+1, the number of plies i of each layer of pyramid is respectively 0,1,2 ... L, the dividing method that pyramid is i-th layer is: at level standard coordinate direction, human face expression sequence image is divided into 2 iblock, is divided into 2 at vertical standard coordinate steering handle human face expression sequence image iblock, human face expression sequence image is divided into 2 the most at last i× 2 iblock, sets above-mentioned L=2 here, and namely utilize spatial pyramid partitioning scheme that human face expression sequence image pretreated in the first step is divided into 2+1 layer, level 0 is former human face expression sequence image, and i-th layer is divided into 2 former human face expression sequence image i× 2 isub-block, i=1,2;
3rd step, utilizes HCBP-TOP algorithm to extract the dynamic space-time textural characteristics of human face expression sequence image:
HCBP-TOP algorithm is utilized human face expression sequence image to be carried out to the extraction of the dynamic space-time territory textural characteristics of hierarchical block, described " hierarchical block " is the human face expression sequence image hierarchical block obtained in above-mentioned second step, after second step spatial pyramid partitioning scheme is to the process of human face expression sequence image hierarchical block, each sub-block X-axis is got further with HCBP-TOP algorithm, Y-axis, the proper vector of time T axle three dimensions, and global feature vector in combined and spliced one-tenth block, again the proper vector of all pieces in entire image is integrated, form the characteristic in certain layering of piece image, finally the dynamic space-time textural characteristics histogram of pyramidal each layer human face expression sequence image is connected into the dynamic space-time textural characteristics histogram of whole human face expression sequence image according to weight allocation, concrete grammar is as follows:
(1) the HCBP feature of image subblock is extracted:
In each layer, utilize HCBP-TOP algorithm to the HCBP feature sub-block of correspondence position in each sequence being extracted image subblock, this sub-block is carried out to the statistics with histogram of the dynamic space-time textural characteristics of human face expression sequence image, then the dynamic space-time textural characteristics histogram of the human face expression sequence image of all sub-blocks is connected into the dynamic space-time textural characteristics histogram of the human face expression sequence image of each layer in whole human face expression sequence, namely human face expression sequence is obtained at the deformation data of X-Y plane and the movable information in X-T plane and Y-T plane, leaching process is calculated by following HCBP coding:
8 groups of HCBP encoding model M 1-M 8as shown in formula (2):
M 1 = 1 1 1 0 0 1 - 1 - 1 - 1 0 1 - 1 0 0 0 0 - 1 0 0 0 0 0 0 0 0 , M 2 = 1 1 1 1 1 - 1 - 1 - 1 - 1 - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 , M 3 = 0 0 1 1 1 0 - 1 - 1 - 1 1 0 0 0 - 1 1 0 0 0 - 1 0 0 0 0 0 0 ,
M 4 = 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 0 0 0 - 1 1 , M 5 = 0 0 0 0 0 0 0 0 - 1 0 0 0 0 - 1 1 0 - 1 - 1 - 1 1 0 0 1 1 1 , M 6 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 - 1 - 1 - 1 - 1 1 1 1 1 1 ,
M 7 = 0 0 0 0 0 0 - 1 0 0 0 1 - 1 0 0 0 1 - 1 - 1 - 1 0 1 1 1 0 0 , M 8 = 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 1 - 1 0 0 0 - - - ( 2 ) ,
In above formula, in every group model, outermost layer 5 pixels are set weight is 1, and it is-1 that adjacent second outer 5 pixels are set weights, and it is 0 that other parts are set weight, center point P 0be used for recording the texture variations storing Haar-like type feature, with P in each plane of X-Y, X-T, Y-T 0centered by constitute one as Suo Shi formula (3) 5 × 5 wicket, around have 24 neighborhood point P i(i=1,2 ..., 24),
W ( x , y ) = P 9 P 10 P 11 P 12 P 13 P 24 P 1 P 2 P 3 P 14 P 23 P 8 P 0 P 4 P 15 P 22 P 7 P 6 P 5 P 16 P 21 P 20 P 19 P 18 P 17 - - - ( 3 ) ,
As seen from the above description, there is fenestella W in its surrounding in any one pixel I (x, y, t) in image sequence I j(x, y), j=0,1,2, W 0(x, y) represents the wicket that this pixel is corresponding on an x-y plane, W 1(x, y) represents the wicket that this pixel is corresponding in X-Z plane, W 2(x, y) represents the wicket that this pixel is corresponding in Y-Z plane, W jthe HCBP value f of (x, y) jthe computing formula of (x, y, t) as shown in the formula:
f j ( x , y , t ) = H C B P ( I ( x , y , t ) ) = &Sigma; k = 1 8 B ( a j , k ) &times; 2 8 - k - - - ( 4 ) ,
Wherein
B ( a j , k ) = 1 , a j , k &GreaterEqual; T j , k 0 , a j , k < T j , k - - - ( 5 ) ,
T j , k = &Sigma; k = 1 8 5 ( I ( x , y , t ) - ( C k &CenterDot; W j ( x , y ) + I ( x , y , t ) ) / 11 ) - - - ( 6 ) ,
a j,k=M k·W j(x,y)(7),
M kfor encoding model, C kfor the pixel summation model that weights in each encoding model are non-vanishing, C k=| M k|, the threshold value comparison function that B (x) is HCBP value, T j,kfor threshold value, a j,kfor fenestella W j(x, y) and encoding model M kthe decimal number obtained after convolution;
Model M is utilized to each image subblock 1-M 8scan image-region pixel, model window size is 5 × 5 pixels, the corresponding threshold value of each model, and the computing method of threshold value are, first calculate 10 non-vanishing pixels of weights and center pixel in each model totally 11 pixels with, i.e. C kw j(x, y), then the mean value of these 11 pixels is obtained, 5 times of difference of pixel and mean value centered by the threshold value of each model, then formula (4) is utilized to calculate the dynamic space-time textural characteristics of human face expression sequence, weights be 5 pixel values of 1 and deduct weights be 5 pixel values of-1 and, i.e. the change information of inboard, lateral, obtains a decimal number a j,k, by a j,kwith threshold value T j,krelatively, if a j,kbe greater than threshold value then HCBP-TOP code value be 1, otherwise count 0;
(2) the dynamic space-time texture HCBP-TOP feature of human face expression sequence image is extracted:
HCBP-TOP algorithm is utilized to extract the dynamic space-time textural characteristics of human face expression sequence image to the hierarchical pyramid sub-block that second step generates, according to the dynamic space-time texture HCBP-TOP feature extracting the human face expression sequence image in the every block of HCBP feature calculation of image subblock in above-mentioned (1), each sub-block X, Y, T tri-dimensions at XY, XT, HCBP combination of eigenvectors on YT direction is spliced into global feature vector HCBP-TOP in block, again the proper vector of all pieces in entire image is integrated, form the characteristic in certain layering of piece image, the feature histogram of certain layer is obtained according to histogram functions, the different effect played when Images Classification with each tomographic image block of spatial information due to tomographic image each during Images Classification is different, the dynamic space-time textural characteristics histogram of pyramidal each layer human face expression sequence image is connected into the dynamic space-time textural characteristics histogram of whole human face expression sequence image according to weight allocation, the dynamic space-time textural characteristics histogram weight of the human face expression sequence image of the layer of weight allocation principle belonging to the sub-block of large scale is little, and the dynamic space-time textural characteristics histogram weight of the human face expression sequence image of layer belonging to the sub-block of small scale is large, the histogrammic weight of dynamic space-time textural characteristics of the human face expression sequence image of definition i-th layer, pyramid is i+1, the weight that then the former figure feature histogram of level 0 distributes is 1, the weight that ground floor feature histogram distributes is 2, along with the number of plies is larger, the weight of this layer of feature histogram distribution is larger, namely the proportion of this layer of characteristic information in total characteristic represents is larger, then the dynamic space-time textural characteristics histogram of each layer human face expression sequence image is merged the dynamic space-time textural characteristics histogram of the human face expression sequence image being connected into general image according to the set of weights divided, wherein the sub-block number of i-th layer is 2 i× 2 ithe dynamic space-time textural characteristics span of the human face expression sequence image of each sub-block is 0 to 255, and represent through the dynamic space-time textural characteristics that standardization processing obtains final human face expression sequence image, by training classifier in the dynamic space-time textural characteristics data input SVM of human face expression sequence image that extracts,
4th step, adopts SVM classifier to carry out training and the prediction of human face expression:
The dynamic space-time textural characteristics data input SVM training classifier of the human face expression sequence image the 3rd step extracted carries out training and the prediction of human face expression, judge which class human face expression the dynamic space-time textural characteristics of the human face expression sequence image extracted belongs to actually, adopt leaving-one method, the average result getting experiment is Expression Recognition rate, and concrete operations flow process is as follows:
(1) the dynamic space-time textural characteristics data input SVM training classifier of the human face expression sequence image above-mentioned 3rd step extracted, go out according to the dynamic space-time textural characteristics sample architecture of these human face expression sequence images and distinguish corresponding training and testing classification sample matrix with the dynamic space-time textural characteristics matrix of the dynamic space-time textural characteristics matrix of training sample human face expression sequence image and test sample book human face expression sequence image, the value in this training and testing classification sample matrix is the class categories of sample;
(2) self-defined kernel function is adopted for the express one's feelings dynamic space-time textural characteristics of sequence image of local facial, cross validation is adopted to select optimal parameter c and g, Lagrange factor c=790, g=1.9, first the dynamic space-time textural characteristics matrix feeding svmtrain function of the dynamic space-time textural characteristics matrix of training sample human face expression sequence image and test sample book human face expression sequence image is obtained support vector, again the dynamic space-time textural characteristics matrix of test sample book human face expression sequence image and above-mentioned support vector are sent in svmpredict function and predict, complete expression recognition thus, in Cohn-Kanade storehouse and SFEW storehouse, experiment obtains anger, detest, fear, glad, sad and surprised 6 kinds of expressions, complete the identification of human face expression thus.
2. according to claim based on the facial expression recognizing method of video sequence, it is characterized in that: described HCBP-TOP algorithm, it is the algorithm adopting CBP characteristic sum Haar-like feature to combine, the subimage sequence of same frequency is extracted to the dynamic space-time textural characteristics of the human face expression sequence of piecemeal with HCBP-TOP algorithm, wherein HCBP-TOP histogram is defined as follows:
H i , j = &Sigma; x , y , t E { f j ( x , y , t ) = i } , i = 0 , ... , 255 ; j = 0 , 1 , 2 - - - ( 8 ) ,
Wherein: f j(x, y, t) represents the HCBP value of pixel I (x, y, t) in jth plane, (j=0:XY; 1:XT; 2:YT), function E{f} is defined as follows:
E { f } = 1 , i f f = i 0 , e l s e - - - ( 9 ) .
CN201510612526.XA 2015-09-23 2015-09-23 Facial expression recognizing method based on video sequence Active CN105139004B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510612526.XA CN105139004B (en) 2015-09-23 2015-09-23 Facial expression recognizing method based on video sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510612526.XA CN105139004B (en) 2015-09-23 2015-09-23 Facial expression recognizing method based on video sequence

Publications (2)

Publication Number Publication Date
CN105139004A true CN105139004A (en) 2015-12-09
CN105139004B CN105139004B (en) 2018-02-06

Family

ID=54724347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510612526.XA Active CN105139004B (en) 2015-09-23 2015-09-23 Facial expression recognizing method based on video sequence

Country Status (1)

Country Link
CN (1) CN105139004B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701495A (en) * 2016-01-05 2016-06-22 贵州大学 Image texture feature extraction method
CN105701459A (en) * 2016-01-06 2016-06-22 广东欧珀移动通信有限公司 Picture display method and terminal device
CN106341724A (en) * 2016-08-29 2017-01-18 刘永娜 Expression image marking method and system
CN106446810A (en) * 2016-09-12 2017-02-22 合肥工业大学 Computer vision method used for mental state analysis
CN106845483A (en) * 2017-02-10 2017-06-13 杭州当虹科技有限公司 A kind of video high definition printed words detection method
CN106980811A (en) * 2016-10-21 2017-07-25 商汤集团有限公司 Facial expression recognizing method and expression recognition device
CN107045618A (en) * 2016-02-05 2017-08-15 北京陌上花科技有限公司 A kind of facial expression recognizing method and device
CN107294947A (en) * 2016-08-31 2017-10-24 张梅 Parking information public service platform based on Internet of Things
CN108537194A (en) * 2018-04-17 2018-09-14 谭红春 A kind of expression recognition method of the hepatolenticular degeneration patient based on deep learning and SVM
CN109145754A (en) * 2018-07-23 2019-01-04 上海电力学院 Merge the Emotion identification method of facial expression and limb action three-dimensional feature
CN109124604A (en) * 2018-09-20 2019-01-04 南方医科大学珠江医院 A kind of appraisal procedure of neonatal pain degree
CN109409296A (en) * 2018-10-30 2019-03-01 河北工业大学 The video feeling recognition methods that facial expression recognition and speech emotion recognition are merged
CN110175526A (en) * 2019-04-28 2019-08-27 平安科技(深圳)有限公司 Dog Emotion identification model training method, device, computer equipment and storage medium
CN110321805A (en) * 2019-06-12 2019-10-11 华中科技大学 A kind of dynamic expression recognition methods based on sequential relationship reasoning
CN110427848A (en) * 2019-07-23 2019-11-08 京东方科技集团股份有限公司 A kind of psychoanalysis system
CN111931677A (en) * 2020-08-19 2020-11-13 北京影谱科技股份有限公司 Face detection method and device and face expression detection method and device
CN112580648A (en) * 2020-12-14 2021-03-30 成都中科大旗软件股份有限公司 Method for realizing image information identification based on image segmentation technology
CN114926886A (en) * 2022-05-30 2022-08-19 山东大学 Micro expression action unit identification method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090226049A1 (en) * 2008-01-31 2009-09-10 University Of Southern California Practical Modeling and Acquisition of Layered Facial Reflectance
CN103488974A (en) * 2013-09-13 2014-01-01 南京华图信息技术有限公司 Facial expression recognition method and system based on simulated biological vision neural network
CN104298981A (en) * 2014-11-05 2015-01-21 河北工业大学 Face microexpression recognition method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090226049A1 (en) * 2008-01-31 2009-09-10 University Of Southern California Practical Modeling and Acquisition of Layered Facial Reflectance
CN103488974A (en) * 2013-09-13 2014-01-01 南京华图信息技术有限公司 Facial expression recognition method and system based on simulated biological vision neural network
CN104298981A (en) * 2014-11-05 2015-01-21 河北工业大学 Face microexpression recognition method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PHILIPP MICHEL等: "Real time facial expression recognition in video using support vector machines", 《INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES》 *
于明 等: "基于LGBP特征和稀疏表示的人脸表情识别", 《计算机工程与设计》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701495A (en) * 2016-01-05 2016-06-22 贵州大学 Image texture feature extraction method
CN105701459B (en) * 2016-01-06 2019-04-16 Oppo广东移动通信有限公司 A kind of image display method and terminal device
CN105701459A (en) * 2016-01-06 2016-06-22 广东欧珀移动通信有限公司 Picture display method and terminal device
CN107045618A (en) * 2016-02-05 2017-08-15 北京陌上花科技有限公司 A kind of facial expression recognizing method and device
CN107045618B (en) * 2016-02-05 2020-07-03 北京陌上花科技有限公司 Facial expression recognition method and device
CN106341724A (en) * 2016-08-29 2017-01-18 刘永娜 Expression image marking method and system
CN107294947A (en) * 2016-08-31 2017-10-24 张梅 Parking information public service platform based on Internet of Things
CN106446810A (en) * 2016-09-12 2017-02-22 合肥工业大学 Computer vision method used for mental state analysis
CN106980811A (en) * 2016-10-21 2017-07-25 商汤集团有限公司 Facial expression recognizing method and expression recognition device
CN106845483A (en) * 2017-02-10 2017-06-13 杭州当虹科技有限公司 A kind of video high definition printed words detection method
CN108537194A (en) * 2018-04-17 2018-09-14 谭红春 A kind of expression recognition method of the hepatolenticular degeneration patient based on deep learning and SVM
CN109145754A (en) * 2018-07-23 2019-01-04 上海电力学院 Merge the Emotion identification method of facial expression and limb action three-dimensional feature
CN109124604A (en) * 2018-09-20 2019-01-04 南方医科大学珠江医院 A kind of appraisal procedure of neonatal pain degree
CN109409296A (en) * 2018-10-30 2019-03-01 河北工业大学 The video feeling recognition methods that facial expression recognition and speech emotion recognition are merged
CN109409296B (en) * 2018-10-30 2020-12-01 河北工业大学 Video emotion recognition method integrating facial expression recognition and voice emotion recognition
CN110175526A (en) * 2019-04-28 2019-08-27 平安科技(深圳)有限公司 Dog Emotion identification model training method, device, computer equipment and storage medium
CN110321805A (en) * 2019-06-12 2019-10-11 华中科技大学 A kind of dynamic expression recognition methods based on sequential relationship reasoning
CN110321805B (en) * 2019-06-12 2021-08-10 华中科技大学 Dynamic expression recognition method based on time sequence relation reasoning
CN110427848A (en) * 2019-07-23 2019-11-08 京东方科技集团股份有限公司 A kind of psychoanalysis system
CN111931677A (en) * 2020-08-19 2020-11-13 北京影谱科技股份有限公司 Face detection method and device and face expression detection method and device
CN112580648A (en) * 2020-12-14 2021-03-30 成都中科大旗软件股份有限公司 Method for realizing image information identification based on image segmentation technology
CN114926886A (en) * 2022-05-30 2022-08-19 山东大学 Micro expression action unit identification method and system

Also Published As

Publication number Publication date
CN105139004B (en) 2018-02-06

Similar Documents

Publication Publication Date Title
CN105139004A (en) Face expression identification method based on video sequences
CN105956560B (en) A kind of model recognizing method based on the multiple dimensioned depth convolution feature of pondization
CN106023220B (en) A kind of vehicle appearance image of component dividing method based on deep learning
CN107564025B (en) Electric power equipment infrared image semantic segmentation method based on deep neural network
CN110532900B (en) Facial expression recognition method based on U-Net and LS-CNN
Li et al. Unsupervised learning of edges
CN103942577B (en) Based on the personal identification method for establishing sample database and composite character certainly in video monitoring
CN104182772B (en) A kind of gesture identification method based on deep learning
CN106599854B (en) Automatic facial expression recognition method based on multi-feature fusion
CN108171112A (en) Vehicle identification and tracking based on convolutional neural networks
CN105005765A (en) Facial expression identification method based on Gabor wavelet and gray-level co-occurrence matrix
CN105825502B (en) A kind of Weakly supervised method for analyzing image of the dictionary study based on conspicuousness guidance
CN106682569A (en) Fast traffic signboard recognition method based on convolution neural network
CN105139039A (en) Method for recognizing human face micro-expressions in video sequence
CN106127196A (en) The classification of human face expression based on dynamic texture feature and recognition methods
CN104281853A (en) Behavior identification method based on 3D convolution neural network
CN106909938B (en) Visual angle independence behavior identification method based on deep learning network
CN110827260B (en) Cloth defect classification method based on LBP characteristics and convolutional neural network
CN105046197A (en) Multi-template pedestrian detection method based on cluster
CN105205449A (en) Sign language recognition method based on deep learning
CN109753950A (en) Dynamic human face expression recognition method
CN104598885A (en) Method for detecting and locating text sign in street view image
CN104298974A (en) Human body behavior recognition method based on depth video sequence
CN110503613A (en) Based on the empty convolutional neural networks of cascade towards removing rain based on single image method
CN109034066A (en) Building identification method based on multi-feature fusion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant