CN105718868A

CN105718868A - Face detection system and method for multi-pose faces

Info

Publication number: CN105718868A
Application number: CN201610029680.9A
Authority: CN
Inventors: 邬书哲; 阚美娜; 山世光; 陈熙霖
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2016-01-18
Filing date: 2016-01-18
Publication date: 2016-06-29
Anticipated expiration: 2036-01-18
Also published as: CN105718868B

Abstract

The invention provides a face detection system for multi-pose faces. The face detection system comprises a front end detector and a back end detector. The front end detector comprises at least one layer of classifier. Each layer comprises at least two parallel first type classifiers for different poses of faces, wherein the first type classifiers are used for distinguishing candidate faces and non-face windows. The back end detector comprises a second type classifier using a depth nerve network. The second type classifier is used for further distinguishing faces and non-faces in detection results of the front end detector. Accordingly, the invention also provides a face detection method. While the detection precision is improved, calculation expense in the detection process is effectively reduced and detection speed is effectively increased.

Description

A kind of face detection system for multi-pose Face and method

Technical field

The invention belongs to computer vision field, particularly relate to human face detection tech.

Background technology

The target of Face datection task is to any given piece image or one group of image sequence, uses machine automatically to judge whether to exist in this image or this sequence face, and is depositing in the context of a person's face, finds out its position and size.Face datection problem is generally conceptualized as two classification problems, namely distinguishes face with non-face.Classical method for detecting human face mainly through based on face and non-face image pattern Study strategies and methods, then using the data Study strategies and methods trained that each image window is classified by the mode of sliding window on the image of input.

Facial image can because Different Individual be in age, sex, race, body weight, and the difference of the intrinsic factor aspect such as different head pose and its apparent with present huge difference in shape, and illumination, block, the impact of the extrinsic factor such as angle, distance, collecting device further enriches the changing pattern of facial image, adds the difficulty of Face datection.Likely affect the apparent of facial image with in the factor of shape, the attitude of face is one of maximum factor of the apparent effect to facial image, and the change of attitude is the conversion of a kind of nonlinearity, the facial image of different attitudes also exists significant difference on apparent, single model is adopted to be difficult to uniformly all of attitude is modeled, therefore generally multi-pose Face test problems is divided into multiple more simple sub-attitude Face datection problem to solve, namely it is respectively trained grader for different attitudes, each grader processes the face under corresponding attitude.In the prior art, the organizational structure of multiple graders mainly has: block form, pyramid and tree-like.

But, the method for organizing of multiple graders is still suffered from deficiency in various degree by prior art.Block form grader structure trains a human-face detector for each attitude, overwhelming majority candidate image window is required for performing the human-face detector of all attitudes, thus it has relatively low detection efficiency, the false drop rate of whole detector increases along with increasing of detector arranged side by side simultaneously；Pyramid grader structure can regard the side-by-side configuration of shared upper level node in form as, except ground floor, each layer is all the form adopting multiple models arranged side by side, thus it has the problem identical with side-by-side configuration in time efficiency with false drop rate, difference is simply in that the high-rise mode by shared node decreases the expense that part calculates, and model multi-pose Face with single model and certainly will there is contradiction between recall rate and non-face window filter ability, in turn result in the decline of accuracy of detection or the increase of computing cost, or both have concurrently；Grader on each layer of tree classificator structure is in parallel carries out branch with from coarse to fine to attitude, owing to its top mode generally requires more feature, and need when branch explicitly or implicitly attitude to be estimated, its accuracy of detection depends critically upon the order of accuarcy of the estimation to attitude, and inaccurate or wrong Attitude estimation all easily causes missing inspection thus reducing the recall rate of detector.The mechanism that the tree-like detector having devises multiple branches reduces the dependence to Attitude estimation accuracy, but this does not tackle the problem at its root, and the mechanism of multiple branches itself is a difficult point.

In sum, for the facial image in multi-pose situation, the grader that model complexity is low lacks enough modeling abilities, the grader that model complexity is high then has high computing cost, the method of multiple graders is adopted also to be difficult to take into account stronger modeling ability and less computing cost at present, it is difficult to the performance simultaneously obtained in accuracy and speed.

Summary of the invention

It is an object of the invention to provide a kind of solution that can overcome above-mentioned technical problem.

The invention provides a kind of face detection system for multi-pose Face, including: front-end detector and rear end detector, wherein, described front-end detector includes at least one of which grader, described each layer comprises the first kind grader for different attitude faces that at least two is arranged side by side, for the face of candidate and non-face window are made a distinction；Described rear end detector includes the Equations of The Second Kind grader adopting deep neural network, for further discriminating between face in the testing result of described front-end detector and non-face.

Preferably, described front-end detector includes at least two layers of classified device, described each layer comprises the first kind grader for different attitude faces that at least two is arranged side by side, wherein, and the number of the described first kind grader in the no more than higher level of number of the described first kind grader on lower level.

Preferably, described first kind grader includes the grader adopting the feature that can quickly calculate, and/or adopts the grader of the algorithm that can quickly calculate.

Preferably, the described feature that can quickly calculate includes the SIFT feature of LAB, Haar, acceleration and/or SURF feature.

Preferably, described first kind grader includes cascade classifier, grader based on partial model and/or the grader based on neural network model.

Preferably, described different human face postures include the angle according to the outer angle of left rotation and right rotation of face header planes, the outside upper lower rotation of plane and/or the different angles scope of the angular divisions rotated in plane.

Preferably, described Equations of The Second Kind grader includes the grader that is made up of at least two deep neural network cascade.

The present invention has also correspondingly provided a kind of method for detecting human face for aforementioned face detection system, including: step 1, input window to be detected；By described front-end detector, step 2, judges that described window to be detected is whether as the face of particular pose respectively；By described rear end detector is unified, step 3, judges in described step 2 that the window to be detected of all faces being confirmed as particular pose is whether as face；Window indicia to be detected being confirmed as face all in described step 3 are face window by step 4.

Preferably, wherein, described step 2 also includes: filter the window of all faces being confirmed as nonspecific attitude；And/or, described step 3 also includes: filters and all is confirmed as non-face window.

Preferably, wherein, described step 4 also includes: windows to be detected being confirmed as face all in described step 3 carry out window merging, and is face window by the window indicia after merging.

Compared with prior art, the technical scheme that the present invention proposes can quickly filter the non-face image window of the overwhelming majority, ensures face is had sufficiently high recall rate simultaneously, and the face window passed through further is accurately distinguished, effectively reduce flase drop；While promoting accuracy of detection, effectively reduce the computing cost of detection process, be effectively improved detection speed.

Accompanying drawing explanation

In order to be illustrated more clearly that technical scheme, below the accompanying drawing used required during embodiment is described is briefly described, it should be apparent that, the accompanying drawing in the following describes is not intended that the restriction for technical scheme.

Fig. 1 is the structural representation of face detection system according to an embodiment of the invention；

Fig. 2 is the schematic diagram of face detection system according to an embodiment of the invention；

Fig. 3 is the schematic diagram of face detection system according to another embodiment of the invention；

Fig. 4 is the schematic diagram of face detection system according to still a further embodiment；

Fig. 5 is the flow chart of method for detecting human face according to an embodiment of the invention.

Detailed description of the invention

Fig. 1 illustrates the structural representation of face detection system according to an embodiment of the invention.As it is shown in figure 1, the face detection system of the present invention includes the specific detector C of attitude 1 of front end and the full gesture detector C2 of rear end.Wherein, front-end detector C1 is for quickly filtering the non-face window of the overwhelming majority on input picture, to determine the candidate face window for rear end detection；The full gesture detector C2 in rear end carries out unified detection for all attitudes for face, to determine and to export Face datection result.

Front-end detector C1

As shown in Figure 2, according to one embodiment of present invention, front-end detector C1 mono-is divided into two-layer, ground floor has the AdaBoost grader of 5 employing LAB (LocallyAssembledBinaryFeature) features arranged side by side, corresponds respectively to different human face posture samples and be trained.Wherein, by face sample according to face 5 different attitudes that the anglec of rotation of horizontal direction divides outside header planes, that is: the angular range [-90 ° that will horizontally rotate outside plane, 90 °], it is divided into 5 little scopes: left full side face [-90 ° ,-50 °], left half side-view face [-50 °,-25 °], right full side face [50 °, 90 °], right half side-view face [25 °, 50 °] and quasi-front face [-25 °, 25 °].Filter non-face window in order to as far as possible many, ensure the recall rate (that is, the ratio of the face number comprised in the face number being properly detected and original image) of face window simultaneously, it is preferable that, it is necessary to use 100～150 LAB features.

Adopting 2 small-sized convolutional neural networks on the second layer of front-end detector C1, a network processes is by the left half side-view of ground floor, quasi-front and right half side-view face, another network processes left side and full side, right side face entirely by ground floor.Preferably, each convolutional neural networks on the second layer of front end only comprises a convolutional layer and two full articulamentums, its reason is in that, although the LABAdaboost grader of ground floor has filtered out most window, but owing to the computing cost of convolution own is relatively larger, therefore detection speed can be greatly reduced according to excessive network.The network RGB triple channel image using 20 × 20 is as input, convolutional layer adopts the convolution kernel of 65 × 5, full articulamentum is respectively adopted 20 and 2 nodes, after convolutional layer and full articulamentum before also include a Max-pooling layer with 2 for step-length to 2 × 2 region process.The training of convolutional neural networks adopts back-propagation algorithm.

As in figure 2 it is shown, the ground floor of front-end detector C1 and the second layer define funnel shaped structure, (that is, high-rise grader number is no less than the grader number of low layer；High level specifically described herein is to low layer or with ascending number order, namely by the order of the input of front-end detector C1 to outfan, that is from input to outfan, the number of grader is successively to reduce or constant), in order to adjust the overall performance of front-end detector C1.

According to embodiments of the invention, front-end detector C1 is not limited to aforesaid double-layer structure.On the one hand, front-end detector C1 only can also be corresponded respectively to grader that different human face posture samples is trained (such as by one layer, the less HaarAdaBoost cascade classifier of progression) constitute side by side, so that front-end detector structure is more simple, as shown in Figure 3.

On the other hand, front-end detector C1 can also by arranged side by side being made up of with funnel shaped structure other kinds of grader more than two-layer or two-layer.Wherein, each layer of detector C 1 can also adopt other features that can quickly calculate, such as, Haar feature, SIFT (Scale-InvariantFeatureTransform) feature of acceleration, SURF (Speeded-upRobustFeature) etc. can pass through integrogram and to be characterized as the feature that the feature extraction mode at center quickly calculates；The grader of different layers can adopt different types of feature, to describe the change of sample from different angles, to increase the distinction between face and non-face window, such as, AdaBoost cascade classifier is all adopted for each layer of front end grader C1, LAB feature can be adopted in the cascade classifier constituting ground floor, from the cascade classifier constituting the second layer, then adopt Haar feature, so pass through the LAB feature adopting extraction rate very fast in the grader of ground floor, contribute to quickly filtering a large amount of non-face window, and pass through the Haar feature adopting descriptive power relatively higher in the grader of the follow-up number of plies, then contribute to promoting the accuracy of subsequent stages classification；The grader of different layers can adopt different types of grader (such as, cascade classifier, grader based on partial model, and/or the grader based on neural network model), the overall performance of front-end detector C1 is adjusted with greater flexibility, for double-layer structure, according to one embodiment of present invention, 6 graders arranged side by side can be used at ground floor, each grader is made up of LABAdaBoost grader, 3 graders arranged side by side are used at the second layer, each grader is by SURFMLP (MultilayerPerceptron, multi-layer perception (MLP)) cascade classifier composition, wherein, 3 graders of the second layer are connected with 2 graders in ground floor respectively, to form the funnel-shaped structure similar with shown in Fig. 2.

According to embodiments of the invention, for the division according to attitude of the face sample, on the one hand, different attitudes can be divided into from the relative position relation of its header planes according to face in face sample, it is possible not only to divide at the outer horizontal rotation angle (deflection angle) of plane of header planes according to aforementioned face, the angle (roll angle) that can also rotate in header planes at the angle (angle of pitch) of the outer vertical direction deflection of plane of header planes, face according to face, and the combination in any of aforementioned these three angle divides.

On the other hand, for the division according to attitude of the face sample, also according to distribution of the ability to express of selected feature, the modeling ability of grader and training sample etc. because usually determining, and the dividing mode in previous embodiment can be not limited to.When carrying out attitude and dividing, the modeling ability of grader to be taken into full account, be balanced between window filter ratio and face recall rate.Divide more attitude and with more detector; the rising of recall rate would generally be brought, but also result in window filter ability simultaneously and decline so that more window enters into the detector that following time complexity is higher; thus greatly reducing speed, too increase the task difficulty of grader below；And divide less attitude and use less grader, the difficulty of single its task of grader can be increased, if task difficulty has exceeded the modeling ability of grader, also result in when reaching identical recall rate, window filter ratio greatly declines, thus causing the speed of front-end detector to slow down and training difficulty to promote.Owing in certain attitude range, the apparent difference of face is relatively small, and along with the increase of attitude span, this species diversity is just gradually increased, therefore, according to embodiments of the invention, preferably, the concrete distribution of the angular range of the face sample different attitudes residing for training sample can be divided into less subclass, such as, incorporate the face in the anglec of rotation angular interval less than 50 ° into the front face that is as the criterion, and the anglec of rotation angular interval more than 50 ° is divided into left surface face further according to direction of rotation and right flank face (skilled artisan would appreciate that, this division can be not limited to aforementioned concrete angular range, and determine according to the concrete distribution situation of the angular range of the different attitudes in sample), even consider that Image Reversal is only simple linear transformation, further two subclasses of left surface and right flank can be merged, to reduce task difficulty and the computing cost of detector C 1；Owing to detector C 1 is only used as front-end detector to produce to detect candidate window for rear end, judge without to the attitude of face, therefore, according to embodiments of the invention, subclass for human face posture divides and boundary sample can be had elasticity, namely can be divided into arbitrary interval of its residing adjacent interval by being in the sample dividing range boundary or it is divided in adjacent interval simultaneously.

According to embodiments of the invention, division according to face sample, each grader of the ground floor of composition front-end detector C1 corresponds to the one of which face sample divided and is trained, grader in succeeding layer number then according to the annexation between itself and the grader of preceding layer, is trained accordingly.During training, main consideration obtains higher recall rate and relatively small number of detection time, false drop rate (that is, by the ratio of the number of non-face subwindow that flase drop the is face all non-face subwindow number detected with in original image) can not be strict with.It is configured to example by cascade classifier with the ground floor detector of front end detector C 1, prior art can used, such as cascade (waterfall) algorithm, the requirement to false drop rate is suitably relaxed during structure cascade classifier, and/or, the mode of the recall rate of the cascade classifier that test configuration completes on the independent sample set for verifying can be passed through, it is determined that and what retains before the cascade classifier constructed as final cascade classifier.

By multiple grader parallel combinations for different human face postures are got up to constitute front-end detector C1 or one layer, it is possible to make up single grader deficiency on modeling ability when ensureing low complex degree；In the process of detection, such as the characteristics of image of Haar feature quickly can be calculated by integrogram, and each feature has only to a small amount of internal storage access and addition and subtraction operation can obtain so that overall computing cost is very low；In the process of training grader, by relaxing the requirement to false drop rate, recall rate and the detection time of face all can be ensured in span of control by the grader making each layer, reduce the cost that side-by-side configuration brings in time, can accomplish to lose face as few as possible simultaneously.Owing to front-end detector C1 is only used as the front end of whole detection system, its need complete to provide the task of the face window of candidate, do not do final judgement, therefore avoid side-by-side configuration problem on false drop rate.

Application front-end detector described in above-described embodiment, after whole front end, the quantity that is ultimately delivered to the window of rear end detector relatively fewer (along with on image the change of face number and change), and ratio shared by non-face window is relatively low (by suitable training and adjustment, can the ratio shared by non-face window be controlled below 70%, ideally can reach 60% even lower), it can be avoided that when rear end detector adopts the face of single models treated multi-pose, owing to counter-example is too much, make to change in the bigger class that rear end detector has due to the face of different attitudes itself, and the situation of all of positive example of single model modeling and counter-example changing pattern cannot be adopted, that is, avoid rear end detector and occur training the situation of an of a sufficiently low model of the complexity that can simultaneously ensure speed and precision.

Rear end detector C 2

As in figure 2 it is shown, rear end detector C 2 can be made up of the grader of the employing deep neural network of the whole attitudes corresponding to face.It is for instance possible to use the multi-layer perception (MLP) (MultilayerPerceptron, MLP) of single hidden layer is as the model building rear end detector C 2.The input of note network is x, is mapped as f from what be input to hidden layer₁, from hidden layer to output layer, it is mapped as f₂, network is output as y, then the deep neural network model for building detector C 2 can be write as: y=f₂(f₁(x)), wherein, y corresponds to face and non-face class label or reliability.

According to one embodiment of present invention, rear end detector C 2 can by adopt SURF feature, with the multi-layer perception (MLP) of single hidden layer be model construction grader constituted.SURF is characterized by a kind of feature based on gradient, on the sample of 40 × 40 sizes, selects to preset the regional area of size and length-width ratio, and each region is divided into the cell of 2 × 2.Preferably, regional area size can for being not less than 16 × 16, and its length-width ratio can be 1:1,2:1,1:2,2:3 and 3:2 totally five kinds of situations.Then calculate gradient sum at different conditions in each cell, and then be combined the vector together constructing one 32 dimension.When region sliding step be 16, area size increase step-length be 1, the sample of one 40 × 40 can extract altogether 56 different SURF features, by these 56 merging features together thus obtaining the characteristic vector of one 1792 dimension, and using this vector as the input of multi-layer perception (MLP).

Multi-layer perception (MLP) one for building rear end detector C 2 has three layers, and including an input layer, a hidden layer and an output layer, wherein, input layer has 1792 nodes, corresponding to the SURF characteristic vector of 1792 dimensions；Hidden layer has 80 nodes；Output layer has 1 node, corresponding to tag along sort: node output 0 expression is non-face window, and output 1 expression is face window.Can adopting back-propagation algorithm when this multi-layer perception (MLP) is trained, forward calculation exports, back transfer gradient, and continuous iteration, until convergence, finally gives the parameter of the network of this multi-layer perception (MLP), mainly includes connecting weights and bias term.

In detection process, extract the SURF feature of collected all windows, it is entered into multi-layer perception (MLP), calculate output corresponding to each window (score), obtain final class label according to classification thresholds set in advance afterwards, complete the judgement to face and non-face window.Preferably, it is possible to classification thresholds is set as 0.5, it is about to the correspondence output window more than 0.5 and is judged as face window, the window that correspondence exports less than 0.5 is judged as non-face window.

According to one embodiment of present invention, rear end detector C 2 can also adopt other features of the input being suitable as deep neural network model, such as, the gray scale of image and color value, Gradient Features, SIFT (Scale-InvariantFeatureTransform) feature, HOG (HistogramofOrientedGradients) feature, and the variant of preceding feature (as carrying out the feature after PCA dimensionality reduction, the shape-indexed feature extracted based on shape) etc..

According to one embodiment of present invention, rear end detector C 2 can also adopt other deep neural network model (such as, own coding device network, degree of depth convolutional neural networks etc.) to build.Such as, according to one embodiment of present invention, rear end detector C 2 can adopt a degree of depth convolutional neural networks to process all windows by front-end detector C1, this network RGB triple channel image using 40 × 40 is as input, comprise 3 convolutional layers and 2 full articulamentums, wherein first convolutional layer adopts the convolution kernel of 16 16 × 16, second convolutional layer adopts 16 5 × 5 convolution kernels, 3rd convolutional layer adopts the convolution kernel of 16 3 × 3, two full articulamentums are respectively adopted 512 and 2 nodes, wherein have after each convolutional layer a Max-pooling layer with 2 for step-length to 2 × 2 region process.Preferably, it is possible to the output of different convolutional layers is stitched together and is connected to first full articulamentum, to utilize the information on different scale.

As shown in Figure 4, according to one embodiment of present invention, rear end detector C 2 can also be made up of the grader adopting multiple deep neural network Cascades, to improve the modeling ability of rear end detector C 2 further, and reduces false drop rate further.In the process of training, it is possible to adopt bootstrap strategy to build training sample.Specifically, in total sample, a part of sample one initial training sample of composition, the network of the training first order are first randomly selected；Then with first order network, all of sample being filtered, sample mistake divided adds training sample, carries out the study of next stage network afterwards based on new training sample, and the rest may be inferred, until obtaining desired network progression.

Rear end detector C 2 adopts the single deep neural network model construction with stronger modeling ability, more classify accurately with the candidate face window that front-end detector C1 is provided, can effectively reduce the flase drop of front-end detector C1, promote accuracy of detection, simultaneously because need window to be processed few, the computing cost of rear end detector C 2 can be efficiently controlled within the acceptable range.

Application embodiment described above, it is possible to formed include front-end detector and rear end detector, for multi-pose Face, by the thick face detection system to smart convergent type structure.By being adjusted correspondingly according to the feature of the feature adopted and grader, this system can have sufficiently high modeling ability, to filter abundant non-face window when ensureing face recall rate, there is again of a sufficiently low complexity to prevent from taking excessive time simultaneously.Fig. 5 illustrates the flow chart of method for detecting human face according to an embodiment of the invention.For sake of convenience, below in conjunction with Fig. 2, Fig. 3 and Fig. 5, method for detecting human face according to an embodiment of the invention is described.

Should be noted that, aforementioned face detection system according to an embodiment of the invention, and following method for detecting human face according to an embodiment of the invention, all it is directed to the window to be detected of input picture, certainly it is not precluded from particular cases, it is possible to using complete facial image as a window to be detected.The window to be detected of input picture can generate method by such as sliding window mode of the prior art or candidate region and obtain.

Being directed to the window to be detected of input picture, according to one embodiment of present invention, the method for detecting human face of the present invention may include steps of:

Step 1, by (such as, the C1.31-C1.33 being respectively directed to left surface, quasi-front and right flank shown in Fig. 3 of the grader for each Given Face attitude of each window to be detected input front end detector C 1 respectively；The C1.21-C1.23 being respectively directed to left half side-view, quasi-front and right half side-view shown in Fig. 2 and the C1.26 that its judged result is determined whether, and it is respectively directed to the C1.24-C1.25 of left full side and full side, the right side and the C1.27 that its judged result is determined whether) and judge whether it is face window: if all graders are all judged as non-face window, then filter this window；If having at least a grader to be judged as face window, then retain this window as candidate face window；

Step 2, is collected by all candidate face windows of front-end detector C1；

Step 3, inputs rear end detector C 2 by each the candidate face window collected in step 2 and judges whether it is face window: be judged as non-face window, then filter this window；It is judged as face window, then retains this window, and be marked as face window；

Step 4, is collected by all face windows of rear end detector C 2.

According to one embodiment of present invention, for making each face finally given correspond only to a detection block, aforementioned method for detecting human face can further include:

Step 5, the all face windows collected are carried out window merging, wherein, the mode carrying out window merging can use window of the prior art to merge method, such as, based on the method etc. handed over and be averaged than non-maxima suppression or the detection block of (IoU, IntersectionoverUnion).

Apply the face detection system described in above-described embodiment and method, it is possible to quickly filtered the non-face image window of the overwhelming majority by front-end detector, ensure face is had sufficiently high recall rate simultaneously；Further accurately distinguished by the less facial image window of front-end detector by rear end detector, effectively reduce flase drop；While promoting accuracy of detection, effectively reduce the computing cost of detection process, be effectively improved detection speed.

The foregoing is only the schematic detailed description of the invention of the present invention, be not limited to the scope of the present invention.Any those skilled in the art, the equivalent variations made under the premise without departing from the spirit and scope of the present invention, amendment and combination, the scope of protection of present invention all should be belonged to.The protection domain of application claims is by appending claims and equivalent defines.

Claims

1. the face detection system for multi-pose Face, it is characterised in that including: front-end detector and rear end detector, wherein,

Described front-end detector includes at least one of which grader, and described each layer comprises the first kind grader for different attitude faces that at least two is arranged side by side, for the face of candidate and non-face window are made a distinction；

Described rear end detector includes the Equations of The Second Kind grader adopting deep neural network, for further discriminating between face in the testing result of described front-end detector and non-face.

2. face detection system according to claim 1, it is characterized in that, described front-end detector includes at least two layers of classified device, described each layer comprises the first kind grader for different attitude faces that at least two is arranged side by side, wherein, the number of the described first kind grader in the no more than higher level of number of the described first kind grader on lower level.

3. face detection system according to claim 1 and 2, it is characterised in that described first kind grader includes the grader adopting the feature that can quickly calculate, and/or adopts the grader of the algorithm that can quickly calculate.

4. face detection system according to claim 3, it is characterised in that the described feature that can quickly calculate includes the SIFT feature of LAB, Haar, acceleration and/or SURF feature.

5. face detection system according to claim 1 and 2, it is characterised in that described first kind grader includes cascade classifier, grader based on partial model and/or the grader based on neural network model.

6. face detection system according to claim 1, it is characterised in that described different human face postures include the angle according to the outer angle of left rotation and right rotation of face header planes, the outside upper lower rotation of plane and/or the different angles scope of the angular divisions rotated in plane.

7. face detection system according to claim 1, it is characterised in that described Equations of The Second Kind grader includes the grader being made up of at least two deep neural network cascade.

8. for a method for detecting human face for face detection system according to any one of claim 1 to 7, including:

Step 1, inputs window to be detected；

By described front-end detector, step 2, determines that whether described window to be detected is the face of particular pose respectively；

By described rear end detector is unified, step 3, determines in described step 2, whether the window to be detected of all faces being confirmed as particular pose is face；

Window indicia to be detected being confirmed as face all in described step 3 are face window by step 4.

9. method for detecting human face according to claim 8, wherein,

Described step 2 also includes: filter the window of all faces being confirmed as nonspecific attitude；And/or,

Described step 3 also includes: filters and all is confirmed as non-face window.

10. method for detecting human face according to claim 8 or claim 9, wherein, described step 4 also includes: windows to be detected being confirmed as face all in described step 3 carry out window merging, and is face window by the window indicia after merging.