CN107958238A

CN107958238A - A kind of Face detection method based on the classification of eye nose mouth

Info

Publication number: CN107958238A
Application number: CN201711494703.4A
Authority: CN
Inventors: 容李庆; 袁亚荣; 罗杰; 林锴; 汤俊杰; 陈纯敏
Original assignee: Guangzhou Two Yuan Technology Co Ltd
Current assignee: Guangzhou Two Yuan Technology Co Ltd
Priority date: 2017-12-31
Filing date: 2017-12-31
Publication date: 2018-04-24

Abstract

A kind of Face detection method based on the classification of eye nose mouth, including training part：The grader at three positions of training eye nose mouth；Face detection flow：Assembled classifier realizes the eye in digital picture, nose, mouth classification position.Face detection method provided by the invention based on the classification of eye nose mouth has the following advantages：1. using single Neural, arithmetic speed is more quick relative to traditional cascade neural network mode, and operand is also greatly reduced.2. using eye, nose, the classification of mouth and proportionate relationship, the robustness of higher can be obtained, there is good recognition effect especially for side face.3. since this method is classified using three eye, nose, mouth positions, cascade connection is formed in single Neural, there is classification accuracy well.4. this method is woth no need to carry out classification and computing pixel-by-pixel, simple plus neural network structure, better processing speed can be obtained in the terminal.

Description

A kind of Face detection method based on the classification of eye nose mouth

Technical field

The present invention relates to a kind of Face detection method based on the classification of eye nose mouth.

Background technology

Generally start to realize that Face detection technology and face recognition technology are obtained instantly in digitizing technique and artificial intelligence Very big development and progress is arrived.By Paul Viola and Michael Jones in the paper delivered in 2001《Rapid Object Detection using a Boosted Cascade of Simple》One kind is proposed in paper and is based on Lis Hartel The Face detection method that sign cascade (V-J cascades) is realized.Levy to face and non-face divide using Lis Hartel in the method Class, accelerates the statistics to Lis Hartel sign using the method for integral image, and grader is levied in digital picture using multilayer Lis Hartel Face positioned.The mode for the cascade classifier that this method proposes is laid a good foundation for Face detection technology below.And And accelerate the calculating of Lis Hartel sign using integral image so that this method can be applied under real-time scene.

But by Lis Hartel sign to face and it is non-face carry out classification there are larger limitation, one side Lis Hartel sign Accuracy it is relatively low (neutral net), the classification that is on the other hand obtained in the case where side face or face form are more effect Fruit more seems barely satisfactory.

In order to meet face comes in every shape in real world classification and positioning, the accurate of the non-face grader of face is improved Property, more general implementation method is to utilize the neutral net in deep learning to form cascade at present, to the people in digital picture Face is classified and is positioned.Such as《Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks》What is proposed in paper forms the mode pair of cascade based on three neutral nets Face in digital picture is classified and is positioned.

Neutral net is for Lis Hartel is levied, its calculation amount bigger, algorithm is increasingly complex, but can obtain more preferable Accuracy rate.In the Face detection that cascade formation is carried out using neutral net, since there are multilayer neural network, every to be checked The digital picture of survey is calculated by multistage neutral net, finally obtains result.

The cascade either carried out using Lis Hartel sign or neutral net, due to needing by multistage neutral net meter Calculate, influenced on speed is realized.And the more preferable classifying quality in order to obtain in the system of cascade, especially for The robustness of classification is improved, face (digital picture) sample for generally requiring to add various forms adds training.Because in cascade More advanced grader is by with more the fitness to sample.Thus very big spending is brought to training implementation.

The document of Face detection is being carried out using face《From Facial Parts Responses to Face Detection:A Deep Learning Approach》In propose a kind of be based on to hair, eye, nose, mouth and chin The method for being identified and then extrapolating face location.Neutral net and the process of identification is trained to need to image in the method Calculated, classified to each pixel of input picture pixel-by-pixel, and then build one on hair, eye, nose, mouth With the Planar Mapping figure of chin, utilize the space proportion of the mapping graph and hair, eye, nose, mouth and chin (training is got) Classification and reckoning face position.The program is due to the use of the neutral net for full convolution, and it is larger to input size (256x256), the convolutional neural networks structure used is complex, and real-time effect is on the one hand unable to reach in execution efficiency Fruit, in particular in current mobile terminal, during as needed longer in smart mobile phone, tablet computer and embedded experience terminal Between computing and positioning are carried out to individual artwork.On the other hand, it is special to training to classify pixel-by-pixel in being trained due to this method It is not the collection of sample and processing brings very big cost overhead, and in combination hair, eye, nose, mouth and chin Also neutral net has been used to carry out the training and computing of comparative example in ratio, overall execution efficiency and time have been delayed again.

The content of the invention

The present invention proposes that one kind classifies eye, nose and mouth in digital picture, and then infers whether comprising face Based on the Face detection method of eye nose mouth classification, the Grid Dead Reckon face position of eye, nose and mouth these characteristic portions is utilized The coordinate put, is finally reached the purpose to Face detection.

In order to solve the above technical problems, the technical solution adopted in the present invention is：

A kind of Face detection method based on the classification of eye nose mouth, comprises the following steps：

Training part：The grader at three positions of training eye nose mouth, step are as follows：

1) collect and mark eye, nose, the training sample of mouth and the classification samples not comprising these three positions, data Sample is four kinds altogether, also the grader of as four classification；

2) neutral net is defined；

3) input sample is trained into neutral net, the model after being trained；

4) face of sample to eye, nose, the analysis of three position ratios of mouth and statistics, obtain one and meet Eye, nose, the positional structure of mouth ratio in most of face；

5) in order to obtain more preferable face frame effect after combining, the calibration neutral net mould of face frame position is also carried out Type；

6) the combination calibration required sample data of neutral net trains the neural network model calibrated；

Face detection flow：Assembled classifier realizes the eye in digital picture, nose, mouth classification position, walks It is rapid as follows：

1) the minimum facial size that can be positioned is defined, i.e., minimum face, according to the contracting of minimum face and image pyramid Put and original image is zoomed in and out based on ratio, form image pyramid；

2) size and step-length of sliding window are defined, all images in image pyramid are scanned using sliding window；

3) sliding window obtains screening frame up to scanning the image in all pyramids；

4) frame to be screened is traveled through, seeks the frame to be screened using same nose or mouth, and merged；

5) to all frame application non-maxima suppression computings to be screened, face frame to be restored is obtained；

6) face frame to be restored is input in calibration neutral net, obtains final face location coordinate.

Face detection method provided by the invention based on the classification of eye nose mouth has the following advantages：

1. using single Neural, arithmetic speed is more quick relative to traditional cascade neural network mode, computing Amount is also greatly reduced.

2. utilizing eye, nose, the classification of mouth and proportionate relationship, the robustness of higher can be obtained, especially for side Face has good recognition effect.

3. since this method is classified using three eye, nose, mouth positions, cascade is formed in single Neural Relation, has classification accuracy well.

4. this method is woth no need to carry out classification and computing pixel-by-pixel, simple plus neural network structure, can move Better processing speed is obtained in dynamic terminal.

Brief description of the drawings

The attached drawing for forming the part of the application is used for providing a further understanding of the present invention, schematic reality of the invention Apply example and its explanation is used to explain the present invention, do not form inappropriate limitation of the present invention.In the accompanying drawings：

Fig. 1 is flow diagram of the embodiment of the present invention；

Fig. 2 a, 2b, 2c are respectively eye in face, nose, the positional structure schematic diagram of mouth ratio.

Embodiment

In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, it is right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.

Embodiment

With reference to shown in Fig. 1, a kind of Face detection method based on the classification of eye nose mouth, step is as follows：

1. training part, the grader at these three positions is predominantly trained, step is as follows：

1) collect and mark eye, nose, the training sample of mouth and the classification samples not comprising these three positions, data Sample is four kinds altogether, also the grader of as four classification.

2) neutral net, framework leading reference are defined《Joint FaceDetection and Alignment using Multi-task Cascaded Convolutional Networks》In RNet Classification Neural frameworks, point Class species is four classification.

3) input sample is trained into neutral net, the model after being trained.

4) in order in Face detection determine each position proportionate relationship, to 20,0000 faces to come in every shape into Row obtains one and meets eye, nose, mouth in most of face to eye, nose, the analysis of three position ratios of mouth and statistics The positional structure of ratio, as shown in Fig. 2 a, 2b, 2c.

5) in order to obtain more preferable face frame effect after combining, this method needs to carry out the calibration nerve of face frame position Network model.Neural network structure leading reference《Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks》In RNet regression calibrations structure.

6) the combination calibration required sample data of neutral net trains the neural network model calibrated.

2. Face detection flow, the grader above main combination is to the eye in digital picture, nose, mouth classification position Realized, step is as follows：

1) the minimum facial size (minimum face) that can be positioned is defined, such as the size of 40x40px.According to minimum face and Original image is zoomed in and out based on the scaling of image pyramid, forms image pyramid.

2) size and step-length of sliding window are defined, such as size is 40x40px, step-length 4px.Using sliding window to image All images in pyramid are scanned, wherein the disposition scanned be divided into again it is following several：

A) it is shown with reference to the accompanying drawings to build to the left side and to the T-shaped of the right when the window scanned is there are when eye The framework of shape, the matrix data inputted in rectangle frame on the other side enter neutral net, determine whether eye.

B) the matrix content for extracting nose position shown with reference to the accompanying drawings is input to neutral net, obtains predicted value, if prediction Value then thinks that there are nose in the region more than threshold value.

C) a rectangle frame is built in its lower edge (such as attached drawing 2a, 2b, 2c) after in b) really, by the matrix data in frame It is input in neutral net and is predicted, then thinks that there are mouth in the region when predicted value is more than threshold value.

D) as after confirming in c) structure one include the rectangle encirclement frame of eye, nose and mouth, and be added to frame to be screened In.Store the positional information of eye, nose and mouth respectively in frame to be screened.

3) sliding window obtains screening frame up to scanning the image in all pyramids.

4) frame to be screened is traveled through, seeks the frame to be screened using same nose or mouth, and merged.

5) to all frame application non-maxima suppression computings to be screened, face frame to be restored is obtained.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement made within refreshing and principle etc., should all be included in the protection scope of the present invention.

Claims

A kind of 1. Face detection method based on the classification of eye nose mouth, it is characterised in that comprise the following steps：

Training part：The grader at three positions of training eye nose mouth, step are as follows：

1) collect and mark eye, nose, the training sample of mouth and the classification samples not comprising these three positions, data sample Altogether be four kinds, also as four classification grader；

2) neutral net is defined；

3) input sample is trained into neutral net, the model after being trained；

4) face of sample to eye, nose, the analysis of three position ratios of mouth and statistics, obtain one and meet big portion Divide eye, nose, the positional structure of mouth ratio in face；

5) in order to obtain more preferable face frame effect after combining, the calibration neural network model of face frame position is also carried out；

6) the combination calibration required sample data of neutral net trains the neural network model calibrated；

Face detection flow：Assembled classifier realizes that step is such as to the eye in digital picture, nose, mouth classification position Under：

1) the minimum facial size that can be positioned is defined, i.e., minimum face, according to the pantograph ratio of minimum face and image pyramid Original image is zoomed in and out based on example, forms image pyramid；

2) size and step-length of sliding window are defined, all images in image pyramid are scanned using sliding window；

3) sliding window obtains screening frame up to scanning the image in all pyramids；

4) frame to be screened is traveled through, seeks the frame to be screened using same nose or mouth, and merged；

5) to all frame application non-maxima suppression computings to be screened, face frame to be restored is obtained；

6) face frame to be restored is input in calibration neutral net, obtains final face location coordinate.
A kind of 2. Face detection method based on the classification of eye nose mouth according to claim 1, it is characterised in that：

All images in image pyramid are scanned using sliding window, wherein the disposition scanned be divided into again it is following several Kind：

A) it is shown with reference to the accompanying drawings to build to the left side and to the T-shaped of the right when the window scanned is there are when eye Framework, the matrix data inputted in rectangle frame on the other side enter neutral net, determine whether eye；

B) the matrix content for extracting nose position shown with reference to the accompanying drawings is input to neutral net, predicted value is obtained, if predicted value is big Then think that there are nose in the region in threshold value；

C) a rectangle frame is built in its lower edge after in b) really, the matrix data in frame is input in neutral net and is carried out Prediction, then thinks that there are mouth in the region when predicted value is more than threshold value；

D) as after confirming in c) structure one include the rectangle encirclement frame of eye, nose and mouth, and be added in frame to be screened, The positional information of eye, nose and mouth is stored in frame to be screened respectively.