CN110443132A

CN110443132A - A kind of Face datection and the more attribute convergence analysis methods of face based on deep learning

Info

Publication number: CN110443132A
Application number: CN201910590960.0A
Authority: CN
Inventors: 张赛男; 李千目
Original assignee: Nanjing Tech University
Current assignee: Nanjing Tech University
Priority date: 2019-07-02
Filing date: 2019-07-02
Publication date: 2019-11-12

Abstract

The invention discloses a kind of Face datection based on deep learning and the multiattribute convergence analysis methods of face.This method are as follows: input picture first generates the different scale set of the picture using image pyramid, is handled using image of the full convolutional neural networks to input, and the first hierarchical network generates preliminary face candidate region；Then face detection module filters out face candidate frame from thick to thin, will repeat area using non-maxima suppression algorithm and simplifies beyond the candidate window of given threshold, determines human face region；Then human face region is amplified, is sent to facial attributive analysis network and is analyzed, obtains age, gender prediction's result；Finally Face datection result and attributive analysis result are labeled on picture and are saved, obtains visual prediction result.Present invention reduces the complexities of Face datection and the more attribute convergence analysis of face, have the advantages that structure is simple, parameter is few, practical.

Description

A kind of Face datection and the more attribute convergence analysis methods of face based on deep learning

Technical field

The present invention relates to machine learning, depth learning technology field, especially a kind of Face datection based on deep learning And the more attribute convergence analysis methods of face.

Background technique

Currently, deep learning has become the very active project of research, especially in terms of human face analysis.Such as Modern big data era, as the computing capability of hardware greatly enhances, deep learning can more play the advantage of its own, analyze Significant increase is obtained in precision.The experiment that many carries out in large-scale data is it has been shown that by obtained by deep learning Character representation can show good performance in fields such as natural language processing, computer visions.

Face character refer in facial image imply some attribute informations, as the age of people, gender, hair style, adornment, Head pose etc..Face character analysis, which refers to, to be analyzed and is identified to face character according to facial image.It is limited to current Shooting condition, that there are resolution ratio is low for some pictures under road traffic scene, facial size is small, face obscures, blocks face etc. Problem, how to carry out face character analysis to such picture is still the problem in computer vision field.

Summary of the invention

That the purpose of the present invention is to provide a kind of image resolution requirements is low, analysis accuracy rate is high, parameter is few, practical Face datection based on deep learning and the more attribute convergence analysis methods of face.

The technical solution for realizing the aim of the invention is as follows: a kind of Face datection and the more attributes of face based on deep learning Convergence analysis method, comprising the following steps:

Step 1, input picture generate the different scale set of the picture using image pyramid, using full convolutional Neural Network handles the image of input, and the first hierarchical network generates preliminary face candidate region；

Step 2, face detection module filter out face candidate frame according to preliminary face candidate region from thick to thin, benefit It will repeat area with non-maxima suppression algorithm to simplify beyond the candidate window of given threshold, deletion overlapping area, which is greater than, to be set The candidate window for determining threshold value, determines human face region；

Step 3 amplifies human face region, is sent to facial attributive analysis network and is analyzed, obtains age, gender prediction As a result；

Face datection result and attributive analysis result are labeled on picture and are saved by step 4, are obtained visual Prediction result.

Further, input picture described in step 1 generates the different scale set of the picture using image pyramid, It is handled using image of the full convolutional neural networks to input, the first hierarchical network generates preliminary face candidate region, tool Body is as follows:

Step 1.1, input picture, having a size of (h, w)；

Step 1.2, the different size set that the picture is generated using image pyramid, according to the minimum facial size of setting Zoom ratio factor between facial size net_face_size that min_face_size, the network layer can detect, pyramidal layer, Input picture, which is zoomed to the network layer, to detect suitable size:

(h_n, w_n)=(h*net_face_size/min_face_size* (factor^n), h*net_face_size/ Min_face_size* (factor^n),

Wherein, n is the pyramid number of plies, is started counting from 0, until being less than net_face_ compared with decimal in (h_n.w_n) Size is handled using image of the full convolutional neural networks to input, and the first hierarchical network generates preliminary face candidate area Domain.

Further, face detection module described in step 2 filters out people according to preliminary face candidate region from thick to thin Face candidate frame will repeat area using non-maxima suppression algorithm and simplify beyond the candidate frame of given threshold, deletes overlapping The candidate window that area is greater than given threshold determines human face region, specific as follows:

Step 2.1 is rolled up using human face region differentiation, the recurrence of face bounding box and Face detection three cascade multitasks Product neural network, training face detection module；

Step 2.2, using trained face detection module Stepwise Screening face candidate frame, obtain final result.

Further, human face region is amplified described in step 3, is sent to facial attributive analysis network and is analyzed, is obtained To age, gender prediction as a result, specific as follows:

The face that step 3.1, input face detection module are predicted；

Step 3.2, the picture region that will be delivered to facial attributive analysis network expand, if a left side for protoplast's face zone boundary frame Upper, lower right coordinate is (x₁,y₁), (x₂,y₂), then in former bounding box picture height h, width w are as follows:

Coordinate according to former bounding box improves, the practical upper left for intercepting picture region, lower right coordinate (x '₁,y′₁)、 (x′₂,y′₂) are as follows:

Increase effective information with this, includes the incoming facial attributive analysis network of information of ear, hair style by face periphery；

Step 3.3 carries out human face analysis using convolutional layer, and the convolution kernel size of three convolutional layers is 7*7,5*5,3* respectively 3, the excitation function used is PReLU, and by primary maximum pond after each convolutional layer, convolutional layer generates multiple features Two layers of the feature in front and back is locally tied, reduces the size of characteristic pattern by face；

Step 3.4, the face characteristic extracted according to two full articulamentums, determine that the face belongs to the general of every age class Rate and belong to male, women probability；

Step 3.5, lower probability more of all categories obtain face age, gender prediction's result.

Further, facial attributive analysis network described in step 3 selects the shooting picture under real scene as face The data set of portion's attributive analysis.

Further, facial attributive analysis network described in step 3 is trained using online difficult example mining algorithm, According to the difficult example ratio and this batch of total number of samples set, crucial sample number, carries out to losing in this lot sample sheet needed for calculating Descending arrangement, extracts the crucial sample of respective numbers, forms small quantities of sample, be trained.

Compared with prior art, the present invention its remarkable advantage are as follows: (1) model prediction accuracy and network structure between the two It is balanced, reduces image resolution requirement；(2) complexity of Face datection and the more attribute convergence analysis of face is reduced Degree, has the advantages that structure is simple, parameter is few, practical.

Detailed description of the invention

Fig. 1 is the process signal the present invention is based on the Face datection of deep learning and the more attribute convergence analysis methods of face Figure.

Fig. 2 is the flow diagram of face detection module in the present invention.

Fig. 3 is the flow diagram of septum reset attributive analysis of the present invention.

Fig. 4 is the prediction result schematic diagram of picture under actual scene in the embodiment of the present invention.

Specific embodiment

The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

In conjunction with Fig. 1, a kind of Face datection and the more attribute convergence analysis methods of face based on deep learning of the present invention is special Sign is, comprising the following steps:

Step 1, input picture generate the different scale set of the picture using image pyramid, using full convolutional Neural Network handles the image of input, and the first hierarchical network generates preliminary face candidate region, specific as follows in conjunction with Fig. 2:

Step 1.1, input picture, having a size of (h, w)；

Step 2, face detection module filter out face candidate frame according to preliminary face candidate region from thick to thin, benefit It will repeat area with non-maxima suppression algorithm to simplify beyond the candidate window of given threshold, deletion overlapping area, which is greater than, to be set The candidate window for determining threshold value, determines human face region, specific as follows in conjunction with Fig. 2:

Step 3 amplifies human face region, is sent to facial attributive analysis network and is analyzed, obtains age, gender prediction As a result, in conjunction with Fig. 3, it is specific as follows:

The face that step 3.1, input face detection module are predicted；

Further, facial attributive analysis network described in step 3 is trained using online difficult example mining algorithm, According to the difficult example ratio and this batch of total number of samples set, crucial sample number, carries out to losing in this lot sample sheet needed for calculating Descending arrangement, extracts the crucial sample of respective numbers, forms small quantities of sample, be trained, to increase the specific gravity of difficult example sample, by force Change the specific aim of model.

Embodiment 1

In conjunction with Fig. 4, a specific embodiment of the invention, comprising the following steps:

Step 2, face detection module filter out face candidate frame from thick to thin, will be repeated using non-maxima suppression algorithm Area is simplified beyond the candidate window of given threshold, determines human face region；

Step 3 amplifies human face region, comprising will be enlarged by model into effective pictorial information such as hair lengths, ear etc. Face picture after enclosing is sent to facial attributive analysis network and is analyzed, and obtains age, gender prediction's result；

Face datection result and attributive analysis result are labeled on picture and are saved by step 4, are obtained visual Prediction result is convenient for practical application.

Specific testing result (to protect privacy, blurs face in test data) as shown in Figure 4, in test chart In piece, rectangle frame outlines human face region, and the Gender on rectangle frame represents prediction of the present invention to the face gender, and M is male, F is women, and the Age on rectangle frame represents prediction of the present invention to the face age class, behind institute with data represent the corresponding age Section shares 8 kinds of possibility: 0-2,4-6,8-13,15-20,25-32,38-43,48-53,60-.Although testing the picture light used Lines part is poor, and resolution ratio is low, and the result of model prediction is still more bonded true.As can be seen that the present invention is based on depth The Face datection and the more attribute convergence analysis methods of face of habit, are put down between the two in model prediction accuracy and network structure Weighing apparatus, reduces image resolution requirement, and reduces the complexity of Face datection and the more attribute convergence analysis of face, has knot The advantage that structure is simple, parameter is few, practical.

Claims

1. a kind of Face datection and the more attribute convergence analysis methods of face based on deep learning, which is characterized in that including following Step:

Step 1, input picture generate the different scale set of the picture using image pyramid, using full convolutional neural networks The image of input is handled, the first hierarchical network generates preliminary face candidate region；

Step 2, face detection module filter out face candidate frame, utilization is non-according to preliminary face candidate region from thick to thin Maximum restrainable algorithms will repeat area and simplify beyond the candidate window of given threshold, delete overlapping area and are greater than setting threshold The candidate window of value, determines human face region；

Step 3 amplifies human face region, is sent to facial attributive analysis network and is analyzed, and obtains age, gender prediction's knot Fruit；

Face datection result and attributive analysis result are labeled on picture and are saved by step 4, are visually predicted As a result.

2. the Face datection and the multiattribute convergence analysis method of face according to claim 1 based on deep learning, It is characterized in that, input picture described in step 1, the different scale set of the picture is generated using image pyramid, using full volume Product neural network handles the image of input, and the first hierarchical network generates preliminary face candidate region, specific as follows:

Step 1.1, input picture, having a size of (h, w)；

Step 1.2, the different size set that the picture is generated using image pyramid, according to the minimum facial size min_ of setting Zoom ratio factor between facial size net_face_size that face_size, the network layer can detect, pyramidal layer, will be defeated Suitable size can be detected to the network layer by entering image scaling:

(h_n, w_n)=(h*net_face_size/min_face_size* (factor^n), h*net_face_size/min_ Face_size* (factor^n),

Wherein, n is the pyramid number of plies, is started counting from 0, until being less than net_face_size compared with decimal in (h_n.w_n), is adopted It is handled with image of the full convolutional neural networks to input, the first hierarchical network generates preliminary face candidate region.

3. the Face datection and the multiattribute convergence analysis method of face according to claim 1 based on deep learning, It is characterized in that, face detection module described in step 2 filters out face candidate according to preliminary face candidate region from thick to thin Frame will repeat area using non-maxima suppression algorithm and simplify beyond the candidate frame of given threshold, it is big to delete overlapping area Human face region is determined in the candidate window of given threshold, specific as follows:

Step 2.1 utilizes human face region differentiation, the recurrence of face bounding box and Face detection three cascade multitask convolution minds Through network, training face detection module；

4. the Face datection and the multiattribute convergence analysis method of face according to claim 1 based on deep learning, It is characterized in that, amplifies human face region described in step 3, be sent to facial attributive analysis network and analyzed, obtain age, property Other prediction result, specific as follows:

The face that step 3.1, input face detection module are predicted；

Step 3.2, the picture region that will be delivered to facial attributive analysis network expand, if the upper left of protoplast's face zone boundary frame, Lower right coordinate is (x₁,y₁), (x₂,y₂), then in former bounding box picture height h, width w are as follows:

Coordinate according to former bounding box improves, the practical upper left for intercepting picture region, lower right coordinate (x '₁,y′₁)、(x′₂, y′₂) are as follows:

Step 3.3 carries out human face analysis using convolutional layer, and the convolution kernel size of three convolutional layers is 7*7,5*5,3*3 respectively, is adopted Excitation function is PReLU, and by primary maximum pond after each convolutional layer, convolutional layer generates multiple characteristic faces, will The feature that two layers of front and back is locally tied, and reduces the size of characteristic pattern；

Step 3.4, the face characteristic extracted according to two full articulamentums, determine the face belong to every age class probability and Belong to male, women probability；

5. the Face datection and the multiattribute convergence analysis method of face according to claim 1 based on deep learning, It is characterized in that, facial attributive analysis network described in step 3, selects the shooting picture under real scene as facial attribute point The data set of analysis.

6. the Face datection and the multiattribute convergence analysis method of face according to claim 1 based on deep learning, It is characterized in that, facial attributive analysis network described in step 3, is trained using online difficult example mining algorithm, according to setting Good difficult example ratio and this batch of total number of samples, crucial sample number needed for calculating carry out descending arrangement to loss in this lot sample sheet, The crucial sample for extracting respective numbers, forms small quantities of sample, is trained.