CN106295502A

CN106295502A - A kind of method for detecting human face and device

Info

Publication number: CN106295502A
Application number: CN201610590134.2A
Authority: CN
Inventors: 陈书楷; 王辉能
Original assignee: Xiamen Zhongkong Biological Recognition Information Technology Co Ltd
Current assignee: Xiamen Entropy Technology Co., Ltd
Priority date: 2016-07-25
Filing date: 2016-07-25
Publication date: 2017-01-04
Anticipated expiration: 2036-07-25
Also published as: CN106295502B

Abstract

The embodiment of the invention discloses a kind of method for detecting human face and device, it is possible to the image of arbitrary size is processed, and positive face and side face can be detected simultaneously, improve detection speed.The method comprise the steps that and obtain candidate face image by the first degree of depth convolutional network, described first degree of depth convolutional network is the full convolutional network for initial survey；Being calculated described candidate face image by the second degree of depth convolutional network, obtain the reliability values of described candidate face image, described second degree of depth convolutional network is the degree of depth convolutional network for verification；If the reliability values of described candidate face image is more than predetermined threshold value, then it is judged to final facial image.

Description

A kind of method for detecting human face and device

Technical field

The present invention relates to image procossing and field of face identification, particularly relate to a kind of method for detecting human face and device.

Background technology

Method for detecting human face is the basis of recognition of face, detects that facial image is for identifying like clockwork from image Extremely important.Under some scenes, needing computer auto-detection to go out face for being identified, this will ask Face datection side Method can detect positive face image and side face image simultaneously, it is also proposed want it addition, carry out process for the image of arbitrary size Ask.

At present, method for detecting human face can only detect positive face, or can only detect side face, and typically by fixed size Image processes.

Clearly for needing to detect positive face and the task of side face simultaneously, can only detect respectively, thereby result in detection speed mistake Slowly, additionally the image of arbitrary size can not be processed.

Summary of the invention

Embodiments provide a kind of method for detecting human face and device, it is possible at the image of arbitrary size Reason, and positive face and side face can be detected simultaneously, improve detection speed.

In view of this, first aspect present invention provides a kind of method for detecting human face, including:

Obtaining candidate face image by the first degree of depth convolutional network, described first degree of depth convolutional network is for initial survey Full convolutional network；

By the second degree of depth convolutional network, described candidate face image is calculated, obtain described candidate face image Reliability values, described second degree of depth convolutional network is the degree of depth convolutional network for verification；

If the reliability values of described candidate face image is more than predetermined threshold value, then it is judged to final facial image.

Optional:

Described by first degree of depth convolutional network obtain candidate face image include:

Face thermodynamic chart is generated by the first degree of depth convolutional network；

Local hottest point is determined from described face thermodynamic chart, and using described local hottest point as candidate face position；

According to described candidate face position acquisition candidate face image.

Optional:

Described include according to before described candidate face position acquisition candidate face image:

Judge whether described candidate face position exists overlap；

The most then merge the described candidate face position of overlap.

Optional:

Described by including before the first degree of depth convolutional network acquisition candidate face image:

Generate the first degree of depth convolutional network；

Gather facial image and non-face image, and using described facial image and non-face image as training sample；

Described first degree of depth convolutional network is trained by described training sample.

Optional:

Described second degree of depth convolutional network is multiple degree of depth convolutional network, by the second degree of depth convolutional network to described candidate Facial image calculates, and the reliability values obtaining described candidate face image includes:

Respectively described candidate face image is calculated by the plurality of degree of depth convolutional network, obtain described candidate Multiple reliability values of face image；

The reliability values of described candidate face image is obtained according to the plurality of reliability values.

Optional:

Described first degree of depth convolutional network comprises multilamellar, is followed successively by: the first input layer, first volume lamination, the first output layer, First maximum pond layer, the second output layer, the first activation primitive layer, volume Two lamination, the 3rd output layer, the second activation primitive Layer, the 3rd convolutional layer and the 4th output layer；Described second degree of depth convolutional network includes multilamellar, is followed successively by: the second input layer, Four convolutional layers, the 5th output layer, the second maximum pond layer, the 6th output layer, the 3rd activation primitive layer, the 5th convolutional layer, the 7th Output layer, the 3rd maximum pond layer, the 8th output layer, the 4th activation primitive layer, full articulamentum and the 9th output layer.

Second aspect present invention provides a kind of human face detection device, including:

Acquisition module, for obtaining candidate face image, described first degree of depth convolution net by the first degree of depth convolutional network Network is the full convolutional network for initial survey；

First processing module, for being calculated described candidate face image by the second degree of depth convolutional network, is obtained The reliability values of described candidate face image, described second degree of depth convolutional network is the preset degree of depth convolution net for verification Network；

Determination module, if the reliability values for described candidate face image is more than predetermined threshold value, is then judged to final Facial image.

Optional:

Described acquisition module includes:

Signal generating unit, for generating face thermodynamic chart by the first degree of depth convolutional network；

First processing unit, for determining local hottest point, and by described local hottest point from described face thermodynamic chart As candidate face position；

Acquiring unit, for according to described candidate face position acquisition candidate face image.

Optional:

Described device also includes:

Judge module, is used for judging whether described candidate face position exists overlap；

Second processing module, if judging that described candidate face position exists overlap for judge module, then merges overlap Described candidate face position.

Optional:

Described device also includes:

Generation module, for generating the first degree of depth convolutional network；

3rd processing module, is used for gathering facial image and non-face image, and by described facial image and non-face figure As training sample；

Training module, for training described first degree of depth convolutional network by described training sample.

Optional:

Described second degree of depth convolutional network is multiple degree of depth convolutional network, and described first processing module includes:

Computing unit, for respectively described candidate face image being calculated by the plurality of degree of depth convolutional network, Obtain multiple reliability values of described candidate face image；

Second processing unit, for obtaining the reliability number of described candidate face image according to the plurality of reliability values Value.

Optional:

As can be seen from the above technical solutions, the embodiment of the present invention has the advantage that the network for initial survey is full volume Long-pending network, thus the image of arbitrary size can be processed and improve Face datection speed, the additionally present invention by the present invention Candidate face image do not limit positive face image or side face image, so the present invention also is able to detect positive face and side simultaneously Face.

Accompanying drawing explanation

Fig. 1 is one embodiment schematic diagram of embodiment of the present invention method for detecting human face；

Fig. 2 is the embodiment of the present invention one schematic diagram of first degree of depth convolutional network；

Fig. 3 is the embodiment of the present invention one schematic diagram of second degree of depth convolutional network；

Fig. 4 is one embodiment schematic diagram of embodiment of the present invention human face detection device.

Detailed description of the invention

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on Embodiment in the present invention, the every other enforcement that those skilled in the art are obtained under not making creative work premise Example, broadly falls into the scope of protection of the invention.

Term " first " in description and claims of this specification and above-mentioned accompanying drawing, " second ", " the 3rd ", " Four " etc. (if present) is for distinguishing similar object, without being used for describing specific order or precedence.Should manage Solve the data so used can exchange in the appropriate case, in order to the embodiments described herein can be with except here illustrating Or the order enforcement beyond the content described.Additionally, term " includes " and " having " and their any deformation, it is intended that Cover non-exclusive comprising, such as, contain series of steps or the process of unit, method, system, product or equipment need not limit In those steps clearly listed or unit, but can include the most clearly listing or for these processes, method, product Product or intrinsic other step of equipment or unit.

Referring to Fig. 1, in the embodiment of the present invention, one embodiment of method for detecting human face includes:

101, obtaining candidate face image by the first degree of depth convolutional network, this first degree of depth convolutional network is preset use Full convolutional network in initial survey；

In the present embodiment, degree of depth convolutional network is typically only used for the image of fixed size, carries out classifying or identifying, this The first degree of depth convolutional network in bright is full convolutional network, by utilizing the network structure of full convolutional network so that it is can be suitable for Image in arbitrary size.

Optionally, in some embodiments of the invention, obtain candidate face image by the first degree of depth convolutional network to have Body is:

Local hottest point is determined from face thermodynamic chart, and by local hottest point as candidate face position；

According to candidate face position acquisition candidate face image.

It should be noted that the first degree of depth convolutional network can be a mininet, mininet is used for generating face Thermodynamic chart；Then from face thermodynamic chart, local hottest point is found, as candidate face position；According to candidate face position from former Figure intercepts out candidate face image.Such as artwork is a photo, photo includes a little girl and a desk, logical Cross the first degree of depth convolutional network and can intercept out the facial image of little girl from this photo.

Further alternative, in some embodiments of the invention, according to candidate face position acquisition candidate face image Include before:

Judge whether candidate face position exists overlap；

The most then merge the candidate face position of overlap.

Optionally, in some embodiments of the invention, by first degree of depth convolutional network obtain candidate face image it Before include:

Generate the first degree of depth convolutional network；

Gather facial image and non-face image, and using facial image and non-face image as training sample；

The first degree of depth convolutional network is trained by training sample.

It should be noted that the present invention gathers substantial amounts of face and non-face image as training sample for training first Degree of depth convolutional network.Generally, facial image is positioned in a bigger image, carrys out accurate identification face by a rectangular area Position and size, this rectangle becomes face frame；By the size and location of face frame is added random disturbance, intercept out face Image, then enters row stochastic rotation, to detect the facial image of different directions, the most repeatedly obtains from a facial image Multiple training samples；Usual original facial image need more than 10000, the training sample extracted up to more than 100000, with Just the first degree of depth convolutional network learns to more preferable feature.

The selection of non-face image can intercept in not having the pictures such as the picture with scenes of face, object picture at random, non- The number of facial image is significantly larger than the number of facial image, such as more than 1,000,000.

By existing neutral net instrument and training sample, the first degree of depth convolutional network is trained, determines first The parameter of degree of depth convolutional network.When first degree of depth convolutional network is trained, if employing cross entropy loss function, the most finally Convolution need to use two convolution and, but in the use after having trained, owing to only using the defeated of first convolution kernel Going out result, therefore the result of second convolution kernel need not calculate, and can remove.

It should be noted that the second degree of depth convolutional network and the first degree of depth convolutional network are similar to, training sample also can be passed through Training, here is omitted.

Optionally, in some embodiments of the invention, the first degree of depth convolutional network comprises multilamellar, is followed successively by: first is defeated Enter layer, first volume lamination, the first output layer, the first maximum pond layer, the second output layer, the first activation primitive layer, the second convolution Layer, the 3rd output layer, the second activation primitive layer, the 3rd convolutional layer and the 4th output layer, for the ease of understanding the first degree of depth volume Long-pending network, is described in detail to the first degree of depth convolutional network below:

Refer to Fig. 2, Fig. 2 and represent a kind of preset full convolutional network for initial survey, that is to say the first degree of depth convolution net Network, the first input layer inputs 3 passages (H × W) image of a width arbitrary dimension, at first volume lamination, uses 32 5 × 5 convolution Core carries out convolution, and the first output layer obtains the output image of 32 passages, and its height and width decrease 4 pixels simultaneously, i.e. (H-4) × (W-4)；The maximum pondization that then first maximum pond layer is carried out one time 4 × 4 processes, and its pixel count is reduced to original 1/4, i.e. Second output layer obtains ((H-4)/4) × ((W-4)/4) of 32 passages, and then the first activation primitive layer carries out ReLU activation primitive Process；Next volume Two lamination uses 64 7 × 7 convolution kernels to carry out convolution, and the 3rd output layer obtains the output figure of 64 passages Picture, i.e. ((H-4)/4-6) × ((W-4)/4-6), the second activation primitive layer reuses ReLU activation primitive and processes；Last 3rd Convolutional layer uses the convolution kernel of 11 × 1 to carry out convolution, and the 4th output layer obtains 1 passage ((H-4)/4-6) × ((W-4)/4- 6), the most last face probability graph, i.e. thermodynamic chart, on this thermodynamic chart, local maximum point is possible face.

This full convolutional network is equivalent to the blockage image of 32 × 32 pixels to be mapped as a probit, and therefore using should The face (32 × 32) of complete one yardstick of convolutional network Intelligent Measurement, will detect the face of other yardsticks, needs artwork to scale After again detect, need scaling number of times and scaling determine according to the scope of face size to be detected.

For the occasion that rate request is higher, the size of this full convolutional network, such as input picture can be reduced further Using single pass gray level image, or reduce convolution kernel quantity, convolution is 4 convolution kernels such as the first time, and second time is 16 Individual convolution kernel, can greatly speed up the speed of process.

The first degree of depth convolutional network for initial survey is full convolutional network, such that it is able to the image of arbitrary size Reason；Secondly requiring it is that speed is fast, precision can be slightly lower, and usual reliability can reach more than 99.3%.

102, by the second degree of depth convolutional network, candidate face image is calculated, obtain the reliable of candidate face image Property numerical value, this second degree of depth convolutional network is the preset degree of depth convolutional network for verification；

In the present embodiment, the requirement of speed be need not too strict by the second degree of depth convolutional network for verification, but needs Higher reliability, generally, reliability values needs to reach more than 99.7%.Permissible for the second degree of depth convolutional network of verification Being designed as one or more, in the case of multiple, in order to more reliable, multiple degree of depth convolutional network are structurally or on convolution kernel The comparison in difference of design is big, in order to form complementation, it is possible to excavate the different characteristic in image.It addition, for the of verification Two degree of depth convolutional network can use the image of fixed size as input, is therefore not required for full convolutional network.In order to reach Higher reliability, the number of plies of the second degree of depth convolutional network and image channel number can be bigger, and Fig. 3 is a kind of typical second degree of depth Convolutional network schematic diagram, the second degree of depth convolutional network in Fig. 3 can include multilamellar, be followed successively by: the second input layer, Volume Four are amassed Layer, the 5th output layer, the second maximum pond layer, the 6th output layer, the 3rd activation primitive layer, the 5th convolutional layer, the 7th output layer, 3rd maximum pond layer, the 8th output layer, the 4th activation primitive layer, full articulamentum and the 9th output layer.Concrete, second is defeated Enter layer one 3 passages (32 × 32) image of input, at Volume Four lamination, use 32 11 × 11 convolution kernels to carry out convolution, the 5th Output layer obtains the output image of 32 passages (22 × 22), and then the second maximum pond layer carries out the maximum pond Hua Chu of a time 2 × 2 Reason, the 6th output layer obtains the output image of 32 passages (11 × 11), and then the 3rd activation primitive layer carries out ReLU activation primitive Process；Following 5th convolutional layer uses 64 3 × 3 convolution kernels to carry out convolution, and the 7th output layer obtains 64 passages (9 × 9) Output image, the maximum pondization that the 3rd maximum pond layer is carried out a time 3 × 3 processes, and the 8th output layer obtains 64 passages (3 × 3) Output image, the 4th activation primitive layer reuse ReLU activation primitive process, full articulamentum obtain 576 input values and 2 output valves, wherein 576 × 2 represent the parameter matrix of 576 × 2.9th output layer finally gives two numerical value, and these are two years old Individual numerical value finally can be used for calculating face and non-face probability, finally takes the probability of face.

Optionally, in some embodiments of the invention, if the second degree of depth convolutional network is multiple degree of depth convolutional network, step Rapid 102 particularly as follows:

Respectively candidate face image is calculated by multiple degree of depth convolutional network, obtain the multiple of candidate face image Reliability values；

The reliability values of candidate face image is obtained according to multiple reliability values.

Concrete, multiple reliability values are averaged, using meansigma methods as the reliability values of candidate face image； Or, multiple reliability values are taken maximum, using maximum as the reliability values of candidate face image；Or, will be many Individual reliability values takes weighted value, using weighted value as the reliability values of candidate face image, it is also possible to use additive method, It is not construed as limiting herein.

If the reliability values of 103 candidate face images is more than predetermined threshold value, then it is judged to final facial image.

In the present embodiment, reliability values is for representing the reliability of candidate face image, and the value of predetermined threshold value is permissible It is 99.7%, it is also possible to for other reasonable values, be not construed as limiting herein.

In the present embodiment, the network for initial survey is full convolutional network, thus the present invention can be to the image of arbitrary size Carrying out processing and improving Face datection speed, additionally the candidate face image of the present invention does not limit positive face image or side face Image, so the present invention also is able to detect positive face and side face simultaneously.

Referring to Fig. 4, in the embodiment of the present invention, one embodiment of human face detection device includes:

Acquisition module 201, for obtaining candidate face image, this first degree of depth convolution net by the first degree of depth convolutional network Network is the preset full convolutional network for initial survey；

First processing module 202, for being calculated candidate face image by the second degree of depth convolutional network, is waited Selecting the reliability values of facial image, this second degree of depth convolutional network is the preset degree of depth convolutional network for verification；

Determination module 203, if the reliability values for candidate face image is more than predetermined threshold value, is then judged to final people Face image.

Optionally, acquisition module 201 includes:

First processing unit, for determining local hottest point, and by local hottest point as candidate from face thermodynamic chart Face location；

Acquiring unit, for according to candidate face position acquisition candidate face image.

Further, this device also includes:

Judge module, is used for judging whether candidate face position exists overlap；

Second processing module, if judging that candidate face position exists overlap for judge module, then merges this time of overlap Select face location.

Optionally, this device also includes:

3rd processing module, is used for gathering facial image and non-face image, and facial image and non-face image is made For training sample；

Training module, for training the first degree of depth convolutional network by training sample.

Further, if the second degree of depth convolutional network is multiple degree of depth convolutional network.First processing module 202 includes:

Computing unit, for being calculated candidate face image respectively by multiple degree of depth convolutional network, obtains candidate Multiple reliability values of facial image；

Second processing unit, for obtaining the reliability values of candidate face image according to multiple reliability values.

Optionally, the first degree of depth convolutional network comprises multilamellar, is followed successively by: the first input layer, first volume lamination, the first output Layer, the first maximum pond layer, the second output layer, the first activation primitive layer, volume Two lamination, the 3rd output layer, the second activation letter Several layers, the 3rd convolutional layer and the 4th output layer；Described second degree of depth convolutional network includes multilamellar, is followed successively by: the second input layer, Volume Four lamination, the 5th output layer, the second maximum pond layer, the 6th output layer, the 3rd activation primitive layer, the 5th convolutional layer, the Seven output layers, the 3rd maximum pond layer, the 8th output layer, the 4th activation primitive layer, full articulamentum and the 9th output layer.

Those skilled in the art is it can be understood that arrive, for convenience and simplicity of description, and the system of foregoing description, The specific works process of device and unit, is referred to the corresponding process in preceding method embodiment, does not repeats them here.

In several embodiments provided herein, it should be understood that disclosed system, apparatus and method are permissible Realize by another way.Such as, device embodiment described above is only schematically, such as, and described unit Dividing, be only a kind of logic function and divide, actual can have other dividing mode, the most multiple unit or assembly when realizing Can in conjunction with or be desirably integrated into another system, or some features can be ignored, or does not performs.Another point, shown or The coupling each other discussed or direct-coupling or communication connection can be the indirect couplings by some interfaces, device or unit Close or communication connection, can be electrical, machinery or other form.

The described unit illustrated as separating component can be or may not be physically separate, shows as unit The parts shown can be or may not be physical location, i.e. may be located at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected according to the actual needs to realize the mesh of the present embodiment scheme 's.

It addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to two or more unit are integrated in a unit.Above-mentioned integrated list Unit both can realize to use the form of hardware, it would however also be possible to employ the form of SFU software functional unit realizes.

If described integrated unit realizes and as independent production marketing or use using the form of SFU software functional unit Time, can be stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially The part that in other words prior art contributed or this technical scheme completely or partially can be with the form of software product Embodying, this computer software product is stored in a storage medium, including some instructions with so that a computer Equipment (can be personal computer, server, or the network equipment etc.) performs the complete of method described in each embodiment of the present invention Portion or part steps.And aforesaid storage medium includes: USB flash disk, portable hard drive, read only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey The medium of sequence code.

The above, above example only in order to technical scheme to be described, is not intended to limit；Although with reference to front State embodiment the present invention has been described in detail, it will be understood by those within the art that: it still can be to front State the technical scheme described in each embodiment to modify, or wherein portion of techniques feature is carried out equivalent；And these Amendment or replacement, do not make the essence of appropriate technical solution depart from the spirit and scope of various embodiments of the present invention technical scheme.

Claims

1. a method for detecting human face, it is characterised in that including:

Obtaining candidate face image by the first degree of depth convolutional network, described first degree of depth convolutional network is preset for initial survey Full convolutional network；

By the second degree of depth convolutional network, described candidate face image is calculated, obtain the reliable of described candidate face image Property numerical value, described second degree of depth convolutional network is the preset degree of depth convolutional network for verification；

Method the most according to claim 1, it is characterised in that described first degree of depth convolutional network that passes through obtains candidate face Image includes:

Method the most according to claim 2, it is characterised in that described according to described candidate face position acquisition candidate face Include before image:

Judge whether described candidate face position exists overlap；

The most then merge the described candidate face position of overlap.

Method the most according to claim 1, it is characterised in that described first degree of depth convolutional network that passes through obtains candidate face Include before image:

Generate the first degree of depth convolutional network；

5. according to the method described in Claims 1-4 any one, it is characterised in that described second degree of depth convolutional network is many Individual degree of depth convolutional network, is calculated described candidate face image by the second degree of depth convolutional network, obtains described candidate The reliability values of face image includes:

Respectively described candidate face image is calculated by the plurality of degree of depth convolutional network, obtain described candidate face figure Multiple reliability values of picture；

6. according to the method described in claim 1 to 5 any one, it is characterised in that described first degree of depth convolutional network comprises Multilamellar, is followed successively by: the first input layer, first volume lamination, the first output layer, the first maximum pond layer, the second output layer, first sharp Function layer alive, volume Two lamination, the 3rd output layer, the second activation primitive layer, the 3rd convolutional layer and the 4th output layer；Described Two degree of depth convolutional network include multilamellar, are followed successively by: the second input layer, Volume Four lamination, the 5th output layer, the second maximum pond Layer, the 6th output layer, the 3rd activation primitive layer, the 5th convolutional layer, the 7th output layer, the 3rd maximum pond layer, the 8th output layer, 4th activation primitive layer, full articulamentum and the 9th output layer.

7. a human face detection device, it is characterised in that including:

Acquisition module, for obtaining candidate face image by the first degree of depth convolutional network, described first degree of depth convolutional network is The preset full convolutional network for initial survey；

First processing module, for being calculated described candidate face image by the second degree of depth convolutional network, obtains described The reliability values of candidate face image, described second degree of depth convolutional network is the preset degree of depth convolutional network for verification；

Determination module, if the reliability values for described candidate face image is more than predetermined threshold value, is then judged to final face Image.

Device the most according to claim 7, it is characterised in that described acquisition module includes:

First processing unit, for determine from described face thermodynamic chart local hottest point, and using described local hottest point as Candidate face position；

Device the most according to claim 8, it is characterised in that described device also includes:

Second processing module, if judging that described candidate face position exists overlap for judge module, then merges the described of overlap Candidate face position.

Device the most according to claim 7, it is characterised in that described device also includes:

3rd processing module, is used for gathering facial image and non-face image, and described facial image and non-face image is made For training sample；

11. according to the device described in claim 7 to 10 any one, it is characterised in that described second degree of depth convolutional network is Multiple degree of depth convolutional network, described first processing module includes:

Computing unit, for being calculated described candidate face image respectively by the plurality of degree of depth convolutional network, is obtained Multiple reliability values of described candidate face image；

Second processing unit, for obtaining the reliability values of described candidate face image according to the plurality of reliability values.

12. according to the device described in claim 7 to 11 any one, it is characterised in that described first degree of depth convolutional network bag Containing multilamellar, be followed successively by: the first input layer, first volume lamination, the first output layer, the first maximum pond layer, the second output layer, first Activation primitive layer, volume Two lamination, the 3rd output layer, the second activation primitive layer, the 3rd convolutional layer and the 4th output layer；Described Second degree of depth convolutional network includes multilamellar, is followed successively by: the second input layer, Volume Four lamination, the 5th output layer, the second maximum pond Layer, the 6th output layer, the 3rd activation primitive layer, the 5th convolutional layer, the 7th output layer, the 3rd maximum pond layer, the 8th output layer, 4th activation primitive layer, full articulamentum and the 9th output layer.