CN110189255A - Method for detecting human face based on hierarchical detection - Google Patents

Method for detecting human face based on hierarchical detection Download PDF

Info

Publication number
CN110189255A
CN110189255A CN201910455695.5A CN201910455695A CN110189255A CN 110189255 A CN110189255 A CN 110189255A CN 201910455695 A CN201910455695 A CN 201910455695A CN 110189255 A CN110189255 A CN 110189255A
Authority
CN
China
Prior art keywords
resolution
human face
image
indicate
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910455695.5A
Other languages
Chinese (zh)
Other versions
CN110189255B (en
Inventor
于力
刘意文
邹见效
杨瞻远
徐红兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201910455695.5A priority Critical patent/CN110189255B/en
Publication of CN110189255A publication Critical patent/CN110189255A/en
Application granted granted Critical
Publication of CN110189255B publication Critical patent/CN110189255B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of method for detecting human face based on hierarchical detection, it is trained respectively to Face datection model with the Super-resolution reconstruction established model based on GAN network first, then facial image to be detected is inputted into Face datection model, the coordinate information and the candidate region that obtain each candidate region of human face target belong to the confidence value of face, tentatively judged according to confidence value, the generator that then human face target to be determined is input in the Super-resolution reconstruction established model based on GAN network is further judged.The present invention uses hierarchical detection, can effectively improve the verification and measurement ratio to low-resolution face image.

Description

Method for detecting human face based on hierarchical detection
Technical field
The invention belongs to low resolution human face detection tech fields, more specifically, are related to a kind of based on hierarchical detection Method for detecting human face.
Background technique
Face datection problem is occurred as a subproblem of face identification system, with the continuous depth of research Enter and becomes an independent project gradually.Current human face detection tech mixing together machine learning, computer vision, mould The fields such as formula identification and artificial intelligence, become the basis of the derivative application of all face image analysing computers, and to these flavors Response speed and accurate detectability have significant impact.During face datection application scene is constantly expanded, gradually Encounter leads to problems such as the facial image of input undersized or quality is too low due to various reasons, for these low resolution Facial image, the accuracy rate of face detection system, which often will appear, to decline to a great extent.Usually by the face of low quality and small size The test problems of image are referred to as low resolution Face datection.
Current Face datection algorithm essence is all two classification problems, and basic procedure is first to extract from area to be tested Validity feature, then by these features to determine whether there are face, low resolution Face datection is also on this basis It is studied.There are three features for low resolution face tool: information content is few, noise is mostly and less using tool, this leads to me Enough validity features can not be extracted from candidate region to express this region, from the point of view of feature representation level, passing It shows as not extracting in system method enough for expressing the validity feature of low resolution face;In deep neural network The convolutional layer for showing as front can not provide sufficiently strong driving feature map, and can not provide in subsequent convolutional layer enough The feature of low resolution human face region, this inadequate natural endowment cause detection low resolution face extremely difficult.
In order to solve the problems, such as low resolution Face datection, many outstanding scholars, which have done, largely targetedly to be studied, comprehensive From the point of view of, domestic and foreign scholars are concentrated mainly on three directions to the processing of this problem and carry out, and are found for human face region respectively Resolution ratio robust feature expression, new classifier and image super-resolution side are designed for the characteristics of low resolution face Method.It is to be recognized that currently for the research of the small Face datection of low resolution still in developing stage, it is also necessary to solution More problems, on the one hand, how to efficiently extract out the contextual information of low resolution face and is dissolved among detection network, Better performance is provided for low resolution human-face detector to still need to further explore;On the other hand, a complete face inspection Examining system, the necessarily face detection system of full size, this requires we handle low resolution Face datection problem when It waits, it is necessary in view of the detectability to other scale faces, in fact, the fusion problem of exactly this multiple scale detecting is led Cause that present low resolution face detection system or precision are lower or processing speed is very slow, this is urgently to be resolved one big Problem.
Summary of the invention
It is an object of the invention to overcome the deficiencies of the prior art and provide a kind of Face datection sides based on hierarchical detection Method first passes through Face datection model and is filtered to facial image, then uses the oversubscription based on GAN network to sample to be determined Resolution reconstruction model is further detected, to improve the verification and measurement ratio to low-resolution face image.
For achieving the above object, the present invention is based on the method for detecting human face of deep learning the following steps are included:
S1: obtaining several facial image training samples, and each training sample includes an image and face containing face Target information is trained Face datection model using the above facial image training sample;
S2: obtaining several super-resolution face image training samples, and each training sample includes one and contains face Low-resolution image and corresponding high-definition picture, using super-resolution face image training sample to be based on GAN The Super-resolution reconstruction established model of network is trained, and the Super-resolution reconstruction established model based on GAN network includes generator G and differentiation Device D;
S3: facial image to be detected is inputted into Face datection model, obtains the coordinate of each candidate region of human face target Information and the candidate region belong to the confidence value C of face;Default confidence threshold value T1And T2, and 0 < T1< T2< 1;It is right In each candidate region, if corresponding confidence value C >=T2, then determine the candidate region there are human face target, as Human face target region is exported, if corresponding confidence value T1≤ C < T2, then using the candidate region as face to be determined Otherwise target determines that there is no human face targets for the candidate region, without output;
S4: the generator each human face target to be determined being input in the Super-resolution reconstruction established model based on GAN network G generates super-resolution rebuilding image R, is then input to arbiter D, judges whether it is qualified oversubscription by arbiter Resolution reconstruction image and whether include human face target, if image R is both qualified super-resolution rebuilding image and includes face Target then determines that there are human face targets for corresponding candidate region, are exported as human face target region, otherwise determine it There is no human face targets.
The present invention is based on the method for detecting human face of hierarchical detection, first to Face datection model and based on GAN network Super-resolution reconstruction established model is trained respectively, and facial image to be detected is then inputted Face datection model, obtains face The coordinate information of each candidate region of target and the candidate region belong to the confidence value of face, are carried out just according to confidence value Step judgement, the generator being then input to human face target to be determined in the Super-resolution reconstruction established model based on GAN network carry out Further judgement.The present invention uses hierarchical detection, can effectively improve the verification and measurement ratio to low-resolution face image.
Detailed description of the invention
Fig. 1 is the specific embodiment flow chart of the method for detecting human face the present invention is based on hierarchical detection;
Fig. 2 is the structural schematic diagram of R-FCN network;
Fig. 3 is improved frame regression algorithm flow chart in the present embodiment;
Fig. 4 is the structure chart of generator in SRGAN network;
Fig. 5 is the structure chart of arbiter in SRGAN network;
Fig. 6 is the PR curve graph of three kinds of methods in this experimental verification;
Fig. 7 is the testing result exemplary diagram of SFD method for detecting human face in this experimental verification;
Fig. 8 is the testing result exemplary diagram of R-FCN method for detecting human face in this experimental verification;
Fig. 9 is testing result exemplary diagram of the invention in this experimental verification;
Figure 10 is the PR curve graph that three kinds of methods carry out Face datection to clear detection sample set in this experimental verification;
Figure 11 is the PR curve that three kinds of methods carry out Face datection to general fuzzy detection sample set in this experimental verification Figure;
Figure 12 is the PR curve that three kinds of methods carry out Face datection to serious fuzzy detection sample set in this experimental verification Figure.
Specific embodiment
A specific embodiment of the invention is described with reference to the accompanying drawing, preferably so as to those skilled in the art Understand the present invention.Requiring particular attention is that in the following description, when known function and the detailed description of design perhaps When can desalinate main contents of the invention, these descriptions will be ignored herein.
Embodiment
Fig. 1 is the specific embodiment flow chart of the method for detecting human face the present invention is based on hierarchical detection.As shown in Figure 1, Specific steps the present invention is based on the method for detecting human face of hierarchical detection include:
S101: training face detection model:
Several facial image training samples are obtained, each training sample includes an image and human face target containing face Information is trained Face datection model using the above facial image training sample.
S102: training Super-resolution reconstruction established model:
Several super-resolution face image training samples are obtained, each training sample includes one containing the low of face Image in different resolution and corresponding high-definition picture, using super-resolution face image training sample to based on GAN network Super-resolution reconstruction established model be trained, based on GAN (Generative Adversarial Network, generate confrontation net Network) the Super-resolution reconstruction established model of network includes generator G and arbiter D.
S103: Preliminary detection is carried out using Face datection model:
Facial image to be detected is inputted into Face datection model, obtains the coordinate information of each candidate region of human face target And the candidate region belongs to the confidence value C of face.Default confidence threshold value T1And T2, and 0 < T1< T2< 1.For each A candidate region, if corresponding confidence value C >=T2, then determine that there are human face targets for the candidate region, as face Target area is exported, if corresponding confidence value T1≤ C < T2, then using the candidate region as human face target to be determined, Otherwise determine that there is no human face targets for the candidate region, without output.
S104: it is detected using Super-resolution reconstruction established model:
The generator G each human face target to be determined being input in the Super-resolution reconstruction established model based on GAN network is raw At super-resolution rebuilding image SR, it is then input to arbiter D, judges whether it is qualified super-resolution by arbiter Reconstruction image and whether include human face target, if image SR is both qualified super-resolution rebuilding image and includes face mesh Mark, then determine that there are human face targets for corresponding candidate region, are exported as human face target region, otherwise determine it not There are human face targets.
Using above based on the method for detecting human face of hierarchical detection, using Super-resolution reconstruction established model as Face datection mould The auxiliary of type further detects the less high candidate region of confidence level, so that missing inspection and the erroneous detection of human face target are avoided, Improve detection performance.
For Face datection model, it can according to need the specific Face datection model of selection, selected in the present embodiment R-FCN network is improved as Face datection model, and for low resolution Face datection, to improve detection effect.R- FCN network is transformed in traditional Faster R-CNN structure basis, and core design thought is to utilize Faster On the basis of RPN (Reginal Proposal Network, the Area generation network) network proposed in RCNN, it is quick to introduce position Feel information, ROI layers are moved back, entity in image to be detected is calculated using position sensing characteristic pattern and belongs to the general of each classification Rate can greatly improve detection rates while keeping higher positioning accuracy.Fig. 2 is the structural representation of R-FCN network Figure.As shown in Fig. 2, the workflow of R-FCN can be summarized as follows:
Image is inputted in the good sorter network of a pre-training (used in Fig. 2 ResNet-101 network Conv4 it Preceding network), fix its corresponding network parameter.In the characteristic pattern that the last one convolutional layer of pre-training network obtains There are 3 branches on (feature map):
1st branch is exactly the progress RPN operation on this feature figure, obtains corresponding candidate region ROI, specific method Are as follows: anchor frame (Anchors) is generated according to parameter preset on characteristic pattern, anchor frame is one group has difference on entire input picture The region of size and length-width ratio.Then it identifies the anchor frame comprising prospect, converts target for anchor frame using frame regression algorithm It surrounds frame (Bounding Box), it can the closer fitting foreground object that is included.
2nd branch is exactly the position sensing score mapping that K*K* (C+1) dimension is obtained on this feature figure (position-sensitive score map), for classifying.
3rd branch is exactly the position sensing score mapping that a 4*K*K dimension is obtained in this feature, for being returned Return;
Finally, dividing on the position sensing score mapping of the mapping of position sensing score and 4*K*K dimension that K*K* (C+1) is tieed up (Position-Sensitive Rol Pooling, used herein is average pond for the ROI pondization operation of other execution position sensitivity Change operation), the confidence level and location information of each candidate region are obtained, then corresponding classification is obtained by confidence declaration.
The generation parameter of anchor frame is improved first in the present embodiment.In traditional R-FCN network, adopted when generating anchor frame With three kinds of scales and three kinds of length-width ratios, the lower three kinds of scales of default situations are respectively { 128*128,256*256,512*512 }, and three kinds Length-width ratio is { 1:1,1:2,2:1 }, then available 9 kinds of sizes.When detecting target is lesser face, it is easy to happen small The missing inspection of human face region.It therefore is { 16*16,32*32,128*128,256* by the generation scale modification of anchor frame in the present embodiment 256,512*512 } five kinds of scales, same every kind of scale generate length-width ratio { 1:1,1:2,2:1 } three kinds of anchor frames, amount to 15 kinds of rulers It is very little.Two kinds of small scales of addition are used to detect small face, and three retained below kind scale is used to extract the face area of conventional size Domain.
For frame regression algorithm, the prior art mostly uses NMS (Non Maximum Suppression, non-maximum Restrainable algorithms) algorithm, core ideas be find local maximum, inhibit non-maximum, mainly be exactly by way of iteration, It constantly goes to calculate with other anchor frames with the highest anchor frame of confidence level and hand over and than (Intersection-over-Union, IoU, table Show candidate frame and demarcate the overlapping rate of frame), filter those friendships and bigger frame.However it has been investigated that, NMS algorithm exist with Lower problem:
1) confidence level for closing on candidate frame that lap can will be present in NMS algorithm sets 0 by force, i.e., directly thick in operation The sudden and violent candidate frame that IoU value is greater than threshold value is deleted by force, if a true target to be detected appears in overlapping region at this time Interior, very big probability will lead to the failure of this target detection, increase omission factor, reduce average recall rate.
2) it when carrying out frame recurrence using NMS algorithm, hands over and than decision threshold NtIt is difficult to determine optimal value, setting is too It will increase false detection rate greatly, it is too small and will increase omission factor.
In order to solve problem above, frame regression algorithm is improved on the basis of NMS algorithm in the present embodiment. Fig. 3 is improved frame regression algorithm flow chart in the present embodiment.As shown in figure 3, improved frame returns in the present embodiment The specific steps of reduction method include:
S301: initialization data:
Note includes the anchor frame set B={ b of background1,b2,…,bN, bnIndicate that n-th of anchor frame, n=1,2 ..., N, N indicate Anchor frame quantity comprising background remembers that the confidence level of each anchor frame is sn.Initialization retains anchor frame set
S302: current optimal anchor frame is chosen:
The maximum anchor frame of confidence level is chosen from current anchor frame set B, remembers that it is current optimal anchor frame b ', it will be current optimal Anchor frame b ' addition retains anchor frame set D, and current optimal anchor frame b ' is deleted from anchor frame set B.
S303: judge whether that anchor frame set B for sky, if so, frame recurrence terminates, otherwise enters step S304.
S304: confidence level is updated:
For each anchor frame b in current anchor frame set Bn, calculate its friendship with current optimal anchor frame b ' and than iou (b ', bi), each anchor frame b is then updated using following formulanConfidence level sn:
Wherein, NtFor preset friendship and compare threshold value.
Then return step S302.
It is based on for the Super-resolution reconstruction established model of GAN network, SRGAN network is used in the present embodiment.SRGAN network It is when former is using extensive, excellent super-resolution image reconstruction model, based on GAN (Generative Adversarial Network generates confrontation network) network training forms.SRGAN network is sentenced by a generator G and one Other device D is collectively constituted.Fig. 4 is the structure chart of generator in SRGAN network.Fig. 5 is the structure chart of arbiter in SRGAN network. The core of generator is multiple residual blocks therein, and each residual block includes the convolutional layer of two 3*3, and convolutional layer is followed by batch normalizing Change layer (batch normalization, BN) and PReLU as activation primitive, two 2 × sub-pix convolutional layer (sub-pixel Convolution layers) it is used to increase characteristic size.Arbiter D using a similar VGG19 network structure, But without carrying out the pond maxpooling.The part arbiter D includes 8 convolutional layers, the continuous intensification of adjoint network, characteristic Amount is continuously increased, and characteristic size constantly reduces, using LeakyReLU as activation primitive, finally using two full articulamentums with Final sigmoid activation primitive obtains the probability of the authentic specimen learnt.
Existing SRGAN network has that model is difficult to training and is distributed overlapping, it has been investigated that, these problems source In used in traditional SRGAN network KL divergence and JS divergence as measure between authentic specimen distribution and generation sample distribution away from From standard.In the present embodiment after study, problem above is solved using EM divergence.EM divergence is a kind of symmetrical divergence, Its is defined as:
If Ω ∈ RnIt is the continuous opener of bounded, S is the set of all Radon probability distribution in Ω, if to some p ≠ 1, k > 0, then the calculation formula of EM divergence is as follows:
Wherein, PrAnd PgIndicate two different probability distributions, PuIndicate a random probability distribution, inf indicates most lower bound, x It indicates to obey PrThe sample of distribution,It indicates to obey PgThe sample of distribution,Indicate sample x andA stochastic linear combination, PuIndicate sampleProbability distribution, k and p respectively indicate a constant,It is all with tight support property on Ω The function space of single order differentiable function, | | | | norm is sought in expression.
The advantage of EM divergence is to two different distributions, even if not having lap between them, still is able to anti- Reflect the distance between two distributions.This means that significant gradient can be provided constantly in training, entire SRGAN network energy is allowed Enough stable training, caused by can effectively solve to be likely to occur in original SRGAN network training process is disappeared as gradient The problems such as mode is collapsed.In the present embodiment, objective function in model training is improved based on EM divergence.It is dissipated based on EM Spend the optimization object function of the minimax problem of improved SRGAN network:
Wherein, x indicates that true high-resolution sample, z indicate that the low resolution sample of input generator G, G (z) are to generate The super-resolution rebuilding sample generated in device G, PgIndicate the probability distribution of Super-resolution Reconstruction sample, PrIndicate true high-resolution The probability distribution of sample, D (x), D (G (z)) respectively indicate arbiter D and judge that high-resolution sample, super-resolution rebuilding sample are The no probability for authentic specimen, E [] indicate mathematic expectaion,Indicate true high-resolution sample x and super-resolution rebuilding sample G (z) stochastic linear combination, PuIndicate sampleProbability distribution, k and p respectively indicate a constant.
In the training process, above-mentioned optimization object function is decomposed into two optimization problems:
1, the optimization to resolving device D:
2, the optimization to generator G:
It is derived based on the above technology, the present invention improves the training method of SRGAN model, obtains more advantage SRGAN model, to improve the quality of super-resolution face image result.Specific training method are as follows:
Several high-resolution human face image I are obtained firstHR, corresponding low-resolution face image is obtained by down-sampling ILR, every panel height resolution ratio facial image IHRWith corresponding low-resolution face image ILRA training sample is constituted, to obtain Training sample set.In the present embodiment, down-sampling is carried out using gaussian pyramid, first using original image as bottom image G0 (the 0th layer of gaussian pyramid), carries out convolution to it using Gaussian kernel (5*5), then carries out down-sampling to the image after convolution (removal even number row and column) obtains a tomographic image G1, and iteration carries out completing 4 times of down-samplings.
Then SRGAN network is trained using obtained training sample set, the optimization of generator G in training process Objective function are as follows:
The optimization object function of arbiter D are as follows:
Wherein, x indicates true high-resolution human face image, and z indicates the low-resolution face image of input generator G, G It (z) is the super-resolution rebuilding facial image generated in generator G, PgIndicate the probability distribution of Super-resolution Reconstruction facial image, Pr Indicate the probability distribution of true high-resolution human face image, D (x), D (G (z)) respectively indicate arbiter D and judge high-resolution Facial image, super-resolution rebuilding facial image whether be real human face image probability, E [] indicate mathematic expectaion,It indicates The stochastic linear combination of true high-resolution human face image x and super-resolution rebuilding facial image G (z), PuIndicate sample Probability distribution, k and p respectively indicate a constant.
In the training process of SRGAN network, first by generator G to the low resolution face figure in each training sample X As ILRSuper-resolution rebuilding is carried out, method particularly includes: by generator G to the low-resolution face image I in training sample XLRInto Row up-sampling, obtains super-resolution rebuilding facial image ISR.By being to high-resolution human face image I in this present embodimentHRIt carries out 4 times of down-samplings obtain low-resolution face image ISR, therefore generating super-resolution rebuilding facial image ISRUp-sampling multiple It also is 4.
Then by low-resolution face image ILRCorresponding high-resolution human face image IHRIt is super with being generated by generator G Resolution reconstruction facial image ISRArbiter D is inputted, calculates the loss function L of training sample according to the following formulaSR:
Wherein,Indicate the content loss function of training sample, calculation formula is as follows:
Wherein,Indicate the content loss function based on mean square deviation error, calculation formula is as follows:
Wherein, W indicates high-resolution human face image IHRWidth, H indicate high-resolution human face image IHRHeight, r expression under Decimation factor,Indicate high-resolution human face image IHRMiddle coordinate is the pixel value of the pixel of (x, y),Indicate oversubscription Resolution rebuilds facial image ISRMiddle coordinate is the pixel value of the pixel of (x, y).
Indicate VGG loss, calculation formula is as follows:
Wherein, i indicates that maximum pond sequence number, j are indicated and i-th layer of maximum pond layer in VGG-19 network in arbiter D Convolutional layer serial number between the layer of i+1 layer maximum pond, in existing VGG-19 network, maximum pond layer number be 5, two Convolution layer number between the layer of adjacent maximum pond is 2 or 4.φi,jIndicate i-th layer of maximum pond of VGG-19 network in arbiter D Change the characteristic pattern of j-th of convolutional layer acquisition after layer, Wi,jIndicate characteristic pattern φi,jWidth, Hi,jIndicate characteristic pattern φi,j's It is high.
Indicate confrontation loss, this partial loss function makes SRGAN network by " deception " arbiter to be biased to The output exported closer to natural image is generated, calculation formula is as follows:
Wherein,The super-resolution rebuilding facial image that expression arbiter D generates generator is (i.e. ISR) as the probability of true high-resolution human face image, subscript θD、θGRespectively indicate the network ginseng of arbiter D and generator G Number, w indicate the dimension serial number of network parameter, w=1,2 ..., W, and W indicates the dimension of network parameter.
Since whether Super-resolution reconstruction established model needs to detect in super-resolution rebuilding image comprising face mesh in the present invention Classification Loss L is added in order to preferably meet this demand in mark when calculating loss functionclc, calculation formula is as follows:
Wherein, { y1,y2,…,yv,…,yVIndicate high-resolution human face image IHRWhether be face nominal data, V table Show high-resolution human face image IHRThe human face region quantity of middle calibration, value range are { 0,1 }.
It, can preferred Adam optimization algorithm since optimization object function improved in this implementation does not have log The objective function optimization for realizing generator G and arbiter, to improve training effectiveness.For generator G, optimized using Adam The weight w of algorithm descending update generator GG:
Wherein,Indicate weight wGDecline gradient, zmIndicate super-resolution rebuilding facial image ISRIn m-th of picture The value of element, m=1,2 ..., M, M indicate pixel quantity, D (G (zm)) indicate that arbiter D judges super-resolution rebuilding facial image ISRIn m-th pixel be high-resolution human face image IHRThe probability of middle pixel, α indicate learning rate, β1Indicate single order moments estimation Exponential decay rate, β2Indicate the exponential decay rate of second order moments estimation.The typical value of three parameters of Adam optimization algorithm be α= 0.00001、β1=0.9 and β2=0.999.
The weight w of arbiter D is updated using Adam optimization algorithm descendingD:
Wherein,Indicate weight wDThe gradient of decline, xmIndicate high-resolution human face image IHRThe value of m-th of pixel, D (xm) indicate that arbiter D judges high-resolution human face image IHRM-th of pixel is high-resolution human face image IHRMiddle pixel it is general Rate,It indicatesThe gradient of decline,μm=m/M,Indicate that arbiter D sentences It is disconnectedFor high-resolution human face image IHRIn middle pixel probability.
In the present embodiment, the weight w of generator G is preferably alternately updatedGWith the weight w of arbiter DD, i.e., fixed first life It grows up to be a useful person the parameter of G, updates the parameter of arbiter D, then fix the parameter of arbiter D, update the parameter of generator G, so hand over For progress.
Technical effect in order to better illustrate the present invention carries out the present invention using one group of low-resolution face image real Verifying.Using having carried out in the present embodiment, anchor frame generates parameter improvement to Face datection model and frame returns in this experimental verification The improved R-FCN model of reduction method, the Super-resolution reconstruction established model based on GAN network are used through improving training side in the present embodiment The SRGAN model that method obtains.When Face datection model and the Super-resolution reconstruction established model based on GAN network are trained, adopt 10 images are respectively randomly selected with Wider Face training sample set, and from 61 classification, amount to 610 images as detection Image.In order to realize the control of technical effect, SFD method for detecting human face and R-FCN Face datection are chosen in this experimental verification Method method as a comparison.
In order to assess the technical effect of the present inventor's face detecting method and control methods, select PR curve as assessment mark It is quasi-.It is ordinate that PR curve, which is with precision ratio (Precision), recall ratio (Recall) is curve that abscissa is drawn.
Fig. 6 is the PR curve graph of three kinds of methods in this experimental verification.As shown in fig. 6, the present invention is in three kinds of Face datections In method, PR curve is whole closer to the upper right corner, mAP (Mean Average Precision, i.e., averagely AP (average essence Exactness) value) value is to behave oneself best in 0.947 and three groups of data.
Fig. 7 is the testing result exemplary diagram of SFD method for detecting human face in this experimental verification.Fig. 8 is this experimental verification The testing result exemplary diagram of middle R-FCN method for detecting human face.Fig. 9 is testing result example of the invention in this experimental verification Figure.Compare Fig. 7 to Fig. 9 it is found that the present invention altogether detect 14 faces, be respectively 11 and 9 people higher than other two methods Face as a result, showing more excellent detection performance.
Next Face datection is carried out to the image pattern under different clarity.It is marked in Wider Face training sample set Fuzziness (blur) attribute of each human face target, be divided into it is clear, general fuzzy and serious three kinds fuzzy, accordingly from different moulds Several samples are extracted on the image pattern of paste degree constitutes detection sample set.Figure 10 be in this experimental verification three kinds of methods to clear Clear detection sample set carries out the PR curve graph of Face datection.Figure 11 be in this experimental verification three kinds of methods to general fuzzy detection The PR curve graph of sample set progress Face datection.Figure 12 be in this experimental verification three kinds of methods to serious fuzzy detection sample set Carry out the PR curve graph of Face datection.As shown in Figure 11 to Figure 12, three kinds of methods, can be very when sample clarity is higher Good detects face part, and gap is not very greatly, and mAP value is all very high;In the general test group of sample fog-level In, three kinds of algorithm mAP values, which have, slightly to be declined, but is still above 97%, is illustrated under general fog-level, three kinds of methods are all With extraordinary detectability, can't they be constituted with too big challenge.It can be found that, the present invention is fuzzy in face simultaneously When degree is general, having begun has some advantages relative to SFD and R-FCN, but is not obvious;It is tight in detection sample In the case that molality is pasted, the gap of three kinds of methods starts to occur, and wherein SFD performance is worst, is clearly sample with detection fuzziness This when, is compared, and mAP has dropped about 10 percentage points, and fall of the present invention is minimum, is probably reduced only by 5 percentage points Up and down, in the case, mAP value of the present invention is higher by about 2 percentage points than original R-FCN model, and PR curve can be significantly The PR curve of other two control methods is wrapped up, therefore relative to other two methods, in low resolution, this hair It is bright to possess better stability and higher verification and measurement ratio.
Although the illustrative specific embodiment of the present invention is described above, in order to the technology of the art Personnel understand the present invention, it should be apparent that the present invention is not limited to the range of specific embodiment, to the common skill of the art For art personnel, if various change the attached claims limit and determine the spirit and scope of the present invention in, these Variation is it will be apparent that all utilize the innovation and creation of present inventive concept in the column of protection.

Claims (9)

1. a kind of method for detecting human face based on hierarchical detection, which comprises the following steps:
S1: obtaining several facial image training samples, and each training sample includes an image and human face target containing face Information is trained Face datection model using the above facial image training sample;
S2: obtaining several super-resolution face image training samples, and each training sample includes one containing the low of face Image in different resolution and corresponding high-definition picture, using super-resolution face image training sample to based on GAN network Super-resolution reconstruction established model be trained, the Super-resolution reconstruction established model based on GAN network includes generator G and arbiter D;
S3: facial image to be detected is inputted into Face datection model, obtains the coordinate information of each candidate region of human face target And the candidate region belongs to the confidence value C of face;Default confidence threshold value T1And T2, and 0 < T1< T2< 1;For each A candidate region, if corresponding confidence value C >=T2, then determine that there are human face targets for the candidate region, as face Target area is exported, if corresponding confidence value T1≤ C < T2, then using the candidate region as human face target to be determined, Otherwise determine that there is no human face targets for the candidate region, without output;
S4: the generator G each human face target to be determined being input in the Super-resolution reconstruction established model based on GAN network, it is raw At super-resolution rebuilding image SR, it is then input to arbiter D, judges whether it is qualified super-resolution by arbiter Reconstruction image and whether include human face target, if image SR is both qualified super-resolution rebuilding image and includes face mesh Mark, then determine that there are human face targets for corresponding candidate region, are exported as human face target region, otherwise determine it not There are human face targets.
2. method for detecting human face according to claim 1, which is characterized in that the Face datection model uses R-FCN net Network.
3. method for detecting human face according to claim 2, which is characterized in that the generation ruler of anchor frame in the R-FCN network Degree includes five kinds of scales { 16*16,32*32,128*128,256*256,512*512 }, three kinds of length-width ratios { 1:1,1:2,2:1 }.
4. method for detecting human face according to claim 2, which is characterized in that frame regression algorithm in the R-FCN network Specific steps include:
1) note includes the anchor frame set B={ b of background1,b2,…,bN, bnIndicate that n-th of anchor frame, n=1,2 ..., N, N indicate packet Anchor frame quantity containing background remembers that the confidence level of each anchor frame is sn.Initialization retains anchor frame set
2) the maximum anchor frame of confidence level is chosen from current anchor frame set B, remembers that it is current optimal anchor frame b ', it will current optimal anchor Frame b ' addition retains anchor frame set D, and current optimal anchor frame b ' is deleted from anchor frame set B;
3) judge whether 4) anchor frame set B for sky, if so, frame recurrence terminates, is otherwise entered step;
4) for each anchor frame b in current anchor frame set Bn, calculate its friendship with current optimal anchor frame b ' and than iou (b ', bi), each anchor frame b is then updated using following formulanConfidence level sn:
Wherein, NtFor preset friendship and compare threshold value;
Then return step 2).
5. method for detecting human face according to claim 1, which is characterized in that the Super-resolution reconstruction based on GAN network Established model uses SRGAN network.
6. method for detecting human face according to claim 5, which is characterized in that the SRGAN network is instructed using following methods It gets:
Several high-resolution human face image I are obtained firstHR, corresponding low-resolution face image I is obtained by down-samplingLR, often Panel height resolution ratio facial image IHRWith corresponding low-resolution face image ILRA training sample is constituted, to be trained Sample set;
Then SRGAN network is trained using obtained training sample set, the optimization aim of generator G in training process Function are as follows:
The optimization object function of arbiter D are as follows:
Wherein, x indicates true high-resolution human face image, and z indicates the low-resolution face image of input generator G, G (z) For the super-resolution rebuilding facial image generated in generator G, PgIndicate the probability distribution of Super-resolution Reconstruction facial image, PrTable Show the probability distribution of true high-resolution human face image, D (x), D (G (z)) respectively indicate arbiter D and judge high-resolution human Face image, super-resolution rebuilding facial image whether be real human face image probability, E [] indicate mathematic expectaion,Indicate true A stochastic linear of real high-resolution human face image x and super-resolution rebuilding facial image G (z) combine, and k and p are respectively indicated One constant.
7. method for detecting human face according to claim 6, which is characterized in that in the SRGAN network training process, according to The loss function L of following formula calculating training sampleSR:
Wherein,Indicate the content loss function of training sample,Indicate confrontation loss, LclcPresentation class loss.
8. method for detecting human face according to claim 6, which is characterized in that in the SRGAN network training process, use Adam optimization algorithm realizes the objective function optimization of generator G and arbiter D, method particularly includes:
Using Adam optimization algorithm, descending updates the weight w of generator GG:
Wherein,Indicate weight wGDecline gradient, zmIndicate super-resolution rebuilding facial image ISRIn m-th pixel Value, m=1,2 ..., M, M indicate pixel quantity, D (G (zm)) indicate that arbiter D judges super-resolution rebuilding facial image ISRIn M-th of pixel is high-resolution human face image IHRThe probability of middle pixel, α indicate learning rate, β1Indicate the index of single order moments estimation Attenuation rate, β2Indicate the exponential decay rate of second order moments estimation;
The weight w of arbiter D is updated using Adam optimization algorithm descendingD:
Wherein,Indicate weight wDThe gradient of decline, xmIndicate high-resolution human face image IHRThe value of m-th of pixel, D (xm) Indicate that arbiter D judges high-resolution human face image IHRM-th of pixel is high-resolution human face image IHRThe probability of middle pixel,It indicatesThe gradient of decline,μm=m/M,Indicate arbiter D judgementFor high-resolution human face image IHRIn middle pixel probability.
9. super-resolution face image method according to claim 8, which is characterized in that the step generator G With the weight w for alternately updating generator G when the objective function optimization of arbiter DGWith the weight w of arbiter DD
CN201910455695.5A 2019-05-29 2019-05-29 Face detection method based on two-stage detection Active CN110189255B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910455695.5A CN110189255B (en) 2019-05-29 2019-05-29 Face detection method based on two-stage detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910455695.5A CN110189255B (en) 2019-05-29 2019-05-29 Face detection method based on two-stage detection

Publications (2)

Publication Number Publication Date
CN110189255A true CN110189255A (en) 2019-08-30
CN110189255B CN110189255B (en) 2023-01-17

Family

ID=67718558

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910455695.5A Active CN110189255B (en) 2019-05-29 2019-05-29 Face detection method based on two-stage detection

Country Status (1)

Country Link
CN (1) CN110189255B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705509A (en) * 2019-10-16 2020-01-17 上海眼控科技股份有限公司 Face direction recognition method and device, computer equipment and storage medium
CN110705498A (en) * 2019-10-12 2020-01-17 北京泰豪信息科技有限公司 Low-resolution face recognition method
CN110866484A (en) * 2019-11-11 2020-03-06 珠海全志科技股份有限公司 Driver face detection method, computer device and computer readable storage medium
CN111144215A (en) * 2019-11-27 2020-05-12 北京迈格威科技有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN111222420A (en) * 2019-12-24 2020-06-02 重庆市通信产业服务有限公司 FTP protocol-based low-bandwidth-requirement helmet identification method
CN111339950A (en) * 2020-02-27 2020-06-26 北京交通大学 Remote sensing image target detection method
CN112102234A (en) * 2020-08-06 2020-12-18 复旦大学 Ear sclerosis focus detection and diagnosis system based on target detection neural network
CN112183183A (en) * 2020-08-13 2021-01-05 南京众智未来人工智能研究院有限公司 Target detection method and device and readable storage medium
CN112288044A (en) * 2020-12-24 2021-01-29 成都索贝数码科技股份有限公司 News picture attribute identification method of multi-scale residual error network based on tree structure
CN112418009A (en) * 2020-11-06 2021-02-26 中保车服科技服务股份有限公司 Image quality detection method, terminal device and storage medium
CN112437451A (en) * 2020-11-10 2021-03-02 南京大学 Wireless network flow prediction method and device based on generation countermeasure network
CN113283306A (en) * 2021-04-30 2021-08-20 青岛云智环境数据管理有限公司 Rodent identification and analysis method based on deep learning and transfer learning
US20210279884A1 (en) * 2020-03-06 2021-09-09 Siemens Healthcare Gmbh Method of computing a boundary
CN113836974A (en) * 2020-06-23 2021-12-24 江苏翼视智能科技有限公司 Monitoring video pedestrian detection method based on super-resolution reconstruction
CN114862683A (en) * 2022-07-07 2022-08-05 浪潮电子信息产业股份有限公司 Model generation method, target detection method, device, equipment and medium

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5696848A (en) * 1995-03-09 1997-12-09 Eastman Kodak Company System for creating a high resolution image from a sequence of lower resolution motion images
US20020180586A1 (en) * 2001-05-30 2002-12-05 Kitson Frederick Lee Face and environment sensing watch
US20130202162A1 (en) * 2012-02-02 2013-08-08 Korea Institute Of Science And Technology Method of reconstructing three-dimensional facial shape
CN106874894A (en) * 2017-03-28 2017-06-20 电子科技大学 A kind of human body target detection method based on the full convolutional neural networks in region
CN106951867A (en) * 2017-03-22 2017-07-14 成都擎天树科技有限公司 Face identification method, device, system and equipment based on convolutional neural networks
CN107154023A (en) * 2017-05-17 2017-09-12 电子科技大学 Face super-resolution reconstruction method based on generation confrontation network and sub-pix convolution
CN107239736A (en) * 2017-04-28 2017-10-10 北京智慧眼科技股份有限公司 Method for detecting human face and detection means based on multitask concatenated convolutional neutral net
CN107481188A (en) * 2017-06-23 2017-12-15 珠海经济特区远宏科技有限公司 A kind of image super-resolution reconstructing method
US20180075581A1 (en) * 2016-09-15 2018-03-15 Twitter, Inc. Super resolution using a generative adversarial network
CN108090417A (en) * 2017-11-27 2018-05-29 上海交通大学 A kind of method for detecting human face based on convolutional neural networks
CN108090873A (en) * 2017-12-20 2018-05-29 河北工业大学 Pyramid face image super-resolution reconstruction method based on regression model
CN108229381A (en) * 2017-12-29 2018-06-29 湖南视觉伟业智能科技有限公司 Face image synthesis method, apparatus, storage medium and computer equipment
CN108446617A (en) * 2018-03-09 2018-08-24 华南理工大学 The human face quick detection method of anti-side face interference
CN108681718A (en) * 2018-05-20 2018-10-19 北京工业大学 A kind of accurate detection recognition method of unmanned plane low target
CN108805027A (en) * 2018-05-03 2018-11-13 电子科技大学 Face identification method under the conditions of low resolution
EP3438920A1 (en) * 2017-07-31 2019-02-06 Institut Pasteur Method, device, and computer program for improving the reconstruction of dense super-resolution images from diffraction-limited images acquired by single molecule localization microscopy
CN109543548A (en) * 2018-10-26 2019-03-29 桂林电子科技大学 A kind of face identification method, device and storage medium
CN109614985A (en) * 2018-11-06 2019-04-12 华南理工大学 A kind of object detection method based on intensive connection features pyramid network

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5696848A (en) * 1995-03-09 1997-12-09 Eastman Kodak Company System for creating a high resolution image from a sequence of lower resolution motion images
US20020180586A1 (en) * 2001-05-30 2002-12-05 Kitson Frederick Lee Face and environment sensing watch
US20130202162A1 (en) * 2012-02-02 2013-08-08 Korea Institute Of Science And Technology Method of reconstructing three-dimensional facial shape
US20180075581A1 (en) * 2016-09-15 2018-03-15 Twitter, Inc. Super resolution using a generative adversarial network
CN106951867A (en) * 2017-03-22 2017-07-14 成都擎天树科技有限公司 Face identification method, device, system and equipment based on convolutional neural networks
CN106874894A (en) * 2017-03-28 2017-06-20 电子科技大学 A kind of human body target detection method based on the full convolutional neural networks in region
CN107239736A (en) * 2017-04-28 2017-10-10 北京智慧眼科技股份有限公司 Method for detecting human face and detection means based on multitask concatenated convolutional neutral net
CN107154023A (en) * 2017-05-17 2017-09-12 电子科技大学 Face super-resolution reconstruction method based on generation confrontation network and sub-pix convolution
CN107481188A (en) * 2017-06-23 2017-12-15 珠海经济特区远宏科技有限公司 A kind of image super-resolution reconstructing method
EP3438920A1 (en) * 2017-07-31 2019-02-06 Institut Pasteur Method, device, and computer program for improving the reconstruction of dense super-resolution images from diffraction-limited images acquired by single molecule localization microscopy
CN108090417A (en) * 2017-11-27 2018-05-29 上海交通大学 A kind of method for detecting human face based on convolutional neural networks
CN108090873A (en) * 2017-12-20 2018-05-29 河北工业大学 Pyramid face image super-resolution reconstruction method based on regression model
CN108229381A (en) * 2017-12-29 2018-06-29 湖南视觉伟业智能科技有限公司 Face image synthesis method, apparatus, storage medium and computer equipment
CN108446617A (en) * 2018-03-09 2018-08-24 华南理工大学 The human face quick detection method of anti-side face interference
CN108805027A (en) * 2018-05-03 2018-11-13 电子科技大学 Face identification method under the conditions of low resolution
CN108681718A (en) * 2018-05-20 2018-10-19 北京工业大学 A kind of accurate detection recognition method of unmanned plane low target
CN109543548A (en) * 2018-10-26 2019-03-29 桂林电子科技大学 A kind of face identification method, device and storage medium
CN109614985A (en) * 2018-11-06 2019-04-12 华南理工大学 A kind of object detection method based on intensive connection features pyramid network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ADRIAN BULAT等: "Super-FAN: Integrated Facial Landmark Localization and Super-Resolution of Real-World Low Resolution Faces in Arbitrary Poses with GANs", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
代江: "基于GAN的视频超分辨率研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
杜彦璞: "基于生成对抗网络的遥感图像超分辨率方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
贾洁: "基于生成对抗网络的人脸超分辨率重建及识别", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
郜雨桐等: "基于卷积神经网络的车辆型号识别研究", 《应用科技》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705498A (en) * 2019-10-12 2020-01-17 北京泰豪信息科技有限公司 Low-resolution face recognition method
CN110705509A (en) * 2019-10-16 2020-01-17 上海眼控科技股份有限公司 Face direction recognition method and device, computer equipment and storage medium
CN110866484A (en) * 2019-11-11 2020-03-06 珠海全志科技股份有限公司 Driver face detection method, computer device and computer readable storage medium
CN110866484B (en) * 2019-11-11 2022-09-09 珠海全志科技股份有限公司 Driver face detection method, computer device and computer readable storage medium
CN111144215A (en) * 2019-11-27 2020-05-12 北京迈格威科技有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN111144215B (en) * 2019-11-27 2023-11-24 北京迈格威科技有限公司 Image processing method, device, electronic equipment and storage medium
CN111222420A (en) * 2019-12-24 2020-06-02 重庆市通信产业服务有限公司 FTP protocol-based low-bandwidth-requirement helmet identification method
CN111339950B (en) * 2020-02-27 2024-01-23 北京交通大学 Remote sensing image target detection method
CN111339950A (en) * 2020-02-27 2020-06-26 北京交通大学 Remote sensing image target detection method
US20210279884A1 (en) * 2020-03-06 2021-09-09 Siemens Healthcare Gmbh Method of computing a boundary
US11610316B2 (en) * 2020-03-06 2023-03-21 Siemens Healthcare Gmbh Method of computing a boundary
CN113836974A (en) * 2020-06-23 2021-12-24 江苏翼视智能科技有限公司 Monitoring video pedestrian detection method based on super-resolution reconstruction
CN112102234A (en) * 2020-08-06 2020-12-18 复旦大学 Ear sclerosis focus detection and diagnosis system based on target detection neural network
CN112183183A (en) * 2020-08-13 2021-01-05 南京众智未来人工智能研究院有限公司 Target detection method and device and readable storage medium
CN112418009B (en) * 2020-11-06 2024-03-22 中保车服科技服务股份有限公司 Image quality detection method, terminal equipment and storage medium
CN112418009A (en) * 2020-11-06 2021-02-26 中保车服科技服务股份有限公司 Image quality detection method, terminal device and storage medium
CN112437451A (en) * 2020-11-10 2021-03-02 南京大学 Wireless network flow prediction method and device based on generation countermeasure network
CN112288044B (en) * 2020-12-24 2021-07-27 成都索贝数码科技股份有限公司 News picture attribute identification method of multi-scale residual error network based on tree structure
CN112288044A (en) * 2020-12-24 2021-01-29 成都索贝数码科技股份有限公司 News picture attribute identification method of multi-scale residual error network based on tree structure
CN113283306A (en) * 2021-04-30 2021-08-20 青岛云智环境数据管理有限公司 Rodent identification and analysis method based on deep learning and transfer learning
CN114862683A (en) * 2022-07-07 2022-08-05 浪潮电子信息产业股份有限公司 Model generation method, target detection method, device, equipment and medium

Also Published As

Publication number Publication date
CN110189255B (en) 2023-01-17

Similar Documents

Publication Publication Date Title
CN110189255A (en) Method for detecting human face based on hierarchical detection
US11055574B2 (en) Feature fusion and dense connection-based method for infrared plane object detection
CN108334847A (en) A kind of face identification method based on deep learning under real scene
CN104346617B (en) A kind of cell detection method based on sliding window and depth structure extraction feature
CN110503635B (en) Hand bone X-ray film bone age assessment method based on heterogeneous data fusion network
CN109816012A (en) A kind of multiscale target detection method of integrating context information
CN109191476A (en) The automatic segmentation of Biomedical Image based on U-net network structure
CN107610140A (en) Near edge detection method, device based on depth integration corrective networks
CN109344821A (en) Small target detecting method based on Fusion Features and deep learning
CN106683048A (en) Image super-resolution method and image super-resolution equipment
CN110533631A (en) SAR image change detection based on the twin network of pyramid pondization
CN106980858A (en) The language text detection of a kind of language text detection with alignment system and the application system and localization method
CN111368769B (en) Ship multi-target detection method based on improved anchor point frame generation model
Gómez et al. Determining the accuracy in image supervised classification problems
CN109685768A (en) Lung neoplasm automatic testing method and system based on lung CT sequence
CN106780546B (en) The personal identification method of motion blur encoded point based on convolutional neural networks
CN109711401A (en) A kind of Method for text detection in natural scene image based on Faster Rcnn
CN108122008A (en) SAR image recognition methods based on rarefaction representation and multiple features decision level fusion
CN113793333B (en) Defect picture generation method and device applied to industrial quality inspection
CN108122221A (en) The dividing method and device of diffusion-weighted imaging image midbrain ischemic area
CN110232387A (en) A kind of heterologous image matching method based on KAZE-HOG algorithm
CN108389180A (en) A kind of fabric defect detection method based on deep learning
CN110309010B (en) Partial discharge network training method and device for phase resolution of power equipment
CN111652864A (en) Casting defect image generation method for generating countermeasure network based on conditional expression
CN113191390B (en) Image classification model construction method, image classification method and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant