CN110490170A - A kind of face candidate frame extracting method - Google Patents

A kind of face candidate frame extracting method Download PDF

Info

Publication number
CN110490170A
CN110490170A CN201910793249.5A CN201910793249A CN110490170A CN 110490170 A CN110490170 A CN 110490170A CN 201910793249 A CN201910793249 A CN 201910793249A CN 110490170 A CN110490170 A CN 110490170A
Authority
CN
China
Prior art keywords
frame
anchor frame
face
candidate
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910793249.5A
Other languages
Chinese (zh)
Other versions
CN110490170B (en
Inventor
刘中秋
陈高曙
祝江威
张伟
唐松鹤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Miaxis Biometrics Co Ltd
Original Assignee
Miaxis Biometrics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Miaxis Biometrics Co Ltd filed Critical Miaxis Biometrics Co Ltd
Priority to CN201910793249.5A priority Critical patent/CN110490170B/en
Publication of CN110490170A publication Critical patent/CN110490170A/en
Application granted granted Critical
Publication of CN110490170B publication Critical patent/CN110490170B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Abstract

A kind of face candidate frame extracting method provided by the invention mainly includes two steps: step 1 is candidate network training process;Step 2 is candidate frame extraction process;Step 1 includes that the data for preparing 3 class training datas and using are trained candidate network as training sample, and wherein training data is respectively sectioning image, the label of the corresponding anchor frame of sectioning image, true frame regression vector;Step 2 is to carry out the extraction of face candidate frame to image to be detected based on trained candidate network in step 1, and the candidate network is full convolutional neural networks.More anchor frames are used in combination the present invention with image pyramid, reduce the image pyramid number of plies, greatly improve the speed of service while guaranteeing accuracy, have taken into account accuracy and real-time, so that the present invention has good practicability.

Description

A kind of face candidate frame extracting method
[technical field]
The present invention relates to human face detection tech fields, more particularly, to a kind of face candidate frame extracting method.
[background technique]
Face datection (face detection) is the basic steps of computer vision field related application, to subsequent correlation The performance of application plays decisive role.So-called Face datection is that one kind is looked in arbitrarily input information (such as image, video) To the computer technology of the relevant informations such as the position of face, size and posture, wherein the position of face and size are face inspections The most basic face information that must be obtained is surveyed, shows usually using face candidate collimation mark and obtains.
Steps are as follows for method for detecting human face common at present: carrying out image pyramid operation to original image first, obtains more Layer pyramid diagram picture, secondly extracted using a candidate network may be in each layer of pyramid diagram picture face candidate frame, most Screening and coordinate modification are carried out to candidate frame using level-one or multistage refinement network afterwards, it is final to obtain face candidate frame.This side The extraction of face candidate frame need to be handled each layer of pyramid diagram picture in method, computationally intensive, cause to take a long time, and extract speed It is poor to spend, and lacks practicability.
[summary of the invention]
The present invention is intended to provide a kind of face candidate frame extracting method, is effectively reduced calculation amount, face candidate frame is improved Extraction rate.
The purpose of the present invention is what is be achieved through the following technical solutions:
A kind of face candidate frame extracting method mainly includes two steps: step 1: candidate network training process;Step 2: Candidate frame extraction process;The step 2 is to carry out face candidate to image to be detected based on trained candidate network in step 1 Frame extracts.
Step 1: preparing 3 class training datas, the respectively label of the corresponding anchor frame of sectioning image, sectioning image, true side Frame regression vector, corresponding data acquisition modes are as follows:
(1) sectioning image:
Square slice of the side length not less than A is randomly selected from the original image of training dataset, by the unified contracting of its side length It is put into A, it is final to obtain the sectioning image having a size of A*A.
(2) label of the corresponding anchor frame of sectioning image:
1) the corresponding anchor frame of sectioning image:
On the corresponding original image of sectioning image, centered on the sectioning image central point, choosing side length is respectively A* scaleiK square anchor frame;Wherein scale is zoom factor, value range 0~1;I value is 0,1 ..., k-1, wherein K is positive integer, and value meetsPreferred value is maximum k value.
2) label of anchor frame:
The IOU value of each anchor frame with real human face frame in corresponding original image is calculated,Wherein SFor anchor frame with The area of real human face frame intersection part, SFor the area of anchor frame and real human face frame union part.It is corresponding with each anchor frame Maximum IOU value is that anchor frame assigns label, and label classification is divided into positive sample label, part sample label and negative sample label.
(3) true frame regression vector:
For the anchor frame for being endowed positive sample label and part sample label in above-mentioned (2), it is corresponding to calculate its maximum IOU value True frame regression vector of the real human face frame relative to corresponding anchor frame, calculation formula is as follows:
Wherein xg、ygFor the center point coordinate of real human face frame, xa、yaFor the center point coordinate of corresponding anchor frame, wg、hgIt is true The width and height of real face frame, wa、haFor the width and height of corresponding anchor frame.
3 class training datas of above-mentioned acquisition are trained candidate network as training sample, trained termination condition To meet the number of iterations or loss convergence.
Step 2: candidate frame extraction process, it is shown that detailed process is as follows:
(a) image pyramid processing is carried out to original image to be detected, obtains multilayer pyramid diagram picture;
(b) since kth layer pyramid diagram picture, every j layers of pyramid diagram picture inputs trained candidate network and is calculated, Face confidence characteristic figure and prediction frame regression vector characteristic pattern are obtained, wherein j is the positive integer less than or equal to k;
(c) face confidence level is more than or equal to the pixel of threshold value T in face confidence characteristic figure in selection above-mentioned (b);
(d) according to selected pixel in (c), its corresponding anchor frame i.e. primary election anchor frame on original image is obtained;
(e) for the primary election anchor frame obtained in (d), extra anchor frame is excluded using non-maxima suppression method first, with Coordinate modification is carried out using corresponding prediction frame regression vector to remaining anchor frame afterwards, it is final to obtain face candidate frame.
More anchor frames are used in combination the present invention with image pyramid, and the pyramid diagram picture of required processing is initial pyramid diagram AsThe image pyramid number of plies is effectively reduced, the speed of service is greatly improved while guaranteeing accuracy, has taken into account standard True property and real-time, so that the present invention has good practicability.
[Detailed description of the invention]
Fig. 1 is the flow diagram of the candidate network training process of the embodiment of the present invention.
Fig. 2 is the flow diagram of the candidate frame extraction process of the embodiment of the present invention.
Fig. 3 is the structural schematic diagram of the candidate network of the embodiment of the present invention.
[specific embodiment]
To make the purpose of the present invention, technical solution and effect clearer, clear and definite, referring to the drawings and illustrative example pair The present invention is further described.
A kind of face candidate frame extracting method provided by the invention mainly includes two steps: step 1 is candidate network instruction Practice process, as shown in Figure 1;Step 2 is candidate frame extraction process, as shown in Figure 2;The step 2 is to be based on training in step 1 Candidate network to image to be detected carry out the extraction of face candidate frame.
The candidate network is convolutional neural networks, and specific network structure is as shown in Figure 3.
Shown in the candidate network training process of the present embodiment is specific as follows:
(1) side length randomly selected from the original image of training dataset is sliced not less than the square of pixel 24, by it Side length uniformly zooms to 24, final to obtain the sectioning image having a size of 24*24 pixel, as candidate network training input data it One;
(2) obtained in above-mentioned (1) centered on sectioning image central point, k square anchor frame is obtained by scaling, The experience value of middle zoom factor scale is 0.7937, therefore the side length of square anchor frame is respectively 24*scalei, i successively takes Value 0,1 ..., k-1, k value are 4;
(3) the IOU value of each anchor frame and real human face frame in corresponding original image in above-mentioned (2), calculation formula are calculated are as follows:Wherein SFor the area of anchor frame and real human face frame intersection part, SFor anchor frame and real human face frame union part Area.Label is assigned by anchor frame of the corresponding maximum IOU value of each anchor frame;
The scoring criteria of label is described as follows: imparting of the maximum IOU value more than or equal to 0.5 corresponds to anchor frame positive sample label, Imparting of the maximum IOU value less than 0.5 and more than or equal to 0.3 corresponds to anchor frame portion point sample label, maximum tax of the IOU value less than 0.2 Corresponding anchor frame negative sample label is given, remaining anchor frame does not assign label;Positive sample label anchor collimation mark can be denoted as 1, negative sample mark Label anchor collimation mark is denoted as 0;
(4) for being endowed the anchor frame of positive sample label and part sample label in (3), it is corresponding to calculate its maximum IOU value True frame regression vector of the real human face frame relative to corresponding anchor frame, calculation formula is as follows:
Wherein xg、ygFor the center point coordinate of real human face frame, xa、yaFor the center point coordinate of anchor frame, wg、hgFor true people The width and height of face frame, wa、haFor the width and height of anchor frame.
(5) trained candidate network output is face confidence level and predicts two class data of frame regression vector, therefore, right In face confidence level, using sectioning image corresponding to the anchor box label and anchor frame of all positive sample labels and negative sample label into Row training;For predicting frame regression vector, the true side of all positive sample label anchor frames and part sample label anchor frame is used Sectioning image corresponding to frame regression vector and anchor frame is trained;Trained termination condition is to meet the number of iterations or loss receipts It holds back, finally obtains trained candidate network.
The candidate frame extraction process detailed process of the present embodiment can be described as follows:
From wisdom foreground obtain resolution ratio be 1920*1080 original image to be detected, first to original image to be detected into Row image pyramid processing obtains 18 layers of pyramid diagram picture, then since the 4th layer of pyramid diagram picture, by every 3 layers of pyramid diagram As inputting trained candidate network and obtaining two kinds of the frame regression vector characteristic pattern outputs of face confidence characteristic figure and prediction, Then it selects face confidence level to be more than or equal to 0.6 pixel in face confidence characteristic figure, finds out its institute on original image Corresponding anchor is confined as primary election anchor frame, anchor frame extra in primary election anchor frame is then excluded using non-maxima suppression method, then Coordinate modification calculating is carried out using corresponding prediction frame regression vector to remaining anchor frame, it is final to obtain face candidate frame.
More anchor frames are used in combination the present embodiment with image pyramid, by the pyramid diagram picture of required processing from initial 18 Layer is reduced to 6 layers, effectively reduces the image pyramid number of plies.
It by test, is run on common personal computer at present, the picture that a Zhang great little is 1920*1080 is carried out When face candidate frame extracts, face candidate frame extraction time is about 255ms, the embodiment of the present application in MTCNN method for detecting human face Middle face candidate frame extraction time is about 33ms, greatly optimizes the extraction rate of face candidate frame, has good practicability.
Above embodiments are only present pre-ferred embodiments, are not intended to limit the invention;Although with reference to the foregoing embodiments Invention is explained in detail, those skilled in the art should understand that: it still can be to previous embodiment Documented technical solution is modified or equivalent replacement of some of the technical features;And these are modified or replace It changes, the spirit and scope for technical solution of the embodiment of the present invention that it does not separate the essence of the corresponding technical solution.

Claims (5)

1. a kind of face candidate frame extracting method, which is characterized in that mainly include two steps: step 1: candidate network was trained Journey;Step 2: candidate frame extraction process;The step 2 be based on trained candidate network in step 1 to image to be detected into Pedestrian's face candidate frame extracts;
The step 1 include: a. prepare 3 class training datas, the data that b. is used as training sample to candidate network into Row training;Wherein training data is respectively sectioning image, the label of the corresponding anchor frame of sectioning image, true frame regression vector, Data acquiring mode is described in detail below:
(1) sectioning image:
Square slice of the side length not less than A is randomly selected from the original image of training dataset, its side length is uniformly zoomed to A, it is final to obtain the sectioning image having a size of A*A;
(2) label of the corresponding anchor frame of sectioning image:
1) the corresponding anchor frame of sectioning image:
On the corresponding original image of sectioning image, centered on the sectioning image central point, choosing side length is respectively A*scalei's K square anchor frame;Wherein scale is zoom factor, value range 0~1;I value is 0,1 ..., and k-1, wherein k is positive whole Number, and value meets condition
2) label of anchor frame:
The IOU value of each anchor frame with real human face frame in corresponding original image is calculated,Wherein SFor anchor frame and really The area of face frame intersection part, SFor the area of anchor frame and real human face frame union part;With the corresponding maximum of each anchor frame IOU value is that anchor frame assigns label, and label classification is divided into positive sample label, part sample label and negative sample label;
(3) true frame regression vector:
For the anchor frame for being endowed positive sample label and part sample label in above-mentioned (2), it is corresponding true to calculate its maximum IOU value True frame regression vector of the real face frame relative to corresponding anchor frame, calculation formula are as follows:
Wherein xg、ygFor the center point coordinate of real human face frame, xa、yaFor the center point coordinate of corresponding anchor frame, wg、hgFor true people The width and height of face frame, wa、haFor the width and height of corresponding anchor frame;
The step 2 is shown in detailed process is as follows:
(a) image pyramid processing is carried out to original image to be detected, obtains multilayer pyramid diagram picture;
(b) since kth layer pyramid diagram picture, every j layers of pyramid diagram picture inputs trained candidate network and is calculated, and obtains Face confidence characteristic figure and prediction frame regression vector characteristic pattern, wherein j is the positive integer less than or equal to k;
(c) face confidence level is more than or equal to the pixel of threshold value T in face confidence characteristic figure in selection above-mentioned (b);
(d) according to selected pixel in (c), its corresponding anchor frame i.e. primary election anchor frame on original image is obtained;
(e) for the primary election anchor frame obtained in (d), extra anchor frame is excluded using non-maxima suppression method first, it is then right Remaining anchor frame carries out coordinate modification using corresponding prediction frame regression vector, final to obtain face candidate frame.
2. a kind of face candidate frame extracting method as described in claim 1, which is characterized in that the candidate network is full convolution Neural network.
3. a kind of face candidate frame extracting method as described in claim 1, which is characterized in that the k is to meetMaximum value.
4. a kind of face candidate frame extracting method as described in claim 1, which is characterized in that the threshold value T is 0.6.
5. a kind of face candidate frame extracting method as described in claim 1, which is characterized in that maximum IOU value is more than or equal to 0.5 Imparting correspond to anchor frame positive sample label, imparting of the maximum IOU value less than 0.5 and more than or equal to 0.3 corresponds to anchor frame portion and divides sample Label, maximum imparting of the IOU value less than 0.2 correspond to anchor frame negative sample label, remaining anchor frame does not assign label.
CN201910793249.5A 2019-08-27 2019-08-27 Face candidate frame extraction method Active CN110490170B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910793249.5A CN110490170B (en) 2019-08-27 2019-08-27 Face candidate frame extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910793249.5A CN110490170B (en) 2019-08-27 2019-08-27 Face candidate frame extraction method

Publications (2)

Publication Number Publication Date
CN110490170A true CN110490170A (en) 2019-11-22
CN110490170B CN110490170B (en) 2023-01-06

Family

ID=68554292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910793249.5A Active CN110490170B (en) 2019-08-27 2019-08-27 Face candidate frame extraction method

Country Status (1)

Country Link
CN (1) CN110490170B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046792A (en) * 2019-12-11 2020-04-21 三一重工股份有限公司 Face detection method and device, electronic equipment and computer readable storage medium
CN114299283A (en) * 2022-03-07 2022-04-08 科大天工智能装备技术(天津)有限公司 Image-based displacement measurement method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170206431A1 (en) * 2016-01-20 2017-07-20 Microsoft Technology Licensing, Llc Object detection and classification in images
CN107871134A (en) * 2016-09-23 2018-04-03 北京眼神科技有限公司 A kind of method for detecting human face and device
CN109919097A (en) * 2019-03-08 2019-06-21 中国科学院自动化研究所 Face and key point combined detection system, method based on multi-task learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170206431A1 (en) * 2016-01-20 2017-07-20 Microsoft Technology Licensing, Llc Object detection and classification in images
CN107871134A (en) * 2016-09-23 2018-04-03 北京眼神科技有限公司 A kind of method for detecting human face and device
CN109919097A (en) * 2019-03-08 2019-06-21 中国科学院自动化研究所 Face and key point combined detection system, method based on multi-task learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046792A (en) * 2019-12-11 2020-04-21 三一重工股份有限公司 Face detection method and device, electronic equipment and computer readable storage medium
CN114299283A (en) * 2022-03-07 2022-04-08 科大天工智能装备技术(天津)有限公司 Image-based displacement measurement method and system

Also Published As

Publication number Publication date
CN110490170B (en) 2023-01-06

Similar Documents

Publication Publication Date Title
CN106960195B (en) Crowd counting method and device based on deep learning
CN105608456B (en) A kind of multi-direction Method for text detection based on full convolutional network
WO2019140767A1 (en) Recognition system for security check and control method thereof
US20180018503A1 (en) Method, terminal, and storage medium for tracking facial critical area
CN110287960A (en) The detection recognition method of curve text in natural scene image
CN110287826B (en) Video target detection method based on attention mechanism
CN110263712B (en) Coarse and fine pedestrian detection method based on region candidates
CN105844621A (en) Method for detecting quality of printed matter
CN109583340A (en) A kind of video object detection method based on deep learning
CN107624189A (en) Method and apparatus for generating forecast model
WO2009109127A1 (en) Real-time body segmentation system
CN110298297A (en) Flame identification method and device
CN105261021B (en) Remove the method and device of foreground detection result shade
CN111027377B (en) Double-flow neural network time sequence action positioning method
JP2006524394A (en) Delineation of human contours in images
CN113536972B (en) Self-supervision cross-domain crowd counting method based on target domain pseudo label
CN110349167A (en) A kind of image instance dividing method and device
EP3073443A1 (en) 3D Saliency map
CN110399882A (en) A kind of character detecting method based on deformable convolutional neural networks
CN109360191B (en) Image significance detection method based on variational self-encoder
CN109800756A (en) A kind of text detection recognition methods for the intensive text of Chinese historical document
CN113706481A (en) Sperm quality detection method, sperm quality detection device, computer equipment and storage medium
Zhu et al. Towards automatic wild animal detection in low quality camera-trap images using two-channeled perceiving residual pyramid networks
CN113221956A (en) Target identification method and device based on improved multi-scale depth model
CN110490170A (en) A kind of face candidate frame extracting method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant