CN109409303A - A kind of cascade multitask Face datection and method for registering based on depth - Google Patents

A kind of cascade multitask Face datection and method for registering based on depth Download PDF

Info

Publication number
CN109409303A
CN109409303A CN201811287109.2A CN201811287109A CN109409303A CN 109409303 A CN109409303 A CN 109409303A CN 201811287109 A CN201811287109 A CN 201811287109A CN 109409303 A CN109409303 A CN 109409303A
Authority
CN
China
Prior art keywords
face
net network
obtains
picture size
registering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811287109.2A
Other languages
Chinese (zh)
Inventor
刘青山
蔡珍妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN201811287109.2A priority Critical patent/CN109409303A/en
Publication of CN109409303A publication Critical patent/CN109409303A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/344Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Abstract

The present invention provides a kind of cascade multitask Face datection and method for registering based on depth includes the following steps: step 1: adjustment picture size forms image pyramid;Step 2: the various sizes of image that step 1 is obtained is respectively fed to P-Net network, and prediction obtains original image face candidate frame;Step 3: inputting R-Net network after the picture size in all face frames that step 2 obtains is set as n × n, carry out face or non-face judgement, and bounding box recurrence is carried out to face frame;Step 4: inputting O-Net network after the picture size in all face frames that step 3 obtains is set as m × m, carry out face or non-face judgement, and bounding box recurrence is carried out to face frame, while exporting the coordinate at left and right eye center and nose, left and right corners of the mouth position;Step 5: inputting L-Net network after the picture size in all face frames that step 4 obtains is set as k × k, finally obtain the three-dimensional perspective size of original image head pose and the position of several characteristic points.

Description

A kind of cascade multitask Face datection and method for registering based on depth
Technical field
The invention belongs to human face detection tech fields, more particularly to a kind of cascade multitask Face datection based on depth With method for registering.
Background technique
Face datection has vital effect, such as face editor, face for many face applications with being registrated Identification and facial Expression Analysis etc..But in real world, due to the influence of the factors such as illumination, size, attitudes vibration, make one Face, which is detected, to be become difficult with being registrated.
In terms of Face datection, more classical at present is the VJ face that Paul Viola and Michael Jones are proposed Detection method, this method propose integral image, quickly calculate Haar-like feature and are carried out using Adaboost learning algorithm Weak Classifier is combined into strong classifier by feature selecting and classifier training.
It is using the thought returned mostly, more representational is supervision descent method in terms of face registration.This side It is owned by France to utilize the SIFT feature of point on the basis of initialization feature point in a kind of method for solving the problems, such as non-linear minimisation Point is returned to obtain new characteristic point position, then the point newly obtained is returned, until obtaining closest true The position of real characteristic point.
Summary of the invention
It is an object of the invention in view of the drawbacks of the prior art or problem, provide a kind of cascade multitask based on depth Face datection and method for registering, small with model, speed is fast, good to the robustness of the extraneous factors such as light and posture variation Advantage.
Technical scheme is as follows: a kind of cascade multitask Face datection based on depth and method for registering include with Lower step: step 1: adjustment picture size forms image pyramid;Step 2: the various sizes of image that step 1 is obtained is distinguished It is sent into P-Net network, prediction obtains original image face candidate frame;Step 3: the image in all face frames that step 2 is obtained R-Net network is inputted after being dimensioned to n × n, carries out face or non-face judgement, and bounding is carried out to face frame Box is returned, wherein n is positive integer;Step 4: after the picture size in all face frames that step 3 obtains is set as m × m O-Net network is inputted, face or non-face judgement is carried out, and bounding box recurrence is carried out to face frame, exports simultaneously The coordinate of left and right eye center and nose, left and right corners of the mouth position, wherein m is positive integer;Step 5: the institute that step 4 is obtained L-Net network is inputted after having the picture size in face frame to be set as k × k, finally obtains the three dimensional angular of original image head pose Spend the position of size and several characteristic points, wherein k is positive integer.
Preferably, P-Net network described in step 2 is full convolutional neural networks.
Preferably, the three-dimensional perspective of head pose described in step 5 is tri- kinds of angles of yaw, pitch, roll respectively, respectively generation Table or so overturns, spins upside down, the angle of plane internal rotation.
Technical solution provided by the invention has the following beneficial effects:
1, the present invention is based on the cascade multitask Face datections and method for registering of depth, are registrated using Face datection with face Inner link, pass through network simultaneously export face location and characteristic point position, improve estimated performance;
2, the present invention is based on the cascade multitask Face datection and method for registering of depth, using head pose three-dimensional perspective and The inner link of human face characteristic point is mentioned by network while out-feed head posture three-dimensional perspective size and human face characteristic point position High estimated performance;
3, the present invention is based on the cascade multitask Face datection and method for registering of depth, using four shallow-layer neural networks, It is finally trained by cascading the size of prediction face frame, characteristic point position and head pose three-dimensional perspective to finely roughly The four model volumes arrived are very small, and predetermined speed is than very fast.
Detailed description of the invention
Fig. 1 is the process of cascade the multitask Face datection and method for registering provided in an embodiment of the present invention based on depth Figure;
Fig. 2 is the structure of the cascade multitask Face datection and P-Net network in method for registering shown in Fig. 1 based on depth Figure;
Fig. 3 is the structure of the cascade multitask Face datection and R-Net network in method for registering shown in Fig. 1 based on depth Figure;
Fig. 4 is the structure of the cascade multitask Face datection and O-Net network in method for registering shown in Fig. 1 based on depth Figure;
Fig. 5 is the structure of the cascade multitask Face datection and L-Net network in method for registering shown in Fig. 1 based on depth Figure.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
Although the step in the present invention is arranged with label, it is not used to limit the precedence of step, unless Based on the execution of the order or certain step that specify step needs other steps, otherwise the relative rank of step is It is adjustable.It is appreciated that term "and/or" used herein be related to and cover in associated listed item one Person or one or more of any and all possible combinations.
As shown in Figure 1, the cascade multitask Face datection provided in an embodiment of the present invention based on depth and method for registering packet Include following steps:
Step 1, original image pretreatment
For example, original image size is set, choose and reduce the factor 0.709, it is made gradually to narrow down to about 12 × 12, forms figure As pyramid.
Step 2, prediction original image face candidate frame
The various sizes of picture that step 1 is obtained is respectively fed to P-Net network, exports the candidate frame position of original image. Wherein, the P-Net network is full convolutional neural networks, moreover, as shown in Fig. 2, Conv indicates volume in the P-Net network Product, step-length 1;MP indicates max pooling, step-length 2.
In training, face classification uses cross entropy loss function, and bouding box, which is returned, calculates damage using Euclidean distance It loses, and forms the loss for calculating entire P-Net network with the ratio of 2:1.
Step 3 judges face or non-face and fine tuning face frame position
R-Net network is inputted after picture size in all face frames that step 2 obtains is set as n × n, carries out face Or non-face judgement, and bounding box recurrence is carried out to face frame, wherein n is positive integer.
For example, the image resize in all face frames that step 2 is obtained inputs R-Net network after being 24 × 24, into Pedestrian's face/non-face judgement, and bounding box recurrence is carried out to face frame.Wherein, the R-Net network is full convolution mind Through network, moreover, as shown in figure 3, Conv indicates convolution, step-length 1 in the R-Net network of network;MP indicates max Pooling, step-length 2.
In training, face classification uses cross entropy loss function, and bouding box, which is returned, calculates damage using Euclidean distance It loses, and forms the loss for calculating entire R-Net network with the ratio of 2:1.
Step 4 further judges face or non-face, fine tuning frame and the position for predicting several characteristic points
O-Net network is inputted after picture size in all face frames that step 3 obtains is set as m × m, carries out face Or non-face judgement, and bounding box recurrence is carried out to face frame, while exporting left and right eye center and nose, a left side The coordinate of right corners of the mouth position, wherein m is positive integer.
For example, the image resize in all face frames that step 3 is obtained inputs O-Net network after being 48 × 48, into Pedestrian's face/non-face judgement, and bounding box recurrence is carried out to face frame, while exporting left and right eye center and nose The coordinate of sharp, left and right corners of the mouth position 5 points.Wherein, the R-Net network is full convolutional neural networks, moreover, such as Fig. 4 institute Show, in the R-Net network of network, Conv indicates convolution, step-length 1;MP indicates max pooling, step-length 2.
In training, face classification uses cross entropy loss function, and bouding box is returned and positioning feature point is all made of Euclidean distance calculates loss, and the loss for calculating entire O-Net network is formed with the ratio of 2:1:2.
The position of step 5, out-feed head posture three-dimensional perspective and several characteristic points
L-Net network is inputted after picture size in all face frames that step 4 obtains is set as k × k, is finally obtained The three-dimensional perspective size of original image head pose and the position of several characteristic points, wherein k is positive integer.
For example, the image resize in all face frames that step 4 is obtained inputs L-Net network, output after being 48x48 The position of the three-dimensional perspective size of original image head pose and 68 characteristic points.Wherein, the R-Net network is full convolutional Neural Network, moreover, as shown in figure 5, Conv indicates convolution, step-length 1 in the R-Net network of network;MP indicates max Pooling, step-length 2.
In training, facial modeling and head pose estimation are all made of Euclidean distance and calculate loss, in order to obtain More accurate positioning feature point effect forms the loss for calculating entire L-Net network with the ratio of 100:1.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.
In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only wrapped Containing an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should It considers the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms those skilled in the art The other embodiments being understood that.

Claims (3)

1. a kind of cascade multitask Face datection and method for registering based on depth, it is characterised in that: the following steps are included:
Step 1: adjustment picture size forms image pyramid;
Step 2: the various sizes of image that step 1 is obtained is respectively fed to P-Net network, and prediction obtains original image face candidate Frame;
Step 3: inputting R-Net network after the picture size in all face frames that step 2 obtains is set as n × n, carry out people Face or non-face judgement, and bounding box recurrence is carried out to face frame, wherein n is positive integer;
Step 4: inputting O-Net network after the picture size in all face frames that step 3 obtains is set as m × m, carry out people Face or non-face judgement, and bounding box recurrence is carried out to face frame, at the same export left and right eye center and nose, The coordinate of left and right corners of the mouth position, wherein m is positive integer;
Step 5: L-Net network is inputted after the picture size in all face frames that step 4 obtains is set as k × k, finally To the three-dimensional perspective size of original image head pose and the position of several characteristic points, wherein k is positive integer.
2. a kind of cascade multitask Face datection and method for registering, feature based on depth according to claim 1 exists In P-Net network described in step 2 is full convolutional neural networks.
3. a kind of cascade multitask Face datection and method for registering, feature based on depth according to claim 1 exists In, the three-dimensional perspective of head pose described in step 5 is tri- kinds of angles of yaw, pitch, roll respectively, respectively represent left and right overturning, It spins upside down, the angle of plane internal rotation.
CN201811287109.2A 2018-10-31 2018-10-31 A kind of cascade multitask Face datection and method for registering based on depth Pending CN109409303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811287109.2A CN109409303A (en) 2018-10-31 2018-10-31 A kind of cascade multitask Face datection and method for registering based on depth

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811287109.2A CN109409303A (en) 2018-10-31 2018-10-31 A kind of cascade multitask Face datection and method for registering based on depth

Publications (1)

Publication Number Publication Date
CN109409303A true CN109409303A (en) 2019-03-01

Family

ID=65470723

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811287109.2A Pending CN109409303A (en) 2018-10-31 2018-10-31 A kind of cascade multitask Face datection and method for registering based on depth

Country Status (1)

Country Link
CN (1) CN109409303A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175504A (en) * 2019-04-08 2019-08-27 杭州电子科技大学 A kind of target detection and alignment schemes based on multitask concatenated convolutional network
CN110458005A (en) * 2019-07-02 2019-11-15 重庆邮电大学 It is a kind of based on the progressive invariable rotary method for detecting human face with pseudo-crystalline lattice of multitask
CN111652020A (en) * 2019-04-16 2020-09-11 上海铼锶信息技术有限公司 Method for identifying rotation angle of human face around Z axis
CN111738934A (en) * 2020-05-15 2020-10-02 西安工程大学 MTCNN-based red eye automatic repairing method
WO2024050827A1 (en) * 2022-09-09 2024-03-14 Intel Corporation Enhanced image and video object detection using multi-stage paradigm

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330251A (en) * 2017-06-10 2017-11-07 华南理工大学 A kind of wind power prediction method based on Retrieval method
CN107895150A (en) * 2016-11-30 2018-04-10 奥瞳系统科技有限公司 Face datection and head pose angle based on the small-scale convolutional neural networks module of embedded system are assessed
CN108564029A (en) * 2018-04-12 2018-09-21 厦门大学 Face character recognition methods based on cascade multi-task learning deep neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107895150A (en) * 2016-11-30 2018-04-10 奥瞳系统科技有限公司 Face datection and head pose angle based on the small-scale convolutional neural networks module of embedded system are assessed
CN107330251A (en) * 2017-06-10 2017-11-07 华南理工大学 A kind of wind power prediction method based on Retrieval method
CN108564029A (en) * 2018-04-12 2018-09-21 厦门大学 Face character recognition methods based on cascade multi-task learning deep neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HAO WU等: "Simultaneous Face Detection and Pose Estimation Using Convolutional Neural Network Cascade", 《DIGITAL OBJECT IDENTIFIER 10.1109/ACCESS.2018.2869465》 *
KAIPENG ZHANG等: "Joint Face Detection andAlignment using Multi-task Cascaded Convolutional Networks", 《IEEE XPLORE》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175504A (en) * 2019-04-08 2019-08-27 杭州电子科技大学 A kind of target detection and alignment schemes based on multitask concatenated convolutional network
CN111652020A (en) * 2019-04-16 2020-09-11 上海铼锶信息技术有限公司 Method for identifying rotation angle of human face around Z axis
CN111652020B (en) * 2019-04-16 2023-07-11 上海铼锶信息技术有限公司 Face rotation angle identification method around Z axis
CN110458005A (en) * 2019-07-02 2019-11-15 重庆邮电大学 It is a kind of based on the progressive invariable rotary method for detecting human face with pseudo-crystalline lattice of multitask
CN110458005B (en) * 2019-07-02 2022-12-27 重庆邮电大学 Rotation-invariant face detection method based on multitask progressive registration network
CN111738934A (en) * 2020-05-15 2020-10-02 西安工程大学 MTCNN-based red eye automatic repairing method
CN111738934B (en) * 2020-05-15 2024-04-02 西安工程大学 Automatic red eye repairing method based on MTCNN
WO2024050827A1 (en) * 2022-09-09 2024-03-14 Intel Corporation Enhanced image and video object detection using multi-stage paradigm

Similar Documents

Publication Publication Date Title
CN109409303A (en) A kind of cascade multitask Face datection and method for registering based on depth
CN105868716B (en) A kind of face identification method based on facial geometric feature
Liu et al. Recognizing human actions using multiple features
WO2018107979A1 (en) Multi-pose human face feature point detection method based on cascade regression
CN107610209A (en) Human face countenance synthesis method, device, storage medium and computer equipment
CN108171133B (en) Dynamic gesture recognition method based on characteristic covariance matrix
CN103971112B (en) Image characteristic extracting method and device
CN102938065A (en) Facial feature extraction method and face recognition method based on large-scale image data
Ashwin et al. An e-learning system with multifacial emotion recognition using supervised machine learning
KR102138809B1 (en) 2d landmark feature synthesis and facial expression strength determination for micro-facial expression detection
CN107871107A (en) Face authentication method and device
CN107704848A (en) A kind of intensive face alignment method based on multi-constraint condition convolutional neural networks
CN103544478A (en) All-dimensional face detection method and system
Tang et al. Facial expression recognition using AAM and local facial features
Banerjee et al. Learning unseen emotions from gestures via semantically-conditioned zero-shot perception with adversarial autoencoders
Larochelle Few-shot learning
Song et al. A design for integrated face and facial expression recognition
Yao et al. Dynamicbev: Leveraging dynamic queries and temporal context for 3d object detection
Patil et al. Emotion recognition from 3D videos using optical flow method
Luo et al. Dynamic face recognition system in recognizing facial expressions for service robotics
CN105574494B (en) Multi-classifier gesture recognition method and device
Güney et al. Cross-pose facial expression recognition
Yun et al. Head pose classification by multi-class AdaBoost with fusion of RGB and depth images
Cortés et al. A new bag of visual words encoding method for human action recognition
Zientara et al. Drones as collaborative sensors for image recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 210044 No. 219 Ningliu Road, Jiangbei New District, Nanjing City, Jiangsu Province

Applicant after: Nanjing University of Information Science and Technology

Address before: 211500 Yuting Square, 59 Wangqiao Road, Liuhe District, Nanjing City, Jiangsu Province

Applicant before: Nanjing University of Information Science and Technology

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190301