CN110458005B - Rotation-invariant face detection method based on multitask progressive registration network - Google Patents

Rotation-invariant face detection method based on multitask progressive registration network Download PDF

Info

Publication number
CN110458005B
CN110458005B CN201910590187.8A CN201910590187A CN110458005B CN 110458005 B CN110458005 B CN 110458005B CN 201910590187 A CN201910590187 A CN 201910590187A CN 110458005 B CN110458005 B CN 110458005B
Authority
CN
China
Prior art keywords
face
image
network
rotation
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910590187.8A
Other languages
Chinese (zh)
Other versions
CN110458005A (en
Inventor
周丽芳
谷雨
雷帮军
李伟生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201910590187.8A priority Critical patent/CN110458005B/en
Publication of CN110458005A publication Critical patent/CN110458005A/en
Application granted granted Critical
Publication of CN110458005B publication Critical patent/CN110458005B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Abstract

The invention discloses a rotation-invariant face detection method based on a multitask progressive registration network, and belongs to the field of computer vision. The method mainly comprises the following steps: preprocessing images, and constructing and training a cascaded multilayer convolutional neural network; inputting a test image, generating an image set with different resolutions in an image pyramid mode, and then sending the image set into the cascaded multilayer convolutional neural network for starting detection; filtering out part of non-face windows by each level of network, adjusting the positions of candidate frames according to the frame regression result, and predicting the rotation angle of the face; and then registering by flipping the image according to the predicted rotation angle. In the invention, the real-time rotation self-adaptive face detection is realized by a multitask progressive network registration method, and good effects on precision and speed are obtained.

Description

Rotation-invariant face detection method based on multitask progressive registration network
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a rotation invariant face detection method based on a convolutional neural network.
Background
The image containing the face is indispensable to human-computer interaction based on intelligent vision, and the face detection provides abundant visual information for the intelligent analysis of the target, and can be used for identifying an interested object in the image. Meanwhile, the research on human face detection also becomes a fundamental problem which is difficult to avoid in the fields of image processing, computer vision and pattern recognition, and is widely concerned by researchers. The progress made in face detection plays an important role in supporting many problems in the fields of computer vision and pattern recognition, such as face recognition, video tracking, head pose estimation, gender recognition, and the like.
Research on human face detection by computer vision means on human targets has been developed for decades, but the performance of many face detection algorithms is not sufficient to meet the requirements in practical applications. Compared with a controlled environment, the human face in a real scene has different appearances in pictures: the human face is basically in a right-side-up state under the controlled environment, and only the head has slight geometric deformation; the human face pose under the real scene is more complex, and the biggest characteristic is that an uncertain plane rotation angle exists between a human face target and imaging equipment. An important disadvantage of the existing typical DCNN face detection network is that the robustness of the existing DCNN face detection network to image rotation change, scale change and the like is poor.
Disclosure of Invention
In view of the above disadvantages in the prior art, the present invention provides a face detection method that is robust to changes in rotation angles in a plane.
The technical scheme adopted by the invention for realizing the aim is as follows: a rotation-invariant face detection method based on a multi-task progressive registration network comprises the following steps:
s1, preprocessing an image, and constructing and training a cascaded multilayer convolutional neural network;
s2, inputting a test image, generating an image set with different resolutions in an image pyramid mode, and sending the image set into the cascaded multilayer convolutional neural network to start detection;
s3, filtering partial non-face windows by each layer of convolutional neural network, adjusting the positions of candidate frames according to a frame regression result, and simultaneously predicting the rotation angle of the face;
and S4, registering through image overturning operation according to the predicted rotation angle, and judging the registered image as a face image.
Further, the image preprocessing comprises:
a1, rotating a WIDER FACE data set image according to any angle to generate a large number of FACE images containing changes of rotation angles in a plane, and correspondingly rotating and changing FACE position information;
and A2, randomly rotating the LFW data set image according to any angle to generate a large number of face images containing rotation angle changes in a plane, wherein the position information of the key points of the face is correspondingly rotated and changed.
Further, the cascaded multilayer convolutional neural network adopts a three-layer cascaded architecture, the first layer comprises 4 convolutional layers and 1 maximum pooling layer, the second layer comprises 3 convolutional layers, 2 maximum pooling layers and 2 full-connection layers, and the third layer comprises 4 convolutional layers, 3 maximum pooling layers and 2 full-connection layers.
The invention has the following advantages and beneficial effects:
the invention mainly aims at the defect that the existing popular face detection method based on the deep convolutional neural network lacks robustness on image rotation change, and designs a rotation-invariant face detection method based on a multi-task progressive registration network. In the actual scene, the situation that the face area cannot be detected due to the fact that the face target and the imaging device have uncertain plane rotation angles may occur. The method adopts a multi-task learning mode to carry out knowledge transfer among the human face detection task, the human face key point detection task and the angle registration task, and realizes effective learning among all related tasks so as to obtain a practical human face detector with high efficiency, strong discriminative power and high robustness. In addition, the in-plane face image rotation angle information contained in the position coordinates of the key points of the face is fully considered, and in order to improve the robustness of key point detection when the face angle changes, the regression loss function of the key points of the face is redefined, and the tolerance capability of the algorithm to the in-plane rotation angle change is effectively improved. The method obtains better detection effect.
Drawings
Fig. 1 is a flow chart of implementation of rotation-invariant face detection provided in an embodiment of the present invention;
FIG. 2 is a diagram of a first stage network structure of a multi-cascaded convolutional neural network provided in an embodiment of the present invention;
FIG. 3 is a diagram of a second level network structure of a multi-cascaded convolutional neural network provided by an embodiment of the present invention;
FIG. 4 is a diagram of a third-level network structure of a multi-cascaded convolutional neural network provided in an embodiment of the present invention;
FIG. 5 is a diagram illustrating the effect of rotation invariant face detection provided by an embodiment of the present invention;
fig. 6 is a flowchart of a specific implementation of the method S4 for rotation-invariant face detection according to the embodiment of the present invention;
fig. 7 is a tag position display diagram of feature points on a face image.
Detailed Description
The embodiment of the invention is realized based on cascaded multilayer convolutional neural networks, the image to be detected passes through the multilayer convolutional neural networks at all levels at one time, and each multilayer convolutional neural network executes the tasks of face classification, face candidate frame regression, face key point detection and angle identification. And finally, registering through image overturning operation according to the predicted rotation angle, and judging the registered image as a human face image.
In order to explain the technical solution of the present invention, the following description is made with reference to the accompanying drawings and specific examples.
Fig. 1 shows an implementation process of rotation-invariant face detection provided in an embodiment of the present invention, which is detailed as follows:
s1, constructing and training a cascaded multilayer convolutional neural network;
s2, inputting a test image, generating an image set with different resolutions in an image pyramid mode, and then sending the image set into the cascaded multilayer convolutional neural network to start detection;
s3, filtering out partial non-face windows in each level of network, adjusting the positions of the candidate frames according to the frame regression result, and simultaneously predicting the rotation angle of the face;
and S4, registering through image overturning operation according to the predicted rotation angle.
The cascaded multilayer convolutional neural network adopts a three-layer cascaded architecture design, each stage is composed of a shallow convolutional neural network, the human face detection, the angle recognition and the key point positioning tasks are completed simultaneously, and good effects are achieved in the aspects of speed and precision.
Further, step S1 is to construct and train a multi-task convolutional neural network based on a three-layer cascade architecture by using the relevance among the multiple tasks and combining face detection, angle recognition and key point positioning tasks, and the specific implementation steps are as follows:
the network structure diagrams of the first-level network, the second-level network and the third-level network are respectively shown in fig. 2, fig. 3 and fig. 4. The rotation invariant face detection is decomposed into a face/non-face binary classification problem, a face angle identification problem and a face candidate frame regression problem, namely whether an input image is a face or not is judged, and an output result of a detection frame is enabled to be infinitely close to a true value of the input image. Specifically, the method comprises the following steps:
A. as shown in fig. 2, the network structure of the first-level network is, from top to bottom: the first layer, convolutional layer, with convolutional kernel size of 3 × 3 and convolutional kernel number of 16; the second layer is a maximum pooling layer, and the pooling interval is 2 multiplied by 2; the third layer, convolution kernel size is 3 x 3, convolution kernel number is 32; the fourth layer, convolution kernel size is 3 x 3, convolution kernel number is 64; the fifth layer is divided into four sublayers which are respectively connected with the fourth layer in series, the four sublayers are convolution layers, the convolution kernel is 1 multiplied by 1, and the used supervision information is respectively as follows: face and non-face two-classification information, face position information and face key point position information;
B. as shown in fig. 3, the network structure of the second-level network is, from top to bottom: the first layer, convolutional layer, with convolutional kernel size of 3 × 3 and convolutional kernel number of 24; the second layer is a maximum pooling layer, and the pooling interval is 3 multiplied by 3; the third layer, convolution kernel size is 3 x 3, convolution kernel number is 48; the fourth layer is a maximum pooling layer, and the pooling interval is 3 multiplied by 3; the fifth layer, convolution kernel size is 2 x 2, convolution kernel number is 96; the sixth layer is a full connection layer, and the number of neurons is 196; the seventh layer is divided into four sublayers which are respectively connected with the sixth layer in series, the four sublayers are all connected layers, and the used monitoring information is respectively as follows: face and non-face two-classification information, face position information and face key point position information;
C. as shown in fig. 4, the network structure of the third-level network is, from top to bottom: the first layer, convolutional layer, with convolutional kernel size of 3 × 3 and convolutional kernel number of 24; the second layer is a maximum pooling layer, and the pooling interval is 3 multiplied by 3; the third layer, convolution kernel size is 3 x 3, convolution kernel number is 48; the fourth layer is a maximum pooling layer, and the pooling interval is 3 multiplied by 3; the fifth layer, convolution kernel size is 2 x 2, convolution kernel number is 96; the sixth layer is a maximum pooling layer, and the pooling interval is 2 multiplied by 2; a seventh layer, convolutional layer, with convolutional kernel size of 2 × 2 and number of convolutional kernels of 192; the eighth layer is a full connection layer, and the number of the neurons is 254; the ninth layer is divided into three sublayers which are respectively connected with the eighth layer in series, the four sublayers are all connected layers, and the used supervision information is respectively as follows: face and non-face two-classification information, face position information and face key point position information;
D. in the testing stage, the first-level network and the second-level network only output judgment results f of the human face and the non-human face, the displacement t of the human face candidate frame and the human face direction g, and the third-level network only outputs the judgment results f of the human face and the non-human face, the displacement t of the human face candidate frame and the human face key point position p;
E. when the convolutional neural network is trained, a random gradient descent algorithm is used, a cross entropy function is used for calculating the loss of a face/non-face two-classification task, and the loss is shown in a calculation formula (1):
L cls =ylog f+(1-y)log(1-f) (1)
where y represents the true face classification result.
Likewise, the loss of the angle identification task is calculated using a cross entropy function, as shown in equation (2):
L cal =xlog g+(1-x)log(1-g) (2)
where x represents the true angle classification result.
The regression task of the face candidate frame uses an Euclidean distance function, and the calculation formula is as follows:
Figure BDA0002115859240000041
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002115859240000042
and representing the coordinate value of the real face position.
Finally, the rotation angle information of the face image in the plane contained in the position coordinates of the key points of the face is fully considered, the robustness of key point detection in the method is improved when the face angle changes, therefore, the regression loss function of the key points of the face is redefined, and the calculation formula is as follows:
Figure BDA0002115859240000043
n in the formula is the total number of training samples participating in a human face key point task, d is the Euclidean distance between a prediction point and a real point, and theta is the rotation angle value of the sample and meets the condition that theta belongs to [ -45 degrees and 45 degrees ].
F. In the present example, the public FACE data sets WIDER FACE and LFW were used as training sets. The WIDER FACE contains 32203 images, 393703 individual FACE detection frame position markers. Of these, 50% of the face data was used to train the face classification and candidate box regression tasks, 40% was used as the test set, and the remaining 10% was used as the verification set. The LFW dataset is used to train the angle recognition task person and face alignment task.
Further, step S2 inputs the image into the cascaded multilayer convolutional neural network, and outputs the generated face candidate frame displacement, candidate frame score, key point position and face rotation angle, and the specific implementation steps are as follows:
A. the image to be tested is firstly scaled to generate an image pyramid. The input to the first level network is 12 × 12 × 3, where 3 represents the input image color channel being 3 channels, i.e., an RGB image. The output of the input image generated by the first-level network is the face candidate frame displacement t, the candidate frame score f and the face direction g. At the moment, the face angle recognition task is regarded as a two-classification task, namely the face direction is upward and the face direction is downward, and the two classification tasks are respectively marked as 1 and 0;
training labeled values of samples used in first-layer network angle recognition
Figure BDA0002115859240000044
Theta is the rotation angle value of the sample;
let the label value f 1 Samples of 0 and 1 participate in training the first layer network angle identification.
B. The input of the second-level network is 24 multiplied by 3, and the output of the input image generated by the second-level network is the displacement t of the face candidate frame, the score f of the candidate frame and the face direction g; at the moment, the face angle recognition task is regarded as a three-classification task, namely that the face direction faces upwards, the face direction faces left and the face direction faces right, and the three-classification task is respectively marked as 0, 1 and 2;
method for training labeled values of samples used in angle recognition of second-layer network
Figure BDA0002115859240000045
Theta is the rotation angle value of the sample;
let the label value f 2 Samples of 0, 1 and 2 participate in training the second-tier network angle recognition task.
Further, in step S3, the rotation angle of the face output by the network is registered by turning the image, and the specific implementation steps are as follows:
A. in step S2, after the image to be measured passes through the first-level network, a face direction score g is generated, and the calculation formula of the rotation angle in the corresponding plane is
Figure BDA0002115859240000051
Wherein 0 ° represents face up, and 180 ° represents face down.
B. When the rotation angle p is 0 degrees, the image is not turned over; when the rotation angle p is 180 degrees, the image is turned by 180 degrees; at this time, the range of the rotation angle of the face in the plane is narrowed from-180 DEG, 180 DEG to-90 DEG, 90 deg. The image turning operation is simple, the calculation cost is low, and efficient and rapid face image registration in a plane can be realized;
C. in step S2, after the image to be detected passes through the second-level network, a face direction score g is generated, and the face direction score is converted into a direction label according to the formula (5):
id=argmax g i ,i∈[0,1,2] (5)
wherein id represents a directional tag, g 0 ,g 1 ,g 2 Respectively representing the scores of the directions of the human face towards the left, the upper and the right.
The calculation formula of the rotation angle in the corresponding plane is
Figure BDA0002115859240000052
Wherein 0 ° represents face up, 90 ° represents face to left, and-90 ° represents face to right.
D. When the rotation angle p is 0 degrees, the image is not turned over; when the rotation angle p is 90 degrees, the image rotates rightwards by 90 degrees; when the rotation angle p is-90 deg., the image is rotated 90 deg. to the left. At this time, the range of the rotation angle of the face in the plane is narrowed from-90 °,90 ° ] to-45 °,45 °.
Further, step S2 inputs the image into the last-stage multitask convolutional neural network, and outputs and generates a face candidate frame position t, a candidate frame score f, and a key point position p, and the specific implementation steps are as follows:
A. the input of the last level network is 48 multiplied by 3, and the output of the input image generated by the third level network is the face candidate frame displacement t, the candidate frame score f and the face key point position p, which is different from the first level network and the second level network.
B. The purpose of the image to be measured passing through the first stage network is to rapidly generate candidate windows using a full convolution network, predicting the image rotation angle in a relatively coarse manner. The purpose of the second-level network is to continuously refine the candidate window generated in the first level by using a complex convolutional neural network, discard a large number of overlapped windows and predict the image rotation angle.
Further, step S4 calculates a rotation angle by using the geometric information between the position of the face candidate frame output in step S2 and the face key point, and performs registration by flipping the image to obtain a detected face image, which is specifically implemented as follows:
A. as shown in fig. 6 (a), the image to be measured is subjected to face detection and key point positioning through a third-level network, the output image of the third-level network is determined to be a face image, and the range of the rotation angle of the face in the plane is [ -45 °,45 ° ].
B. As shown in fig. 6 (b), the distance from the human eye to the top of the head is known to be closer than the distance from other key points to the top of the head. Based on this a priori knowledge, the forward direction of the face detection box is first determined by calculating the distances between the left and right eyes to the four bounding boxes.
C. As shown in fig. 7, in a standard face-up image, the line between the left eye and the tip of the nose forms an angle α equal to the angle β formed by the line between the tip of the nose and the top of the head. As shown in fig. 6 (d), the calculation formula of the face image rotation angle is θ = (α - β) ÷ 2 using the geometric relationship between the face detection frame and the key point.
The rotation invariant face detection effect provided by the embodiment of the invention is shown in fig. 5.

Claims (4)

1. A rotation invariant face detection method based on a multitask progressive registration network is characterized by comprising the following steps:
s1, preprocessing an image, and constructing and training a cascaded multilayer convolutional neural network;
s2, inputting a test image, generating an image set with different resolutions in an image pyramid mode, and sending the image set into the cascaded multilayer convolutional neural network to start detection;
s3, filtering out partial non-face windows by each layer of convolutional neural network, adjusting the positions of candidate frames according to a frame regression result, and simultaneously predicting the rotation angle of the face; the frame regression result is embodied through the regression loss of the key points of the human face;
face key point regression loss is
Figure FDA0003834670180000011
D in the formula is the Euclidean distance between the prediction point and the real point, and theta is the rotation angle value of the sample and satisfies the condition
Figure FDA0003834670180000014
And S4, registering through image overturning operation according to the predicted rotation angle, and judging the registered image as a human face image.
2. The rotation-invariant face detection method based on the multitask progressive registration network according to claim 1, wherein the rotation-invariant face detection method comprises the following steps: the image preprocessing comprises:
a1, randomly rotating a face image to any angle to generate a large number of face images containing in-plane rotation angle changes, and correspondingly rotating and changing face position information;
and A2, randomly rotating the face key point images to any angle to generate a large number of face key point images containing rotation angle changes in a plane, wherein the position information of the face key point is correspondingly rotated and changed.
3. The rotation-invariant face detection method based on the multitask progressive registration network according to claim 1 or 2, characterized in that: the cascaded multilayer convolutional neural network adopts a three-layer cascaded structure, wherein the first layer comprises 4 convolutional layers and 1 maximum pooling layer, the second layer comprises 3 convolutional layers, 2 maximum pooling layers and 2 full-connection layers, and the third layer comprises 4 convolutional layers, 3 maximum pooling layers and 2 full-connection layers.
4. The rotation-invariant face detection method based on the multitask progressive registration network according to claim 3, wherein the rotation-invariant face detection method comprises the following steps: in the three-layer cascaded multilayer convolutional neural network,
training labeled values of samples used in first-layer network angle recognition
Figure FDA0003834670180000012
Theta is the rotation angle value of the sample;
let the label value f 1 The samples of 0 and 1 participate in training the first layer of network angle recognition;
method for training labeled values of samples used in angle recognition of second-layer network
Figure FDA0003834670180000013
Theta is the rotation angle value of the sample;
let the label value f 2 Samples of 0, 1 and 2 participate in training the second-tier network angle recognition task.
CN201910590187.8A 2019-07-02 2019-07-02 Rotation-invariant face detection method based on multitask progressive registration network Active CN110458005B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910590187.8A CN110458005B (en) 2019-07-02 2019-07-02 Rotation-invariant face detection method based on multitask progressive registration network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910590187.8A CN110458005B (en) 2019-07-02 2019-07-02 Rotation-invariant face detection method based on multitask progressive registration network

Publications (2)

Publication Number Publication Date
CN110458005A CN110458005A (en) 2019-11-15
CN110458005B true CN110458005B (en) 2022-12-27

Family

ID=68482053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910590187.8A Active CN110458005B (en) 2019-07-02 2019-07-02 Rotation-invariant face detection method based on multitask progressive registration network

Country Status (1)

Country Link
CN (1) CN110458005B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428657A (en) * 2020-03-27 2020-07-17 杭州趣维科技有限公司 Real-time rotation invariant face key point detection method
CN111626160B (en) * 2020-05-15 2023-10-03 辽宁工程技术大学 Face detection method based on regional progressive calibration network under angle change
CN111739070B (en) * 2020-05-28 2022-07-22 复旦大学 Real-time multi-pose face detection algorithm based on progressive calibration type network
CN111695522B (en) * 2020-06-15 2022-10-18 重庆邮电大学 In-plane rotation invariant face detection method and device and storage medium
CN112364805B (en) * 2020-11-21 2023-04-18 西安交通大学 Rotary palm image detection method
CN112668465A (en) * 2020-12-25 2021-04-16 秒影工场(北京)科技有限公司 Film face extraction method based on multistage CNN
CN113838056B (en) * 2021-11-29 2022-03-01 中国电力科学研究院有限公司 Power equipment joint detection and identification method, system, equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012013711A2 (en) * 2010-07-28 2012-02-02 International Business Machines Corporation Semantic parsing of objects in video
CN107239736A (en) * 2017-04-28 2017-10-10 北京智慧眼科技股份有限公司 Method for detecting human face and detection means based on multitask concatenated convolutional neutral net
CN107871106A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Face detection method and device
CN108038455A (en) * 2017-12-19 2018-05-15 中国科学院自动化研究所 Bionic machine peacock image-recognizing method based on deep learning
WO2018121777A1 (en) * 2016-12-31 2018-07-05 深圳市商汤科技有限公司 Face detection method and apparatus, and electronic device
CN108564029A (en) * 2018-04-12 2018-09-21 厦门大学 Face character recognition methods based on cascade multi-task learning deep neural network
CN108960064A (en) * 2018-06-01 2018-12-07 重庆锐纳达自动化技术有限公司 A kind of Face datection and recognition methods based on convolutional neural networks
CN109359603A (en) * 2018-10-22 2019-02-19 东南大学 A kind of vehicle driver's method for detecting human face based on concatenated convolutional neural network
CN109409303A (en) * 2018-10-31 2019-03-01 南京信息工程大学 A kind of cascade multitask Face datection and method for registering based on depth

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7274832B2 (en) * 2003-11-13 2007-09-25 Eastman Kodak Company In-plane rotation invariant object detection in digitized images
US10032067B2 (en) * 2016-05-28 2018-07-24 Samsung Electronics Co., Ltd. System and method for a unified architecture multi-task deep learning machine for object recognition

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012013711A2 (en) * 2010-07-28 2012-02-02 International Business Machines Corporation Semantic parsing of objects in video
CN107871106A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Face detection method and device
WO2018121777A1 (en) * 2016-12-31 2018-07-05 深圳市商汤科技有限公司 Face detection method and apparatus, and electronic device
CN107239736A (en) * 2017-04-28 2017-10-10 北京智慧眼科技股份有限公司 Method for detecting human face and detection means based on multitask concatenated convolutional neutral net
CN108038455A (en) * 2017-12-19 2018-05-15 中国科学院自动化研究所 Bionic machine peacock image-recognizing method based on deep learning
CN108564029A (en) * 2018-04-12 2018-09-21 厦门大学 Face character recognition methods based on cascade multi-task learning deep neural network
CN108960064A (en) * 2018-06-01 2018-12-07 重庆锐纳达自动化技术有限公司 A kind of Face datection and recognition methods based on convolutional neural networks
CN109359603A (en) * 2018-10-22 2019-02-19 东南大学 A kind of vehicle driver's method for detecting human face based on concatenated convolutional neural network
CN109409303A (en) * 2018-10-31 2019-03-01 南京信息工程大学 A kind of cascade multitask Face datection and method for registering based on depth

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks;Shi, X等;《IEEE/CVF Conference on Computer Vision and Pattern Recognition》;20181231;第2295-2303页 *
多级联卷积神经网络人脸检测;余飞等;《五邑大学学报(自然科学版)》;20180831;第32卷(第3期);第49-56页 *

Also Published As

Publication number Publication date
CN110458005A (en) 2019-11-15

Similar Documents

Publication Publication Date Title
CN110458005B (en) Rotation-invariant face detection method based on multitask progressive registration network
Zhang et al. Joint face detection and alignment using multitask cascaded convolutional networks
Staar et al. Anomaly detection with convolutional neural networks for industrial surface inspection
Yang et al. Faceness-net: Face detection through deep facial part responses
Ahmad et al. Deep learning methods for object detection in smart manufacturing: A survey
Zhang et al. Joint human detection and head pose estimation via multistream networks for RGB-D videos
Hsiao et al. Occlusion reasoning for object detectionunder arbitrary viewpoint
Li et al. Face detection based on receptive field enhanced multi-task cascaded convolutional neural networks
CN110163286B (en) Hybrid pooling-based domain adaptive image classification method
He et al. Learning polar encodings for arbitrary-oriented ship detection in SAR images
CN108073940B (en) Method for detecting 3D target example object in unstructured environment
Putro et al. High performance and efficient real-time face detector on central processing unit based on convolutional neural network
Liu et al. D-CenterNet: An anchor-free detector with knowledge distillation for industrial defect detection
Zhao et al. Classification matters more: Global instance contrast for fine-grained SAR aircraft detection
Yu et al. SKGNet: Robotic grasp detection with selective kernel convolution
Wei et al. A survey of facial expression recognition based on deep learning
Feng et al. Rapid ship detection method on movable platform based on discriminative multi-size gradient features and multi-branch support vector machine
CN113158787B (en) Ship detection and classification method under complex marine environment
Jhuang et al. Deeppear: Deep pose estimation and action recognition
Suheryadi et al. Spatio-temporal analysis for moving object detection under complex environment
Liu et al. Simple and efficient smoke segmentation based on fully convolutional network
Aiouez et al. Real-time Arabic Sign Language Recognition based on YOLOv5.
Wu et al. Real-Time Pixel-Wise Grasp Detection Based on RGB-D Feature Dense Fusion
CN111160179A (en) Tumble detection method based on head segmentation and convolutional neural network
Liu et al. Facial landmark localization in the wild by backbone-branches representation learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant