CN112836566A - Multitask neural network face key point detection method for edge equipment - Google Patents

Multitask neural network face key point detection method for edge equipment Download PDF

Info

Publication number
CN112836566A
CN112836566A CN202011386983.9A CN202011386983A CN112836566A CN 112836566 A CN112836566 A CN 112836566A CN 202011386983 A CN202011386983 A CN 202011386983A CN 112836566 A CN112836566 A CN 112836566A
Authority
CN
China
Prior art keywords
face
neural network
key point
convolutional neural
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011386983.9A
Other languages
Chinese (zh)
Inventor
李思远
王丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhiyunview Technology Co ltd
Original Assignee
Beijing Zhiyunview Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhiyunview Technology Co ltd filed Critical Beijing Zhiyunview Technology Co ltd
Priority to CN202011386983.9A priority Critical patent/CN112836566A/en
Publication of CN112836566A publication Critical patent/CN112836566A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of deep learning, face recognition and face key point detection, and provides a multitask neural network face key point detection method for edge equipment, which is used for realizing face key point calibration and face accurate recognition on mobile equipment. Therefore, the technical scheme adopted by the invention is that a face image to be detected is input into a convolutional neural network aiming at a multitask neural network face key point detection method of edge equipment, and the detected face key point coordinates are output by the convolutional neural network; defining the convolutional neural network loss function. The method is mainly applied to the occasions of face recognition and face key point detection.

Description

Multitask neural network face key point detection method for edge equipment
Technical Field
The invention relates to the field of deep learning, face recognition and face key point detection, in particular to a multitask neural network face key point detection method for edge equipment.
Background
Face keypoint detection, also known as face targeting or face alignment, is intended to automatically locate a set of predefined fiducial points on a face (e.g., corners of the eyes, nose tip, corners of the mouth, etc.). This problem has been of interest in the field of computer vision and has made great progress over the past few years as a fundamental component of various face applications such as face recognition [1, 2], face verification [3], face morphing [4], and face editing [5 ]. However, due to the factors of detection accuracy, processing speed and model size, it is still challenging to develop a practical face key point detection technology.
The technical difficulty is that the face with very high quality is difficult to acquire in a real scene, that is, the face state in the natural environment is uncontrolled and unconstrained. Under different lighting conditions, the posture, the expression and the shape of the lamp body are greatly changed, and local shielding sometimes occurs, as shown in the figure I. Therefore, the challenges in face calibration detection mainly include the following four types:
1. local variation: the face image is locally disturbed by facial expressions, local extreme illumination (such as highlights and shadows), occlusion and the like, so that some key points may not be displayed or are abnormally positioned.
2. Global change: pose and image quality are two key factors, which can bring global influence to the face appearance in the image, and when the global structure of the face is estimated by mistake, the positioning of most key points is not accurate.
3. Data imbalance: the phenomenon that the distribution of the data set which can be used for training is not uniform among the face types and attributes is quite common. The imbalance is likely to make the algorithm or model unable to correctly characterize the data, thereby reducing the accuracy of the detection.
4. Model efficiency: the size and computational performance of the model also limit the utility of the algorithm. Due to the limitations of computing performance and memory resources of the mobile phone or other embedded devices, the complexity of the detection algorithm must be low and the processing speed must be fast.
In recent years, human face key point positioning research has received extensive attention, and many classical algorithms have been created. The method is characterized in that a face shape model is established firstly, the face characteristic points are described by parameters with lower dimensionality, then a face appearance model is established, and the positions of the characteristic points are updated according to the matching degree of the reconstructed face appearance and the model. Wherein Active Appearance Models (AAMs) and Constraint Local Models (CLMs) proposed by Cootes et al [6] are taken as representatives, and the characteristics of the human face are fully utilized by maximizing the position information of the human face. Active appearance models and their subsequent studies [7,8,9] attempt to jointly model overall appearance and shape, while CLMs and related algorithms [10,11] learn local information by applying various shape constraints. Furthermore, a tree-structured component model (TSPM) [12] utilizes deformable component-based models for simultaneous detection, pose estimation, and key point localization. Another type of method is the extended shape regression method (ESR) [13] and the Supervised Descent Method (SDM) [14], which attempt to solve this problem in a regression manner. The main limitations of these methods are poor robustness to complex scene detection, large computational effort, or high model complexity.
Deep Learning (Deep Learning) is a new research direction in the field of machine Learning, and research is performed by using a multilayer neural network. The Convolutional Neural Network (CNN) is a deep learning model, is widely applied to image and audio signal processing, and has a good effect in the face key point detection in recent years. Zhang et al [15] established a multitask learning network (TCDCN) for joint learning of keypoint locations and pose attributes. But due to the multitasking nature of TCDCN, it is difficult to train in practical applications. Trigeorgis et al [16] proposed a coarse-to-fine recursive convolution model (MDM). Lv et al [17] propose a depth regression structure (TSR) based on two-stage re-initialization, which segments a face into several parts to improve detection accuracy. The method [18] uses attitude angles, including yaw, pitch and roll as attributes to construct a network and estimates these three angles directly to aid in keypoint detection, but its complex nature makes it less than ideal in keypoint detection. The pose-invariant face alignment algorithm (PIFA) proposed by jourablo et al [19] estimates a three-dimensional to two-dimensional projection matrix by deep cascaded regression. Algorithm [20] first builds a model of the face depth in the Z buffer and then fits a three-dimensional model of the two-dimensional image.
Recently, Kumar and Chellapa have designed a single tree-shaped CNN named posture condition tree-shaped convolutional neural network (PCD-CNN) [21], which combines a modular classification network on the basis of a classification network to improve the detection accuracy. Honari et al [22] designed a sequence multitasking (SeqMT) network using the equal-variant scaling transform (ELT) as its loss term. The method [23] proposes a regression method based on face calibration of a coarse-to-fine regression tree set (ERT). To make the face keypoint detection method robust to intrinsic changes in image style, Dong et al [24] developed a Style Aggregation Network (SAN) that combines raw face images with style aggregation images to train the keypoint detector. Wu et al [25] propose a boundary-based face alignment algorithm (LAB) that considers boundary information as the geometry of a face to improve detection accuracy, which extracts facial key points from boundary lines to avoid ambiguity in face key point definition to a large extent. Although the deep learning algorithm has advanced sufficiently at present, there are still many shortcomings, especially in practical applications, there is still much room for improvement regarding the accuracy, efficiency and simplicity of the detection algorithm [28 ].
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a multitask neural network face key point detection method for edge equipment, which is used for realizing face key point calibration and face accurate identification on mobile equipment. Therefore, the technical scheme adopted by the invention is that a face image to be detected is input into a convolutional neural network aiming at a multitask neural network face key point detection method of edge equipment, and the detected face key point coordinates are output by the convolutional neural network; the convolutional neural network loss function is as follows:
Figure BDA0002809962870000021
wherein
Figure BDA0002809962870000022
The distance of the nth key point corresponding to the mth input is shown; n represents the number of key points which are preset for each face and need to be detected; m represents the total number of samples of the training picture set; theta1、θ2And theta3Respectively indicating yaw angle and pitchDeviation values between the actual and predicted values of the angle and roll angle, K being 1, 2, 3; c represents different types of human faces, including front faces, side faces, head-up, head-down, expressions and shielding conditions; weight of
Figure BDA0002809962870000023
And adjusting according to the sample class score, and taking the reciprocal of the classification as a weight.
The convolutional neural network is based on a MobileNet convolutional neural network.
The method for enhancing the training data comprises the following specific steps:
1) turning each face picture, and rotating every 5 degrees before minus 30 degrees to 30 degrees;
2) 20% of the face area was randomly occluded on each picture.
And introducing a sub-network in the process of training the convolutional neural network for supervising the training of the model, wherein the sub-network is only used in the training stage, the input is the output of the fourth layer of the convolutional neural network, and the output is three Euler angles of yaw, pitch and roll and is used for calculating a loss function.
The invention has the characteristics and beneficial effects that:
1. the design of the network in the invention is very light, can support multiple tasks, and can simultaneously obtain the key point and face angles after the face image is input.
2. The model in the invention has very small size, saves memory space, is very suitable for running on mobile platforms such as mobile phones and the like, has high running speed, and the frame rate in the mobile platforms can reach 140 fps.
3. Aiming at the problems of geometric constraint and data imbalance, the invention designs a new loss function, thereby solving the problems of geometric constraint and data imbalance.
4. In order to enlarge the receptive field and better capture the global structure of the human face, the invention designs a multi-scale full-connection layer for accurately positioning key points in the human face image.
5. Compared with other face key point detection algorithms, the method uses a coupling mode between three-dimensional attitude estimation and two-dimensional distance measurement; the network structure is simple and visual, and forward calculation and backward propagation are easy to perform; in a single-stage network structure rather than in a cascaded form, this improves the computational efficiency and performance of the method.
6. The algorithm in the invention has high accuracy under various complex conditions of unconstrained gesture, expression, illumination, shielding and the like. We surpassed other advanced methods (such as TSR [17], SAN [24], LAB [25]) in the two data sets of 300W (300Faces in-the-world change face key point data set) and AFLW (identified face Landmarks in the Wild face key point data set). FIGS. 2-5 show examples of partial faces in 300W and AFLW and examples of multi-face pictures, where green points in the pictures mark key points of detected faces.
Description of the drawings:
fig. 1 is an overall model structure diagram in the method of the present invention, and is a diagram illustrating the architecture of a backbone network and an auxiliary network.
FIG. 2 is an example of a human face of the present invention at different poses, expressions, lights, occlusions and image qualities.
FIG. 3 is an example of the detection results of the face key points under extreme illumination, expression, occlusion and blur disturbance in the present invention.
Fig. 4 is an example of the detection result of the multi-face key points in the complex background.
Fig. 5 is an example of the detection result of the multi-face key points in the complex background.
Detailed Description
The invention provides a practical face key point detection technical method based on deep learning, which can be used for effectively calibrating face key points on a mobile terminal. The scheme is mainly realized based on a convolutional neural network, and a network model is required to be designed firstly, and mainly comprises a convolutional neural network structure and a loss function. The input of the model is a face image needing to be detected, and the output of the model is the coordinates of key points of the detected face. Therefore, the core of the method of the present invention is the design of the model, and we will introduce from the loss function, the backbone network, the auxiliary network, and other implementation details.
A first part: loss function
In the case of small data size, the accuracy of the algorithm depends mainly on the design of the loss function. Taking the geometric information into account in the loss function may help solve the training quality problem. Since the change of the local expression hardly affects the projection, the degrees of freedom including scaling and two-dimensional translation can be reduced, and only three euler angles, namely, a pitch angle, a yaw angle and a roll angle, need to be estimated.
Furthermore, in deep learning, data imbalance is another problem that often affects detection accuracy. Therefore, more penalties are made to the loss values corresponding to the rare training samples, which can help to deal with the data imbalance problem.
In view of the above, we have designed the loss function as follows:
Figure BDA0002809962870000041
wherein
Figure BDA0002809962870000042
The distance of the nth key point corresponding to the mth input is shown; n represents the number of key points which are preset for each face and need to be detected; m represents the total number of samples of the training picture set; theta1、θ2And theta3(K-3) respectively representing deviation values between actual values and predicted values of the yaw angle, the pitch angle and the roll angle, and obviously, as the deviation angle value increases, the penalty also increases; c represents different types of human faces, such as front faces, side faces, head-up, head-down, expressions and shielding conditions; weight of
Figure BDA0002809962870000043
And adjusting according to the sample class score, and taking the reciprocal of the classification as a weight in the invention.
By means of the loss function, it can be found that whether the training is affected by three-dimensional posture change or data imbalance, our loss can be used for processing local change by measuring distance.
A second part: backbone network
The backbone network uses convolutional neural networks in deep learning to extract features and predict keypoints (lower level branches in fig. 1). Because a human face has strong global structures, such as symmetric spatial relationships between eyes, mouth, nose, and the like, using global structures can help in more accurate localization. We use a multi-scale profile to expand the receptive field by using different step sizes to perform the convolution operation. In order to map abstract information in different size receptive fields learned by the previous convolutional layer into a larger space and increase the representation capability of a model, the final prediction is completed by connecting the previous three multi-scale feature maps through a full connection layer, detailed parameters of a backbone network are shown in table one in detail, a picture is converted into a three-dimensional matrix of 112x 112x 3 to serve as input, wherein 112x 112 represents the pixel size of an input image, 3 represents the number of RGB (red, green and blue) three channels, an output layer is the multi-scale full connection layer of the user and is connected with the output of the three convolutional layers. The input of each layer of the network is the output of the previous layer, the first two dimensions represent the image size, and the third dimension represents the number of channels. Taking the second layer 562x 64 as an example, 56 is the previous layer pixel divided by the step size, i.e., 112/2-56, and the third dimension is the number of channels of the previous layer convolutional layer, i.e., 64.
Since the backbone network is a bottleneck in terms of processing speed and model size, MobileNet [26-27] is used instead of the conventional convolution operation. MobileNet is a lightweight convolutional neural network, primarily used for mobile and embedded vision applications. The computational load of the backbone network is greatly reduced by using the MobileNet, so that the detection speed is accelerated. In addition, the main network can adjust the width parameter of the MobileNet according to different requirements to compress the network, so that the model is smaller and the calculation speed is higher, and the model size in the invention can still obtain good detection precision after being compressed by 80%.
Table-detailed parameters of a backbone network
Figure BDA0002809962870000044
Figure BDA0002809962870000051
And a third part: auxiliary network
In the process of training the backbone network, a sub-network is introduced to supervise the training of the model (upper branch in fig. 1). The network is only used in the training phase, and the input is the output of the fourth layer of the backbone network. The auxiliary network aims to estimate three-dimensional rotation information including three euler angles of yaw, pitch and roll for each input face sample, thereby determining the head pose. The auxiliary network can effectively improve the stability and robustness of key point detection, the specific structure of the auxiliary network is shown in table two, the input represents the size of a three-dimensional array in the main network, and the output represents three Euler angles of yaw, pitch and rolling, and is used for calculating a loss function in the first part of formula. I.e. the loss function of the overall network.
Table two detailed parameters of auxiliary network
Figure BDA0002809962870000052
The fourth part: other details
In order to enable the deep neural network model to have good effect, the hyper-parameters in the network are optimized, and the values of the hyper-parameters in the table III can be used as references:
hyper-parameters for table three network training
Figure BDA0002809962870000053
In addition, to solve the problem of data imbalance, we also use a data enhancement strategy. Data enhancement, also called data augmentation, refers to having limited data produce value equivalent to more data without substantially increasing the data. Therefore, we mainly adopt the following two ways:
1) and turning each face picture, and rotating every 5 degrees before the face picture is rotated by minus 30 degrees to 30 degrees.
2) 20% of the face area was randomly occluded on each picture.
By adopting a data enhancement strategy, the expansion of the training data set is realized, so that a better detection effect is obtained.
The invention provides a face key point detection method which is mainly based on a convolutional neural network algorithm in deep learning. The neural network model mainly comprises a backbone network and an auxiliary network, wherein the backbone network takes a MobileNet block as a main structure, and simultaneously introduces a multi-scale full-connection layer to expand the receptive field and enhance the expression capability of the structural features of the human face. The auxiliary network can effectively estimate the rotation information to improve the positioning capability of the key point.
The invention solves the problems of geometric normalization and data imbalance by providing a new loss function, and the whole algorithm is superior to the most advanced method in precision, model size and operation speed. From the detection results of fig. 2 to 5, it can be observed that the present invention can still obtain satisfactory visual effects even under extreme illumination, expression, occlusion, and blur interference.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Primary references
【1】Y.Liu,F.Wei,J.Shao,L.Sheng,J.Yan,and X.Wang.Exploring disentangled feature representation beyond face identification.In CVPR,2018.1
【2】X.Zhu,Z.Lei,J.Yan,D.Yi,and S.Z.Li.High-fidelity pose and expression normalization for face recognition in the wild.In CVPR,2015.
【3】Y.Sun,X.Wang,and X.Tang.Hybrid deep learning for face verification.IEEE TPAMI,38(10):1997-2009,2016.
【4】T.Hassner,S.Harel,E.Paz,and R.Enbar.Effective face frontalization in unconstrained images.In CVPR,2015.
【5】J.Thies,M.Zollho fer,M.Stamminger,C.Theobalt,and M.Niener.Face2face:Real-time face capture and reenactment of rgb videos.In CVPR,2016.
【6】T.Cootes,G.Edwards,and C.Taylor.Active appear-ance models.IEEE TPAMI,23(6):681-685,2001.
【7】I.Matthews and S.Baker.Active appearance models revisited.IJCV,60(2):135-164,2004.
【8】F.Khraman,M.Go kmen,S.Darkner,and R.Larsen.An active illumination and appearance(AIA)model for face alignment.In CVPR,2007.
【9】L.Liang,R.Xiao,F.Wen,and J.Sun.Face alignment via component-based discriminative search.In ECCV,2008.
【10】P.Belhumeur,D.Jacobs,D.Kriegman,and N.Kumar.Localizing parts of faces using a consensus of exem-plars.In CVPR,2011.
【11】M.Valstar,B.Martinez,X.Binefa,and M.Pantic.Facial point detection using boosted regression and graph models.In CVPR,2010.
【12】X.Zhu and D.Ramanan.Face detection,pose estimation,and landmark localization in the wild.In CVPR,2012.
【13】X.Cao,Y.Wei,F.Wen,and J.Sun.Face alignment by explicit shape regression.IJCV,107(2):177-190,2014.2,6,7
【14】X.Xiong and F.De la Torre.Supervised decent method and its applications to face alignment.In CVPR,2013.
【15】Z.Zhang,P.Luo,C.Loy,and X.Tang.Facial land-mark detection via deep multi-task learning.In ECCV,2014.
【16】G.Trigeogis,P.Snape,M.Nicolaou,E.Antonakos,and S.Zafeiriou.Mnemonic descent method:A re-current process applied for end-to-end face alignment.In CVPR,2016.
【17】J.Lv,X.Shao,J.Xing,C.Cheng,and X.Zhou.A deep regression architecture with two-stage re-initialization for high performance facial landmark detection.In CVPR,2017.
【18】H.Yang,W.Mou,Y.Zhang,I.Patras,H.Gunes,and P.Robinson.Face alignment assisted by head pose estimation.In BMVC,2015.
【19】A.Jourabloo and X.Liu.Pose-invariant 3d face align-ment.In ICCV,2015.
【20】X.Zhu,Z.Lei,X.Liu,H.Shi,and S.Z.Li.Face alignment across large poses:A 3d solution.In CVPR,2016.
【21】A.Kumar and R.Chellappa.Disentangling 3d pose in a dendritic cnn for unconstrained 2d face alignment.InCVPR,2018.
【22】S.Honari,P.Molchanov,S.Tyree,P.Vincent,C.Pal,and J.Kautz.Improving landmark localization with semi-supervised learning.In CVPR,2018.
【23】R.Valle,J.Buenaposada,A.Valdes,and L.Baumela.A deeply-initialized coarse-to-fine ensemble of regres-sion trees for face alignment.In ECCV,2018.
【24】X.Dong,Y.Yan,W.Ouyang,and Y.Yang.Style aggregated network for facial landmark detection.In CVPR,2018.
【25】W.Wu,C.Qian,S.Yang,Q.Wang,Y.Cai,and Q.Zhou.Look at boundary:A boundary-aware face alignment algorithm.In CVPR,2018.
【26】A.Howard,M.Zhu,B.Chen,D.Kalenichenko,W.Wang,T.Weyand,M.Andreetto,and H.Adam.Mobilenets:Efficient convolutional neural networks for mobile vision applications.CoRR,abs/1704.04861,2017.3,4.
【27】M.Sandle,A.Howard,M.Zhu,A.Zhmoginov,and L.-C.Chen.Mobilenetv2:Inverted residuals and lin-ear bottlenecks.CoRR,abs/1801.04381,2018.3,4.
【28】Xiaojie Guo,Siyuan Li,Jiawan Zhang,Jiayi Ma,Lin Ma,Wei Liu,Haibin Ling:PFLD:A Practical Facial Landmark Detector.CoRR abs/1902.10859,2019。

Claims (3)

1. A multitask neural network face key point detection method for edge equipment is characterized in that a face image to be detected is input into a convolutional neural network, and the convolutional neural network outputs the detected face key point coordinates; the convolutional neural network loss function is as follows:
Figure FDA0002809962860000011
wherein
Figure FDA0002809962860000012
The distance of the nth key point corresponding to the mth input is shown; n represents the number of key points which are preset for each face and need to be detected; m represents the total number of samples of the training picture set; theta1、θ2And theta3Respectively representing deviation values between actual values and predicted values of yaw angle, pitch angle and roll angle, wherein K is 1, 2 and 3; c represents different types of human faces, including front faces, side faces, head-up, head-down, expressions and shielding conditions; weight of
Figure FDA0002809962860000013
And adjusting according to the sample class score, and taking the reciprocal of the classification as a weight.
2. The method of claim 1, wherein the convolutional neural network is based on a MobileNet convolutional neural network.
3. The method for detecting the key points of the face of the multitask neural network aiming at the edge equipment as claimed in claim 1, wherein the data enhancement processing is carried out on the training data, and the specific steps are as follows:
1) turning each face picture, and rotating every 5 degrees before minus 30 degrees to 30 degrees;
2) 20% of the face area was randomly occluded on each picture.
And introducing a sub-network in the process of training the convolutional neural network for supervising the training of the model, wherein the sub-network is only used in the training stage, the input is the output of the fourth layer of the convolutional neural network, and the output is three Euler angles of yaw, pitch and roll and is used for calculating a loss function.
CN202011386983.9A 2020-12-01 2020-12-01 Multitask neural network face key point detection method for edge equipment Pending CN112836566A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011386983.9A CN112836566A (en) 2020-12-01 2020-12-01 Multitask neural network face key point detection method for edge equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011386983.9A CN112836566A (en) 2020-12-01 2020-12-01 Multitask neural network face key point detection method for edge equipment

Publications (1)

Publication Number Publication Date
CN112836566A true CN112836566A (en) 2021-05-25

Family

ID=75923432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011386983.9A Pending CN112836566A (en) 2020-12-01 2020-12-01 Multitask neural network face key point detection method for edge equipment

Country Status (1)

Country Link
CN (1) CN112836566A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113782184A (en) * 2021-08-11 2021-12-10 杭州电子科技大学 Cerebral apoplexy auxiliary evaluation system based on facial key point and feature pre-learning
WO2022257456A1 (en) * 2021-06-10 2022-12-15 平安科技(深圳)有限公司 Hair information recognition method, apparatus and device, and storage medium
CN115984461A (en) * 2022-12-12 2023-04-18 广州紫为云科技有限公司 Face three-dimensional key point detection method based on RGBD camera

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239736A (en) * 2017-04-28 2017-10-10 北京智慧眼科技股份有限公司 Method for detecting human face and detection means based on multitask concatenated convolutional neutral net
CN108805977A (en) * 2018-06-06 2018-11-13 浙江大学 A kind of face three-dimensional rebuilding method based on end-to-end convolutional neural networks
WO2019109526A1 (en) * 2017-12-06 2019-06-13 平安科技(深圳)有限公司 Method and device for age recognition of face image, storage medium
WO2019128367A1 (en) * 2017-12-26 2019-07-04 广州广电运通金融电子股份有限公司 Face verification method and apparatus based on triplet loss, and computer device and storage medium
CN110263774A (en) * 2019-08-19 2019-09-20 珠海亿智电子科技有限公司 A kind of method for detecting human face
CN111160269A (en) * 2019-12-30 2020-05-15 广东工业大学 Face key point detection method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239736A (en) * 2017-04-28 2017-10-10 北京智慧眼科技股份有限公司 Method for detecting human face and detection means based on multitask concatenated convolutional neutral net
WO2019109526A1 (en) * 2017-12-06 2019-06-13 平安科技(深圳)有限公司 Method and device for age recognition of face image, storage medium
WO2019128367A1 (en) * 2017-12-26 2019-07-04 广州广电运通金融电子股份有限公司 Face verification method and apparatus based on triplet loss, and computer device and storage medium
CN108805977A (en) * 2018-06-06 2018-11-13 浙江大学 A kind of face three-dimensional rebuilding method based on end-to-end convolutional neural networks
CN110263774A (en) * 2019-08-19 2019-09-20 珠海亿智电子科技有限公司 A kind of method for detecting human face
CN111160269A (en) * 2019-12-30 2020-05-15 广东工业大学 Face key point detection method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022257456A1 (en) * 2021-06-10 2022-12-15 平安科技(深圳)有限公司 Hair information recognition method, apparatus and device, and storage medium
CN113782184A (en) * 2021-08-11 2021-12-10 杭州电子科技大学 Cerebral apoplexy auxiliary evaluation system based on facial key point and feature pre-learning
CN115984461A (en) * 2022-12-12 2023-04-18 广州紫为云科技有限公司 Face three-dimensional key point detection method based on RGBD camera

Similar Documents

Publication Publication Date Title
CN107506717B (en) Face recognition method based on depth transformation learning in unconstrained scene
CN107832672B (en) Pedestrian re-identification method for designing multi-loss function by utilizing attitude information
CN109472198B (en) Gesture robust video smiling face recognition method
WO2020108362A1 (en) Body posture detection method, apparatus and device, and storage medium
CN112836566A (en) Multitask neural network face key point detection method for edge equipment
WO2016034059A1 (en) Target object tracking method based on color-structure features
CN110490158B (en) Robust face alignment method based on multistage model
CN107953329B (en) Object recognition and attitude estimation method and device and mechanical arm grabbing system
Tsai et al. Simultaneous 3D object recognition and pose estimation based on RGB-D images
CN111259739B (en) Human face pose estimation method based on 3D human face key points and geometric projection
US20200211220A1 (en) Method for Identifying an Object Instance and/or Orientation of an Object
CN112037320A (en) Image processing method, device, equipment and computer readable storage medium
CN110598715A (en) Image recognition method and device, computer equipment and readable storage medium
CN112308128B (en) Image matching method based on attention mechanism neural network
CN109858433B (en) Method and device for identifying two-dimensional face picture based on three-dimensional face model
CN112528902B (en) Video monitoring dynamic face recognition method and device based on 3D face model
CN108564043B (en) Human body behavior recognition method based on space-time distribution diagram
CN110188630A (en) A kind of face identification method and camera
CN111881841B (en) Face detection and recognition method based on binocular vision
Luo et al. Dynamic face recognition system in recognizing facial expressions for service robotics
CN104751144B (en) A kind of front face fast appraisement method of facing video monitoring
Luo et al. Alignment and tracking of facial features with component-based active appearance models and optical flow
CN112990047B (en) Multi-pose face verification method combining face angle information
CN102496022B (en) Effective feature point description I-BRIEF method
Deng et al. Multi-stream face anti-spoofing system using 3D information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination