CN110598601A - Face 3D key point detection method and system based on distributed thermodynamic diagram - Google Patents

Face 3D key point detection method and system based on distributed thermodynamic diagram Download PDF

Info

Publication number
CN110598601A
CN110598601A CN201910818437.9A CN201910818437A CN110598601A CN 110598601 A CN110598601 A CN 110598601A CN 201910818437 A CN201910818437 A CN 201910818437A CN 110598601 A CN110598601 A CN 110598601A
Authority
CN
China
Prior art keywords
distributed
thermodynamic diagram
network
thermodynamic
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910818437.9A
Other languages
Chinese (zh)
Inventor
王正宁
何庆东
赵德明
刘怡君
曾仪
曾浩
张翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201910818437.9A priority Critical patent/CN110598601A/en
Publication of CN110598601A publication Critical patent/CN110598601A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06T3/06
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a human face 3D key point detection method and a system based on a distributed thermodynamic diagram, which comprises the following steps: carrying out dimensionality reduction projection on N3D reference coordinate vectors of the key points of the human face in the database on three two-dimensional planes; respectively encoding each 2D reference coordinate vector into a distributed thermodynamic diagram by adopting a distributed encoding sub-network; combining N distributed thermodynamic diagrams into a 2D combined thermodynamic diagram in a coordinate mapping mode; superposing the three 2D joint thermodynamic diagrams into a 3D joint thermodynamic diagram through a concat algorithm; decoding the 3D joint thermodynamic diagram into N3D detected coordinate vectors using a decoding subnetwork. The method combines the advantages of the existing 2D and 3D face key point detection methods, constructs the distributed thermodynamic diagrams, and combines the distributed thermodynamic diagrams in a coordinate mapping mode, wherein the distributed coding sub-network model is simple and has small calculation amount, and the model parameters can be further reduced and the model operation speed can be improved while the higher detection precision is kept.

Description

Face 3D key point detection method and system based on distributed thermodynamic diagram
Technical Field
The invention relates to the technical field of image processing and computer machine vision, in particular to a human face 3D key point detection method and system based on a distributed thermodynamic diagram.
Background
With the rapid development of deep learning technology in the field of computer vision, various face image processing tasks are widely applied in life, wherein face key point detection plays an important role in face recognition, expression recognition, face reconstruction and the like.
Face keypoint detection has achieved tremendous success in the past decade, particularly in the field of 2D face keypoint detection. An ASM (active Shape model) algorithm based on a point distribution model, which is proposed by Cootes and the like, is a classic human face key point detection algorithm, a training set is firstly calibrated by a manual calibration method, a Shape model is obtained by training, and then matching of a specific object is realized by matching of key points; the CPR (CascadedPose regression) -based algorithm proposed by Dollar refines a designated initial predicted value step by step through a series of regressors, each regressor depends on the output of the previous regressor to execute simple image operation, and the whole system can automatically learn from training samples; in addition, Zhang et al proposed a multitask cascaded convolutional neural network MTCNN (Multi-task cascaded convolutional network) for handling face detection and face keypoint localization problems simultaneously. However, in complex scenes such as large-angle poses and face occlusion, the 2D-based face keypoint detection method is difficult to implement and has limitations. To address this limitation, more and more researchers are increasingly focusing on 3D face keypoint detection, which represents more information and provides more occlusion information relative to 2D.
The 3D face key point detection method is roughly classified into a model-based method and a non-model-based method. A model-based method: the three-dimensional deformation model (3DMM) proposed by Blanz et al is a common method for completing the detection of key points of a 3D face; II, a non-model-based method: tulyakov et al propose a method for locating 3D face key points by calculating three-dimensional shape features through cascade regression, and popularize the cascade regression method into 3D face key point detection. In addition, the model-based method also comprises a method for detecting key points of the human face by using a deep learning model, which is mainly divided into a two-stage regression method and a volume representation method, wherein the two-stage regression typical method is used for separating coordinates from an axis, firstly regressing the coordinates and then regressing; the volume representation method expands the traditional 2D thermodynamic diagram into a 3D volume table form, and is widely applied to human body key point detection.
However, due to the increase of the 3D space dimension, the processing speed and the model precision of the corresponding algorithm face huge challenges, and the existing 3D face key point detection algorithm has defects of different degrees in the aspects of processing speed, model size and complexity, model precision and the like.
Disclosure of Invention
At least one of the objectives of the present invention is to overcome the problems in the prior art, and provide a method and a system for detecting a 3D key point of a human face based on a distributed thermodynamic diagram, which can simplify a model and increase a processing operation speed while ensuring accuracy.
In order to achieve the above object, the present invention adopts the following aspects.
A3D face key point detection method based on distributed thermodynamic diagrams comprises the following steps:
101, performing dimensionality reduction projection on N3D reference coordinate vectors of key points of a human face in a database on three two-dimensional planes; wherein the three two-dimensional planes are respectively xy, xz and yz planes, and x, y and z are positive or negative simultaneously; each two-dimensional plane comprises N2D reference coordinate vectors corresponding to the N3D reference coordinate vectors;
102, respectively encoding each 2D reference coordinate vector into a distributed thermodynamic diagram by adopting a distributed encoding sub-network; n distributed thermodynamic diagrams can be obtained under one two-dimensional plane;
103, combining the N distributed thermodynamic diagrams under a two-dimensional plane into a 2D combined thermodynamic diagram in a coordinate mapping mode;
step 104, superposing the 2D joint thermodynamic diagrams under the three two-dimensional planes into a 3D joint thermodynamic diagram through a concat algorithm;
and 105, decoding the 3D joint thermodynamic diagram into N3D detection coordinate vectors by adopting a decoding sub network.
Preferably, the distributed coding sub-network is configured to code each 2D reference coordinate vector into a set of consecutive values, and select a maximum value of the set of consecutive values as a coded value, and use a thermodynamic diagram corresponding to the coded value as a distributed thermodynamic diagram of the 2D reference coordinate vector.
Preferably, the distributed coding sub-network is constructed based on a k-order hourglass network, and the distributed coding sub-network is trained by using a face image with coordinate values to form a nonlinear mapping which is input into a face image with coordinate vectors and output into a distributed thermodynamic diagram.
Preferably, the decoding sub-network is constructed based on a 2D full convolution network; and training the decoding subnetwork using the 3D joint thermodynamic diagram to form a nonlinear mapping with an input as the 3D joint thermodynamic diagram and an output as the 3D detection coordinate vector.
Preferably, the decoding subnetwork comprises: 5 2D convolutional layers, wherein the number of convolutional cores in the 5 2D convolutional layers is respectively as follows: 128, 128, 256, 256, 512; the sizes of convolution kernels are 4 multiplied by 4, and the step length is set to be 2; the batch normalization and LeakyRelu activation functions are collocated in the middle of each convolutional layer.
A distributed thermodynamic diagram based 3D face keypoint detection system comprising at least one processor, and a memory communicatively connected to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above-described method.
In summary, due to the adoption of the technical scheme, the invention at least has the following beneficial effects:
reducing the representation dimension of the 3D key point coordinate vector by projecting the coordinate vector to a two-dimensional plane; combining the advantages of 2D face key point detection and 3D face key point detection, providing a face key point detection model combining distributed thermodynamic diagrams and coordinate regression; the projection coordinates of the two-dimensional plane are subjected to distributed coding and then are expressed into a 2D (two-dimensional) combined thermodynamic diagram through coordinate mapping, so that the relation among the coordinates is kept, and the expressed dimensionality is reduced; and decoding the combined thermodynamic diagram through coordinate regression to obtain final 3D key point detection coordinates, so that the process of directly detecting N3D key point coordinates from one 2D face image is realized.
Drawings
Fig. 1 is a flowchart of a distributed thermodynamic diagram-based face 3D keypoint detection method according to an exemplary embodiment of the present invention.
Fig. 2 is a schematic diagram of a distributed coding subnetwork structure according to an exemplary embodiment of the present invention.
Fig. 3 is a 2D face key point and its corresponding distributed thermodynamic diagrams, which are combined by coordinate mapping to form a combined thermodynamic diagram (the leftmost projection of the combined thermodynamic diagram on a plane), according to an exemplary embodiment of the present invention.
Fig. 4 is a schematic diagram of a one-stage stacked hourglass network configuration according to an exemplary embodiment of the present invention.
Fig. 5 is a schematic diagram of a first order hourglass network structure according to an exemplary embodiment of the present invention.
FIG. 6 is a schematic diagram of a decoding subnetwork structure according to an exemplary embodiment of the present invention
Fig. 7 is a schematic structural diagram of a face 3D keypoint detection system based on a distributed thermodynamic diagram according to an exemplary embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and embodiments, so that the objects, technical solutions and advantages of the present invention will be more clearly understood. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Fig. 1 illustrates a face 3D keypoint detection method based on distributed thermodynamic diagrams according to an exemplary embodiment of the present invention. The method of this embodiment mainly includes:
101, performing dimensionality reduction projection on N3D reference coordinate vectors of key points of a human face in a database on three two-dimensional planes; wherein the three two-dimensional planes are respectively xy, xz and yz planes, and x, y and z are positive or negative simultaneously; each two-dimensional plane comprises N2D reference coordinate vectors corresponding to the N3D reference coordinate vectors;
specifically, the 3D reference coordinate vectors of N key points of the face are extracted from a group route (generally abbreviated as GT information) data set, and the total number of the key points of the general face is 68, so that N is preferably 68 in this embodiment. And performing dimensionality reduction decomposition on the extracted N3D key point reference coordinate vectors (x, y, z) in three two-dimensional planes.
In the specific projection, the three 2D reference coordinate vectors (x, y), (y, z) and (x, z) are decomposed. Let Vx,y,zRepresenting the keypoint 3D reference coordinate vector, (x, y, z), the three 2D reference coordinate vectors generated separately are:
for example: a three-dimensional space coordinate point is (1, -2, 3), and dimension reduction decomposition is carried out to obtain (1, -2), (-2, 3) and (1, 3), but in order to form a joint 2D thermodynamic diagram later, three coordinate planes of xy, yz and xz (x, y and z have the same positive and negative polarities and are positive or negative at the same time) are projected during dimension reduction; therefore, three two-dimensional reference coordinates with the same positive and negative characters can be obtained by each three-dimensional coordinate after dimension reduction. Preferably, we project it in three planes in the first quadrant (x, y, z are all positive) of the spatial coordinate system.
102, respectively encoding each 2D reference coordinate vector into a distributed thermodynamic diagram by adopting a distributed encoding sub-network; wherein, N distributed thermodynamic diagrams can be obtained by one two-dimensional plane;
specifically, fig. 2 shows the distributed coding sub-network structure, where the distributed coding sub-network codes each 2D reference coordinate vector as a set of continuous values, and selects the maximum value as a coding value, and uses the thermodynamic diagram corresponding to the coding value as the distributed thermodynamic diagram of the 2D reference coordinate vector. Order toIndicates that the mth thermodynamic diagram is located at (i)m,jm) The value of (c), m ∈ {1,2,3 }. For the nth key point on the face image, the position is vx,y,vy,z,vx,zThe (x, y) coordinate vector is encoded in 2D gaussian form (the other two coordinate vectors do the same operation), as shown in equation (1) (σ is the variance):
for a face image with N key points, selecting the maximum value of each key point in a series of continuous values coded by the key point through a max function as a coded value, wherein the thermodynamic diagram corresponding to the coded value is the distributed thermodynamic diagram of the corresponding 2D reference coordinate vector. Fig. 3 (taking N ═ 15 as an example) shows the projection (leftmost) of the corresponding 2D face key points and their corresponding distributed thermodynamic diagrams, and the joint thermodynamic diagram formed by the joint of the distributed thermodynamic diagrams through coordinate mapping, on the image plane.
The distributed coding sub-network is constructed based on a k-order hourglass network model (for example, k is 1), a network finally used for distributed coding is obtained through training and learning, and the distributed coding sub-network is trained by using a face image with coordinate values to form a nonlinear mapping which is input into a face image with coordinate vectors and output into a distributed thermodynamic diagram. Since the distributed coding sub-network only needs to perform mapping from a face image with coordinate vectors to a distributed thermodynamic diagram, the corresponding model is very simple and miniaturized, and the execution speed of the model is further increased.
As shown in fig. 4 and 5, the k-order (k ═ 1) hourglass network model is centered on an hourglass network (the number of input channels is 256, the number of output channels is 512, and the specific structure is shown in fig. 2), and other modules are added to form a one-stage stacked hourglass network. The original image firstly passes through a convolution layer (the size of a convolution kernel is 7), firstly passes through Batch Normalization, then is subjected to down-sampling and maximum value pooling, and then passes through 3 residual modules (the input and the output of the residual modules are all 128 channels) and is input into an hourglass subnetwork. And the output result of the hourglass is processed by two linear transformation modules, and then is subjected to channel conversion by convolution (the size of a convolution kernel is 1) to obtain the final thermodynamic diagram. With a first order hourglass network as shown in figure 2. The upper half path and the lower half path both comprise a plurality of residual modules (the first number represents an input channel, and the second number represents an output channel), and deeper features are extracted step by step. But the first half is performed in the original scale, and the second half is subjected to the process of down sampling and up sampling. The downsampling uses maxporoling and the upsampling uses nearest neighbor interpolation. And finally, adding the upper half output and the lower half output to obtain final output.
Since the size of the joint thermodynamic diagram is w × H × 3, the encoding resolution is often set to 128 × 128 × 3 for a face image of size 256 × 256, so that the distributed encoding subnetwork E forms a mapping E (I) → H from the face image I coordinates input to the distributed thermodynamic diagram H. The network inputs a face image with a size of 128 × 128, and outputs a w × h distributed thermodynamic diagram (the size of the output layer thermodynamic diagram can be set according to actual needs), and the loss function is shown in formula (2):
103, combining the N distributed thermodynamic diagrams under a two-dimensional plane into a 2D combined thermodynamic diagram in a coordinate mapping mode;
specifically, the N thermodynamic diagrams obtained in each two-dimensional plane in step 102 are combined into one thermodynamic diagram in a corresponding point coordinate mapping manner, so as to obtain three 2D combined thermodynamic diagrams with the size of w × h.
Step 104, superposing the 2D joint thermodynamic diagrams under the three two-dimensional planes into a 3D joint thermodynamic diagram through a concat algorithm;
specifically, 2D joint thermodynamic diagrams under three two-dimensional planes are superposed by adopting a concat method to obtain a 3D thermodynamic diagram. The Concat method is a joint vector algorithm used to join two or more arrays. These three 2D joint thermodynamic diagrams can be superimposed together by the concat method, resulting in a 3D thermodynamic diagram with a size of w × h × 3 (where 3 represents 3 channels), as shown in equation (3):
H=concat(p1,p2,p3) (3)
and 105, decoding the 3D joint thermodynamic diagram into N3D detection coordinate vectors by adopting a decoding sub network.
Specifically, the 3D thermodynamic diagrams obtained by the decoding subnetwork are decoded to obtain the detected coordinate vectors of the N3D key points.
The decoding sub-network may be pre-trained to form a mapping D (H) → c of the joint thermodynamic diagram H to the corresponding 3D coordinate vector c. Since the size of the joint thermodynamic diagram H is w × H × 3, the decoding sub-network is constructed by using a 2D full convolution network to decode the thermodynamic diagram, as shown in fig. 6; the decoding sub-network comprises 5 2D convolutional layers, the number of convolutional cores is 128, 128, 256, 256 and 512, the sizes of the convolutional cores are 4 multiplied by 4, the step length is 2, the number of channels of the last convolutional layer is Nmultiplied by 3, a batch normalization function and a LeakyRelu activation function are matched in the middle of each convolutional layer, the last layer is a global average pooling layer, and N3D key point coordinate vectors can be obtained through a 3D joint thermodynamic diagram obtained by a concat method through the decoding sub-network. Further, we pre-train the decoding sub-network with a mean square error loss function, which is shown in equation (5):
therefore, the extraction of the detection coordinate vectors of the N3D key points of the human face is completed.
Further, when an algorithm model is established, the distributed coding sub-network and the decoding sub-network can be pre-trained respectively, and then the two networks are connected together to be used as a whole for fine tuning (in the programming process, coordinate mapping is added between the two network models to carry out combination of N thermodynamic diagrams, a cancat algorithm is added to carry out superposition of 2D combined thermodynamic diagrams to 3D thermodynamic diagrams), and the algorithm model is mainly carried out by two steps:
the first step is as follows: in the pre-training stage, the distributed coding sub-network is trained by using a face image with a coordinate vector, so that the non-linear mapping with 1 face image with the coordinate vector as input and one distributed thermodynamic diagram as output layer is formed. At the same time, the decoding sub-networks are trained using a 3D joint thermodynamic diagram to form a non-linear mapping whose input is the 3D joint thermodynamic diagram and whose output is the 3D detection coordinate vector.
The second step is that: in the fine tuning stage, the decoding sub-network after the pre-training is connected to the back of the distributed coding sub-network after the pre-training, a coordinate mapping algorithm and a concat algorithm are added between the two networks to form a complete distributed thermodynamic diagram human face 3D key point detection network model, the complete network model is fine tuned, finally the whole network is trained in an end-to-end mode, and the loss function is as follows:
wherein the content of the first and second substances,for decoding subnetworksA net loss function;a loss function for the distributed coding sub-network; λ is the weight of the coordinate regression loss (typically a number less than 1, such as 0.1); d represents a decoding subnetwork; c represents a 3D detected coordinate vector; h represents a distributed thermodynamic diagram; e denotes a distributed coding sub-network and I denotes a face image with coordinate vectors.
Fig. 7 illustrates a face 3D keypoint detection system based on joint thermodynamic diagrams, namely an electronic device 310 (e.g., a computer server with program execution functionality) comprising at least one processor 311, a power supply 314, and a memory 312 and an input-output interface 313 communicatively connected to the at least one processor 311, according to an exemplary embodiment of the invention; the memory 312 stores instructions executable by the at least one processor 311, the instructions being executable by the at least one processor 311 to enable the at least one processor 311 to perform a method disclosed in any one of the embodiments; the input/output interface 313 may include a display, a keyboard, a mouse, and a USB interface for inputting/outputting data; the power supply 314 is used to provide power to the electronic device 310.
Those skilled in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.
When the integrated unit of the present invention is implemented in the form of a software functional unit and sold or used as a separate product, it may also be stored in a computer-readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code.
The foregoing is merely a detailed description of specific embodiments of the invention and is not intended to limit the invention. Various alterations, modifications and improvements will occur to those skilled in the art without departing from the spirit and scope of the invention.

Claims (6)

1. A face 3D key point detection method based on a distributed thermodynamic diagram is characterized by comprising the following steps:
101, performing dimensionality reduction projection on N3D reference coordinate vectors of key points of a human face in a database on three two-dimensional planes; wherein the three two-dimensional planes are respectively xy, xz and yz planes, and x, y and z are positive or negative simultaneously; each two-dimensional plane comprises N2D reference coordinate vectors corresponding to the N3D reference coordinate vectors;
102, respectively encoding each 2D reference coordinate vector into a distributed thermodynamic diagram by adopting a distributed encoding sub-network; n distributed thermodynamic diagrams can be obtained under one two-dimensional plane;
103, combining the N distributed thermodynamic diagrams under a two-dimensional plane into a 2D combined thermodynamic diagram in a coordinate mapping mode;
step 104, superposing the 2D joint thermodynamic diagrams under the three two-dimensional planes into a 3D joint thermodynamic diagram through a concat algorithm;
and 105, decoding the 3D joint thermodynamic diagram into N3D detection coordinate vectors by adopting a decoding sub network.
2. The method of claim 1, wherein the distributed coding sub-network is configured to code each 2D reference coordinate vector as a set of consecutive values, and select a maximum value of the set of consecutive values as a coded value, and the thermodynamic diagram corresponding to the coded value is a distributed thermodynamic diagram of the 2D reference coordinate vector.
3. The method of claim 1, wherein the distributed coding sub-network is constructed based on a k-order hourglass network, and is trained using face images with coordinate values to form a non-linear mapping that is input as a face image with coordinate vectors and output as a distributed thermodynamic diagram.
4. The method of claim 1, wherein the decoding sub-network is constructed based on a 2D full convolutional network; and training the decoding subnetwork using the 3D joint thermodynamic diagram to form a nonlinear mapping with an input as the 3D joint thermodynamic diagram and an output as the 3D detection coordinate vector.
5. The method of claim 4, wherein decoding the sub-network comprises: 5 2D convolutional layers, wherein the number of convolutional cores in the 5 2D convolutional layers is respectively as follows: 128, 128, 256, 256, 512; the sizes of convolution kernels are 4 multiplied by 4, and the step length is set to be 2; the batch normalization and LeakyRelu activation functions are collocated in the middle of each convolutional layer.
6. A human face 3D key point detection system based on distributed thermodynamic diagrams is characterized by comprising at least one processor and a memory which is in communication connection with the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 5.
CN201910818437.9A 2019-08-30 2019-08-30 Face 3D key point detection method and system based on distributed thermodynamic diagram Pending CN110598601A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910818437.9A CN110598601A (en) 2019-08-30 2019-08-30 Face 3D key point detection method and system based on distributed thermodynamic diagram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910818437.9A CN110598601A (en) 2019-08-30 2019-08-30 Face 3D key point detection method and system based on distributed thermodynamic diagram

Publications (1)

Publication Number Publication Date
CN110598601A true CN110598601A (en) 2019-12-20

Family

ID=68856555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910818437.9A Pending CN110598601A (en) 2019-08-30 2019-08-30 Face 3D key point detection method and system based on distributed thermodynamic diagram

Country Status (1)

Country Link
CN (1) CN110598601A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111523480A (en) * 2020-04-24 2020-08-11 北京嘀嘀无限科技发展有限公司 Method and device for detecting face obstruction, electronic equipment and storage medium
CN111737396A (en) * 2020-08-26 2020-10-02 成都四方伟业软件股份有限公司 Method and device for improving thermodynamic diagram display performance based on 2D convolution
CN113688664A (en) * 2021-07-08 2021-11-23 三星(中国)半导体有限公司 Face key point detection method and face key point detection device
CN113705488A (en) * 2021-08-31 2021-11-26 中国电子科技集团公司第二十八研究所 Remote sensing image fine-grained airplane identification method based on local segmentation and feature fusion
CN114757822A (en) * 2022-06-14 2022-07-15 之江实验室 Binocular-based human body three-dimensional key point detection method and system
CN115565207A (en) * 2022-11-29 2023-01-03 武汉图科智能科技有限公司 Occlusion scene downlink person detection method with feature simulation fused

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106530389A (en) * 2016-09-23 2017-03-22 西安电子科技大学 Three-dimensional reconstruction method based on medium wave infrared face image
CN108805977A (en) * 2018-06-06 2018-11-13 浙江大学 A kind of face three-dimensional rebuilding method based on end-to-end convolutional neural networks
CN109241910A (en) * 2018-09-07 2019-01-18 高新兴科技集团股份有限公司 A kind of face key independent positioning method returned based on the cascade of depth multiple features fusion
US20190377409A1 (en) * 2018-06-11 2019-12-12 Fotonation Limited Neural network image processing apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106530389A (en) * 2016-09-23 2017-03-22 西安电子科技大学 Three-dimensional reconstruction method based on medium wave infrared face image
CN108805977A (en) * 2018-06-06 2018-11-13 浙江大学 A kind of face three-dimensional rebuilding method based on end-to-end convolutional neural networks
US20190377409A1 (en) * 2018-06-11 2019-12-12 Fotonation Limited Neural network image processing apparatus
CN109241910A (en) * 2018-09-07 2019-01-18 高新兴科技集团股份有限公司 A kind of face key independent positioning method returned based on the cascade of depth multiple features fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHENGNING WANG ETC.: ""A Light-Weighted Network for Facial Landmark Detection via Combined Heatmap and Coordinate Regression"", 《2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111523480A (en) * 2020-04-24 2020-08-11 北京嘀嘀无限科技发展有限公司 Method and device for detecting face obstruction, electronic equipment and storage medium
CN111523480B (en) * 2020-04-24 2021-06-18 北京嘀嘀无限科技发展有限公司 Method and device for detecting face obstruction, electronic equipment and storage medium
CN111737396A (en) * 2020-08-26 2020-10-02 成都四方伟业软件股份有限公司 Method and device for improving thermodynamic diagram display performance based on 2D convolution
CN113688664A (en) * 2021-07-08 2021-11-23 三星(中国)半导体有限公司 Face key point detection method and face key point detection device
CN113688664B (en) * 2021-07-08 2024-04-26 三星(中国)半导体有限公司 Face key point detection method and face key point detection device
CN113705488A (en) * 2021-08-31 2021-11-26 中国电子科技集团公司第二十八研究所 Remote sensing image fine-grained airplane identification method based on local segmentation and feature fusion
CN114757822A (en) * 2022-06-14 2022-07-15 之江实验室 Binocular-based human body three-dimensional key point detection method and system
CN115565207A (en) * 2022-11-29 2023-01-03 武汉图科智能科技有限公司 Occlusion scene downlink person detection method with feature simulation fused

Similar Documents

Publication Publication Date Title
CN110598601A (en) Face 3D key point detection method and system based on distributed thermodynamic diagram
Mittal et al. Autosdf: Shape priors for 3d completion, reconstruction and generation
Parmar et al. Image transformer
CN110188768B (en) Real-time image semantic segmentation method and system
Lu et al. 3DCTN: 3D convolution-transformer network for point cloud classification
CN110929736B (en) Multi-feature cascading RGB-D significance target detection method
EP3396603A1 (en) Learning an autoencoder
Jiang et al. Dual attention mobdensenet (damdnet) for robust 3d face alignment
CN112215050A (en) Nonlinear 3DMM face reconstruction and posture normalization method, device, medium and equipment
CN110675316B (en) Multi-domain image conversion method, system and medium for generating countermeasure network based on condition
Spurek et al. Hypernetwork approach to generating point clouds
JP7135659B2 (en) SHAPE COMPLEMENTATION DEVICE, SHAPE COMPLEMENTATION LEARNING DEVICE, METHOD, AND PROGRAM
CN114677412B (en) Optical flow estimation method, device and equipment
CN113706686A (en) Three-dimensional point cloud reconstruction result completion method and related components
CN112132739B (en) 3D reconstruction and face pose normalization method, device, storage medium and equipment
CN114049435A (en) Three-dimensional human body reconstruction method and system based on Transformer model
CN110516643A (en) A kind of face 3D critical point detection method and system based on joint thermodynamic chart
CN111598111A (en) Three-dimensional model generation method and device, computer equipment and storage medium
US20220335685A1 (en) Method and apparatus for point cloud completion, network training method and apparatus, device, and storage medium
CN111462274A (en) Human body image synthesis method and system based on SMP L model
CN110516642A (en) A kind of lightweight face 3D critical point detection method and system
Kim et al. Deep translation prior: Test-time training for photorealistic style transfer
CN114494543A (en) Action generation method and related device, electronic equipment and storage medium
Zamyatin et al. Learning to generate chairs with generative adversarial nets
WO2023071806A1 (en) Apriori space generation method and apparatus, and computer device, storage medium, computer program and computer program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191220