CN114119697A - PIC-based 3D face model face texture diversity completion method - Google Patents
PIC-based 3D face model face texture diversity completion method Download PDFInfo
- Publication number
- CN114119697A CN114119697A CN202111403229.6A CN202111403229A CN114119697A CN 114119697 A CN114119697 A CN 114119697A CN 202111403229 A CN202111403229 A CN 202111403229A CN 114119697 A CN114119697 A CN 114119697A
- Authority
- CN
- China
- Prior art keywords
- face
- texture
- model
- pic
- incomplete
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/529—Depth or shape recovery from texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/04—Texture mapping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Graphics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Geometry (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
- Image Processing (AREA)
Abstract
A3D face model face texture diversity completion method based on PIL comprises the steps of constructing a high-definition face texture data set through a texture completion network and a face texture data set; and then estimating parameter distribution of a latent space of a mask area of the facial texture by adopting a Convolution Variation Automatic Encoder (CVAE), decomposing implicit information in the parameter distribution after sampling, training an improved PIC network model by using distribution regularization loss and appearance matching loss as generation loss, and finally generating complete facial texture by adopting the trained improved PIC network model to replace the 3D face model with the incomplete texture. The method can generate complete facial texture for the 3d face model reconstructed by the single image by using texture recovery and the face images with different angles.
Description
Technical Field
The invention relates to a technology in the field of Image processing, in particular to a diversified Completion method for facial textures of a 3D face model based on diversified Image Completion (PIC).
Background
The existing 3D face reconstruction algorithm based on deep learning can quickly generate a 3D model result of an input image, but still faces many problems in the training process. The texture definition and integrity of the reconstructed face are very critical problems, and the single-image 3D face reconstruction has a very obvious problem that a single image cannot show the complete face texture, particularly the side face figure image, and how to recover the missing side face partial texture as much as possible is very important.
The face parameterized model designed after the morphed model of the 3D face synthesis also contains texture parameters, so that given a picture of a person, the face texture information closest to the person can be obtained by fitting the texture parameters. However, this approach has high requirements on the face database used for the parameterized model, for example, the parameterized model using the white face database has relatively poor fitting effect on facial texture of asian people, and the facial texture fitted to non-skin color images has relatively serious distortion, for example, ancient images. UV-GAN [ DengJ, Cheng S, Xue N, et al UV-GAN: adaptive Facial UV Map Completion for position-innovative Face Recognition [ J ].2017 ] takes advantage of the powerful ability to generate an anti-network to convert Facial textures to images-images. The real pixels in the image are used to construct the facial texture that is closest to the given image. LAB [ Wu W, Qian C, Yang S, et al.Look at Boundary: A Boundary-Aware Face Alignment Algorithm [ C ]//2018IEEE/CVF Conference on Computer Vision and Pattern registration. IEEE,2018 ] an Attention-Net-GAN was designed by constructing a facial texture map by PRNet [ Zhu X, Zoen L, Liu X, et al.Face Alignment acquisition targets Poses: A3D Solution [ C ]//2016IEEE Conference on Computer Vision and Pattern Registration (CVPR). IEEE,2016 ], directly yielding a complete facial texture. The time for acquiring the incomplete facial texture by the UV-GAN is saved. The UV-GAN and LAB are mainly intended to solve occlusion caused by head pose of a person in an input image, and do not pay attention to the case where a face is occluded by other objects. IS Na et al [ Na I S, Tran C, Nguyen D, et al, facial UV map composition for position-innovative face recognition: a novel adaptive approach based on coordinated identification of residual UNets [ J ]. Human-central computing and Information Sciences,2020,10(1):1-17.] design a generative confrontation network specifically for this problem, and can input a person image with face occlusion and output a complete face image. The part of the training set is added with random masks such as sunglasses for eyes, glasses masks, mouth masks and the like based on the key points of the face of the image.
Disclosure of Invention
The invention provides a PIL-based 3D face model face texture diversity completion method, which aims at the defects that the texture definition and the integrity of a face reconstructed by the prior art are low, the single-image 3D face reconstruction cannot display complete face texture, the face texture details cannot be recovered by combining texture images through interpolation processing generally, and if the single-view face provides side images, the shielded face details cannot be recovered through interpolation processing and the like.
The invention is realized by the following technical scheme:
the invention relates to a 3D face model face texture diversity completion method based on PIC, which constructs a high-definition face texture data set through a texture completion network and a face texture data set; and then estimating parameter distribution of a latent space of a mask area of the facial texture by adopting a Convolution Variation Automatic Encoder (CVAE), decomposing implicit information in the parameter distribution after sampling, training an improved PIC network model by using distribution regularization loss and appearance matching loss as generation loss, and finally generating complete facial texture by adopting the trained improved PIC network model to replace the 3D face model with the incomplete texture.
The invention relates to a system for realizing the method, which comprises the following steps: incomplete texture generation unit, residual encoder, residual decoder and long and short attention unit, wherein: an incomplete texture generation unit generates an incomplete texture image according to the 3D face reconstruction model and the public face database; the residual encoder encodes the incomplete texture image to obtain implicit information, the implicit information is sampled, the residual decoder decodes the implicit information to obtain a complete texture image, and the long and short attention units combine the sampled implicit information according to the encoded information of the incomplete texture image to finally obtain diversified face texture completion.
Technical effects
The invention creates an algorithm for constructing a face texture data set, improves a face completion PIC frame, applies the face texture data set to face texture completion, generates a large number of high-definition side figure images by modifying semantic vectors of seerettyface style GAN, generates incomplete face textures by a 3D face reconstruction frame such as 3DDFA/PRNet, and obtains the high-definition face texture data set by symmetric completion. The invention is based on improving PIC and applying it to face texture complement, eliminating the reconstruction path of PIC, and changing the local appearance matching loss into global appearance matching loss.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a schematic diagram of a network architecture according to an embodiment.
Detailed Description
As shown in fig. 1, the present embodiment relates to a method for completing facial texture diversity of a 3D face model based on PIC, which includes:
step one, constructing a high-definition facial texture data set, which specifically comprises the following steps:
a1. the high-definition facial texture data set adopts a high-definition face generation preprocessing model disclosed by SeePrettyFace based on StyleGAN, and a mapping relation from (18,512) dimensional vectors to (1024, 3) dimensional vectors is established; then, in the randomly generated pictures of the high-definition face generation preprocessing model, a person side photo can be found, the corresponding semantic vector is fixed in the vector (0:18,200:400) with 18,512 dimensions, and the rest positions are further randomly generated, so that a large number of person side high-definition photos can be rapidly acquired.
The high-definition face generation preprocessing model adopts a (18,512) dimensional vector to represent each face, and a slightly different new face model can be generated by slightly changing the vector.
The mapping relation is as follows: the high definition face generation pre-processing model establishes (18,512) dimensional vectors to (1024, 3) dimensional vectors, which have some attributes such as age, gender, expression, etc.
a2. Generating corresponding high-definition incomplete textures by utilizing a pre-trained network model based on the mobilene according to the high-definition face side data set obtained in the step a1, and obtaining a high-definition complete texture data set through mirror symmetry processing;
the architecture of the network model based on the Mobilenet is realized by adopting but not limited to 3DDFA ([ Zhu X, Zhen L, Liu X, et al.face Alignment Across Large Poses: A3D Solution [ C ]//2016IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE,2016 ]
a3. And adding false illumination and false shadow to each complete texture in the high-definition complete texture data set, and finally acquiring the data pair of the incomplete illumination face texture and the complete illumination face texture.
Step two, acquiring a potential vector z through a coding network EcThe method specifically comprises the following steps:
b1. obtaining incomplete texture I using random texture maskmAnd corresponding complete texture IgA deletion portion Ic;
The coding network E is realized by adopting a coding network E based on PIC (Zheng C, Cham T J, J Cai. Pluralogic Image compression [ C ]//2019IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019.).
The random texture mask is: random stripe/block masks and face masks for the case of face images, i.e., front/left/right face texture masks.
b2. Obtaining parameter distribution of a latent space by adopting conditional variation self-encoder CVAE, and then sampling from the parameter distribution of the latent spacePotential vector zc,zcThe information including the deletion region specifically includes: calculating the variation lower limit of the conditional log likelihood function logp (Ic | Im) of the training example: logp (Ic | Im) ≥ KL (q)ψ(zc|Ic)||pφ(zc|Im))+Eqψ(zc|Ic)[logpθ(Ic|zC,Im)]Wherein: i isgAs an original image, ImTo be an observable moiety, IcIs a deletion part. The KL divergence term represents the significant sampling function q it will learnψ(·|Ic) Regularization as fixed latent prior
Step three: obtaining, by a decoder, implicit information z encoded by an encoding network EcAcquiring information of a missing area;
the deletion area refers to: due to missing part of the image IcFor inferring the importance function q during trainingψ(·|Ic)=Nψ(. cndot.). Thus, the potential vector z of the samplecContaining information of the missing region.
Step four: obtaining different generation results through the generation network G and the implicit information z obtained in the step two, specifically: when the important function q is selectedψ(·|Ic) When sampling, information containing missing region can be obtained, and likelihood function pθ(Ic|zC,Im) Emphasis on IcAnd (4) reconstructing. In contrast, learning from the absence of IcConditional a priori of (a)φ(·|Im) While sampling, let the likelihood function model and IcIndependent of the original instance, creative generation may be facilitated.
The generation network G is realized by adopting a PIC-based generation network G (PIC [ ZHENG C, Cham T J, J Cai. Pluralogic Image compression [ C ]//2019IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE,2019 ]).
Step five: updating coded network parameters by back-propagation with minimized overhead lossCounting and generating network parameters until the loss converges, comprising: minimizing challenge lossWherein Ig Shen nTo generate an image, IgDiscriminating the network parameter theta for real images by back-propagation updatingDMinimizing distribution regularization loss Wherein: conditional prior of learningAlso gaussian distributed, regularized to qψ(·|Ic) And minimizing appearance matching lossWherein: i isg Shen nTo generate an image, IgIs a real image.
As shown in fig. 2, the system for implementing the method according to this embodiment includes: incomplete texture generation unit, residual encoder, residual decoder and long and short attention unit, wherein: an incomplete texture generation unit generates an incomplete texture image according to the 3D face reconstruction model and the public face database; the residual encoder encodes the incomplete texture image to obtain implicit information, the implicit information is sampled, the residual decoder decodes the implicit information to obtain a complete texture image, and the long and short attention units combine the sampled implicit information according to the encoded information of the incomplete texture image to finally obtain diversified face texture completion.
The incomplete texture generating unit comprises: the face image interpolation system comprises a 3D face model generating unit, an interpolation face texture generating unit and a incomplete face texture generating unit, wherein: the 3D face model generating unit constructs a 3D face model according to input face image information, the interpolation face texture generating unit performs image interpolation processing according to vertex information and face image information of the 3D face model to obtain an interpolated face texture result, and the incomplete face texture generating unit acquires face texture of a visible part according to one layer of depth information of an intercepted surface to obtain an incomplete face texture result.
Through specific practical experiments, under the specific environment setting of Tesla P40 GPU and Python PyTorch framework, lambda is 10-4The fixed learning rate of (b) was trained ab initio using Adam optimization, with β 1 ═ 0 and β 2 ═ 0.999. The final weight used is αKL=αapp=20,αad1. Parameters to run the above method, training time with random irregular and centre hole-free training models takes approximately 5 weeks. The item is 59ms over the average extrapolated time.
Compared with the prior art, the method can construct the facial texture data set at zero cost, eliminate the reconstruction path of the PIC by improving the PIC and applying the PIC to the facial texture completion, change the local appearance matching loss into the global appearance matching loss, and enable the texture completion to be more complete and the completed texture to be richer and clearer. Meanwhile, the method is the existing first 3D face model face texture diversity completion method.
The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Claims (9)
1. A3D human face model face texture diversity completion method based on PIC is characterized in that a high-definition face texture data set is constructed through a texture completion network and a face texture data set; and then estimating parameter distribution of a latent space of a mask region of the facial texture by adopting a convolution variational automatic encoder, decomposing implicit information in the parameter distribution after sampling, training an improved PIC network model by using distribution regularization loss and appearance matching loss as generation loss, and finally generating complete facial texture by adopting the trained improved PIC network model to replace the 3D face model with incomplete texture.
2. The PIC-based face texture diversity completion method for the 3D face model according to claim 1, wherein the constructing of the high definition face texture dataset specifically comprises:
a1. the high-definition facial texture data set adopts a high-definition face generation preprocessing model based on StyleGAN, and a mapping relation from a (18,512) dimensional vector to a (1024,1024,3) dimensional vector is established; then finding a figure side photo in the randomly generated pictures of the high-definition face generation preprocessing model, fixing the corresponding semantic vectors in (18,512) -dimensional vectors (0:18,200:400), and further randomly generating the rest positions, thereby quickly obtaining a large number of figure side high-definition photos;
a2. generating corresponding high-definition incomplete textures by utilizing a pre-trained network model based on the mobilene according to the high-definition face side data set obtained in the step a1, and obtaining a high-definition complete texture data set through mirror symmetry processing;
a3. and adding false illumination and false shadow to each complete texture in the high-definition complete texture data set, and finally acquiring the data pair of the incomplete illumination face texture and the complete illumination face texture.
3. The PIC-based face texture diversity completion method for the 3D face model based on the PIC of claim 2, wherein the high definition face generation preprocessing model is represented by a (18,512) -dimensional vector for each face, and a slightly different new face pattern can be generated by only slightly changing the vector.
4. The PIC-based method for complementing facial texture diversity of 3D face model as claimed in claim 1, wherein the incomplete texture I is obtained by using a random texture mask based on the parameter distribution of the latent space in the facial texture mask regionmAnd corresponding complete texture IgA deletion portion Ic(ii) a However, the device is not suitable for use in a kitchenThen, conditional variation self-encoder CVAE is adopted to obtain parameter distribution of latent space, and potential vector z is obtained by sampling from parameter distribution of latent spacec,zcInformation containing a missing region; then obtaining the implicit information z of the coding network E code through the decodercInformation of the missing region is acquired.
5. The PIC-based face texture diversity completion method for the 3D face model based on the PIC of claim 4, wherein the random texture mask is: random stripe/block masks and face masks for the case of face images, i.e., front/left/right face texture masks.
6. The PIC-based 3D face model face texture diversity completion method of claim 1 or 4, wherein the sampling is: calculating the variation lower limit of the conditional log likelihood function logp (Ic | Im) of the training example: logp (Ic | Im) ≥ KL (q)ψ(zc|Ic)||pφ(zc|Im))+Eqψ(zc|Ic)[logpθ(Ic|zC,Im)]Wherein: i isgAs an original image, ImTo be an observable moiety, IcIs a deletion moiety; the KL divergence term represents the significant sampling function q it will learnψ(·|Ic) Regularization as fixed latent prior
7. The PIC-based face texture diversity completion method for the 3D face model based on the PIC of claim 1, wherein the training of the improved PIC network model comprises: and obtaining different generation results through the generation network G and the implicit information z obtained in the step two, minimizing the countermeasure loss, and updating the coding network parameters and the generation network parameters through back propagation until the loss converges.
8. Root of herbaceous plantThe PIC-based 3D face model face texture diversity completion method of claim 7, wherein the minimization of the countermeasure lossWherein IgenTo generate an image, IgDiscriminating the network parameter theta for real images by back-propagation updatingDMinimizing distribution regularization loss Wherein: conditional prior of learningAlso gaussian distributed, regularized to qψ(·|Ic) And minimizing appearance matching lossWherein: i isgenTo generate an image, IgIs a real image.
9. A system for implementing the PIC-based 3D human face model face texture diversity completion method of any one of claims 1-8, comprising: incomplete texture generation unit, residual encoder, residual decoder and long and short attention unit, wherein: an incomplete texture generation unit generates an incomplete texture image according to the 3D face reconstruction model and the public face database; the method comprises the steps that a residual encoder encodes an incomplete texture image to obtain implicit information, the implicit information is sampled, a residual decoder decodes the implicit information to obtain a complete texture image, and a long and short attention unit combines the sampled implicit information according to the encoded information of the incomplete texture image to finally obtain diversified face texture completion;
the incomplete texture generating unit comprises: the face image interpolation system comprises a 3D face model generating unit, an interpolation face texture generating unit and a incomplete face texture generating unit, wherein: the 3D face model generating unit constructs a 3D face model according to input face image information, the interpolation face texture generating unit performs image interpolation processing according to vertex information and face image information of the 3D face model to obtain an interpolated face texture result, and the incomplete face texture generating unit acquires face texture of a visible part according to one layer of depth information of an intercepted surface to obtain an incomplete face texture result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111403229.6A CN114119697A (en) | 2021-11-24 | 2021-11-24 | PIC-based 3D face model face texture diversity completion method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111403229.6A CN114119697A (en) | 2021-11-24 | 2021-11-24 | PIC-based 3D face model face texture diversity completion method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114119697A true CN114119697A (en) | 2022-03-01 |
Family
ID=80371787
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111403229.6A Pending CN114119697A (en) | 2021-11-24 | 2021-11-24 | PIC-based 3D face model face texture diversity completion method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114119697A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116957991A (en) * | 2023-09-19 | 2023-10-27 | 北京渲光科技有限公司 | Three-dimensional model complement method and three-dimensional model complement model generation method |
-
2021
- 2021-11-24 CN CN202111403229.6A patent/CN114119697A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116957991A (en) * | 2023-09-19 | 2023-10-27 | 北京渲光科技有限公司 | Three-dimensional model complement method and three-dimensional model complement model generation method |
CN116957991B (en) * | 2023-09-19 | 2023-12-15 | 北京渲光科技有限公司 | Three-dimensional model completion method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108520503B (en) | Face defect image restoration method based on self-encoder and generation countermeasure network | |
Ning et al. | Multi‐view frontal face image generation: a survey | |
Pumarola et al. | Ganimation: Anatomically-aware facial animation from a single image | |
Tang et al. | Real-time neural radiance talking portrait synthesis via audio-spatial decomposition | |
CN110738605B (en) | Image denoising method, system, equipment and medium based on transfer learning | |
CN113327278B (en) | Three-dimensional face reconstruction method, device, equipment and storage medium | |
CN111932444A (en) | Face attribute editing method based on generation countermeasure network and information processing terminal | |
CN113838176A (en) | Model training method, three-dimensional face image generation method and equipment | |
US11727628B2 (en) | Neural opacity point cloud | |
CN115914505B (en) | Video generation method and system based on voice-driven digital human model | |
Aakerberg et al. | Semantic segmentation guided real-world super-resolution | |
Roessle et al. | Ganerf: Leveraging discriminators to optimize neural radiance fields | |
Zhang et al. | Morphable model space based face super-resolution reconstruction and recognition | |
CN117422829A (en) | Face image synthesis optimization method based on nerve radiation field | |
CN114119697A (en) | PIC-based 3D face model face texture diversity completion method | |
Yang et al. | BareSkinNet: De‐makeup and De‐lighting via 3D Face Reconstruction | |
CN116703750A (en) | Image defogging method and system based on edge attention and multi-order differential loss | |
CN116703719A (en) | Face super-resolution reconstruction device and method based on face 3D priori information | |
Cao et al. | Guided cascaded super-resolution network for face image | |
Tal et al. | Nldnet++: A physics based single image dehazing network | |
Mir et al. | DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers | |
Lin et al. | FAEC‐GAN: An unsupervised face‐to‐anime translation based on edge enhancement and coordinate attention | |
Zhang et al. | MA-NeRF: Motion-Assisted Neural Radiance Fields for Face Synthesis from Sparse Images | |
Mohaghegh et al. | Robust monocular 3D face reconstruction under challenging viewing conditions | |
Guo et al. | Depth-Guided Robust Point Cloud Fusion NeRF for Sparse Input Views |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |