CN115880748A - Face reconstruction and occlusion region identification method, device, equipment and storage medium - Google Patents

Face reconstruction and occlusion region identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN115880748A
CN115880748A CN202211328264.0A CN202211328264A CN115880748A CN 115880748 A CN115880748 A CN 115880748A CN 202211328264 A CN202211328264 A CN 202211328264A CN 115880748 A CN115880748 A CN 115880748A
Authority
CN
China
Prior art keywords
face
dimensional
information
reconstruction
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211328264.0A
Other languages
Chinese (zh)
Inventor
汪叶娇
约翰·尤迪·阿迪库苏马
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bigo Technology Pte Ltd
Original Assignee
Bigo Technology Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bigo Technology Pte Ltd filed Critical Bigo Technology Pte Ltd
Priority to CN202211328264.0A priority Critical patent/CN115880748A/en
Publication of CN115880748A publication Critical patent/CN115880748A/en
Priority to PCT/CN2023/123840 priority patent/WO2024088061A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application discloses a method, a device, equipment and a storage medium for face reconstruction and identification of a shielded area, wherein the method comprises the following steps: inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image during three-dimensional reconstruction through the first network structure; three-dimensional reconstruction of the face image is carried out based on the reconstruction parameter vector to generate three-dimensional face information; rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data; acquiring a depth feature map extracted by the first network structure, and constructing map structure information based on the depth feature map and the two-dimensional face information; and inputting the graph structure information into a second network structure, and outputting the shielding area corresponding to the face image through the second network structure. According to the scheme, the whole resource deployment is optimized, the application scene with multiple parallel tasks and high real-time requirement can be met, and the accuracy of judgment of the shielding area is higher.

Description

Face reconstruction and occlusion region identification method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of image processing, in particular to a method, a device, equipment and a storage medium for face reconstruction and identification of an occlusion region.
Background
With the development and popularization of virtual reality technology, people have not met the interaction of common two-dimensional planes, and urgent needs are put forward for the application of three-dimensional spaces such as 3D (three-dimensional) beauty and 3D face pinching. Most of the current three-dimensional face reconstruction applications such as 3D American style and 3D stylization are distributed in scenes such as live broadcast and social contact, and have high requirements on instantaneity and reconstruction effect. Meanwhile, because the actual production environment is much more complex than imagination, the input data does not always conform to the conditions of containing a complete face or having self-occlusion, object occlusion and the like, and therefore the model needs to have robustness on various input data. Meanwhile, the actual occlusion area needs to be determined in many cases, so that a series of post-processing operations are performed.
In the related technology, a convolutional neural network is usually used for extracting 3D face features and performing 2D face segmentation to determine an occlusion region, two separate tasks need to be started for prediction, so that more resource deployment is needed, application scenarios with multiple parallel tasks and high real-time requirements cannot be met, and the occlusion region determination performed by the method is poor in robustness. In addition, the traditional mode of judging most of the occlusion areas adopts the traditional semantic segmentation task to process, and the accuracy of judging the occlusion areas is low.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a storage medium for face reconstruction and occlusion region identification, solves the problems that in the related technology, the whole resource deployment is optimized, the application scene with multiple parallel tasks and high real-time requirement can be met, and the accuracy of occlusion region judgment is higher.
In a first aspect, an embodiment of the present application provides a method for reconstructing a face and identifying an occlusion region, where the method includes:
inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image during three-dimensional reconstruction through the first network structure;
three-dimensional reconstruction of the face image is carried out based on the reconstruction parameter vector to generate three-dimensional face information;
rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data;
acquiring a depth feature map extracted by the first network structure, and constructing map structure information based on the depth feature map and the two-dimensional face information;
and inputting the graph structure information into a second network structure, and outputting the shielding area corresponding to the face image through the second network structure.
In a second aspect, an embodiment of the present application further provides a device for reconstructing a face and identifying an occlusion region, including:
the parameter vector generation module is configured to input a face image into a first network structure and output a reconstruction parameter vector of the face image during three-dimensional reconstruction through the first network structure;
the three-dimensional information determination module is configured to perform three-dimensional reconstruction on the face image based on the reconstruction parameter vector to generate three-dimensional face information;
the rendering and mapping module is configured to render and map the three-dimensional face information to generate two-dimensional face information containing network segmentation data;
the image information construction module is configured to acquire a depth feature image extracted by the first network structure and construct image structure information based on the depth feature image and the two-dimensional face information;
and the occlusion area determining module is configured to input the graph structure information into a second network structure, and output an occlusion area corresponding to the face image through the second network structure.
In a third aspect, an embodiment of the present application further provides a face reconstruction and occlusion region identification device, where the face reconstruction and occlusion region identification device includes:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for face reconstruction and occlusion region identification according to the embodiment of the present application.
In a fourth aspect, the present application further provides a non-volatile storage medium storing computer-executable instructions, which when executed by a computer processor, are used to perform the face reconstruction and occlusion region identification method according to the present application.
In a fifth aspect, the present application further provides a computer program product, where the computer program product includes a computer program, where the computer program is stored in a computer-readable storage medium, and at least one processor of the device reads from the computer-readable storage medium and executes the computer program, so that the device executes the method for reconstructing a human face and identifying an occlusion region according to the present application.
In the embodiment of the application, a face image is input into a first network structure, a reconstruction parameter vector of the face image during three-dimensional reconstruction is output through the first network structure, three-dimensional reconstruction of the face image is performed based on the reconstruction parameter vector to generate three-dimensional face information, rendering and mapping processing are performed on the three-dimensional face information to generate two-dimensional face information containing network segmentation data, a depth feature map extracted by the first network structure is further obtained at the moment, the pattern structure information is input into a second network structure based on the depth feature map and the two-dimensional face information, and a shielding area corresponding to the face image is output through the second network structure. According to the face reconstruction and occlusion region identification method, the face reconstruction and occlusion region identification are combined into the same model, when the occlusion region is identified, related data in the face reconstruction process are utilized, the occlusion problem on a two-dimensional plane is modeled into the identification of three-dimensional face occlusion, the method is more in line with a human perception mode, meanwhile, due to the fact that the related data of the three-dimensional face are utilized, hidden associated features are easier to mine, and therefore a more accurate occlusion region identification result is obtained. Meanwhile, the common characteristics of the tasks of face reconstruction and occlusion region identification are fully utilized, the model is compressed as much as possible, the two tasks form a mutual promotion effect, the overall resource deployment is optimized, the application scene with multiple parallel tasks and high real-time requirement can be met, and the accuracy of occlusion region judgment is higher.
Drawings
Fig. 1 is a flowchart of a face reconstruction and occlusion region identification method according to an embodiment of the present application;
fig. 2 is a flowchart of a method for performing three-dimensional reconstruction of a face image based on a reconstruction parameter vector according to an embodiment of the present application;
fig. 3 is a schematic diagram of a picture for reconstructing and processing a face image according to an embodiment of the present application;
fig. 4 is a flowchart of a method for constructing graph structure information based on a depth feature graph and two-dimensional face information according to an embodiment of the present application;
fig. 5 is a schematic diagram illustrating special effect processing performed on an image according to an embodiment of the present application;
fig. 6 is a block diagram of a face reconstruction and occlusion region recognition apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a face reconstruction and occlusion region recognition device according to an embodiment of the present application.
Detailed Description
The embodiments of the present application will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad embodiments of the present application. It should be further noted that, for convenience of description, only some structures related to the embodiments of the present application are shown in the drawings, not all of the structures are shown.
The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.
The method for face reconstruction and identification of the shielded area can be applied to various scenes needing 3D face reconstruction, and can obtain whether the current face is shielded or not and accurately identify the corresponding shielded area. Specifically, the face image in the short video and live broadcast process is processed to realize face reconstruction and occlusion area identification. After the identification of the shielded area is completed, makeup and beauty and other special effect treatment can be further carried out based on the identification result of the shielded area so as to ensure the special effect.
Fig. 1 is a flowchart of a method for face reconstruction and identification of an occlusion region according to an embodiment of the present application, which specifically includes the following steps:
step S101, inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image during three-dimensional reconstruction through the first network structure.
In one embodiment, the face image is an image containing a face region, and the face image may be an image acquired by a camera or a user input picture. Optionally, before the face image is input to the first network structure, face detection and alignment correction may be performed on the rows of the face image by a face detector.
In one embodiment, the first network structure may be a convolutional neural network, such as a mobilenet-v3 network structure (a lightweight convolutional neural network, the third generation of the mobilenet series), a VGG network structure, or a backbone network structure such as resnet. The first network structure is a network structure which is trained in advance, two-dimensional pictures are used as input in the training process, and reconstruction parameter vectors for three-dimensional face reconstruction are output through a series of convolution layers. Optionally, the first network structure uses a reconstruction loss function of the weakly supervised training in the training learning.
In one embodiment, after the face image is input to the first network structure, the reconstruction parameter vector output by the first network structure exemplarily includes: the system comprises a face feature vector, a face expression vector and a three-dimensional face coefficient vector. Other vectors for three-dimensional reconstruction, such as an illumination vector, a reflectivity vector, a pose vector, a translation vector, and the like, are also included.
And S102, carrying out three-dimensional reconstruction on the face image based on the reconstruction parameter vector to generate three-dimensional face information.
In one embodiment, after a reconstruction parameter vector corresponding to an input face image is obtained through a first network structure, three-dimensional reconstruction of the face image is performed based on the reconstruction parameter vector to obtain three-dimensional face information. An exemplary three-dimensional reconstruction process is shown in fig. 2, and fig. 2 is a flowchart of a method for performing three-dimensional reconstruction of a face image based on a reconstruction parameter vector according to an embodiment of the present application, which specifically includes:
and S1021, reconstructing a three-dimensional face point cloud based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector and preset average face shape information to obtain point cloud reconstruction information.
In one embodiment, the three-dimensional face reconstruction includes reconstructing a three-dimensional face point cloud and generating face texture information. Optionally, in the process of reconstructing the three-dimensional face point cloud, reconstruction is performed based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector, and preset average face shape information in the determined reconstruction parameter vector.
Illustratively, the face feature vector is denoted as B id And the facial expression vector is marked as B exp The three-dimensional face coefficient vector comprises alpha, beta and delta, and the preset average face shape information is recorded as
Figure BDA0003912848800000051
For example, when point cloud reconstruction information S is determined, the point cloud reconstruction information S is calculated by using the following formula:
Figure BDA0003912848800000052
step S1022, reconstructing a face texture based on the three-dimensional face coefficient vector and the preset average face texture information and face base information, to obtain face texture information.
In one embodiment, when generating the face texture information, the face texture information is generated based on the three-dimensional face coefficient vector in the determined reconstruction parameter vector, the obtained face base information, and the preset average face texture information. Alternatively, the face basis information may use the basis information in the disclosed face model.
Illustratively, the three-dimensional face coefficient vector comprises alpha, beta and delta, and the face base information is denoted as B t The preset average face texture information is recorded as
Figure BDA0003912848800000053
When the face texture information T is determined, the face texture information T is obtained by adopting the following formula:
Figure BDA0003912848800000061
and S1023, generating three-dimensional face information based on the point cloud reconstruction information and the face texture information.
In one embodiment, after point cloud reconstruction information and face texture information are obtained, final three-dimensional face information is generated based on the point cloud reconstruction information and the face texture information. Illustratively, the three-dimensional face information can be obtained by performing superposition fitting on the point cloud reconstruction information and the face texture information.
And step S103, rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data.
In one embodiment, after the three-dimensional face information is obtained, rendering and mapping are further performed on the three-dimensional face information to generate two-dimensional face information, wherein the generated two-dimensional face information includes network segmentation data. Optionally, the rendering process may be performed by rendering three-dimensional face information by a renderer to obtain a two-dimensional face image, or by using a set rendering model or other rendering algorithmsRendering processing; the mapping process may be: and mapping the human face three-dimensional point pair data in the three-dimensional human face information into a two-dimensional human face image based on a topological structure in the three-dimensional human face information. For example, as shown in fig. 3, fig. 3 is a schematic diagram of a picture for reconstructing and processing a face image according to an embodiment of the present application, where an original input face image is I T The image obtained after three-dimensional reconstruction is V R For the image V R Two-dimensional rendering can be carried out by utilizing a differential renderer to obtain a two-dimensional face image I R Now combined with the reconstructed image V R The topological structure of (a) projects a triangular patch of 1293 individual face three-dimensional point pairs to a two-dimensional face image I R In generating two-dimensional face information Tr containing network segmentation data R
And step S104, obtaining the depth feature map extracted by the first network structure, and constructing map structure information based on the depth feature map and the two-dimensional face information.
In one embodiment, when the occlusion region is identified, a depth feature map extracted by using a first network structure in a three-dimensional reconstruction process is used, graph structure information is constructed based on the depth feature map and the two-dimensional face information generated in step S103, and then the occlusion region is further identified based on the graph structure information. The depth feature map is a feature of a layer obtained by calculating a convolution layer in the first network structure.
Optionally, as shown in fig. 4, an exemplary manner of constructing graph structure information by using the depth feature map and the two-dimensional face information is shown, where fig. 4 is a flowchart of a method for constructing graph structure information based on the depth feature map and the two-dimensional face information according to an embodiment of the present application, and the method specifically includes:
and S1041, carrying out segmentation processing on the depth feature map based on network segmentation data in the two-dimensional face information to obtain a triangular one-sided segmentation result.
When the depth feature map is segmented, the same segmentation strategy of network segmentation in the two-dimensional face information is adopted for segmentation. When the segmentation is carried out, the size consistency of the depth feature map size and the image size in the two-dimensional face information is further adjusted, so that the two sizes are the same, and at the moment, the same segmentation strategy is executed to obtain a triangle slice segmentation result.
Step S1042, constructing an adjacency matrix according to the vertex connection relation and the point-to-distance in the triangle facet division result to obtain graph structure information.
After the triangle one-sided segmentation result is obtained, a adjacency matrix is constructed by using the vertex connection relation of the three-dimensional face mesh and the point pair distance between the point pairs projected to the two-dimensional plane, and therefore the construction of the graph structure information is completed. Illustratively, to construct an adjacency matrix A ij E.nxn as an example, wherein N represents the number of vertices of a three-dimensional face point cloud, 1293 vertices are used in one embodiment, and a specific construction method of an adjacency matrix based on vertex connectivity and point pair distance is as follows:
A ij =C ij *D ij
Figure BDA0003912848800000071
wherein, C ij Representing whether two vertexes are communicated or not, if the two vertexes are not communicated, the value is 0, otherwise, the value is 1 ij Representing the point-to-point distance between two vertices projected onto a two-dimensional plane.
And S105, inputting the graph structure information into a second network structure, and outputting the shielding area corresponding to the face image through the second network structure.
In one embodiment, the second network structure may be a graph convolution neural network, the graph structure information generated in step S104 is input to the second network structure, and the occlusion area corresponding to the face image is output through the second network structure. Optionally, the three-dimensional face vertices in the graph structure information may be classified by a graph convolution neural network, and the occlusion region corresponding to the face image is output based on the classification result. Optionally, the frame of the second network structure mainly uses a graph attention network, and the graph attention network is used for integrating and classifying three-dimensional face vertices to finally obtain the three-dimensional face verticesSegmentation mask M for outputting occlusion region R
Optionally, the second network structure uses a supervised segmentation loss function during training and learning, for example, a dice loss function is used for supervised training of the occlusion mask and the real mask, and experiments show that compared with the traditional cross entropy loss function, the prediction of the segmentation mask by the second network structure can obtain a more accurate result.
According to the scheme, the face image is input into the first network structure, the reconstruction parameter vector of the face image during three-dimensional reconstruction is output through the first network structure, three-dimensional reconstruction of the face image is performed based on the reconstruction parameter vector to generate three-dimensional face information, rendering and mapping processing are performed on the three-dimensional face information to generate two-dimensional face information containing network segmentation data, at the moment, a depth feature map extracted by the first network structure is further obtained, map structure information is constructed based on the depth feature map and the two-dimensional face information, the map structure information is input into the second network structure, and the shielding area corresponding to the face image is output through the second network structure. According to the face reconstruction and occlusion region identification method, the face reconstruction and occlusion region identification are combined into the same model, when the occlusion region is identified, related data in the face reconstruction process are utilized, the occlusion problem on a two-dimensional plane is modeled into the identification of three-dimensional face occlusion, the method is more in line with a human perception mode, meanwhile, due to the fact that the related data of the three-dimensional face are utilized, hidden associated features are easier to mine, and therefore a more accurate occlusion region identification result is obtained. Meanwhile, the common characteristics of the tasks of face reconstruction and shielding region identification are fully utilized, the model is compressed as much as possible, the two tasks form a mutual promotion effect, the overall resource deployment is optimized, the application scenes with multiple parallel tasks and high real-time requirements can be met, and the accuracy of shielding region judgment is higher.
In the scheme, the face reconstruction and the shielding region identification are integrated into one model for processing, and the two tasks with relevance are subjected to feature extraction simultaneously, so that the problem that pain points of relevant features hidden by the two tasks cannot be excavated by the traditional solution is solved, the size of the model is compressed, and the purposes of better reconstruction and segmentation effects are achieved.
On the basis of the above scheme, after outputting the occlusion region corresponding to the face image through the second network structure, the method further includes: and performing special effect rendering processing based on the face image, the two-dimensional face information and the shielding area, and displaying a processing result. Namely, after three-dimensional reconstruction and determination of the occlusion region, various special effect treatments can be further applied based on the three-dimensional reconstruction and the determination of the occlusion region. For example, as shown in fig. 5, fig. 5 is a schematic diagram of performing special effect processing on an image according to an embodiment of the present application. Wherein, the face image I T For the original input image, image M R Image I, an image output after occlusion region identification R Processing the rendered and mapped two-dimensional face information to finally obtain an image I F . Optionally, in the process of performing special effect rendering processing, the original image occlusion object may be used to render the occlusion area, and the face image part outside the occlusion area is rendered based on the three-dimensional face information.
Wherein the image M R The middle black part represents an unoccluded area, which indicates that the area of the original picture is unoccluded, and the area should be rendered to reconstruct the restored image I during final rendering R . In contrast, image M R The white area in (1) represents that the area of the original input picture is blocked, and the reconstructed picture should not be rendered during final rendering. When the three-dimensional face is reconstructed, the complete face can be reconstructed no matter whether the object face is shielded or not. Therefore, for the cases like the middle-hand self-occlusion and the sunglasses occlusion in the example of fig. 5, the reconstruction only looks like the image I R The whole face is reconstructed as shown, and no occlusion is reconstructed. If a post-processing flow similar to three-dimensional makeup exists, the whole three-dimensional makeup material can be reserved because a complete three-dimensional face is reconstructed, and the makeup of a shielding part can not be shielded in consideration of the existence of a shielding object. Thus to address this common problem, in one embodiment a predicted image M is utilized R The black area of (A) identifies the obstruction, needleFor the portion of the obstruction, the final rendering shows the input image. For the image I finally rendered by the parts occluded by the sunglasses and the hand F In using an image I T The sunglasses and the hands are used for rendering, so that the problem that the shielding object cannot be rendered or the three-dimensional makeup is wrongly rendered on the shielding object under the condition that the three-dimensional makeup is processed is solved.
Fig. 6 is a block diagram of a face reconstruction and occlusion region recognition apparatus according to an embodiment of the present application, where the apparatus is configured to execute the face reconstruction and occlusion region recognition method according to the embodiment, and has corresponding functional modules and beneficial effects of the execution method. As shown in fig. 6, the apparatus specifically includes: a parameter vector generation module 101, a three-dimensional information determination module 102, a rendering mapping module 103, a graph information construction module 104, and an occlusion region determination module 105, wherein,
the parameter vector generation module 101 is configured to input a face image into a first network structure, and output a reconstruction parameter vector of the face image during three-dimensional reconstruction through the first network structure;
a three-dimensional information determination module 102 configured to perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information;
the rendering and mapping module 103 is configured to render and map the three-dimensional face information to generate two-dimensional face information containing network segmentation data;
a graph information construction module 104 configured to obtain a depth feature graph extracted from the first network structure, and construct graph structure information based on the depth feature graph and the two-dimensional face information;
and the occlusion region determining module 105 is configured to input the graph structure information into a second network structure, and output an occlusion region corresponding to the face image through the second network structure.
According to the scheme, the face image is input into the first network structure, the reconstruction parameter vector of the face image during three-dimensional reconstruction is output through the first network structure, three-dimensional reconstruction of the face image is performed based on the reconstruction parameter vector to generate three-dimensional face information, rendering and mapping processing are performed on the three-dimensional face information to generate two-dimensional face information containing network segmentation data, at the moment, a depth feature map extracted by the first network structure is further obtained, map structure information is constructed based on the depth feature map and the two-dimensional face information, the map structure information is input into the second network structure, and the shielding area corresponding to the face image is output through the second network structure. According to the face reconstruction and occlusion region identification method, the face reconstruction and occlusion region identification are combined into the same model, when the occlusion region is identified, related data in the face reconstruction process are utilized, the occlusion problem on a two-dimensional plane is modeled into the identification of three-dimensional face occlusion, the method is more in line with a human perception mode, meanwhile, due to the fact that the related data of the three-dimensional face are utilized, hidden associated features are easier to mine, and therefore a more accurate occlusion region identification result is obtained. Meanwhile, the common characteristics of the tasks of face reconstruction and occlusion region identification are fully utilized, the model is compressed as much as possible, the two tasks form a mutual promotion effect, the overall resource deployment is optimized, the application scene with multiple parallel tasks and high real-time requirement can be met, and the accuracy of occlusion region judgment is higher.
In a possible embodiment, the reconstruction parameter vector includes a facial feature vector, a facial expression vector, and a three-dimensional face coefficient vector, and the three-dimensional information determining module 102 is configured to:
reconstructing a three-dimensional face point cloud based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector and preset average face shape information to obtain point cloud reconstruction information;
reconstructing face texture based on the three-dimensional face coefficient vector and preset average face texture information and face base information to obtain face texture information;
and generating three-dimensional face information based on the point cloud reconstruction information and the face texture information.
In one possible embodiment, the rendering mapping module 103 is configured to:
rendering the three-dimensional face information through a renderer to obtain a two-dimensional face image;
and mapping the human face three-dimensional point pair data in the three-dimensional human face information to the two-dimensional human face image to generate two-dimensional human face information containing network segmentation data based on the topological structure in the three-dimensional human face information.
In one possible embodiment, the graph information construction module 104 is configured to:
carrying out segmentation processing on the depth feature map based on network segmentation data in the two-dimensional face information to obtain a triangular one-sided segmentation result;
and constructing an adjacency matrix according to the vertex connection relation and the point-to-distance in the triangle facet segmentation result to obtain graph structure information.
In a possible embodiment, the occlusion region determining module 105 is configured to:
classifying three-dimensional human face vertexes in the graph structure information through the graph convolution neural network;
and outputting the occlusion area corresponding to the face image based on the classification result.
In one possible embodiment, the apparatus further includes a special effects processing module configured to:
and after the shielding area corresponding to the face image is output through the second network structure, performing special effect rendering processing based on the face image, the two-dimensional face information and the shielding area, and displaying a processing result.
In one possible embodiment, the special effect processing module is configured to:
and rendering the shielding area by using an original image shielding object, and rendering the part of the face image outside the shielding area based on the three-dimensional face information.
Fig. 7 is a schematic structural diagram of a face reconstruction and occlusion region identification apparatus provided in an embodiment of the present application, as shown in fig. 7, the apparatus includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of the processors 201 in the device may be one or more, and one processor 201 is taken as an example in fig. 7; the processor 201, the memory 202, the input device 203 and the output device 204 in the apparatus may be connected by a bus or other means, and fig. 7 illustrates the example of connection by a bus. The memory 202 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the face reconstruction and occlusion region identification method in the embodiment of the present application. The processor 201 executes various functional applications and data processing of the device by running software programs, instructions and modules stored in the memory 302, namely, the above-mentioned face reconstruction and occlusion region identification methods are realized. The input device 203 may be used to receive input numeric or character information and generate key signal inputs relating to user settings and function controls of the apparatus. The output device 204 may include a display device such as a display screen.
The present application further provides a non-volatile storage medium containing computer executable instructions, which when executed by a computer processor, are configured to perform a method for face reconstruction and occlusion region identification described in the foregoing embodiments, where the method includes:
inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image during three-dimensional reconstruction through the first network structure;
performing three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information;
rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data;
acquiring a depth feature map extracted by the first network structure, and constructing map structure information based on the depth feature map and the two-dimensional face information;
and inputting the graph structure information into a second network structure, and outputting the shielding area corresponding to the face image through the second network structure.
It should be noted that, in the embodiment of the above apparatus for reconstructing a face and identifying an occlusion region, each unit and each module included in the apparatus are only divided according to functional logic, but are not limited to the above division, as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiments of the present application.
In some possible embodiments, various aspects of the methods provided by the present application may also be implemented in a form of a program product including program code for causing a computer device to perform the steps of the methods according to various exemplary embodiments of the present application described above in this specification when the program product is run on the computer device, for example, the computer device may perform the face reconstruction and occlusion region identification methods described in the embodiments of the present application. The program product may be implemented using any combination of one or more readable media.

Claims (11)

1. The method for reconstructing the human face and identifying the shielded area is characterized by comprising the following steps:
inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image during three-dimensional reconstruction through the first network structure;
three-dimensional reconstruction of the face image is carried out based on the reconstruction parameter vector to generate three-dimensional face information;
rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data;
acquiring a depth feature map extracted by the first network structure, and constructing map structure information based on the depth feature map and the two-dimensional face information;
and inputting the graph structure information into a second network structure, and outputting the shielding area corresponding to the face image through the second network structure.
2. The method for reconstructing a human face and identifying an occluded area according to claim 1, wherein the reconstruction parameter vector comprises a human face feature vector, a human face expression vector and a three-dimensional human face coefficient vector, and the three-dimensional reconstruction of the human face image based on the reconstruction parameter vector generates three-dimensional human face information, comprising:
reconstructing a three-dimensional face point cloud based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector and preset average face shape information to obtain point cloud reconstruction information;
reconstructing face texture based on the three-dimensional face coefficient vector, preset average face texture information and face base information to obtain face texture information;
and generating three-dimensional face information based on the point cloud reconstruction information and the face texture information.
3. The method for reconstructing a face and identifying an occluded area according to claim 1, wherein the rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data comprises:
rendering the three-dimensional face information through a renderer to obtain a two-dimensional face image;
and mapping the human face three-dimensional point pair data in the three-dimensional human face information to the two-dimensional human face image to generate two-dimensional human face information containing network segmentation data based on the topological structure in the three-dimensional human face information.
4. The method for reconstructing a human face and identifying an occlusion region according to any one of claims 1 to 3, wherein the constructing of the graph structure information based on the depth feature map and the two-dimensional human face information comprises:
carrying out segmentation processing on the depth feature map based on network segmentation data in the two-dimensional face information to obtain a triangular one-sided segmentation result;
and constructing an adjacency matrix according to the vertex connection relation and the point-to-distance in the triangle facet segmentation result to obtain graph structure information.
5. The method for reconstructing a human face and identifying an occlusion region according to any one of claims 1 to 3, wherein the second network structure comprises a convolutional neural network, and the outputting the occlusion region corresponding to the human face image through the second network structure comprises:
classifying three-dimensional human face vertexes in the graph structure information through the graph convolution neural network;
and outputting the occlusion area corresponding to the face image based on the classification result.
6. The method for reconstructing a human face and identifying an occlusion region according to any one of claims 1 to 3, wherein after the outputting the occlusion region corresponding to the human face image through the second network structure, the method further comprises:
and performing special effect rendering processing on the basis of the face image, the two-dimensional face information and the shielding area, and displaying a processing result.
7. The method for reconstructing a face and identifying an occlusion region according to claim 6, wherein performing special effect rendering processing based on the face image, the two-dimensional face information, and the occlusion region comprises:
and rendering the shielding area by using an original image shielding object, and rendering the part of the face image outside the shielding area based on the three-dimensional face information.
8. Face is rebuild and is sheltered from regional recognition device, its characterized in that includes:
the parameter vector generation module is configured to input a face image into a first network structure and output a reconstruction parameter vector of the face image during three-dimensional reconstruction through the first network structure;
the three-dimensional information determining module is configured to perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information;
the rendering and mapping module is configured to render and map the three-dimensional face information to generate two-dimensional face information containing network segmentation data;
the image information construction module is configured to acquire a depth feature image extracted by the first network structure and construct image structure information based on the depth feature image and the two-dimensional face information;
and the occlusion area determining module is configured to input the graph structure information into a second network structure, and output an occlusion area corresponding to the face image through the second network structure.
9. A face reconstruction and occlusion region recognition apparatus, the apparatus comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the face reconstruction and occlusion region identification method of any of claims 1-7.
10. A non-volatile storage medium storing computer executable instructions for performing the face reconstruction and occlusion region identification method of any of claims 1-7 when executed by a computer processor.
11. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the face reconstruction and occlusion region identification method of any of claims 1-7.
CN202211328264.0A 2022-10-27 2022-10-27 Face reconstruction and occlusion region identification method, device, equipment and storage medium Pending CN115880748A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211328264.0A CN115880748A (en) 2022-10-27 2022-10-27 Face reconstruction and occlusion region identification method, device, equipment and storage medium
PCT/CN2023/123840 WO2024088061A1 (en) 2022-10-27 2023-10-10 Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211328264.0A CN115880748A (en) 2022-10-27 2022-10-27 Face reconstruction and occlusion region identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115880748A true CN115880748A (en) 2023-03-31

Family

ID=85759023

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211328264.0A Pending CN115880748A (en) 2022-10-27 2022-10-27 Face reconstruction and occlusion region identification method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN115880748A (en)
WO (1) WO2024088061A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024088061A1 (en) * 2022-10-27 2024-05-02 广州市百果园信息技术有限公司 Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020037678A1 (en) * 2018-08-24 2020-02-27 太平洋未来科技(深圳)有限公司 Method, device, and electronic apparatus for generating three-dimensional human face image from occluded image
CN113781640A (en) * 2021-09-27 2021-12-10 华中科技大学 Three-dimensional face reconstruction model establishing method based on weak supervised learning and application thereof
CN114549501A (en) * 2022-02-28 2022-05-27 佛山虎牙虎信科技有限公司 Face occlusion recognition method, three-dimensional face processing method, device, equipment and medium
CN115880748A (en) * 2022-10-27 2023-03-31 百果园技术(新加坡)有限公司 Face reconstruction and occlusion region identification method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024088061A1 (en) * 2022-10-27 2024-05-02 广州市百果园信息技术有限公司 Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium

Also Published As

Publication number Publication date
WO2024088061A1 (en) 2024-05-02

Similar Documents

Publication Publication Date Title
He et al. Arch++: Animation-ready clothed human reconstruction revisited
CN111598998B (en) Three-dimensional virtual model reconstruction method, three-dimensional virtual model reconstruction device, computer equipment and storage medium
KR101199475B1 (en) Method and apparatus for reconstruction 3 dimension model
US20200151963A1 (en) Training data set generation apparatus and method for machine learning
EP2080167B1 (en) System and method for recovering three-dimensional particle systems from two-dimensional images
CN112232914B (en) Four-stage virtual fitting method and device based on 2D image
EP3474185B1 (en) Classification of 2d images according to types of 3d arrangement
Galteri et al. Deep 3d morphable model refinement via progressive growing of conditional generative adversarial networks
CN115496864B (en) Model construction method, model reconstruction device, electronic equipment and storage medium
CN114450719A (en) Human body model reconstruction method, reconstruction system and storage medium
CN111784818A (en) Method and device for generating three-dimensional human body model and computer readable storage medium
US20220245911A1 (en) Methods of estimating a bare body shape from a concealed scan of the body
Archirapatkave et al. GPGPU acceleration algorithm for medical image reconstruction
Caliskan et al. Multi-view consistency loss for improved single-image 3d reconstruction of clothed people
CN116310045A (en) Three-dimensional face texture creation method, device and equipment
WO2024088061A1 (en) Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium
RU2713695C1 (en) Textured neural avatars
Fuentes-Jimenez et al. Texture-generic deep shape-from-template
Srivastava et al. xcloth: Extracting template-free textured 3d clothes from a monocular image
CN116977539A (en) Image processing method, apparatus, computer device, storage medium, and program product
Sun et al. Weakly-supervised reconstruction of 3D objects with large shape variation from single in-the-wild images
Han et al. 3D human model reconstruction from sparse uncalibrated views
CN116778065B (en) Image processing method, device, computer and storage medium
US20240193899A1 (en) Methods of estimating a bare body shape from a concealed scan of the body
US20240029358A1 (en) System and method for reconstructing 3d garment model from an image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination