CN112150608A - Three-dimensional face reconstruction method based on graph convolution neural network - Google Patents

Three-dimensional face reconstruction method based on graph convolution neural network Download PDF

Info

Publication number
CN112150608A
CN112150608A CN202010929831.2A CN202010929831A CN112150608A CN 112150608 A CN112150608 A CN 112150608A CN 202010929831 A CN202010929831 A CN 202010929831A CN 112150608 A CN112150608 A CN 112150608A
Authority
CN
China
Prior art keywords
face
dimensional
image
region
subregion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010929831.2A
Other languages
Chinese (zh)
Other versions
CN112150608B (en
Inventor
孟凡阳
潘鸿鹄
何震宇
田第鸿
柳伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Peng Cheng Laboratory
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Peng Cheng Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology, Peng Cheng Laboratory filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN202010929831.2A priority Critical patent/CN112150608B/en
Publication of CN112150608A publication Critical patent/CN112150608A/en
Application granted granted Critical
Publication of CN112150608B publication Critical patent/CN112150608B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a three-dimensional face reconstruction method based on a graph convolution neural network, which comprises the steps of obtaining a face image to be reconstructed, and determining a plurality of face sub-regions corresponding to the face image; for each face subregion, acquiring a face feature vector corresponding to the face subregion; determining three-dimensional face structure information corresponding to the face feature vector based on the trained graph convolution neural network; and determining a three-dimensional face image corresponding to the face image according to the acquired three-dimensional face structure information. According to the method and the device, the three-dimensional face structure information corresponding to the face image is determined through the graph convolution neural network, so that each point cloud data in the obtained three-dimensional face structure information comprises position information and color information, and further the authenticity of the three-dimensional face image can be improved.

Description

Three-dimensional face reconstruction method based on graph convolution neural network
Technical Field
The application relates to the technical field of computer science, in particular to a three-dimensional face reconstruction method based on a graph convolution neural network.
Background
The method for recovering the three-dimensional face information from the picture is widely applied to the fields of animation and movie production, electronic games, virtual reality, augmented reality and the like. At present, a 3D face reconstruction technology is generally adopted to recover three-dimensional face information from a picture, wherein the 3D face reconstruction technology generally uses a convolutional neural network or a fully connected neural network to perform parameter fitting or regression of face information. However, the convolutional neural network and the fully-connected neural network are mainly good at processing regular data in the Euclidean space, and for data in non-Euclidean spaces such as 3D point clouds, the convolutional neural network and the fully-connected neural network cannot express topological relations between points, so that the representation of face information is restricted, and the accuracy of three-dimensional face information is influenced.
Disclosure of Invention
The technical problem to be solved by the application is to provide a three-dimensional face reconstruction method based on a graph convolution neural network aiming at the defects of the prior art.
In order to solve the above technical problem, a first aspect of the embodiments of the present application provides a three-dimensional face reconstruction method based on a graph convolution neural network, where the method includes:
acquiring a face image to be reconstructed, and determining a plurality of face subregions corresponding to the face image, wherein each face subregion in the face subregions is included in the face image;
for each face subregion, acquiring a face feature vector corresponding to the face subregion;
determining three-dimensional face structure information corresponding to the face feature vector based on the trained graph convolution neural network;
and determining a three-dimensional face image corresponding to the face image according to the acquired three-dimensional face structure information.
The three-dimensional face reconstruction method based on the graph convolution neural network is characterized in that the face image is a two-dimensional face image.
The three-dimensional face reconstruction method based on the graph convolution neural network is characterized in that the three-dimensional face structure information comprises a plurality of three-dimensional face point cloud data, and each three-dimensional face point cloud data in the three-dimensional face point cloud data comprises position information and color information.
The three-dimensional face reconstruction method based on the graph convolution neural network includes the following steps:
acquiring a face image to be reconstructed, and acquiring a face feature point set of the face image;
dividing the face feature point set into a plurality of face feature point subsets;
and determining a face sub-region corresponding to the face feature point subset to obtain a plurality of face sub-regions for the face feature point subsets in the face feature point subsets.
The three-dimensional face reconstruction method based on the graph convolution neural network is characterized in that the face sub-region is the minimum region comprising all face characteristic points in the face characteristic point subset corresponding to the face sub-region.
The three-dimensional face reconstruction method based on the graph convolution neural network, wherein for each face sub-region, acquiring the face feature vector corresponding to the face sub-region specifically includes:
for each face subregion, adjusting the region size of the face subregion to obtain an adjusted face subregion;
and determining a face feature vector corresponding to the face subregion based on the trained feature extraction model and the adjusted face subregion, wherein the region size of the adjusted face subregion is the same as the image size of the input item of the feature extraction model.
The three-dimensional face reconstruction method based on the graph convolution neural network comprises a three-layer cascaded graph convolution structure, and the feature number of each layer of graph convolution structure is increased in sequence according to the cascade sequence.
The three-dimensional face reconstruction method based on the graph convolution neural network is characterized in that each piece of three-dimensional face structure information in all pieces of three-dimensional face structure information comprises a plurality of pieces of three-dimensional point cloud data; the determining, according to the obtained all three-dimensional face structure information, a three-dimensional face image corresponding to the face image specifically includes:
acquiring an overlapping region of each face subregion corresponding to each three-dimensional face structure information, and forming a residual error region corresponding to an image region by each face subregion, wherein the residual error region and the image region form the face image;
for each acquired overlapping area, acquiring three-dimensional point cloud data corresponding to each pixel point in each overlapping area, and taking the average value of all the acquired three-dimensional point cloud data as the three-dimensional point cloud data corresponding to the pixel point to acquire three-dimensional face structure information corresponding to each face subregion constituting image area;
determining that the residual error regions correspond to predicted three-dimensional face structure information based on an empirical method and an interpolation method;
and determining a three-dimensional face image corresponding to the face image based on the three-dimensional face structure information and the predicted three-dimensional face structure information.
A second aspect of embodiments of the present application provides a computer-readable storage medium storing one or more programs, which are executable by one or more processors to implement the steps in the method for reconstructing a three-dimensional face based on a atlas neural network as described in any one of the above.
A third aspect of the embodiments of the present application provides a terminal device, including: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps of the method for reconstructing a three-dimensional human face based on a atlas neural network as described in any one of the above.
Has the advantages that: compared with the prior art, the three-dimensional face reconstruction method based on the atlas neural network comprises the steps of obtaining a face image to be reconstructed, and determining a plurality of face sub-regions corresponding to the face image; for each face subregion, acquiring a face feature vector corresponding to the face subregion; determining three-dimensional face structure information corresponding to the face feature vector based on the trained graph convolution neural network; and determining a three-dimensional face image corresponding to the face image according to the acquired three-dimensional face structure information. According to the method and the device, the three-dimensional face structure information corresponding to the face image is determined through the graph convolution neural network, so that each point cloud data in the obtained three-dimensional face structure information comprises position information and color information, and further the authenticity of the three-dimensional face image can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without any inventive work.
Fig. 1 is a flowchart of a three-dimensional face reconstruction method based on a graph convolution neural network provided in the present application.
Fig. 2 is a schematic flow diagram of a three-dimensional face reconstruction method based on a graph convolution neural network provided in the present application.
Fig. 3 is a schematic diagram of a human face feature point in the three-dimensional human face reconstruction method based on the atlas neural network provided in the present application.
Fig. 4 is a schematic diagram of human face feature points displayed on a human face image in the three-dimensional human face reconstruction method based on the atlas neural network provided in the present application.
Fig. 5 is a schematic diagram of a face sub-region in the three-dimensional face reconstruction method based on the atlas neural network provided in the present application.
Fig. 6 is a schematic structural diagram of a terminal device provided in the present application.
Detailed Description
The present application provides a three-dimensional face reconstruction method based on a graph convolution neural network, and in order to make the purpose, technical scheme, and effect of the present application clearer and clearer, the present application is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The inventor finds that the three-dimensional face information recovered from the picture is widely applied to many fields such as animation and movie production, electronic games, virtual reality, augmented reality and the like. However, the recovery of three-dimensional face information from a single picture is a very challenging task, and the first difficulty is that the single picture has no depth information, and secondly, the reconstruction effect is also restricted by the change of shielding and illumination.
In the last 70 th century, Parke proposed the earliest parameterized face reconstruction methods that modeled the shape of the face by 250 polygons and 400 vertices, which can be varied by different parameters. In 1987, Waters et al propose a muscle model to simulate the expression and posture of a human face, which considers the deformation of muscles under the action of force while considering the shape change of meshes, and is more accurate than the Parke model, but because the number of points and the number of meshes are small, the reality degree of the reconstructed human face is still insufficient.
By the 90 s of the 20 th century, researchers began exploring recovery of 3D face structures From pictures, wherein two methods of comparative influence were shadow From Shape recovery (SFS) and face reconstruction based on a deformation Model (3D portable Model,3DMM), respectively. The core idea of SFS is to perform face reconstruction based on the correspondence between the gray-level values of the image and the height changes of the object, but this method is greatly affected by illumination. In the 3DMM, a three-dimensional scanner is used for reconstructing a face, all the faces have the same vertex number and topological relation, and the method compresses a model by using Principal Component Analysis (PCA), so that different 3D faces can be obtained by linearly combining the three-dimensional model and the PCA parameters when face reconstruction is carried out.
In recent years, with continuous improvement of hardware and continuous innovation of algorithms, deep learning begins to cut head and expose corners in various fields, and students explore application of the deep learning in the field of three-dimensional face reconstruction, for example, Tewari in 2017 proposes a 3D face reconstruction method based on a self-encoding (auto encoder) structure, the method extracts image features through an encoder in the self-encoding structure, and a decoder regresses 3DMM parameters to recover three-dimensional face information from a two-bit image; fanzi et al, 2019, proposed to use multi-view constraints to regress more accurate 3d dm parameters and achieve the state-of-art effect.
However, the existing 3D face reconstruction technology generally uses a convolutional neural network or a fully-connected neural network to perform parameter fitting or regression of face information. However, the convolutional neural network and the fully-connected neural network are mainly good at processing regular data in the Euclidean space, and for data in non-Euclidean spaces such as 3D point clouds, the convolutional neural network and the fully-connected neural network cannot express topological relations between points, so that the representation of face information is restricted, and the accuracy of three-dimensional face information is influenced.
In order to solve the above problem, in the embodiment of the present application, the method includes acquiring a face image to be reconstructed, and determining a plurality of face sub-regions corresponding to the face image; for each face subregion, acquiring a face feature vector corresponding to the face subregion; determining three-dimensional face structure information corresponding to the face feature vector based on the trained graph convolution neural network; and determining a three-dimensional face image corresponding to the face image according to the acquired three-dimensional face structure information. According to the method and the device, the three-dimensional face structure information corresponding to the face image is determined through the graph convolution neural network, so that each point cloud data in the obtained three-dimensional face structure information comprises position information and color information, and further the authenticity of the three-dimensional face image can be improved.
The following further describes the content of the application by describing the embodiments with reference to the attached drawings.
The present embodiment provides a three-dimensional face reconstruction method based on a graph convolution neural network, as shown in fig. 1 and fig. 2, the method includes:
s10, obtaining a face image to be reconstructed, and determining a plurality of face sub-regions corresponding to the face image.
Specifically, the facial image to be reconstructed may be captured by an imaging system (e.g., a camera, etc.), transmitted by an external device (e.g., a smartphone, etc.), and downloaded via a network (e.g., hundreds of degrees, etc.). The face image to be reconstructed is a two-dimensional image, and the two-dimensional image carries a face, wherein a color space to which the two-dimensional image belongs can be an RGB color space, a YUV color space and the like. For example, the two-dimensional image is captured by a camera, and belongs to an RGB color space.
In an implementation manner of this embodiment, the process of acquiring the face image may be: acquiring a to-be-processed image carrying a face image, performing face recognition on the to-be-processed image to obtain the face image in the to-be-processed image, and taking the acquired face image as the to-be-reconstructed face image. In addition, when the image to be processed contains a plurality of face images, the face image corresponding to the selection operation can be selected from the plurality of face images obtained by identification according to the received selection operation, or the face image to be reconstructed is determined according to the image area occupied by each face image, wherein the face image to be reconstructed can be the face image occupying the largest image area; or, determining a face image to be reconstructed according to the position information of each face image in the image to be processed, wherein the face image to be reconstructed can be the face image closest to the image center point of the image to be processed; or, each identified face image is taken as a face image to be reconstructed, and the like.
Further, each of the face subregions is included in the face image, and each face subregion is a partial region in the face image. Wherein, every face subregion is corresponding to a face position to the face position that each face subregion corresponds is different, for example, a plurality of face subregions include 7 face subregions, are left cheek region, chin region, right cheek region, left eye region, right eye region, nose region and mouth region respectively, wherein, left eye region includes left eye and left eyebrow, and right eye region includes right eye and right eyebrow.
In an implementation manner of this embodiment, the acquiring a face image to be reconstructed and dividing the face image into a plurality of face sub-regions specifically includes:
acquiring a face image to be reconstructed, and acquiring a face feature point set of the face image;
dividing the face feature point set into a plurality of face feature point subsets, wherein the face feature points in each of the face feature point subsets are different from one another;
and determining a face sub-region corresponding to the face feature point subset to obtain a plurality of face sub-regions for the face feature point subsets in the face feature point subsets.
Specifically, the face feature point set includes a plurality of face feature points, and dividing the face feature point set into a plurality of face feature point subsets means grouping the plurality of face feature points in the face feature point set, and using each group of face feature points obtained by grouping as a face feature point subset. Therefore, each face feature point subset in the face feature point subsets is included in the face feature point set, and each face feature point subset at least comprises face feature points in the face feature point set.
In one implementation manner of this embodiment, the face feature point set includes 68 face feature points of a human face, and as shown in fig. 3 and 4, the ranking sequence of the 68 face feature points is fixed, and the 68 face feature points respectively represent a left cheek, a right cheek, a left eyebrow, a right eyebrow, a nose, a left eye, a right eye, and a mouth from small to large according to the ranking sequence, wherein the face feature points of ranking numbers 1 to 7 are the face feature points corresponding to the left cheek region, the face feature points of ranking numbers 7 to 11 and 58 are the face feature points corresponding to the chin region, the face feature points of ranking numbers 11 to 17 are the face feature points corresponding to the right cheek region, the face feature points of ranking numbers 18 to 22 are the face feature points corresponding to the left eyebrow region, and the face feature points of ranking numbers 23 to 27 are the face feature points corresponding to the right eyebrow region, the face characteristic points with the sequence numbers 28-36 are the face characteristic points corresponding to the nose area, the face characteristic points with the sequence numbers 37-42 are the face characteristic points corresponding to the left eye area, the face characteristic points with the sequence numbers 43-48 are the face characteristic points corresponding to the right eyebrow area, and the face characteristic points with the sequence numbers 49-68 are the face characteristic points corresponding to the mouth area.
Therefore, the sequencing sequence numbers corresponding to the face feature points included in each face feature point subset in the plurality of face feature point subsets can be preset, and after the face feature points are identified, each face feature point subset can be quickly determined according to the sequencing sequence numbers corresponding to the face feature points included in each face feature point subset. In one implementation of this embodiment, the plurality of face feature point subsets comprises 7 face feature point subsets, respectively denoted as left cheek set, chin set, right cheek set, left eye set, right eye set, nose set, and mouth set, the left cheek set comprises face feature points with sequencing numbers 1-7, the right cheek set comprises face feature points with sequencing numbers 11-17, the chin set comprises face feature points with sequencing numbers 7-11 and 58, the left eye set comprises face feature points with sequencing numbers 18-22 and face feature points with sequencing numbers 37-42, the right eye set comprises face feature points with sequencing numbers 23-27 and face feature points with sequencing numbers 43-48, the nose set comprises face feature points with sequencing numbers 28-36, and the mouth set comprises face feature points with sequencing numbers 49-68. Wherein the left cheek set corresponds to a left cheek region, the chin set corresponds to a chin region, the right cheek set corresponds to a right cheek region, the left eye set corresponds to a left eye region, the right eye set corresponds to a right eye region, the nasal subset corresponds to a nose region, and the mouth set corresponds to a mouth region.
Further, in an implementation manner of this embodiment, after obtaining each face feature point subset, for a face feature point subset of the face feature point subsets, a region boundary point is selected in the face feature point subset, and a rectangular frame is drawn by using the selected region boundary point as an extreme point, so as to obtain a face sub-region corresponding to the face feature point subset, so that the face sub-region is a minimum region including all face feature points in the face feature point subset corresponding to the face region, where the boundary point is located on a boundary of the face sub-region. For example, as shown in fig. 5, for the left eye set, if the four extreme points corresponding to the left eye set are 20, 42, 18, and 22, respectively, then a rectangular frame is drawn with 20, 42, 18, and 22 as boundary points, and the region enclosed by the rectangular frame is taken as the left eye region, or if the four extreme points corresponding to the nose set, up, down, left, right, and left, are 28, 34, 32, and 36, respectively, then a rectangular frame is drawn with 28, 34, 32, and 36 as boundary points, and the region enclosed by the rectangular frame is taken as the nose region.
In an implementation manner of this embodiment, the determining of the face sub-regions corresponding to the face image may be determined by a Dlib-based face detection and segmentation module, that is, after the face image is acquired, the face image is input into the Dlib-based face detection and segmentation module, and the face sub-regions corresponding to the face image are determined by the Dlib-based face detection and segmentation module, where Dlib is a modern C + + tool box including machine learning algorithms and tools for creating complex software in C + + to solve practical problems, and is widely applied to the industrial and academic industries, including robots, embedded devices, mobile phones, and large high-performance computing environments.
And S20, acquiring the face feature vector corresponding to each face subregion.
Specifically, the face feature vector is a feature vector corresponding to the face subregion, and the three-dimensional face structure information corresponding to the face subregion may be determined based on the face feature vector. In an implementation manner of this embodiment, the face feature vector is determined based on a trained feature extraction model, where an input item of the feature extraction model is a face subregion, and an output item of the feature extraction model is a face feature vector. Correspondingly, for each face subregion, acquiring the face feature vector corresponding to the face subregion specifically includes:
for each face subregion, adjusting the region size of the face subregion to obtain an adjusted face subregion;
and determining a face feature vector corresponding to the face subregion based on the trained feature extraction model and the adjusted face subregion, wherein the region size of the adjusted face subregion is the same as the image size of the input item of the feature extraction model.
Specifically, the feature extraction model may be a VGG16 network model, and the image size of the input items of the VGG16 network model is 224 × 224. Therefore, after the face sub-region is acquired, the region size of the face sub-region needs to be adjusted, so that the region size of the face sub-region after adjustment is equal to 224 × 224, and thus the face sub-region can be used as an input item of the VGG16 network model. In addition, the VGG16 contains 13 convolutional layers and 3 fully-connected layers in total, and the use of multiple 3 × 3 small convolutional kernels makes the network have a larger receptive field in few parameters. In an implementation manner of this embodiment, the feature vector output by the 10 th convolutional layer and the feature vector output by the 13 th convolutional layer in the cascade order of the VGG16 network model are used to determine the feature vector corresponding to the face sub-region, for example, the feature vector output by the 10 th convolutional layer and the feature vector output by the 13 th convolutional layer are spliced to obtain the feature vector corresponding to the face sub-region.
And S30, determining three-dimensional face structure information corresponding to the face feature vector based on the trained atlas neural network.
Specifically, the graph convolution neural network is trained, an input item of the graph convolution neural network is a face feature vector, and an output item of the graph convolution neural network is three-dimensional face structure information, wherein the three-dimensional face structure information includes a plurality of three-dimensional face point cloud data, and each three-dimensional face point cloud data in the plurality of three-dimensional face point cloud data includes position information and color information. For example, each three-dimensional face point cloud data includes 6 dimensions, where the first three dimensions are used to represent position information of the three-dimensional face point cloud data, the last three dimensions are used to represent color information of the three-dimensional face point cloud data, and when the face image belongs to the RGB color space, the color information includes an R value, a B value, and a G value.
In an implementation manner of this embodiment, the graph convolution neural network includes three cascaded graph convolution structures, and is located in two adjacent graph convolution structures according to a cascade order, an output item of a previous graph convolution structure is an input item of a next graph convolution structure, an input item of a most previous graph convolution structure is a face feature vector, and an output item of a last graph convolution structure is three-dimensional face structure information. It can be understood that the three-dimensional face structure information corresponding to each face subregion is an input item of the graph convolution neural network, and the three-dimensional face structure information corresponding to each face subregion is sequentially input into the graph convolution neural network, so that the parameter number of the graph convolution neural network can be reduced. Therefore, as the number of points in each face of the BFM model is 53215, if the graph convolutional neural network is directly used, the size of the adjacent matrix of each layer exceeds 25 hundred million, and the memory and the video memory are greatly consumed, so that the number of points contained in the image input into the graph convolutional neural network model can be reduced by dividing the face image into a plurality of face subregions, the number of parameters of the graph convolutional neural network is reduced, and the consumption of the memory and the video memory corresponding to the graph convolutional neural network model is further reduced.
In one implementation manner of this embodiment, the graph convolution neural network includes three layers of cascaded graph convolution structures, and the number of features of each layer of graph convolution structures increases in sequence according to the cascade order. For example, the feature numbers of each layer of the three-layer GCN structure are 1, 2, and 6, respectively, where 6 is the output dimension of the third layer. Thus, the parameters of the graph convolution neural network model can be reduced by selecting smaller characteristic numbers for the first two layers.
And S40, determining a three-dimensional face image corresponding to the face image according to all the obtained three-dimensional face structure information.
Specifically, after three-dimensional face structure information corresponding to each face subregion is acquired, a three-dimensional face image is determined based on all the acquired three-dimensional face structure information. However, it can be known from the above determination process of the face sub-regions that there may be overlap between several face sub-regions, and an image region formed by several face sub-regions is smaller than that of the face image. Therefore, two three-dimensional point cloud data corresponding to the same pixel point in the face image may exist in all the obtained three-dimensional face structure information, the pixel point exists in the face image, and the three-dimensional point cloud data corresponding to the pixel point does not exist in all the three-dimensional face structure information, so that the authenticity and the accuracy of the three-dimensional face image are influenced.
Based on this, in an implementation manner of this embodiment, the determining, according to the obtained all three-dimensional face structure information, a three-dimensional face image corresponding to the face image specifically includes:
acquiring an overlapping region of each face subregion corresponding to each three-dimensional face structure information, and forming a residual error region corresponding to an image region by each face subregion, wherein the residual error region and the image region form the face image;
for each acquired overlapping area, acquiring three-dimensional point cloud data corresponding to each pixel point in each overlapping area, and taking the average value of all the acquired three-dimensional point cloud data as the three-dimensional point cloud data corresponding to the pixel point to acquire three-dimensional face structure information corresponding to each face subregion constituting image area;
determining that the residual error regions correspond to predicted three-dimensional face structure information based on an empirical method and an interpolation method;
and determining a three-dimensional face image corresponding to the face image based on the three-dimensional face structure information and the predicted three-dimensional face structure information.
Specifically, the overlap region refers to an image region that is at least simultaneously included in two face sub-regions, and the residual region refers to an image region that is not included in any face sub-region in the face image; the residual error region can be obtained by combining a plurality of face regions according to the position information of the face regions in the face image to obtain a candidate image and then performing difference between the face image and the candidate image; for each pixel point in the overlapping area, at least two pieces of three-dimensional point cloud data can be obtained.
Further, for each pixel point in the overlapping region, determining all three-dimensional point cloud data corresponding to the pixel point, determining an average value of each dimension of each three-dimensional point cloud data, taking the obtained average value as the dimension value to obtain the three-dimensional point cloud data corresponding to the pixel point to obtain the three-dimensional point cloud data corresponding to the overlapping region, removing the three-dimensional point cloud data corresponding to the overlapping region from the three-dimensional face structure information corresponding to each face subregion containing each overlapping region, and taking the three-dimensional point cloud data included in each removed three-dimensional face structure information and the three-dimensional point cloud data corresponding to the overlapping region as the three-dimensional face structure information corresponding to each face subregion.
Further, for each pixel point in the residual region, determining whether a pixel point corresponding to the three-dimensional point cloud data is included in a preset range with the pixel point as a center, if so, determining the three-dimensional point cloud data corresponding to the pixel point by adopting an interpolation method, and if not, determining the three-dimensional point cloud data corresponding to the pixel point by adopting an empirical method, wherein the empirical method is that the residual feature is replaced by the feature of an average human face in the 3DMM, and the interpolation method refers to that the feature around the residual feature is used for fitting the residual feature.
Based on the three-dimensional face reconstruction method based on the atlas neural network, the embodiment provides the three-dimensional face reconstruction method based on the atlas neural network, and the method comprises the steps of obtaining a face image to be reconstructed and determining a plurality of face sub-regions corresponding to the face image; for each face subregion, acquiring a face feature vector corresponding to the face subregion; determining three-dimensional face structure information corresponding to the face feature vector based on the trained graph convolution neural network; and determining a three-dimensional face image corresponding to the face image according to the acquired three-dimensional face structure information. According to the method and the device, the three-dimensional face structure information corresponding to the face image is determined through the graph convolution neural network, so that each point cloud data in the obtained three-dimensional face structure information comprises position information and color information, and further the authenticity of the three-dimensional face image can be improved.
Based on the aforementioned three-dimensional face reconstruction method based on the convolutional neural network, this embodiment provides a computer-readable storage medium, where one or more programs are stored in the computer-readable storage medium, and the one or more programs can be executed by one or more processors to implement the steps in the three-dimensional face reconstruction method based on the convolutional neural network according to the aforementioned embodiment.
Based on the above three-dimensional face reconstruction method based on the graph convolution neural network, the present application also provides a terminal device, as shown in fig. 6, which includes at least one processor (processor) 20; a display screen 21; and a memory (memory)22, and may further include a communication Interface (Communications Interface)23 and a bus 24. The processor 20, the display 21, the memory 22 and the communication interface 23 can communicate with each other through the bus 24. The display screen 21 is configured to display a user guidance interface preset in the initial setting mode. The communication interface 23 may transmit information. The processor 20 may call logic instructions in the memory 22 to perform the methods in the embodiments described above.
Furthermore, the logic instructions in the memory 22 may be implemented in software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product.
The memory 22, which is a computer-readable storage medium, may be configured to store a software program, a computer-executable program, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure. The processor 20 executes the functional application and data processing, i.e. implements the method in the above-described embodiments, by executing the software program, instructions or modules stored in the memory 22.
The memory 22 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Further, the memory 22 may include a high speed random access memory and may also include a non-volatile memory. For example, a variety of media that can store program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, may also be transient storage media.
In addition, the specific processes loaded and executed by the storage medium and the instruction processors of the terminal device are described in detail in the method, and are not stated herein.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A three-dimensional face reconstruction method based on a graph convolution neural network is characterized by comprising the following steps:
acquiring a face image to be reconstructed, and determining a plurality of face subregions corresponding to the face image, wherein each face subregion in the face subregions is included in the face image;
for each face subregion, acquiring a face feature vector corresponding to the face subregion;
determining three-dimensional face structure information corresponding to the face feature vector based on the trained graph convolution neural network;
and determining a three-dimensional face image corresponding to the face image according to the acquired three-dimensional face structure information.
2. The method of claim 1, wherein the face image is a two-dimensional face image.
3. The method of claim 1, wherein the three-dimensional face structure information comprises a plurality of three-dimensional face point cloud data, each of the plurality of three-dimensional face point cloud data comprising location information and color information.
4. The method of claim 1, wherein the obtaining a face image to be reconstructed and dividing the face image into a plurality of face sub-regions specifically comprises:
acquiring a face image to be reconstructed, and acquiring a face feature point set of the face image;
dividing the face feature point set into a plurality of face feature point subsets;
and determining a face sub-region corresponding to the face feature point subset to obtain a plurality of face sub-regions for the face feature point subsets in the face feature point subsets.
5. The method of claim 4, wherein the face sub-region is a minimum region including all face feature points in a subset of face feature points corresponding to the face sub-region.
6. The method of claim 1, wherein for each face sub-region, obtaining the face feature vector corresponding to the face sub-region specifically comprises:
for each face subregion, adjusting the region size of the face subregion to obtain an adjusted face subregion;
and determining a face feature vector corresponding to the face subregion based on the trained feature extraction model and the adjusted face subregion, wherein the region size of the adjusted face subregion is the same as the image size of the input item of the feature extraction model.
7. The three-dimensional human face reconstruction method based on the graph convolution neural network is characterized in that the graph convolution neural network comprises three layers of cascaded graph convolution structures, and the feature number of each layer of graph convolution structure is increased in sequence according to the cascade sequence.
8. The method of claim 1, wherein each three-dimensional face structure information of all three-dimensional face structure information comprises a plurality of three-dimensional point cloud data; the determining, according to the obtained all three-dimensional face structure information, a three-dimensional face image corresponding to the face image specifically includes:
acquiring an overlapping region of each face subregion corresponding to each three-dimensional face structure information, and forming a residual error region corresponding to an image region by each face subregion, wherein the residual error region and the image region form the face image;
for each acquired overlapping area, acquiring three-dimensional point cloud data corresponding to each pixel point in each overlapping area, and taking the average value of all the acquired three-dimensional point cloud data as the three-dimensional point cloud data corresponding to the pixel point to acquire three-dimensional face structure information corresponding to each face subregion constituting image area;
determining that the residual error regions correspond to predicted three-dimensional face structure information based on an empirical method and an interpolation method;
and determining a three-dimensional face image corresponding to the face image based on the three-dimensional face structure information and the predicted three-dimensional face structure information.
9. A computer readable storage medium storing one or more programs, which are executable by one or more processors to implement the steps of the method for reconstructing a three-dimensional face based on a atlas neural network as recited in any one of claims 1 to 8.
10. A terminal device, comprising: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps in the method for reconstructing a three-dimensional face based on a atlas neural network as recited in any of claims 1-8.
CN202010929831.2A 2020-09-07 2020-09-07 Three-dimensional face reconstruction method based on graph convolution neural network Active CN112150608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010929831.2A CN112150608B (en) 2020-09-07 2020-09-07 Three-dimensional face reconstruction method based on graph convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010929831.2A CN112150608B (en) 2020-09-07 2020-09-07 Three-dimensional face reconstruction method based on graph convolution neural network

Publications (2)

Publication Number Publication Date
CN112150608A true CN112150608A (en) 2020-12-29
CN112150608B CN112150608B (en) 2024-07-23

Family

ID=73890666

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010929831.2A Active CN112150608B (en) 2020-09-07 2020-09-07 Three-dimensional face reconstruction method based on graph convolution neural network

Country Status (1)

Country Link
CN (1) CN112150608B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598790A (en) * 2021-01-08 2021-04-02 中国科学院深圳先进技术研究院 Brain structure three-dimensional reconstruction method and device and terminal equipment

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102986214A (en) * 2010-07-06 2013-03-20 皇家飞利浦电子股份有限公司 Generation of high dynamic range images from low dynamic range images
CN104504410A (en) * 2015-01-07 2015-04-08 深圳市唯特视科技有限公司 Three-dimensional face recognition device and method based on three-dimensional point cloud
CN108399649A (en) * 2018-03-05 2018-08-14 中科视拓(北京)科技有限公司 A kind of single picture three-dimensional facial reconstruction method based on cascade Recurrent networks
CN109934129A (en) * 2019-02-27 2019-06-25 嘉兴学院 A kind of man face characteristic point positioning method, device, computer equipment and storage medium
CN110288695A (en) * 2019-06-13 2019-09-27 电子科技大学 Single-frame images threedimensional model method of surface reconstruction based on deep learning
CN110599593A (en) * 2019-09-12 2019-12-20 北京三快在线科技有限公司 Data synthesis method, device, equipment and storage medium
US20200027269A1 (en) * 2018-07-23 2020-01-23 Fudan University Network, System and Method for 3D Shape Generation
WO2020069049A1 (en) * 2018-09-25 2020-04-02 Matterport, Inc. Employing three-dimensional data predicted from two-dimensional images using neural networks for 3d modeling applications
CN111027140A (en) * 2019-12-11 2020-04-17 南京航空航天大学 Airplane standard part model rapid reconstruction method based on multi-view point cloud data
KR20200052420A (en) * 2018-10-25 2020-05-15 주식회사 인공지능연구원 Apparatus for Predicting 3D Original Formation
CN111598998A (en) * 2020-05-13 2020-08-28 腾讯科技(深圳)有限公司 Three-dimensional virtual model reconstruction method and device, computer equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102986214A (en) * 2010-07-06 2013-03-20 皇家飞利浦电子股份有限公司 Generation of high dynamic range images from low dynamic range images
CN104504410A (en) * 2015-01-07 2015-04-08 深圳市唯特视科技有限公司 Three-dimensional face recognition device and method based on three-dimensional point cloud
CN108399649A (en) * 2018-03-05 2018-08-14 中科视拓(北京)科技有限公司 A kind of single picture three-dimensional facial reconstruction method based on cascade Recurrent networks
US20200027269A1 (en) * 2018-07-23 2020-01-23 Fudan University Network, System and Method for 3D Shape Generation
WO2020069049A1 (en) * 2018-09-25 2020-04-02 Matterport, Inc. Employing three-dimensional data predicted from two-dimensional images using neural networks for 3d modeling applications
KR20200052420A (en) * 2018-10-25 2020-05-15 주식회사 인공지능연구원 Apparatus for Predicting 3D Original Formation
CN109934129A (en) * 2019-02-27 2019-06-25 嘉兴学院 A kind of man face characteristic point positioning method, device, computer equipment and storage medium
CN110288695A (en) * 2019-06-13 2019-09-27 电子科技大学 Single-frame images threedimensional model method of surface reconstruction based on deep learning
CN110599593A (en) * 2019-09-12 2019-12-20 北京三快在线科技有限公司 Data synthesis method, device, equipment and storage medium
CN111027140A (en) * 2019-12-11 2020-04-17 南京航空航天大学 Airplane standard part model rapid reconstruction method based on multi-view point cloud data
CN111598998A (en) * 2020-05-13 2020-08-28 腾讯科技(深圳)有限公司 Three-dimensional virtual model reconstruction method and device, computer equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JIANGKE LIN等: "Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks", 《COMPUTER GRAPHICS AND ANIMATION》, 18 May 2020 (2020-05-18), pages 1 - 11 *
吴潜溢: "三维人脸表示及其应用", 《中国优秀硕士学位论文全文数据库 信息科技辑 (月刊)》, no. 8, 15 August 2019 (2019-08-15), pages 138 - 636 *
庄昱峰等: "基于P2M框架改进的单张图像三维重建", 《电子测量技术》, no. 9, 8 May 2020 (2020-05-08), pages 66 - 69 *
张玉娟等: "LIDAR点云建筑物数据提取及三维模型建立", 《哈尔滨师范大学自然科学学报》, no. 6, 15 December 2017 (2017-12-15), pages 23 - 25 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598790A (en) * 2021-01-08 2021-04-02 中国科学院深圳先进技术研究院 Brain structure three-dimensional reconstruction method and device and terminal equipment

Also Published As

Publication number Publication date
CN112150608B (en) 2024-07-23

Similar Documents

Publication Publication Date Title
Chen et al. Fsrnet: End-to-end learning face super-resolution with facial priors
US12008797B2 (en) Image segmentation method and image processing apparatus
US10198624B2 (en) Segmentation-guided real-time facial performance capture
CN112052839B (en) Image data processing method, apparatus, device and medium
CN110176027B (en) Video target tracking method, device, equipment and storage medium
CN106778928B (en) Image processing method and device
JP7282810B2 (en) Eye-tracking method and system
CN107463949B (en) Video action classification processing method and device
CN111243093B (en) Three-dimensional face grid generation method, device, equipment and storage medium
CN109416727B (en) Method and device for removing glasses in face image
WO2022078041A1 (en) Occlusion detection model training method and facial image beautification method
CN109961507A (en) A kind of Face image synthesis method, apparatus, equipment and storage medium
CN113850168A (en) Fusion method, device and equipment of face pictures and storage medium
CN110599395A (en) Target image generation method, device, server and storage medium
CN110046574A (en) Safety cap based on deep learning wears recognition methods and equipment
CN109978077B (en) Visual recognition method, device and system and storage medium
KR20230110787A (en) Methods and systems for forming personalized 3D head and face models
CN111080754B (en) Character animation production method and device for connecting characteristic points of head and limbs
CN116310105A (en) Object three-dimensional reconstruction method, device, equipment and storage medium based on multiple views
CN111028318A (en) Virtual face synthesis method, system, device and storage medium
CN112150608B (en) Three-dimensional face reconstruction method based on graph convolution neural network
CN114677286A (en) Image processing method and device, storage medium and terminal equipment
KR102160955B1 (en) Method and apparatus of generating 3d data based on deep learning
WO2023250223A1 (en) View dependent three-dimensional morphable models
CN115035566B (en) Expression recognition method, apparatus, computer device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant