CN112150608B - Three-dimensional face reconstruction method based on graph convolution neural network - Google Patents
Three-dimensional face reconstruction method based on graph convolution neural network Download PDFInfo
- Publication number
- CN112150608B CN112150608B CN202010929831.2A CN202010929831A CN112150608B CN 112150608 B CN112150608 B CN 112150608B CN 202010929831 A CN202010929831 A CN 202010929831A CN 112150608 B CN112150608 B CN 112150608B
- Authority
- CN
- China
- Prior art keywords
- face
- dimensional
- image
- neural network
- subarea
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000004891 communication Methods 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 9
- 238000004836 empirical method Methods 0.000 claims description 5
- 239000000203 mixture Substances 0.000 claims description 3
- 230000010365 information processing Effects 0.000 abstract description 4
- 238000003672 processing method Methods 0.000 abstract description 4
- 230000001815 facial effect Effects 0.000 description 11
- 238000005096 rolling process Methods 0.000 description 11
- 238000013527 convolutional neural network Methods 0.000 description 8
- 210000004709 eyebrow Anatomy 0.000 description 8
- 238000011084 recovery Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 210000000887 face Anatomy 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 210000003205 muscle Anatomy 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000003643 water by type Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a three-dimensional face reconstruction method based on a graph convolution neural network, which comprises the steps of obtaining a face image to be reconstructed and determining a plurality of face subregions corresponding to the face image; for each face subarea, acquiring a face feature vector corresponding to the face subarea; based on the trained graph convolution neural network, determining three-dimensional face structure information corresponding to the face feature vector; and determining the three-dimensional face image corresponding to the face image according to the obtained three-dimensional face structure information. According to the three-dimensional face structure information processing method and device, the three-dimensional face structure information corresponding to the face image is determined through the graph convolution neural network, so that each piece of point cloud data in the obtained three-dimensional face structure information comprises the position information and the color information, and the authenticity of the three-dimensional face image can be improved.
Description
Technical Field
The application relates to the technical field of computer science, in particular to a three-dimensional face reconstruction method based on a graph convolution neural network.
Background
The recovery of three-dimensional face information from pictures has found wide application in many fields such as animation and movie production, electronic games, virtual reality and augmented reality. At present, a 3D face reconstruction technology is generally adopted to recover three-dimensional face information from a picture, wherein the 3D face reconstruction technology generally uses a convolutional neural network or a fully-connected neural network to perform parameter fitting or face information regression. However, the convolutional neural network and the fully-connected neural network are mainly good at processing rule data of European space, and for data of non-European space such as 3D point cloud, the topological relation between points cannot be expressed by the convolutional neural network and the fully-connected neural network, so that the representation of face information is restricted, and the accuracy of three-dimensional face information is affected.
Disclosure of Invention
The application aims to solve the technical problem of providing a three-dimensional face reconstruction method based on a graph convolution neural network aiming at the defects of the prior art.
In order to solve the above technical problems, a first aspect of the present application provides a three-dimensional face reconstruction method based on a graph convolution neural network, where the method includes:
acquiring a face image to be reconstructed, and determining a plurality of face subregions corresponding to the face image, wherein each face subregion in the plurality of face subregions is contained in the face image;
for each face subarea, acquiring a face feature vector corresponding to the face subarea;
Based on the trained graph convolution neural network, determining three-dimensional face structure information corresponding to the face feature vector;
and determining the three-dimensional face image corresponding to the face image according to the obtained three-dimensional face structure information.
The three-dimensional face reconstruction method based on the graph convolution neural network comprises the step of obtaining a face image.
The three-dimensional face reconstruction method based on the graph convolution neural network comprises the steps that the three-dimensional face structure information comprises a plurality of pieces of three-dimensional face point cloud data, and each piece of three-dimensional face point cloud data in the plurality of pieces of three-dimensional face point cloud data comprises position information and color information.
The three-dimensional face reconstruction method based on the graph convolution neural network, wherein the steps of obtaining a face image to be reconstructed and dividing the face image into a plurality of face subregions specifically comprise:
Acquiring a face image to be reconstructed, and acquiring a face feature point set of the face image;
dividing the face feature point set into a plurality of face feature point subsets;
and for the face feature point subsets in the face feature point subsets, determining face subareas corresponding to the face feature point subsets to obtain a plurality of face subareas.
The three-dimensional face reconstruction method based on the graph convolution neural network comprises the step that the face subarea is the smallest area comprising all face characteristic points in a face characteristic point subset corresponding to the face subarea.
The three-dimensional face reconstruction method based on the graph convolution neural network, wherein for each face subarea, the obtaining the face feature vector corresponding to the face subarea specifically comprises the following steps:
for each face subarea, adjusting the area size of the face subarea to obtain an adjusted face subarea;
And determining a face feature vector corresponding to the face subarea based on the trained feature extraction model and the adjusted face subarea, wherein the area size of the adjusted face subarea is the same as the image size of an input item of the feature extraction model.
The three-dimensional face reconstruction method based on the graph rolling neural network comprises a three-layer cascaded graph rolling structure, and feature numbers of the graph rolling structures of all layers are sequentially increased according to a cascading sequence.
The three-dimensional face reconstruction method based on the graph convolution neural network comprises the following steps that each piece of three-dimensional face structure information in all pieces of three-dimensional face structure information comprises a plurality of pieces of three-dimensional point cloud data; the determining the three-dimensional face image corresponding to the face image according to the obtained three-dimensional face structure information specifically comprises:
Acquiring an overlapping region of each face subarea corresponding to each three-dimensional face structure information, and forming a residual region corresponding to an image region by each face subarea, wherein the residual region and the image region form the face image;
for each obtained overlapping region, obtaining three-dimensional point cloud data corresponding to each pixel point in each overlapping region, and taking the average value of all the obtained three-dimensional point cloud data as the three-dimensional point cloud data corresponding to the pixel point to obtain three-dimensional face structure information corresponding to each face sub-region composition image region;
determining three-dimensional face structure information which is predicted corresponding to the residual region based on an empirical method and an interpolation method;
and determining a three-dimensional face image corresponding to the face image based on the three-dimensional face structure information and the predicted three-dimensional face structure information.
A second aspect of the embodiments of the present application provides a computer readable storage medium storing one or more programs executable by one or more processors to implement steps in a method for reconstructing a three-dimensional face based on a graph-rolling neural network as described in any one of the above.
A third aspect of an embodiment of the present application provides a terminal device, including: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps in the three-dimensional face reconstruction method based on the graph convolution neural network as described in any one of the above.
The beneficial effects are that: compared with the prior art, the application provides a three-dimensional face reconstruction method based on a graph convolution neural network, which comprises the steps of obtaining a face image to be reconstructed and determining a plurality of face subareas corresponding to the face image; for each face subarea, acquiring a face feature vector corresponding to the face subarea; based on the trained graph convolution neural network, determining three-dimensional face structure information corresponding to the face feature vector; and determining the three-dimensional face image corresponding to the face image according to the obtained three-dimensional face structure information. According to the three-dimensional face structure information processing method and device, the three-dimensional face structure information corresponding to the face image is determined through the graph convolution neural network, so that each piece of point cloud data in the obtained three-dimensional face structure information comprises the position information and the color information, and the authenticity of the three-dimensional face image can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without creative effort for a person of ordinary skill in the art.
Fig. 1 is a flowchart of a three-dimensional face reconstruction method based on a graph convolution neural network provided by the application.
Fig. 2 is a schematic flow chart of a three-dimensional face reconstruction method based on a graph convolution neural network.
Fig. 3 is a schematic diagram of face feature points in the three-dimensional face reconstruction method based on the graph convolution neural network.
Fig. 4 is a schematic diagram showing face feature points on a face image in the three-dimensional face reconstruction method based on the graph convolution neural network.
Fig. 5 is a schematic diagram of a face subregion in the three-dimensional face reconstruction method based on the graph convolution neural network.
Fig. 6 is a schematic structural diagram of a terminal device provided by the present application.
Detailed Description
The application provides a three-dimensional face reconstruction method based on a graph convolution neural network, which is used for making the purposes, the technical scheme and the effects of the application clearer and more definite, and is further described in detail below by referring to the drawings and the embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The inventor finds that the recovery of three-dimensional face information from pictures has wide application in many fields such as animation and movie production, electronic games, virtual reality and augmented reality. However, the recovery of three-dimensional face information from a single picture is a very challenging task, and the primary difficulty is that the single picture has no depth information, and secondly, the reconstruction effect is limited by the occlusion and illumination changes.
In the 70 s of the last century, parke proposed the earliest parameterized face reconstruction method that simulated face shapes with 250 polygons and 400 vertices, which could be changed by different parameters. In 1987 Waters et al proposed a muscle model to simulate the expression and posture of a face, which considers the change of the mesh shape and also the deformation of the muscle under the action of force, more accurate than the Parke model, but the reconstructed face is still not true enough due to the small number of points and meshes.
By the 90 s of the 20 th century, researchers began to explore the recovery of 3D face structures from pictures, where the two methods of impact were shadow shape recovery (Shape From Shading, SFS) and face reconstruction methods based on deformation models (3D Morphable Model,3DMM), respectively. The core idea of SFS is to reconstruct a face based on the correspondence between the gray value of an image and the height variation of an object, but the method is greatly affected by illumination. The 3DMM reconstructs the face by using a three-dimensional scanner, and all the faces have the same vertex number and topological relation, and the method compresses the model by using principal component analysis (PRINCIPAL COMPONENTS ANALYSIS, PCA), so that when the face is reconstructed, different 3D faces can be obtained by only linearly combining the three-dimensional model and the PCA parameters.
With the continuous improvement of hardware and continuous innovation of algorithms in recent years, deep learning starts to chop head and expose angles in various fields, and scholars explore many applications of deep learning in the field of three-dimensional face reconstruction, for example Tewari in 2017, a 3D face reconstruction method based on a self-coding (AutoEncoder) structure is proposed, and the method extracts image features from an encoder in the self-coding structure, and a decoder returns 3DMM parameters to recover three-dimensional face information from a two-dimensional image; fanzi et al in 2019 proposed using multi-view constraints to regress more accurate 3DMM parameters and achieve the state-of-art effect.
However, existing 3D face reconstruction techniques commonly use convolutional neural networks or fully connected neural networks to perform fitting of parameters or regression of face information. However, the convolutional neural network and the fully-connected neural network are mainly good at processing rule data of European space, and for data of non-European space such as 3D point cloud, the topological relation between points cannot be expressed by the convolutional neural network and the fully-connected neural network, so that the representation of face information is restricted, and the accuracy of three-dimensional face information is affected.
In order to solve the above problems, in an embodiment of the present application, the method includes acquiring a face image to be reconstructed, and determining a plurality of face sub-areas corresponding to the face image; for each face subarea, acquiring a face feature vector corresponding to the face subarea; based on the trained graph convolution neural network, determining three-dimensional face structure information corresponding to the face feature vector; and determining the three-dimensional face image corresponding to the face image according to the obtained three-dimensional face structure information. According to the three-dimensional face structure information processing method and device, the three-dimensional face structure information corresponding to the face image is determined through the graph convolution neural network, so that each piece of point cloud data in the obtained three-dimensional face structure information comprises the position information and the color information, and the authenticity of the three-dimensional face image can be improved.
The application will be further described by the description of embodiments with reference to the accompanying drawings.
The implementation provides a three-dimensional face reconstruction method based on a graph convolution neural network, as shown in fig. 1 and fig. 2, the method comprises the following steps:
S10, acquiring a face image to be reconstructed, and determining a plurality of face subregions corresponding to the face image.
Specifically, the face image to be reconstructed may be obtained by photographing through an imaging system (e.g., a camera, etc.), transmitting through an external device (e.g., a smart phone, etc.), and downloading through a network (e.g., hundred degrees, etc.). The face image to be reconstructed is a two-dimensional image, and the face image is carried in the two-dimensional image, wherein the color space to which the two-dimensional image belongs can be RGB color space, YUV color space and the like. For example, the two-dimensional image is captured by a camera, and belongs to the RGB color space.
In one implementation manner of this embodiment, the process of obtaining the face image may be: and acquiring an image to be processed carrying the face image, carrying out face recognition on the image to be processed to obtain the face image in the image to be processed, and taking the acquired face image as the face image to be reconstructed. In addition, when the image to be processed contains a plurality of face images, the face image corresponding to the selection operation can be selected from the plurality of face images obtained by recognition according to the received selection operation, or the face image to be reconstructed is determined according to the image area occupied by each face image, wherein the face image to be reconstructed can be the face image with the largest occupied image area; or determining a face image to be reconstructed according to the position information of each face image in the image to be processed, wherein the face image to be reconstructed can be the face image closest to the image center point of the image to be processed; or, each face image identified is used as a face image to be reconstructed, etc.
Further, each of the plurality of face subregions is included in the face image, and each face subregion is a partial region in the face image. Each face subregion corresponds to one face part, and the face parts corresponding to the face subregions are different, for example, the face subregions comprise 7 face subregions, namely a left cheek region, a chin region, a right cheek region, a left eye region, a right eye region, a nose region and a mouth region, wherein the left eye region comprises a left eye and a left eye eyebrow, and the right eye region comprises a right eye and a right eye eyebrow.
In an implementation manner of this embodiment, the acquiring a face image to be reconstructed and dividing the face image into a plurality of face sub-areas specifically includes:
Acquiring a face image to be reconstructed, and acquiring a face feature point set of the face image;
dividing the face feature point set into a plurality of face feature point subsets, wherein the face feature points included in each face feature point subset in the plurality of face feature point subsets are different from each other;
and for the face feature point subsets in the face feature point subsets, determining face subareas corresponding to the face feature point subsets to obtain a plurality of face subareas.
Specifically, the face feature point set includes a plurality of face feature points, and dividing the face feature point set into a plurality of face feature point subsets refers to grouping the plurality of face feature points in the face feature point set, and taking each group of face feature points obtained by grouping as a face feature point subset. Thus, each of the face feature point subsets is contained in the face feature point set, and each of the face feature point subsets includes at least one face feature point in the face feature point set.
In one implementation manner of this embodiment, the face feature point set includes 68 face feature points of the face, and as shown in fig. 3 and fig. 4, the sequence number of the 68 face feature points is fixed, the sequence number of the 68 face feature points is from small to large, and the sequence number of the 68 face feature points is respectively indicated by a left cheek, a right cheek, a left eyebrow, a right eyebrow, a nose, a left eye, a right eye and a mouth, wherein the face feature points of the sequence numbers 1-7 are face feature points corresponding to a left cheek region, the sequence numbers 7-11 and 58 are face feature points corresponding to a chin region, the face feature points of the sequence numbers 11-17 are face feature points corresponding to a right cheek region, the face feature points of the sequence numbers 18-22 are face feature points corresponding to a left eyebrow region, the face feature points of the sequence numbers 23-27 are face feature points corresponding to a right eyebrow region, the face feature points of the sequence numbers 28-36 are face feature points corresponding to a nose region, the face feature points of the sequence numbers 37-42 are face feature points corresponding to a right eyebrow region, and the sequence numbers of the face feature points corresponding to a face region corresponding to a left eyebrow region is a face feature point corresponding to a face region of a sequence number of a human face region of 43.
Based on the above, the sequence numbers corresponding to the face feature points included in each of the face feature point subsets can be preset, so that after the face feature points are identified, each face feature point subset can be rapidly determined according to the sequence numbers corresponding to the face feature points included in each face feature point subset. In one implementation of this embodiment, the plurality of facial feature point subsets includes 7 facial feature point subsets, respectively designated as a left cheek set, a chin set, a right cheek set, a left eye set, a right eye set, a nose set, and a mouth set, wherein the left cheek set includes facial feature points of sequence numbers 1-7, the right cheek set includes facial feature points of sequence numbers 11-17, the chin set includes facial feature points of sequence numbers 7-11 and 58, the left eye set includes facial feature points of sequence numbers 18-22 and facial feature points of sequence numbers 37-42, the right eye set includes facial feature points of sequence numbers 23-27 and facial feature points of sequence numbers 43-48, the nose set includes facial feature points of sequence numbers 28-36, and the mouth set includes facial feature points of sequence numbers 49-68. Wherein the left cheek set corresponds to a left cheek region, the chin set corresponds to a chin region, the right cheek set corresponds to a right cheek region, the left eye set corresponds to a left eye region, the right eye set corresponds to a right eye region, the nose set corresponds to a nose region, and the mouth set corresponds to a mouth region.
Further, in one implementation manner of this embodiment, after each face feature point subset is obtained, for a face feature point subset of a plurality of face feature point subsets, a region boundary point is selected in the face feature point subset, and a rectangular frame is drawn with the selected region boundary point as an extremum point, so as to obtain a face sub-region corresponding to the face feature point subset, so that the face sub-region is a minimum region including all face feature points in the face feature point subset corresponding to the face region, where the boundary point is located on a boundary of the face sub-region. For example, as shown in fig. 5, for the left eye set, four extreme points corresponding to the left eye set are 20, 42, 18, and 22 respectively, then a rectangular frame is drawn by using 20, 42, 18, and 22 as boundary points, and an area surrounded by the rectangular frame is taken as a left eye area, and for another example, four extreme points corresponding to the nose set are 28, 34, 32, and 36 respectively, then a rectangular frame is drawn by using 28, 34, 32, and 36 as boundary points, and an area surrounded by the rectangular frame is taken as a nose area.
In one implementation manner of this embodiment, the determining the face sub-areas corresponding to the face image may be determined by a Dlib-based face detection and segmentation module, that is, after the face image is acquired, the face image is input to the Dlib-based face detection and segmentation module, and the face sub-areas corresponding to the face image are determined by the Dlib-based face detection and segmentation module, where Dlib is a modern c++ toolbox, which includes machine learning algorithms and tools for creating complex software in c++ to solve practical problems, and is widely used in industry and academia including robots, embedded devices, mobile phones, and large-scale high-performance computing environments.
S20, for each face subarea, obtaining a face feature vector corresponding to the face subarea.
Specifically, the face feature vector is a feature vector corresponding to the face sub-region, and three-dimensional face structure information corresponding to the face sub-region can be determined based on the face feature vector. In one implementation manner of this embodiment, the face feature vector is determined based on a trained feature extraction model, where an input term of the feature extraction model is a face sub-region, and an output term is a face feature vector. Correspondingly, for each face subarea, the obtaining the face feature vector corresponding to the face subarea specifically includes:
for each face subarea, adjusting the area size of the face subarea to obtain an adjusted face subarea;
And determining a face feature vector corresponding to the face subarea based on the trained feature extraction model and the adjusted face subarea, wherein the area size of the adjusted face subarea is the same as the image size of an input item of the feature extraction model.
Specifically, the feature extraction model may be a VGG16 network model, and the image size of the input item of the VGG16 network model is 224×224. Therefore, after the face subarea is obtained, the area size of the face subarea needs to be adjusted so that the area size of the adjusted face subarea is equal to 224×224, and the face subarea can be used as an input item of the VGG16 network model. In addition, the VGG16 contains a total of 13 convolutional layers and 3 fully-connected layers, and the use of multiple 3*3 small convolutional kernels allows the network to have a larger receptive field with few parameters. In one implementation manner of this embodiment, the feature vector corresponding to the face sub-region is determined by using the feature vector output by the 10 th convolution layer and the feature vector output by the 13 th convolution layer of the VGG16 network model according to the cascade sequence, for example, the feature vector output by the 10 th convolution layer and the feature vector output by the 13 th convolution layer are spliced to obtain the feature vector corresponding to the face sub-region, and so on.
S30, based on the trained graph convolution neural network, determining three-dimensional face structure information corresponding to the face feature vector.
Specifically, the graph rolling neural network is trained, an input item of the graph rolling neural network is a human face feature vector, and an output item of the graph rolling neural network is three-dimensional human face structure information, wherein the three-dimensional human face structure information comprises a plurality of three-dimensional human face point cloud data, and each three-dimensional human face point cloud data in the plurality of three-dimensional human face point cloud data comprises position information and color information. For example, each three-dimensional face point cloud data includes 6 dimensions, wherein the first three dimensions are used to represent position information of the three-dimensional face point cloud data, and the last three dimensions are used to represent color information of the three-dimensional face point cloud data, and when the face image belongs to an RGB color space, the color information includes R value, B value, and G value.
In one implementation manner of this embodiment, the convolutional neural network includes three layers of cascaded convolutional structures, and the three layers of convolutional structures are located in two adjacent layers of the convolutional structures according to a cascade sequence, where an output item of a preceding convolutional structure is an input item of a subsequent convolutional structure, an input item of a foremost convolutional structure is a face feature vector, and an output item of a final convolutional structure is three-dimensional face structure information. It can be understood that the three-dimensional face structure information corresponding to each face subarea is an input item of the graph convolution neural network, and the three-dimensional face structure information corresponding to each face subarea is sequentially input into the graph convolution neural network, so that the parameter number of the graph convolution neural network can be reduced. Therefore, as the number of points in each face of the BFM model is 53215, if the graph rolling neural network is directly used, the size of the adjacent matrix of each layer exceeds 25 hundred million, and the memory and the video memory are greatly consumed, so that the number of points contained in the image input into the graph rolling neural network model can be reduced by dividing the face image into a plurality of face subareas, the parameter number of the graph rolling neural network is reduced, and the consumption of the memory and the video memory corresponding to the graph rolling neural network model is further reduced.
In one implementation of this embodiment, the convolutional neural network includes a three-layer cascaded convolutional structure, and the feature numbers of the layer convolutional structures sequentially increase in the cascade order. For example, the feature number of each layer of the three-layer GCN structure is 1,2, and 6, respectively, where 6 is the output dimension of the third layer. Thus, by selecting smaller feature numbers for the first two layers, the number of parameters of the graph roll-up neural network model can be reduced.
And S40, determining a three-dimensional face image corresponding to the face image according to the obtained three-dimensional face structure information.
Specifically, after three-dimensional face structure information corresponding to each face subarea is acquired, a three-dimensional face image is determined based on all the acquired three-dimensional face structure information. However, it is known from the above-described determination of the face subregions that there may be overlap between the face subregions, and that the image area constituted by the face subregions is smaller than the image area of the face image. Therefore, two three-dimensional point cloud data may exist in all the obtained three-dimensional face structure information, the two three-dimensional point cloud data correspond to the same pixel point in the face image, the pixel point exists in the face image, and the three-dimensional point cloud data corresponding to the pixel point does not exist in all the three-dimensional face structure information, so that the authenticity and the accuracy of the three-dimensional face image are affected.
Based on this, in one implementation manner of this embodiment, the determining, according to the obtained all three-dimensional face structure information, the three-dimensional face image corresponding to the face image specifically includes:
Acquiring an overlapping region of each face subarea corresponding to each three-dimensional face structure information, and forming a residual region corresponding to an image region by each face subarea, wherein the residual region and the image region form the face image;
for each obtained overlapping region, obtaining three-dimensional point cloud data corresponding to each pixel point in each overlapping region, and taking the average value of all the obtained three-dimensional point cloud data as the three-dimensional point cloud data corresponding to the pixel point to obtain three-dimensional face structure information corresponding to each face sub-region composition image region;
determining three-dimensional face structure information which is predicted corresponding to the residual region based on an empirical method and an interpolation method;
and determining a three-dimensional face image corresponding to the face image based on the three-dimensional face structure information and the predicted three-dimensional face structure information.
Specifically, the overlapping area refers to an image area at least contained in two face subregions at the same time, and the residual area refers to an image area which is not contained in Ren Yiren face subregions in the face image; the residual error area can be obtained by combining a plurality of face areas according to the position information of the face areas in the face image to obtain a candidate image and then performing difference between the face image and the candidate image; for each pixel point in the overlapping region, at least two three-dimensional point cloud data may be acquired.
Further, for each pixel point in the overlapping area, determining all three-dimensional point cloud data corresponding to the pixel point, determining an average value of each dimension of each three-dimensional point cloud data, taking the obtained average value as the dimension value, obtaining three-dimensional point cloud data corresponding to the pixel point, obtaining three-dimensional point cloud data corresponding to the overlapping area, removing the three-dimensional point cloud data corresponding to the overlapping area in three-dimensional face structure information corresponding to each face subarea containing each overlapping area, and taking the three-dimensional point cloud data contained in each three-dimensional face structure information after removal and the three-dimensional point cloud data corresponding to the overlapping area as three-dimensional face structure information corresponding to each face subarea.
Further, for each pixel point in the residual region, determining whether a pixel point corresponding to three-dimensional point cloud data is included in a preset range taking the pixel point as a center, if so, determining the three-dimensional point cloud data corresponding to the pixel point by adopting an interpolation method, and if not, determining the three-dimensional point cloud data corresponding to the pixel point by adopting an empirical method, wherein the empirical method is to replace the incomplete feature by using the feature of an average face in the 3DMM, and the interpolation method refers to fitting the incomplete feature by using the feature around the incomplete feature.
Based on the three-dimensional face reconstruction method based on the graph roll-up neural network, the embodiment provides the three-dimensional face reconstruction method based on the graph roll-up neural network, and the method comprises the steps of obtaining a face image to be reconstructed and determining a plurality of face subregions corresponding to the face image; for each face subarea, acquiring a face feature vector corresponding to the face subarea; based on the trained graph convolution neural network, determining three-dimensional face structure information corresponding to the face feature vector; and determining the three-dimensional face image corresponding to the face image according to the obtained three-dimensional face structure information. According to the three-dimensional face structure information processing method and device, the three-dimensional face structure information corresponding to the face image is determined through the graph convolution neural network, so that each piece of point cloud data in the obtained three-dimensional face structure information comprises the position information and the color information, and the authenticity of the three-dimensional face image can be improved.
Based on the three-dimensional face reconstruction method based on the graph roll-up neural network, the embodiment provides a computer readable storage medium, wherein one or more programs are stored in the computer readable storage medium, and the one or more programs can be executed by one or more processors to realize the steps in the three-dimensional face reconstruction method based on the graph roll-up neural network.
Based on the three-dimensional face reconstruction method based on the graph convolution neural network, the application also provides a terminal device, as shown in fig. 6, which comprises at least one processor (processor) 20; a display screen 21; and a memory (memory) 22, which may also include a communication interface (Communications Interface) 23 and a bus 24. Wherein the processor 20, the display 21, the memory 22 and the communication interface 23 may communicate with each other via a bus 24. The display screen 21 is configured to display a user guidance interface preset in the initial setting mode. The communication interface 23 may transmit information. The processor 20 may invoke logic instructions in the memory 22 to perform the methods of the embodiments described above.
Further, the logic instructions in the memory 22 described above may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product.
The memory 22, as a computer readable storage medium, may be configured to store a software program, a computer executable program, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure. The processor 20 performs functional applications and data processing, i.e. implements the methods of the embodiments described above, by running software programs, instructions or modules stored in the memory 22.
The memory 22 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the terminal device, etc. In addition, the memory 22 may include high-speed random access memory, and may also include nonvolatile memory. For example, a plurality of media capable of storing program codes such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or a transitory storage medium may be used.
In addition, the specific processes that the storage medium and the plurality of instruction processors of the terminal device load and execute are described in detail in the above method, and are not stated here.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.
Claims (9)
1. The three-dimensional face reconstruction method based on the graph convolution neural network is characterized by comprising the following steps of:
acquiring a face image to be reconstructed, and determining a plurality of face subregions corresponding to the face image, wherein each face subregion in the plurality of face subregions is contained in the face image;
for each face subarea, acquiring a face feature vector corresponding to the face subarea;
Based on the trained graph convolution neural network, determining three-dimensional face structure information corresponding to the face feature vector;
According to the obtained three-dimensional face structure information, determining a three-dimensional face image corresponding to the face image;
Wherein, each three-dimensional face structure information in all three-dimensional face structure information comprises a plurality of three-dimensional point cloud data; the determining the three-dimensional face image corresponding to the face image according to the obtained three-dimensional face structure information specifically comprises:
Acquiring an overlapping region of each face subarea corresponding to each three-dimensional face structure information, and forming a residual region corresponding to an image region by each face subarea, wherein the residual region and the image region form the face image;
for each obtained overlapping region, obtaining three-dimensional point cloud data corresponding to each pixel point in each overlapping region, and taking the average value of all the obtained three-dimensional point cloud data as the three-dimensional point cloud data corresponding to the pixel point to obtain three-dimensional face structure information corresponding to each face sub-region composition image region;
Determining predicted three-dimensional face structure information corresponding to the residual region based on an empirical method and an interpolation method;
and determining a three-dimensional face image corresponding to the face image based on the three-dimensional face structure information and the predicted three-dimensional face structure information.
2. The three-dimensional face reconstruction method based on a graph roll-up neural network according to claim 1, wherein the face image is a two-dimensional face image.
3. The three-dimensional face reconstruction method based on a graph roll-up neural network according to claim 1, wherein the three-dimensional face structure information comprises a plurality of three-dimensional face point cloud data, and each three-dimensional face point cloud data in the plurality of three-dimensional face point cloud data comprises position information and color information.
4. The three-dimensional face reconstruction method based on a graph roll-up neural network according to claim 1, wherein the obtaining a face image to be reconstructed and dividing the face image into a plurality of face sub-areas specifically comprises:
Acquiring a face image to be reconstructed, and acquiring a face feature point set of the face image;
dividing the face feature point set into a plurality of face feature point subsets;
and for the face feature point subsets in the face feature point subsets, determining face subareas corresponding to the face feature point subsets to obtain a plurality of face subareas.
5. The three-dimensional face reconstruction method based on the graph roll-up neural network according to claim 4, wherein the face subarea is a minimum area comprising all face feature points in the face feature point subset corresponding to the face subarea.
6. The three-dimensional face reconstruction method based on the graph roll-up neural network according to claim 1, wherein for each face sub-region, obtaining a face feature vector corresponding to the face sub-region specifically includes:
for each face subarea, adjusting the area size of the face subarea to obtain an adjusted face subarea;
And determining a face feature vector corresponding to the face subarea based on the trained feature extraction model and the adjusted face subarea, wherein the area size of the adjusted face subarea is the same as the image size of an input item of the feature extraction model.
7. The three-dimensional face reconstruction method based on the graph roll-up neural network according to claim 1, wherein the graph roll-up neural network comprises a three-layer cascaded graph roll-up structure, and feature numbers of the layers of the graph roll-up structure are sequentially increased according to a cascading sequence.
8. A computer readable storage medium storing one or more programs executable by one or more processors to implement the steps in the graph-roll-up neural network based three-dimensional face reconstruction method of any one of claims 1-7.
9. A terminal device, comprising: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
The processor, when executing the computer readable program, implements the steps in the three-dimensional face reconstruction method based on a graph roll-up neural network as claimed in any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010929831.2A CN112150608B (en) | 2020-09-07 | 2020-09-07 | Three-dimensional face reconstruction method based on graph convolution neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010929831.2A CN112150608B (en) | 2020-09-07 | 2020-09-07 | Three-dimensional face reconstruction method based on graph convolution neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112150608A CN112150608A (en) | 2020-12-29 |
CN112150608B true CN112150608B (en) | 2024-07-23 |
Family
ID=73890666
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010929831.2A Active CN112150608B (en) | 2020-09-07 | 2020-09-07 | Three-dimensional face reconstruction method based on graph convolution neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112150608B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112598790B (en) * | 2021-01-08 | 2024-07-05 | 中国科学院深圳先进技术研究院 | Brain structure three-dimensional reconstruction method and device and terminal equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934129A (en) * | 2019-02-27 | 2019-06-25 | 嘉兴学院 | A kind of man face characteristic point positioning method, device, computer equipment and storage medium |
CN111598998A (en) * | 2020-05-13 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Three-dimensional virtual model reconstruction method and device, computer equipment and storage medium |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130107956A1 (en) * | 2010-07-06 | 2013-05-02 | Koninklijke Philips Electronics N.V. | Generation of high dynamic range images from low dynamic range images |
CN104504410A (en) * | 2015-01-07 | 2015-04-08 | 深圳市唯特视科技有限公司 | Three-dimensional face recognition device and method based on three-dimensional point cloud |
CN108399649B (en) * | 2018-03-05 | 2021-07-20 | 中科视拓(北京)科技有限公司 | Single-picture three-dimensional face reconstruction method based on cascade regression network |
CN109147048B (en) * | 2018-07-23 | 2021-02-26 | 复旦大学 | Three-dimensional mesh reconstruction method by utilizing single-sheet colorful image |
EP3857451A4 (en) * | 2018-09-25 | 2022-06-22 | Matterport, Inc. | Employing three-dimensional data predicted from two-dimensional images using neural networks for 3d modeling applications |
KR102131592B1 (en) * | 2018-10-25 | 2020-08-05 | 주식회사 인공지능연구원 | Apparatus for Predicting 3D Original Formation |
CN110288695B (en) * | 2019-06-13 | 2021-05-28 | 电子科技大学 | Single-frame image three-dimensional model surface reconstruction method based on deep learning |
CN110599593B (en) * | 2019-09-12 | 2021-03-23 | 北京三快在线科技有限公司 | Data synthesis method, device, equipment and storage medium |
CN111027140B (en) * | 2019-12-11 | 2020-09-22 | 南京航空航天大学 | Airplane standard part model rapid reconstruction method based on multi-view point cloud data |
-
2020
- 2020-09-07 CN CN202010929831.2A patent/CN112150608B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934129A (en) * | 2019-02-27 | 2019-06-25 | 嘉兴学院 | A kind of man face characteristic point positioning method, device, computer equipment and storage medium |
CN111598998A (en) * | 2020-05-13 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Three-dimensional virtual model reconstruction method and device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112150608A (en) | 2020-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111598998B (en) | Three-dimensional virtual model reconstruction method, three-dimensional virtual model reconstruction device, computer equipment and storage medium | |
CN113496507B (en) | Human body three-dimensional model reconstruction method | |
CN111243093B (en) | Three-dimensional face grid generation method, device, equipment and storage medium | |
CN106778928B (en) | Image processing method and device | |
Gu et al. | Learning dynamic guidance for depth image enhancement | |
CN112037320B (en) | Image processing method, device, equipment and computer readable storage medium | |
CN111971713A (en) | 3D face capture and modification using image and time tracking neural networks | |
CN112784621B (en) | Image display method and device | |
US11776210B2 (en) | 3D face modeling based on neural networks | |
CN113838176A (en) | Model training method, three-dimensional face image generation method and equipment | |
CN113850168A (en) | Fusion method, device and equipment of face pictures and storage medium | |
CN111833360B (en) | Image processing method, device, equipment and computer readable storage medium | |
CA3137297C (en) | Adaptive convolutions in neural networks | |
CN110958469A (en) | Video processing method and device, electronic equipment and storage medium | |
JP7462120B2 (en) | Method, system and computer program for extracting color from two-dimensional (2D) facial images | |
CN116977522A (en) | Rendering method and device of three-dimensional model, computer equipment and storage medium | |
CN113313631B (en) | Image rendering method and device | |
CN116385667B (en) | Reconstruction method of three-dimensional model, training method and device of texture reconstruction model | |
CN114202615A (en) | Facial expression reconstruction method, device, equipment and storage medium | |
KR20230110787A (en) | Methods and systems for forming personalized 3D head and face models | |
CN113781659A (en) | Three-dimensional reconstruction method and device, electronic equipment and readable storage medium | |
CN115330980A (en) | Expression migration method and device, electronic equipment and storage medium | |
CN116310105A (en) | Object three-dimensional reconstruction method, device, equipment and storage medium based on multiple views | |
RU2713695C1 (en) | Textured neural avatars | |
CN112150608B (en) | Three-dimensional face reconstruction method based on graph convolution neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |