CN107909640B - Face relighting method and device based on deep learning - Google Patents

Face relighting method and device based on deep learning Download PDF

Info

Publication number
CN107909640B
CN107909640B CN201711078324.7A CN201711078324A CN107909640B CN 107909640 B CN107909640 B CN 107909640B CN 201711078324 A CN201711078324 A CN 201711078324A CN 107909640 B CN107909640 B CN 107909640B
Authority
CN
China
Prior art keywords
face
network
relighting
picture
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711078324.7A
Other languages
Chinese (zh)
Other versions
CN107909640A (en
Inventor
徐枫
王至博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201711078324.7A priority Critical patent/CN107909640B/en
Publication of CN107909640A publication Critical patent/CN107909640A/en
Application granted granted Critical
Publication of CN107909640B publication Critical patent/CN107909640B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a face relighting method and a face relighting device based on deep learning, wherein the method comprises the following steps: obtaining a face region image, a normal image and a representation illumination distribution image corresponding to the face image; processing the photo set by utilizing the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set; constructing a face relighting network by a deep learning method, wherein a network training data set is used for training to generate the face relighting network; and processing the face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using a face relighting network to obtain a face relighting result. The method can utilize a deep learning method to construct the face relighting network, utilize the face inverse relighting regular term and the face recognition network characteristic regular term to train the face relighting network, and utilize the face relighting network to relight the face, thereby effectively improving the reality and the reliability of the face relighting.

Description

Face relighting method and device based on deep learning
Technical Field
The invention relates to the technical field of computer vision, computer graphics and deep learning, in particular to a face relighting method and device based on deep learning.
Background
When people take images, the shot contents and the shooting conditions (such as time, visual angle and illumination) are recorded in the photos, and some image processing technologies can avoid retaking and directly change the shooting conditions by technical means to obtain new photos. The relighting technology is a method for obtaining a corresponding new photo by changing the illumination when the photo is shot, and the object aimed by the face relighting technology is the face photo, wherein the face relighting technology is very important in the post-processing of the picture and the post-production of the film.
In the related technology, face relighting is usually to build a face illumination model to relight a face, but the action between the face and illumination is very complex, and a model built artificially is often difficult to express the real performance of the face under each illumination, so that the reliability is poor and needs to be solved.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, one objective of the present invention is to provide a face relighting method based on deep learning, which can effectively improve the reality and reliability of face relighting.
Another objective of the present invention is to provide a face relighting device based on deep learning.
In order to achieve the above object, an embodiment of the present invention provides a face relighting method based on deep learning, including the following steps: determining the corresponding relation between the vertex of the human face three-dimensional model and the position and the color of a two-dimensional pixel of a human face picture by using a camera parameter matrix, the human face three-dimensional model and an illumination model through an iterative optimization method, and obtaining a human face area picture, a normal picture and a representation illumination distribution picture corresponding to the human face picture; processing a photo set by using the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set; constructing a face relighting network by a deep learning method, wherein the face relighting network is generated by training by utilizing the network training data set; and processing a face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using the face relighting network to obtain a face relighting result.
The face relighting method based on the deep learning of the embodiment of the invention can utilize a three-dimensional face model to fit the face in the picture, calculate to obtain an original illumination estimation picture and a target illumination estimation picture which are used for representing illumination distribution on the face, construct a face relighting network by utilizing the deep learning method, train the face relighting network by utilizing a face inverse relighting regular term and a face recognition network characteristic regular term, and relight the face by utilizing the face relighting network only by utilizing a face three-dimensional reconstruction method to fit the face model in the picture for different face pictures, thereby effectively improving the reality and reliability of the face relighting.
In addition, the face relighting method based on deep learning according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, the determining, by an iterative optimization method, a corresponding relationship between a vertex of the three-dimensional model of the human face and a position and a color of a two-dimensional pixel of a human face picture further includes: obtaining the positions of characteristic points in the face picture by a face characteristic point detection method; rendering by using the camera parameter matrix and the illumination model to obtain a face rendering picture; and iteratively optimizing the difference between the pixel values of the face rendering picture and the face real picture to obtain the distance of the projection point of the feature point on the face three-dimensional model on the two-dimensional picture.
Further, in an embodiment of the present invention, the training with the network training data set to generate a face relighting network further includes: obtaining an original illumination estimation image and a target illumination estimation image by using the illumination model and a normal map corresponding to the real face images in the training set; constructing a generation network, wherein the generation network of the countermeasure network is the face relighting network; taking the real face picture of the network training data set, the corresponding original illumination estimation graph and a target illumination estimation graph as the input of the generation network, taking the output of the generation network and the target illumination estimation graph as the false input of the countermeasure network, and taking the real face picture of the network training data set and the original illumination estimation graph as the real input of the countermeasure network; and training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors.
Further, in an embodiment of the present invention, the taking a sum of the output of the countermeasure network and the inverse-relighting regularization term error and the face recognition network feature regularization term error as an error further includes: outputting the target illumination estimation graph and the original illumination estimation graph as input through the generation network to obtain an output picture; and taking the difference of each pixel of the real face picture corresponding to the output picture as the error of the inverse relighting regular term.
Further, in an embodiment of the present invention, the taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network feature regular term as an error further includes obtaining the face recognition network feature of the generated network output and the output of the face real picture; and acquiring a difference value between the generated network output and the face recognition network characteristics output by the real face picture as the face recognition network characteristic regular term error.
In order to achieve the above object, another embodiment of the present invention provides a face relighting device based on deep learning, including: the system comprises an acquisition module, a parameter matrix of a camera, a human face three-dimensional model and an illumination model, wherein the acquisition module is used for determining the corresponding relation between the vertex of the human face three-dimensional model and the position and the color of two-dimensional pixels of a human face picture by an iterative optimization method to obtain a human face area image, a normal image and a representation illumination distribution image corresponding to the human face picture; the acquisition module is used for processing a photo set by utilizing the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set; the construction module is used for constructing a face relighting network by a deep learning method, wherein the face relighting network is generated by training the network training data set; and the processing module is used for processing a face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using the face relighting network to obtain a face relighting result.
The deep learning face relighting device provided by the embodiment of the invention can utilize a three-dimensional face model to fit a face in a picture, calculate to obtain an original illumination estimation picture and a target illumination estimation picture which are used for representing illumination distribution on the face, construct a face relighting network by using a deep learning method, train the face relighting network by using a face inverse relighting regular term and a face recognition network characteristic regular term, and relight different face pictures by using the face relighting network only by using a face three-dimensional reconstruction method to fit the face model in the picture, so that the face can be relighted by using the face relighting network, and the reality and reliability of face relighting can be effectively improved.
In addition, the deep learning face relighting device according to the above embodiment of the present invention may also have the following additional technical features:
further, in an embodiment of the present invention, the acquisition module is further configured to obtain positions of feature points in a face picture by a face feature point detection method, obtain a face rendering picture by rendering the camera parameter matrix and the illumination model, and iteratively optimize a difference between pixel values of the face rendering picture and a face real picture, so as to obtain a distance between projection points of the feature points on the three-dimensional face model on the two-dimensional picture.
Further, in an embodiment of the present invention, the constructing module is further configured to obtain an original illumination estimation graph and a target illumination estimation graph by using the illumination model and a normal graph corresponding to a real face picture in a training set, and construct a generated countermeasure network, where the generated network of the countermeasure network is the face relighting network, and the real face picture of the network training data set, the original illumination estimation graph and the target illumination estimation graph are used as inputs of the generated network, the generated network output and the target illumination estimation graph are used as false inputs to the countermeasure network, the real face picture of the network training data set and the original illumination estimation graph are used as true inputs to the countermeasure network, and a sum of an output of the countermeasure network, an error of a back-relighting regular term and an error of a face recognition network feature regular term is used as an error, the generation trains the antagonistic network.
Further, in an embodiment of the present invention, the building module is further configured to output, as input, the target illumination estimation map and the original illumination estimation map through the generation network to obtain an output picture, and use a difference between pixels of a real face picture corresponding to the output picture as the inverse relighting regularization term error.
Further, in an embodiment of the present invention, the building module is further configured to obtain face recognition network features of the generated network output and the human face real picture output, and obtain a difference between the face recognition network features of the generated network output and the human face real picture output as the face recognition network feature regular term error.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow chart of a face relighting method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a flowchart of a face relighting method based on deep learning according to another embodiment of the present invention;
FIG. 3 is a functional diagram of a method for using a face relighting network according to another embodiment of the present invention;
fig. 4 is a schematic structural diagram of a face relighting apparatus based on deep learning according to another embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes a face relighting method and apparatus based on deep learning according to an embodiment of the present invention with reference to the accompanying drawings, and first, a face relighting method based on deep learning according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 1 is a flowchart of a face relighting method based on deep learning according to an embodiment of the present invention.
As shown in fig. 1, the face relighting method based on deep learning includes the following steps:
in step S101, a camera parameter matrix, a three-dimensional face model and an illumination model are used, and a correspondence between a vertex of the three-dimensional face model and a position and a color of a two-dimensional pixel of a face picture is determined by an iterative optimization method, so as to obtain a face region map, a normal map and a representative illumination distribution map corresponding to the face picture.
That is to say, in the embodiment of the present invention, the camera parameter matrix, the three-dimensional face model and the illumination model may be used, the iterative optimization method is used to determine the correspondence between the vertex of the three-dimensional face model and the two-dimensional pixels of the face image and the color, and the face region image, the normal image and the image for representing the illumination distribution corresponding to the face image are obtained through calculation.
Further, in an embodiment of the present invention, determining a correspondence between a vertex of the three-dimensional model of the human face and a position and a color of a two-dimensional pixel of the human face picture by an iterative optimization method further includes: obtaining the positions of characteristic points in the face picture by a face characteristic point detection method; rendering by using the camera parameter matrix and the illumination model to obtain a face rendering picture; and iteratively optimizing the difference between the pixel values of the face rendering picture and the face real picture to obtain the distance of the projection point of the feature point on the face three-dimensional model on the two-dimensional picture.
It can be understood that the method for determining the correspondence between the vertex of the three-dimensional model of the human face and the two-dimensional pixel in the embodiment of the invention comprises the following steps:
(1) obtaining the positions of characteristic points in the face picture by using a face characteristic point detection method;
(2) rendering by using the camera parameter matrix and the spherical harmonic illumination model to obtain a face rendering picture;
(3) and iteratively optimizing the difference between the pixel values of the face rendering picture and the face real picture, and the distance of the projection point of the feature point on the three-dimensional model on the two-dimensional picture.
For example, as shown in FIG. 2, an embodiment of the present invention collects a face data set. For a face picture, fitting a face in the picture by using a face three-dimensional model, after the vertex v on the face three-dimensional model and the vertex v are subjected to rotation transformation R and translation transformation t, projecting the corresponding position of a pixel point P in the face picture through a camera parameter matrix P:
p=P(Rv)+t,
the embodiment of the invention can obtain the positions of the feature points in the face picture by a face feature point detection method, and can obtain the coefficient vector in the face three-dimensional model and the parameter l in the spherical harmonic illumination model by performing iterative optimization on the positions, vertexes and colors of corresponding pixel points of the three-dimensional model feature points and the picture feature points. The embodiment of the invention can utilize the spherical harmonic illumination model to calculate the illumination intensity of each pixel point, and the method comprises the following steps:
Figure GDA0002395177410000051
wherein b isjRelating to the normal direction of each vertex on the face model:
Figure GDA0002395177410000052
Figure GDA0002395177410000053
Figure GDA0002395177410000054
Figure GDA0002395177410000055
the embodiment of the invention can obtain a normal image of the face by using the three-dimensional model of the face, and render to obtain an original illumination distribution map of the face by using the normal image and the illumination intensity formula, randomly generate a group of spherical harmonic illumination coefficients, render to obtain a target illumination estimation map, project the three-dimensional model of the face to an area covered on the picture and take out to obtain a corresponding picture of the face area.
In step S102, the photo set is processed by using the camera parameter matrix, the three-dimensional face model, and the illumination model, so as to obtain a network training data set.
For example, as shown in fig. 2, in the embodiment of the present invention, each picture in the face picture data set may be processed by using the method in step S101, so that a training database for face relighting network training may be obtained.
In step S103, a face relighting network is constructed by a deep learning method, wherein a network training data set is used for training to generate the face relighting network.
That is to say, the embodiment of the present invention may construct the face relighting network by a deep learning method, and perform training by using the training data set, so that the face relighting network may generate a face relighting result.
For example, the face relighting network constructed in the embodiment of the invention is characterized in that: the face relighting network has three pictures as input; the face relighting network outputs a picture.
Further, in an embodiment of the present invention, the training is performed by using a network training data set, and a face relighting network is generated, further including: obtaining an original illumination estimation image and a target illumination estimation image by using an illumination model and a normal map corresponding to the real face image in the training set; constructing a generation network, wherein the generation network of the countermeasure network is a face relighting network; the real face picture of the network training data set, the corresponding original illumination estimation picture and the target illumination estimation picture are used as input for generating a network, the generated network output and the target illumination estimation picture are used as false input of an antagonistic network, and the real face picture of the network training data set and the original illumination estimation picture are used as real input of the antagonistic network; and training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors.
It can be understood that the method for training the face relighting network in the embodiment of the invention can comprise the following steps:
(1) obtaining an original illumination estimation image and a target illumination estimation image by using an illumination model and a normal map corresponding to the real face image in the training set;
(2) constructing a generation confrontation network, wherein the generation network is a face relighting network;
(3) the real face pictures in the training set, the corresponding original illumination estimation pictures and the target illumination estimation pictures are used as the input of a generation network, the output of the generation network and the target illumination estimation pictures are used as the false input of an antagonistic network, and the real face pictures and the original illumination estimation pictures in the training set are used as the real input of the antagonistic network;
(4) and training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors.
Further, in an embodiment of the present invention, taking the sum of the output of the countermeasure network and the error of the inverse relighting regularization term and the error of the face recognition network feature regularization term as an error, further includes: obtaining an output picture by generating a network output and target illumination estimation picture and an original illumination estimation picture as input; and taking the difference of each pixel of the real face picture corresponding to the output picture as the inverse relighting regular term error.
It can be understood that the method for calculating the error of the inverse relighting regular term in the embodiment of the invention comprises the following steps:
(1) taking the generated network output and target illumination estimation graph and the original illumination estimation graph as input to obtain an output picture of the generated network output and target illumination estimation graph;
(2) and taking the difference of each pixel of the real face picture corresponding to the obtained output picture as an error.
Optionally, in an embodiment of the present invention, a sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network feature regular term is used as an error, and further includes obtaining a face recognition network feature that generates a network output and a face real picture output; and acquiring a difference value between the generated network output and the face recognition network characteristic output of the face real picture to be used as a face recognition network characteristic regular term error.
Specifically, the method for calculating the face recognition network feature regular term error in the embodiment of the invention comprises the following steps:
(1) calculating the face recognition network characteristics of the generated network output and the face real picture output;
(2) and taking the difference between the two face recognition network characteristics as an error.
For example, as shown in fig. 2, the embodiment of the present invention can perform face relightingAnd (5) network training and generating a face relighting network. The embodiment of the invention can design the face relighting network as the generation network for generating the confrontation network
Figure GDA0002395177410000071
The countermeasure network is
Figure GDA0002395177410000072
The input is a real picture I of a human faceiOriginal illumination estimation picture
Figure GDA0002395177410000073
Target illumination estimation picture
Figure GDA0002395177410000074
The output is a heavily illuminated picture Io. When face relighting network training is carried out, the adopted errors are as follows:
Figure GDA0002395177410000075
the first two terms of the error are the errors of the countermeasure network output, and the last term is the regular term. The embodiment of the invention adopts a human face inverse-relighting regular term Reg2GThe regular terms of the face recognition network features are respectively as follows:
Figure GDA0002395177410000076
Figure GDA0002395177410000077
in the face recognition network feature regular term, the embodiment of the invention adopts an fc6 layer of a VGG (visual graphics Generator) face recognition network as a feature F, and performs the following iterative optimization on an error L to obtain the face relighting network
Figure GDA0002395177410000078
Comprises the following steps:
Figure GDA0002395177410000079
in step S104, the face image is processed by using the camera parameter matrix, the face three-dimensional model, and the illumination model, and the output result is processed by using the face relighting network, so as to obtain a face relighting result.
It can be understood that, in the embodiment of the present invention, the method in step S101 is adopted to process the face picture, and the output result is processed by using the face relighting network, so that a face relighting result can be obtained.
For example, after the face relighting network is obtained through training, as shown in fig. 3, a camera parameter matrix, a face three-dimensional model and an illumination model may be used to process a face picture, so as to obtain an original illumination distribution map and a target illumination distribution map, and the three pictures are used as input of the face relighting network, so as to obtain a relighting result of the face.
To sum up, the embodiment of the present invention aims to solve the problem of face relighting in the post-processing of the image, and the embodiment of the present invention inputs a face image, then performs face relighting processing on the face region of the face image, and estimates the illumination distribution on the face by using a face three-dimensional model reconstruction method to obtain the training data of the face relighting network, and simultaneously trains the face relighting network by generating an antagonistic network training method, and the antagonistic network training method uses the trained face relighting network and takes the real face image and the original illumination estimation image and the target illumination estimation image obtained by the face three-dimensional reconstruction as input, so as to obtain the face relighting image.
According to the face relighting method based on deep learning provided by the embodiment of the invention, a three-dimensional face model can be used for fitting the face in a picture, an original illumination estimation picture and a target illumination estimation picture for representing illumination distribution on the face are obtained through calculation, a face relighting network is constructed by using the deep learning method, the face relighting network is trained by using a face inverse relighting regular term and a face recognition network characteristic regular term, and for different face pictures, the face relighting network can be used for relighting the face only by using the face three-dimensional reconstruction method for fitting the face model in the picture, so that the reality sense and the reliability of face relighting can be effectively improved.
Next, a face relighting apparatus based on deep learning proposed according to an embodiment of the present invention will be described with reference to the drawings.
Fig. 4 is a schematic structural diagram of a face relighting device based on deep learning according to an embodiment of the present invention.
As shown in fig. 4, the deep learning based face relighting apparatus 10 includes: acquisition module 100, acquisition module 200, construction module 300 and processing module 400.
The acquisition module 100 is configured to determine, by using the camera parameter matrix, the three-dimensional face model and the illumination model, a correspondence between a vertex of the three-dimensional face model and a position and a color of a two-dimensional pixel of a face picture by an iterative optimization method, and obtain a face region map, a normal map and a representative illumination distribution map corresponding to the face picture. The obtaining module 200 is configured to process the photo set by using the camera parameter matrix, the human face three-dimensional model, and the illumination model to obtain a network training data set. The construction module 300 is configured to construct a face relighting network through a deep learning method, wherein a network training data set is used for training to generate the face relighting network. The processing module 400 processes the face picture using the camera parameter matrix, the face three-dimensional model and the illumination model, and processes the output result using the face relighting network to obtain a face relighting result. The device 10 of the embodiment of the invention can utilize a deep learning method to construct a face relighting network, utilize a face inverse relighting regular term and a face recognition network characteristic regular term to train the face relighting network, and utilize the face relighting network to relight the face, thereby effectively improving the sense of reality and the reliability of the face relighting.
Further, in an embodiment of the present invention, the acquisition module 100 is further configured to obtain positions of feature points in a face picture by a face feature point detection method, obtain a face rendering picture by rendering using a camera parameter matrix and an illumination model, and iteratively optimize a difference between pixel values of the face rendering picture and a face real picture, so as to obtain a distance between projection points of the feature points on the three-dimensional face model on the two-dimensional picture.
Further, in an embodiment of the present invention, the constructing module 300 is further configured to obtain an original illumination estimation map and a target illumination estimation map by using the illumination model and a normal map corresponding to the real face images in the training set, construct and generate the confrontation network, wherein, the generation network of the countermeasure network is a face relighting network, and the face real picture of the network training data set, the corresponding original illumination estimation picture and the target illumination estimation picture are used as the input of the generation network, the generation network output and the target illumination estimation picture are used as the false input of the countermeasure network, the face real picture of the network training data set and the original illumination estimation picture are used as the real input of the countermeasure network, and training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors.
Further, in an embodiment of the present invention, the building module 300 is further configured to obtain an output picture by generating a network output and taking the target illumination estimation map and the original illumination estimation map as inputs, and taking a difference between pixels of the real face picture corresponding to the output picture as an inverse-relighting regular term error.
Further, in an embodiment of the present invention, the building module 300 is further configured to obtain face recognition network features of the generated network output and the face real picture output, and obtain a difference between the face recognition network features of the generated network output and the face real picture output as a face recognition network feature regularization term error.
It should be noted that the explanation of the embodiment of the deep learning face relighting method is also applicable to the deep learning face relighting device of the embodiment, and details are not repeated here.
According to the deep-learning face relighting device provided by the embodiment of the invention, a three-dimensional face model can be used for fitting the face in a picture, an original illumination estimation picture and a target illumination estimation picture for representing illumination distribution on the face are obtained through calculation, a face relighting network is constructed by using a deep learning method, the face relighting network is trained by using a face inverse relighting regular term and a face recognition network characteristic regular term, and for different face pictures, the face can be relighted by using the face relighting network only by using a face three-dimensional reconstruction method for fitting the face model in the picture, so that the reality and reliability of face relighting can be effectively improved.
In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," "circumferential," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the invention and to simplify the description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and are therefore not to be considered limiting of the invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the present invention, unless otherwise expressly stated or limited, the first feature "on" or "under" the second feature may be directly contacting the first and second features or indirectly contacting the first and second features through an intermediate. Also, a first feature "on," "over," and "above" a second feature may be directly or diagonally above the second feature, or may simply indicate that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature may be directly under or obliquely under the first feature, or may simply mean that the first feature is at a lesser elevation than the second feature.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (8)

1. A face relighting method based on deep learning is characterized by comprising the following steps:
determining the corresponding relation between the vertex of the human face three-dimensional model and the position and the color of a two-dimensional pixel of a human face picture by using a camera parameter matrix, the human face three-dimensional model and an illumination model through an iterative optimization method, and obtaining a human face area picture, a normal picture and a representation illumination distribution picture corresponding to the human face picture;
processing a photo set by using the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set;
constructing a face relighting network by a deep learning method, wherein the face relighting network is generated by training the network training data set, and the method further comprises the following steps: obtaining an original illumination estimation image and a target illumination estimation image by using the illumination model and a normal map corresponding to the real face images in the training set; constructing a generation network, wherein the generation network of the countermeasure network is the face relighting network; taking the real face picture of the network training data set, the corresponding original illumination estimation graph and a target illumination estimation graph as the input of the generation network, taking the output of the generation network and the target illumination estimation graph as the false input of the countermeasure network, and taking the real face picture of the network training data set and the original illumination estimation graph as the real input of the countermeasure network; training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors; and
and processing a face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using the face relighting network to obtain a face relighting result.
2. The deep learning-based face relighting method according to claim 1, wherein the determining the correspondence between the vertices of the three-dimensional model of the face and the positions and colors of the two-dimensional pixels of the face picture by an iterative optimization method further comprises:
obtaining the positions of characteristic points in the face picture by a face characteristic point detection method;
rendering by using the camera parameter matrix and the illumination model to obtain a face rendering picture;
and iteratively optimizing the difference between the pixel values of the face rendering picture and the face real picture to obtain the distance of the projection point of the feature point on the face three-dimensional model on the two-dimensional picture.
3. The deep learning-based face relighting method according to claim 1, wherein the error is a sum of an output of the countermeasure network and an error of a back relighting regular term and an error of a face recognition network feature regular term, and further comprising:
outputting the target illumination estimation graph and the original illumination estimation graph as input through the generation network to obtain an output picture;
and taking the difference of each pixel of the real face picture corresponding to the output picture as the error of the inverse relighting regular term.
4. The deep learning based face relighting method according to claim 1 or 3 or the above, wherein the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network feature regular term is taken as an error, and the method further comprises:
acquiring the human face recognition network characteristics of the generated network output and the human face real picture output;
and acquiring a difference value between the generated network output and the face recognition network characteristics output by the real face picture as the face recognition network characteristic regular term error.
5. A face relighting device based on deep learning is characterized by comprising:
the system comprises an acquisition module, a parameter matrix of a camera, a human face three-dimensional model and an illumination model, wherein the acquisition module is used for determining the corresponding relation between the vertex of the human face three-dimensional model and the position and the color of two-dimensional pixels of a human face picture by an iterative optimization method to obtain a human face area image, a normal image and a representation illumination distribution image corresponding to the human face picture;
the acquisition module is used for processing a photo set by utilizing the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set;
a construction module, configured to construct a face relighting network through a deep learning method, wherein the face relighting network is generated by training using the network training data set, the construction module is further configured to obtain an original illumination estimation map and a target illumination estimation map by using the illumination model and a normal map corresponding to a face real image in the training set, and construct and generate an confrontation network, wherein the confrontation network generation network is the face relighting network, and the face real image in the network training data set, the original illumination estimation map and the target illumination estimation map are used as inputs of the generation network, the generation network output and the target illumination estimation map are used as false inputs of the confrontation network, and the face real image in the network training data set and the original illumination estimation map are used as true inputs of the confrontation network, taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors, and training the countermeasure network; and
and the processing module is used for processing a face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using the face relighting network to obtain a face relighting result.
6. The deep learning based face relighting device according to claim 5, wherein the acquisition module is further configured to obtain positions of feature points in a face picture by a face feature point detection method, obtain a face rendering picture by rendering the camera parameter matrix and the illumination model, and iteratively optimize a difference between pixel values of the face rendering picture and a face real picture to obtain distances of projection points of the feature points on the three-dimensional face model on the two-dimensional picture.
7. The deep learning based face relighting apparatus according to claim 5, wherein the construction module is further configured to output the target illumination estimation map and the original illumination estimation map as inputs through the generation network, obtain an output picture, and use a difference between pixels of a real face picture corresponding to the output picture as the inverse relighting regularization term error.
8. The deep learning based face relighting device according to claim 5 or 7 or the building module is further configured to obtain face recognition network features of the generated network output and the human face real picture output, and obtain a difference between the face recognition network features of the generated network output and the human face real picture output as the face recognition network feature regularization term error.
CN201711078324.7A 2017-11-06 2017-11-06 Face relighting method and device based on deep learning Active CN107909640B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711078324.7A CN107909640B (en) 2017-11-06 2017-11-06 Face relighting method and device based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711078324.7A CN107909640B (en) 2017-11-06 2017-11-06 Face relighting method and device based on deep learning

Publications (2)

Publication Number Publication Date
CN107909640A CN107909640A (en) 2018-04-13
CN107909640B true CN107909640B (en) 2020-07-28

Family

ID=61842581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711078324.7A Active CN107909640B (en) 2017-11-06 2017-11-06 Face relighting method and device based on deep learning

Country Status (1)

Country Link
CN (1) CN107909640B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493408A (en) * 2018-11-21 2019-03-19 深圳阜时科技有限公司 Image processing apparatus, image processing method and equipment
CN111382618B (en) 2018-12-28 2021-02-05 广州市百果园信息技术有限公司 Illumination detection method, device, equipment and storage medium for face image
CN109815893B (en) * 2019-01-23 2021-03-26 中山大学 Color face image illumination domain normalization method based on cyclic generation countermeasure network
CN110032927A (en) * 2019-02-27 2019-07-19 视缘(上海)智能科技有限公司 A kind of face identification method
CN110633608A (en) * 2019-03-21 2019-12-31 广州中科凯泽科技有限公司 Human body limb similarity evaluation method of posture image
CN110322548B (en) * 2019-06-11 2023-04-18 北京工业大学 Three-dimensional grid model generation method based on geometric image parameterization
CN110706152B (en) * 2019-09-25 2021-09-17 中山大学 Face illumination migration method based on generation of confrontation network
CN111951373B (en) * 2020-06-30 2024-02-13 重庆灵翎互娱科技有限公司 Face image processing method and equipment
CN112686202B (en) * 2021-01-12 2023-04-25 武汉大学 Human head identification method and system based on 3D reconstruction
US11587528B2 (en) * 2021-02-12 2023-02-21 Microsoft Technology Licensing, Llc Optimized facial illumination from adaptive screen content
CN112966670A (en) * 2021-04-08 2021-06-15 北京的卢深视科技有限公司 Face recognition method, electronic device and storage medium
CN114331823A (en) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN115546041B (en) * 2022-02-28 2023-10-20 荣耀终端有限公司 Training method of light supplementing model, image processing method and related equipment thereof
US20230290108A1 (en) * 2022-03-10 2023-09-14 Samsung Electronics Co., Ltd. Machine-Learning Models Trained to Modify Image Illumination Without Ground-Truth Images

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872491A (en) * 2010-05-21 2010-10-27 清华大学 Free view angle relighting method and system based on photometric stereo
WO2015066628A1 (en) * 2013-11-04 2015-05-07 Facebook, Inc. Systems and methods for facial representation
CN105447906A (en) * 2015-11-12 2016-03-30 浙江大学 Method for calculating lighting parameters and carrying out relighting rendering based on image and model
CN106570928A (en) * 2016-11-14 2017-04-19 河海大学 Image-based re-lighting method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872491A (en) * 2010-05-21 2010-10-27 清华大学 Free view angle relighting method and system based on photometric stereo
WO2015066628A1 (en) * 2013-11-04 2015-05-07 Facebook, Inc. Systems and methods for facial representation
CN105447906A (en) * 2015-11-12 2016-03-30 浙江大学 Method for calculating lighting parameters and carrying out relighting rendering based on image and model
CN106570928A (en) * 2016-11-14 2017-04-19 河海大学 Image-based re-lighting method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
提高人脸识别率的重光照算法;罗光明;《计算机与现代化》;20150630(第238期);第51-54页 *

Also Published As

Publication number Publication date
CN107909640A (en) 2018-04-13

Similar Documents

Publication Publication Date Title
CN107909640B (en) Face relighting method and device based on deep learning
Koch et al. Evaluation of cnn-based single-image depth estimation methods
US10334168B2 (en) Threshold determination in a RANSAC algorithm
JP6057298B2 (en) Rapid 3D modeling
CN103247075B (en) Based on the indoor environment three-dimensional rebuilding method of variation mechanism
CN111340944B (en) Single-image human body three-dimensional reconstruction method based on implicit function and human body template
CN101916454B (en) Method for reconstructing high-resolution human face based on grid deformation and continuous optimization
CN106447601B (en) Unmanned aerial vehicle remote sensing image splicing method based on projection-similarity transformation
CN108053437B (en) Three-dimensional model obtaining method and device based on posture
US20170278302A1 (en) Method and device for registering an image to a model
CN103106688A (en) Indoor three-dimensional scene rebuilding method based on double-layer rectification method
CN106023230B (en) A kind of dense matching method of suitable deformation pattern
JP2005308553A (en) Three-dimensional image measuring device and method
CN110246161B (en) Method for seamless splicing of 360-degree panoramic images
CN113689539B (en) Dynamic scene real-time three-dimensional reconstruction method based on implicit optical flow field
CN111062866A (en) Transformation matrix-based panoramic image splicing method
CN112862683B (en) Adjacent image splicing method based on elastic registration and grid optimization
JP2012185772A (en) Method and program for enhancing accuracy of composited picture quality of free viewpoint picture using non-fixed zoom camera
CN117036569B (en) Three-dimensional model color generation network training method, color generation method and device
CN109472752A (en) More exposure emerging systems based on Aerial Images
CN114049464A (en) Reconstruction method and device of three-dimensional model
CN117542122B (en) Human body pose estimation and three-dimensional reconstruction method, network training method and device
CN116958434A (en) Multi-view three-dimensional reconstruction method, measurement method and system
CN112802186A (en) Dynamic scene real-time three-dimensional reconstruction method based on binarization characteristic coding matching
CN112750155A (en) Panoramic depth estimation method based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant