CN107909640B

CN107909640B - Face relighting method and device based on deep learning

Info

Publication number: CN107909640B
Application number: CN201711078324.7A
Authority: CN
Inventors: 徐枫; 王至博
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2017-11-06
Filing date: 2017-11-06
Publication date: 2020-07-28
Anticipated expiration: 2037-11-06
Also published as: CN107909640A

Abstract

The invention discloses a face relighting method and a face relighting device based on deep learning, wherein the method comprises the following steps: obtaining a face region image, a normal image and a representation illumination distribution image corresponding to the face image; processing the photo set by utilizing the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set; constructing a face relighting network by a deep learning method, wherein a network training data set is used for training to generate the face relighting network; and processing the face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using a face relighting network to obtain a face relighting result. The method can utilize a deep learning method to construct the face relighting network, utilize the face inverse relighting regular term and the face recognition network characteristic regular term to train the face relighting network, and utilize the face relighting network to relight the face, thereby effectively improving the reality and the reliability of the face relighting.

Description

Face relighting method and device based on deep learning

Technical Field

The invention relates to the technical field of computer vision, computer graphics and deep learning, in particular to a face relighting method and device based on deep learning.

Background

When people take images, the shot contents and the shooting conditions (such as time, visual angle and illumination) are recorded in the photos, and some image processing technologies can avoid retaking and directly change the shooting conditions by technical means to obtain new photos. The relighting technology is a method for obtaining a corresponding new photo by changing the illumination when the photo is shot, and the object aimed by the face relighting technology is the face photo, wherein the face relighting technology is very important in the post-processing of the picture and the post-production of the film.

In the related technology, face relighting is usually to build a face illumination model to relight a face, but the action between the face and illumination is very complex, and a model built artificially is often difficult to express the real performance of the face under each illumination, so that the reliability is poor and needs to be solved.

Disclosure of Invention

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, one objective of the present invention is to provide a face relighting method based on deep learning, which can effectively improve the reality and reliability of face relighting.

Another objective of the present invention is to provide a face relighting device based on deep learning.

In order to achieve the above object, an embodiment of the present invention provides a face relighting method based on deep learning, including the following steps: determining the corresponding relation between the vertex of the human face three-dimensional model and the position and the color of a two-dimensional pixel of a human face picture by using a camera parameter matrix, the human face three-dimensional model and an illumination model through an iterative optimization method, and obtaining a human face area picture, a normal picture and a representation illumination distribution picture corresponding to the human face picture; processing a photo set by using the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set; constructing a face relighting network by a deep learning method, wherein the face relighting network is generated by training by utilizing the network training data set; and processing a face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using the face relighting network to obtain a face relighting result.

The face relighting method based on the deep learning of the embodiment of the invention can utilize a three-dimensional face model to fit the face in the picture, calculate to obtain an original illumination estimation picture and a target illumination estimation picture which are used for representing illumination distribution on the face, construct a face relighting network by utilizing the deep learning method, train the face relighting network by utilizing a face inverse relighting regular term and a face recognition network characteristic regular term, and relight the face by utilizing the face relighting network only by utilizing a face three-dimensional reconstruction method to fit the face model in the picture for different face pictures, thereby effectively improving the reality and reliability of the face relighting.

In addition, the face relighting method based on deep learning according to the above embodiment of the present invention may further have the following additional technical features:

further, in an embodiment of the present invention, the determining, by an iterative optimization method, a corresponding relationship between a vertex of the three-dimensional model of the human face and a position and a color of a two-dimensional pixel of a human face picture further includes: obtaining the positions of characteristic points in the face picture by a face characteristic point detection method; rendering by using the camera parameter matrix and the illumination model to obtain a face rendering picture; and iteratively optimizing the difference between the pixel values of the face rendering picture and the face real picture to obtain the distance of the projection point of the feature point on the face three-dimensional model on the two-dimensional picture.

Further, in an embodiment of the present invention, the training with the network training data set to generate a face relighting network further includes: obtaining an original illumination estimation image and a target illumination estimation image by using the illumination model and a normal map corresponding to the real face images in the training set; constructing a generation network, wherein the generation network of the countermeasure network is the face relighting network; taking the real face picture of the network training data set, the corresponding original illumination estimation graph and a target illumination estimation graph as the input of the generation network, taking the output of the generation network and the target illumination estimation graph as the false input of the countermeasure network, and taking the real face picture of the network training data set and the original illumination estimation graph as the real input of the countermeasure network; and training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors.

Further, in an embodiment of the present invention, the taking a sum of the output of the countermeasure network and the inverse-relighting regularization term error and the face recognition network feature regularization term error as an error further includes: outputting the target illumination estimation graph and the original illumination estimation graph as input through the generation network to obtain an output picture; and taking the difference of each pixel of the real face picture corresponding to the output picture as the error of the inverse relighting regular term.

Further, in an embodiment of the present invention, the taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network feature regular term as an error further includes obtaining the face recognition network feature of the generated network output and the output of the face real picture; and acquiring a difference value between the generated network output and the face recognition network characteristics output by the real face picture as the face recognition network characteristic regular term error.

In order to achieve the above object, another embodiment of the present invention provides a face relighting device based on deep learning, including: the system comprises an acquisition module, a parameter matrix of a camera, a human face three-dimensional model and an illumination model, wherein the acquisition module is used for determining the corresponding relation between the vertex of the human face three-dimensional model and the position and the color of two-dimensional pixels of a human face picture by an iterative optimization method to obtain a human face area image, a normal image and a representation illumination distribution image corresponding to the human face picture; the acquisition module is used for processing a photo set by utilizing the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set; the construction module is used for constructing a face relighting network by a deep learning method, wherein the face relighting network is generated by training the network training data set; and the processing module is used for processing a face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using the face relighting network to obtain a face relighting result.

The deep learning face relighting device provided by the embodiment of the invention can utilize a three-dimensional face model to fit a face in a picture, calculate to obtain an original illumination estimation picture and a target illumination estimation picture which are used for representing illumination distribution on the face, construct a face relighting network by using a deep learning method, train the face relighting network by using a face inverse relighting regular term and a face recognition network characteristic regular term, and relight different face pictures by using the face relighting network only by using a face three-dimensional reconstruction method to fit the face model in the picture, so that the face can be relighted by using the face relighting network, and the reality and reliability of face relighting can be effectively improved.

In addition, the deep learning face relighting device according to the above embodiment of the present invention may also have the following additional technical features:

further, in an embodiment of the present invention, the acquisition module is further configured to obtain positions of feature points in a face picture by a face feature point detection method, obtain a face rendering picture by rendering the camera parameter matrix and the illumination model, and iteratively optimize a difference between pixel values of the face rendering picture and a face real picture, so as to obtain a distance between projection points of the feature points on the three-dimensional face model on the two-dimensional picture.

Further, in an embodiment of the present invention, the constructing module is further configured to obtain an original illumination estimation graph and a target illumination estimation graph by using the illumination model and a normal graph corresponding to a real face picture in a training set, and construct a generated countermeasure network, where the generated network of the countermeasure network is the face relighting network, and the real face picture of the network training data set, the original illumination estimation graph and the target illumination estimation graph are used as inputs of the generated network, the generated network output and the target illumination estimation graph are used as false inputs to the countermeasure network, the real face picture of the network training data set and the original illumination estimation graph are used as true inputs to the countermeasure network, and a sum of an output of the countermeasure network, an error of a back-relighting regular term and an error of a face recognition network feature regular term is used as an error, the generation trains the antagonistic network.

Further, in an embodiment of the present invention, the building module is further configured to output, as input, the target illumination estimation map and the original illumination estimation map through the generation network to obtain an output picture, and use a difference between pixels of a real face picture corresponding to the output picture as the inverse relighting regularization term error.

Further, in an embodiment of the present invention, the building module is further configured to obtain face recognition network features of the generated network output and the human face real picture output, and obtain a difference between the face recognition network features of the generated network output and the human face real picture output as the face recognition network feature regular term error.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flow chart of a face relighting method based on deep learning according to an embodiment of the present invention;

FIG. 2 is a flowchart of a face relighting method based on deep learning according to another embodiment of the present invention;

FIG. 3 is a functional diagram of a method for using a face relighting network according to another embodiment of the present invention;

fig. 4 is a schematic structural diagram of a face relighting apparatus based on deep learning according to another embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

The following describes a face relighting method and apparatus based on deep learning according to an embodiment of the present invention with reference to the accompanying drawings, and first, a face relighting method based on deep learning according to an embodiment of the present invention will be described with reference to the accompanying drawings.

Fig. 1 is a flowchart of a face relighting method based on deep learning according to an embodiment of the present invention.

As shown in fig. 1, the face relighting method based on deep learning includes the following steps:

in step S101, a camera parameter matrix, a three-dimensional face model and an illumination model are used, and a correspondence between a vertex of the three-dimensional face model and a position and a color of a two-dimensional pixel of a face picture is determined by an iterative optimization method, so as to obtain a face region map, a normal map and a representative illumination distribution map corresponding to the face picture.

That is to say, in the embodiment of the present invention, the camera parameter matrix, the three-dimensional face model and the illumination model may be used, the iterative optimization method is used to determine the correspondence between the vertex of the three-dimensional face model and the two-dimensional pixels of the face image and the color, and the face region image, the normal image and the image for representing the illumination distribution corresponding to the face image are obtained through calculation.

Further, in an embodiment of the present invention, determining a correspondence between a vertex of the three-dimensional model of the human face and a position and a color of a two-dimensional pixel of the human face picture by an iterative optimization method further includes: obtaining the positions of characteristic points in the face picture by a face characteristic point detection method; rendering by using the camera parameter matrix and the illumination model to obtain a face rendering picture; and iteratively optimizing the difference between the pixel values of the face rendering picture and the face real picture to obtain the distance of the projection point of the feature point on the face three-dimensional model on the two-dimensional picture.

It can be understood that the method for determining the correspondence between the vertex of the three-dimensional model of the human face and the two-dimensional pixel in the embodiment of the invention comprises the following steps:

(1) obtaining the positions of characteristic points in the face picture by using a face characteristic point detection method;

(2) rendering by using the camera parameter matrix and the spherical harmonic illumination model to obtain a face rendering picture;

(3) and iteratively optimizing the difference between the pixel values of the face rendering picture and the face real picture, and the distance of the projection point of the feature point on the three-dimensional model on the two-dimensional picture.

For example, as shown in FIG. 2, an embodiment of the present invention collects a face data set. For a face picture, fitting a face in the picture by using a face three-dimensional model, after the vertex v on the face three-dimensional model and the vertex v are subjected to rotation transformation R and translation transformation t, projecting the corresponding position of a pixel point P in the face picture through a camera parameter matrix P:

p＝P(Rv)+t，

the embodiment of the invention can obtain the positions of the feature points in the face picture by a face feature point detection method, and can obtain the coefficient vector in the face three-dimensional model and the parameter l in the spherical harmonic illumination model by performing iterative optimization on the positions, vertexes and colors of corresponding pixel points of the three-dimensional model feature points and the picture feature points. The embodiment of the invention can utilize the spherical harmonic illumination model to calculate the illumination intensity of each pixel point, and the method comprises the following steps:

wherein b is_jRelating to the normal direction of each vertex on the face model:

the embodiment of the invention can obtain a normal image of the face by using the three-dimensional model of the face, and render to obtain an original illumination distribution map of the face by using the normal image and the illumination intensity formula, randomly generate a group of spherical harmonic illumination coefficients, render to obtain a target illumination estimation map, project the three-dimensional model of the face to an area covered on the picture and take out to obtain a corresponding picture of the face area.

In step S102, the photo set is processed by using the camera parameter matrix, the three-dimensional face model, and the illumination model, so as to obtain a network training data set.

For example, as shown in fig. 2, in the embodiment of the present invention, each picture in the face picture data set may be processed by using the method in step S101, so that a training database for face relighting network training may be obtained.

In step S103, a face relighting network is constructed by a deep learning method, wherein a network training data set is used for training to generate the face relighting network.

That is to say, the embodiment of the present invention may construct the face relighting network by a deep learning method, and perform training by using the training data set, so that the face relighting network may generate a face relighting result.

For example, the face relighting network constructed in the embodiment of the invention is characterized in that: the face relighting network has three pictures as input; the face relighting network outputs a picture.

Further, in an embodiment of the present invention, the training is performed by using a network training data set, and a face relighting network is generated, further including: obtaining an original illumination estimation image and a target illumination estimation image by using an illumination model and a normal map corresponding to the real face image in the training set; constructing a generation network, wherein the generation network of the countermeasure network is a face relighting network; the real face picture of the network training data set, the corresponding original illumination estimation picture and the target illumination estimation picture are used as input for generating a network, the generated network output and the target illumination estimation picture are used as false input of an antagonistic network, and the real face picture of the network training data set and the original illumination estimation picture are used as real input of the antagonistic network; and training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors.

It can be understood that the method for training the face relighting network in the embodiment of the invention can comprise the following steps:

(1) obtaining an original illumination estimation image and a target illumination estimation image by using an illumination model and a normal map corresponding to the real face image in the training set;

(2) constructing a generation confrontation network, wherein the generation network is a face relighting network;

(3) the real face pictures in the training set, the corresponding original illumination estimation pictures and the target illumination estimation pictures are used as the input of a generation network, the output of the generation network and the target illumination estimation pictures are used as the false input of an antagonistic network, and the real face pictures and the original illumination estimation pictures in the training set are used as the real input of the antagonistic network;

(4) and training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors.

Further, in an embodiment of the present invention, taking the sum of the output of the countermeasure network and the error of the inverse relighting regularization term and the error of the face recognition network feature regularization term as an error, further includes: obtaining an output picture by generating a network output and target illumination estimation picture and an original illumination estimation picture as input; and taking the difference of each pixel of the real face picture corresponding to the output picture as the inverse relighting regular term error.

It can be understood that the method for calculating the error of the inverse relighting regular term in the embodiment of the invention comprises the following steps:

(1) taking the generated network output and target illumination estimation graph and the original illumination estimation graph as input to obtain an output picture of the generated network output and target illumination estimation graph;

(2) and taking the difference of each pixel of the real face picture corresponding to the obtained output picture as an error.

Optionally, in an embodiment of the present invention, a sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network feature regular term is used as an error, and further includes obtaining a face recognition network feature that generates a network output and a face real picture output; and acquiring a difference value between the generated network output and the face recognition network characteristic output of the face real picture to be used as a face recognition network characteristic regular term error.

Specifically, the method for calculating the face recognition network feature regular term error in the embodiment of the invention comprises the following steps:

(1) calculating the face recognition network characteristics of the generated network output and the face real picture output;

(2) and taking the difference between the two face recognition network characteristics as an error.

For example, as shown in fig. 2, the embodiment of the present invention can perform face relightingAnd (5) network training and generating a face relighting network. The embodiment of the invention can design the face relighting network as the generation network for generating the confrontation network

The countermeasure network is

The input is a real picture I of a human face_iOriginal illumination estimation picture

Target illumination estimation picture

The output is a heavily illuminated picture I_o. When face relighting network training is carried out, the adopted errors are as follows:

the first two terms of the error are the errors of the countermeasure network output, and the last term is the regular term. The embodiment of the invention adopts a human face inverse-relighting regular term Reg_2GThe regular terms of the face recognition network features are respectively as follows:

in the face recognition network feature regular term, the embodiment of the invention adopts an fc6 layer of a VGG (visual graphics Generator) face recognition network as a feature F, and performs the following iterative optimization on an error L to obtain the face relighting network

Comprises the following steps:

in step S104, the face image is processed by using the camera parameter matrix, the face three-dimensional model, and the illumination model, and the output result is processed by using the face relighting network, so as to obtain a face relighting result.

It can be understood that, in the embodiment of the present invention, the method in step S101 is adopted to process the face picture, and the output result is processed by using the face relighting network, so that a face relighting result can be obtained.

For example, after the face relighting network is obtained through training, as shown in fig. 3, a camera parameter matrix, a face three-dimensional model and an illumination model may be used to process a face picture, so as to obtain an original illumination distribution map and a target illumination distribution map, and the three pictures are used as input of the face relighting network, so as to obtain a relighting result of the face.

To sum up, the embodiment of the present invention aims to solve the problem of face relighting in the post-processing of the image, and the embodiment of the present invention inputs a face image, then performs face relighting processing on the face region of the face image, and estimates the illumination distribution on the face by using a face three-dimensional model reconstruction method to obtain the training data of the face relighting network, and simultaneously trains the face relighting network by generating an antagonistic network training method, and the antagonistic network training method uses the trained face relighting network and takes the real face image and the original illumination estimation image and the target illumination estimation image obtained by the face three-dimensional reconstruction as input, so as to obtain the face relighting image.

According to the face relighting method based on deep learning provided by the embodiment of the invention, a three-dimensional face model can be used for fitting the face in a picture, an original illumination estimation picture and a target illumination estimation picture for representing illumination distribution on the face are obtained through calculation, a face relighting network is constructed by using the deep learning method, the face relighting network is trained by using a face inverse relighting regular term and a face recognition network characteristic regular term, and for different face pictures, the face relighting network can be used for relighting the face only by using the face three-dimensional reconstruction method for fitting the face model in the picture, so that the reality sense and the reliability of face relighting can be effectively improved.

Next, a face relighting apparatus based on deep learning proposed according to an embodiment of the present invention will be described with reference to the drawings.

Fig. 4 is a schematic structural diagram of a face relighting device based on deep learning according to an embodiment of the present invention.

As shown in fig. 4, the deep learning based face relighting apparatus 10 includes: acquisition module 100, acquisition module 200, construction module 300 and processing module 400.

The acquisition module 100 is configured to determine, by using the camera parameter matrix, the three-dimensional face model and the illumination model, a correspondence between a vertex of the three-dimensional face model and a position and a color of a two-dimensional pixel of a face picture by an iterative optimization method, and obtain a face region map, a normal map and a representative illumination distribution map corresponding to the face picture. The obtaining module 200 is configured to process the photo set by using the camera parameter matrix, the human face three-dimensional model, and the illumination model to obtain a network training data set. The construction module 300 is configured to construct a face relighting network through a deep learning method, wherein a network training data set is used for training to generate the face relighting network. The processing module 400 processes the face picture using the camera parameter matrix, the face three-dimensional model and the illumination model, and processes the output result using the face relighting network to obtain a face relighting result. The device 10 of the embodiment of the invention can utilize a deep learning method to construct a face relighting network, utilize a face inverse relighting regular term and a face recognition network characteristic regular term to train the face relighting network, and utilize the face relighting network to relight the face, thereby effectively improving the sense of reality and the reliability of the face relighting.

Further, in an embodiment of the present invention, the acquisition module 100 is further configured to obtain positions of feature points in a face picture by a face feature point detection method, obtain a face rendering picture by rendering using a camera parameter matrix and an illumination model, and iteratively optimize a difference between pixel values of the face rendering picture and a face real picture, so as to obtain a distance between projection points of the feature points on the three-dimensional face model on the two-dimensional picture.

Further, in an embodiment of the present invention, the constructing module 300 is further configured to obtain an original illumination estimation map and a target illumination estimation map by using the illumination model and a normal map corresponding to the real face images in the training set, construct and generate the confrontation network, wherein, the generation network of the countermeasure network is a face relighting network, and the face real picture of the network training data set, the corresponding original illumination estimation picture and the target illumination estimation picture are used as the input of the generation network, the generation network output and the target illumination estimation picture are used as the false input of the countermeasure network, the face real picture of the network training data set and the original illumination estimation picture are used as the real input of the countermeasure network, and training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors.

Further, in an embodiment of the present invention, the building module 300 is further configured to obtain an output picture by generating a network output and taking the target illumination estimation map and the original illumination estimation map as inputs, and taking a difference between pixels of the real face picture corresponding to the output picture as an inverse-relighting regular term error.

Further, in an embodiment of the present invention, the building module 300 is further configured to obtain face recognition network features of the generated network output and the face real picture output, and obtain a difference between the face recognition network features of the generated network output and the face real picture output as a face recognition network feature regularization term error.

It should be noted that the explanation of the embodiment of the deep learning face relighting method is also applicable to the deep learning face relighting device of the embodiment, and details are not repeated here.

According to the deep-learning face relighting device provided by the embodiment of the invention, a three-dimensional face model can be used for fitting the face in a picture, an original illumination estimation picture and a target illumination estimation picture for representing illumination distribution on the face are obtained through calculation, a face relighting network is constructed by using a deep learning method, the face relighting network is trained by using a face inverse relighting regular term and a face recognition network characteristic regular term, and for different face pictures, the face can be relighted by using the face relighting network only by using a face three-dimensional reconstruction method for fitting the face model in the picture, so that the reality and reliability of face relighting can be effectively improved.

In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," "circumferential," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the invention and to simplify the description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and are therefore not to be considered limiting of the invention.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

In the present invention, unless otherwise expressly stated or limited, the first feature "on" or "under" the second feature may be directly contacting the first and second features or indirectly contacting the first and second features through an intermediate. Also, a first feature "on," "over," and "above" a second feature may be directly or diagonally above the second feature, or may simply indicate that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature may be directly under or obliquely under the first feature, or may simply mean that the first feature is at a lesser elevation than the second feature.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A face relighting method based on deep learning is characterized by comprising the following steps:

determining the corresponding relation between the vertex of the human face three-dimensional model and the position and the color of a two-dimensional pixel of a human face picture by using a camera parameter matrix, the human face three-dimensional model and an illumination model through an iterative optimization method, and obtaining a human face area picture, a normal picture and a representation illumination distribution picture corresponding to the human face picture;

processing a photo set by using the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set;

constructing a face relighting network by a deep learning method, wherein the face relighting network is generated by training the network training data set, and the method further comprises the following steps: obtaining an original illumination estimation image and a target illumination estimation image by using the illumination model and a normal map corresponding to the real face images in the training set; constructing a generation network, wherein the generation network of the countermeasure network is the face relighting network; taking the real face picture of the network training data set, the corresponding original illumination estimation graph and a target illumination estimation graph as the input of the generation network, taking the output of the generation network and the target illumination estimation graph as the false input of the countermeasure network, and taking the real face picture of the network training data set and the original illumination estimation graph as the real input of the countermeasure network; training the countermeasure network by taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors; and

and processing a face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using the face relighting network to obtain a face relighting result.

2. The deep learning-based face relighting method according to claim 1, wherein the determining the correspondence between the vertices of the three-dimensional model of the face and the positions and colors of the two-dimensional pixels of the face picture by an iterative optimization method further comprises:

obtaining the positions of characteristic points in the face picture by a face characteristic point detection method;

rendering by using the camera parameter matrix and the illumination model to obtain a face rendering picture;

and iteratively optimizing the difference between the pixel values of the face rendering picture and the face real picture to obtain the distance of the projection point of the feature point on the face three-dimensional model on the two-dimensional picture.

3. The deep learning-based face relighting method according to claim 1, wherein the error is a sum of an output of the countermeasure network and an error of a back relighting regular term and an error of a face recognition network feature regular term, and further comprising:

outputting the target illumination estimation graph and the original illumination estimation graph as input through the generation network to obtain an output picture;

and taking the difference of each pixel of the real face picture corresponding to the output picture as the error of the inverse relighting regular term.

4. The deep learning based face relighting method according to claim 1 or 3 or the above, wherein the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network feature regular term is taken as an error, and the method further comprises:

acquiring the human face recognition network characteristics of the generated network output and the human face real picture output;

and acquiring a difference value between the generated network output and the face recognition network characteristics output by the real face picture as the face recognition network characteristic regular term error.

5. A face relighting device based on deep learning is characterized by comprising:

the system comprises an acquisition module, a parameter matrix of a camera, a human face three-dimensional model and an illumination model, wherein the acquisition module is used for determining the corresponding relation between the vertex of the human face three-dimensional model and the position and the color of two-dimensional pixels of a human face picture by an iterative optimization method to obtain a human face area image, a normal image and a representation illumination distribution image corresponding to the human face picture;

the acquisition module is used for processing a photo set by utilizing the camera parameter matrix, the human face three-dimensional model and the illumination model to obtain a network training data set;

a construction module, configured to construct a face relighting network through a deep learning method, wherein the face relighting network is generated by training using the network training data set, the construction module is further configured to obtain an original illumination estimation map and a target illumination estimation map by using the illumination model and a normal map corresponding to a face real image in the training set, and construct and generate an confrontation network, wherein the confrontation network generation network is the face relighting network, and the face real image in the network training data set, the original illumination estimation map and the target illumination estimation map are used as inputs of the generation network, the generation network output and the target illumination estimation map are used as false inputs of the confrontation network, and the face real image in the network training data set and the original illumination estimation map are used as true inputs of the confrontation network, taking the sum of the output of the countermeasure network and the error of the inverse relighting regular term and the error of the face recognition network characteristic regular term as errors, and training the countermeasure network; and

and the processing module is used for processing a face picture by using the camera parameter matrix, the face three-dimensional model and the illumination model, and processing an output result by using the face relighting network to obtain a face relighting result.

6. The deep learning based face relighting device according to claim 5, wherein the acquisition module is further configured to obtain positions of feature points in a face picture by a face feature point detection method, obtain a face rendering picture by rendering the camera parameter matrix and the illumination model, and iteratively optimize a difference between pixel values of the face rendering picture and a face real picture to obtain distances of projection points of the feature points on the three-dimensional face model on the two-dimensional picture.

7. The deep learning based face relighting apparatus according to claim 5, wherein the construction module is further configured to output the target illumination estimation map and the original illumination estimation map as inputs through the generation network, obtain an output picture, and use a difference between pixels of a real face picture corresponding to the output picture as the inverse relighting regularization term error.

8. The deep learning based face relighting device according to claim 5 or 7 or the building module is further configured to obtain face recognition network features of the generated network output and the human face real picture output, and obtain a difference between the face recognition network features of the generated network output and the human face real picture output as the face recognition network feature regularization term error.