WO2020037680A1

WO2020037680A1 - Light-based three-dimensional face optimization method and apparatus, and electronic device

Info

Publication number: WO2020037680A1
Application number: PCT/CN2018/102333
Authority: WO
Inventors: 李建亿; 朱利明
Original assignee: 太平洋未来科技（深圳）有限公司
Priority date: 2018-08-24
Filing date: 2018-08-24
Publication date: 2020-02-27
Also published as: CN109271911A; CN109271911B

Abstract

A light-based three-dimensional face optimization method and apparatus, and an electronic device. The method comprises: acquiring a target picture, and determining whether a facial image in the target picture is in a non-uniform light condition (S101); if the facial image is in a non-uniform light condition, inputting the facial image to a pre-trained image generation model to obtain an optimized facial image after light adjustment is performed on the facial image (S102); processing the optimized facial image based on a pre-trained convolutional neural network model to obtain first three-dimensional face model parameter information (S103); and processing a three-dimensional average face model according to the first three-dimensional face model parameter information to obtain a three-dimensional facial image corresponding to the facial image (S104). By means of the method, a facial image photographed under a poor lighting environment is optimized, so as to obtain a clear face; moreover, only a single picture is needed to generate a three-dimensional face, thus reducing the costs.

Description

Light-based 3D human face optimization method, device and electronic equipment

Technical field

The invention relates to the technical field of image processing, and in particular, to a method, a device, and an electronic device for optimizing a three-dimensional human face based on light.

Background technique

Three-dimensional face reconstruction has been widely used in medical, education, and entertainment fields. In the process of implementing the present invention, the inventor found that in the process of 3D face reconstruction, multiple pictures and multiple angles are used to form a 3D model. However, since a large number of pictures are required, the reconstruction process is cumbersome and complicated, and it takes a long time. In addition, during the reconstruction process, the key points of the face in the face image need to be located to generate the position information of the key points of the face, and in the case of poor lighting conditions (such as backlighting, sidelighting, etc.) , The image information of the face is not clear, resulting in a large error in the finally generated 3D face. In addition, mobile devices such as mobile phones are increasingly using 3D face reconstruction technology, and a large number of two-bit pictures required for 3D face reconstruction are often obtained through mobile phone cameras. Mobile phones are prone to shake during shooting, which will affect the image The acquisition quality indirectly affects the subsequent 3D face reconstruction effect.

Summary of the Invention

The ray-based three-dimensional human face optimization method, device, and electronic device provided by the embodiments of the present invention are used to solve at least the foregoing problems in related technologies.

One aspect of the embodiments of the present invention provides a light-based three-dimensional face optimization method, including:

Determine whether the face image in the obtained target picture is unevenly lit;

If the face image has uneven illumination, performing light adjustment on the face image to obtain an optimized face image;

Processing the optimized face image based on a pre-trained convolutional neural network model to obtain the first three-dimensional face model parameter information;

The three-dimensional average face model is processed according to the first three-dimensional face model parameter information to obtain a three-dimensional face image corresponding to the face image.

Further, the step of determining whether the face image in the obtained target picture is unevenly illuminated includes:

Obtaining a grayscale histogram of the target picture;

Calculating the gray level distribution variance of the target picture according to the gray level histogram;

Comparing the gray level distribution variance with a gray level distribution critical variance, and when the gray level distribution variance is greater than or equal to the gray level distribution critical variance, determining that a face image exists in the target picture The light is uneven.

Further, the step of performing light adjustment on the face image to obtain an optimized face image includes:

Obtaining a training sample and an initial image generation model, where the training sample includes a plurality of first images generated under a non-positive uniform light source condition, and a second image corresponding to the first image generated under a front uniform light source condition;

Using a machine learning method to train the initial image generation model based on the training samples to obtain an image generation model;

Light adjustment is performed on the face image using the image generation model to obtain an optimized face image.

Further, using the machine learning method to train the initial image generation model based on the training samples to obtain the image generation model includes:

Inputting the first image into the initial image generation model to obtain an output optimized first image;

Using the optimized first image and the second image as input of a discrimination network, training the discrimination network, and determining parameters of the discrimination network after training;

Training the initial image generation model by using the first image as an input of the initial image generation model;

Inputting the optimized first image and the second image output by the initial image generation model after training to the discriminant network after training to determine a loss function value of the discriminant network after training;

When the value of the loss function converges, the initial image generation model is determined as the image generation model.

Further, the convolutional neural network model is trained by the following steps:

Build a convolutional neural network model consisting of two layers of hourglass convolutional neural networks;

Obtaining a data set for training the convolutional neural network model, where the data set includes several two-dimensional face pictures and three-dimensional portrait scan data corresponding to the two-dimensional face pictures;

Pre-processing the two-dimensional face picture to obtain facial feature point information;

Inputting the facial feature point information into the convolutional neural network model to obtain the second three-dimensional facial model parameter information;

The cross-entropy loss function is used to optimize the parameters of the convolutional neural network until the second 3D face model parameter information and the loss function of the 3D portrait scan data converge to a preset threshold.

Further, the target picture is obtained through an image acquisition device, which includes a lens, an autofocus voice coil motor, a mechanical image stabilizer, and an image sensor, and the lens is fixed on the autofocus voice coil motor, The lens is used to acquire an image (picture), the image sensor transmits the image acquired by the lens to the recognition module, the autofocus voice coil motor is mounted on the mechanical image stabilizer, and the processing module According to the feedback of the lens shake detected by the gyroscope in the lens, the action of the mechanical image stabilizer is driven to realize the lens shake compensation.

Further, the mechanical image stabilizer includes a movable plate, a base plate, and a compensation mechanism. Each of the movable plate and the base plate is provided with a through hole through which the lens passes, and the auto-focusing voice coil motor is installed at The movable plate is mounted on the substrate, and the size of the substrate is larger than the movable plate. The compensation mechanism drives the movable plate and the movable plate under the driving of the processing module. The lens moves to achieve lens shake compensation; the compensation mechanism includes a first compensation component, a second compensation component, a third compensation component, and a fourth compensation component installed around the substrate, wherein the first compensation component and The third compensation component is disposed opposite to each other, the second compensation component is disposed opposite to the fourth compensation component, and a line between the first compensation component and the third compensation component is connected to the first compensation component and the first compensation component. The lines between the three compensation components are perpendicular to each other; the first compensation component, the second compensation component, the third compensation component, and the fourth compensation component all include a driving member, a rotating shaft, and a one-way bearing. Rotate the ring gear; the driving member is controlled by the processing module, and the driving member is drivingly connected to the rotating shaft to drive the rotating shaft to rotate; the rotating shaft is connected to the inner ring of the one-way bearing to Driving the inner ring of the one-way bearing to rotate; the rotating ring gear is sleeved on the one-way bearing and connected to the outer ring of the one-way bearing, and an outer surface of the rotating ring gear is provided with a ring in its circumferential direction External teeth, the bottom surface of the movable plate is provided with a plurality of rows of strip grooves arranged at even intervals, the strip grooves are engaged with the external teeth, and the external teeth can slide along the length direction of the strip grooves ; Wherein the rotatable direction of the one-way bearing of the first compensation component is opposite to the rotatable direction of the one-way bearing of the third compensation component, and the rotatable direction of the one-way bearing of the second compensation component is different from that The rotatable direction of the one-way bearing of the fourth compensation assembly is opposite.

Further, four fixed mounting holes are formed around the fixing plate, and the one-way bearing and the rotating ring gear are mounted on the mounting holes.

Further, the driving member is a micro motor, the micro motor is electrically connected to the processing module, and a rotary output end of the micro motor is connected to the rotating shaft; or the driving member includes a memory alloy wire and a crank A connecting rod, one end of the memory alloy wire is fixed on the fixing plate and connected with the processing module through a circuit, and the other end of the memory alloy wire is connected with the rotating shaft through the crank connecting rod to drive The rotation shaft rotates.

Further, the image acquisition device is disposed on a mobile phone, and the mobile phone includes a stand. The bracket includes a mobile phone mount and a retractable support rod; the mobile phone mount includes a retractable connection plate and a folding plate group installed at opposite ends of the connection plate, and one end of the support rod passes through the middle of the connection plate The damping hinge is connected; the folding plate group includes a first plate body, a second plate body, and a third plate body, wherein one of two opposite ends of the first plate body is hinged with the connecting plate, so The other end of the two opposite ends of the first plate is hinged to one of the two opposite ends of the second plate; the other end of the opposite ends of the second plate is two opposite to the third plate. One end of the ends is hinged; the second plate body is provided with an opening for inserting a corner of the mobile phone; when the mobile phone mounting seat is used to install the mobile phone, the first plate body, the second plate body, and the third plate body The folded state is a right triangle, the second plate is a hypotenuse of a right triangle, the first plate and the third plate are right angles of a right triangle, and one side of the third plate is Affixed side by side with one side of the connecting plate, the first The other end of the plate opposite ends with one end of opposite ends of said first plate offset.

Further, one side of the third plate body is provided with a first connection portion, and a side surface of the connection plate that is in contact with the third plate body is provided with a first fit that is matched with the first connection portion. When the mobile phone mounting base is used to install a mobile phone, the first connection portion and the first mating portion are snap-connected.

Further, a second connection portion is provided on one end of the opposite ends of the first plate body, and a second connection is provided on the other end of the opposite ends of the third plate body to cooperate with the second connection portion. When the mobile phone mounting base is used to install a mobile phone, the second connection portion and the second mating portion are snap-connected.

Further, the other end of the support rod is detachably connected with a base.

Another aspect of the embodiments of the present invention provides a light-based three-dimensional face optimization device, including:

A judging module, configured to judge whether the face image in the obtained target picture has uneven illumination;

An optimization module, if the face image has uneven illumination, performing light adjustment on the face image to obtain an optimized face image;

An acquisition module, configured to process the optimized face image based on a pre-trained convolutional neural network model to obtain first first three-dimensional face model parameter information;

A processing module is configured to process a three-dimensional average face model according to the first three-dimensional face model parameter information to obtain a three-dimensional face image corresponding to the face image.

Further, the judgment module is specifically configured to:

Obtaining a grayscale histogram of the target picture;

Further, the optimization module further includes a first training module, the first training module is configured to:

Further, the first training module is further configured to:

Inputting the optimized first image and the second image output from the trained initial image generation model to the discriminative network after training to determine a loss function value of the discriminative network after training;

Further, the device further includes a second training module, the second training module is configured to build a convolutional neural network model composed of two layers of hourglass-type convolutional neural networks; and acquire and use to train the convolutional neural network A data set of the model, where the data set includes several two-dimensional face pictures and three-dimensional portrait scan data corresponding to the two-dimensional face pictures; pre-processing the two-dimensional face pictures to obtain facial feature point information; and The facial feature point information is input to the convolutional neural network model to obtain the second three-dimensional face model parameter information; the cross-entropy loss function is used to optimize the parameters of the convolutional neural network until the second three-dimensional human. The face model parameter information and the loss function of the three-dimensional portrait scan data converge to a preset threshold.

Another aspect of the embodiments of the present invention provides an electronic device, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute any one of the foregoing light-based Three-dimensional face optimization method.

It can be seen from the above technical solutions that the light-based three-dimensional face optimization method, device, and electronic device provided by the embodiments of the present invention are applicable to face images taken in situations where the lighting environment is poor (such as backlighting, sidelighting, etc.) Optimize to get a clear face; at the same time, only a single picture can be used to generate a three-dimensional face image. A convolutional neural network model can automatically generate more accurate and realistic face expressions and poses without the need for hardware support Reduce costs in many ways.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to more clearly explain the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are merely These are some of the embodiments described in the embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained according to these drawings.

FIG. 1 is a flowchart of a light-based three-dimensional face optimization method according to an embodiment of the present invention; FIG.

2 is a flowchart of a light-based three-dimensional face optimization method according to an embodiment of the present invention;

3 is a flowchart of a light-based three-dimensional face optimization method according to an embodiment of the present invention;

4 is a structural diagram of a light-based three-dimensional face optimization device according to an embodiment of the present invention;

5 is a structural diagram of a light-based three-dimensional face optimization device according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a hardware structure of an electronic device that executes a method for optimizing a three-dimensional human face according to a method embodiment of the present invention; FIG.

7 is a structural diagram of an image acquisition device according to an embodiment of the present invention;

8 is a structural diagram of an optical image stabilizer provided by an embodiment of the present invention;

FIG. 9 is an enlarged view of part A of FIG. 8; FIG.

10 is a schematic bottom view of a movable plate of a micro memory alloy optical image stabilizer provided by an embodiment of the present invention;

11 is a structural diagram of a stent provided by an embodiment of the present invention;

FIG. 12 is a schematic state diagram of a stent according to an embodiment of the present invention; FIG.

13 is a schematic view of another state of a stent according to an embodiment of the present invention;

FIG. 14 is a structural state diagram when the mounting base and the mobile phone are connected according to an embodiment of the present invention.

detailed description

In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described The embodiments are only a part of the embodiments of the present invention, but not all the embodiments. Based on the embodiments in the embodiments of the present invention, all other embodiments obtained by those skilled in the art should belong to the protection scope of the embodiments of the present invention.

Hereinafter, some embodiments of the present invention will be described in detail with reference to the drawings. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other. FIG. 1 is a flowchart of a light-based three-dimensional face optimization method according to an embodiment of the present invention. As shown in FIG. 1, a light-based three-dimensional face optimization method provided by an embodiment of the present invention includes:

S101: Obtain a target picture, and determine whether a face image in the target picture is in a non-uniform light condition.

In this step, the target picture may be a picture taken in real time, or an image stored in a local picture of the terminal. When the target picture is taken in backlight or side light conditions, the face in the target picture is in non-uniform light lines, which makes the facial features of the portrait unclear, resulting in errors in the generated three-dimensional face image. Therefore, in this step, after obtaining the target picture, it is first necessary to determine whether the face image in the target picture is in a non-uniform light condition.

As an optional implementation of the embodiment of the present invention, when determining whether a face image is in a non-uniform light condition, first process a target picture to obtain a grayscale histogram of the target picture;

Calculate the grayscale distribution variance of the target picture according to the grayscale histogram; compare the grayscale distribution variance with the grayscale distribution critical variance, and when the grayscale distribution variance is greater than or equal to the grayscale distribution critical variance To determine that the face image in the target picture is in non-uniform lighting conditions.

Specifically, the grayscale histogram of the picture can clearly represent the light and dark distribution of the image, and its distribution has nothing to do with the content of the image. Generally speaking, the distribution of grayscale histograms of backlit or sidelight scenes and non-backlit or sidelight scenes is completely different. The grayscale histogram distribution of backlight or sidelight scenes is high in pixel distribution on extremely bright and dark grayscale levels, while pixels in non-backlight or sidelight scenes are mainly concentrated in the middle grayscale level. Therefore, the gray level histogram of backlight or side light has a large variance in gray level distribution, while the gray level histogram of non-backlit or side light scene has a small gray level distribution variance.

Before this step, the critical variance of the gray level distribution can be obtained from multiple pictures (including backlight, sidelight, non-backlight, non-sidelight). If the critical variance is greater than this, it is determined as the backlight or sidelight picture, that is, the target picture It is in non-uniform light conditions; if it is less than the critical variance, it is determined as a non-backlit and side-light picture, that is, the target picture is in uniform light conditions.

When it is determined that the face image is in a non-uniform light condition, step S102 is performed.

S102. Input the face image into a pre-trained image generation model to obtain an optimized face image after light adjustment is performed on the face image.

Specifically, the optimized face image may be a face image presented under a uniform light source condition, and clear facial features can be obtained under this condition. It should be noted that the image generation model can be used to perform light adjustment on a face image captured under a non-positive uniform light source condition to generate a face image under a front uniform light source condition.

As an example, the image generation model may be a model obtained by using a machine learning method in advance to train a model (for example, an existing convolutional neural network model, etc.) for image processing based on training samples. The above convolutional neural network may include a convolutional layer, a pooling layer, a depooling layer, and a deconvolution layer. The last deconvolution layer of the convolutional neural network may output an optimized face image, and the output optimized face image It can be expressed by a matrix of RGB three channels, and the size of the output optimized face image can be the same as the face image in the target picture.

As shown in Figure 2, the image generation model can be trained by the following steps:

S1021: Obtain training samples and an initial image generation model (the prior art, which is not described herein), the training samples include multiple first images generated under a non-positive uniform light source condition, and generated under a front uniform light source condition. A second image corresponding to the first image.

S1022. Use a machine learning method to train the initial image generation model based on the training samples to obtain the image generation model.

As an optional implementation of the embodiment of the present invention, an initial image generation model and initial parameters in the model may be determined, and an output of the initial image generation model may be evaluated and corrected by setting a discriminant network. Specifically, first, the first image in the training sample is input into the initial image generation model to obtain an optimized first image output by the initial image generation model; second, the optimized first image and the corresponding first optimized image are The second image is used as the input of the discriminative network. The discriminative network is trained, the parameters of the discriminated network after training are determined and fixed, and the subsequent output results will be evaluated and corrected by using this parameter. Again, the first image will be used as the initial image. Generate the input of the model, train the initial image generation model, and continuously adjust and modify the initial parameters of the model; finally, the optimized first image output by the trained initial image generation model and the second corresponding to the optimized first image An image is input to the trained discrimination network, and a loss function value of the trained discrimination network is determined. When the loss function value converges, the initial image generation model is determined as the image generation model.

The value of the loss function may be used to characterize the degree of difference between the optimized first image and the second image output by the image generation model. The smaller the loss function, the smaller the degree of difference between the optimized first image and the second image. Exemplarily, the aforementioned loss function may use an Euclidean distance function, a hinge function, or the like.

S103: Process the optimized face image based on a pre-trained convolutional neural network model to obtain first first three-dimensional face model parameter information.

Specifically, the first three-dimensional face parameter information includes face shape information and facial expression information. The face image obtained in step S102 is input to a pre-trained convolutional neural network model, and the first three-dimensional face model parameter is output. information.

Before performing this step, you need to train a convolutional neural network model. As shown in Figure 3, training the convolutional neural network model can include the following steps:

S1031. Construct a convolutional neural network model composed of a two-layer hourglass convolutional neural network.

S1032: Obtain a data set for training the convolutional neural network model, where the data set includes a plurality of two-dimensional face pictures and three-dimensional portrait scan data corresponding to the two-dimensional face pictures.

It should be noted that there is no restriction on the sequence of steps S1031 and S1032. The data set can be acquired first, and then the convolutional neural network model can be constructed. The volume and neural network model can also be constructed first. There are no restrictions here.

Specifically, the method for obtaining the input sample data set in this step includes downloading pictures directly from the Internet as the input sample data set, and artificially taking pictures as the input sample data set. The artificially taken pictures may include pictures of people of different races. , Pictures of people with different light and shadow effects. The 3D portrait scan data mainly includes the pose information of the face (such as the tilt angle, deflection angle, and rotation angle of the face, the shape parameters of the face feature points, and the expression parameters of the face feature points.

S1033: Preprocess the two-dimensional face picture to obtain face feature point information.

Specifically, the facial feature point information includes, but is not limited to, coordinate parameter values of the facial feature points in the picture and texture parameters (that is, texture parameters of the RGB features). The related art includes many recognition methods for recognizing a face image. For example, the range of a face image can be recognized according to the edge information and / or color information of the image. In this embodiment, a pre-defined key point is identified based on detection. The key points obtained determine facial feature point information. For example, the eyebrows, eyes, nose, face, and mouth in the face image are each composed of several key points, that is, the eyebrows, eyes, and nose in the face image can be determined by the coordinate positions of the key points. , Face and mouth position and texture.

As an optional implementation of this step, a facial feature point recognition algorithm may be used to obtain facial feature point information. The training of the facial feature point recognition algorithm may include the following steps: first, a certain number of training sets are obtained, and the training set is a picture carrying human facial feature point information; second, the training set is used to form an initial regression function r0 and an initial Training set; again, using the initial training set and initial regression function r0 to iterate to form the next training set and regression function rn; each iteration of the regression function uses a gradient boosting algorithm to learn, so that when the nth training set and the training set are When the facial feature point information meets the convergence conditions, the corresponding regression function rn is the facial feature point recognition algorithm after training.

In this step, an algorithm is used to perform face detection on the picture to obtain the position of the face in the picture, and a range rectangle is used to identify the range of the face, for example (left, top, right, bottom). The first preset number of feature points and the coordinates (x _i , y _i ) of each face feature point are obtained through the regression function in the trained feature point recognition algorithm for the input portrait photo recognition, where i represents the first The first preset number of i feature points may be 68, including key points of eyebrows, eyes, nose, mouth, and face. For each face feature point, a texture parameter (R _i , G _i , B _i ) representing a second preset number of pixels around the feature point is formed according to its coordinates (x _i , y _i ) and a Gaussian algorithm. Optionally, the second preset number may be 6, 8 or the like, which is not limited in the present invention.

S1034: Enter the feature point information of the face into the convolutional neural network model to obtain the second three-dimensional face model parameter information.

In this step, the algorithm of the convolutional nerve inputs face feature point information each time. The face feature point information can reflect the current face shape information. The output of the algorithm is the second three-dimensional face model parameter p. The algorithm uses a convolutional neural network to fit the mapping function from input to output. The network structure includes 4 convolutional layers, 3 pooling layers, and 2 fully connected layers. By concatenating multiple convolutional neural networks until convergence on the training set, it is updated according to the currently predicted face shape and used as the input of the next level of convolutional neural network.

The first two convolutional layers of the network extract facial features through weight-sharing methods, and the last two convolutional layers extract facial features through local perception, further returning a feature vector in a 256-dimensional space and outputting a feature in a 234-dimensional space. Vector, the second three-dimensional face model parameter p. These include face pose parameters [f, pitch, yaw, roll, t _2dx , t _2dy ], shape parameters α _id , and expression parameters α _exp . Among them, f is a scale factor, pitch is a tilt angle, yaw is a deflection angle, roll is a rotation angle, and t _2dx and t _2dy are offset terms.

S1035: Optimize parameters of the convolutional neural network by using a cross-entropy loss function until the second three-dimensional face model parameter information and the loss function of the three-dimensional portrait scan data converge to a preset threshold.

In deep learning, the loss function is a reflection of the degree of fit of the model data. When the result of the fit is worse, the value of the loss function will be larger. Generally speaking, after k (k = 0, 1, ... K) iterations, the parameter p ^k will be obtained after an initial parameter change, and a neural network Net ^K is trained according to the above three-dimensional portrait scan data. The prediction parameter p is continuously updated p ^k . The network is expressed mathematically as follows:

△ p ^k = Net ^K (I, PNCC (p ^k ))

After each iteration of the network model, a better parameter p ^{k + 1} = p ^k + △ p ^{k is obtained} as the input of the next layer of the network, where the structure is the same as Net ^K , until p ^{k + 1 is} equal to the three-dimensional The loss function of the portrait scan data converges to a preset threshold, indicating that the training of the volume and neural network model is complete.

S104. Process the three-dimensional average face model according to the first three-dimensional face model parameter information to obtain a three-dimensional face image corresponding to the face image.

Faces have many similarities. Normal faces have one nose, two eyes, one mouth, and two ears. The order from top to bottom and left to right is unchanged, so you can first build a three-dimensional average face model. Because the similarity of faces is large, it is always possible to change from one normal face to another normal face, and the average face model can be changed by calculating the amount of change, so this is also the basis of 3D face reconstruction.

Specifically, first, the three-dimensional average face model is processed according to the face shape information and the facial expression information to obtain an initial three-dimensional face model.

Specifically, it can be processed according to the following formula:

S = S ₀ + A _id * α _id + A _exp * α _exp

In the above formula, S is the initial three-dimensional face model, S ₀ is the average face model, A _id is the base vector of the shape, α _id is the shape parameter, A _exp is the base vector of the expression, and α _exp is the expression parameter. A _exp and A _exp are obtained in advance using existing algorithms respectively.

Secondly, the initial three-dimensional face image is adjusted according to the face posture information to obtain a three-dimensional face image corresponding to the face.

Specifically, the initial three-dimensional face model projects the face model onto the image plane through a weak perspective projection to obtain a three-dimensional face image corresponding to the face, and the formula is expressed as follows:

V (p) = F * Pr * R (S ₀ + A _id α _id + A _exp α _exp ) + t _2d

In the above formula, V (p) is the reconstructed three-dimensional face image corresponding to the face, f is a scale factor, Pr is a right-angle projection matrix, and R is a rotation matrix. The tilt angle (pitch), deflection angle (yaw), The rotation angle (roll) is obtained based on the pose information of the human face in the two-dimensional image identified by the feature points.

The light-based three-dimensional face optimization method provided by the embodiment of the present invention optimizes a face image taken in a poor lighting environment (such as backlighting, sidelighting, etc.) to obtain a clear human face; Only a single picture can be used to generate a three-dimensional face image. Convolutional neural network models can automatically generate more accurate and realistic face expressions and poses, without the support of hardware, and reduce costs in many ways.

FIG. 4 is a structural diagram of a light-based three-dimensional face optimization device according to an embodiment of the present invention. As shown in FIG. 4, the device specifically includes a judgment module 100, an optimization module 200, an acquisition module 300, and a processing module 400. among them,

A judging module 100 is configured to obtain a target picture and determine whether a face image in the target picture is in a non-uniform light condition; an optimization module 200 is configured to, if the face image is in a non-uniform light condition, convert the human face The image is input to a pre-trained image generation model to obtain an optimized face image after light adjustment of the face image; an acquisition module 300 is configured to perform the optimized face image based on a pre-trained convolutional neural network model Processing to obtain first three-dimensional face model parameter information; a processing module 400 for processing a three-dimensional average face model according to the first three-dimensional face model parameter information to obtain a three-dimensional face image corresponding to the face image .

The light-based three-dimensional face optimization device provided by the embodiment of the present invention is specifically configured to execute the method provided by the embodiment shown in FIG. 1, and its implementation principles, methods, and functional uses are similar to the embodiment shown in FIG. 1, and here No longer.

FIG. 5 is a structural diagram of a light-based three-dimensional face optimization device according to an embodiment of the present invention. As shown in FIG. 5, the device specifically includes: a first training module 500, a second training module 600, a determination module 100, an optimization module 200, an acquisition module 300, and a processing module 400. among them,

The first training module 500 is configured to obtain a training sample and an initial image generation model, where the training sample includes a plurality of first images generated under a non-positive uniform light source condition, and the first image generated under the front uniform light source condition and the first image generation model. A second image corresponding to one image; using a machine learning method, training the initial image generation model based on the training samples to obtain the image generation model.

The second training module 600 is configured to build a convolutional neural network model composed of a two-layer hourglass-type convolutional neural network; and acquire a data set for training the convolutional neural network model, where the data set includes a plurality of two-dimensional people 3D portrait scan data corresponding to the face picture and the 2D face picture; pre-processing the 2D face picture to obtain face feature point information; and inputting the face feature point information to the convolutional nerve The network model obtains the second three-dimensional face model parameter information; using the cross-entropy loss function to optimize the parameters of the convolutional neural network until the second three-dimensional face model parameter information and the loss function of the three-dimensional portrait scan data Converge to a preset threshold.

Optionally, the judgment module 100 is configured to process the target picture to obtain a grayscale histogram of the target picture; calculate a grayscale distribution variance of the target picture according to the grayscale histogram; The gray level distribution variance is compared with the gray level distribution critical variance. When the gray level distribution variance is greater than or equal to the gray level distribution critical variance, it is determined that the face image in the target picture is non-uniform. Lighting conditions.

Optionally, the first training module 500 is further configured to input the first image into the initial image generation model to obtain an output optimized first image; and to input the optimized first image and the second image As the input of the discrimination network, train the discrimination network to determine the parameters of the discrimination network after training; use the first image as an input to the initial image generation model, and train the initial image generation model ; Inputting the optimized first image and the second image output by the initial image generation model after training to the discrimination network after training, and determining a loss function value of the discrimination network after training; When the value of the loss function converges, the initial image generation model is determined as the image generation model.

The light-based three-dimensional face optimization device provided by the embodiment of the present invention is specifically configured to execute the method provided by the embodiment shown in FIG. 1 to FIG. 3, and its implementation principles, methods, and functional uses are as shown in FIG. 1-3. The examples are similar and will not be repeated here.

The above-mentioned light-based three-dimensional face optimization device according to the embodiments of the present invention may be used as one of the software or hardware functional units, independently set in the above-mentioned electronic device, or may be implemented as one of the functional modules integrated in the processor to execute the present invention. A method for optimizing a three-dimensional human face based on light according to an embodiment.

FIG. 6 is a schematic diagram of a hardware structure of an electronic device that performs a light-based three-dimensional face optimization method according to an embodiment of the method of the present invention. According to FIG. 6, the electronic device includes:

One or more processors 610 and a memory 620. One processor 610 is taken as an example in FIG. 6.

The device for performing the light-based three-dimensional face optimization method may further include: an input device 630 and an output device 630.

The processor 610, the memory 620, the input device 630, and the output device 640 may be connected through a bus or other methods. In FIG. 6, the connection through the bus is taken as an example.

The memory 620 is a non-volatile computer-readable storage medium, and can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the light-based three-dimensional person in the embodiment of the present invention. Program instructions / modules corresponding to the face optimization method. The processor 610 executes various functional applications and data processing of the server by running non-volatile software programs, instructions, and modules stored in the memory 620, that is, implementing the light-based three-dimensional face optimization method.

The memory 620 may include a storage program area and a storage data area, where the storage program area may store an operating system and an application program required for at least one function; the storage data area may store a light-based three-dimensional face optimization provided according to an embodiment of the present invention Data created using the device, etc. In addition, the memory 620 may include a high-speed random access memory 620, and may further include a non-volatile memory 620, such as at least one magnetic disk memory 620, a flash memory device, or other non-volatile solid-state memory 620. In some embodiments, the memory 620 may optionally include a memory 620 remotely disposed with respect to the processor 66, and these remote memories 620 may be connected to the light-based three-dimensional face optimization device through a network. Examples of the above network include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

The input device 630 may receive inputted numeric or character information, and generate key signal inputs related to user settings and function control of a light-based three-dimensional face optimization device. The input device 630 may include a device such as a pressing module.

The one or more modules are stored in the memory 620, and when executed by the one or more processors 610, execute the light-based three-dimensional face optimization method.

The electronic devices in the embodiments of the present invention exist in various forms, including but not limited to:

(1) Mobile communication equipment: This type of equipment is characterized by mobile communication functions, and its main goal is to provide voice and data communication. Such terminals include: smart phones (such as iPhone), multimedia phones, feature phones, and low-end phones.

(2) Ultra-mobile personal computer equipment: This type of equipment belongs to the category of personal computers, has computing and processing functions, and generally has the characteristics of mobile Internet access. Such terminals include: PDA, MID and UMPC devices, such as iPad.

(3) Portable entertainment equipment: This type of equipment can display and play multimedia content. Such devices include: audio and video players (such as iPod), handheld game consoles, e-books, as well as smart toys and portable car navigation devices.

(4) Other electronic devices with data interaction functions.

Preferably, an image acquisition device for acquiring an image is provided on the electronic device, and a software or hardware image stabilizer is often provided on the image acquisition device to ensure the quality of the acquired image. Most of the existing image stabilizers are powered by coils that generate Loren magnetic force in the magnetic field to drive the lens. To achieve optical image stabilization, the lens needs to be driven in at least two directions, which means that multiple coils need to be arranged, which will give the whole The miniaturization of the structure brings certain challenges, and it is easy to be affected by external magnetic fields, which will affect the anti-shake effect. Therefore, the Chinese patent published as CN106131435A provides a miniature optical anti-shake camera module, which realizes memory alloy wires through temperature changes. Stretching and shortening to pull the auto-focusing voice coil motor to achieve lens shake compensation. The control chip of the micro memory alloy optical anti-shake actuator can control the change of the driving signal to change the temperature of the memory alloy wire. Control the elongation and shortening of the memory alloy wire, and calculate the position and moving distance of the actuator based on the resistance of the memory alloy wire. When the micro memory alloy optical image stabilization actuator moves to the specified position, the resistance of the memory alloy wire at this time is fed back. By comparing the deviation of this resistance value and the target value, the movement on the micro memory alloy optical image stabilization actuator can be corrected. deviation.

However, the applicant found that due to the randomness and uncertainty of the shake, the structure of the above technical solution alone cannot achieve accurate compensation of the lens in the case of multiple shakes. This is due to the rising temperature and shape of the shape memory alloy. It takes a certain amount of time to cool down. When the shake occurs in the first direction, the above technical solution can compensate the lens for the shake in the first direction, but when the subsequent shake in the second direction occurs, it is too late due to the memory alloy wire. Deformation in an instant, so it is easy to cause untimely compensation, and it is impossible to accurately realize lens shake compensation for multiple shakes and continuous shakes in different directions. Therefore, it is necessary to improve its structure in order to obtain better image quality and facilitate subsequent 3D Image generation.

With reference to Figures 8-10, this embodiment improves the anti-shake device and designs it as a mechanical anti-shake device 3000. The specific structure is as follows:

The mechanical image stabilizer 3000 of this embodiment includes a movable plate 3100, a base plate 3200, and a compensation mechanism 3300. Each of the movable plate 3100 and the base plate 3200 is provided with a through hole through which the lens 1000 passes. An autofocus voice coil motor 2000 is mounted on the movable plate 3100, and the movable plate 3100 is mounted on the base plate 3200. The size of the base plate 3200 is larger than the movable plate 3100, and the movable plate 3100 passes above it. The auto-focusing voice coil motor limits its up and down movement, and the compensation mechanism 3300 drives the movable plate 3100 and the lens 1000 on the movable plate 3100 to move under the driving of the processing module to achieve shake compensation of the lens 1000.

Specifically, the compensation mechanism 3300 in this embodiment includes a first compensation component 3310, a second compensation component 3320, a third compensation component 3330, and a fourth compensation component 3340 installed around the substrate 3200. A compensation component 3310 and the third compensation component 3330 are disposed opposite to each other, the second compensation component 3320 is disposed opposite to the fourth compensation component 3340, and a connection line between the first compensation component 3310 and the third compensation component 3330 The connection lines between the first compensation component 3310 and the third compensation component 3330 are perpendicular to each other, that is, a compensation component, a second compensation component 3320, and a third compensation component 3330 are respectively arranged in the front, rear, left, and right directions of the movable plate 3100. The first compensation component 3310 can make the movable plate 3100 move forward, the third compensation component 3330 can make the movable plate 3100 move backward, the second compensation component 3320 can make the movable plate 3100 move left, and the fourth compensation component 3340 can make The movable plate 3100 moves to the left, and the first compensation component 3310 can cooperate with the second compensation component 3320 or the fourth compensation component 3340 to realize the operation of the movable plate 3100 in an inclined direction. , The third component 3330 may be compensated 1000 compensation and the second compensation component 3320 or the fourth compensation component 3340 cooperate to achieve movement of the movable plate 3100 to the tilt direction, the lens implemented in the respective direction of jitter.

Specifically, the first compensation component 3310, the second compensation component 3320, the third compensation component 3330, and the fourth compensation component 3340 in this embodiment each include a driving member 3301, a rotating shaft 3302, a one-way bearing 3303, and a rotating ring gear 3304. . The driving member 3301 is controlled by the processing module, and the driving member 3301 is drivingly connected to the rotating shaft 3302 to drive the rotating shaft 3302 to rotate. The rotating shaft 3302 is connected to the inner ring of the one-way bearing 3303 to drive the inner ring of the one-way bearing 3303 to rotate. The rotating ring gear 3304 is sleeved on the one-way bearing 3303 and is connected to the one-way bearing 3303. The outer ring of the one-way bearing 3303 is fixedly connected. The outer surface of the rotating ring gear 3304 is provided with a ring of external teeth along its circumferential direction. The shaped groove 3110 is meshed with the external teeth, and the external teeth can slide along the length direction of the strip groove 3110; wherein the rotatable direction of the one-way bearing 3303 of the first compensation component 3310 and the external teeth The rotation direction of the one-way bearing 3303 of the third compensation component 3330 is opposite, and the rotation direction of the one-way bearing 3303 of the second compensation component 3320 is opposite to the rotation direction of the one-way bearing 3303 of the fourth compensation component 3340. .

One-way bearing 3303 is a bearing that can rotate freely in one direction and lock in the other direction. When the movable plate 3100 needs to be moved forward, the driving member 3301 of the first compensation component 3310 causes the rotating shaft 3302 to drive The inner ring of the one-way bearing 3303 rotates. At this time, the one-way bearing 3303 is locked. Therefore, the inner ring of the one-way bearing 3303 can drive the outer ring to rotate, which in turn drives the rotating ring gear 3304 to rotate. The engagement of the groove 3110 drives the movable plate 3100 to move in a direction that can compensate for shake. When the movable plate 3100 needs to be reset after shake compensation, the third compensation component 3330 can be used to drive the movable plate 3100 to rotate. The operation process of the third compensation component 3330 Similar to the first compensation component 3310, at this time, the one-way bearing 3303 of the first compensation component 3310 is in a rotatable state, so the ring gear on the first compensation component 3310 follows the movable plate 3100, and will not affect the activity Reset of board 3100.

Preferably, in order to reduce the overall thickness of the entire mechanical image stabilizer 3000, in this embodiment, four through-holes (not shown in the drawings) are provided on the periphery of the fixing plate, and the mounting holes are provided with the mounting holes. The one-way bearing 3303 and the rotating ring gear 3304 can reduce the overall thickness of the entire mechanical vibration stabilizer 3000 by concealing parts of the one-way bearing 3303 and the rotating ring gear 3304 in the mounting holes. Alternatively, a part of the entire compensation assembly may be directly placed in the mounting hole.

Specifically, the driving member 3301 in this embodiment may be a micro motor, the micro motor is electrically connected to the processing module, a rotation output end of the micro motor is connected to the rotating shaft 3302, and the micro motor is controlled To the processing module. Alternatively, the driving member 3301 is composed of a memory alloy wire and a crank connecting rod. One end of the memory alloy wire is fixed on the fixing plate and is connected to the processing module through a circuit. The other end of the memory alloy wire passes The crank link is connected to the rotating shaft 3302 to drive the rotating shaft 3302 to rotate. Specifically, the processing module calculates the elongation of the memory alloy wire according to the feedback from the gyroscope, and drives the corresponding circuit to the shape memory alloy. The temperature of the wire is increased, and the shape memory alloy wire is stretched to drive the crank link mechanism. The crank of the crank link mechanism drives the rotation shaft 3302 to rotate the inner ring of the one-way bearing 3303. When the one-way bearing 3303 is locked, the inner The ring drives the outer ring to rotate, and the rotating ring gear 3304 drives the movable plate 3100 through the strip groove 3110.

The following describes the working process of the mechanical image stabilizer 3000 of this embodiment in detail in combination with the above structure. Taking the lens 1000 as an example of two shakes, the directions of the two shakes are opposite, and the movable plate 3100 needs to be compensated for forward motion, and then Left motion compensation once. When forward motion compensation of the movable plate 3100 is required, the gyroscope feeds the detected lens 1000 shake direction and distance in advance to the processing module. The processing module calculates the required movement distance of the movable plate 3100, and then drives the first compensation component 3310. The driving member 3301 causes the rotating shaft 3302 to drive the inner ring of the one-way bearing 3303. At this time, the one-way bearing 3303 is locked, so the inner ring can drive the outer ring to rotate, which in turn drives the rotating ring gear 3304 to rotate, and the rotating ring gear 3304 passes The strip groove 3110 drives the movable plate 3100 to move forward, and then the third compensation component 3330 drives the movable plate 3100 to reset. When the motion board 3100 needs motion compensation to the left, the gyroscope feeds back the detected lens 1000 shake direction and distance to the processing module in advance, and the processing module calculates the motion distance required for the motion board 3100 to drive the second compensation component 3320. The driving member 3301 causes the rotating shaft 3302 to drive the inner ring of the one-way bearing 3303. At this time, the one-way bearing 3303 is locked, so the inner ring can drive the outer ring to rotate, which in turn drives the rotating ring gear 3304 to rotate, and the rotating ring gear 3304 passes The strip groove 3110 drives the movable plate 3100 to move forward, and because the external teeth of the ring gear 3304 can slide along the length direction of the strip groove 310, when the movable plate 3100 moves to the left, the movable plate 3100 and the first compensation The sliding fitting between the component 3310 and the third compensation component 3330 does not affect the leftward movement of the movable plate 3100. After the compensation is completed, the fourth compensation component 3340 is used to drive the movable plate 3100 to reset.

Of course, the above is just two simple jitters. When multiple jitters occur or the direction of the jitter is not reciprocating, you can drive multiple compensation components to compensate for the jitter. The basic working process is the same as the principle described above. To repeat, in addition, the detection feedback of the shape memory alloy resistance and the detection feedback of the gyroscope are existing technologies, and are not described here too.

Based on the above description, it can be known that the mechanical compensator provided by this embodiment not only is not affected by external magnetic fields and has a good anti-shake effect, but also can accurately compensate the lens 1000 in the case of multiple shakes, and the compensation is timely and accurate. Greatly improved the quality of the acquired images, and simplified the difficulty of subsequent 3D image processing.

Further, the electronic device includes a mobile phone with the image acquisition device. The mobile phone includes a stand. The purpose of the mobile phone stand is due to the uncertainty of the image acquisition environment, so the phone needs to be supported and fixed with a stand in order to obtain more stable image quality.

In addition, the applicant found that the existing mobile phone holder only has the function of supporting the mobile phone, but does not have the function of a selfie stick. Therefore, the applicant made the first step of improving the holder, combining the mobile phone holder 6000 and the support rod 6200. As shown in FIG. 11, the bracket 6000 in this embodiment includes a mobile phone mounting base 6100 and a retractable supporting rod 6200. The supporting rod 6200 and the middle portion of the mobile phone mounting base 6100 (specifically, the middle portion of the substrate 3200 described below) pass through a damping hinge. When the supporting rod 6200 is rotated to the state of FIG. 12, the bracket 6000 may form a selfie stick structure, and when the supporting rod 6200 is rotated to the state of FIG. 13, the bracket 6000 may form a mobile phone bracket 6000 structure.

In combination with the above bracket structure, the applicant also found that the combination of the mobile phone mounting base 6100 and the support pole 6200 takes up a lot of space. Even if the support pole 6200 is retractable, the mobile phone mounting base 6100 cannot undergo structural changes and the volume will not be further reduced. Putting it in a pocket or a small bag causes the inconvenience of carrying the bracket 6000. Therefore, in this embodiment, a second step improvement is performed on the bracket 6000, so that the overall accommodation of the bracket 6000 is further improved.

As shown in FIGS. 12-14, the mobile phone mounting base 6100 of this embodiment includes a retractable connection plate 6110 and a folding plate group 6120 installed at opposite ends of the connection plate 6110. The support rod 6200 and the connection plate 6110 The middle part is connected by a damping hinge; the folding plate group 6120 includes a first plate body 6121, a second plate body 6122, and a third plate body 6123, wherein one of the two opposite ends of the first plate body 6121 is connected to the first plate body 6121. The connecting plate 6110 is hinged, the other end of the opposite ends of the first plate body 6121 is hinged to one of the opposite ends of the second plate body 6122, and the opposite ends of the second plate body 6122 are The other end is hinged to one of opposite ends of the third plate body 6123; the second plate body 6122 is provided with an opening 6130 for inserting a corner of the mobile phone.

As shown in FIG. 14, when the mobile phone mounting base 6100 is used to install a mobile phone, the first plate body 6121, the second plate body 6122 and the third plate body 6123 are folded into a right triangle state, and the second plate body 6122 is a hypotenuse of a right-angled triangle, and the first plate body 6121 and the third plate 6123 are right-angled sides of a right triangle, wherein one side of the third plate body 6123 and one of the connection plate 6110 The sides are attached side by side, and the other end of the opposite ends of the third plate body 6123 and one of the opposite ends of the first plate body 6121 abut against each other. This structure can make the three folding plates in a self-locking state, and When the two corners of the lower part of the mobile phone are inserted into the two openings 6130 on both sides, the lower sides of the mobile phone 5000 are located in two right-angled triangles. The mobile phone 5000 can be completed through the joint work of the mobile phone, the connecting plate 6110, and the folding plate group 6120. The triangle state cannot be opened under external force. The triangle state of 6120 pieces of folding plate group can only be released after the mobile phone is pulled out from the opening 6130.

When the mobile phone mounting base 6100 is not in working state, the connecting plate 6110 is reduced to a minimum length, and the folding plate group 6120 and the connecting plate 6110 are folded to each other. The user can fold the mobile phone mounting base 6100 to a minimum volume, and due to the support The scalability of the lever 6200 allows the entire bracket 6000 to be accommodated in the smallest volume, which improves the collection of the bracket 6000. Users can even put the bracket 6000 directly into their pockets or small handbags, which is very convenient.

Preferably, in this embodiment, a first connection portion is also provided on one side of the third plate body 6123, and a side surface where the connection plate 6110 is in contact with the third plate body 6123 is provided with the first connection portion. A first mating portion that mates with a connecting portion. When the bracket 6000 mobile phone mounting base 6100 is used to mount a mobile phone, the first connecting portion and the first mating portion are snap-fitted. Specifically, the first connecting portion of this embodiment is a convex strip or protrusion (not shown in the figure), and the first matching portion is a card slot (not shown in the figure) opened on the connecting plate 6110. This structure not only improves the stability when the 6120 pieces of the folding plate group are in a triangle state, but also facilitates the connection between the 6120 pieces of the folding plate group and the connecting plate 6110 when the mobile phone mounting base 6100 needs to be folded to a minimum state.

Preferably, in this embodiment, a second connection portion is also provided at one end of the opposite ends of the first plate body 6121, and the other end of the opposite ends of the third plate body 6123 is provided with the second connection portion. When the bracket 6000 is used to install a mobile phone, the second connecting portion is a second matching portion that is matched with the second fitting portion, and the second connecting portion and the second fitting portion are engaged and connected. The second connecting portion may be a protrusion (not shown in the figure), and the second mating portion is an opening 6130 or a card slot (not shown in the figure) that cooperates with the protrusion. This structure improves the stability when the laminated board assembly is in a triangular state

In addition, in this embodiment, a base (not shown in the figure) can be detachably connected to the other end of the support rod 6200. When the mobile phone needs to be fixed and the mobile phone 5000 has a certain height, the support rod 6200 can be stretched to A certain length and place the bracket 6000 on a plane through the base, and then place the mobile phone in the mobile phone mounting base 6100 to complete the fixing of the mobile phone; and the detachable connection of the support bar 6200 and the base can make the two can be carried separately, further The accommodating of the bracket 6000 and the convenience of carrying are improved.

The device embodiments described above are only schematic, and the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules, which may be located in One place, or can be distributed to multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the objective of the solution of this embodiment. Those of ordinary skill in the art can understand and implement without creative labor.

An embodiment of the present invention provides a non-transitory computer-readable storage storage medium, where the computer storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by an electronic device, the electronic device is caused The three-dimensional face optimization method based on light in any of the method embodiments described above is performed.

An embodiment of the present invention provides a computer program product, wherein the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions When executed by an electronic device, the electronic device is caused to execute the light-based three-dimensional face optimization method in any of the foregoing method embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus a necessary universal hardware platform, and of course, also by hardware. Based on such an understanding, the above-mentioned technical solution in essence or a part that contributes to the existing technology may be embodied in the form of a software product, and the computer software product may be stored in a computer-readable storage medium, the computer-readable record A medium includes any mechanism for storing or transmitting information in a form readable by a computer (eg, a computer). For example, machine-readable media include read-only memory (ROM), random-access memory (RAM), magnetic disk storage media, optical storage media, flash storage media, electrical, optical, acoustic, or other forms of propagation signals (e.g., carrier waves , Infrared signals, digital signals, etc.), the computer software product includes a number of instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute various embodiments or certain parts of the embodiments Methods.

In the end, it should be noted that the above embodiments are only used to illustrate the technical solutions of the embodiments of the present invention, rather than limiting them. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and range.

Claims

A light-based three-dimensional face optimization method, which is characterized by:

Determine whether the face image in the obtained target picture is unevenly lit;

If the face image has uneven illumination, performing light adjustment on the face image to obtain an optimized face image;

Processing the optimized face image based on a pre-trained convolutional neural network model to obtain the first three-dimensional face model parameter information;

The three-dimensional average face model is processed according to the first three-dimensional face model parameter information to obtain a three-dimensional face image corresponding to the face image.
The method according to claim 1, wherein the step of determining whether the face image in the obtained target picture has uneven illumination includes:

Obtaining a grayscale histogram of the target picture;

Calculating the gray level distribution variance of the target picture according to the gray level histogram;

Comparing the gray level distribution variance with a gray level distribution critical variance, and when the gray level distribution variance is greater than or equal to the gray level distribution critical variance, determining that a face image exists in the target picture The light is uneven.
The method according to claim 1, wherein the step of performing light adjustment on the face image to obtain an optimized face image comprises:

Obtaining a training sample and an initial image generation model, where the training sample includes a plurality of first images generated under a non-positive uniform light source condition, and a second image corresponding to the first image generated under a front uniform light source condition;

Using a machine learning method to train the initial image generation model based on the training samples to obtain an image generation model;

Light adjustment is performed on the face image using the image generation model to obtain an optimized face image.
The method according to claim 3, wherein the using a machine learning method to train the initial image generation model based on the training samples to obtain the image generation model comprises:

Inputting the first image into the initial image generation model to obtain an output optimized first image;

Using the optimized first image and the second image as input of a discrimination network, training the discrimination network, and determining parameters of the discrimination network after training;

Training the initial image generation model by using the first image as an input of the initial image generation model;

Inputting the optimized first image and the second image output by the initial image generation model after training to the discriminant network after training to determine a loss function value of the discriminant network after training;

When the value of the loss function converges, the initial image generation model is determined as the image generation model.
The method according to any one of claims 1-4, wherein the convolutional neural network model is trained by the following steps:

Build a convolutional neural network model consisting of two layers of hourglass convolutional neural networks;

Obtaining a data set for training the convolutional neural network model, where the data set includes several two-dimensional face pictures and three-dimensional portrait scan data corresponding to the two-dimensional face pictures;

Pre-processing the two-dimensional face picture to obtain facial feature point information;

Inputting the facial feature point information into the convolutional neural network model to obtain the second three-dimensional facial model parameter information;

The cross-entropy loss function is used to optimize the parameters of the convolutional neural network until the second 3D face model parameter information and the loss function of the 3D portrait scan data converge to a preset threshold.
A light-based three-dimensional human face optimization device, comprising:

A judging module, configured to judge whether the face image in the obtained target picture has uneven illumination;

An optimization module, if the face image has uneven illumination, performing light adjustment on the face image to obtain an optimized face image;

An acquisition module, configured to process the optimized face image based on a pre-trained convolutional neural network model to obtain first first three-dimensional face model parameter information;

A processing module is configured to process a three-dimensional average face model according to the first three-dimensional face model parameter information to obtain a three-dimensional face image corresponding to the face image.
The device according to claim 6, wherein the determining module is specifically configured to:

Obtaining a grayscale histogram of the target picture;

Calculating the gray level distribution variance of the target picture according to the gray level histogram;

Comparing the gray level distribution variance with a gray level distribution critical variance, and when the gray level distribution variance is greater than or equal to the gray level distribution critical variance, determining that a face image exists in the target picture The light is uneven.
The apparatus according to claim 6, wherein the optimization module further comprises a first training module, and the first training module is configured to:

Obtaining a training sample and an initial image generation model, where the training sample includes a plurality of first images generated under a non-positive uniform light source condition, and a second image corresponding to the first image generated under a front uniform light source condition;

Using a machine learning method to train the initial image generation model based on the training samples to obtain an image generation model;

Light adjustment is performed on the face image using the image generation model to obtain an optimized face image.
The method according to claim 8, wherein the first training module is further configured to:

Inputting the first image into the initial image generation model to obtain an output optimized first image;

Using the optimized first image and the second image as input of a discrimination network, training the discrimination network, and determining parameters of the discrimination network after training;

Training the initial image generation model by using the first image as an input of the initial image generation model;

Inputting the optimized first image and the second image output by the initial image generation model after training to the discriminant network after training to determine a loss function value of the discriminant network after training;

When the value of the loss function converges, the initial image generation model is determined as the image generation model.
An electronic device, comprising: at least one processor; and

A memory connected in communication with the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to execute any one of claims 1 to 5. Light-based 3D face optimization method.