CN111951373B - Face image processing method and equipment - Google Patents

Face image processing method and equipment Download PDF

Info

Publication number
CN111951373B
CN111951373B CN202010623139.7A CN202010623139A CN111951373B CN 111951373 B CN111951373 B CN 111951373B CN 202010623139 A CN202010623139 A CN 202010623139A CN 111951373 B CN111951373 B CN 111951373B
Authority
CN
China
Prior art keywords
face image
matrix
neural network
preset
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010623139.7A
Other languages
Chinese (zh)
Other versions
CN111951373A (en
Inventor
徐博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Spiritplume Interactive Entertainment Technology Co ltd
Original Assignee
Chongqing Spiritplume Interactive Entertainment Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Spiritplume Interactive Entertainment Technology Co ltd filed Critical Chongqing Spiritplume Interactive Entertainment Technology Co ltd
Priority to CN202010623139.7A priority Critical patent/CN111951373B/en
Publication of CN111951373A publication Critical patent/CN111951373A/en
Application granted granted Critical
Publication of CN111951373B publication Critical patent/CN111951373B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a processing method and equipment of a face image, which are used for generating a preset neural network model in advance based on training data and a preset neural network structure, wherein the preset neural network structure comprises a convolution neural network block model and a convolution kernel, and the method comprises the following steps: receiving a face image to be processed, and acquiring illumination parameters of the face image to be processed based on the preset neural network model, wherein the illumination parameters comprise spherical harmonic illumination coefficients and normal mapping; and generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter. The illumination information of the face image is obtained through the preset neural network model, and special light detection equipment is not needed, so that the accuracy and stability of deglazing the face image are improved on the basis of not improving the cost.

Description

Face image processing method and equipment
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a method and apparatus for processing a face image.
Background
In the process of forming the 3D face reconstruction by scanning the face through a camera, image information in a photo needs to be acquired, then the information is made into a UV (ultraviolet) map, and the UV map is attached to a 3D face mesh to be displayed in a 3D scene. The images in the photographs contain rich light information in the photographing environment, whereas in general 3D scenes, the addition of simulated straight-line light and point light sources is required. If the face image in the photo is not polished, the photo and scene light information are overlapped, and the performance of the face in the 3D scene is seriously affected.
One of the preconditions for removing light from face images in photographs is that illumination information of the position of the face images can be captured when photographing. The illumination information includes information such as ambient light, straight line light, spot light, and the like. The most direct method in the prior art is to have a light detection device which is placed at the position of the face to collect the illumination information. Current mobile phone devices are limited by technology and cost, most mobile phones do not have a sensor for detecting light, and the accuracy is not enough if any.
The prior art also carries out large-area modification on pixels of the original image through a global histogram adjustment and gamma correction scheme. However, this scheme can only change the brightness of the whole, but cannot effectively remove the shadow and highlight areas.
There are also prior art schemes for gamma correction by iteration after shadow-based object detection. However, this scheme is limited by the accuracy of shadow detection, which is not stable enough because a proper threshold cannot be determined to determine the system of shadow and gamma correction.
Therefore, how to improve the accuracy and stability of the face image for deglazing without increasing the cost is a technical problem to be solved at present.
Disclosure of Invention
In view of the defects of high cost, insufficient precision and poor stability of a face image dimming method in the prior art, the invention provides a face image processing method, which is used for generating a preset neural network model based on training data and a preset neural network structure in advance, and comprises the following steps:
receiving a face image to be processed, and acquiring illumination parameters of the face image to be processed based on the preset neural network model, wherein the illumination parameters comprise spherical harmonic illumination coefficients and normal mapping;
generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter;
the training data comprises a preset face image, a real spherical harmonic illumination coefficient of the preset face image and a real normal map.
Preferably, the generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter specifically includes:
acquiring a first matrix corresponding to the face image to be processed based on the face image to be processed;
acquiring a second matrix corresponding to the spherical harmonic illumination coefficient and a third matrix corresponding to the normal map based on the illumination parameter;
and acquiring a fourth matrix according to the first matrix, the second matrix and the third matrix, and acquiring the dimming image based on the fourth matrix.
Preferably, a fourth matrix is obtained according to the first matrix, the second matrix and the third matrix, specifically:
the fourth matrix is obtained according to a dimming formula, wherein the dimming formula specifically comprises: a=b/(C x D),
wherein a is the fourth matrix, B is the first matrix, C is the second matrix, and D is the third matrix.
Preferably, the convolutional neural network block model is a residual network block model, wherein a preset number of residual network block models are not connected with a full connection layer of the preset neural network structure.
Preferably, the training data is data subjected to data enhancement processing, and the data enhancement processing includes increasing the background of the preset face image and/or changing the rotation angle of the preset face image.
Preferably, the preset neural network model is generated based on training data and a preset neural network structure, specifically:
determining initial parameters of a preset neural network structure according to the length and the width of the preset face image, wherein the initial parameters comprise the number of units of an input layer, the input number and the output number of each hidden layer and an initial weight value;
inputting the preset face image into the input layer, and determining an output layer result based on a forward propagation algorithm and the initial parameters;
determining a loss function according to the output layer result and the training data;
training according to a preset learning rate based on an optimization algorithm and a back propagation algorithm, and determining a minimum loss value of the loss function according to a training result, wherein the preset learning rate is the learning rate determined by estimating an Adam algorithm based on an adaptive matrix;
and determining the preset neural network model according to the weight value corresponding to the minimum loss value.
Correspondingly, the invention also provides a processing device of the face image, which comprises:
the acquisition module is used for receiving the face image to be processed and acquiring illumination parameters of the face image to be processed based on a preset neural network model, wherein the illumination parameters comprise spherical harmonic illumination coefficients and normal mapping, and the preset neural network model is generated in advance based on training data and a preset neural network structure;
the generating module is used for generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter;
the training data comprises a preset face image, a real spherical harmonic illumination coefficient of the preset face image and a real normal map.
Preferably, the generating module is specifically configured to:
acquiring a first matrix corresponding to the face image to be processed based on the face image to be processed;
acquiring a second matrix corresponding to the spherical harmonic illumination coefficient and a third matrix corresponding to the normal map based on the illumination parameter;
and acquiring a fourth matrix according to the first matrix, the second matrix and the third matrix, and acquiring the dimming image based on the fourth matrix.
Preferably, the generating module is further configured to:
the fourth matrix is obtained according to a dimming formula, wherein the dimming formula specifically comprises: a=b/(C x D),
wherein a is the fourth matrix, B is the first matrix, C is the second matrix, and D is the third matrix.
Preferably, the convolutional neural network block model is a residual network block model, wherein a preset number of residual network block models are not connected with a full connection layer of the preset neural network structure.
Compared with the prior art, the invention has the following beneficial effects:
the invention discloses a processing method and equipment of a face image, which are used for generating a preset neural network model in advance based on training data and a preset neural network structure, wherein the preset neural network structure comprises a convolution neural network block model and a convolution kernel, and the method comprises the following steps: receiving a face image to be processed, and acquiring illumination parameters of the face image to be processed based on the preset neural network model, wherein the illumination parameters comprise spherical harmonic illumination coefficients and normal mapping; and generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter. The illumination information of the face image is obtained through the preset neural network model, and special light detection equipment is not needed, so that the accuracy and stability of deglazing the face image are improved on the basis of not improving the cost, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a face image processing method according to an embodiment of the present invention;
fig. 2 is a flow chart illustrating a face image processing method according to another embodiment of the present invention;
FIG. 3 shows a schematic diagram of a set of training data in an embodiment of the invention;
FIG. 4 is a schematic diagram of a preset neural network according to an embodiment of the present invention;
fig. 5 shows a front-back contrast diagram of face image dimming in an embodiment of the present invention;
fig. 6 shows a schematic structural diagram of a face image processing device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
As described in the background art, the method for removing the light from the face image in the photo in the prior art has high cost, insufficient precision and poor stability.
In order to solve the above problems, an embodiment of the present invention provides a method for processing a face image, which generates a preset neural network model in advance based on training data and a preset neural network structure, acquires illumination parameters of the face image to be processed based on the preset neural network model, and generates a dimming image of the face image to be processed according to the face image to be processed and the illumination parameters. The illumination information of the face image is obtained through the preset neural network model, and special light detection equipment is not needed, so that the accuracy and stability of deglazing the face image are improved on the basis of not improving the cost.
Fig. 1 is a schematic flow chart of a face image processing method according to an embodiment of the present invention, where the method generates a preset neural network model in advance based on training data and a preset neural network structure.
The method comprises the following steps:
s101, receiving a face image to be processed, and acquiring illumination parameters of the face image to be processed based on the preset neural network model.
In a specific implementation scenario, a face image to be processed may be received from various sources such as a mobile phone camera, and after the face image to be processed is received, in order to perform a dimming process on the face image to be processed, an illumination parameter of the face image to be processed needs to be obtained.
The spherical harmonic illumination coefficient is actually a multi-dimensional coefficient obtained by sampling the ambient light, and then the coefficient is used for restoring the illumination during rendering, which can be regarded as simplification of the ambient light, thereby simplifying the calculation process.
The normal line mapping is to make normal line on each point of concave-convex surface of original object, mark normal line direction by RGB color channel, if light source is applied on specific position, it can make surface with low detail degree generate accurate illumination direction and reflection effect with high detail degree.
The illumination parameters such as the spherical harmonic illumination coefficient, the normal map and the like are acquired to analyze illumination information in the face image to be processed so as to further obtain a deglazed image, and the deglazed processing of the face image to be processed is realized.
The preset neural network model is generated in advance based on training data and a preset neural network structure.
The training data comprises a preset face image, a real spherical harmonic illumination coefficient of the preset face image and a real normal map. The real spherical harmonic illumination coefficient and the real normal map are the real values of the spherical harmonic illumination coefficient and the normal map of the preset face image.
In order to ensure the training effect of the training data on the preset neural network model, in a preferred embodiment of the present application, the training data is data subjected to data enhancement processing, where the data enhancement processing includes increasing the background of the preset face image and/or changing the rotation angle of the preset face image.
In a specific implementation scenario, various backgrounds are unavoidable in the face image, in order to improve the capability of the preset neural network model for recognizing the face image from the complex background, a background image can be added to the preset face image for training, for example, a graph is randomly selected from an atlas to be used as a background image, the background image is rendered first, then the face image is rendered, and the two images are overlapped. The face images with various angles can be identified by training a preset neural network model by changing the rotation angle of the preset face images.
It should be noted that, the scheme of the above preferred embodiment is only one specific implementation scheme provided in the present application, and other ways of processing training data in order to increase the training effect of the neural network are all within the protection scope of the present application.
The preset neural network structure comprises a convolutional neural network block model and a convolutional kernel. The convolution kernel is the function that, given an input image, a weighted average of pixels in a small region in the input image is taken as each corresponding pixel in the output image, where the weight is defined by a function called the convolution kernel.
In order to obtain higher accuracy and recall rate of the preset neural network structure, in a preferred embodiment of the present application, the convolutional neural network block model is a residual network block model, where a preset number of residual network block models are not connected to a full connection layer of the preset neural network structure.
In a specific implementation scene, the convolutional neural network block model is selected from a residual network block model, and the residual network is characterized by easy optimization and can improve the accuracy by increasing a considerable depth. The residual blocks inside the deep neural network are connected in a jumping mode, and the gradient disappearance problem caused by depth increase in the deep neural network is relieved. In order to avoid information loss caused by the full-connection layer, a preset number of residual network block models are not connected with the full-connection layer of the preset neural network structure, for example, 6 groups of residual network blocks are selected to skip the full-connection layer.
It should be noted that, the scheme of the above preferred embodiment is only a specific implementation scheme provided in the present application, the preset number of the total connection layers skipped by the residual network block may be determined according to a specific implementation scenario, and in order to enable the preset neural network structure to obtain higher accuracy and recall rate, other types of convolutional neural network block models may also be selected, and these modifications to the preset neural network structure all belong to the protection scope of the present application.
In order to obtain an accurate preset neural network model, in a preferred embodiment of the present application, the preset neural network model is generated based on training data and a preset neural network structure, specifically:
determining initial parameters of a preset neural network structure according to the length and the width of the preset face image, wherein the initial parameters comprise the number of units of an input layer, the input number and the output number of each hidden layer and an initial weight value;
inputting the preset face image into the input layer, and determining an output layer result based on a forward propagation algorithm and the initial parameters;
determining a loss function according to the output layer result and the training data;
training according to a preset learning rate based on an optimization algorithm and a back propagation algorithm, and determining a minimum loss value of the loss function according to a training result, wherein the preset learning rate is the learning rate determined by estimating an Adam algorithm based on an adaptive matrix;
and determining the preset neural network model according to the weight value corresponding to the minimum loss value.
In a specific implementation scene, determining the number of units of an input layer according to the length and the width of an image in a preset face image, setting the input and output number of each hidden layer, randomly initializing weights, inputting the preset face image into the input layer, and determining an output layer result by utilizing the initial parameters through a forward propagation algorithm. Adam is adopted to design independent adaptive learning rates for different parameters by calculating first moment estimation and second moment estimation of gradients, so that an efficient training process is obtained. The back propagation algorithm computes the gradient of the loss function for all weight values in the network in combination with the optimization algorithm. This gradient is fed back to the optimization method for updating the weights to minimize the loss function. And determining a minimum loss value through thousands of iterations and adjustment of learning rate, and taking the weight value iteratively saved when the minimum loss value is obtained as the weight value in a preset neural network model.
It should be noted that, the scheme of the above preferred embodiment is only one specific implementation scheme provided in the present application, and other ways of generating the preset neural network model based on the training data and the preset neural network structure are all within the protection scope of the present application.
S102, generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter.
In a specific implementation scene, after the illumination parameters of the face image to be processed predicted by the preset neural network model are obtained, a dimming image of the face image to be processed can be generated according to the face image to be processed and the illumination parameters.
In order to increase the effect of the generated dimming image, in a preferred embodiment of the present application, the generating the dimming image of the face image to be processed according to the face image to be processed and the illumination parameter specifically includes:
acquiring a first matrix corresponding to the face image to be processed based on the face image to be processed;
acquiring a second matrix corresponding to the spherical harmonic illumination coefficient and a third matrix corresponding to the normal map based on the illumination parameter;
and acquiring a fourth matrix according to the first matrix, the second matrix and the third matrix, and acquiring the dimming image based on the fourth matrix.
In a specific implementation scene, the face image to be processed, the spherical harmonic illumination coefficient and the normal map are converted into corresponding matrixes, image processing calculation is facilitated, a fourth matrix corresponding to the deglazed image is obtained through the converted matrix processing calculation, and then the fourth matrix is converted to obtain the deglazed image.
It should be noted that, the above solution of the preferred embodiment is only one specific implementation solution provided in the present application, and other ways of generating the dimming image of the face image to be processed according to the face image to be processed and the illumination parameter are all within the protection scope of the present application.
In order to ensure accuracy of obtaining a dimming image by using a matrix, in a preferred embodiment of the present application, a fourth matrix is obtained according to the first matrix, the second matrix and the third matrix, specifically:
the fourth matrix is obtained according to a dimming formula, wherein the dimming formula specifically comprises: a=b/(C x D),
wherein a is the fourth matrix, B is the first matrix, C is the second matrix, and D is the third matrix.
In a specific implementation scene, the converted matrix is processed and calculated by the above-mentioned deglazing formula to obtain a fourth matrix corresponding to the deglazing image, and the illumination information in the face image to be processed is removed, so that enough definition can be reserved. The above-mentioned dimming formula is only a preferred formula provided by the present invention, other formulas can be used for performing matrix calculation to obtain a dimming image under the teaching of the present invention, and the mode of obtaining the dimming image by using matrix calculation belongs to the protection scope of the present application.
The invention discloses a processing method and equipment of a face image, which are used for generating a preset neural network model in advance based on training data and a preset neural network structure, wherein the preset neural network structure comprises a convolution neural network block model and a convolution kernel, and the method comprises the following steps: receiving a face image to be processed, and acquiring illumination parameters of the face image to be processed based on the preset neural network model, wherein the illumination parameters comprise spherical harmonic illumination coefficients and normal mapping; and generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter. The illumination information of the face image is obtained through the preset neural network model, and special light detection equipment is not needed, so that the accuracy and stability of deglazing the face image are improved on the basis of not improving the cost.
In order to further explain the technical idea of the invention, the technical scheme of the invention is described with specific application scenarios.
The embodiment of the invention provides a processing method of a face image, which comprises the steps of firstly training a neural network model through a plurality of groups (such as 40 ten thousand groups) of data. As long as the face photo is input, the predicted 27-dimensional spherical harmonic illumination coefficient and the normal map of each pixel are output through the model, so that illumination information is calculated. And calculating each pixel of the original image and the illumination information to finally obtain a dimming image. The implementation process is shown in fig. 2, and the specific steps are as follows:
the first step: a plurality of sets of training data for training a predetermined neural network model are collected. An example of a set of training data is shown in fig. 3. The training data is that accurate face photos are obtained in advance based on parameterized 3D face models, and spherical harmonic illumination coefficients and normal maps corresponding to the face photos. And taking the face photo as input data, and taking the spherical harmonic illumination coefficient and the normal map as the true value of output data. In order to make the preset neural network model have better generalization capability, data enhancement processing is also performed on training data, such as adding various backgrounds.
In a specific implementation scenario, training data may be collected based on the three-dimensional face deformation model 3 DMM. Based on a BFM2017 database of the 3DMM, on the basis that a three-dimensional deformation model is built on a three-dimensional face database, the face shape and face texture statistics are taken as constraints, and meanwhile, the influence of the pose and illumination factors of the face is considered, so that the generated three-dimensional face model is high in precision.
And a second step of: selecting a residual network block model to extract features in the face image. The preset neural network structure is shown in fig. 4. The residual network is characterized by easy optimization and can improve accuracy by increasing considerable depth. The residual network block inside the deep neural network uses jump connection, so that the gradient disappearance problem caused by adding depth in the deep neural network is relieved. And then, different convolution kernels are added to extract the characteristics of pixel areas with different sizes, so that the global and local key information points can be acquired more accurately. Specifically, a convolution kernel of the shape (3, 3), (3, 4), (3, 5) is selected to slide the convolution over the picture. Since the padding mechanism is used, it is independent of the image pixel size and area. Each convolution uses one convolution kernel to extract a set of features. In order to avoid information loss caused by the full connection layer, part of residual network blocks are skipped over the full connection layer, so that higher accuracy and recall rate are obtained.
In a specific implementation scenario, in order to extract features in a face image, a neural network training needs to be performed on a preset neural network structure by using collected training data, so as to generate a preset neural network model, which specifically includes the following steps:
1. selecting a preset neural network structure
And determining the number of units of the input layer according to the length and the width of the face image in the training data. The number of inputs and outputs per concealment layer is set, wherein the number of inputs and outputs of (3, 64), (64, 128), (256 ), (256, 512) is used in the encoding phase and the number of inputs and outputs of (512, 256), (256 ), (256, 64), (64,3) is used in the decoding phase.
2. Randomly initializing weights
The weight values in the neural network are initialized to a small number near 0, but not 0.
3. Performing forward propagation FP algorithm
For forward propagation, the process can be expressed by the following formula:
a n =σ(a n-1 *W n +b n )
wherein the superscript represents the number of layers, the asterisk represents the convolution, b represents the bias term bias, σ represents the activation function, and W represents the weight value.
4. Calculating a loss function
The calculation formula is as follows:
Loss(image)=λ1*E-RECON+λ2*E-Normal+(1-λ1-λ2)*E-Light
wherein image is a face image, E-RECON is a difference between a predicted dimming image and an original image, E-Norma is a normal map difference generated by a predicted normal map and training data, E-Light is an illumination difference generated by a predicted illumination and training data, λ1=0.3, and λ2=0.3.
5. Minimizing loss function using optimization algorithm and back propagation algorithm
The random gradient descent keeps a single learning rate updating all weights, and the learning rate does not change during training. The independent adaptive learning rate can be designed for different parameters by adopting Adam to calculate the first moment estimation and the second moment estimation of the gradient, so that an efficient training process is obtained. The back propagation algorithm computes the gradient of the loss function for all weight values in the network in combination with the optimization algorithm. This gradient is fed back to the optimization method for updating the weight values to minimize the loss function.
6. Preserving neural network values
After thousands of iterations and adjustments in learning rate, the training is ended when the loss value is reduced to a point where it can no longer be reduced. And using the neural network weight saved in the iteration with the smallest loss value as a model of the using stage.
And a third step of: the trained pre-set neural network model may be packaged into a web service or SDK (Software Development Kit ) for application invocation. The mobile phone sends the face image obtained by the camera to a preset neural network model, so that a spherical harmonic illumination coefficient and a normal map are obtained. And converting the face image, the spherical harmonic illumination coefficient and the normal map into matrixes, and then calculating through a dimming formula to obtain a dimming image matrix. Finally, a face image after the light removal is obtained, as shown in fig. 5.
Specifically, the picture is an n×n-dimensional array of 3 channels, and the matrix is an n×n×3 matrix. The spherical harmonic illumination coefficient is a matrix of 1 x 27 dimension, and if the picture is of n x n dimension, the spherical harmonic illumination coefficient is converted into a matrix of n x 27 dimension, and then calculated with the matrix of each channel of the picture.
Wherein, the formula of removing light is specifically:
A=B/(C×D)
wherein A is a dimming image matrix, B is an original face image matrix, C is a spherical harmonic illumination coefficient matrix, and D is a normal map matrix.
By applying the technical scheme, a preset neural network model is generated in advance based on training data and a preset neural network structure, illumination parameters of the face image to be processed are acquired based on the preset neural network model, and a dimming image of the face image to be processed is generated according to the face image to be processed and the illumination parameters. The illumination information of the face image is obtained through the preset neural network model, and special light detection equipment is not needed, so that the accuracy and stability of deglazing the face image are improved on the basis of not improving the cost.
Corresponding to the processing method of the face image provided in the embodiment of the present application, the embodiment of the present application further provides a processing device of the face image, as shown in fig. 6, where the device includes:
the acquiring module 601 is configured to receive a face image to be processed, and acquire illumination parameters of the face image to be processed based on a preset neural network model, where the illumination parameters include a spherical harmonic illumination coefficient and a normal map, and the preset neural network model is generated in advance based on training data and a preset neural network structure;
a generating module 602, configured to generate a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter;
the training data comprises a preset face image, a real spherical harmonic illumination coefficient of the preset face image and a real normal map.
In a specific application scenario of the present application, the generating module 602 is specifically configured to:
acquiring a first matrix corresponding to the face image to be processed based on the face image to be processed;
acquiring a second matrix corresponding to the spherical harmonic illumination coefficient and a third matrix corresponding to the normal map based on the illumination parameter;
and acquiring a fourth matrix according to the first matrix, the second matrix and the third matrix, and acquiring the dimming image based on the fourth matrix.
In a specific application scenario of the present application, the generating module 602 is further configured to:
the fourth matrix is obtained according to a dimming formula, wherein the dimming formula specifically comprises: a=b/(C x D),
wherein a is the fourth matrix, B is the first matrix, C is the second matrix, and D is the third matrix.
In a specific application scenario of the present application, the convolutional neural network block model is a residual network block model, where a preset number of residual network block models are not connected to a full connection layer of the preset neural network structure.
In a specific application scenario of the present application, the training data is data subjected to data enhancement processing, where the data enhancement processing includes increasing a background of the preset face image and/or changing a rotation angle of the preset face image.
In a specific application scenario of the present application, the apparatus further includes a training module, specifically configured to:
determining initial parameters of a preset neural network structure according to the length and the width of the preset face image, wherein the initial parameters comprise the number of units of an input layer, the input number and the output number of each hidden layer and an initial weight value;
inputting the preset face image into the input layer, and determining an output layer result based on a forward propagation algorithm and the initial parameters;
determining a loss function according to the output layer result and the training data;
training according to a preset learning rate based on an optimization algorithm and a back propagation algorithm, and determining a minimum loss value of the loss function according to a training result, wherein the preset learning rate is the learning rate determined by estimating an Adam algorithm based on an adaptive matrix;
and determining the preset neural network model according to the weight value corresponding to the minimum loss value.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, one of ordinary skill in the art will appreciate that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not drive the essence of the corresponding technical solutions to depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (2)

1. A method for processing a face image, wherein a preset neural network model is generated in advance based on training data and a preset neural network structure, the method comprising:
receiving a face image to be processed, and acquiring illumination parameters of the face image to be processed based on the preset neural network model, wherein the illumination parameters comprise spherical harmonic illumination coefficients and normal mapping;
generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter; specifically, a first matrix corresponding to the face image to be processed is obtained based on the face image to be processed; acquiring a second matrix corresponding to the spherical harmonic illumination coefficient and a third matrix corresponding to the normal map based on the illumination parameter; obtaining a fourth matrix according to the first matrix, the second matrix and the third matrix, specifically obtaining the fourth matrix according to a dimming formula, wherein the dimming formula specifically includes: a=b/(c×d), where a is the fourth matrix, B is the first matrix, C is the second matrix, and D is the third matrix; and acquiring the dimming image based on the fourth matrix;
the convolutional neural network block model is a residual network block model, wherein a preset number of residual network block models are not connected with a full-connection layer of the preset neural network structure; the training data comprises a preset face image, a real spherical harmonic illumination coefficient of the preset face image and a real normal map; the training data is data subjected to data enhancement processing, and the data enhancement processing comprises the steps of increasing the background of the preset face image and/or changing the rotation angle of the preset face image;
generating a preset neural network model based on training data and a preset neural network structure, wherein the method specifically comprises the following steps: determining initial parameters of a preset neural network structure according to the length and the width of the preset face image, wherein the initial parameters comprise the number of units of an input layer, the input number and the output number of each hidden layer and an initial weight value; inputting the preset face image into the input layer, and determining an output layer result based on a forward propagation algorithm and the initial parameters; determining a loss function according to the output layer result and the training data; training according to a preset learning rate based on an optimization algorithm and a back propagation algorithm, and determining a minimum loss value of the loss function according to a training result, wherein the preset learning rate is the learning rate determined by estimating an Adam algorithm based on an adaptive matrix; and determining the preset neural network model according to the weight value corresponding to the minimum loss value.
2. A processing apparatus for face images, the apparatus comprising:
the acquisition module is used for receiving the face image to be processed and acquiring illumination parameters of the face image to be processed based on a preset neural network model, wherein the illumination parameters comprise spherical harmonic illumination coefficients and normal mapping, and the preset neural network model is generated in advance based on training data and a preset neural network structure;
the generating module is used for generating a dimming image of the face image to be processed according to the face image to be processed and the illumination parameter; the method is particularly used for acquiring a first matrix corresponding to the face image to be processed based on the face image to be processed; acquiring a second matrix corresponding to the spherical harmonic illumination coefficient and a third matrix corresponding to the normal map based on the illumination parameter; acquiring a fourth matrix according to the first matrix, the second matrix and the third matrix, and acquiring the dimming image based on the fourth matrix;
the generating module is further configured to obtain the fourth matrix according to a dimming formula, where the dimming formula specifically is: a=b/(c×d), where a is the fourth matrix, B is the first matrix, C is the second matrix, and D is the third matrix;
the training data comprises a preset face image, a real spherical harmonic illumination coefficient of the preset face image and a real normal map; the convolutional neural network block model is a residual network block model, wherein a preset number of residual network block models are not connected with a full-connection layer of the preset neural network structure.
CN202010623139.7A 2020-06-30 2020-06-30 Face image processing method and equipment Active CN111951373B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010623139.7A CN111951373B (en) 2020-06-30 2020-06-30 Face image processing method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010623139.7A CN111951373B (en) 2020-06-30 2020-06-30 Face image processing method and equipment

Publications (2)

Publication Number Publication Date
CN111951373A CN111951373A (en) 2020-11-17
CN111951373B true CN111951373B (en) 2024-02-13

Family

ID=73337864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010623139.7A Active CN111951373B (en) 2020-06-30 2020-06-30 Face image processing method and equipment

Country Status (1)

Country Link
CN (1) CN111951373B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989473B (en) * 2021-12-23 2022-08-12 北京天图万境科技有限公司 Method and device for relighting
CN114677291B (en) * 2022-02-25 2023-05-12 荣耀终端有限公司 Image processing method, device and related equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016110030A1 (en) * 2015-01-09 2016-07-14 杭州海康威视数字技术股份有限公司 Retrieval system and method for face image
CN107909640A (en) * 2017-11-06 2018-04-13 清华大学 Face weight illumination method and device based on deep learning
CN108334847A (en) * 2018-02-06 2018-07-27 哈尔滨工业大学 A kind of face identification method based on deep learning under real scene
CN110874632A (en) * 2018-08-31 2020-03-10 北京嘉楠捷思信息技术有限公司 Image recognition processing method and device
CN111028273A (en) * 2019-11-27 2020-04-17 山东大学 Light field depth estimation method based on multi-stream convolution neural network and implementation system thereof
CN111091492A (en) * 2019-12-23 2020-05-01 韶鼎人工智能科技有限公司 Face image illumination migration method based on convolutional neural network
CN111275651A (en) * 2020-02-25 2020-06-12 东南大学 Face bright removal method based on antagonistic neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016110030A1 (en) * 2015-01-09 2016-07-14 杭州海康威视数字技术股份有限公司 Retrieval system and method for face image
CN107909640A (en) * 2017-11-06 2018-04-13 清华大学 Face weight illumination method and device based on deep learning
CN108334847A (en) * 2018-02-06 2018-07-27 哈尔滨工业大学 A kind of face identification method based on deep learning under real scene
CN110874632A (en) * 2018-08-31 2020-03-10 北京嘉楠捷思信息技术有限公司 Image recognition processing method and device
CN111028273A (en) * 2019-11-27 2020-04-17 山东大学 Light field depth estimation method based on multi-stream convolution neural network and implementation system thereof
CN111091492A (en) * 2019-12-23 2020-05-01 韶鼎人工智能科技有限公司 Face image illumination migration method based on convolutional neural network
CN111275651A (en) * 2020-02-25 2020-06-12 东南大学 Face bright removal method based on antagonistic neural network

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Registration of 3D facial surfaces using covariance matrix pyramids;Moritz Kaiser 等;《2010 IEEE International Conference on Robotics and Automation》;1-15 *
一种人脸标准光照图像的线性重构方法;熊鹏飞 等;模式识别与人工智能(04);1-4 *
人脸识别中光照处理方法的分析;刘笃晋 等;计算机系统应用(1);1-4 *
基于CycleGAN的非配对人脸图片光照归一化方法;曾碧 等;广东工业大学学报(05);1-5 *
基于GAN的人脸图像光照迁移;宁宁 等;北京电子科技学院学报(04);1-4 *
基于线性子空间和商图像理论的人脸光照补偿;刘丽华;计算机工程与应用(25);1-3 *
经典光照模型实现人脸图像光照方向准确估计;陈晓钢 等;计算机工程与应用(11);1-3 *

Also Published As

Publication number Publication date
CN111951373A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
US10755173B2 (en) Video deblurring using neural networks
CN106778928B (en) Image processing method and device
CN113658051B (en) Image defogging method and system based on cyclic generation countermeasure network
US10708525B2 (en) Systems and methods for processing low light images
US20200057831A1 (en) Real-time generation of synthetic data from multi-shot structured light sensors for three-dimensional object pose estimation
CN111951372B (en) Three-dimensional face model generation method and equipment
CN111835983B (en) Multi-exposure-image high-dynamic-range imaging method and system based on generation countermeasure network
CN109493297B (en) Low-quality face image enhancement method, system, equipment and storage medium
CN111951373B (en) Face image processing method and equipment
CN111047543A (en) Image enhancement method, device and storage medium
JP7322460B2 (en) Information processing device, three-dimensional model generation method, and program
CN113284037B (en) Ceramic watermark carrier recovery method based on deep neural network
CN116612015A (en) Model training method, image mole pattern removing method and device and electronic equipment
CN114419392A (en) Hyperspectral snapshot image recovery method, device, equipment and medium
CN112509144A (en) Face image processing method and device, electronic equipment and storage medium
CN113378812A (en) Digital dial plate identification method based on Mask R-CNN and CRNN
CN110335228B (en) Method, device and system for determining image parallax
CN111612721B (en) Image restoration model training method and device and satellite image restoration method and device
CN109360176A (en) Image processing method, device, electronic equipment and computer readable storage medium
CN115049558A (en) Model training method, human face image processing device, electronic equipment and readable storage medium
Korus et al. Neural imaging pipelines-the scourge or hope of forensics?
KR102577361B1 (en) Method and apparatus for image dehazing via complementary adversarial learning
CN114764803B (en) Noise evaluation method and device based on real noise scene and storage medium
KR102442980B1 (en) Super-resolution method for multi-view 360-degree image based on equi-rectangular projection and image processing apparatus
CN117764988B (en) Road crack detection method and system based on heteronuclear convolution multi-receptive field network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant