CN110363116A

CN110363116A - Irregular face antidote, system and medium based on GLD-GAN

Info

Publication number: CN110363116A
Application number: CN201910575810.2A
Authority: CN
Inventors: 孙锬锋; 蒋兴浩; 许可; 陆翼龙
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2019-10-22
Anticipated expiration: 2039-06-28
Also published as: CN110363116B

Abstract

The present invention provides a kind of irregular face antidote, system and medium based on GLD-GAN, comprising: step S1: identifying and the face in interception image, obtains face figure P₀；Step S2: according to the face figure P of acquisition₀, background redundancy is rejected using the image Segmentation Technology based on CRF-RNN；Step S3: image classification is carried out using Inception model, side face is divided into multiple classes according to angle；Step S4: the GLD-GAN network model of each class of training.The present invention will generate regularization of the confrontation application of net in side face and study, learn to realize the conversion of individual side face to corresponding positive face by confrontation, simultaneously by side face rule method in conjunction with super-resolution rebuilding technology, the end-to-end mapping of the side face portrait under multi-pose, different illumination conditions to the positive face view of high quality is realized.

Description

Irregular human face correction method, system and medium based on GLD-GAN

Technical Field

The invention relates to the technical field of image recognition, in particular to an irregular human face correction method, an irregular human face correction system and an irregular human face correction medium based on GLD-GAN.

Background

The face recognition technology has become mature day by day, and is applied to the reality very widely, such as security monitoring, attendance check and check-in and the like. However, the existing face recognition is limited more, and most of the existing applications are applicable to clear and front-looking faces, and once illumination change occurs or the face deflects or even is shielded, the accuracy of the existing face recognition technology begins to decrease greatly. Therefore, face recognition under complicated and difficult conditions remains a challenging problem. The invention provides an irregular human face correction algorithm based on GLD-GAN, which improves the accuracy of human face recognition in a complex environment by correcting an irregular human face into a clear front face.

Related patents published at present are not many, wherein publication numbers and publication dates are CN108510061A and 2018.09.07 respectively, and chinese patent titled "method for synthesizing a front face by multiple surveillance videos based on condition generation countermeasure network" provides a method for synthesizing a front face by human faces under multiple surveillance videos, which needs to collect unconstrained deflection angles of human faces and a front face in surveillance videos and screen out the front face, so that the front face synthesis is realized by inputting multiple side faces; in addition, the publication numbers and the publication dates are CN108537743A and 2018.09.14 respectively, and the title is a face image enhancement method based on generation of a confrontation network, the method provides another idea for realizing irregular face correction based on generation of the confrontation network, a 3D dense face alignment method is mainly used for preprocessing face images of various postures and then the face images are placed in the network for training, and the images obtained by the method can keep original illumination and real visual degree and retain original identity information. But since it adopts a 3D dense face alignment method, its operation speed is slow; meanwhile, when the human face is deflected or shielded at a large angle, the accuracy of the human face is obviously reduced. Further, there is a patent entitled "image correction method and mobile terminal" based on mobile terminal, publication No. and publication date CN108491775A and 2018.09.04, respectively, which aims to improve the display effect of a photographed work and correct the orientation of the pupil in a portrait. Although this method has some correction for irregular faces, it is ineffective in that information for the left/right eyes is usually lost in the case of large-angle side faces.

The invention researches a regularization method of a side face portrait, namely reconstructing a front face image under an ideal illumination condition from a two-dimensional face image with a non-front posture and an unsatisfactory illumination condition, specifically comprising the posture and the illumination correction of the portrait and the automatic compensation of facial texture loss, and aims to realize the end-to-end mapping from a single side face portrait to a front face portrait view at multiple angles, solve the problem of complex face recognition under the actual environment to a certain extent and further improve the face recognition accuracy in the actual application.

Patent document CN106874861A (application number: 201710053937.9) discloses a face correction method and system, the method includes: setting a standard face to obtain standard points, wherein the standard points at least comprise: 5 characteristic points of the center of the left eyeball, the center of the right eyeball, the nose tip, the left mouth angle and the right mouth angle; selecting different characteristic points from the standard points, and correcting the human faces with multiple deflection angles in different modes of calculating similar transformation parameters for the input human faces with different deflection angles; when the deflection angle is within a first deflection angle threshold value, calculating a parameter of similarity transformation by adopting coordinates based on the centers of the left eyeball and the right eyeball; when the deflection angle is within a second deflection angle threshold value, calculating a parameter of similarity transformation by adopting a coordinate based on the middle of two eyeballs and a coordinate based on the middle of two mouth angles; and when the deflection angle is within a third deflection angle threshold value, calculating the similarity transformation parameter based on the unilateral characteristic point.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a method, a system and a medium for correcting irregular human faces based on GLD-GAN.

The irregular human face correction method based on the GLD-GAN provided by the invention comprises the following steps:

step S1: recognizing and intercepting the face in the image to obtain a face image P₀；

Step S2: according to the obtained face image P₀Eliminating background redundancy by adopting an image segmentation technology based on CRF-RNN;

step S3: classifying the images by adopting an inclusion model, and dividing the side faces into a plurality of classes according to angles;

step S4: training a GLD-GAN network model of each class;

step S5: inputting the side face image into a GLD-GAN network model to obtain a front face image a;

step S6: inputting the side face image into an open source TP-GAN to obtain a front face image b;

step S7: fusing the front face images a and b according to the ratio of a: b to 3:1, and replacing the front face in the image b with the fused front face to obtain a front face image c;

step S8: performing optimization adjustment by adopting a bilateral filtering algorithm to obtain a frontal face image c;

step S9: training a super-resolution network SRGAN;

step S10: and performing super-resolution processing on the front face image c by using the SRGAN.

Preferably, the step S1 includes:

step S101: detecting whether a human face exists in the image: if yes, the step S102 is carried out continuously, otherwise, the current cycle is skipped out and the error of the undetected face is reported;

step S102: calculating the coordinates of the five sense organs and a corresponding bounding box (bounding box) according to detection;

step S103: firstly, positioning the position of a nose in the calculated coordinates of the five sense organs, ensuring that the nose is the central axis of the intercepted face image, adjusting the position of a boundary frame according to the central axis, and intercepting the boundary frame as a new face image P₀；

The step S102:

reading an image, identifying and analyzing a human face by utilizing an open source library face _ recognition based on dlib, and returning coordinates and a bounding box of five sense organs of the human face;

in the step S103, the position of the bounding box is adjusted according to the central axis:

let right be the right border coordinate of the former border frame, let left be the left border coordinate of the former border frame, let original axis coordinate x ═ right-left)/2, let the abscissa of nose be mid, if x > mid, then right ═ right- (x-mid), left is invariable; if x < mid, then left ═ left + (mid-x), right is unchanged; if x is mid, no operation is performed.

Preferably, the step S2 includes:

step S201: utilizing an open source model CRF-RNN to obtain a face image P₀Performing image segmentation to obtain an image-segmented color map P₁；

Step S202: traverse P₀And P₁If at P₁Is corresponding to the first color, then P is₀The corresponding pixel in the image is changed into a second color, thereby achieving the effect of segmenting the face image, and obtaining the image P which is obtained by segmenting the face and the background and only has the face₂；

The step S3 includes:

dividing all training images for side face rectification into a plurality of classes according to the angles of the side faces, and then putting the training images into an inclusion V3 model for classification training to obtain a trained inclusion V3 network;

the input of the increment V3 network is a face image, and the angle of the face image deflected relative to the front face is output;

the step S4 includes:

step S401: preparing a first training, testing and verifying data set, wherein one side face image of an image in the first training, testing and verifying data set corresponds to one front face image;

step S402: splicing the side face image and the corresponding front face image in the first training, testing and verifying data set on one image;

step S403: training a GLD-GAN network model aiming at each class divided according to the angle of the side face, wherein the loss function of the GLD-GAN network is as follows:

wherein,

G^*representing loss functions of GLD-GAN networks

The loss of the GAN network is expressed, namely a stable local optimal solution is achieved through a maximum minimum value;

λ₁a parameter representing loss of L1;

λ₂a parameter representing a loss of blur;

λ₃a parameter representing a loss of identity retention;

represents the L1 loss, i.e., the one-norm loss;

represents a loss of ambiguity;

representing an identity retention loss;

g represents a loss function of the generating network;

d represents a loss function of the discrimination network, namely a local loss function between the generated image and the real data ground route;

the maximum and minimum values are game among the generative countermeasure networks so as to achieve a stable local optimal solution;

l1 denotes the loss function of the whole between the generated image and the real data ground route;

l2 denotes the sharpness loss function of the generated image;

l3 denotes the identity information loss function between the generated image and the real data ground route.

Preferably, the step S5 includes:

step S501: according to the Incepton V3 network, the input face image P is processed₂Dividing the GLD-GAN network model into corresponding classes;

step S502: in the GLD-GAN network model, the input human face image P is processed₂Converting into picture with preset size, and outputting corresponding front face image P₃；

The step S6 includes:

in an open source TP-GAN network model, an input face image P is input₂Converting into picture with preset size, and outputting corresponding front face image P₄；

The step S7 includes:

step S701: front face image P₃、P₄Performing face fusion according to a first preset proportion, wherein a face fusion algorithm is based on an open source library dlib and OpenCV, wherein dlib is used for extracting a feature model of a face, and then the feature model is fused according to the proportion by using OpenCV to obtain a fused face;

step S702: replacing original front face image P by obtained fused face₄To obtain an image P₅The face replacement algorithm is also based on dlib and OpenCV;

replacing original front face image P by obtained fused face₄：

Calculating convex hulls by using the key points of the face detected by dlib, performing Delaunay triangulation on the key points, and mapping the fused face to a face image P by affine transformation₄The above.

Preferably, the step S8 includes:

using a bilateral filter, image P₅Optimizing to obtain an image P₆；

The bilateral filter generates a distance template by using a two-dimensional Gaussian function, generates a value domain template by using a one-dimensional Gaussian function, and has the following distance template coefficients:

wherein,

representing distance template coefficients;

k and l are central coordinates of the template window;

i, j are coordinates of other coefficients of the template window;

σ_dis the standard deviation of its gaussian function;

the value domain template coefficient generation formula is as follows:

wherein,

representing a value domain template coefficient;

a pixel value representing a point with coordinates i, j;

f (k, l) represents a pixel value of a point whose coordinate is k, l;

σ_rthe standard deviation of the gaussian function is expressed;

wherein the function f represents the image to be processed;

in summary, the templates of the bilateral filter are:

wherein,

a template representing a bilateral filter;

the step S9 includes:

preparing a second training, testing and verifying data set, and training a super-resolution network model SRGAN;

the size ratio of the low resolution images and the high resolution images in the second training, testing and verifying dataset is a second preset ratio.

Preferably, the step S10 includes:

image P₆Inputting the trained super-resolution network model SRGAN, performing super-resolution processing on the input image by using the trained super-resolution network model SRGAN, increasing the resolution of the image to a preset resolution, and outputting a final image P₇；

Wherein, the loss function of the generation network of the SRGAN network is as follows:

wherein,

a loss function representing a generated network in the SRGAN network;

W_i,jrepresenting the width of the VGG-19 network characteristic diagram of the ith row and the jth column;

H_i,jthe height of the VGG-19 network characteristic diagram of the ith row and the jth column is represented;

φ_i,jrepresents a feature map obtained from the VGG-19 network before the ith max pooling layer of the jth layer convolution (after activation);

I^HRrepresenting a high resolution image;

(I^HR)_x,yrepresenting pixel points with coordinates x and y in the high-resolution image;

representing a feature map corresponding to the image;

I^LRrepresenting a low resolution image;

and the pixel points with coordinates x and y in the characteristic diagram representing the low-resolution image.

The invention provides an irregular human face correction system based on GLD-GAN, which comprises:

module S1: recognizing and intercepting the face in the image to obtain a face image P₀；

Module S2: according to the obtained face image P₀Eliminating background redundancy by adopting an image segmentation technology based on CRF-RNN;

module S3: classifying the images by adopting an inclusion model, and dividing the side faces into a plurality of classes according to angles;

module S4: training a GLD-GAN network model of each class;

module S5: inputting the side face image into a GLD-GAN network model to obtain a front face image a;

module S6: inputting the side face image into an open source TP-GAN to obtain a front face image b;

module S7: fusing the front face images a and b according to a preset proportion, and replacing the front face in the image b with the fused front face to obtain a front face image c;

module S8: performing optimization adjustment by adopting a bilateral filtering algorithm to obtain a frontal face image c;

module S9: training a super-resolution network SRGAN;

module S10: and performing super-resolution processing on the front face image c by using the SRGAN.

Preferably, the module S1 includes:

a module S101: detecting whether a human face exists in the image: if yes, calling the module S102, otherwise, jumping out of the current cycle and reporting the error of the undetected face;

a module S102: calculating the coordinates of the five sense organs and a corresponding bounding box (bounding box) according to detection;

a module S103: firstly, positioning the position of the nose in the calculated coordinates of the five sense organs, ensuring that the nose is the central axis of the intercepted face image, and adjusting according to the central axisThe position of the bounding box is intercepted and taken as a new face image P₀；

The module S102:

the module S103, adjusting the position of the bounding box according to the central axis:

let right be the right border coordinate of the former border frame, let left be the left border coordinate of the former border frame, let original axis coordinate x ═ right-left)/2, let the abscissa of nose be mid, if x > mid, then right ═ right- (x-mid), left is invariable; if x < mid, then left ═ left + (mid-x), right is unchanged; if x is mid, no operation is performed;

the module S2 includes:

a module S201: utilizing an open source model CRF-RNN to obtain a face image P₀Performing image segmentation to obtain an image-segmented color map P₁；

A module S202: traverse P₀And P₁If at P₁Is corresponding to the first color, then P is₀The corresponding pixel in the image is changed into a second color, thereby achieving the effect of segmenting the face image, and obtaining the image P which is obtained by segmenting the face and the background and only has the face₂；

The module S3 includes:

the module S4 includes:

a module S401: preparing a first training, testing and verifying data set, wherein one side face image of an image in the first training, testing and verifying data set corresponds to one front face image;

a module S402: splicing the side face image and the corresponding front face image in the first training, testing and verifying data set on one image;

a module S403: training a GLD-GAN network model aiming at each class divided according to the angle of the side face, wherein the loss function of the GLD-GAN network is as follows:

wherein,

G^*representing loss functions of GLD-GAN networks

λ₁a parameter representing loss of L1;

λ₂a parameter representing a loss of blur;

λ₃a parameter representing a loss of identity retention;

represents the L1 loss, i.e., the one-norm loss;

represents a loss of ambiguity;

representing an identity retention loss;

g represents a loss function of the generating network;

l2 denotes the sharpness loss function of the generated image;

l3 denotes the identity information loss function between the generated image and the real data ground route;

the module S5 includes:

a module S501: according to the Incepton V3 network, the input face image P is processed₂Dividing the GLD-GAN network model into corresponding classes;

a module S502: in the GLD-GAN network model, the input human face image P is processed₂Converting into picture with preset size, and outputting corresponding front face image P₃；

The module S6 includes:

The module S7 includes:

a module S701: front face image P₃、P₄Performing face fusion according to a first preset proportion, wherein a face fusion algorithm is based on an open source library dlib and OpenCV, wherein dlib is used for extracting a feature model of a face, and then the feature model is fused according to the proportion by using OpenCV to obtain a fused face;

a module S702: replacing original front face image P by obtained fused face₄To obtain an image P₅The face replacement algorithm is also based on dlib and OpenCV;

replacing original front face image P by obtained fused face₄：

Preferably, the module S8 includes:

using a bilateral filter, image P₅Optimizing to obtain an image P₆；

wherein,

representing distance template coefficients;

k and l are central coordinates of the template window;

i, j are coordinates of other coefficients of the template window;

σ_dis the standard deviation of its gaussian function;

the value domain template coefficient generation formula is as follows:

wherein,

representing a value domain template coefficient;

a pixel value representing a point with coordinates i, j;

f (k, l) represents a pixel value of a point whose coordinate is k, l;

σ_rthe standard deviation of the gaussian function is expressed;

wherein the function f represents the image to be processed;

in summary, the templates of the bilateral filter are:

wherein,

a template representing a bilateral filter;

the module S9 includes:

the size ratio of the low-resolution images and the high-resolution images in the second training, testing and verifying data set is a second preset ratio;

the module S10 includes:

wherein,

a loss function representing a generated network in the SRGAN network;

I^HRrepresenting a high resolution image;

representing a feature map corresponding to the image;

I^LRrepresenting a low resolution image;

According to the present invention, there is provided a computer readable storage medium storing a computer program, wherein the computer program is configured to, when executed by a processor, implement the steps of the GLD-GAN based irregular face rectification method according to any one of the above.

Compared with the prior art, the invention has the following beneficial effects:

1. the invention applies the generated confrontation network technology to the regularization research of the side faces and realizes the conversion from a single side face to a corresponding front face through the confrontation learning.

2. The invention integrates the face generated by the TP-GAN network and the GLD-GAN, and retains the face identity information characteristics on the basis of the side face regularization.

3. The invention adopts the bilateral filter to filter the generated image, thereby removing the interference of noise.

4. The invention combines the side face regularization method with the super-resolution reconstruction technology to realize the end-to-end mapping from the side face image to the high-quality front face view under the conditions of multiple postures and different illumination.

5. The invention applies the generated confrontation network technology to the regularization research of the side faces, realizes the conversion from a single side face to a corresponding front face through confrontation learning, and simultaneously combines the side face regularization method with the super-resolution reconstruction technology to realize the end-to-end mapping from the side face portrait to the high-quality front face view under the conditions of multiple postures and different illumination.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

fig. 1 is a schematic flow diagram of an irregular face correction method based on GLD-GAN provided by the present invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

step S4: training a GLD-GAN network model of each class;

step S9: training a super-resolution network SRGAN;

Specifically, the step S1 includes:

The step S102:

Specifically, the step S2 includes:

The step S3 includes:

the step S4 includes:

wherein,

G^*representing loss functions of GLD-GAN networks

λ₁a parameter representing loss of L1;

λ₂a parameter representing a loss of blur;

λ₃a parameter representing a loss of identity retention;

represents the L1 loss, i.e., the one-norm loss;

represents a loss of ambiguity;

representing an identity retention loss;

g represents a loss function of the generating network;

l2 denotes the sharpness loss function of the generated image;

Specifically, the step S5 includes:

The step S6 includes:

The step S7 includes:

replacing original front face image P by obtained fused face₄：

Calculating convex hulls by using the key points of the face detected by dlib, performing Delaunay triangulation on the key points, and performing Delaunay triangulation on the key points by using the imitationsProjection transformation for mapping fused face to frontal face image P₄The above.

Specifically, the step S8 includes:

using a bilateral filter, image P₅Optimizing to obtain an image P₆；

wherein,

representing distance template coefficients;

k and l are central coordinates of the template window;

i, j are coordinates of other coefficients of the template window;

σ_dis the standard deviation of its gaussian function;

the value domain template coefficient generation formula is as follows:

wherein,

representing a value domain template coefficient;

a pixel value representing a point with coordinates i, j;

f (k, l) represents a pixel value of a point whose coordinate is k, l;

σ_rthe standard deviation of the gaussian function is expressed;

wherein the function f represents the image to be processed;

in summary, the templates of the bilateral filter are:

wherein,

a template representing a bilateral filter;

the step S9 includes:

Specifically, the step S10 includes:

wherein,

a loss function representing a generated network in the SRGAN network;

I^HRrepresenting a high resolution image;

representing a feature map corresponding to the image;

I^LRrepresenting a low resolution image;

The irregular human face correction system based on the GLD-GAN can be realized through the steps and the flows of the irregular human face correction method based on the GLD-GAN. The GLD-GAN based irregular face correction method can be understood as a preferred example of the GLD-GAN based irregular face correction system by those skilled in the art.

module S4: training a GLD-GAN network model of each class;

module S9: training a super-resolution network SRGAN;

Specifically, the module S1 includes:

a module S103: firstly, positioning the position of a nose in the calculated coordinates of the five sense organs, ensuring that the nose is the central axis of the intercepted face image, adjusting the position of a boundary frame according to the central axis, and intercepting the boundary frame as a new face image P₀；

The module S102:

the module S2 includes:

A module S202: traverse P₀And P₁If at P₁Is corresponding to the first color, then P is₀The corresponding pixel in the image is changed into a second color, thereby achieving the effect of segmenting the face image and obtainingAfter the human face and the background are segmented, only the image P of the human face₂；

The module S3 includes:

the module S4 includes:

wherein,

G^*representing loss functions of GLD-GAN networks

λ₁a parameter representing loss of L1;

λ₂a parameter representing a loss of blur;

λ₃a parameter representing a loss of identity retention;

represents a loss of L1, i.e. oneLoss of norm;

represents a loss of ambiguity;

representing an identity retention loss;

g represents a loss function of the generating network;

l2 denotes the sharpness loss function of the generated image;

the module S5 includes:

The module S6 includes:

The module S7 includes:

a module S701: front face image P₃、P₄Performing face fusion according to a first preset proportion, wherein a face fusion algorithm is based on an open source library dlib and OpenCV, wherein dlib is used for extracting a feature model of a face, and then OpenCV is used for proportionally combining featuresFusing the models to obtain fused faces;

replacing original front face image P by obtained fused face₄：

Specifically, the module S8 includes:

using a bilateral filter, image P₅Optimizing to obtain an image P₆；

wherein,

representing distance template coefficients;

k and l are central coordinates of the template window;

i, j are coordinates of other coefficients of the template window;

σ_dis the standard deviation of its gaussian function;

the value domain template coefficient generation formula is as follows:

wherein,

representing a value domain template coefficient;

a pixel value representing a point with coordinates i, j;

f (k, l) represents a pixel value of a point whose coordinate is k, l;

σ_rthe standard deviation of the gaussian function is expressed;

wherein the function f represents the image to be processed;

in summary, the templates of the bilateral filter are:

wherein,

a template representing a bilateral filter;

the module S9 includes:

the module S10 includes:

wherein,

a loss function representing a generated network in the SRGAN network;

I^HRrepresenting a high resolution image;

representing a feature map corresponding to the image;

I^LRrepresenting a low resolution image;

The present invention will be described more specifically below with reference to preferred examples.

Preferred example 1:

as shown in fig. 1, the irregular face correction method based on GLD-GAN provided by the present invention includes the following steps:

step S1: recognizing and intercepting a face in an image;

step S2: eliminating background redundancy by adopting an image segmentation technology based on CRF-RNN;

step S3: image classification is carried out by adopting an inclusion model, and the side faces are classified according to angles

Step S4: training a GLD-GAN network model of each angle;

step S5: inputting the side face image into GLD-GAN to obtain a front face image a;

step S9: training a super-resolution network SRGAN;

The step S1 includes:

step S1.1: detecting whether a human face exists in the image;

step S1.2: if yes, calculating the coordinates of the five sense organs and a bounding box;

specifically, reading an image, identifying and analyzing a human face, and returning coordinates of five sense organs and a bounding box of the human face by utilizing an open source library face _ recognition based on dlib;

step S1.3: the position of the nose is firstly positioned in the coordinates of the five sense organs, and the nose is ensured to be the central axis of the intercepted face image. Adjusting the position of the bounding box according to the central axis, and intercepting the bounding box as a new face image P₀。

Specifically, let the original central axis x be (right-left)/2, let the abscissa of the nose be mid, if x > mid, length be mid-left, right be right- (x-mid), left is unchanged; if x < mid, then length-right mid, left + (mid-x), right is unchanged.

The step S2 includes:

step S2.1: utilizing an open source model CRF-RNN to obtain a face image P obtained in the step 1₀Performing image segmentation to obtain an image-segmented color map P₁。

Step S2.2: traverse P₀And P₁If at P₁Corresponds to red (192,0,0), then P is asserted₀The corresponding pixel in the image is changed into pure black (0,0,0), thereby achieving the effect of segmenting the face image;

step S2.3: storing the image P which is obtained by processing according to the step 2.2 and is obtained by dividing the human face and the background and only has the human face₂。

The step S3 includes: manually labeling all training images for side face correction according to the angles of the side faces, dividing the training images into seven classes of 0,15,30,45,60,75 and 90 degrees, and then putting the training images into an inclusion V3 model for classification training;

specifically, seven folders corresponding to seven classes are prepared, corresponding training data is respectively placed in the seven folders, and then the classification model is read and trained by using inclusion V3. Each time an image is input into the model, the model returns the likelihood that the image belongs to each class. In this project only the most likely result is required to be returned.

The step S4 includes:

step S4.1: preparing corresponding training, testing and verifying data sets;

step S4.2: the data sets required by the GLD-GAN network are in one-to-one correspondence, namely, one side face image corresponds to one front face image, and then the two images are spliced on one image;

step S4.3: because there are seven classes in total, it is necessary to train a GLD-GAN network model for each class; the loss function of the GLD-GAN network is

Wherein,

G^*representing loss functions of GLD-GAN networks

The loss of the GAN network is expressed, namely, the loss is increased to a stable local optimal solution through a maximum minimum value.

λ₁To representParameters of L1 loss

λ₂Parameters representing blur loss

λ₃Parameters representing loss of identity retention

Representing L1 loss, i.e. loss of one norm

Representing loss of ambiguity

Indicating identity retention loss

Wherein G represents the loss function of the generated network, D represents the loss function of the discrimination network, namely the local loss function between the generated image and the grountruth, the maximum and minimum values are the game between the generated countermeasure networks so as to achieve a stable local optimal solution, and L represents the loss function of the discrimination network₁Representing the overall loss function between the generated image and the ground truth, L₂Representing the loss of sharpness function, L, of the generated image₃Representing the identity information loss function between the generated image and the ground truth.

The step S5 includes:

step S5.1: according to the Incepton V3 network, the input face image P is processed₂And dividing the angle into GLD-GAN networks with corresponding angles.

Step S5.2: regardless of the face image P inputted₂The size of the image is 256 × 256, and the corresponding front face image P is output₃。

The step S6 includes: regardless of the face image P inputted₂The size of the image is 256 × 256, and the corresponding front face image P is output₄。

The step S7 includes:

step S7.1: will face upImage P₃：P₄And performing face fusion according to the ratio of 3:1, wherein a face fusion algorithm is based on an open source library dlib and OpenCV. And the dlib is used for extracting a feature model of the human face, and the feature models are fused according to the proportion by using OpenCV.

Step S7.2: replacing original front face image P by obtained fused face₄To obtain an image P₅. The face replacement algorithm is also based on dlib and OpenCV.

Specifically, a convex hull is calculated by using the key points of the face detected by dlib, Delaunay triangulation is performed on the key points, and the fused face is mapped to a face image P by affine transformation₄The above. Affine transformation formula as

Wherein, (x, y) is the coordinates of the feature points of the fused face, and (x ', y') is the transformed feature points in the image P₄M is an affine transformation matrix comprising a plurality of operations of translation, rotation, scaling, and M₀₀～m₁₂Respectively, parameters of affine variation.

The step S8 includes: using a bilateral filtering algorithm, the image P₅Optimizing and removing noise to a certain extent to obtain an image P₆. The bilateral filter generates a distance template by using a two-dimensional Gaussian function, and generates a value domain template by using a one-dimensional Gaussian function. Distance template factor of

Wherein,

representing distance template coefficients;

k and l are central coordinates of the template window;

i, j are coordinates of other coefficients of the template window;

σ_das a function of its gaussiansStandard deviation. The value domain template coefficient is generated by the formula

Wherein,

representing value domain template coefficients

Pixel value representing a point with coordinates i, j

f (k, l) represents the pixel value of a point with coordinates k, l

σ_rStandard deviation of Gaussian function

Where the function f represents the image to be processed. In summary, the template of the bilateral filter is

Wherein,

template representing bilateral filter

The step S9 includes: preparing corresponding training, testing and verifying data sets; the size ratio of the low-resolution image to the high-resolution image is 1:4, and in practical cases, the ready-made low-resolution image is not necessarily required, and the low-resolution image can be directly obtained by compressing the high resolution.

The step S10 includes: image P₆When the input SRGAN network model is input, the super-resolution model can perform super-resolution processing on the input image, greatly improve the resolution of the image to 1024 × 1024, and output a final image P₇. Wherein the loss function of the generating network of the SRGAN network is

Wherein,

loss function representing a generated network in an SRGAN network

W_i,jWidth of VGG-19 network characteristic diagram representing ith row and jth column

H_i,jHeight of VGG-19 network characteristic diagram representing ith row and jth column

φ_i,jFeature maps representing the acquisition from before the ith max pooling layer of the jth convolution (after activation) of a VGG-19 network

I^HRRepresenting high resolution images

(I^HR)_x,yRepresenting x, y coordinates of pixels in high resolution images

Feature map representing image correspondence

I^LRRepresenting low resolution images

Pixel point with x, y coordinate in characteristic diagram representing low resolution image

The loss function is simply the pixel-by-pixel loss of the feature map of a certain layer as the content loss, not the pixel-by-pixel loss of the final output result, so that the manifold space where the image is located can be learned. And the loss function of the SRGAN discrimination network is actually negative logarithm summation, which is beneficial to training.

Preferred example 2:

an irregular human face correction method based on GLD-GAN of autonomous research comprises the following steps:

step S1: recognizing and intercepting a face in an image;

Step S4: training a GLD-GAN network model of each angle;

step S9: training a super-resolution network SRGAN;

The step S1 includes:

step S1.1: detecting whether a human face exists in the image;

The step S2 includes:

step S2.1: utilizing an open source model CRF-RNN to obtain a face image P obtained in the step 1₀Performing image segmentation to obtain an imageDivided color map P₁。

Step S2.2: traverse P₀And P₁If at P₁Corresponds to red (192,0,0), then P is asserted₀The corresponding pixel in (1) becomes pure black (0,0,0)

The step S3 includes: dividing all training images for side face correction into seven classes of 0,15,30,45,60,75 and 90 degrees according to the angles of the side faces, and then putting the training images into an inclusion V3 model for classification training;

The step S4 includes:

step S4.1: preparing corresponding training, testing and verifying data sets;

step S4.2: because there are seven classes in total, it is necessary to train a GLD-GAN network model for each class;

the step S5 includes:

The step S7 includes:

step S7.1: front face image P₃：P₄And performing face fusion according to the ratio of 3:1, wherein a face fusion algorithm is based on an open source library dlib and OpenCV.

The step S8 includes: using a bilateral filtering algorithm, the image P₅Optimizing and removing noise to a certain extent to obtain an image P₆。

The step S9 includes: preparing corresponding training, testing and verifying data sets;

the step S10 includes: image P₆When the input SRGAN network model is input, the super-resolution model can perform super-resolution processing on the input image, and the resolution of the image is greatly improved to 1024 x 1024

In the description of the present application, it is to be understood that the terms "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience in describing the present application and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present application.

Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. An irregular human face correction method based on GLD-GAN is characterized by comprising the following steps:

step S4: training a GLD-GAN network model of each class;

step S9: training a super-resolution network SRGAN;

2. The GLD-GAN based irregular face correction method according to claim 1, wherein the step S1 comprises:

The step S102:

3. The GLD-GAN based irregular face correction method according to claim 2, wherein the step S2 comprises:

The step S3 includes:

the step S4 includes:

wherein,

G^*representing loss functions of GLD-GAN networks

λ₁a parameter representing loss of L1;

λ₂a parameter representing a loss of blur;

λ₃a parameter representing a loss of identity retention;

represents the L1 loss, i.e., the one-norm loss;

represents a loss of ambiguity;

representing an identity retention loss;

g represents a loss function of the generating network;

l2 denotes the sharpness loss function of the generated image;

4. The GLD-GAN based irregular face correction method according to claim 3, wherein the step S5 comprises:

The step S6 includes:

The step S7 includes:

step S702: by usingThe obtained fused face replaces the original front face image P₄To obtain an image P₅The face replacement algorithm is also based on dlib and OpenCV;

replacing original front face image P by obtained fused face₄：

5. The GLD-GAN based irregular face correction method according to claim 4, wherein the step S8 comprises:

using a bilateral filter, image P₅Optimizing to obtain an image P₆；

wherein,

representing distance template coefficients;

k and l are central coordinates of the template window;

i, j are coordinates of other coefficients of the template window;

σ_dis the standard deviation of its gaussian function;

the value domain template coefficient generation formula is as follows:

wherein,

representing a value domain template coefficient;

a pixel value representing a point with coordinates i, j;

f (k, l) represents a pixel value of a point whose coordinate is k, l;

σ_rthe standard deviation of the gaussian function is expressed;

wherein the function f represents the image to be processed;

in summary, the templates of the bilateral filter are:

wherein,

a template representing a bilateral filter;

the step S9 includes:

6. The GLD-GAN based irregular face correction method according to claim 5, wherein the step S10 comprises:

wherein,

a loss function representing a generated network in the SRGAN network;

I^HRrepresenting a high resolution image;

representing a feature map corresponding to the image;

I^LRrepresenting a low resolution image;

7. An irregular face rectification system based on GLD-GAN, comprising:

module S4: training a GLD-GAN network model of each class;

module S9: training a super-resolution network SRGAN;

8. The GLD-GAN based irregular face correction system of claim 7, wherein the module S1 comprises:

The module S102:

the module S2 includes:

The module S3 includes:

the module S4 includes:

wherein,

G^*representing loss functions of GLD-GAN networks

Representing GAN networksThe loss of the complex itself, namely a stable local optimal solution is achieved through a maximum minimum value;

λ₁a parameter representing loss of L1;

λ₂a parameter representing a loss of blur;

λ₃a parameter representing a loss of identity retention;

represents the L1 loss, i.e., the one-norm loss;

represents a loss of ambiguity;

representing an identity retention loss;

g represents a loss function of the generating network;

l2 denotes the sharpness loss function of the generated image;

the module S5 includes:

The module S6 includes:

The module S7 includes:

replacing original front face image P by obtained fused face₄：

9. The GLD-GAN based irregular face correction system of claim 8, wherein the module S8 comprises:

using a bilateral filter, image P₅Optimizing to obtain an image P₆；

wherein,

representing distance template coefficients;

k and l are central coordinates of the template window;

i, j are coordinates of other coefficients of the template window;

σ_dis the standard deviation of its gaussian function;

the value domain template coefficient generation formula is as follows:

wherein,

representing a value domain template coefficient;

a pixel value representing a point with coordinates i, j;

f (k, l) represents a pixel value of a point whose coordinate is k, l;

σ_rthe standard deviation of the gaussian function is expressed;

wherein the function f represents the image to be processed;

in summary, the templates of the bilateral filter are:

wherein,

a template representing a bilateral filter;

the module S9 includes:

the module S10 includes:

wherein,

a loss function representing a generated network in the SRGAN network;

I^HRrepresenting a high resolution image;

representing a feature map corresponding to the image;

I^LRrepresenting a low resolution image;

10. A computer readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the GLD-GAN based irregular face correction method according to any one of claims 1 to 6.