CN108596024B - Portrait generation method based on face structure information - Google Patents

Portrait generation method based on face structure information Download PDF

Info

Publication number
CN108596024B
CN108596024B CN201810206139.XA CN201810206139A CN108596024B CN 108596024 B CN108596024 B CN 108596024B CN 201810206139 A CN201810206139 A CN 201810206139A CN 108596024 B CN108596024 B CN 108596024B
Authority
CN
China
Prior art keywords
image
face
sketch
network
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810206139.XA
Other languages
Chinese (zh)
Other versions
CN108596024A (en
Inventor
俞俊
施圣洁
高飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201810206139.XA priority Critical patent/CN108596024B/en
Publication of CN108596024A publication Critical patent/CN108596024A/en
Application granted granted Critical
Publication of CN108596024B publication Critical patent/CN108596024B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a portrait generation method based on face structure information. The invention comprises the following steps: 1. and 2, performing data preprocessing on the original image, the target image and the face structure information, and performing feature extraction and fusion by using a face structure information model at the input end of the image generator. 3. The combined loss function based on the face structure component is used in the loss function part of the image generator. 4. A countermeasure network is generated, generated by a generator, and differentiated by a discriminator. 5. And (4) model training, namely training neural network parameters by using a back propagation algorithm. The invention provides a neural network model for generating a portrait from a face photo, in particular a method for generating the portrait by using guiding information of a face part and calculating loss of each part by using part information to optimize network parameters in a portrait generator.

Description

Portrait generation method based on face structure information
Technical Field
The invention relates to a generation countermeasure network for generating a face Photo-portrait (Photo-Sketch Synthesis), mainly relating to a method for modeling generation of the face Photo-portrait and instructive optimization for generating an image by using face structure information.
Background
The problem of Face Sketch Portrait Generation (Sketch portal Generation) is to generate a corresponding Face Sketch given a Face, and it is also called Photo-Sketch conversion (Photo-Sketch conversion) or Face Sketch Generation (Face Sketch Synthesis). The human face sketch generation has more applications, such as wide application in entertainment or criminal investigation. An ideal generated face sketch has two characteristics, namely, the appearance of a person is kept, so that the method has high preparation rate on identification of sketch face sketch identity information; and secondly, pixel tracing is required, so that the pixel tracing has a good effect on visual perception. Although some successful methods have been proposed in this field, some existing generation methods, even those based on deep learning, still produce blurring or severe deformation in the generation process of face sketch.
In recent years, creation of a countermeasure network (GAN) has been highly successful in problems such as Image Style conversion (Image Style Transfer), Image Super-Resolution (Image Super-Resolution), and Image-to-Image conversion (Image-to-Image conversion). The sketch generation problem can be described as a photo-sketch generation problem, which is better handled by modeling a Conditional generation countermeasure network (cGAN). Although a conditional generation countermeasure network can exhibit good performance, such as it can generate some good textures, it is difficult to effectively model the relationships between various parts of a face without a given face structure.
In actual scenes, photo face conversion is also widely applied. Particularly in case detection and security, and especially in case detection, can help investigators locate or narrow the scope of suspect. Although the layout and the use of the current monitoring video are very wide, in practical application, the problems that a suspect cannot be shot and the resolution ratio is too low often exist, so that a witness can be adopted to describe the facial features of the suspect, then a professional draws a face sketch portrait, and the face sketch portrait is taken into an police database to be compared, so that the criminal suspect can be found. The face recognition rate of the face photos in the prior art is mature, but the recognition problem of sketching and photos still cannot be well solved. On the other hand, the generation of the sketch also pursues visual quality, a good sketch is generated, the identity information is retained, the texture of the sketch, the details of each part, the clear texture and the like can make the sketch widely applied to entertainment.
Because the actual human face image content is complex, various human face parts exist, and the details required to be presented among the parts also have differences, the conversion algorithm of human face photo-sketch faces a huge challenge. Specifically, there are two main difficulties:
(1) modeling the face photo image, extracting features, and keeping identity information: in the traditional face sketch generation process, the retention of identity information is an important problem. However, the extraction and retention of identity information features in the actual sketch generation still remain a difficult problem, and particularly, the problem of retaining face identity information is more difficult when a visual effect is pursued. In actual criminal investigation application, the face identity information is indispensable, so that the face identity information is reserved in the process from the face photo image to sketch generation.
(2) Face sketch each part structure and holistic visual effect: in the process of face sketch generation, problems which usually occur include deformation of face structure, blurring of details, especially blurring of hair texture, and the like, and furthermore, blurring or unrealistic of overall paintings is also a frequently encountered problem. Particularly, in the task of improving the recognition effect, the influence of the identity information features of the human face on the visual effect is more obvious. In order to be applied to entertainment and the like, the improvement of the visual effect of face sketch is also an important aspect.
Disclosure of Invention
The invention aims to provide a portrait generation method based on face structure information aiming at the defects of the prior art.
The technical scheme adopted by the invention for solving the technical problem comprises the following steps:
step (1), data preprocessing
The size of an original face photo image is 250 multiplied by 200, the number of channels of an RGB image is 3; the size of the sketch image of the original face photograph is also 250 x 200, a grayscale image, with a channel of 1. The human face photo image and the sketch image Y of the human face photo are aligned uniformly, the size and the channel number of the human face photo image obtained after alignment are equal to those of the original human face photo image, the distance between two eyes is 50, and the distance from the eyes to the upper side of the image is 125. The aligned face picture image is acquired by the existing method to obtain a face structure part probability map, namely, the face structure part probability map is acquired by a semantic analysis network (P-Net) in fig. 1, the size of the image is 250 × 200, the number of channels is 11, and the probability of each channel is the probability of 11 parts. And respectively combining the probability maps of the left eyebrow and the right eyebrow, the left eye and the right eye, and the upper lip and the lower lip in the 11 parts in a corresponding pixel addition mode to finally obtain the probability maps of 8 parts in total. During training, edge filling is performed on the three types of images by using 0 to obtain 286 × 286 images, and the images with the sizes of 256 × 256 corresponding positions are randomly selected for training each time.
Step (2), feature extraction and fusion based on face structure information
Based on the existing original U-shaped network (U Networks, UNet), the face structure information is added at the input end, the U-shaped network is improved, and the model is realized based on the neural network. The structure diagram is shown in fig. 1, a face photo image X is input into an Appearance Encoder (Appearance Encoder), a feature map with the size of 1 × 1 is finally obtained, and meanwhile, the feature map after each convolution operation is reserved; probability map of human face structural parts
Figure BDA0001595926880000031
Inputting the data into a structure Encoder (Composition Encoder), wherein the specific method is the same as the method for processing the face photos; in the Decoder (Decoder) part, the input of each time is spliced with the feature map with the same size obtained in the previous encoder, and then the operation is carried out. Finally, the required face sketch image is obtained at the end of the decoder.
Aligning the face picture image X, the sketch image Y corresponding to the face picture and the probability map of the face structure part
Figure BDA0001595926880000032
The three groups of the triads, X, Y,
Figure BDA0001595926880000033
as a training set.
Step (3), a combined loss function based on the face structure component:
by the method of step (2), we have obtained a face sketch image of 256 × 256 size. Based on the probability graph of the existing face structure part, each part is optimized, namely, for the generated image, the probability graph of each part is multiplied by pixel points, and the manhattan distance (L) is calculated by the image obtained by multiplying the probability graph of each part by the original face sketch image1Distance) to optimize the network model.
Step (4) generating a countermeasure network
The network is divided into a generator and a discriminator, the portrait generated by the generator is close to the distribution of the real portrait, and the discriminator is used for calculating a loss function by discriminating whether the portrait is the real original portrait or the generated portrait and optimizing the loss function.
Step (5), model training
And (3) generating a portrait according to the model in the step (2) by using a training set consisting of the existing 'photo-structure information-portrait' triple, calculating the loss of the network by using the steps (3) and (4), and training the model parameters of the neural network in the steps (2) and (4) by using a back propagation algorithm until the whole neural network model converges.
Preprocessing the data in the step (1):
firstly, face alignment is carried out, the size and the channel number of a face photo image X obtained after alignment are equal to those of the original image, the distance between two eyes is 50, and the distance from the eyes to the upper edge of the image is 125.
Secondly, the face photo image X is decomposed into 8 parts of probability graph based on pixel points by a face semantic analysis method
Figure BDA0001595926880000041
Wherein
Figure BDA0001595926880000042
The probability output for each component is separately,
Figure BDA0001595926880000043
the probability that the pixel Xi, j belongs to the C-th component is shown, wherein C is 1, …, C, and C is 8. The 8 parts are respectively: eyes, eyebrows, nose, upper and lower lips, mouth, face, hair, background;
aligned face photo image
Figure BDA0001595926880000044
And face sketch images
Figure BDA0001595926880000045
Figure BDA0001595926880000046
Wherein h, w and C represent the face photo image height, width and channel number, respectively.
The feature extraction and fusion based on the face structure information in the step (2) are specifically as follows:
2-1, firstly, an original U-shaped network (UNet) has the following specific structure:
the U-network is divided into two parts: an encoder and a decoder.
In the Encoder (Encoder), divide into 8 modules (Block) totally, 2-7 modules constitute by 3 kinds of operations, do respectively in proper order: an empty corrected Linear unit (ReLU), a Convolution (CNN) and Batch Normalization (BN), wherein the first module comprises convolution, and the last module comprises an empty corrected Linear unit and convolution; meanwhile, the output result of each module is reserved as a characteristic and used in a decoder.
In the Decoder (Decoder) part, divide into 8 modules altogether, 1-7 module comprises by 3 kinds of operations, does respectively in proper order: modified Linear Units (ReLU), Convolution (CNN), and Batch Normalization (BN), the last module including modified Linear Units, convolution, and Tanh.
In the decoder part, the last Feature map (Feature maps) in the encoder is used as the input of the first module of the decoder, and the input of each module of the decoder is spliced (collocated) with the Feature map of the corresponding size reserved in the encoder as the input of the next module. The required input image is available at the end of the decoder.
2-2. the U-shaped network added with the face structure information has the following specific structure:
the U-shaped network added with the face structure information comprises two encoders and a decoder.
Two encoders respectively facing the face photo image X and the face structure information
Figure BDA0001595926880000051
Is subjected to a treatment in which
Figure BDA0001595926880000052
I.e. by X and
Figure BDA0001595926880000053
the specific network structure and the retained characteristic diagram are the same as those of the original U-shaped network respectively serving as the input of the two encoders.
An encoder for the face picture image X, able to obtain a feature set
Figure BDA0001595926880000054
Figure BDA0001595926880000055
For face structure information
Figure BDA0001595926880000056
Can obtain a feature map set
Figure BDA0001595926880000057
Wherein S is 8.
In the decoder part, the operation of each module is the same as that of the original U-shaped network. At the input part of the first module, the last characteristic diagrams of the two encoders are spliced to obtain a characteristic diagram I1
Figure BDA0001595926880000058
Will feature diagram I1As input to the first module. At the same time, the output O of each module is stitched with the corresponding size of the feature maps in the two encoders, i.e. the output O for the first module in the decoder1Will be
Figure BDA0001595926880000059
And
Figure BDA00015959268800000510
and O1Splicing, waiting for input to the next module
Figure BDA00015959268800000511
The outputs thereafter are analogized. Finally, a 256 multiplied by 256 face sketch image which is needed by people is obtained
Figure BDA00015959268800000512
The probability graph based on the face structure part in the step (3)
Figure BDA00015959268800000513
The combined loss function of (2) is as follows:
the loss function includes two parts: for global losses and for losses of individual components, respectively
Figure BDA0001595926880000061
And
Figure BDA0001595926880000062
and (4) showing.
For the
Figure BDA0001595926880000063
The specific formula of the loss function of (2) is as follows:
Figure BDA0001595926880000064
wherein the content of the first and second substances,
Figure BDA0001595926880000065
representing a sketch image of a human face, Y representing a sketch image, m and n representing the height and width of the sketch image respectively,
Figure BDA0001595926880000066
the calculation formula of (2) is as follows:
Figure BDA0001595926880000067
for the
Figure BDA0001595926880000068
Loss function of (2): in the probability map of the existing face structure part
Figure BDA0001595926880000069
Respectively optimizing each part on the basis of the following specific processes:
firstly, a weight factor is introduced to eliminate the loss of different pixel points of each part. For each part, the specific formula is as follows:
Figure BDA00015959268800000610
wherein the content of the first and second substances,
Figure BDA00015959268800000611
representing the sum of the probabilities of all the pixels in the c-th component,
Figure BDA00015959268800000612
here, convolution operation is indicated. An indication of multiplication by a corresponding pixel point is indicated here.
Thus total loss function of 8 components
Figure BDA00015959268800000613
The concrete formula of (1) is as follows:
Figure BDA00015959268800000614
the total loss function of the resulting generator is:
Figure BDA00015959268800000615
wherein α preferably ranges from 0 to 1.
Generating the countermeasure network in the step (4), specifically as follows:
the generation of the countermeasure network is divided into two parts as a whole: a Generator (Generator) and a Discriminator (Discriminator), and the two encoders and the decoder in the step (2) integrally form the Generator in the generation countermeasure network.
The input of the discriminator is
Figure BDA00015959268800000616
Wherein
Figure BDA00015959268800000617
Representing the generated sketch image, and judging whether the sketch image is true or false, wherein a judging loss function formula is as follows:
Figure BDA00015959268800000618
the training model in the step (5) is as follows:
the loss value of the generator is formulated as follows:
Figure BDA0001595926880000071
loss value of discriminator
Figure BDA0001595926880000072
I.e. the loss of the discriminator.
According to the calculated loss value
Figure BDA0001595926880000073
And
Figure BDA0001595926880000074
parameters in the network are adjusted using a Back-Propagation (BP) algorithm.
The invention has the following beneficial effects:
because the human face has strong geometric constraint and very complex structural details, the introduction of the human face structural information to assist the generation of the human face sketch is very promising. Recently, the face part marking technology based on face pixel points is rapidly developed, and with the inspiration, face structure information is introduced to generate faces. In addition, we add structure information not only at the input, but also at the Loss function part at the output, and use the upgraded version of the Loss function, which we call composite Loss.
The invention provides a deep neural network architecture generated from a face photo image to a sketch based on face structure information, and aims to solve the two difficult problems. 1. Generating a human face sketch image with good visual effect, making the structure of the human face sketch image unreasonable, keeping the characteristics of details and the like, and making the human face sketch image more like a manual picture; 2. in the aspect of preserving the face identity information, namely for the face recognition problem, the method also has extremely high accuracy.
Drawings
FIG. 1 is a schematic view of the present invention.
Detailed Description
The following is a more detailed description of the detailed parameters of the present invention.
As shown in fig. 1, a portrait generation method based on face structure information includes the following steps:
step (1), data preprocessing:
carrying out face alignment on a face photo with an original size of 250 multiplied by 200 and a portrait uniformly, wherein the size of the aligned face photo is 250 multiplied by 200, the distance between two eyes is 50, and the distance between the eyes and the upper side of the image is 125; carrying out edge filling on the image by 0 to obtain a 286 × 286 image, and randomly taking a face photo image X with the size of 256 × 256 for training each time; extracting features with space and texture information by using a U-Net network;
aligning the face picture image X, the sketch image Y corresponding to the face picture and the probability map of the face structure part
Figure BDA0001595926880000081
The three groups of the triads, X, Y,
Figure BDA0001595926880000082
as a training set;
step (2) based on the probability graph of the face structure part
Figure BDA0001595926880000083
The feature extraction and fusion:
probability map of face picture image X and face structure component using two different encoders
Figure BDA0001595926880000084
Coding is carried out, the features extracted by the two coders are spliced with the features of the decoder, and the finally generated sketch image Y is output;
step (3) probability graph based on face structure part
Figure BDA0001595926880000085
The combined loss function of:
for the sketch image Y generated in the step (2), according to the probability map of the existing face structure part
Figure BDA0001595926880000086
Respectively calculating loss functions of each part and the target image, and optimizing network parameters by adding the loss functions of the whole sketch image Y and the target image;
step (4), generating a countermeasure network:
the network is divided into a generator and a discriminator, the sketch image Y generated by the generator is close to the distribution of a real image, and the discriminator is used for calculating and optimizing a loss function according to whether the sketch image Y is a real original sketch image or the generated sketch image Y;
step (5), model training:
and (3) generating a sketch image according to the model in the step (2) by using a training set consisting of the existing 'photo-structure information-sketch image' triple, calculating the loss of the network by using the steps (3) and (4), and training the model parameters of the neural network in the steps (2) and (4) by using a back propagation algorithm until the whole neural network model converges.
The data preprocessing in the step (1) comprises the following specific steps:
the CUFS data set is used here as training and testing data.
1-1. the face photograph image is X, wherein
Figure BDA0001595926880000087
The number of channels of the image features is defined, and h is 250 and w is 200, which are the height and width of the face photo image respectively; the sketch image of the human face is Y, wherein
Figure BDA0001595926880000088
The number of channels of the image features is, and h-250 and w-200 are the height and width of the face photograph image, respectively. Firstly, the human face alignment is carried out on X and Y, the size and the number of channels of an image obtained after the alignment are equal to those of the original image, the distance between two eyes is 50, and the distance from the eyes to the upper edge of the image is 125.
1-2, according to the face photo X obtained after alignment in 1-1, we predict the structural components to obtain the structural probability graph of the face photo as
Figure BDA0001595926880000089
Figure BDA0001595926880000091
C is the number of channels of the image feature, and h 250 and w 200 are the height and width of the face photograph image, respectively. Where C-11 is the probability of 11 components per channel. In addition, probability maps of the left and right eyebrows, the left and right eyes, and the upper and lower lips of the 11 parts are respectively combined in a corresponding pixel addition manner, and finally probability maps of 8 parts in total are obtained. Finally, we can get
Figure BDA0001595926880000092
Wherein C is 8.
1-3. after obtaining X, Y,
Figure BDA0001595926880000093
then, we perform a uniform processing on the sizes of the training samples, namely, 0 to X, Y,
Figure BDA0001595926880000094
and respectively carrying out edge filling to obtain 286 × 286 images, and randomly taking the images with the sizes of 256 × 256 at the corresponding positions for training each time. When the image is filled with 0, the number of 0 s on the top and bottom of the image is equal, and the number of 0 s on the left and right is equal.
The feature extraction and fusion based on the face structure information in the step (2) are specifically as follows:
2-1. in the encoder part, the parameter Negative Slope (Negative Slope) of the Leaky modified linear unit is 0.2; the convolution kernels (Kernel Size) of the convolution operation are all 4 in Size, the step Size (Stride) is 2, the 0 Padding (Zero Padding) is 1, and the feature map Size is sequentially 2 powers of 64 and 512 maximum.
The combined loss function based on the face structure component in the step (3) is specifically as follows:
in the loss function described in step (3), α is preferably in the range of 0 to 1, where α is 0.7.

Claims (3)

1. A portrait generation method based on face structure information is characterized by comprising the following steps:
step (1), data preprocessing:
carrying out face alignment on a face photo with an original size of 250 multiplied by 200 and a portrait uniformly, wherein the size of the aligned face photo is 250 multiplied by 200, the distance between two eyes is 50, and the distance between the eyes and the upper side of the image is 125; carrying out edge filling on the image by 0 to obtain a 286 × 286 image, and randomly taking a face photo image X with the size of 256 × 256 for training each time; extracting features with space and texture information by using a U-Net network;
aligning the face picture image X, the sketch image Y corresponding to the face picture and the probability map of the face structure part
Figure FDA0002958765290000011
The three groups of the triads, X, Y,
Figure FDA0002958765290000012
as a training set;
step (2) based on the probability graph of the face structure part
Figure FDA0002958765290000013
The feature extraction and fusion:
probability map of face picture image X and face structure component using two different encoders
Figure FDA0002958765290000014
Coding is carried out, the features extracted by the two coders are spliced with the features of the decoder, and the finally generated sketch image Y is output;
step (3) probability graph based on face structure part
Figure FDA0002958765290000015
The combined loss function of:
tracing the sketch generated in the step (2)Image Y, from the probability map of the existing face structure components
Figure FDA0002958765290000016
Respectively calculating loss functions of each part and the target image, and optimizing network parameters by adding the loss functions of the whole sketch image Y and the target image;
step (4), generating a countermeasure network:
the network is divided into a generator and a discriminator, the sketch image Y generated by the generator is close to the distribution of a real image, and the discriminator is used for calculating and optimizing a loss function according to whether the sketch image Y is a real original sketch image or the generated sketch image Y;
step (5), model training:
using a training set formed by the existing 'photo-structure information-sketch image' triple, generating a sketch image according to the model in the step (2), calculating the loss of the network by using the steps (3) and (4), and training the model parameters of the neural network in the step (2) and the step (4) by using a back propagation algorithm until the whole neural network model converges;
the feature extraction and fusion based on the face structure information in the step (2) are specifically as follows:
2-1, firstly, an original U-shaped network (UNet) has the following specific structure:
the U-network is divided into two parts: an encoder and a decoder;
in the encoder part, the encoder part is divided into 8 modules, and the 2 nd to 7 th modules are composed of 3 operations, which are respectively as follows: the method comprises a Leaky correction linear unit, convolution and batch normalization, wherein the first module comprises convolution, and the last module comprises a Leaky correction linear unit and convolution; meanwhile, the output result of each module is reserved as a characteristic and used in a decoder;
in the decoder part, 8 modules are divided, and the 1 st module to the 7 th module are composed of 3 operations, which are respectively as follows: modifying linear units, convolution and batch normalization, wherein the last module comprises modifying linear units, convolution and hyperbolic tangent;
in the decoder part, the last characteristic diagram in the encoder is used as the input of the first module of the decoder, and the input of each module of the decoder is spliced with the characteristic diagram with the corresponding size reserved in the encoder to be used as the input of the next module; the required input image is available at the end of the decoder;
2-2. the U-shaped network added with the face structure information has the following specific structure:
the U-shaped network added with the face structure information comprises two encoders and a decoder;
two encoders respectively facing the face photo image X and the face structure information
Figure FDA0002958765290000021
Is subjected to a treatment in which
Figure FDA0002958765290000022
I.e. by X and
Figure FDA0002958765290000023
the specific network structure and the reserved characteristic diagram are the same as those of the original U-shaped network;
an encoder for the face picture image X, able to obtain a feature set
Figure FDA0002958765290000024
Figure FDA0002958765290000025
For face structure information
Figure FDA0002958765290000026
Can obtain a feature map set
Figure FDA0002958765290000027
Wherein S ═ 8;
in the decoder part, the operation of each module and the original U-shaped networkThe same; at the input part of the first module, the last characteristic diagrams of the two encoders are spliced to obtain a characteristic diagram I1
Figure FDA0002958765290000028
Will feature diagram I1As an input to a first module; at the same time, the output O of each module is stitched with the corresponding size of the feature maps in the two encoders, i.e. the output O for the first module in the decoder1Will be
Figure FDA0002958765290000029
And
Figure FDA00029587652900000210
and O1Splicing, waiting for input to the next module
Figure FDA00029587652900000211
The subsequent output is analogized in turn; finally, a 256 multiplied by 256 face sketch image which is needed by people is obtained
Figure FDA00029587652900000212
The probability graph based on the face structure part in the step (3)
Figure FDA00029587652900000316
The combined loss function of (2) is as follows:
the loss function includes two parts: for global losses and for losses of individual components, respectively
Figure FDA0002958765290000031
And
Figure FDA0002958765290000032
represents;
for the
Figure FDA0002958765290000033
The specific formula of the loss function of (2) is as follows:
Figure FDA0002958765290000034
wherein the content of the first and second substances,
Figure FDA0002958765290000035
representing a sketch image of a human face, Y representing a sketch image, m and n representing the height and width of the sketch image respectively,
Figure FDA0002958765290000036
the calculation formula of (2) is as follows:
Figure FDA0002958765290000037
for the
Figure FDA0002958765290000038
Loss function of (2): in the probability map of the existing face structure part
Figure FDA0002958765290000039
Respectively optimizing each part on the basis of the following specific processes:
firstly, introducing a weight factor to eliminate the loss of different pixel points of each part; for each part, the specific formula is as follows:
Figure FDA00029587652900000310
wherein the content of the first and second substances,
Figure FDA00029587652900000311
representing the sum of the probabilities of all the pixels in the c-th component,
Figure FDA00029587652900000312
here, convolution operation is denoted; an indication here is the multiplication of the corresponding pixel;
thus total loss function of 8 components
Figure FDA00029587652900000313
The concrete formula of (1) is as follows:
Figure FDA00029587652900000314
the total loss function of the resulting generator is:
Figure FDA00029587652900000315
wherein α preferably ranges from 0 to 1.
2. The portrait generation method based on face structure information as claimed in claim 1, wherein the data preprocessing of step (1):
firstly, carrying out face alignment, wherein the size and the channel number of a face photo image X obtained after alignment are equal to those of the original image, the distance between two eyes is 50, and the distance from the eyes to the upper edge of the image is 125;
secondly, the face photo image X is decomposed into 8 parts of probability graph based on pixel points by a face semantic analysis method
Figure FDA0002958765290000041
Wherein
Figure FDA0002958765290000042
The probability output for each component is separately,
Figure FDA0002958765290000043
Figure FDA0002958765290000044
representing a pixel point Xi,jProbability of belonging to the C-th component, where C is 1, …, C is 8; the 8 parts are respectively: eyes, eyebrows, nose, upper and lower lips, mouth, face, hair, background;
aligned face photo image
Figure FDA0002958765290000045
And face sketch images
Figure FDA0002958765290000046
Wherein h, w and C represent the face photo image height, width and channel number, respectively.
3. The portrait generation method based on face structure information as claimed in claim 2, wherein the generation of the confrontation network in step (4) is as follows:
the generation of the countermeasure network is divided into two parts as a whole: a generator and a discriminator;
the two encoders and one decoder in the step (2) integrally form a generator in the generation countermeasure network;
the input of the discriminator is
Figure FDA0002958765290000047
Wherein
Figure FDA0002958765290000048
Representing the generated sketch image, and judging whether the sketch image is true or false, wherein a judging loss function formula is as follows:
Figure FDA0002958765290000049
the training model in the step (5) is as follows:
the loss value of the generator is formulated as follows:
Figure FDA00029587652900000410
loss value of discriminator
Figure FDA00029587652900000411
Namely the loss of the discriminator;
according to the calculated loss value
Figure FDA00029587652900000412
And
Figure FDA00029587652900000413
parameters in the network are adjusted using a back propagation algorithm.
CN201810206139.XA 2018-03-13 2018-03-13 Portrait generation method based on face structure information Active CN108596024B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810206139.XA CN108596024B (en) 2018-03-13 2018-03-13 Portrait generation method based on face structure information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810206139.XA CN108596024B (en) 2018-03-13 2018-03-13 Portrait generation method based on face structure information

Publications (2)

Publication Number Publication Date
CN108596024A CN108596024A (en) 2018-09-28
CN108596024B true CN108596024B (en) 2021-05-04

Family

ID=63626274

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810206139.XA Active CN108596024B (en) 2018-03-13 2018-03-13 Portrait generation method based on face structure information

Country Status (1)

Country Link
CN (1) CN108596024B (en)

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109448083B (en) * 2018-09-29 2019-09-13 浙江大学 A method of human face animation is generated from single image
WO2020062120A1 (en) 2018-09-29 2020-04-02 浙江大学 Method for generating facial animation from single image
CN109448079A (en) * 2018-10-25 2019-03-08 广东智媒云图科技股份有限公司 A kind of drawing bootstrap technique and equipment
CN109360231B (en) * 2018-10-25 2022-01-07 哈尔滨工程大学 Sea ice remote sensing image simulation method for generating confrontation network based on fractal depth convolution
CN109472838A (en) * 2018-10-25 2019-03-15 广东智媒云图科技股份有限公司 A kind of sketch generation method and device
CN109640068A (en) * 2018-10-31 2019-04-16 百度在线网络技术(北京)有限公司 Information forecasting method, device, equipment and the storage medium of video frame
CN111127304B (en) * 2018-10-31 2024-02-20 微软技术许可有限责任公司 Cross-domain image conversion
CN109741247B (en) * 2018-12-29 2020-04-21 四川大学 Portrait cartoon generating method based on neural network
CN109920021B (en) * 2019-03-07 2023-05-23 华东理工大学 Face sketch synthesis method based on regularized width learning network
CN110069992B (en) * 2019-03-18 2021-02-09 西安电子科技大学 Face image synthesis method and device, electronic equipment and storage medium
CN111860041A (en) * 2019-04-26 2020-10-30 北京陌陌信息技术有限公司 Face conversion model training method, device, equipment and medium
CN110619315B (en) * 2019-09-24 2020-10-30 重庆紫光华山智安科技有限公司 Training method and device of face recognition model and electronic equipment
CN112861579B (en) * 2019-11-27 2022-10-18 四川大学 Automatic detection method for three-dimensional facial markers
CN111127309B (en) * 2019-12-12 2023-08-11 杭州格像科技有限公司 Portrait style migration model training method, portrait style migration method and device
CN111223057B (en) * 2019-12-16 2023-09-22 杭州电子科技大学 Incremental focused image-to-image conversion method based on generation of countermeasure network
CN111242837B (en) * 2020-01-03 2023-05-12 杭州电子科技大学 Face anonymity privacy protection method based on generation countermeasure network
CN111275778B (en) * 2020-01-08 2023-11-21 杭州未名信科科技有限公司 Face simple drawing generation method and device
CN111243051B (en) * 2020-01-08 2023-08-18 杭州未名信科科技有限公司 Portrait photo-based simple drawing generation method, system and storage medium
CN111243050B (en) * 2020-01-08 2024-02-27 杭州未名信科科技有限公司 Portrait simple drawing figure generation method and system and painting robot
CN111353546B (en) * 2020-03-09 2022-12-23 腾讯科技(深圳)有限公司 Training method and device of image processing model, computer equipment and storage medium
CN111402407B (en) * 2020-03-23 2023-05-02 杭州相芯科技有限公司 High-precision portrait model rapid generation method based on single RGBD image
CN111523413B (en) * 2020-04-10 2023-06-23 北京百度网讯科技有限公司 Method and device for generating face image
CN111667007A (en) * 2020-06-08 2020-09-15 大连民族大学 Face pencil drawing image generation method based on confrontation generation network
CN111783647B (en) * 2020-06-30 2023-11-03 北京百度网讯科技有限公司 Training method of face fusion model, face fusion method, device and equipment
US20220191027A1 (en) * 2020-12-16 2022-06-16 Kyndryl, Inc. Mutual multi-factor authentication technology
CN112633288B (en) * 2020-12-29 2024-02-13 杭州电子科技大学 Face sketch generation method based on painting brush touch guidance
CN112800898A (en) * 2021-01-18 2021-05-14 深圳市网联安瑞网络科技有限公司 Pedestrian re-identification data set enhancement method, system, terminal, camera and medium
CN112907692B (en) * 2021-04-09 2023-04-14 吉林大学 SFRC-GAN-based sketch-to-face reconstruction method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709875A (en) * 2016-12-30 2017-05-24 北京工业大学 Compressed low-resolution image restoration method based on combined deep network
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
CN107066969A (en) * 2017-04-12 2017-08-18 南京维睛视空信息科技有限公司 A kind of face identification method
CN107358626A (en) * 2017-07-17 2017-11-17 清华大学深圳研究生院 A kind of method that confrontation network calculations parallax is generated using condition
CN107527318A (en) * 2017-07-17 2017-12-29 复旦大学 A kind of hair style replacing options based on generation confrontation type network model
CN107577985A (en) * 2017-07-18 2018-01-12 南京邮电大学 The implementation method of the face head portrait cartooning of confrontation network is generated based on circulation
CN107633218A (en) * 2017-09-08 2018-01-26 百度在线网络技术(北京)有限公司 Method and apparatus for generating image
CN107633232A (en) * 2017-09-26 2018-01-26 四川长虹电器股份有限公司 A kind of low-dimensional faceform's training method based on deep learning
CN107665339A (en) * 2017-09-22 2018-02-06 中山大学 A kind of method changed by neural fusion face character

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709875A (en) * 2016-12-30 2017-05-24 北京工业大学 Compressed low-resolution image restoration method based on combined deep network
CN107066969A (en) * 2017-04-12 2017-08-18 南京维睛视空信息科技有限公司 A kind of face identification method
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
CN107358626A (en) * 2017-07-17 2017-11-17 清华大学深圳研究生院 A kind of method that confrontation network calculations parallax is generated using condition
CN107527318A (en) * 2017-07-17 2017-12-29 复旦大学 A kind of hair style replacing options based on generation confrontation type network model
CN107577985A (en) * 2017-07-18 2018-01-12 南京邮电大学 The implementation method of the face head portrait cartooning of confrontation network is generated based on circulation
CN107633218A (en) * 2017-09-08 2018-01-26 百度在线网络技术(北京)有限公司 Method and apparatus for generating image
CN107665339A (en) * 2017-09-22 2018-02-06 中山大学 A kind of method changed by neural fusion face character
CN107633232A (en) * 2017-09-26 2018-01-26 四川长虹电器股份有限公司 A kind of low-dimensional faceform's training method based on deep learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
基于人脸特征和线积分卷积的肖像素描生成;赵艳丹等;《计算机辅助设计与图形学学报》;20141030;第26卷(第10期);第160-168页 *
基于样本学习的肖像画自动生成算法;陈洪等;《计算机学报》;20030228;第26卷(第2期);第148-156页 *
基于特征发现的卡通人脸肖像生成;周仁琴等;《计算机辅助设计与图形学学报》;20060930;第18卷(第9期);第1362-1369页 *
基于相关分析的肖像素描漫画生成系统;华博等;《计算机应用与软件》;20150730;第32卷(第7期);第1712-1716页 *

Also Published As

Publication number Publication date
CN108596024A (en) 2018-09-28

Similar Documents

Publication Publication Date Title
CN108596024B (en) Portrait generation method based on face structure information
CN109815826B (en) Method and device for generating face attribute model
US20200401842A1 (en) Human Hairstyle Generation Method Based on Multi-Feature Retrieval and Deformation
CN112766160A (en) Face replacement method based on multi-stage attribute encoder and attention mechanism
CN107680158A (en) A kind of three-dimensional facial reconstruction method based on convolutional neural networks model
CN110827312B (en) Learning method based on cooperative visual attention neural network
CN113240691B (en) Medical image segmentation method based on U-shaped network
CN107067429A (en) Video editing system and method that face three-dimensional reconstruction and face based on deep learning are replaced
CN109816011A (en) Generate the method and video key frame extracting method of portrait parted pattern
CN112580515B (en) Lightweight face key point detection method based on Gaussian heat map regression
CN111223057B (en) Incremental focused image-to-image conversion method based on generation of countermeasure network
Liu et al. Image decolorization combining local features and exposure features
CN113034355B (en) Portrait image double-chin removing method based on deep learning
CN109753996A (en) Hyperspectral image classification method based on D light quantisation depth network
CN113378812A (en) Digital dial plate identification method based on Mask R-CNN and CRNN
CN115731597A (en) Automatic segmentation and restoration management platform and method for mask image of face mask
CN109712095B (en) Face beautifying method with rapid edge preservation
CN114066871A (en) Method for training new coronary pneumonia focus region segmentation model
CN110555379B (en) Human face pleasure degree estimation method capable of dynamically adjusting features according to gender
CN105069767A (en) Image super-resolution reconstruction method based on representational learning and neighbor constraint embedding
CN114783039B (en) Motion migration method driven by 3D human body model
CN116311472A (en) Micro-expression recognition method and device based on multi-level graph convolution network
CN113688698B (en) Face correction recognition method and system based on artificial intelligence
CN112784800B (en) Face key point detection method based on neural network and shape constraint
CN111064905A (en) Video scene conversion method for automatic driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant