CN114863527B - Makeup style migration method based on FP-SCGAN model - Google Patents

Makeup style migration method based on FP-SCGAN model Download PDF

Info

Publication number
CN114863527B
CN114863527B CN202210488449.1A CN202210488449A CN114863527B CN 114863527 B CN114863527 B CN 114863527B CN 202210488449 A CN202210488449 A CN 202210488449A CN 114863527 B CN114863527 B CN 114863527B
Authority
CN
China
Prior art keywords
makeup
image
migration
loss
discriminator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210488449.1A
Other languages
Chinese (zh)
Other versions
CN114863527A (en
Inventor
李妹纳
杭丽君
熊攀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202210488449.1A priority Critical patent/CN114863527B/en
Publication of CN114863527A publication Critical patent/CN114863527A/en
Application granted granted Critical
Publication of CN114863527B publication Critical patent/CN114863527B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a makeup style migration method based on an FP-SCGAN model, which combines a feature pyramid with an SCGAN algorithm. The FP-SCGAN network includes four parts in total: PSEnc, FIEnc, MFDec and a markov discriminator. PSEnc is used for extracting reference makeup features, FIEnc is used for extracting facial features of pictures to be migrated, MFDec is used for fusing the facial features of the original pictures and the makeup features of the reference pictures, and a Markov discriminator is used for measuring the distance between the generated distribution and the actual distribution. The improved algorithm can solve the problems that the eyesockets have unnatural edges, lighter eyepieces cannot be migrated and the like during makeup migration, and compared with the current mainstream SCGAN makeup migration algorithm, the migration effect is improved.

Description

Makeup style migration method based on FP-SCGAN model
Technical Field
The invention belongs to the technical field of makeup migration methods, and relates to a makeup style migration method based on an FP-SCGAN model.
Background
Computer vision is one of the most popular research fields in the field of deep learning, and is widely applied to various fields at present. Along with the development and application of image processing algorithms, the development of the short video industry is accelerated, more and more functions such as camera filters, beauty and special effects are realized, and a large number of users are attracted. The application of these functions is indispensible from style migration algorithms in image processing algorithms.
The goal of image style migration is to migrate the style of a reference picture into another picture or pictures. Before neural networks, image style migration has a common idea: and analyzing an image in a certain style, establishing a mathematical or statistical model, and changing the image to be migrated so that the image can better accord with the established model. But there is a great disadvantage to doing so: a program can basically only make a certain style or a certain scene. Practical applications based on traditional style migration studies are therefore very limited. At present, a style migration algorithm is mainly based on deep learning, a neural network is adopted to extract characteristics of a style image and an image to be migrated, and the image is up-sampled and restored after the characteristics are fused, so that style migration is realized.
There are many style migration algorithms currently focusing on the migration of face attributes, wherein a typical face attribute migration is performed when makeup is migrated. The GAN-based cosmetic migration algorithm performs very well among many algorithms. The SCGAN can well migrate the reference makeup to the target image, and even for targets with very different makeup positions, good migration effect can still be generated. However, the problem that the eyebox has an unnatural edge, and lighter eyebox cannot migrate easily occurs in the migration process.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a makeup style migration method based on an FP-SCGAN model, which combines a feature pyramid and an SCGAN to generate a makeup migration network FP-SCGAN, so that the problems can be effectively solved, and the migration effect is improved. The method comprises the following steps:
the FP-SCGAN network comprises PSEnc, FIEnc, MFDec and a Markov discriminator, the training of the network is carried out in the mutual game of a generator G and the discriminator D, the network is converged when the dynamic balance is finally achieved, and the training specifically comprises the following steps:
s10, obtaining style characteristics: sending the non-makeup image x and the makeup image y into FIEnc, and obtaining facial features c of the picture to be migrated through feature extraction, downsampling and residual error modules x ,c y Sending the key region of the makeup reference image into PSEnc, performing feature extraction through a pretrained VGG19 network and fusing features by a feature pyramid to obtain style features s x ,s y
S20, obtaining the fusion characteristics of the reference makeup image and the image to be migrated: sending the obtained style characteristics into a multi-layer perceptron to map the style characteristics into a characteristic space to obtain a style characteristic coding code x ,code y The facial features and style features of the pictures to be migrated are coded and sent into an MFDec, and feature fusion is carried out through a decoder AdaIN; meanwhile, adaIN is used for introducing features in the shallow layer of the MFDec, and the feature x of fusion of the reference makeup image and the image to be migrated is obtained through the MFDec network y ,y x ,x x ,y y
S30, optimizing a discriminator and a generator: fixing parameters of a generator G, calculating generator loss, and optimizing a discriminator D to enhance discrimination capability of the discriminator D, then carrying out back propagation, updating the discriminator parameters, wherein the two discriminators are respectively used for discriminating a generated make-up image and a make-up removal image, and the two discriminators are identical in structure; fixing parameters of the discriminator D, calculating the discriminator loss, and optimizing the generator G to enhance the deception capability of the generator G on the discriminator D;
s40, calculating various losses: the method comprises the steps of identity loss, wherein the loss adopts a generator to reconstruct an image to be migrated; dressing loss, which directs the migration of the dressing in the critical area; local vgg penalty that enforces retention of critical area semantic information; global vgg penalty that guarantees that the generated image is similar to the original image semantic information;
s50, updating generator parameters: at the same time x y ,y x Extracting content feature c from FIEnc x,fake ,c y,fake The method comprises the steps of carrying out a first treatment on the surface of the Then, c x,fake And code x C y,fake And code y Respectively send into MFDec to obtain x rec Y rec The method comprises the steps of carrying out a first treatment on the surface of the Further calculating reconstruction loss, wherein the loss guides the network to carry out overall style migration and reserves the basic characteristics of the original image; and finally, back propagation is carried out, and generator parameters are updated.
Preferably, the formula for calculating the generator loss is:
wherein E is x~X Representing the true probability of the non-make-up image; e (E) y~Y Representing the true probability of the applied image; e (E) x~X,y~Y Representing joint probabilities of the generated images; d (D) x (·),D y (-) represents the arbiter output of the sampled self-generated data; d (D) x ,D y Representing a arbiter output sampled from the real data; g (x, y) is represented by x with reference to the makeup of yPerforming row migration; g (y, x) is migration with reference to the makeup of x.
Preferably, the formula for calculating the loss of the discriminator is:
wherein D is x (·),D y (. Cndot.) is the output function of the arbiter, E x~X,y~Y Representing joint probability of generating images, wherein G (x, y) is migration by taking the makeup of x and y as a reference; g (y, x) is migration with reference to the makeup of x.
Preferably, the calculation formula of the identity loss is:
L idt =||G(x,x)-x|| 1 +||G(y,y)-y|| 1
wherein G (x, x) is migration by taking the makeup of x as a reference; g (y, y) is migration of y with reference to the makeup of y 1 For the L1 penalty, the absolute error between the true data and the generated data is calculated.
Preferably, the formula for calculating the makeup loss is:
wherein,pairing data representing the generated x, +.>The generated pairing data of y is represented by x representing an image which is not made up, y representing an image which is made up, M x,i A face mask for the pre-makeup image, where i is the sequence number of critical area including three parts of eye socket, face and lips, M y,i A face mask for the image after makeup is composed of three parts including eye socket, face and lip, and G (x, y) for the reference of yThe examination is migrated; g (y, x) represents migration of y with reference to the makeup of x 1 For the L1 penalty, the absolute error between the true data and the generated data is calculated.
Preferably, the calculation formula of the local vgg loss is:
wherein M is x,i A face mask for the pre-makeup image, where i is the sequence number of critical area including three parts of eye socket, face and lips, M y,i A face mask representing a post-makeup image, wherein i represents a sequence number of a key region including three parts of an eye socket, a face and lips, and G (x, y) represents migration of x with reference to a makeup of y; g (y, x) represents migration by taking the makeup of x as a reference, F l (.) stands for layer one feature in the vgg network, I.I 2 Representing the loss of L2, i.e. the square error between the calculated real data and the generated data.
Preferably, the calculation formula of the global vgg loss is as follows:
wherein G (x, y) represents migration by taking the makeup of x as a reference; g (y, x) represents migration by taking the makeup of x as a reference, F l (.) stands for layer one feature in the vgg network, I.I 2 Lost for L2.
Preferably, the calculation formula of the reconstruction loss is:
L cyc =||G(G(y,x),y)-y|| 1 +||G(G(x,y),x)-x|| 1
wherein, G (G, x, y) represents that after the migration by taking the makeup of x as a reference, the y is migrated by taking the makeup of y as a reference, G (G, y, x) represents that after the migration by taking the makeup of y as a reference, the y is migrated by taking the makeup of x as a reference, the y is migrated 1 For L1 loss, i.e. calculating true data and generated dataAbsolute error between the two.
The beneficial effects of the invention are as follows:
compared with the prior art, the invention provides a makeup style migration method based on an FP-SCGAN model, PSEnc is used for extracting reference makeup features, FIEnc is used for extracting facial features of pictures to be migrated, MFDec is used for fusing the facial features of original pictures and the makeup features of reference pictures, and Markov discriminators are used for measuring the distance between generated distribution and actual distribution. The improved algorithm can solve the problems that the eyesockets have unnatural edges, lighter eyepieces cannot be migrated and the like during makeup migration, and compared with the current mainstream SCGAN makeup migration algorithm, the migration effect is improved.
Drawings
FIG. 1 is a diagram of the overall structure of an FP-SCGAN network in a makeup style migration method based on an FP-SCGAN model according to an embodiment of the invention;
FIG. 2 is a FIEnc structure diagram in a makeup style migration method based on an FP-SCGAN model according to an embodiment of the invention;
FIG. 3 is a PSEnc structure diagram in a makeup style migration method based on an FP-SCGAN model according to an embodiment of the present invention;
FIG. 4 is a diagram of the MFDec structure in the makeup style migration method based on the FP-SCGAN model according to the embodiment of the invention;
FIG. 5 is a schematic diagram of a Markov discriminator in a makeup style migration method based on an FP-SCGAN model according to an embodiment of the invention;
fig. 6 is a flow chart of steps of a makeup style migration method based on an FP-SCGAN model according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
On the contrary, the invention is intended to cover any alternatives, modifications, equivalents, and variations as may be included within the spirit and scope of the invention as defined by the appended claims. Further, in the following detailed description of the present invention, certain specific details are set forth in order to provide a better understanding of the present invention. The present invention will be fully understood by those skilled in the art without the details described herein.
Referring to fig. 1, the overall network structure diagram of the present invention, FP-SCGAN network is composed of four parts, PSEnc, FIEnc, MFDec and a discriminator.
PSEnc is used to extract reference makeup features, including various types of information such as color, texture, edges, etc. The network comprises feature extraction and feature fusion by adopting a feature pyramid.
The PSEnc network structure is shown in fig. 3. Because the input makeup reference image contains a lot of information irrelevant to the makeup, only images of three parts of the eye socket, the face and the lips extracted from the reference image are input. Firstly, feature extraction is carried out on a reference makeup image through a pretrained VGG19 network, and conv1_1, conv2_1, conv3_1 and conv4_1 feature images output by the VGG19 network are fused by adopting a feature pyramid structure. In order to enhance the feature extraction capability of the network, the four feature graphs extracted by the VGG19 are first convolved and then feature graph fusion is performed. And outputting four layers of features after feature pyramid fusion, and then sending the extracted features into a full-connection layer, and mapping the features into a proper scale for subsequent AdaIN.
The FIEnc network is used for extracting facial features of the pictures to be migrated and comprises a feature extraction module, a downsampling module and a residual error module.
The FIEnc network structure is seen in fig. 2, including feature extraction, downsampling and residual modules. To preserve the features of the image to be migrated, the network extracts facial features directly through stacked convolution. The first two processes are used for up-scaling and down-sampling of the features, and the residual module is used for improving the expressive power of the network.
The MFDec network is used for fusing the facial features of the original image and the cosmetic features of the reference image, and the decoder adopts AdaIN. The network includes a residual module, upsampling, and convolution.
Referring to fig. 4, firstly, three feature graphs with different scales output by the PSEnc are mapped into different network layers through the MLP, because the features of the reference graph are different from the feature space where the features of the original graph are located, and the features of the reference graph can be mapped into more reasonable feature space by adopting the MLP for mapping. And carrying out feature fusion on the mapped feature map and the feature map output by FIEnc through AdaIN. AdaIN was also used in the shallow layers of the MFDec network to introduce features in order to allow for complete retention of lighter makeup in the reference makeup. A residual module is adopted in a backbone structure of the network for improving the expression capability of the network, and upsampling and convolution are used for restoring the features into images.
When the multi-layer perceptron MLP is adopted, the reference features are closer to the distribution of the original image while keeping some original information, and the migration effect is more ideal.
All normalization layers of the MFDec use AdaIN to ensure that as many features as possible in the reference image are preserved, the expression of AdaIN being
Wherein x is the characteristic of original image, mu is the mean value of x in the channel direction, sigma 2 The variance of x in the channel direction is epsilon is a minimum number, alpha is the mean value of the reference picture feature y in the channel direction, and gamma is the standard deviation of y in the channel direction.
The discriminator is used for measuring the distance between the generated distribution and the actual distribution. A markov discriminator is used in view of the blurriness of the image generated using a general discriminator. Compared to a normal arbiter, a markov arbiter will determine whether each local region is a generated image, the generated image will be finer,
the construction of the arbiter used is shown in fig. 5. Where SN is spectral normalization (Spectral Normalization), this normalization allows the network to meet lipschz continuity (Lipschitz continuity), limiting dramatic changes in the function, and making the model training process more stable. In addition, the design of the arbiter adopts the proposal in the WGAN, and the cross entropy loss is not adopted when the loss is calculated, but the L1 loss is adopted.
Network overall framework: when the network propagates forwards, the image to be made up can be directly input into FIEnc, and the image characteristics are obtained. The reference image is divided into three parts of an eye socket, skin and lips, and then the three parts are input into a full-connection layer, so that a style code is obtained, and the style code is the mean and variance of AdaIN. When the image features obtained in FIEnc pass through the residual layer of MFDec, as AdaIN is adopted in the residual layer, the mean and variance of the features will shift to values in the style code, i.e. the distribution of the features is migrated to the distribution of the reference image. And obtaining photos after makeup migration after the features are up-sampled twice.
The training of the network is performed in the mutual game of the generator G and the arbiter D, and the network converges when the dynamic balance is finally achieved. The loss function of FP-SCGAN is shown in equation (1.1). Wherein L is adv To combat losses, including generator losses and arbiter losses, lambda adv For its loss factor; l (L) cyc Lambda for reconstruction loss cyc For its loss factor;for global vgg penalty, lambda g For its loss factor; />For local vgg loss, lambda l For its loss factor; l (L) makeup Lambda for loss of makeup makeup For its loss factor. />The optimization process representing the network is the countermeasure training of G and D. When the parameters of G are fixed,representing that the confidence of D for the real sample is as enhanced as possible. When fixing the parameters of D +.>Representation optimization G, minimizing true samples as much as possibleAnd generating a gap between samples.
During network training, the specific steps are as follows:
input: training setWherein x represents an image without makeup and y represents an image with makeup +.>Pairing data representing the generated x, +.>Pairing data representing the generated y, M x Face mask representing pre-makeup image, M y A face mask representing the post-cosmetic image. The training batch size is B, the training set data size is N, the learning rate is gamma, and the learning rate is gamma, the training iteration number is J, I 1 Is lost for L1.
Tasks: through training setContinuously iterating training to make the generator and the discriminator converged, thereby achieving the aim of makeup migration.
Referring to fig. 6, the specific steps are:
s10, obtaining style characteristics: sending the non-makeup image x and the makeup image y into FIEnc, and obtaining facial features c of the picture to be migrated through feature extraction, downsampling and residual error modules x ,c y Sending the key region of the makeup reference image into PSEnc, performing feature extraction through a pretrained VGG19 network and fusing features by a feature pyramid to obtain style features s x ,s y
S20, obtaining the fusion characteristics of the reference makeup image and the image to be migrated: the obtained style characteristics are sent into the MLP to map the style characteristics to the characteristicsSpace, get style characteristic code x ,code y The facial features and style features of the pictures to be migrated are coded and sent into an MFDec, and feature fusion is carried out through a decoder AdaIN; meanwhile, adaIN is used for introducing features in the shallow layer of the MFDec, and the feature x of fusion of the reference makeup image and the image to be migrated is obtained through the MFDec network y ,y x ,x x ,y y
S30, optimizing a discriminator and a generator: fixing parameters of a generator G, calculating generator loss, and optimizing a discriminator D to enhance discrimination capability of the discriminator D, then carrying out back propagation, updating the discriminator parameters, wherein the two discriminators are respectively used for discriminating a generated make-up image and a make-up removal image, and the two discriminators are identical in structure; fixing parameters of the discriminator D, calculating the discriminator loss, and optimizing the generator G to enhance the deception capability of the generator G on the discriminator D;
s40, calculating various losses: the method comprises the steps of identity loss, wherein the loss adopts a generator to reconstruct an image to be migrated; dressing loss, which directs the migration of the dressing in the critical area; local vgg penalty that enforces retention of critical area semantic information; global vgg penalty that guarantees that the generated image is similar to the original image semantic information;
s50, updating generator parameters: at the same time x y ,y x Extracting content feature c from FIEnc x,fake ,c y,fake The method comprises the steps of carrying out a first treatment on the surface of the Then, c x,fake And code x C y,fake And code y Respectively send into MFDec to obtain x rec Y rec The method comprises the steps of carrying out a first treatment on the surface of the Further calculating reconstruction loss, wherein the loss guides the network to carry out overall style migration and reserves the basic characteristics of the original image; and finally, back propagation is carried out, and generator parameters are updated.
S10 specifically comprises taking B samples from N to form a batch
Sample x and sample y are fed into FIEnc (see FIG. 2 for structure) to give c x C y The calculation process is shown in formula (1.2), wherein X represents an input image, f is a feature extraction module, down is a downsampling module, and Res is a residual module.
c=Res(Down(f(X))) (1.2)
The key region of the face is fed into PSEnc (see FIG. 3 for structural diagram) to obtain style characteristics s x S y The process of extracting the dressing style of the image is shown in the formula (1.3).
s=concat(E(X*mask eye ),E(X*mask lip ),E(X*mask face )) (1.3)
Wherein E is PSEnc, X is input image, mask eye Mask for orbital portion of input image lip Mask for inputting image face Is a face mask for the input image. concat means that the three features are spliced according to the feature channel direction. E (-) is calculated as shown in formula (1.4), wherein mask item Masking three critical areas, VGG is a pre-trained VGG network, FP is a feature pyramid. After the VGG network extracts the features, the feature pyramids are adopted to fuse the features with different sizes.
E(X*mask item )=FP(VGG(X*mask item )) (1.4)
S20 specifically includes the step of comparing S x And s y Feeding MLP to obtain feature code x Code y
C to obtain x And c y ,code x And code y Feeding MFDec (structure see fig. 4) to give x y ,y x ,x x ,y y The calculation process is shown in the formula (1.5).
out=conv(up(res(Dec(x,MLP(y code ))))) (1.5)
Where conv is the convolutional layer, up is the upsampling, res is the residual block, and Dec () is the decoder.
S30 specifically includes fixing parameters of the generator G, calculating generator lossAndFor optimizing the discriminator D such that the discrimination capability of the discriminator D is enhanced. The calculation process is shown in formula (1.6). Wherein D is x (·),D y (-) represents the arbiter output of the sampled self-generated data; d (D) x ,D y Representing the arbiter output sampled from the real data.
Back propagation, updating the arbiter (structure see fig. 5) parameters. The two discriminators are used for discriminating the generated makeup image and the makeup removing image respectively, and the two discriminators are identical in structure.
Fixing parameters of the discriminator D, and calculating the discriminator lossAnd +.>For optimizing the generator G so that the deception ability of the generator G against the arbiter D is enhanced. The calculation process is shown in formula (1.7).
S40 specifically includes calculating identity loss L idt (x x X) and L idt (y y Y), the calculation process is shown as formula (1.8). The loss adopts a generator to reconstruct x and y, so that the network can retain the characteristics of the original image to a greater extent.
L idt =||G(x,x)-x|| 1 +||G(y,y)-y|| 1 (1.8)
Calculating dressing lossAnd +.>The calculation process is shown in the formula (1.9). The effect of this loss is to guide the migration of the critical area makeup.
Calculating local vgg lossAnd +.>The calculation process is shown as a formula (1.10), wherein M y,i A face mask for the image after makeup is composed of the sequence numbers of key region including eye socket, face and lips, M x,i Similarly, F l (.) stands for layer one feature in vgg networks. The effect of this loss is to enhance the retention of critical area semantic information.
Computing global vgg penaltyAnd +.>The calculation process is shown in formula (1.11). The effect of this loss is to ensure that the generated image is similar to the original image semantic information.
S50 specifically includes combining x y ,y x Extracting content feature c from FIEnc x,fake C y,fake . Will c x,fake And code x Feeding inMFDec gets x rec C, adding y,fake And code y Feeding MFDec to obtain y rec
Calculating reconstruction loss L cyc (x rec X) and L cyc (y rec And y), wherein the calculation process is shown in a formula (1.12), wherein G (G (y, x, y) represents that after y is migrated by taking the makeup of x as a reference, the y is migrated by taking the makeup of y as a reference, and G (G (x, y, x) is the same as the above. The effect of this loss is to guide the network to make the whole style migration while preserving the basic features of the original image.
L cyc =||G(G(y,x),y)-y|| 1 +||G(G(x,y),x)-x|| 1 (1.12)
Counter-propagating, updating the generator parameters.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (8)

1. A makeup style migration method based on an FP-SCGAN model is characterized in that an FP-SCGAN network comprises PSEnc, FIEnc, MFDec and a Markov discriminator, training of the network is carried out in a mutual game of a generator G and the discriminator D, and when dynamic balance is finally achieved, the network is converged, and the training specifically comprises the following steps:
s10, obtaining style characteristics: sending the non-makeup image x and the makeup image y into FIEnc, and obtaining facial features c of the picture to be migrated through feature extraction, downsampling and residual error modules x ,c y Sending the key region of the makeup reference image into PSEnc, performing feature extraction through a pretrained VGG19 network and fusing features by a feature pyramid to obtain style features s x ,s y
S20, obtaining the fusion characteristics of the reference makeup image and the image to be migrated: sending the obtained style characteristics into a multi-layer perceptron to map the style characteristics into a characteristic space to obtain a style characteristic coding code x ,code y The obtained picture to be migratedFacial features and style feature codes are sent into the MFDec and feature fusion is carried out through a decoder AdaIN; meanwhile, adaIN is used for introducing features in the shallow layer of the MFDec, and the feature x of fusion of the reference makeup image and the image to be migrated is obtained through the MFDec network y ,y x ,x x ,y y
S30, optimizing a discriminator and a generator: fixing parameters of a generator G, calculating generator loss, and optimizing a discriminator D to enhance discrimination capability of the discriminator D, then carrying out back propagation, updating the discriminator parameters, wherein the two discriminators are respectively used for discriminating a generated make-up image and a make-up removal image, and the two discriminators are identical in structure; fixing parameters of the discriminator D, calculating the discriminator loss, and optimizing the generator G to enhance the deception capability of the generator G on the discriminator D;
s40, calculating various losses: the method comprises the steps of identity loss, wherein the loss adopts a generator to reconstruct an image to be migrated; dressing loss, which directs the migration of the dressing in the critical area; local vgg penalty that enforces retention of critical area semantic information; global vgg penalty that guarantees that the generated image is similar to the original image semantic information;
s50, updating generator parameters: at the same time x y ,y x Extracting content feature c from FIEnc x,fake ,c y,fake The method comprises the steps of carrying out a first treatment on the surface of the Then, c x,fake And code x C y,fake And code y Respectively send into MFDec to obtain x rec Y rec The method comprises the steps of carrying out a first treatment on the surface of the Further calculating reconstruction loss, wherein the loss guides the network to carry out overall style migration and reserves the basic characteristics of the original image; and finally, back propagation is carried out, and generator parameters are updated.
2. The method of claim 1, wherein the formula for calculating generator loss is:
wherein E is x~X Representing the true probability of the non-make-up image; e (E) y~Y Representing the true probability of the applied image; e (E) x~X,y~Y Representing joint probabilities of the generated images; d (D) x (·),D y (-) represents the arbiter output of the sampled self-generated data; d (D) x ,D y Representing a arbiter output sampled from the real data; g (x, y) is migration by taking the makeup of x and y as a reference; g (y, x) is migration with reference to the makeup of x.
3. The method of claim 1, wherein the formula for calculating the discriminator loss is:
wherein D is x (·),D y (. Cndot.) represents the arbiter output of the sampled self-generated data, E x~X,y~Y Representing joint probability of generating images, wherein G (x, y) is migration by taking the makeup of x and y as a reference; g (y, x) is migration with reference to the makeup of x.
4. The method of claim 1, wherein the identity loss is calculated by the formula:
L idt =||G(x,x)-x|| 1 +||G(y,y)-y|| 1
wherein G (x, x) is migration by taking the makeup of x as a reference; g (y, y) is migration of y with reference to the makeup of y 1 For the L1 penalty, the absolute error between the true data and the generated data is calculated.
5. The method according to claim 1, wherein the calculation formula of the makeup loss is:
wherein,pairing data representing the generated x, +.>The generated pairing data of y is represented by x representing an image which is not made up, y representing an image which is made up, M x,i A face mask for the pre-makeup image, where i is the sequence number of critical area including three parts of eye socket, face and lips, M y,i A face mask representing a post-makeup image, wherein i represents a sequence number of a key region including three parts of an eye socket, a face and lips, and G (x, y) represents migration of x with reference to a makeup of y; g (y, x) represents migration of y with reference to the makeup of x 1 For the L1 penalty, the absolute error between the true data and the generated data is calculated.
6. The method of claim 1, wherein the local vgg loss is calculated as:
wherein M is x,i A face mask for the pre-makeup image, where i is the sequence number of critical area including three parts of eye socket, face and lips, M y,i A face mask representing a post-makeup image, wherein i represents a sequence number of a key region including three parts of an eye socket, a face and lips, and G (x, y) represents migration of x with reference to a makeup of y; g (y, x) represents migration by taking the makeup of x as a reference, F l (.) stands for layer one feature in the vgg network, I.I 2 Representing the loss of L2, i.e. the square error between the calculated real data and the generated data.
7. The method of claim 1, wherein the global vgg penalty is calculated by the formula:
wherein G (x, y) represents migration by taking the makeup of x as a reference; g (y, x) represents migration by taking the makeup of x as a reference, F l (.) stands for layer one feature in the vgg network, I.I 2 Lost for L2.
8. The method of claim 1, wherein the reconstruction loss is calculated as:
L cyc =||G(G(y,x),y)-y|| 1 +||G(G(x,y),x)-x|| 1
wherein, G (G, x, y) represents that after the migration by taking the makeup of x as a reference, the y is migrated by taking the makeup of y as a reference, G (G, y, x) represents that after the migration by taking the makeup of y as a reference, the y is migrated by taking the makeup of x as a reference, the y is migrated 1 For the L1 penalty, the absolute error between the true data and the generated data is calculated.
CN202210488449.1A 2022-05-06 2022-05-06 Makeup style migration method based on FP-SCGAN model Active CN114863527B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210488449.1A CN114863527B (en) 2022-05-06 2022-05-06 Makeup style migration method based on FP-SCGAN model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210488449.1A CN114863527B (en) 2022-05-06 2022-05-06 Makeup style migration method based on FP-SCGAN model

Publications (2)

Publication Number Publication Date
CN114863527A CN114863527A (en) 2022-08-05
CN114863527B true CN114863527B (en) 2024-03-19

Family

ID=82634559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210488449.1A Active CN114863527B (en) 2022-05-06 2022-05-06 Makeup style migration method based on FP-SCGAN model

Country Status (1)

Country Link
CN (1) CN114863527B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107464210A (en) * 2017-07-06 2017-12-12 浙江工业大学 A kind of image Style Transfer method based on production confrontation network
CN107644006A (en) * 2017-09-29 2018-01-30 北京大学 A kind of Chinese script character library automatic generation method based on deep neural network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10762337B2 (en) * 2018-04-27 2020-09-01 Apple Inc. Face synthesis using generative adversarial networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107464210A (en) * 2017-07-06 2017-12-12 浙江工业大学 A kind of image Style Transfer method based on production confrontation network
CN107644006A (en) * 2017-09-29 2018-01-30 北京大学 A kind of Chinese script character library automatic generation method based on deep neural network

Also Published As

Publication number Publication date
CN114863527A (en) 2022-08-05

Similar Documents

Publication Publication Date Title
US11908244B2 (en) Human posture detection utilizing posture reference maps
CN110399849B (en) Image processing method and device, processor, electronic device and storage medium
CN108520503B (en) Face defect image restoration method based on self-encoder and generation countermeasure network
CN111598998B (en) Three-dimensional virtual model reconstruction method, three-dimensional virtual model reconstruction device, computer equipment and storage medium
US11232286B2 (en) Method and apparatus for generating face rotation image
CN110348330B (en) Face pose virtual view generation method based on VAE-ACGAN
CN111488865B (en) Image optimization method and device, computer storage medium and electronic equipment
CN113487618B (en) Portrait segmentation method, portrait segmentation device, electronic equipment and storage medium
Wang et al. MAGAN: Unsupervised low-light image enhancement guided by mixed-attention
CN111108508B (en) Face emotion recognition method, intelligent device and computer readable storage medium
CN110555896A (en) Image generation method and device and storage medium
CN113066034A (en) Face image restoration method and device, restoration model, medium and equipment
CN113658040A (en) Face super-resolution method based on prior information and attention fusion mechanism
CN114723760B (en) Portrait segmentation model training method and device and portrait segmentation method and device
Sun et al. Masked lip-sync prediction by audio-visual contextual exploitation in transformers
CN116012255A (en) Low-light image enhancement method for generating countermeasure network based on cyclic consistency
Wang et al. Digital twin: Acquiring high-fidelity 3D avatar from a single image
CN111612687A (en) Automatic face image makeup method
Yang et al. Research on digital camouflage pattern generation algorithm based on adversarial autoencoder network
CN114863527B (en) Makeup style migration method based on FP-SCGAN model
CN116703750A (en) Image defogging method and system based on edge attention and multi-order differential loss
CN111597847A (en) Two-dimensional code identification method, device and equipment and readable storage medium
CN112132743B (en) Video face changing method capable of self-adapting illumination
Wang et al. MetaScleraSeg: an effective meta-learning framework for generalized sclera segmentation
Wu et al. Semantic image inpainting based on generative adversarial networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant