CN110706303A - Face image generation method based on GANs - Google Patents

Face image generation method based on GANs Download PDF

Info

Publication number
CN110706303A
CN110706303A CN201910975725.5A CN201910975725A CN110706303A CN 110706303 A CN110706303 A CN 110706303A CN 201910975725 A CN201910975725 A CN 201910975725A CN 110706303 A CN110706303 A CN 110706303A
Authority
CN
China
Prior art keywords
image
training
face image
gans
generator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910975725.5A
Other languages
Chinese (zh)
Other versions
CN110706303B (en
Inventor
和红杰
陈泓佑
陈帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN201910975725.5A priority Critical patent/CN110706303B/en
Publication of CN110706303A publication Critical patent/CN110706303A/en
Application granted granted Critical
Publication of CN110706303B publication Critical patent/CN110706303B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a face image generation method based on GANs, which relates to the technical field of computers, wherein a face image generated by a generator can be associated with not only random vectors but also feature vectors, so that the generated image is directly influenced by the features of a training image, and the interpretability is increased; the gradient disappearance can be effectively avoided, decoding training is possible before binary countermeasure training, the gradient disappearance phenomenon caused by optimizing JS divergence can be avoided, and the quality of generated images is improved; the decoder can learn good image structural features, so that the generator can learn the good structural features, the images with distorted faces are reduced, and meanwhile, the definition of the images can be learned more reasonably; due to the characteristic decoding constraint, the gradient descending direction is also constrained to a certain extent when the objective function is optimized, and less epoch number can be used in the training process.

Description

Face image generation method based on GANs
Technical Field
The invention relates to the technical field of computers, in particular to a human face image generation method based on GANs.
Background
The image generation method based on the GANs is one of the hot spots of the current artificial intelligence research, and theoretically, the image generation method based on the GANs can effectively simulate a lot of image contents, such as: human face, buildings, indoor scenes, flowers, animal images, and the like. The generation of the images is also of practical significance, for example, the effective generation of real faces or cartoon faces can save the virtual generation of some common character roles in movie and television or cartoon works, thereby saving the cost; the indoor background information which some photographers want to protect can be effectively protected for the generation of the indoor scene; the number of the images in a certain category is small, and more images in the category can be acquired, so that the purpose of data augmentation is achieved.
The basic structure of the GANs includes two neural networks (hereinafter referred to as "networks" for "neural networks") including a generator G and a discriminator D. The generator G mainly obtains a generated image G (z) by feeding a random vector z, and the discriminator D performs two-classification training by using the training image set image x and the generated image G (z) as positive and negative samples. The purpose of the generator G is to make G (z) as similar as possible to the image sample distribution in the training set X. The purpose of the discriminator D is to distinguish the true attributes of the positive and negative samples of the training set image x and the generated image g (z) as much as possible. The two training sets are sequentially subjected to learning training, and finally the generator G has the capacity of generating images similar to the training set in a mode of resisting learning.
One of the more important improvements in improving the diversity and quality of images generated by GANs is to optimize other divergences instead of optimizing JS divergences. For the GANs model, the classical optimization goal is to optimize the JS divergence between the training data set distribution and the generated data set distribution, such as raw GANs and DCGANs. However, the JS divergence optimization is insufficient, and when the intersection area between the distribution of the training data set and the distribution of the generation data set is small, the JS divergence is approximately constant, so that the problem that the gradient of the optimized objective function disappears is caused, and the image generation quality of the GANs is influenced. Typically, instead of JS divergence, other distances or divergences are used, WGANs, WGANsGP, bgans using W distance between training data set distribution and generating data set distribution as optimization target, and LSGANs using pearson divergence as optimization target. Compared with DCGANs for optimizing JS divergence, WGANs, WGANsGP, BEGANs and LSGANs are replaced by other divergences, and the image generation quality is improved to a certain extent. WGANs use relatively rough weight pruning when the arbiter network meets the 1-L condition, affecting the quality of generation to some extent. The WGANsGP substitutes 1-center gradient punishment used by the discriminator network for weight pruning, and the processing mode is more reasonable. The BEGANs improve the judger by means of coding and decoding ideas, and further optimize the W distance, but the optimization target is more complex, and small plaque areas or detailed textures are easy to appear in the generated image and lose. In addition, they require a large number of parameter update iterations to achieve a good training result.
Disclosure of Invention
The present invention is to provide a method for generating a face image based on GANs, which can alleviate the above problems.
In order to alleviate the above problems, the technical scheme adopted by the invention is as follows:
the invention provides a face image generation method based on GANs, which comprises the following steps:
s1, acquiring a training set X, wherein the training set X consists of a plurality of face images;
s2, extracting the hidden features of all the face images in the training set X to obtain a hidden feature set C of the face images;
s3, face image decoding training, specifically comprising:
s31, sampling the blocksize face image from the training set X without repetition, and carrying out pixel value scale transformation on the sampled face image;
s32, calculating the Boolean value of the discriminant function delta
Figure BDA0002233550060000021
Wherein t is the current iteration epoch number, r is the frequency of controlling invoking decoding constraint, l is the last decoding constraint condition, if δ is 1, the step S33 is continuously executed, if δ is 0, the generator G of the GANs is taken as the antagonistic learning generator G, and the step S4 is skipped;
s33, constructing a decoder Dec which has the same network structure as the generator G in the GANs and weight sharing, performing decoding training on the decoder Dec by using a RMSProp optimization method according to the formula (2),
Figure BDA0002233550060000022
where λ is the decoding loss function weight coefficient, xiIs the ith human face image subjected to pixel value scale conversion in step S31, ciIs xiCorresponding hidden features of the face image in the hidden feature set C of the face image, Dec (C)i) Denotes ciAn output image decoded by a decoder Dec, k being a blocksize value;
through the training of the decoder Dec, the generator G of the GANs is updated in a parameter sharing mode, and the generator G after the updating is used as a counterstudy generator G;
s4, the counterlearning in face image generation specifically includes:
s41, sampling the blocksize face image from the training set X without repetition, and carrying out pixel value scale transformation on the sampled face image;
s42, taking the blocksize image subjected to the pixel value scale transformation in the step S41 as a positive sample, generating blocksize random vectors by adopting a random generation method, and feeding the blocksize random vectors serving as an input information source into a countermeasure learning generator G to obtain a blocksize generated image serving as a negative sample;
s43, feeding the positive samples and the negative samples into a discriminator D of the GANs, performing weight updating training on the discriminator D by using a RMSProp optimization method, outputting the trained discriminator D, wherein the optimized loss function is a formula (3),
Figure BDA0002233550060000031
wherein, D (x)i) Is the discriminator D on the ith positive sample xiA discrimination value of D (G (z))i) Is the arbiter D for the ith negative sample G (z)i) A discrimination value of (1);
s44, feeding the batchsize random vectors obtained in the step S42 into the counterlearning generator G, performing weight updating training on the counterlearning generator G by using a RMSProp optimization method, wherein the optimized loss function is an expression (4),
if the current epoch training is not finished or the current iteration epoch number does not reach the maximum epoch number, jumping to step S41, and if the current epoch training is finished and the current iteration epoch number reaches the maximum epoch number, outputting a generator G which is well trained;
s45, storing and obtaining GANs composed of the trained discriminator D and the trained generator G;
and S5, generating an image random vector by adopting a random generation method, generating a face image by using the GANs obtained in the step S45 and taking the image random vector as input, and performing pixel value scale transformation on the generated face image to finish the generation of the face image.
The technical effect of the technical scheme is as follows: the face image generated by the generator can be associated with not only the random vector but also the feature vector, which shows that the generated image is directly influenced by the features of the training image, and the interpretability is increased; the gradient disappearance can be effectively avoided, decoding training is possible before binary countermeasure training, the gradient disappearance phenomenon caused by optimizing JS divergence can be avoided, and the quality of generated images is improved; the decoder can learn good image structural features, so that the generator can learn the good structural features, the images with distorted faces are reduced, and meanwhile, the definition of the images can be learned more reasonably; due to the characteristic decoding constraint, the gradient descending direction is also constrained to a certain extent when the objective function is optimized, and less epoch number can be used in the training process.
Optionally, the step S2 specifically includes:
s21, sampling the blocksize face image from the training set X without repetition, and carrying out pixel value scale transformation on the sampled face image;
s22, training a feature learning network by using the batchsize face image subjected to the pixel value scale transformation in the step S21;
s23, extracting the implicit characteristics of the face image after the pixel value scale transformation in the step S21 through a characteristic learning network, if the implicit characteristics of the face image in the training set X are not extracted completely, jumping to the step S21, otherwise, outputting a face image implicit characteristic set C.
The technical effect of the technical scheme is as follows: the hidden features of each image in the face image training set X can be obtained after the processing by the technical scheme, X and C are in one-to-one correspondence, and each face image has the corresponding hidden features.
Optionally, the step S21, the step S31, and the step S41 of performing the pixel value scaling on the face image are to perform the pixel value scaling on the face image according to equation (6) to [ -1,1],
Figure BDA0002233550060000041
wherein i is the label of a certain image in the Batchsize images, and i belongs to [1, Batchsize ].
The technical effect of the technical scheme is as follows: the pixel value range of each image can be changed into real number of [ -1,1], the training set image is standardized, and the training set image is convenient to feed into a network for learning.
Optionally, the step S22 specifically includes:
constructing an initial feature learning network, feeding the batchsize facial image subjected to scale transformation in the step S21 into the feature learning network, fully training by using a mean square error loss function shown in an Adam optimizer optimization formula (6), completing the training of the feature learning network after the maximum epoch number of the feature learning network is reached,
Figure BDA0002233550060000042
wherein x isi *Is the ith reconstructed image output by the feature learning network and the ith human face image X in the training set XiAnd correspondingly.
The technical effect of the technical scheme is as follows: the training image and the reconstructed image can be the same as much as possible, and the formula (6) is a convex optimization function, so that the optimization of the objective function can be facilitated; when the network converges and reaches a certain training frequency, the closer the training image is to the reconstructed image, the more the characteristics of the network intermediate layer can represent the training set image.
Optionally, the feature learning network is any one of a deep neural network, a convolutional neural network, a U-Net type automatic coding machine, a DenseNet type automatic coding machine, and a sparse automatic coding machine.
The technical effect of the technical scheme is as follows: the network combination formula (6) can be used for reconstructing face images of a training set, and after full training, the features in the feature set are derived from the features of a network middle layer.
Alternatively, in step S44, the generator G needs to train twice in succession.
The technical effect of the technical scheme is as follows: the training effect can be improved to a certain extent on the generation quality.
Optionally, the step S5 specifically includes:
s51, setting the number N of required images, determining the distribution type of the random vectors of the images, enabling the random vectors of the images to be consistent with the distribution type of the random vectors in the step S42, and loading the GANs obtained in the step S45;
s52, generating N image random vectors by using a random generation method, sequentially feeding the N image random vectors into a trained generator G, and outputting to obtain N generated images with the pixel value range of [ -1,1 ];
s53, using the formula (7) to carry out pixel value scale transformation on the N generated images in the step S52, completing the generation of the face image,
G(zj)←127.5×(G(zj)+1) (7)
wherein, G (z)j) Is the j image generated, zjIs the image random vector, j 1,2, 3.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flow chart of a face image generation method according to an embodiment of the present invention;
FIG. 2 is a general block diagram of the human face image generation GANs in the embodiment of the invention;
FIG. 3 is a flow chart of feature extraction in an embodiment of the present invention;
FIG. 4 is a flow chart of decoding constraint and countermeasure learning according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The foregoing background and the following examples are given by way of explanation of the key terms involved:
and (3) GANs: the general adaptive Networks, which is a mode of learning through countermeasures, enables a generator network to output samples similar to the distribution of a training set by feeding in excitation vectors.
JS divergence: Jensen-Shannon divergence can measure the distance between two distributions, the bigger the difference between the two distributions is, the larger the JS divergence value of the two distributions is, and the difference exceeds a certain degree, so that the JS divergence value of the two distributions approaches to a constant value. The smaller the difference between the two distributions. The smaller their JS divergence values are, the minimum JS divergence value of 0 can be achieved if and only if the two distributions are the same.
DCGANs: deep Convolution generated adaptive Networks (DEEP) are the GANs models of a generator G and a discriminator D designed by utilizing a Convolution neural network, and the Deep Convolution generated adaptive Networks are one of the GANs standard models, and have relatively great breakthrough.
WGANs: wasserstein general adaptive Networks, a W distance generating countermeasure network, is a method of changing an optimization function to optimize the W distance between a generated image set distribution and a training image set distribution. Compared with the GANs for optimizing JS divergence, the method can overcome the problem of gradient disappearance caused by JS divergence, but WGANs need to enable the discriminator D to meet 1-Lipschitz continuity, so a brute force weight pruning method is used for the discriminator D.
WGANsGP: wasserstein genetic adaptive Networks, Gradient Penalty. Compared with WGANs, the WGANs model with the gradient penalty avoids a brute force processing mode of weight pruning, and the training effect is better.
BEGANs: boundary balanced GANs, Boundary balanced basic Networks. And improving the GANs optimization function by using a boundary balance strategy to avoid the gradient disappearance phenomenon possibly brought by optimizing JS divergence. The objective function it optimizes is the W distance.
LSGANs: the raw Squares genetic adaptive Networks. And the optimized objective function is converted into Pearson divergence, so that the gradient disappearance phenomenon possibly brought by JS divergence is avoided. The design of the loss function is a least square loss function.
Batchsize: the size of the sample fed by each batch of training GANs.
An Epoch: and in the training period, the number of the training sets is fixed, the training is carried out by feeding Batchsize samples into the GANs each time without repeating, and one training period is carried out after all the training set samples are covered (generally, samples sampled in each batch are sequentially sampled from the training sets.
Example 1
Referring to fig. 1, fig. 2 and fig. 4, an embodiment of the present invention provides a method for generating a human face image based on GANs, including the following steps:
and S1, acquiring a training set X, wherein the training set X consists of a plurality of face images.
In this embodiment, two methods for acquiring the training set X are given: the first is by center-cropping the CELEBA face data set into a fixed size face image such as: 64X 6496X 64, etc.; the second is to crawl people pictures on a public network through a crawler technology, then cut out face images through a face recognition technology, and finally zoom the image size to a fixed size as follows: 64X 6496X 64, etc.
S2, extracting the hidden features of all the face images in the training set X to obtain a hidden feature set C of the face images.
S3 human face image decoding training
S31, sampling the blocksize face image X from the training set X without repeated sampling1,x2,...,xk(k ═ batch size) according toPerforming a pixel value scaling to [ -1,1]Wherein i is the label of a certain image in the Batchsize images, i belongs to [1, Batchsize [ ]]The transformed image is still marked as x1,x2,...,xk(k=batchsize)。
S32, calculating the Boolean value of the discriminant function delta
Figure BDA0002233550060000072
Where t is the current iteration epoch number, r is the frequency of controlling invoking decoding constraints, l is the last decoding constraint condition, if δ is 1, the step S33 is continued, and if δ is 0, the generator G of the GANs is used as the counterlearning generator G, and the step S4 is skipped, that is, the generator G of the GANs performs the counterlearning directly in the following step without decoding training.
S33, constructing a decoder Dec which has the same network structure as the generator G in the GANs and weight sharing, performing decoding training on the decoder Dec by using a RMSProp optimization method according to the formula (2),
Figure BDA0002233550060000081
where λ is the decoding loss function weight coefficient, xiIs the ith human face image subjected to pixel value scale conversion in step S31, ciIs xiCorresponding hidden features of the face image in the hidden feature set C of the face image, Dec (C)i) Denotes ciAn output image decoded by a decoder Dec, k being a blocksize value;
through the training of the decoder Dec, the generator G of the GANs is updated in a parameter sharing manner, and the generator G updated this time is used as the counterlearning generator G, that is, the counterlearning of the generator G updated this time is performed in the following steps.
S4 confrontation learning in face image generation
S41, collecting from the training set X without repeatingSample size sheet face image x1,x2,...,xk(k ═ batch size) according to
Figure BDA0002233550060000082
Performing a pixel value scaling to [ -1,1]Wherein i is the label of a certain image in the Batchsize images, i belongs to [1, Batchsize [ ]]The transformed image is still marked as x1,x2,...,xk(k=batchsize)。
S42, acquiring positive and negative samples of a discriminator D; the pixel value scaled image x of the batch size sheet in step S411,x2,...,xk(k ═ batch size) as a positive sample; randomly generating batchsize vectors z by adopting a random generation method1,z2,...,zk(k ═ batch size; z vector is the excitation vector of the image generated by the counterstudy generator G, which is the input information source of the counterstudy generator G and has a fixed dimension, e.g. 100 dimensions; each vector element in the z vector is a real number, with a range of values [ -1,1]Uniform distribution is obeyed. However, there is a normal distribution with 0 mean and 1 standard deviation, and it should be noted that the value range of the vector elements is not necessarily [ -1,1]) Fed into the resist learning generator G, a blocksize sheet generation image G (z)1),G(z2),...,G(zk) (k ═ batch size) as a negative sample.
S43, feeding the positive samples and the negative samples into a discriminator D of the GANs, performing weight updating training on the discriminator D by using a RMSProp optimization method, outputting the trained discriminator D, wherein the optimized loss function is a formula (3),
Figure BDA0002233550060000083
wherein, D (x)i) Is the discriminator D on the ith positive sample xiA discrimination value of D (G (z))i) Is the arbiter D for the ith negative sample G (z)i) The discrimination value of (1).
S44, obtaining batchsize random vectors z obtained in the step S421,z2,...,zkIs fed to the antagonistic learning generator G,using an RMSProp optimization method to carry out weight updating training on the counterstudy generator G, continuously training for two times G, wherein the optimized loss function is an expression (4),
Figure BDA0002233550060000091
and repeating the steps S3-S4, feeding Batchsize images to the GANs for training each time for the training set X, finishing an epoch training task after covering all the face images of the training set X, then training the process until the maximum epoch number, and finally storing to obtain the final trained generator G.
And S45, storing and obtaining the GANs consisting of the trained discriminator D and the trained generator G.
S5 generation of face image
S51, setting the number N of required images, determining the distribution type of random vectors of the images to be consistent with the distribution type of the random vectors when the GANs are trained in the step S42, and loading the GANs obtained in the step S45;
s52, generating N random vectors z of images by using random generation method1,z2,...,zNSequentially feeding the N random vectors of the images into the trained generator G to obtain the original values G (z) of the output images of the N generators G (the value range of the output values is [ -1,1]])。
S53, using formula (7) to carry out image pixel value scale transformation on N G (z) to make the pixel value be [0,255], at this time, storing the N images G (z), completing the generation of the face image,
G(zj)←127.5×(G(zj)+1) (7)
wherein, G (z)j) Is the jth image generated, j being 1,2, 3.
Compared with the prior art, the method for generating the human face image based on the GANs has the following advantages:
1. increase interpretability: most of the feed information of the generator G in the GANs is a random vector z, which indicates that the generator G performs image generation by using a noise signal z as an excitation feature, i.e. the feature z of G (z) is hard to be interpreted to have an intrinsic relation with the feature c of the training set X. In the proposed face generation method, the excitation signal of the generator G (the decoder Dec shares the weight of the generator G, and the network structure is consistent) is no longer only a random vector z, but also a feature vector c of the training image x. Therefore, the face image generated by the generator G can be not only related to z but also related to c, which shows that the generated image G (z) is directly influenced by the characteristic c of the training image x, and the interpretability is increased.
2. The problem of gradient disappearance when JS divergence is optimized is alleviated: the optimized expressions (3) and (4) are JS divergence for optimizing the distribution of the training image set X and the distribution of the generated image set g (z), and it can be seen from the nature of JS divergence that JS divergence values are less likely to approach a constant as the distributions of the two images are more similar, thereby effectively avoiding gradient disappearance. Before the binary confrontation training, decoding training is possible, and the aim is to reconstruct the training image set X, which is equivalent to improving the similarity between the distribution of G (Z) and the distribution of X. This is favorable to avoiding optimizing the gradient disappearance phenomenon that JS divergence leads to and taking place, and then improves the quality of generating the image.
3. Improving the visual effect of the generated image: since the decoder Dec is subjected to decoding training before the countermeasure training, according to the knowledge in the step (2), since the decoding is performed by making the output image of the decoder Dec strictly consistent with the training image in terms of pixels for decoding reconstruction, the decoder Dec can learn a good image structural feature, so that the generator G can learn a better structural feature, and reduce images with distorted faces. While the sharpness (texture features) of the image can be more reasonably learned.
4. Number of iterations of less training: because the feature decoding constraint is carried out on the target function, the gradient descending direction is also constrained to a certain extent when the target function is optimized, and less epoch number can be used in the training process.
Example 2
Referring to fig. 3, step S2 in embodiment 1 specifically includes:
s21, sequentially sampling from the training set X without repeated sampling (in the same epoch period, sequentially sampling when the sequence of the training set samples is determined is not repeated sampling) batch size face image X1,x2,...,xk(k ═ batch size) and scaled to [ -1,1] according to the line pixel value scale of equation (5)]The transformed image is still marked as x1,x2,...,xk(k=batchsize)。
Wherein i is the label of a certain image in the Batchsize images, and i belongs to [1, Batchsize ].
S22, training a feature learning network by using the batchsize face image subjected to the pixel value scale transformation in the step S21, specifically as follows:
constructing an initial feature learning network, and carrying out dimension conversion on the pixel value of the batch size image x in the step S211,x2,...,xk(k ═ batch size) is fed into the feature learning network, full training is carried out by using a mean square error loss function shown in an Adam optimizer optimization formula (6), after the maximum epoch number of the feature learning network is reached, the training of the feature learning network is completed,
Figure BDA0002233550060000102
wherein x isi *Is the ith reconstructed image output by the feature learning network and the ith human face image X in the training set XiAnd correspondingly.
S23, after the feature learning network is fully trained, the face image x after the pixel value scale transformation in the step S21 is processed1,x2,...,xk(k ═ batch size) is fed into the feature learning network, and the output value of the network middle layer is recorded, so that the corresponding hidden feature c of the face image is obtained1,c2,...,ck(k=batchsize)。
According to the feature extraction method, the steps S21 to S23 are repeated until the extraction of the implicit features of the whole training set X is completed, so that the face image feature set C is obtained. The whole face image feature extraction process is a pre-training process.
In this embodiment, the feature learning network may be any one of a deep neural network, a convolutional neural network, a U-Net type automatic coding machine, a DenseNet type automatic coding machine, and a sparse automatic coding machine.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. A face image generation method based on GANs is characterized by comprising the following steps:
s1, acquiring a training set X, wherein the training set X consists of a plurality of face images;
s2, extracting the hidden features of all the face images in the training set X to obtain a hidden feature set C of the face images;
s3, face image decoding training, specifically comprising:
s31, sampling the blocksize face image from the training set X without repetition, and carrying out pixel value scale transformation on the sampled face image;
s32, calculating the Boolean value of the discriminant function delta
Figure FDA0002233550050000011
Wherein t is the current iteration epoch number, r is the frequency of controlling invoking decoding constraint, l is the last decoding constraint condition, if δ is 1, the step S33 is continuously executed, if δ is 0, the generator G of the GANs is taken as the antagonistic learning generator G, and the step S4 is skipped;
s33, constructing a decoder Dec which has the same network structure as the generator G in the GANs and weight sharing, performing decoding training on the decoder Dec by using a RMSProp optimization method according to the formula (2),
Figure FDA0002233550050000012
where λ is the decoding loss function weight coefficient, xiIs the ith human face image subjected to pixel value scale conversion in step S31, ciIs xiCorresponding hidden features of the face image in the hidden feature set C of the face image, Dec (C)i) Denotes ciAn output image decoded by a decoder Dec, k being a blocksize value;
through the training of the decoder Dec, the generator G of the GANs is updated in a parameter sharing mode, and the generator G after the updating is used as a counterstudy generator G;
s4, the counterlearning in face image generation specifically includes:
s41, sampling the blocksize face image from the training set X without repetition, and carrying out pixel value scale transformation on the sampled face image;
s42, taking the blocksize image subjected to the pixel value scale transformation in the step S41 as a positive sample, generating blocksize random vectors by adopting a random generation method, and feeding the blocksize random vectors serving as an input information source into a countermeasure learning generator G to obtain a blocksize generated image serving as a negative sample;
s43, feeding the positive samples and the negative samples into a discriminator D of the GANs, performing weight updating training on the discriminator D by using a RMSProp optimization method, outputting the trained discriminator D, wherein the optimized loss function is a formula (3),
Figure FDA0002233550050000021
wherein, D (x)i) Is the discriminator D on the ith positive sample xiA discrimination value of D (G (z))i) Is the arbiter D for the ith negative sample G (z)i) A discrimination value of (1);
s44, feeding the batchsize random vectors obtained in the step S42 into the counterlearning generator G, performing weight updating training on the counterlearning generator G by using a RMSProp optimization method, wherein the optimized loss function is an expression (4),
Figure FDA0002233550050000022
if the current epoch training is not finished or the current iteration epoch number does not reach the maximum epoch number, jumping to step S41, and if the current epoch training is finished and the current iteration epoch number reaches the maximum epoch number, outputting a generator G which is well trained;
s45, storing and obtaining GANs composed of the trained discriminator D and the trained generator G;
and S5, generating an image random vector by adopting a random generation method, generating a face image by using the GANs obtained in the step S45 and taking the image random vector as input, and performing pixel value scale transformation on the generated face image to finish the generation of the face image.
2. The method for generating human face images based on GANs according to claim 1, wherein the step S2 specifically includes:
s21, sampling the blocksize face image from the training set X without repetition, and carrying out pixel value scale transformation on the sampled face image;
s22, training a feature learning network by using the batchsize face image subjected to the pixel value scale transformation in the step S21;
s23, extracting the implicit characteristics of the face image after the pixel value scale transformation in the step S21 through a characteristic learning network, if the implicit characteristics of the face image in the training set X are not extracted completely, jumping to the step S21, otherwise, outputting a face image implicit characteristic set C.
3. The method for generating human face images based on GANs according to claim 2, wherein the step S21, the step S31 and the step S41 are performed by performing pixel value scaling on the human face image according to equation (5) to [ -1,1],
Figure FDA0002233550050000023
wherein i is the label of a certain image in the Batchsize images, and i belongs to [1, Batchsize ].
4. The method for generating human face images based on GANs according to claim 2, wherein the step S22 specifically comprises:
constructing an initial feature learning network, feeding the batchsize facial image subjected to the pixel value scale transformation in the step S21 into the feature learning network, fully training by using a mean square error loss function shown in an Adam optimizer optimization formula (6), completing the training of the feature learning network after the maximum epoch number of the feature learning network is reached,
Figure FDA0002233550050000031
wherein x isi *Is the ith reconstructed image output by the feature learning network and the ith human face image X in the training set XiAnd correspondingly.
5. The method for generating human face images based on GANs as claimed in claim 4, wherein said feature learning network is any one of a deep neural network, a convolutional neural network, a U-Net type automatic coding machine, a DenseNet type automatic coding machine and a sparse automatic coding machine.
6. The method for generating human face images based on GANs according to claim 1, wherein in step S44, the counterlearning generator G needs to train twice in succession.
7. The method for generating human face images based on GANs according to claim 1, wherein the step S5 specifically includes:
s51, setting the number N of required images, determining the distribution type of the random vectors of the images, enabling the random vectors of the images to be consistent with the distribution type of the random vectors in the step S42, and loading the GANs obtained in the step S45;
s52, generating N image random vectors by using a random generation method, sequentially feeding the N image random vectors into a trained generator G, and outputting to obtain N generated images with the pixel value range of [ -1,1 ];
s53, using the formula (7) to carry out pixel value scale transformation on the N generated images in the step S52, completing the generation of the face image,
G(zj)←127.5×(G(zj)+1) (7)
wherein, G (z)j) Is the j image generated, zjIs the image random vector, j 1,2, 3.
CN201910975725.5A 2019-10-15 2019-10-15 Face image generation method based on GANs Active CN110706303B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910975725.5A CN110706303B (en) 2019-10-15 2019-10-15 Face image generation method based on GANs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910975725.5A CN110706303B (en) 2019-10-15 2019-10-15 Face image generation method based on GANs

Publications (2)

Publication Number Publication Date
CN110706303A true CN110706303A (en) 2020-01-17
CN110706303B CN110706303B (en) 2021-05-11

Family

ID=69198332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910975725.5A Active CN110706303B (en) 2019-10-15 2019-10-15 Face image generation method based on GANs

Country Status (1)

Country Link
CN (1) CN110706303B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353995A (en) * 2020-03-31 2020-06-30 成都信息工程大学 Cervical single cell image data generation method based on generation countermeasure network
CN112102194A (en) * 2020-09-15 2020-12-18 北京金山云网络技术有限公司 Face restoration model training method and device
CN112288013A (en) * 2020-10-30 2021-01-29 中南大学 Small sample remote sensing scene classification method based on element metric learning
CN112966429A (en) * 2020-08-11 2021-06-15 中国矿业大学 Non-linear industrial process modeling method based on WGANs data enhancement
CN113191950A (en) * 2021-05-07 2021-07-30 西南交通大学 Super-resolution face image reconstruction method
JP2021114279A (en) * 2020-01-20 2021-08-05 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド Image generation method, generation device, electronic apparatus, computer readable medium, and computer program
CN114005170A (en) * 2022-01-05 2022-02-01 中国科学院自动化研究所 DeepFake defense method and system based on visual countermeasure reconstruction
WO2022062449A1 (en) * 2020-09-25 2022-03-31 平安科技(深圳)有限公司 User grouping method and apparatus, and electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268202A1 (en) * 2017-03-15 2018-09-20 Nec Laboratories America, Inc. Video surveillance system based on larger pose face frontalization
CN109615582A (en) * 2018-11-30 2019-04-12 北京工业大学 A kind of face image super-resolution reconstruction method generating confrontation network based on attribute description
CN109635774A (en) * 2018-12-21 2019-04-16 中山大学 A kind of human face synthesizing method based on generation confrontation network
CN109785258A (en) * 2019-01-10 2019-05-21 华南理工大学 A kind of facial image restorative procedure generating confrontation network based on more arbiters
CN109815928A (en) * 2019-01-31 2019-05-28 中国电子进出口有限公司 A kind of face image synthesis method and apparatus based on confrontation study

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268202A1 (en) * 2017-03-15 2018-09-20 Nec Laboratories America, Inc. Video surveillance system based on larger pose face frontalization
CN109615582A (en) * 2018-11-30 2019-04-12 北京工业大学 A kind of face image super-resolution reconstruction method generating confrontation network based on attribute description
CN109635774A (en) * 2018-12-21 2019-04-16 中山大学 A kind of human face synthesizing method based on generation confrontation network
CN109785258A (en) * 2019-01-10 2019-05-21 华南理工大学 A kind of facial image restorative procedure generating confrontation network based on more arbiters
CN109815928A (en) * 2019-01-31 2019-05-28 中国电子进出口有限公司 A kind of face image synthesis method and apparatus based on confrontation study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈泓佑 等: "基于子样本集构建的DCGANs训练方法", 《自动化学报》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021114279A (en) * 2020-01-20 2021-08-05 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド Image generation method, generation device, electronic apparatus, computer readable medium, and computer program
JP7084457B2 (en) 2020-01-20 2022-06-14 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Image generation methods, generators, electronic devices, computer-readable media and computer programs
US11463631B2 (en) 2020-01-20 2022-10-04 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for generating face image
CN111353995A (en) * 2020-03-31 2020-06-30 成都信息工程大学 Cervical single cell image data generation method based on generation countermeasure network
CN111353995B (en) * 2020-03-31 2023-03-28 成都信息工程大学 Cervical single cell image data generation method based on generation countermeasure network
CN112966429A (en) * 2020-08-11 2021-06-15 中国矿业大学 Non-linear industrial process modeling method based on WGANs data enhancement
CN112102194A (en) * 2020-09-15 2020-12-18 北京金山云网络技术有限公司 Face restoration model training method and device
WO2022062449A1 (en) * 2020-09-25 2022-03-31 平安科技(深圳)有限公司 User grouping method and apparatus, and electronic device and storage medium
CN112288013A (en) * 2020-10-30 2021-01-29 中南大学 Small sample remote sensing scene classification method based on element metric learning
CN113191950A (en) * 2021-05-07 2021-07-30 西南交通大学 Super-resolution face image reconstruction method
CN114005170A (en) * 2022-01-05 2022-02-01 中国科学院自动化研究所 DeepFake defense method and system based on visual countermeasure reconstruction

Also Published As

Publication number Publication date
CN110706303B (en) 2021-05-11

Similar Documents

Publication Publication Date Title
CN110706303B (en) Face image generation method based on GANs
CN113469356B (en) Improved VGG16 network pig identity recognition method based on transfer learning
CN112541864A (en) Image restoration method based on multi-scale generation type confrontation network model
CN110390638B (en) High-resolution three-dimensional voxel model reconstruction method
CN112215050A (en) Nonlinear 3DMM face reconstruction and posture normalization method, device, medium and equipment
CN110728219A (en) 3D face generation method based on multi-column multi-scale graph convolution neural network
CN111652049A (en) Face image processing model training method and device, electronic equipment and storage medium
CN111861945B (en) Text-guided image restoration method and system
CN112686817B (en) Image completion method based on uncertainty estimation
CN113989100B (en) Infrared texture sample expansion method based on style generation countermeasure network
Saquil et al. Ranking cgans: Subjective control over semantic image attributes
CN112819689B (en) Training method of human face attribute editing model, human face attribute editing method and human face attribute editing equipment
CN112183742A (en) Neural network hybrid quantization method based on progressive quantization and Hessian information
CN108959512B (en) Image description network and technology based on attribute enhanced attention model
CN114330736A (en) Latent variable generative model with noise contrast prior
Shariff et al. Artificial (or) fake human face generator using generative adversarial network (GAN) machine learning model
CN114638408A (en) Pedestrian trajectory prediction method based on spatiotemporal information
CN116383639A (en) Knowledge distillation method, device, equipment and storage medium for generating countermeasure network
CN111667006A (en) Method for generating family font based on AttGan model
CN109598771B (en) Terrain synthesis method of multi-landform feature constraint
Golts et al. Image compression optimized for 3D reconstruction by utilizing deep neural networks
CN117750155A (en) Method and device for generating video based on image and electronic equipment
CN117522674A (en) Image reconstruction system and method combining local and global information
CN112528077A (en) Video face retrieval method and system based on video embedding
CN117094365A (en) Training method and device for image-text generation model, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant