CN112598125A

CN112598125A - Handwritten number generation method for generating countermeasure network based on double-discriminator weighting

Info

Publication number: CN112598125A
Application number: CN202011342015.8A
Authority: CN
Inventors: 刘宝; 高娜; 黄梦涛; 刘海; 闫洪霖; 张金玉; 宋美玉; 王良; 师露露; 翟晓航
Original assignee: Shaanxi Zhongyi Times Technology Co ltd; Xian University of Science and Technology
Current assignee: Shaanxi Zhongyi Times Technology Co ltd; Xian University of Science and Technology
Priority date: 2020-11-25
Filing date: 2020-11-25
Publication date: 2021-04-02
Anticipated expiration: 2040-11-25
Also published as: CN112598125B

Abstract

The invention discloses a handwritten digit generation method for generating an antagonistic network based on double-discriminator weighting, which relates to the technical field of deep learning and aims at the technical problem of D2GAN (D2 GAN). the application provides the handwritten digit generation method for generating the antagonistic network based on double-discriminator weighting, a new target function is constructed by introducing the weighting idea, and the phenomenon of gradient disappearance is avoided from the perspective of a loss function; the advantages of the forward KL divergence and the reverse KL divergence are combined, so that the generated modes are diversified, and the mode collapse problem of the GAN is solved.

Description

Handwritten number generation method for generating countermeasure network based on double-discriminator weighting

Technical Field

The invention relates to the technical field of deep learning, in particular to a handwritten number generation method for generating an anti-network based on double-discriminator weighting.

Background

Generation of Antagonistic Networks (GAN) is a method of antagonistic learning that has developed in recent years. The GAN is composed of a generator G and a discriminator D, which are based on the idea of game theory and play with each other, and the goal is to find Nash equilibrium (Nash equilibrium) in the continuous non-convex problem with high-dimensional parameters. The GAN is proved to be capable of generating vivid images, is helpful in data enhancement and image completion, and is mainly applied to the fields of image super-resolution reconstruction, transfer learning, image restoration and the like.

But the loss function of the generator is equivalent to minimizing the real data P given an optimal discriminator_data(x) And generating a sample P_g(z) JS divergence between (z) (Jensen-Shannon). The two distributions are difficult to intersect in a high-dimensional space, even if the two distributions intersect, the intersecting part is a low-dimensional manifold in the high-dimensional space, the measure is 0, the intersecting part can be ignored, and at the moment, the JS divergence is constant, and the problem of gradient disappearance occurs. To solve this problem, Goodfellow et al redefines the loss function-log (D (G (z))) of the generator. Although the gradient vanishing problem is solved, the contradiction of minimizing KL divergence and maximizing JS divergence exists in the objective function, so that the training of the generator is unstable, and the generated samples are mostly safe samples and have the defect of repeated safetyThe generated images tend to be consistent and have reduced diversity due to poor diversity, and the generated samples are deficient in variety, so that the problem of mode collapse occurs.

Aiming at the problem of mode collapse of GAN, a D2GAN algorithm is proposed in the prior art, the idea of double discriminators is introduced, and the KL divergence and the inverse KL divergence are comprehensively utilized in an objective function so as to balance the confrontation of a generator and a discriminator. The objective function of the method introduces hyper-parameters alpha and beta, and has two purposes: one is to reduce-D by reducing alpha and beta, respectively, for the purpose of model-stable learning₁(G (z)) and-D₂(x) The impact on the optimization; second, by increasing α, β, minimization of KL divergence and inverse KL divergence is encouraged, respectively. Similar to GAN algorithm, the introduction of the hyper-parameter has a alleviating effect on model stability and mode collapse, but the decrease and increase of the hyper-parameter are contradictory, so that the two functions can be mutually counteracted, thereby making the training of the generator unstable.

D2GAN introduces a double discriminator to try to solve the GAN pattern collapse problem, but does not give guiding suggestions beyond the definition of parameters, and its generator can learn most distributions, but still some distributions are forgotten. Therefore, the application provides a handwritten number generation method for generating an anti-network based on dual-discriminator weighting, a new objective function is constructed by introducing a weighting idea, and the phenomenon of gradient disappearance is avoided from the perspective of a loss function; the advantages of the forward KL divergence and the reverse KL divergence are combined, so that the generated modes are diversified, and the mode collapse problem of the GAN is solved.

Disclosure of Invention

The invention aims to provide a handwritten number generation method for generating an antagonistic network based on double-discriminator weighting, which introduces a weighting thought to construct a new target function and avoids the phenomenon of gradient disappearance from the angle of a loss function; the advantages of the forward KL divergence and the reverse KL divergence are combined, so that the generated modes are diversified, and the mode collapse problem of the GAN is solved.

The invention provides a handwritten digit generation method for generating a confrontation network based on double-discriminator weighting, which comprises the following steps:

s1: establishing a D2WGAN network model, wherein the D2WGAN network model comprises a generator G and a discriminator D₁And a discriminator D₂The D2WGAN network model is trained through back propagation;

s2: and performing theoretical analysis on the D2WGAN network model, verifying that the generator recovers and generates real data of handwritten numbers through the KL divergence and the reverse KL divergence between the minimum model and the real data under an optimal discriminator.

Further, the training process of step S1 is specifically:

s11: an MNIST data set is adopted as a training sample;

s12: building a generator and a discriminator model;

s13: establishing a loss function of a discriminator and a generator;

s14: training generators and discriminator models.

Further, the input-output relationship between the generator and the arbiter in step S12 is as follows:

the input and output expression relation of the generator is as follows:

wherein: t (G) denotes the output of the generation network G,

to slave noise space P_zSampling m samples;

discriminator D₁The input and output expression relation of (1) is as follows:

wherein: t (D)₁) Representation discrimination network D₁Is then outputted from the output of (a),

to be derived from the real data space P_dataSampling m samples;

discriminator D₂The input and output expression relation of (1) is as follows:

wherein: t (D)₂) Representation discrimination network D₂To output of (c).

Further, in the step S13, the input of the generator and the discriminator is obtained first, the input of the generator is the noise z, and z satisfies the random noise distribution P_zThe input of the discriminator comprises a sample generated by the generator and a real data sample x, the noise data is input into the generator to obtain G (z), and the samples G (z) generated by the generator are respectively input into a discriminator D₁And a discriminator D₂To obtain D₁(G (z) and D₂(G (z)), the real data x is input to a discriminator D₁And a discriminator D₂To obtain D₁(x) And D₂(x)；

The penalty function for the discriminator is:

Loss_D＝Loss_D₁+Loss_D₂ (6)

wherein: loss _ D₁Is a discriminator D₁Loss function of (Loss _ D)₂Is a discriminator D₂Loss _ D is the Loss function of the total discriminator, and the relationship between the two hyper-parameters is: a discriminator D for 0 ≤ ω ≤ 1 and ρ + ω ═ 1₁Mainly focusing on the real data, discriminator D₂Mainly emphasizes the data generated by the generator, and the two discriminators are connected through weighting;

the loss function of the generator is:

further, in the training process in step S14, an Adam optimizer is used for training, and the formula is as follows:

m_t＝μ*m_t-1+(1-μ)*g_t (8)

further, the discriminator D in the step S1₁And a discriminator D₂The D2WGAN network model is characterized in that all sensors are multilayer sensors, an objective function of the D2WGAN network model is formed by weighting a forward objective function and a reverse objective function, and the form of the objective function is as follows:

introducing hyper-parameters rho and omega, wherein the rho and omega are weights of forward and reverse objective functions respectively; wherein rho + omega is 1 and 0 is less than or equal to rho, and omega is less than or equal to 1;

the forward objective function is:

the inverse objective function is:

when ρ is 1 and ω is 0, the D2WGAN algorithm degenerates to a forward objective function, which is:

the optimal discriminator is as follows:

on the basis of the optimal discriminator, the optimal generator is as follows:

when ρ is 0 and ω is 1, the D2WGAN algorithm degenerates to an inverse objective function, i.e. the objective function is:

the optimal discriminator is as follows:

on the basis of the optimal discriminator, the optimal generator is as follows:

when the D2WGAN exists in both the forward direction and the reverse direction objective functions, namely 0 < rho, omega < 1 and rho + omega is 1, the network is equivalent to the weighted fusion of KL divergence and reverse KL divergence according to different emphasis degrees of single and various modes, and the complementary characteristics of the two divergences are utilized to avoid the generation of bad samples on the basis of generating various modes.

Further, the step S2 gives a fixed G, and T (G, D)₁，D₂) Maximizing to obtain the following optimal arbiter in closed form

Further, the step S2 is given

And

under Nash equilibrium, if and only if P_G＝P_dataWhen the temperature of the water is higher than the set temperature,

reaching a global minimum;

wherein the content of the first and second substances,

compared with the prior art, the invention has the following remarkable advantages:

(1) compared with a D2GAN target function, the method is different in concept. The D2WGAN introduces the weighting idea to ensure that the network has a certain emphasisThe heuristic defect of the hyperparameter in the D2GAN is improved, so that fusion and complementation of the forward KL divergence and the reverse KL divergence are realized, generation of bad samples is avoided on the basis of generating various modes, and the mode collapse problem can be effectively solved. The introduction purposes of the hyper-parameters in the two objective functions are different. D2WGAN introduces hyper-parameters to weight the forward and backward KL divergence, while D2GAN introduces to stabilize the model, reduce-D₁(G (z)) and-D₂(x) And controlling the effect of the forward and reverse KL divergence on the optimization. The two constraints are different. An emphasis weighting idea is introduced into a D2WGAN algorithm, the constraint conditions are that rho is 1, omega is 0, rho is 0, and omega is 1, and the completeness of an emphasis point is considered; the constraint conditions in D2GAN are 0 < alpha, beta < 1, and lack of interpretability.

2) The optimized objective function of the invention can degrade towards the positive direction and the negative direction KL divergence. The D2WGAN algorithm can be degenerated into KL divergence and reverse KL divergence in two extreme cases of rho-1, omega-0 and rho-0, omega-1, is beneficial to realizing the generation of multiple modes or capturing a single mode, and has strong interpretability; whereas only 0 < α, β ≦ 1 is required in D2GAN, the objective function itself determines that it cannot degrade to KL and inverse KL divergence, and is poorly interpretable.

3) The results of the invention after optimization are different. When P is present_G＝P_dataThe optimal discriminator in the D2WGAN algorithm is

The optimal discriminator in the D2GAN is D₁ ^*(x)＝α，D₂ ^*(x) β. When the generator is optimized on the basis of the optimal arbiter, the D2WGAN result is

D2GAN results in J (G, D)₁ ^*，D₂ ^*) Alog (α -1) + β log (β -1). The result shows that when the D2GAN judges whether the network obtains the optimal discriminator and the optimal generator, the results are all provided with the hyper-parameters alpha and beta, and the judgment standard is changed every time along with the change of the hyper-parameters alpha and beta, so that the judgment is not intuitive; the D2WGAN results of the inventionThe number of the integers is not changed along with the change of the super parameters, and the judgment result is more visual when the network is judged whether to reach Nash balance or not.

Drawings

FIG. 1 is a schematic diagram of a handwritten digit generation method for generating a countermeasure network by using dual discriminators with weighting according to an embodiment of the present invention;

FIG. 2 is a flowchart of a handwritten digit generation method for generating a countermeasure network by weighting a dual discriminator according to an embodiment of the present invention;

FIG. 3 is a block diagram of a dual arbiter weighted generation countermeasure network according to an embodiment of the present invention;

FIG. 4 is a graph of maximum minimum loss and non-saturation loss provided by an embodiment of the present invention;

FIG. 5 is a graph of linear loss provided by an embodiment of the present invention;

FIG. 6 is a flowchart of training discriminant networks D1 and D2 according to an embodiment of the present invention;

FIG. 7 is a flow chart of training of a generator network G according to an embodiment of the present invention;

fig. 8 is a diagram illustrating an effect of a handwritten digital image generated by a GAN network using an MNIST data set according to an embodiment of the present invention;

fig. 9 is an effect diagram of a handwritten digital image generated by a D2GAN network using an MNIST data set according to an embodiment of the present invention;

fig. 10 is a diagram illustrating the effect of generating a handwritten digital image by using an MNIST data set in a D2WGAN network according to an embodiment of the present invention.

Detailed Description

The technical solutions of the embodiments of the present invention are clearly and completely described below with reference to the drawings in the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.

Aiming at the problem of mode collapse of GAN, the capability of generating diversity samples is poor; the invention provides a dual-discriminator weighted generation countermeasure network, introduces the weighting thought to construct a new target function, and avoids the phenomenon of gradient disappearance from the angle of a loss function; the advantages of the forward KL divergence and the reverse KL divergence are combined, so that the generated modes are diversified, and the mode collapse problem is avoided.

Two loss functions are proposed in GAN to train the generator G. In early training, the minimax loss function may not provide enough gradient to allow a generator to perform good training, and is easy to saturate. In the experimental process, the two loss functions are found to cause gradient disappearance during generator training, and as a result, as shown in fig. 1, when an input value is small, an output value of a discriminator tends to be 0, and saturation occurs at the initial training stage by using a minimx loss function; when the input value is large, the output value of the discriminator tends to 1, and saturation occurs in the later training period by using the unsaturated loss function. As shown in FIG. 2, the derivative of the linear loss function is-1, and the gradient of the linear loss function is not attenuated, so that the phenomenon of gradient disappearance can be effectively relieved; by utilizing the linear relation, the method can quickly converge in the random gradient descent and has high calculation speed. Practice has shown that the samples generated using the linear loss function-D (-) training generator are more diverse.

Fig. 1 is a schematic diagram of a handwritten number generation method for generating a countermeasure network based on dual-discriminator weighting, and fig. 2 is a flowchart of the handwritten number generation method for generating a countermeasure network based on dual-discriminator weighting, in which a handwritten number data set is prepared as a training sample, random noise is used as an input of a generator, and data generated by the generator and real data obtained by sampling are used as inputs of a discriminator. Initializing the weight parameters of a generator and two discriminators, firstly fixing the weight parameters of the generator, training discriminators D1 and D2 networks, wherein the two discriminators have the same model structure and different loss functions during training, so that the emphasis points are different; and secondly, fixing the weight parameters of the two discriminators and training a discriminator G network.

Referring to fig. 1-10, the invention provides a handwritten digit generation method for generating a confrontation network based on dual-discriminator weighting, which comprises the following steps:

s1: establishing a D2WGAN network model, wherein the D2WGAN network model comprises a generator G and a discriminator D₁And a discriminator D₂Fig. 3 is a structural diagram of a dual-arbiter weighted generation countermeasure network, the D2WGAN network model being trained by back propagation;

Example 1

The step S1 training process specifically includes:

s11: an MNIST data set is adopted as a training sample;

s12: building a generator and a discriminator model;

s13: establishing a loss function of a discriminator and a generator;

s14: training generators and discriminator models.

Wherein said step S11 prepares a MNIST dataset as a training sample, the MNIST dataset consisting of numbers handwritten from 250 different persons, of which 50% are high school students and 50% are from staff of the Census Bureau of population. The data set comprises 70000 handwritten digital pictures, wherein 60000 are training sets, 10000 are testing sets, 10 types of handwritten digital pictures are divided into 0-9, and the 10 types of handwritten digital pictures are normalized into 28 × 28 grayscale pictures.

In step S12, the input-output relationship between the generator and the discriminator is as follows:

in order to compare the effect of generating the confrontation network by the dual-discriminator weight and the effect of generating the confrontation network and generating the handwritten number by the dual-discriminator generated confrontation network, the model structure of the generator and the discriminator is set to be very simple, and only one hidden layer is provided.

The input and output expression relation of the generator is as follows:

wherein: t (G) denotes the output of the generation network G,

to slave noise space P_zSampling m samples;

to be derived from the real data space P_dataSampling m samples;

wherein: t (D)₂) Representation discrimination network D₂To output of (c).

In step S13, the input of the generator and the discriminator is obtained first, the input of the generator is noise z, and z satisfies the random noise distribution P_zThe input of the discriminator comprises a sample generated by the generator and a real data sample x, the noise data is input into the generator to obtain G (z), and the samples G (z) generated by the generator are respectively input into a discriminator D₁And a discriminator D₂To obtain D₁(G (z) and D₂(G (z)), the real data x is input to a discriminator D₁And a discriminator D₂To obtain D₁(x) And D₂(x) (ii) a Fig. 6 is a flowchart of training the arbiter networks D1 and D2 according to an embodiment of the present invention, and fig. 7 is a flowchart of training the generator network G according to an embodiment of the present invention;

the penalty function for the discriminator is:

Loss_D＝Loss_D₁+Loss_D₂ (6)

the loss function of the generator is:

in the step S14, an Adam optimizer is used for training in the training process, and the formula is as follows:

m_t＝μ*m_t-1+(1-μ)*g_t (8)

equations (8) and (9) are first and second order moment estimates, respectively, of the gradient, and equations (10) and (11) are corrections to the first order second order moment estimate, which can be approximated as unbiased estimates of the expectation. It can be seen that the moment estimation directly on the gradient has no additional requirements on the memory and can be dynamically adjusted according to the gradient. The last term is preceded by a dynamic constraint formed on the learning rate n and having a well-defined range.

Example 2

The discriminator D in the step S1₁And a discriminator D₂The D2WGAN network model is a multilayer perceptron, an objective function of the D2WGAN network model is composed of a forward objective function and a reverse objective function in a weighting mode, and the objective function is in the form of:

the forward objective function is:

the inverse objective function is:

the optimal discriminator is as follows:

on the basis of the optimal discriminator, the optimal generator is as follows:

in this case, the network is equivalent to optimizing KL divergence, which facilitates generation of multi-mode distributions, but may generate poor samples.

the optimal discriminator is as follows:

on the basis of the optimal discriminator, the optimal generator is as follows:

the network then acts to optimize the inverse KL divergence to help better capture the single mode, but may lose some samples.

Example 3

The step S2 gives a fixed G, and T (G, D)₁，D₂) Maximization to obtain the optimal arbiter of the following closed form

Example 4

The step S2 is given

And

reaching a global minimum;

wherein the content of the first and second substances,

example 5

The effect of the invention is further described by combining simulation experiments.

1. The experimental environment is as follows:

the simulation experiment environment of the invention is as follows: the processor is InterXeon E5-2620 v4, the operating system is 64-bit Windows 10, the graphics card is NVIDIA GeForce RTX 2080Ti, the pycharm editor is used, the python3.7 version is used, and the Tensorflow deep learning framework is used. The MNIST is used for handwriting digital images, a data set comprises 70000 handwritten digital images, 60000 are training sets, 10000 are testing sets, and only 60000 training data sets are used in the experiment.

2. Simulation experiment contents:

and generating the handwritten digital picture on the originally generated countermeasure network, the double-discriminant generation countermeasure network and the double-discriminant weighting generation countermeasure network respectively. Except for the self algorithm, the network structure is basically consistent and comprises a hidden layer, iterative training is carried out for the same times, the handwritten number generation results are compared, and the results are shown in fig. 8, fig. 9 and fig. 10.

3. And (3) simulation result analysis:

as can be seen from fig. 8, when the GAN generates MNIST data, most of the generated handwritten digits are 0, 3, and 8, and other handwritten digits appear very little, and at the later stage of training, other handwritten digits except 0, 3, and 8 do not appear substantially. The reason is that GAN finds a pattern that is easier to fool the discriminator during training, and thus has an increasing probability of generating such a pattern, resulting in fewer handwritten digit types being generated. As can be seen from fig. 9, the D2GAN generates more types of handwritten numbers, and there are other handwritten numbers except for 2 which appears less during the training process in the later stage of the run. The D2GAN can learn most of the distribution in the learning process, but a part of the distribution is still forgotten. As can be seen from fig. 10, D2WGAN generates all handwritten digits within 0-9 and is evenly distributed. Overall, D2WGAN better balances the parameters of KL divergence and inverse KL divergence, with better diversity of generation.

The above disclosure is only for a few specific embodiments of the present invention, however, the present invention is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims

1. A handwritten digit generation method for generating a confrontation network based on dual-discriminator weighting is characterized by comprising the following steps:

s2: and performing theoretical analysis on the D2WGAN network model, verifying that the generator recovers and generates real data of handwritten digits by minimizing KL divergence and reverse KL divergence between the model and the real data under an optimal discriminator.

2. The method as claimed in claim 1, wherein the training process of step S1 is specifically as follows:

s11: an MNIST data set is adopted as a training sample;

s12: building a generator and a discriminator model;

s13: establishing a loss function of a discriminator and a generator;

s14: training generators and discriminator models.

3. The method for generating handwritten numeral based on dual-discriminator-weighted generation countermeasure network as claimed in claim 2, wherein the input-output relationship of the generator and the discriminator in step S12 is as follows:

the input and output expression relation of the generator is as follows:

wherein: t (G) denotes the output of the generation network G,

to slave noise space P_zSampling m samples;

to be derived from the real data space P_dataSampling m samples;

wherein: t (D)₂) Representation discrimination network D₂To output of (c).

4. The method as claimed in claim 2, wherein the generator and the discriminator are obtained as inputs in step S13, the generator input is noise z, and z satisfies the random noise distribution P_zThe input of the discriminator comprises a sample generated by the generator and a real data sample x, the noise data is input into the generator to obtain G (z), and the samples G (z) generated by the generator are respectively input into a discriminator D₁And a discriminator D₂To obtain D₁(G (z) and D₂(G (z)), the real data x is input to a discriminator D₁And a discriminator D₂To obtain D₁(x) And D₂(x)；

The penalty function for the discriminator is:

Loss_D＝Loss_D₁+Loss_D₂ (6)

wherein: loss _ D₁Is a discriminator D₁Loss function of (Loss _ D)₂Is a discriminator D₂Loss _ D is the Loss function of the total discriminator, and the relationship between the two hyper-parameters is: a discriminator D for 0 ≤ ω ≤ 1 and ρ + ω ═ 1₁Mainly focusing on the real data, discriminatorsD₂Mainly focuses on the data generated by the generator, and the two discriminators are connected through weighting;

the loss function of the generator is:

5. the method as claimed in claim 2, wherein the training in step S14 is performed by using Adam optimizer, and the formula is as follows:

6. the method for generating handwritten numeral based on dual-discriminator-weighted generation countermeasure network as claimed in claim 1, wherein said discriminator D in said step S1₁And a discriminator D₂Are all multilayer perceptrons, the D2WGAN networkThe objective function of the model is composed of a forward objective function and a reverse objective function in a weighted mode, and the objective function is in the following form:

the forward objective function is:

the inverse objective function is:

the optimal discriminator is as follows:

on the basis of the optimal discriminator, the optimal generator is as follows:

the optimal discriminator is as follows:

on the basis of the optimal discriminator, the optimal generator is as follows:

when the D2WGAN exists in both the forward direction and the reverse direction objective functions, namely 0 < rho, omega < 1 and rho + omega is 1, the network is equivalent to the weighted fusion of KL divergence and reverse KL divergence according to different emphasis degrees of single and multiple modes, and the complementary characteristics of the two divergences are utilized to avoid the generation of bad samples on the basis of generating multiple modes.

7. The method as claimed in claim 1, wherein the step S2 gives a fixed G, and gives T (G, D)₁，D₂) Maximization to obtain the optimal arbiter of the following closed form

8. The method for generating handwritten numeral based on bidiscriminant weighting generation countermeasure network as claimed in claim 1, wherein said step S2 specifies

And

reaching a global minimum;

wherein the content of the first and second substances,