CN114612589A - Application of stable generation countermeasure network in style migration based on attention mechanism - Google Patents

Application of stable generation countermeasure network in style migration based on attention mechanism Download PDF

Info

Publication number
CN114612589A
CN114612589A CN202210250457.2A CN202210250457A CN114612589A CN 114612589 A CN114612589 A CN 114612589A CN 202210250457 A CN202210250457 A CN 202210250457A CN 114612589 A CN114612589 A CN 114612589A
Authority
CN
China
Prior art keywords
attention
generator
network
cyclegan
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210250457.2A
Other languages
Chinese (zh)
Inventor
李庚隆
徐蔚鸿
张康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202210250457.2A priority Critical patent/CN114612589A/en
Publication of CN114612589A publication Critical patent/CN114612589A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

A stable generation countermeasure network ASGAN based on an attention mechanism is provided, which can effectively enlarge the receptive field. The ASGAN training process is stabilized by using instance normalization and an AdaBelief optimizer and applied to image style migration. Theoretical analysis shows that ASGAN uses less computational cost and total amount of parameters than CycleGAN. The qualitative and quantitative analysis of the experiment shows that compared with CycleGAN and AGGAN, the ASGAN has better and more stable image style migration effect, and ASGAN migration indexes perform better in PSNR, SSIM, LPIPS and FID index evaluation.

Description

Application of stably-generated countermeasure network in style migration based on attention mechanism
Technical Field
The invention relates to the field of style migration, in particular to application of a stably-generated countermeasure network based on an attention mechanism in style migration.
Background
In recent years, with the development of deep learning, the generation of confrontational network model, which is proposed by Ian Goodfellow during reading of the morbid university, has been improved. In the process of researching a generation model, Ian hopes to generate pictures by simulating human brain thinking mode. But the quality of the generated pictures is not ideal all the time, and the images are blurred and unclear. Ian is doubtful about the mode of using the traditional neural network, and a completely new idea is proposed, namely two neural networks are simultaneously used to form a game and confrontation relationship, namely the original idea of generating the confrontation network. Nowadays, the game theory is also applied to various aspects such as images, voice, network security and the like.
With the development of GANs, researchers have analyzed GANs from different perspectives. Arjovsky et al analyzed some of the problems that occurred with the original GANs, and proposed some research directions based on these problems, which laid the guiding foundation for the subsequent development direction of GANs. Kurac et al analyzed the network structure of the GANs with some redundant regularization, normalization methods, and network structures, and improved using the latest loss function and network structure, and finally obtained better results than the conventional GANs. Bau et al demonstrate a number of practical applications that can be achieved by this framework, from comparing internal characterizations of different layers, models, and datasets, to improving GANs by locating and removing distortion-causing elements, thereby enabling interactive control of objects in a scene. Lucic et al find that experimental results do not fall through by analyzing a GANS model with intra-row center effects, and further propose a research direction for optimizing from the aspect of computing resources, and open up a new way for the development of GANs. Arora et al analyzed the assumptions of the GANs and found that the objective function therein could not solve the problems of pattern collapse and learning meaningless features. Mescheder et al demonstrated that discontinuous distributions in the learning samples are one cause of non-convergence of non-regularized GANs, and also analyzed some regularization methods recently proposed to stabilize training. Nagarajan et al analyzed the training convergence problem of GANs from a kinetic perspective. The variants that have been proposed to generate countermeasure networks are mainly divided into two main categories: the first class of variants is embodied in the structure of the network, such as a convolutional neural network when processing images, a circular neural network when processing time series data, and the other class of variants is embodied in the loss function, which can make the learning of the generator more stable.
The generation countermeasure network comprises two network structures of a generator and a discriminator, can be combined to form the GANs, and can also be used separately, so that the generation countermeasure network has strong adaptability, and the generation countermeasure network with different network structures is proposed for being applied to different fields. The primitive generation countermeasure network uses a fully connected neural network, can only process relatively simple image data sets such as MNIST, CIFAR-10 and Toronto Face Dataset, and the processing effect on complex image types and high-resolution image data sets is not good. And the traditional GANs are often accompanied by unstable training conditions in the training process, so that although the generation of the countermeasure network is an innovation for generating the model, a plurality of defects exist at the same time. Since the original generative confrontation network uses a fully connected network, there are many parameters, and to solve this problem, a CNN-based generative confrontation network model is proposed. The Deep convolution generation countermeasure network (Deep Convolutional networks) is a network structure which changes the full connection layer of the original GaNs into CNN, greatly reduces the calculation amount, and modifies the network structure, and the effect is better than that of the original GANs. However, the model has some problems, such as unstable model training, which still occurs as the training time of the model is prolonged. In order to make the application range of the GANs wider, CGANs have come to use, which extends the original GANs into a conditional model, and limits the output result of the network by adding additional conditions. Although the results shown in CGAN are very basic, it demonstrates the potential of a conditional countermeasure network and shows great application prospects.
Although the previously proposed generation of competing networks has been theoretically successful, researchers have found that many problems still occur during the training of GANs, the most significant of which stems from the extreme instability of the training. One solution to stabilize the training process is to start with a loss function. For example, the purpose of stable training is realized by introducing a Wasserstein distance in the initiative wgan (Wasserstein gan), and the distance has superior smoothness compared with KL divergence and JS divergence, so that the problem of gradient disappearance can be solved better in theory, and the training process can be stabilized more effectively. However, the selection method of the WGAN to the gradient value of the discriminator is not reasonable, so that the WGAN-GP uses a gradient penalty to enable the gradient to be updated smoothly, namely the 1-lipschitz condition is met, and the problem of training gradient vanishing gradient explosion is solved. RSGAN can produce more stable, higher quality data than other GANs varieties. Standard RSGANs with gradient penalties generate data with quality better than WGAN-GP and with a 400% reduction in time required for the best network architecture model in the GAN variant of the same phase. RSGANs are able to generate reasonably high resolution images from a very small sample compared to GANs and LSGAN, and the quality of the image is significantly improved compared to WGAN-GP.
The originally generated countermeasure network is improved through two aspects of the network structure and the loss function, so that the training stability and the training result are greatly improved. Meanwhile, applications for generating an anti-network are also continuously developed, wherein style migration is a big hotspot in the applications. Most pioneering is the application of CGAN to image style migration offering a conditional countermeasure network for a general solution to the image-to-image conversion problem, which model can learn features in the input image and then migrate into the output image, e.g. can change the night of a certain place to the day, and can also convert a label map to a real map, but which model must use paired datasets, i.e. day of a certain place must use the same location of night pictures as a match. So acquisition of the data set is often difficult. In order to solve the problem, CycleGAN is generated, and the network model can achieve a good style migration effect without using pairing data by learning cycle consistency loss.
Nowadays, CNN is mostly used for generating a countermeasure network, but CNN can only capture local spatial information and a receptive field, which is not enough to cover the whole network structure, and it is difficult to learn many kinds of data sets, and it may also cause key part shifts in images, such as the positions of five sense organs are not right in the generation of human faces. Therefore, an attention-based mechanism for generating a countermeasure network SAGAN is proposed, the model combines Non-local Neural Networks with GANs so that the network can have a larger receptive field without causing reduction of computational efficiency, and the model has the greatest advantage of being good at processing geometric figures. When capturing the correlation of a certain position in Non-local, the global correlation of the position and all positions in a picture is calculated, so that the calculation amount is increased, in order to solve the problem and maintain the precision, GCNet establishes a three-step universal framework of unified global context modeling, is a light-weight model, and can achieve the effect of effectively capturing global information.
Currently, more common optimizers can be roughly divided into two types: compared with the SGD, many models (such as convolutional neural networks) adopt the adaptive method, which generally converges faster but has poorer generalization effect (such as Adam) and an acceleration scheme (such as random gradient descent SGD with momentum). For such complex cases as creating a countermeasure network (GAN), an adaptive approach is often used by default because of its stability. The AdaBelief optimizer adjusts the step size according to "belief" in the current gradient direction, and takes the Exponential Moving Average (EMA) of noisy gradients as the next gradient prediction. If the observed gradient deviates significantly from the prediction, then the current observation is not trusted and a smaller step size is taken; if the observed gradient is close to the predicted value, the current observation is trusted and a larger step size is taken. Experiments show that the optimizer simultaneously meets 3 advantages: fast convergence, good generalization and training stability of the adaptive method.
Although the research on CycleGAN is increasing, the migration effect is also increasing. However, there are still some problems: firstly, the migration effect of CycleGAN is not satisfactory when performing geometric figures, and secondly, the result of image generation still has a space for improvement.
Disclosure of Invention
The invention aims to solve the problem of poor migration effect in the style migration field, and provides a stable generation countermeasure network based on an attention mechanism.
The purpose of the invention can be realized by the following technical scheme:
an application of a stable generation countermeasure network based on attention mechanism in style migration, comprising the following steps:
1) selecting a data set for migration from the style migration official data set to realize the migration of two domains in the data set;
2) inputting the sample data set of the first domain into a first generator to generate a second domain image after migration;
3) transmitting the generated second domain image into a discriminator to obtain a discrimination result, and calculating to obtain the countermeasure loss of the first domain migration;
4) inputting the generated second domain image into a second generator, generating an image of the first domain, and calculating the cycle consistency loss of the first domain;
5) the same steps are carried out on the second domain image, and the confrontation loss of the second domain and the cycle consistency loss of the second domain are calculated;
6) adding all losses to obtain a total loss;
7) fixing the parameters of the discriminator, and performing back propagation and parameter updating on the generator while not performing gradient descent;
8) allowing the gradient of the discriminator to be reduced, and performing backward propagation and parameter updating;
9) optimizing the model through continuous iteration to finally obtain a trained model;
10) inputting the test set image into the trained model to obtain a test result;
11) and testing the trained model by using PSNR, SSIM, LPIPS and FID evaluation indexes on the test result, and measuring and outputting the result.
In the step 3), the countermeasure loss is obtained by specifically calculating as follows:
for mapping function G X → Y and discriminator DYThe challenge loss of (a) is expressed as follows:
Figure BDA0003546584810000031
for the mapping function F: Y → X and the discriminator DXThe challenge loss of (a) is expressed as follows:
Figure BDA0003546584810000032
g in the formula (1) represents a generator of domain X → domain Y, DYA discriminator representing a Y domain;
f in equation (2) represents the generator of domain Y → domain X, DXA discriminator representing an X domain;
in the formulas (1) and (2),
Figure BDA0003546584810000033
and
Figure BDA0003546584810000034
respectively representing training examples of an X domain and a Y domain;
in the step 4), the cycle consistency loss is obtained by specifically calculating as follows:
Figure BDA0003546584810000035
in the step 6), the total loss is obtained by specifically calculating as follows:
L(G,F,Dx,Dy)=LGAN(G,DY,X,Y)+LGAN(F,DX,Y,X)+λLcyc(G,F) (4)
in formula (4), λ represents the correlation between two domains;
in the step 9), the model is optimized, and the specific optimization objective is as follows:
Figure BDA0003546584810000036
in the formula (5), G*And F*A generator under the optimal condition;
the invention has the beneficial effects that:
the invention provides a stable generation countermeasure network based on an attention mechanism based on a style migration problem, and the network can effectively improve the migration effect of images.
Secondly, the invention adopts an attention mechanism to carry out secondary feature extraction on the input features.
And thirdly, the invention adopts a sub-pixel convolution mechanism to improve the sampling effect on the model, thereby improving the image migration effect.
And fourthly, stabilizing the model training process by adopting a spectrum normalization mechanism.
And fifthly, adding a weight coefficient to the attention diagram extracted by the attention mechanism, and dynamically adjusting the proportion of the attention in the network by modifying the weight coefficient.
And sixthly, an AdaBelief optimizer is adopted to further optimize the model and improve the generation effect.
Drawings
FIG. 1 is a diagram of an example network model of the present invention.
FIG. 2 is a diagram of an attention mechanism network model used in an embodiment of the present invention.
FIG. 3 is a diagram of an example generator network model of the present invention.
FIG. 4 is a diagram of an example arbiter network model of the present invention.
FIG. 5 is a graph comparing the effect of the example of the present invention on the applet 2orange dataset with the effect of cycleGAN, AGGAN.
FIG. 6 is a graph comparing the effect of an example of the invention on the horses2zebra dataset with the effect of cycleGAN, AGGAN.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Examples
An application of a stable generation countermeasure network based on an attention mechanism in style migration is characterized in that images of two domains are taken out from a data set image, a generator and a discriminator added with the attention mechanism are adopted to extract features of an input image and train the input image, and migration between the two domains is achieved. Network model diagram see fig. 1, network model diagram for attention mechanism in model see fig. 2, generator network model diagram in model see fig. 3, and discriminator network model diagram in model see fig. 4. The method comprises the following specific steps:
1. selection of data sets
We selected horse2zebra and apple2orange in the style migration official dataset, where the horse2zebra training set includes 1067 horse images, 1334 zebra images, and the test set includes 120 horse images, 140 zebra images. The applet 2orange training set included 995 apple images, 1019 orange images, and the test set included 266 apple images, 248 orange images.
2. New generator extracts input image features and migrates
9 residual blocks are used in the new generator. To further extract features of the images in the generator, a re-extraction of features is performed using an attention mechanism, and a network model diagram is shown in fig. 3.
3. The new discriminator discriminates the migration result of the generator
In order to improve the discrimination accuracy of the discriminator, an attention mechanism is also added to the discriminator to further extract the features of the image on the basis of the features of the original feature map, and a network model map of the discriminator is shown in fig. 4.
4. Training model
The total loss was calculated and back-propagated, and the trained model was obtained after 200 iterations in the experimental environment of table 2.
Table 1 experimental environment table
Name(s) Configuration of
Operating system Ubuntu18.04
GPU NVIDIA GEFORCE RTX 2080Ti
CPU Inter xeon processor(skylake,IBRS),2
RAM 16GB
GPU correlation library CUDA10.2,CUDNN7.6
Deep learning framework Pytorch
5. Test model
And inputting the test set images in the data set into a generator of the trained model to obtain a test result. And testing the trained model by adopting PSNR, SSIM, LPIPS and FID evaluation indexes, and measuring a migration result.
6. Description of evaluation index
We evaluated our test results using PSNR, SSIM, LPIPS, FID evaluation indices.
PSNR is called peak signal-to-noise ratio and is defined by Mean Square Error (MSE), two images G and noise images H with the same size of m multiplied by n are given, and the definition of the Mean Square Error (MSE) and the PSNR is as follows:
Figure BDA0003546584810000041
Figure BDA0003546584810000042
in equations (6) and (7), MAX is the maximum possible pixel value of the picture, MSE represents the mean square error of the current image G and the reference image H, and m and n are the height and width of the image, respectively.
SSIM is an index for measuring the similarity of two images. The contrast is measured based on the brightness (luminance), contrast (contrast) and structure (structure) between samples x and y. The more similar the two groups of pictures, the higher the value of SSIM.
LPIPS is a new judgment index simulating human perception, the calculation of the index depends on a VGG network, and deep features of different structures and tasks in the picture are extracted through the network.
The FID estimates the data distribution of the real image and the generated image in the deep neural network and calculates the distance between the real image and the generated image, the estimation result is highly similar to the human perception, and the smaller the FID value is, the closer the two images are.
7. Description of comparative model
ASGAN is compared to cycleGAN and AGGAN.
CycleGAN is used for image style migration between unpaired data, using a cyclic consistency penalty to learn not only the source domain to target domain mapping, but also the reverse mapping from target domain to source domain.
AGGAN solves the problem that it is difficult for traditional unsupervised image style migration techniques to focus on a single object without changing the background in the picture or multiple objects by introducing an unsupervised attention mechanism that does antagonism training with the generator and the discriminator.
8. Computational cost analysis
It is assumed that the dimension of input information of the ASGAN generator and the discriminator is H × W × C, the dimension of the input feature map of a certain calculation unit is H × W × C, and the dimension of the processed output feature map is H ' × W ' × C '. The number of executions of all the basic calculation units and simple operations involved is listed as shown in table 2.
TABLE 2 number of simple operation executions of calculation units of basic classes contained in ASGAN
Basic computing unit S× S÷ S+ S- S>,<,≥,≤,==,≠ S
Convlution ck2h′w′c′ 0 ck2h′w′c′+h′w′c′ 0 0 h′w′c′
ReLU 0 0 0 0 hwc h′w′c′
Tanh 6hwc hwc 13hwc hwc 0 h′w′c′
Softmax hwc hwc 4hwc-1 hwc hwc-1 h′w′c′
Adding 0 0 hwc 0 0 h′w′c′
Mapping 0 0 0 0 0 h′w′c′
Matmuling hwc 0 h(w-1)c 0 0 h′w′c′
Arranging 0 0 0 0 0 h′w′c′
In Table 1, S×,S÷,S+,S-,SRespectively representing the number of times of execution of simple operations of multiplication, division, addition, subtraction and assignment, S>,<,≥,≤,==,≠Indicating that the number of times of execution of the compare operation is greater than, less than, greater than or equal to, less than or equal to, or not equal to. The Convlution calculation unit represents convolution or inverse convolution, k represents the convolution kernel size, and the ReLU, Tanh and Softmax calculation units represent activation functions, which can be converted into a combination of several kinds of simple operations. The Adding, Mapping, Matmuling and arraging computing units respectively represent matrix addition, matrix Mapping, matrix multiplication and row vector sequential arrangement into a three-dimensional matrix. Tanh and Softmax activation functions use exApproximation function, wherein exThe approximation function includes: the average error range of the method is 0.02-0.04 through one multiplication, three addition and two assignment operations.
According to the descriptions in fig. 3 and 4, when the dimension of the model input information is fixed, the dimension of the input feature map and the dimension of the output feature map of each basic computing unit are also fixed. Then, table 3 is obtained by accumulating the actual number of times of execution of the simple operation performed by each basic calculation unit in the ASGAN, based on table 2 and the fixed input dimension and output dimension of each basic calculation unit.
TABLE 3 Total number of executions of each simple operation in ASGAN
Figure BDA0003546584810000051
Similar to the statistical method for the calculated cost of ASGAN, the calculated cost to obtain CycleGAN is shown in Table 4.
Table 4 total number of executions of each simple operation in CycleGAN.
Simple operation Number of executions
S× 13068HWC+1794376HW
S÷ 2HWC
S+ 13084HWC+1795656.03125HW
S- 2HWC
S>,<,≥,≤,==,≠ 776HW
S= 20HWC+2128.03125HW
Because multiplication and division have similar instruction cycles, the computation costs of both are similar, as are the computation costs of addition, subtraction, comparison, and assignment. From tables 3 and 4, it can be seen that:
the computational cost of ASGAN multiplication and division is:
CostASGAN=13070HWC+1499784.25HW+131072 (8)
the computational cost of ASGAN addition, subtraction, comparison and assignment is:
Figure BDA0003546584810000052
the computational cost of CycleGAN multiplication and division is:
CostCycleGAN=13070HWC+1794376HW (10)
the computation cost of addition, subtraction, comparison and assignment of the CycleGAN is as follows:
Figure BDA0003546584810000061
from equations (8) and (10), the reduced computational cost of ASGAN compared to cycleGAN when considering multiplication and division is:
Figure BDA0003546584810000062
from equations (9) and (11), the reduced computational cost of ASGAN compared to cycleGAN when considering addition, subtraction, comparison and assignment is:
Figure BDA0003546584810000063
wherein H is more than or equal to 1, W is more than or equal to 1 and C is more than or equal to 1. As can be seen from equations (12) and (13), the computational cost reduction for ASGAN multiplication and division ranges from 0 to 16.30% and the computational cost reduction for addition, subtraction, comparison, and assignment ranges from 0 to 16.27% compared to CycleGAN. Therefore, ASGAN run times are less than cycleGAN.
9. Analysis of total amount of parameters
The number of parameters required by a convolution basic calculation unit is as follows: k is a radical of2C.c '+ c', where k representsThe size of the convolution kernel, c represents the number of channels of the convolution input feature map, c' represents the number of channels of the convolution output feature map [43 ]]. Thus, the total number of parameters for ASGAN and CycleGAN are:
PASGAN=14592C+27773058 (14)
PCycleGAN=14592C+28241920 (15)
from equations (14) and (15), one can obtain:
Figure BDA0003546584810000064
wherein C.gtoreq.1, as can be obtained from equation (16), the amount of the parameter decrease of ASGAN compared to cycleGAN is in the range of 0 to 1.66%. ASGAN thus occupies fewer computer memory resources than CycleGAN.
10. Description of qualitative comparison
Model results of the operation on the applet 2orange dataset are shown in fig. 5, and when the apple to orange transition is performed, the first row results show different degrees of blurring artifacts for both CycleGAN and AGGAN, while ASGAN shows no artifacts. In the second line of results, there was a case where the CycleGAN was not completely transformed in the lower right of the apple, and AGGAN appeared watermarks in the same positions. In the process of switching the orange to the apple, in the third row, both CycleGAN and AGgan destroy the background of the input image, and ASGAN can well realize the migration of the orange to the apple. In the fourth row, both CycleGAN and AGGAN create artifacts that affect the image migration result, while ASGAN better enables migration between the two domains.
The test result of the model in horse2zebra is shown in fig. 6, when the horse is converted into zebra, the first row CycleGAN has artifacts on the horse tail and horse head of the conversion result, the background is changed, and AGGAN has the problem of rough contour, but ASGAN keeps the background color unchanged and has no artifacts and contour blurring problems when the conversion is carried out. In the second row, when the conversion is carried out, obvious artifacts appear near the tail of a CycleGAN horse, the color of the horse is changed when the artifacts appear as the result of AGGAN, and the color of the horse is not changed when the artifacts do not appear as the result of ASGAN. When the zebra-to-horse conversion is carried out, the third row of cycleGAN and AGGAN is transparent when being processed at the head position of the horse, and the ASGAN can still retain the head characteristics of the zebra and carry out characteristic migration on the rest positions of the zebra. Fourth, CycleGAN changes the color of the horse ears, AGGAN changes the horse color, and ASGAN migration works better.
11. Description of quantitative comparison
The model provided by the invention is compared with two models, namely CycleGAN and AGGAN. Test results are obtained through the test set of the applet 2orange and the horse2zebra data sets, and then are evaluated through PSNR, SSIM, LPIPS and FID indexes, and an average value is taken. The results of evaluating the index on the horse2zebra dataset are shown in table 2, and the results of evaluating the index on the applet 2orange dataset are shown in table 3. Through comparison of the three indexes, the model provided by the invention improves the image migration effect.
TABLE 2 average PSNR, average SSIM, average LPIPS, average FID for cycleGAN, AGGAN, ASGAN in the applet 2orange dataset
Figure BDA0003546584810000065
Figure BDA0003546584810000071
TABLE 3 average PSNR, average SSIM, average LPIPS, average FID of cycleGAN, AGGAN, ASGAN in the horse2zebra dataset
Figure BDA0003546584810000072
The above embodiments describe in detail the application embodiments of the present invention for a robust generation countermeasure network based on attention mechanism in style migration, and the above embodiments are only used to help understanding the proposed method and core idea of the present invention.

Claims (6)

1. An application of a stable generation countermeasure network based on attention mechanism in style migration, which is characterized by comprising the following steps:
(1) selecting a data set for migration from the style migration official data set;
(2) forward propagation: inputting the sample data sets of the two domains into a new generator, and performing convolution, attention mechanism, residual error and sub-pixel convolution to obtain a transferred generated image;
(3) and (3) back propagation: first, the parameters of the new discriminator are fixed so as not to perform gradient descent, and the new generator is reversely propagated and updated with the parameters. Then, allowing the gradient of the new discriminator to decrease, and performing back propagation and parameter updating;
(4) and testing the trained model by adopting PSNR, SSIM, LPIPS and FID evaluation indexes, parameter quantity, consumed video memory and training time, and measuring a migration result.
2. The use of an attention-based mechanism for stably generating an antagonistic network in style migration according to claim 1,
and (3) adding the attention mechanism of the new generator in the step (2) into the improved cycleGAN generator and the improved discriminator by the GC Block attention mechanism, so that the receptive field is enlarged, more spatial information can be captured by the improved cycleGAN generator, and a better effect can be obtained when the model processes geometric images.
3. The use of an attention-based mechanism for stably generating an antagonistic network in style migration according to claim 1,
and (3) modifying the reverse inverted convolution in the generator of the cycleGAN into sub-pixel convolution by an up-sampling mechanism of the new generator in the step (2), so that the generation result of the model is better.
4. The use of an attention-based mechanism for stably generating an antagonistic network in style migration according to claim 1,
and (3) adding spectrum normalization into the generator and the discriminator of the cycleGAN to stabilize the training process of the model by using the normalization mechanism of the new generator in the step (2).
5. The use of an attention-based mechanism for stably generating an antagonistic network in style migration according to claim 1,
and (3) adding a weight coefficient to the attention diagram extracted from the attention mechanism in the step (2) of the attention mechanism of the new generator, and dynamically adjusting the proportion of the attention in the network by modifying the weight coefficient.
6. The application of the attention-based stable generation countermeasure network in style migration according to claim 1,
and (4) the optimizer in the step (3) uses an AdaBelief optimizer to further optimize the model and improve the generation effect.
CN202210250457.2A 2022-03-15 2022-03-15 Application of stable generation countermeasure network in style migration based on attention mechanism Pending CN114612589A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210250457.2A CN114612589A (en) 2022-03-15 2022-03-15 Application of stable generation countermeasure network in style migration based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210250457.2A CN114612589A (en) 2022-03-15 2022-03-15 Application of stable generation countermeasure network in style migration based on attention mechanism

Publications (1)

Publication Number Publication Date
CN114612589A true CN114612589A (en) 2022-06-10

Family

ID=81863751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210250457.2A Pending CN114612589A (en) 2022-03-15 2022-03-15 Application of stable generation countermeasure network in style migration based on attention mechanism

Country Status (1)

Country Link
CN (1) CN114612589A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958468A (en) * 2023-07-05 2023-10-27 中国科学院地理科学与资源研究所 Mountain snow environment simulation method and system based on SCycleGAN

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958468A (en) * 2023-07-05 2023-10-27 中国科学院地理科学与资源研究所 Mountain snow environment simulation method and system based on SCycleGAN

Similar Documents

Publication Publication Date Title
CN113673307B (en) Lightweight video action recognition method
Cheng et al. Cspn++: Learning context and resource aware convolutional spatial propagation networks for depth completion
CN112837224A (en) Super-resolution image reconstruction method based on convolutional neural network
CN110136067B (en) Real-time image generation method for super-resolution B-mode ultrasound image
CN115311187B (en) Hyperspectral fusion imaging method, system and medium based on internal and external prior
CN113066089B (en) Real-time image semantic segmentation method based on attention guide mechanism
CN111986085A (en) Image super-resolution method based on depth feedback attention network system
CN111968036A (en) Layered image super-resolution method and system, computer equipment and application
Xu et al. AutoSegNet: An automated neural network for image segmentation
CN114612589A (en) Application of stable generation countermeasure network in style migration based on attention mechanism
CN111025385B (en) Seismic data reconstruction method based on low rank and sparse constraint
CN116109689A (en) Edge-preserving stereo matching method based on guide optimization aggregation
US20220398697A1 (en) Score-based generative modeling in latent space
CN117078510B (en) Single image super-resolution reconstruction method of potential features
Jin et al. Fusion of remote sensing images based on pyramid decomposition with Baldwinian Clonal Selection Optimization
CN107330912B (en) Target tracking method based on sparse representation of multi-feature fusion
CN114037600A (en) New cycleGAN style migration network based on new attention mechanism
CN113515661A (en) Image retrieval method based on filtering depth convolution characteristics
CN113095328A (en) Self-training-based semantic segmentation method guided by Gini index
You et al. Generative Neural Fields by Mixtures of Neural Implicit Functions
Mathieu Unsupervised Learning Under Uncertainty
CN114881843B (en) Fluid artistic control method based on deep learning
Ding et al. MSEConv: A Unified Warping Framework for Video Frame Interpolation
CN117994708B (en) Human body video generation method based on time sequence consistent hidden space guiding diffusion model
Yu Reconstruction of compressive sensed (CS) images with deep equilibrium model (DEQ) based on iterative shrinkage-thresholding algorithm (ISTA)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination