CN112802048A - Method and device for generating layer generation countermeasure network with asymmetric structure - Google Patents

Method and device for generating layer generation countermeasure network with asymmetric structure Download PDF

Info

Publication number
CN112802048A
CN112802048A CN202110120086.1A CN202110120086A CN112802048A CN 112802048 A CN112802048 A CN 112802048A CN 202110120086 A CN202110120086 A CN 202110120086A CN 112802048 A CN112802048 A CN 112802048A
Authority
CN
China
Prior art keywords
image
foreground
layer
network
mask
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110120086.1A
Other languages
Chinese (zh)
Other versions
CN112802048B (en
Inventor
季向阳
杨宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202110120086.1A priority Critical patent/CN112802048B/en
Publication of CN112802048A publication Critical patent/CN112802048A/en
Application granted granted Critical
Publication of CN112802048B publication Critical patent/CN112802048B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a device for generating a layer generation countermeasure network with an asymmetric structure. The confrontation network is generated by training a layer with an asymmetric structure, a large number of images with foreground mask pseudo labels are generated by using the network, and the images are used as a training data set to train a segmentation network, so that unsupervised object segmentation is realized. The method solves the problem that training of the layer countermeasure generation network is easy to fall into a degradation solution, so that the layer countermeasure generation network can effectively generate the layer, meanwhile, a foreground mask is used as a segmentation pseudo-label of the image and is used for training a segmenter, extra artificial labels are not needed in the whole process, and the time and the cost for preparing data are greatly reduced.

Description

Method and device for generating layer generation countermeasure network with asymmetric structure
Technical Field
The invention relates to the technical field of pattern recognition, computer vision and machine learning, in particular to a method and a device for generating a layer generation confrontation network with an asymmetric structure.
Background
The countermeasure generation network is a basic and important technology in the cross field of deep learning, machine learning and computer vision. A challenge generation network generally includes a generator and an arbiter. The generator converts random vectors which are subject to a certain standard distribution (such as normal distribution) into specific data samples (such as images), and the discriminator distinguishes the generated data samples from real data samples. The generator and the discriminator are alternately and iteratively updated in a mutual confrontation mode, the discriminator gradually improves the discrimination capability of the discriminator, and correspondingly, the data samples generated by the generator for confronting the discriminator are more and more vivid. The final generator can generate a very realistic picture so that the discriminator is essentially unable to distinguish it from the real data.
On the other hand, the countermeasure generation network can generate a high-quality composite image, which becomes an important means for numerous applications such as data generation, data conversion, image editing, and the like. On the other hand, the training for resisting the generated network does not need any manual supervision information, so that the training becomes an important method for many unsupervised learning, weak supervised learning and semi-supervised learning.
The layer countermeasure generating network is different from a general countermeasure generating network mainly in a generator. A generator of a general countermeasure generation network directly maps random vectors into generated images, and an image-layer countermeasure generation network maps random vectors into image layers, such as a foreground object image, a foreground object mask, and a background image, and then overlaps the image layers to obtain a final generated image. The simple layer generation network has an important defect that the simple layer generation network is very easy to fall into a degradation solution in the training process, namely all layers are collapsed to a certain layer, so that the layer directly generates a whole picture, and the corresponding foreground object mask is all zero or all one.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, an object of the present invention is to provide a method for generating an image-layer countermeasures network with an asymmetric structure, which solves the problem that training of the image-layer countermeasures generation network is prone to fall into a degraded solution, so that the image-layer countermeasures generation network can effectively generate an image layer.
Another objective of the present invention is to provide an apparatus for generating an overlay generation countermeasure network with an asymmetric structure.
In order to achieve the above object, an embodiment of the present invention provides a method for generating a layer generation countermeasure network with an asymmetric structure, including:
inputting the continuous random variable into a background generator of the asymmetric layer generator, and outputting to obtain a background image;
inputting the continuous random variable, the first discrete variable and the second discrete variable into a foreground generator of the asymmetric image layer generator, and outputting to obtain a foreground image and a foreground mask;
disturbing the foreground image and the foreground mask through a layer disturbing machine, and stacking the background image, the disturbed foreground image and the disturbed foreground mask to obtain a generated image;
inputting the generated image and the real image into a discriminator at the same time, obtaining a loss function of the discriminator according to the counterstudy loss function, and training the discriminator;
inputting the generated image into a discriminator, calculating an antagonistic learning loss value, carrying out pseudo classification on the generated image and the generated foreground mask by using an auxiliary classifier, calculating a cross entropy according to a first discrete variable, a second discrete variable and an output value of the auxiliary classifier to obtain a loss function of a generator, and training the generator;
and repeatedly carrying out alternate training of the discriminator and the generator to obtain the trained layer confrontation generation network.
The method for generating the layer generation countermeasure network with the asymmetric structure, provided by the embodiment of the invention, is used for preventing all foreground masks from degrading by introducing disturbance when the layers are stacked, provides an asymmetric structure for preventing all zero foreground masks from degrading, and can generate a large number of vivid images by a perfect layer countermeasure generation network, and meanwhile, the images have layer representation and comprise foreground object masks. Further a solution for unsupervised object segmentation is provided: by using the generated data, the foreground object mask is regarded as segmentation labels, and a segmentation network is trained, so that an effective segmenter is obtained.
In addition, the method for generating an overlay generation countermeasure network with an asymmetric structure according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, the method further includes:
and generating a plurality of images with foreground masks by using the trained image layer confrontation network, training a segmentation network by using the images with the plurality of foreground masks, and segmenting the object by using the trained segmentation network.
Further, in an embodiment of the present invention, the first discrete variable and the second discrete variable are related to each other through a hierarchical relationship, and the first discrete variable is derived from the second discrete variable, and the first discrete variable is used for representing a category of shape and posture characteristics of a foreground object, which is represented on the foreground mask; the second discrete variable is used for representing a specific appearance style of a foreground object, and the specific appearance style is represented on the foreground image.
Further, in an embodiment of the present invention, applying a perturbation to the foreground image and the foreground mask by an image layer perturber includes:
applying a disturbance to the position, the size and the angle of the foreground image layer:
Figure BDA0002922079310000021
Figure BDA0002922079310000022
wherein, x is a foreground image layer including a foreground image or a foreground mask, u and v are pixel coordinates, alpha is a disturbance angle, delta u and delta b are disturbance offset, and s is a disturbance scaling ratio.
Further, in an embodiment of the present invention, stacking the background image, the disturbed foreground image, and the foreground mask to obtain a generated image includes:
Figure BDA0002922079310000031
wherein x isbAs background image, xfIs a foreground image, and the image is a foreground image,
Figure BDA0002922079310000032
for the perturbed image of the mask,
Figure BDA0002922079310000033
the foreground image after disturbance is obtained.
Further, in an embodiment of the present invention, the layer generation penalty function against the network is:
Figure BDA0002922079310000034
wherein V (D, G) is an antagonistic learning loss term, λMI,cFor the second mutual information loss term weight, VMI,c(G,Qc) For the second mutual information loss term, λMI,pIs the first mutual information loss term weight, VMI,p(G,Qp) Is a first mutual information loss term, λbinFor binarizing the loss term weight, Vbin(G) Is a binary loss term.
In order to achieve the above object, an embodiment of another aspect of the present invention provides an apparatus for generating an overlay generation countermeasure network with an asymmetric structure, including:
the background generation module is used for inputting the continuous random variable into a background generator of the asymmetric layer generator and outputting to obtain a background image;
the foreground generation module is used for inputting the continuous random variable, the first discrete variable and the second discrete variable into a foreground generator of the asymmetric image layer generator and outputting to obtain a foreground image and a foreground mask;
the disturbance module is used for applying disturbance to the foreground image and the foreground mask through the image layer disturber and stacking the background image, the disturbed foreground image and the disturbed foreground mask to obtain a generated image;
the processing module is used for judging the authenticity of the generated image through a discriminator to obtain a counterstudy loss function; pseudo-classifying the generated image by using an auxiliary image classifier, and calculating cross entropy according to the second discrete variable and the class output by the auxiliary image classifier; pseudo-classifying the foreground mask through an auxiliary mask classifier, and calculating cross entropy according to the first discrete variable and the class output by the auxiliary mask classifier; weighting and summing the three terms to obtain a loss function of the generator;
and the training module is used for alternately training the discriminator and the generator according to the loss function of the layer generation confrontation network to obtain the trained layer confrontation generation network.
The image layer generation countermeasure network generation device with the asymmetric structure provided by the embodiment of the invention is used for preventing all foreground mask degradation solutions by introducing disturbance when the image layers are stacked, provides an asymmetric structure to prevent all zero foreground mask degradation solutions, and can generate a large number of vivid images by a perfect image layer countermeasure generation network, and meanwhile, the images have image layer representation and comprise foreground object masks. Further a solution for unsupervised object segmentation is provided: by using the generated data, the foreground object mask is regarded as segmentation labels, and a segmentation network is trained, so that an effective segmenter is obtained.
In addition, the layer generation countermeasure network generation apparatus with an asymmetric structure according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, the method further includes:
and the segmentation module is used for generating a plurality of images with foreground masks by using the trained layer confrontation network, training a segmentation network by using the images with the plurality of foreground masks, and segmenting the object by using the trained segmentation network.
Further, in an embodiment of the present invention, the first discrete variable and the second discrete variable are related to each other through a hierarchical relationship, and the first discrete variable is derived from the second discrete variable, and the first discrete variable is used for representing a category of shape and posture characteristics of a foreground object, which is represented on the foreground mask; the second discrete variable is used for representing a specific appearance style of a foreground object, and the specific appearance style is represented on the foreground image.
Further, in an embodiment of the present invention, stacking the background image, the disturbed foreground image, and the foreground mask to obtain a generated image includes:
Figure BDA0002922079310000041
wherein x isbAs background image, xfIs a foreground image, and the image is a foreground image,
Figure BDA0002922079310000042
for the perturbed image of the mask,
Figure BDA0002922079310000043
the foreground image after disturbance is obtained.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flowchart of a method for generating a countermeasure network for generating an overlay having an asymmetric structure according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for generating a countermeasure network for an image layer with an asymmetric structure according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a segmented network training according to one embodiment of the present invention;
FIG. 4 is a schematic diagram of an asymmetric layer countermeasures generation network collapse degradation solution according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an asymmetric layer confrontation generation network according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a split network according to one embodiment of the present invention;
fig. 7 is a schematic structural diagram of an apparatus for generating an anti-adversarial network with an asymmetric layer structure according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes a method and an apparatus for generating an overlay generation countermeasure network with an asymmetric structure according to an embodiment of the present invention with reference to the drawings.
First, a method for generating an overlay generation countermeasure network having an asymmetric structure according to an embodiment of the present invention will be described with reference to the drawings.
Fig. 1 is a flowchart of a method for generating a countermeasure network for generating an overlay with an asymmetric structure according to an embodiment of the present invention.
Fig. 2 is a flowchart of a method for generating a countermeasure network for an image layer with an asymmetric structure according to an embodiment of the present invention.
As shown in fig. 1 and fig. 2, the method for generating an overlay generation countermeasure network with an asymmetric structure includes the following steps:
and step S1, inputting the continuous random variable into a background generator of the asymmetric layer generator, and outputting to obtain a background image.
And step S2, inputting the continuous random variable, the first discrete variable and the second discrete variable into a foreground generator of the asymmetric image layer generator, and outputting to obtain a foreground image and a foreground mask.
The dissimilarity network for generating asymmetric image layers comprises an asymmetric image layer generatorThe image layer scrambler, the discriminator and the auxiliary classifier. Wherein the inputs to the asymmetric layer generator are a random variable z and a first discrete variable cpAnd a second discrete variable ccThe output is a series of layers including a background image, a foreground mask, and a foreground image. The layer perturbator mainly performs a small perturbation on the characteristics of the position, the size, the angle and the like of the foreground layer, and the introduction of the module is to prevent the occurrence of degradation solution in the training process.
Specifically, the asymmetric layer generator includes a foreground generator and a background generator, and since the structures of the foreground generator and the background generator are different, the functions of the foreground generator and the background generator are not interchangeable, so that the asymmetric layer generator is called as an asymmetric layer generator. The input to the background generator is a continuous random variable z and the output is a background image. The input to the foreground generator is a continuous random variable z shared with the background generator and its proprietary discrete random variable cpAnd ccThe output is the foreground image and the foreground mask. Wherein c ispAnd ccAre related to each other by a hierarchical relationship, i.e. by ccCan push out cp. Wherein c ispIs used to represent the categories of the characteristics of the foreground object such as shape and posture, and is represented on the foreground mask. And ccThe specific appearance pattern used to represent the foreground object is reflected on the generated foreground image. The input to the foreground generator contains proprietary variables that can prevent the degenerate solution from occurring.
And step S3, disturbing the foreground image and the foreground mask through the image layer disturbing device, and stacking the background image, the disturbed foreground image and the foreground mask to obtain a generated image.
Specifically, the layer passes through the layer perturber to apply a small perturbation to the position, size, angle and other characteristics of the foreground layer, that is, the layer is perturbed by the layer perturber
Figure BDA0002922079310000051
Figure BDA0002922079310000061
Wherein alpha is a disturbance angle, delta u and delta v are disturbance offset, s is a disturbance scaling proportion, and uniform sampling is carried out in a certain interval in the training process. The amount of disturbance is generally small so as not to affect the realism of the final superimposed image. Perturbators may also prevent degenerative solutions from occurring.
The layers after disturbance are stacked to obtain a generated image
Figure BDA0002922079310000062
And step S4, inputting the generated image and the real picture into the discriminator at the same time, obtaining the loss function of the discriminator according to the counterstudy loss function, and training the discriminator.
And step S5, inputting the generated image into a discriminator, calculating an antagonistic learning loss value, carrying out pseudo classification on the generated image and the generated foreground mask by using an auxiliary classifier, calculating cross entropy according to the first discrete variable, the second discrete variable and the output value of the auxiliary classifier to obtain a loss function of the generator, and training the generator.
And step S6, repeating the alternate training of the discriminator and the generator to obtain the trained layer confrontation generation network.
The layers are subjected to a stacking process to form a final generated image, and the generated image is sent to a discriminator to judge the authenticity. In the training process, the discriminator also receives the real image and is used for comparing the real image with the generated image, so that the generated image is more and more vivid. In addition, the generator has an additional branch for implementing a pseudo-classification for maximizing the generation of the image and ccThe mutual information between them. In addition, an auxiliary classifier is used to perform pseudo-classification on the generated foreground mask, further preventing the occurrence of degraded solutions.
Specifically, the generated image is sent to the discriminator D to discriminate authenticity. In addition to the authenticity branch, the discriminator also has an auxiliary branch for pseudo-classifying the generated image. C used by the formation processcCalculating cross entropy with the class distribution of the branch output to obtain the loss of the branchFunction is as
Figure BDA0002922079310000063
In addition to this, the auxiliary classifier DmPseudo-classification of foreground masks is introduced, and similarly, its loss function is:
Figure BDA0002922079310000064
the learning problem of the whole map layer generating countermeasure network can be summarized according to the mode of countermeasure learning as follows:
Figure BDA0002922079310000065
wherein:
Figure BDA0002922079310000066
in order to combat the learning loss function,
Figure BDA0002922079310000067
is a concave function, and depending on the particular challenge generating network, for example, a hind loss function, i.e., f (t) -max (0,1-t), may be chosen. The last term is:
Figure BDA0002922079310000068
for binarizing the foreground mask as much as possible.
Further, in an embodiment of the present invention, the method further includes:
and generating a plurality of images with foreground masks by using the trained image layer confrontation network, training a segmentation network by using the images with the plurality of foreground masks, and segmenting the object by using the trained segmentation network.
It will be appreciated that after the training of the layer confrontation generation network is completed, it can be used to generate a large number of images with foreground masks. These images are used to train a segmentation network in accordance with a pattern of general supervised training to achieve object segmentation.
Specifically, as shown in fig. 3, the generator for generating the countermeasure network by using the trained image layer synthesizes some images with foreground masks, and the foreground masks are used as segmentation pseudo labels of the images and used for training the segmenter. On the basis of generating a network by layer confrontation, the unsupervised object segmentation problem is solved, so that the training of the model does not depend on the labeled data any more in the aspect of the object segmentation problem, and the time and the cost for preparing the data are greatly reduced.
The layer generation countermeasure network is very easy to fall into a degradation solution in the training process, namely, multiple layers collapse into a single layer, and the corresponding foreground mask is all zero or all one. In one aspect, embodiments of the invention introduce perturbations when stacking layers to prevent degradation of a full foreground mask. In another aspect, embodiments of the present invention provide an asymmetric structure to prevent degradation of an all-zero foreground mask. Specifically, the generation of the background layer depends only on the shared variables, while the generation of the foreground layer depends on the shared variables and the proprietary variables. Finally, the generated image is required to have high mutual information with proprietary variables. The asymmetric structure can effectively prevent the occurrence of the all-zero foreground mask degradation solution.
As shown in fig. 4, the asymmetric layer confrontation generation network provided by the present invention can effectively prevent degradation solution from being trapped in the training process. The specific principle is illustrated by the following figure, if the degradation solution of all the foreground masks occurs, the stacked images will have abrupt and unrealistic boundaries due to the effect of the layer perturber, which will be penalized by the discriminator; if a degenerate solution of the all-zero foreground mask occurs, the generated image will not contain any information about the proprietary variable c, and the mutual information of the generated image and c is low, which is penalized by the mutual information loss function. And if the normal solution occurs, the generated image contains the information of c, and the mutual information of the generated image and c is maximized.
As shown in fig. 5, a network structure of layer generator, discriminator and auxiliary classifier at 128 × 128 resolution is given. Where the convolution kernel size of all convolutional layers of the generator is 3 x 3 and in the discriminator the convolution kernel size of all convolutional layers is 4 x 4. For the discriminator, two head branches respectively realize true and false discrimination and mutual information estimation, and for the auxiliary classifier, only the head of the mutual information estimation is reserved.
As shown in fig. 6, the unsupervised object segmentation method does not rely on manual labeling, so that the limitation on the data collection process is small, and the time and cost for preparing data can be greatly reduced. The network can be trained and tested at any resolution, such as 128 x 128 or 64 x 64 resolution. Where the convolution kernel size of all convolution layers is 3 x 3.
According to the method for generating the layer generation countermeasure network with the asymmetric structure, which is provided by the embodiment of the invention, the degradation solution of the all-foreground mask is prevented by introducing disturbance when the layers are stacked, the asymmetric structure is provided to prevent the degradation solution of the all-zero foreground mask, and the perfect layer countermeasure generation network can generate a large number of vivid images which are represented by the layers and comprise the foreground object mask. Further a solution for unsupervised object segmentation is provided: by using the generated data, the foreground object mask is regarded as segmentation labels, and a segmentation network is trained, so that an effective segmenter is obtained.
Next, an image layer generation countermeasure network generation apparatus having an asymmetric structure according to an embodiment of the present invention will be described with reference to the drawings.
Fig. 7 is a schematic structural diagram of an apparatus for generating an anti-adversarial network with an asymmetric layer structure according to an embodiment of the present invention.
As shown in fig. 7, the map layer generation countermeasure network generation apparatus with an asymmetric structure includes: a first generation module 701, a second generation module 702, a perturbation module 703, a processing module 704 and a training module 705.
And the background generating module 701 is configured to input the continuous random variable into a background generator of the asymmetric layer generator, and output the continuous random variable to obtain a background image.
And a foreground generating module 702, configured to input the continuous random variable, the first discrete variable, and the second discrete variable into a foreground generator of the asymmetric layer generator, and output a foreground image and a foreground mask.
And the perturbation module 703 is configured to apply perturbation to the foreground image and the foreground mask through the layer perturber, and stack the background image, the perturbed foreground image, and the foreground mask to obtain a generated image.
A processing module 704, configured to determine authenticity of the generated image through a discriminator to obtain a counterlearning loss function; pseudo-classifying the generated image by using an auxiliary image classifier, and calculating cross entropy according to the second discrete variable and the class output by the auxiliary image classifier; pseudo-classifying the foreground mask through an auxiliary mask classifier, and calculating cross entropy according to the first discrete variable and the class output by the auxiliary mask classifier; and weighting and summing the three terms to obtain a loss function of the generator.
The training module 705 is configured to generate a loss function of the countermeasure network according to the layer, and alternately perform training of the discriminator and the generator to obtain a trained layer countermeasure generation network.
Further, in an embodiment of the present invention, the method further includes:
and the segmentation module is used for generating a plurality of images with foreground masks by using the trained layer confrontation network, training a segmentation network by using the images with the plurality of foreground masks, and segmenting the object by using the trained segmentation network.
Further, in an embodiment of the present invention, the first discrete variable and the second discrete variable are related to each other through a hierarchical relationship, and the first discrete variable is derived from the second discrete variable, and the first discrete variable is used for representing the type of the shape and the posture characteristics of the foreground object and is represented on the foreground mask; the second discrete variable is used for representing a specific appearance style of the foreground object, and is represented on the foreground image.
Further, in an embodiment of the present invention, stacking the background image, the disturbed foreground image, and the foreground mask to obtain a generated image includes:
Figure BDA0002922079310000081
wherein x isbAs background image, xfIs a foreground image, and the image is a foreground image,
Figure BDA0002922079310000082
for the perturbed image of the mask,
Figure BDA0002922079310000083
the foreground image after disturbance is obtained.
It should be noted that the foregoing explanation of the method embodiment is also applicable to the apparatus of this embodiment, and is not repeated herein.
According to the layer generation countermeasure network generation device with the asymmetric structure provided by the embodiment of the invention, disturbance is introduced when layers are stacked to prevent the degradation solution of the all-foreground mask, an asymmetric structure is provided to prevent the degradation solution of the all-zero foreground mask, and a perfect layer countermeasure generation network can generate a large number of vivid images which are represented by the layers and comprise the foreground object mask. Further a solution for unsupervised object segmentation is provided: by using the generated data, the foreground object mask is regarded as segmentation labels, and a segmentation network is trained, so that an effective segmenter is obtained.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A method for generating a layer generation countermeasure network with an asymmetric structure is characterized by comprising the following steps:
inputting the continuous random variable into a background generator of the asymmetric layer generator, and outputting to obtain a background image;
inputting the continuous random variable, the first discrete variable and the second discrete variable into a foreground generator of the asymmetric image layer generator, and outputting to obtain a foreground image and a foreground mask;
disturbing the foreground image and the foreground mask through a layer disturbing machine, and stacking the background image, the disturbed foreground image and the disturbed foreground mask to obtain a generated image;
inputting the generated image and the real image into a discriminator at the same time, obtaining a loss function of the discriminator according to the counterstudy loss function, and training the discriminator;
inputting the generated image into a discriminator, calculating an antagonistic learning loss value, carrying out pseudo classification on the generated image and the generated foreground mask by using an auxiliary classifier, calculating a cross entropy according to a first discrete variable, a second discrete variable and an output value of the auxiliary classifier to obtain a loss function of a generator, and training the generator;
and repeatedly carrying out alternate training of the discriminator and the generator to obtain the trained layer confrontation generation network.
2. The method of claim 1, further comprising:
and generating a plurality of images with foreground masks by using the trained image layer confrontation network, training a segmentation network by using the images with the plurality of foreground masks, and segmenting the object by using the trained segmentation network.
3. The method according to claim 1, wherein the first discrete variable and the second discrete variable are related to each other through a hierarchical relationship, the first discrete variable is derived through the second discrete variable, and the first discrete variable is used for representing the category of the shape and the posture characteristic of a foreground object and is represented on the foreground mask; the second discrete variable is used for representing a specific appearance style of a foreground object, and the specific appearance style is represented on the foreground image.
4. The method of claim 1, wherein applying perturbation to the foreground image and the foreground mask by an overlay perturber comprises:
applying a disturbance to the position, the size and the angle of the foreground image layer:
Figure FDA0002922079300000011
Figure FDA0002922079300000012
wherein, x is a foreground image layer including a foreground image or a foreground mask, u and v are pixel coordinates, alpha is a disturbance angle, delta u and delta v are disturbance offset, and s is a disturbance scaling ratio.
5. The method of claim 1, wherein stacking the background image, the perturbed foreground image, and a foreground mask to generate an image comprises:
Figure FDA0002922079300000013
wherein x isbAs background image, xfIs a foreground image, and the image is a foreground image,
Figure FDA0002922079300000021
for the perturbed image of the mask,
Figure FDA0002922079300000022
the foreground image after disturbance is obtained.
6. The method of claim 1, wherein the graph layer generation penalty function against the network is:
Figure FDA0002922079300000023
wherein V (D, G) is an antagonistic learning loss term, λMI,cFor the second mutual information loss term weight, VMI,c(G,Qc) For the second mutual information loss term, λMI,pIs the first mutual information loss term weight, VMI,p(G,Qp) Is a first mutual information loss term, λbinFor binarizing the loss term weight, Vbin(G) Is a binary loss term.
7. An apparatus for generating an overlay generation countermeasure network having an asymmetric structure, comprising:
the background generation module is used for inputting the continuous random variable into a background generator of the asymmetric layer generator and outputting to obtain a background image;
the foreground generation module is used for inputting the continuous random variable, the first discrete variable and the second discrete variable into a foreground generator of the asymmetric image layer generator and outputting to obtain a foreground image and a foreground mask;
the disturbance module is used for applying disturbance to the foreground image and the foreground mask through the image layer disturber and stacking the background image, the disturbed foreground image and the disturbed foreground mask to obtain a generated image;
the processing module is used for judging the authenticity of the generated image through a discriminator to obtain a counterstudy loss function; pseudo-classifying the generated image by using an auxiliary image classifier, and calculating cross entropy according to the second discrete variable and the class output by the auxiliary image classifier; pseudo-classifying the foreground mask through an auxiliary mask classifier, and calculating cross entropy according to the first discrete variable and the class output by the auxiliary mask classifier; weighting and summing the three terms to obtain a loss function of the generator;
and the training module is used for alternately training the discriminator and the generator according to the loss function of the layer generation confrontation network to obtain the trained layer confrontation generation network.
8. The apparatus of claim 7, further comprising:
and the segmentation module is used for generating a plurality of images with foreground masks by using the trained layer confrontation network, training a segmentation network by using the images with the plurality of foreground masks, and segmenting the object by using the trained segmentation network.
9. The apparatus according to claim 7, wherein the first discrete variable and the second discrete variable are related to each other through a hierarchical relationship, and the first discrete variable is derived through the second discrete variable, and is used for representing the category of the shape and the posture characteristic of a foreground object, which is represented on the foreground mask; the second discrete variable is used for representing a specific appearance style of a foreground object, and the specific appearance style is represented on the foreground image.
10. The apparatus of claim 7, wherein stacking the background image, the perturbed foreground image, and a foreground mask to generate an image comprises:
Figure FDA0002922079300000031
wherein x isbAs background image, xfIs a foreground image, and the image is a foreground image,
Figure FDA0002922079300000032
for the perturbed image of the mask,
Figure FDA0002922079300000033
the foreground image after disturbance is obtained.
CN202110120086.1A 2021-01-28 2021-01-28 Method and device for generating layer generation countermeasure network with asymmetric structure Active CN112802048B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110120086.1A CN112802048B (en) 2021-01-28 2021-01-28 Method and device for generating layer generation countermeasure network with asymmetric structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110120086.1A CN112802048B (en) 2021-01-28 2021-01-28 Method and device for generating layer generation countermeasure network with asymmetric structure

Publications (2)

Publication Number Publication Date
CN112802048A true CN112802048A (en) 2021-05-14
CN112802048B CN112802048B (en) 2022-09-09

Family

ID=75812603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110120086.1A Active CN112802048B (en) 2021-01-28 2021-01-28 Method and device for generating layer generation countermeasure network with asymmetric structure

Country Status (1)

Country Link
CN (1) CN112802048B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114820685A (en) * 2022-04-24 2022-07-29 清华大学 Generation method and device for generating countermeasure network by independent layer
CN114900586A (en) * 2022-04-28 2022-08-12 中国人民武装警察部队工程大学 Information steganography method and device based on DCGAN

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180260957A1 (en) * 2017-03-08 2018-09-13 Siemens Healthcare Gmbh Automatic Liver Segmentation Using Adversarial Image-to-Image Network
CN108665414A (en) * 2018-05-10 2018-10-16 上海交通大学 Natural scene picture generation method
US20190251401A1 (en) * 2018-02-15 2019-08-15 Adobe Inc. Image composites using a generative adversarial neural network
CN110443203A (en) * 2019-08-07 2019-11-12 中新国际联合研究院 The face fraud detection system counter sample generating method of network is generated based on confrontation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180260957A1 (en) * 2017-03-08 2018-09-13 Siemens Healthcare Gmbh Automatic Liver Segmentation Using Adversarial Image-to-Image Network
US20190251401A1 (en) * 2018-02-15 2019-08-15 Adobe Inc. Image composites using a generative adversarial neural network
CN108665414A (en) * 2018-05-10 2018-10-16 上海交通大学 Natural scene picture generation method
CN110443203A (en) * 2019-08-07 2019-11-12 中新国际联合研究院 The face fraud detection system counter sample generating method of network is generated based on confrontation

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DUC MINH VO等: "Paired-D GAN for Semantic Image Synthesis", 《COMPUTER VISION - ACCV 2018, PT IV》 *
SIDDHARTH PANDEY等: "An image augmentation approach using two-stage generative adversarial network for nuclei image segmentation", 《BIOMEDICAL SIGNAL PROCESSING AND CONTROL》 *
王晓红等: "基于生成对抗网络的风格化书法图像生成", 《包装工程》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114820685A (en) * 2022-04-24 2022-07-29 清华大学 Generation method and device for generating countermeasure network by independent layer
CN114820685B (en) * 2022-04-24 2023-01-31 清华大学 Generation method and device for generating countermeasure network by independent layer
CN114900586A (en) * 2022-04-28 2022-08-12 中国人民武装警察部队工程大学 Information steganography method and device based on DCGAN
CN114900586B (en) * 2022-04-28 2024-04-16 中国人民武装警察部队工程大学 Information steganography method and device based on DCGAN

Also Published As

Publication number Publication date
CN112802048B (en) 2022-09-09

Similar Documents

Publication Publication Date Title
CN110135366B (en) Shielded pedestrian re-identification method based on multi-scale generation countermeasure network
Anwar et al. Image colorization: A survey and dataset
Baldassarre et al. Deep koalarization: Image colorization using cnns and inception-resnet-v2
CN111340122B (en) Multi-modal feature fusion text-guided image restoration method
CN111612807B (en) Small target image segmentation method based on scale and edge information
Bakkay et al. BSCGAN: Deep background subtraction with conditional generative adversarial networks
CN109919013A (en) Method for detecting human face and device in video image based on deep learning
Akey Sungheetha Classification of remote sensing image scenes using double feature extraction hybrid deep learning approach
CN112802048B (en) Method and device for generating layer generation countermeasure network with asymmetric structure
CN113379771B (en) Hierarchical human body analysis semantic segmentation method with edge constraint
Singh et al. Steganalysis of digital images using deep fractal network
CN113344110B (en) Fuzzy image classification method based on super-resolution reconstruction
Rios et al. Feature visualization for 3D point cloud autoencoders
CN115439694A (en) High-precision point cloud completion method and device based on deep learning
Pan et al. Residual meshnet: Learning to deform meshes for single-view 3d reconstruction
Ling et al. Re-visiting discriminator for blind free-viewpoint image quality assessment
CN111652240A (en) Image local feature detection and description method based on CNN
Bounsaythip et al. Genetic algorithms in image processing-a review
CN114331946A (en) Image data processing method, device and medium
CN114187506B (en) Remote sensing image scene classification method of viewpoint-aware dynamic routing capsule network
CN113705358B (en) Multi-angle side face normalization method based on feature mapping
CN117094895B (en) Image panorama stitching method and system
CN112668662A (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
CN117011515A (en) Interactive image segmentation model based on attention mechanism and segmentation method thereof
CN111191729A (en) Three-dimensional object fusion feature representation method based on multi-modal feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Ji Xiangyang

Inventor after: Yang Yu

Inventor after: Zou Qiran

Inventor before: Ji Xiangyang

Inventor before: Yang Yu

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant