CN112241741A - Self-adaptive image attribute editing model and method based on classified countermeasure network - Google Patents

Self-adaptive image attribute editing model and method based on classified countermeasure network Download PDF

Info

Publication number
CN112241741A
CN112241741A CN202010861642.6A CN202010861642A CN112241741A CN 112241741 A CN112241741 A CN 112241741A CN 202010861642 A CN202010861642 A CN 202010861642A CN 112241741 A CN112241741 A CN 112241741A
Authority
CN
China
Prior art keywords
attribute
image
label
classifier
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010861642.6A
Other languages
Chinese (zh)
Inventor
向金海
刘颖
倪福川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong Agricultural University
Original Assignee
Huazhong Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong Agricultural University filed Critical Huazhong Agricultural University
Priority to CN202010861642.6A priority Critical patent/CN112241741A/en
Publication of CN112241741A publication Critical patent/CN112241741A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a self-adaptive image attribute editing model based on a classification countermeasure network, which realizes the functions of accurate attribute conversion and high-quality image generation by constructing a reeling machine residual error network and adding an attribute countermeasure classifier Atta-cls in a discriminator; the decoder is constructed by adopting the upper convolution residual error network Tr-resnet, the attribute characteristics and the content characteristics are selectively extracted, the problem of limitation of jump connection in a deep encoder-decoder structure is solved, the attribute characteristics of a target image are enhanced, a more accurate and high-quality image is generated, and the performance of a model is improved. Under the influence of the idea of generating the confrontation network, the attribute confrontation classifier Atta-cls learns the shortages of the converted image in a manner of confrontation learning aiming at the attribute difference, and further optimizes according to the shortages. The invention also enables the estimated attribute label to approach the source label through the attribute continuity loss function, thereby ensuring the attribute continuity of the generated image.

Description

Self-adaptive image attribute editing model and method based on classified countermeasure network
Technical Field
The invention belongs to the technical field of attribute editing of image generation, and particularly relates to a self-adaptive image attribute editing model and method based on a classified countermeasure network.
Background
Property editing, also known as property transformation, aims to change the properties of an image, including one or more properties of hair color, gender, style, etc., while leaving other properties unchanged. The key to attribute editing is to achieve accurate attribute conversion and to generate high quality images. In recent years, the generation of the countermeasure network gan (generic adaptive networks) has greatly promoted the development of property editing. The generation of the antagonistic network GAN is defined as a mingma game with producers that generate images as realistic as possible and with discriminators that try to distinguish the synthetic images from the original ones. GAN has been applied to various fields of computer vision, such as: image generation, image transformation, super-resolution image generation, image deblurring and the like. In addition, various generation countermeasure networks GANs and similar variants are proposed to enhance image quality and training stability, including designing novel generator/discriminator architectures, selection of loss functions, study of regularization techniques, and the like. Meanwhile, GNAs are also applied to change local attributes (such as hair color, accessory adding, facial expression changing, etc.) or global attributes (such as gender, age, style, etc.) of an image, but the existing methods have the problems of poor conversion effect or high computational resource cost.
In order to obtain accurate attribute-transformed images, an encoder-decoder structure may be introduced into the generation of the countermeasure network GAN for feature extraction. VAE/GAN is a combined model of VAE and GAN that reconstructs an image by introducing a codec structure to acquire high-level semantic information of the image, and corrects the image by reconstructing loss and countering loss. Although this method has good performance, it may result in poor quality of the generated image due to the presence of the bottleneck layer. To solve this problem, a jump connection or a variation thereof is applied to an encoder-decoder structure as a generator for generating a competing network to improve image quality, rendering a realistic image. However, the use of a jump connection brings a trade-off between image quality and conversion accuracy, i.e. it can generate high quality images, but at the cost of less attribute accuracy. Because the source attribute characteristics and the content characteristics output by the encoder are simultaneously transmitted to the decoder, too much attention to the source attribute characteristics can influence the application of the target attribute characteristics, namely, the limitation of jump connection in the structure of the encoder-decoder causes the reduction of the model attribute conversion performance. To address this problem, STGAN's work has introduced a Selective Transfer Unit (STU) as a new hop-and-hop architecture. However, such STUs require more parameters and computational resources, severely limiting the applications.
The conditional generation countermeasure network CGAN (conditional generic adaptive nets) is a network that generates GAN under supervision, and the CGAN generates a specific image that matches a reference label as input to a generator and a discriminator. Inspired by CGAN, researchers made a great deal of contribution in style conversion and property editing: regarding style conversion, two models, namely Pix2Pix and CycleGAN, respectively realize the interconversion of paired and unpaired images between two domains; there are also some two-domain transformation models (Genegan) in property editing. However, the number of models of the dual-domain transformation method increases exponentially with the increase of the domain, so that the models have poor generalization capability and no universality. The StarGAN controls the attribute conversion of the image by adopting domain classification constraint, and realizes the multi-domain conversion of the attribute for the first time; StarGAN reconstructs the original image through a loss of cyclic consistency, which can affect the generation of high quality images. The AttGAN model adopts a style controller, not only realizes multi-domain conversion on the basis of a source image, but also realizes multi-mode conversion on specific attributes; on the premise of multi-domain conversion, in order to avoid the influence of irrelevant attributes, on one hand, the STGAN and the RelGAN both adopt a differential attribute tag as input. On the other hand, AME-GAN and AG-UIT both segment the input image information into an image attribute part and an image background part. Achieving accurate attribute conversion and generating high quality images at the same time remains a major challenge in this area.
According to the idea of generating the countermeasure network GAN model, the real image and the generated image need to be fed back to the discriminator simultaneously in the process of training the discriminator to know the defects of the generated image, so that the discriminator can guide the optimization of the generator according to the defect information when training the generator. In the existing image attribute editing method, only the original image is taken as the input of the attribute classifier when the classifier is trained, the optimized classifier is used for improving the generator, the influence of the generated image on the accuracy of attribute transfer enhancement is ignored, and the attribute difference between the generated image and the real image is difficult to find.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: an adaptive image attribute editing model and an editing method based on a classification countermeasure network are provided for simultaneously realizing accurate attribute conversion and generating functions of high-quality images.
The technical scheme adopted by the invention for solving the technical problems is as follows: the self-adaptive image attribute editing model based on the classified countermeasure network comprises a generator G, a classifier C and a discriminator D; setting the source domain as PdataThe source image is xrThe source label is trThe source attribute label is lr(ii) a Generating a field of PgThe reconstructed image is xrecGenerating an image as xfEvaluating the source tag as srGenerating a label as sfTarget attribute label is lf(ii) a The generator G is used for receiving a source image xrAnd object attribute label lfEditing a source image xrAnd output a generated image xfOr reconstructing an image xrec(ii) a The generator G comprises an encoder and a decoder T; the encoder comprises an attribute encoder EaAnd a content encoder Ec(ii) a Attribute encoder EaFor receiving a source image xrExtracting attribute characteristics l of the imageattAttribute label E of output evaluationa(xr) And ensuring the evaluation attribute label E by a label approximation methoda(xr) Continuity of (c); content encoder EcFor receiving a source image xrExtracting the content characteristic l of the imagecontentOutput evaluation content tag Ec(xr) (ii) a The decoder T is an upper convolution residual error net Tr-resnet and is used for receiving a target attribute label lfAnd evaluating the attribute tag Ea(xr) And are combined withEvaluating content characteristics Ec(xr) Combining the structural target image features ltargetOutputting the generated image xfOr reconstructing an image xrec(ii) a The classifier C is an attribute confrontation classifier Atta-cls for receiving a source image xrAnd generating image xfAnd correspondingly outputting an evaluation source label s according to whether the attribute of the image can be divided or notrAnd generating a tag sf(ii) a And the discriminator D is used for distinguishing real images from generated images through a generation countermeasure method, and training the generator G until the generator G outputs the images which can not be classified by the classifier C and can not be distinguished by the discriminator D.
According to the scheme, the method further comprises an attribute continuity module; let a source image xrSource attribute tag of lrThe attribute continuity module is used for passing the attribute continuity loss function LaLet evaluation attribute label Ea(xr) Approximation source attribute label lrThen L isaComprises the following steps:
La=||lr-Ea(xr)||1
wherein the attribute label E is evaluateda(xr) With source attribute label l as a referencerIs a vector of the same dimension, | ·| non-woven phosphor1Is L1And (4) norm.
According to the scheme, the upper convolution residual error net Tr-resnet comprises a plurality of layers of upper convolution residual error blocks Tr-resnet-block, and the input and output combination of the layers of partial upper convolution residual error blocks is used as the output of the unit; let y be the input characteristic information of the convolution residual block on the l-th layerl-1And output characteristic information of ylAn output of flMatching input characteristic information yl-1And outputting the characteristic information ylThe size of the transposed convolution operation is Transpose, and the characteristic information of the encoder of layer 2 is x2Let the weights be α and β, respectively, and initialize α ═ a1,a2,…,as),β=(b1,b2,…,bs) Wherein a isi,biSubject to a standard normal distribution, s denotes ylOr x2The number of medium feature maps; the convolution residual block on the l-th layer is:
Figure BDA0002648342180000041
Figure BDA0002648342180000042
when l is 3, carrying out weighted summation on the characteristic information of the encoder of the layer 2, the input characteristic information and the output characteristic information of the convolution residual block on the layer 3 to obtain final output information; when l ≠ 3, the output of the convolution residual block is a weighted sum of the input characteristic information and the output characteristic information of the convolution residual block on the l-th layer.
According to the scheme, if the attribute label is an n-dimensional vector, the label output by the classifier C is an n + 1-dimensional vector, and the source label t isrAnd generating a tag tfAre all n + 1-dimensional vectors, tr/tf∈Rn+1(ii) a In the stage of training classifier C, let tr 1When the attribute of the real image is recognizable, the classifier C optimizes all the attributes of the image; t is tf 1When the value is false, the attribute of the generated image is not recognizable, and the classifier C does not process the attribute of the image; source tag trThe last n-dimensional vector of (2) represents the source attribute label lrGenerating a label tfThe last n-dimensional vector of (2) represents the target attribute label lf
Further, a difference attribute label l is set*Tag l for target attributefWith source attribute label lrDifference between l*=lf-lr(ii) a The functional relationship between the generator G and the decoder T is:
G(x,l*)=T(Ec(x),l*)。
further, the system also comprises an image definition module which is used for reconstructing the loss function L through the reconstructionrecInhibiting the blurring of the reconstructed image and keeping the definition; reconstruction of an image xrecProperty label of (2) and source property label lrConsistent, so difference attribute label l*Is 0, the loss function L is reconstructedrecComprises the following steps:
Lrec=||x-T(Ec(x),0)||1
further, the system also comprises an image quality module, which is used for ensuring the good quality of the generated image through a resistance loss function; set in the source image xrAnd generating image xfInter-sampling to obtain a sample x', the penalty function L of the discriminator DDPenalty function L of sum generator GGRespectively as follows:
Figure BDA0002648342180000043
Figure BDA0002648342180000044
further, the system also comprises an attribute conversion module, which is used for improving the conversion rate of the attributes and keeping the stability of the model through an attribute classification countermeasure method; let the probability that classifier C correctly classifies the attributes of image x be vector sr=C(x),C(x)1The first element in the vector c (x) indicates whether the image attribute is separable; the probability of classifier C misclassifying an attribute of an image is (1-C (x)1) (ii) a Superscript T represents a transposition operation; penalty function losscComprises the following steps:
Figure BDA0002648342180000051
set in the source image xrAnd generating image xfInter-sampling to obtain sample x*,Ex*[]Indicates about x*A gradient penalty term of; the attribute classification of classifier C opposes the loss function LCdAttribute classification of sum generator G countering loss function LCgRespectively as follows:
Figure BDA0002648342180000052
Figure BDA0002648342180000053
further, let λ0、λ1、λ2、λ3、λ4For model weighting parameters, the objective loss function of the joint training arbiter D and classifier C includes the antagonistic loss function L of the arbiter DDAttribute classification of classifier C confrontation loss function LCdThe method specifically comprises the following steps:
Figure BDA0002648342180000054
the target loss function of the training generator G includes the opponent loss function L of the generator GGGenerator G Attribute Classification penalty function LCgReconstruction loss function LrecAnd attribute continuity loss function LaThe method specifically comprises the following steps:
Figure BDA0002648342180000055
through the maximum and minimum games of the discriminator D, the classifier C and the generator G, the attribute conversion rate of the generated image is improved while the high quality of the image output by the generator G is kept, and the generated domain approaches to the source domain.
The self-adaptive attribute editing method based on the classified countermeasure network is characterized by comprising the following steps: the method comprises the following steps:
s1: constructing a self-adaptive image attribute editing model based on a classification countermeasure network, wherein the self-adaptive image attribute editing model comprises a generator, a classifier C and a discriminator D; the generator G is used for receiving a source image xrAnd object attribute label lfEditing a source image xrAnd output a generated image xfOr reconstructing an image xrec(ii) a The generator G comprises an encoder and a decoder T; the encoder comprises an attribute encoder EaAnd a content encoder Ec(ii) a Attribute encoder EaFor receiving a source image xrExtracting attribute characteristics l of the imageattOutput evaluation attribute label Ea(xr) And ensuring the evaluation attribute label E by a label approximation methoda(xr) Continuity of (c); content encoder EcFor receiving a source image xrExtracting the content characteristic l of the imagecontentOutput evaluation content tag Ec(xr) (ii) a The decoder T is an upper convolution residual error net Tr-resnet and is used for receiving a target attribute label lfAnd evaluating the attribute tag Ea(xr) And evaluating the content characteristics Ec(xr) Combining the structural target image features ltargetOutputting the generated image xfOr reconstructing an image xrec(ii) a The classifier C is an attribute confrontation classifier Atta-cls for receiving a source image xrAnd generating image xfAnd correspondingly outputting an evaluation source label s according to whether the attribute of the image can be divided or notrAnd generating a tag sf(ii) a The discriminator D is used for distinguishing real images and generated images through a generation countermeasure method, and training the generator G until the generator G outputs the attributes which cannot be classified by the classifier C and the images which cannot be distinguished by the discriminator D; initializing model parameters;
s2: a source image xrAnd object attribute label lfInput generator G, which outputs a generated image xfOr reconstructing an image xrec
S3: in the source image xrAnd generating image xfSampling;
s4: inputting the samples of step S3 into a discriminator D and a classifier C, fixing a generator G, and passing through a countermeasure loss function L of the discriminator DDAttribute classification of sum classifier C countering loss function LCdTraining a classifier C and a discriminator D by the obtained target loss function; judging whether the classifier C correctly classifies and the discriminator D judges whether the probability of correct discrimination is maximum, if so, executing step S5; if not, the step is repeatedly executed;
s5: fixing a classifier C and a discriminator D, adjusting and generating the weight of the attribute feature in the image through an upper convolution residual error network Tr-resnet, and adjusting and generating the weight of the attribute feature in the image through an attribute continuity loss function LaLoss of reconstitutionFunction LrecGenerator G, penalty function LGAttribute classification of sum generator G countering loss function LCgA training generator G; judging that the classifier C correctly classifies and generates an image xfThe discriminator D discriminates the generated image x accuratelyfIf not, the probability of (2) is approximate to 1/2, otherwise, the step is repeatedly executed; if yes, go to step S6;
s6: steps S2 to S5 are repeatedly performed until the generated image is indistinguishable from the real image.
The invention has the beneficial effects that:
1. the invention discloses a self-adaptive image attribute editing model based on a classification countermeasure network, which realizes the functions of accurate attribute conversion and high-quality image generation by constructing an attribute classification countermeasure network and an upper convolution residual error network.
2. The invention makes the attribute label of network evaluation approach the source attribute label through the attribute continuity loss function, thus ensuring the continuity of the attribute of the generated image.
3. The invention constructs the decoder through the convolution residual error net, selectively utilizes the attribute characteristic and the content characteristic, solves the limitation problem of jump connection in the structure of the encoder and the decoder, enhances the attribute characteristic of the generated image, generates the image with more accurate attribute conversion and high quality, and improves the performance of the model.
4. The real image and the generated image are simultaneously fed back to the discriminator in the process of training the classifier, so that the classifier can guide the optimization of the generator according to the defect information and accurately find the attribute difference between the generated image and the real image.
5. The invention ensures the generation of high-quality images through reconstruction loss and countermeasures to loss.
Drawings
FIG. 1 is a functional block diagram of an embodiment of the present invention.
Fig. 2 is a flow chart of an embodiment of the present invention.
FIG. 3 is a model architecture diagram of a processed image of an embodiment of the present invention.
FIG. 4 is a generated image of an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Referring to fig. 1 and fig. 3, an embodiment of the present invention provides an adaptive image attribute editing model based on a classification countermeasure network clsgan (classification genetic adaptive networks), including a generator G, a classifier C, and a discriminator D, where an output end of the generator G is connected to input ends of the classifier C and the discriminator D; the generator G is used for receiving the source image and the target attribute label, editing the attribute of the source image and outputting a generated image or a reconstructed image; the classifier C is used for receiving the source image and generating an image, and correspondingly outputting an evaluation source label and a generation label according to whether the attribute of the image can be divided or not; and the discriminator D is used for distinguishing real images from generated images through a generation countermeasure method, and training the generator G until the generator G outputs the images which can not be classified by the classifier C and can not be distinguished by the discriminator D. Referring to fig. 2, in the stage of training the discriminator D and the classifier C, both the source image and the generated image are firstly used as the input of the discriminator D and the classifier C, so that the classifier C evaluates the identifiability of all attributes of the real image as correctly as possible, and pays attention to a single attribute; and (3) predicting the attribute of the generated image by the classifier C according to a loss function method of target detection, training and optimizing the generator G according to the defect of the generated image after the optimization of the discriminator D and the classifier C is finished, so that the attribute class of the image generated by the generator G is identifiable (the value is defined as 1 or true), and simultaneously, the evaluation attribute label is consistent with the target attribute label.
Setting the source domain as PdataThe source image is xrThe source label is trThe source attribute label is lr(ii) a Generating a field of PgThe reconstructed image is xrecGenerating an image as xfEvaluating the source tag as srGenerating a label as sfTarget attribute label is lf(ii) a The generator G comprises an encoder comprising an attribute encoder E and a decoder TaAnd a content encoder Ec(ii) a Attribute encoder EaAnd a content encoder EcFor decoupling source imagesxrAnd the unmodified content information.
Attribute encoder EaA general convolutional neural network with a constraint-InstanceNorm-ReLU as a basic unit for receiving a source image xrExtracting attribute characteristics l of the imageattEvaluation output attribute label Ea(xr) As the basis for label continuity operation. To ensure evaluation of the attribute tags Ea(xr) The continuity of (c) is required to be lost through the attribute continuity loss function LaLet evaluation attribute label Ea(xr) Approximating a true attribute tag value, i.e., a source attribute tag lrThen L isaComprises the following steps:
La=||lr-Ea(xr)||1
wherein the attribute label E is evaluateda(xr) With source attribute label l as a referencerIs a vector of the same dimension, | ·| non-woven phosphor1Is L1And (4) norm.
Content encoder EcIs a convolutional neural network for receiving a source image xrExtracting the content characteristic l of the imagecontentOutput evaluation content tag Ec(x) (ii) a Content characteristics lcontentIs a size of 512 x 16 with respect to the source image xrHigh level semantic features of content.
Let the Difference Attribute label be l*For representing object property labels lfWith source attribute label lrThe difference between, i.e.:
l*=lf-lr
encoder E for contentcOutput content characteristics lcontentAnd a difference attribute label l*Connected to construct the target image feature ltargetAnd input it to a decoder T, which outputs a generated image xfOr reconstructing an image xrec. The functional relationship between the generator G and the decoder T is:
G(x,l*)=T(Ec(x),l*),
due to the fact thatReconstruction of an image xrecProperty label of (2) and source property label lrConsistent, so difference attribute label l *0, attribute labels and content features are put into the decoder T for reconstructing the image, then the loss function L is reconstructedrecComprises the following steps:
Lrec=||x-T(Ec(x),0)||1
reconstruction loss function LrecL of1The norm is used to suppress blur and preserve sharpness in the reconstructed image.
For enhancing the generated image xfThe decoder T needs to selectively utilize the input attribute features and content features. Referring to fig. 3, the residual neural network ResNet is merged into the upper convolution layer to form an upper convolution residual network Tr-ResNet, which includes several upper convolution residual blocks Tr-ResNet-block; up-convolution residual block selectively obtaining source image x by combination of its input and outputrAnd object attribute label lfInformation for generating a more accurate and high quality image.
The partial convolution residual block has the input and output combination of the layer as the output of the unit. In order to make efficient use of the source image xrAnd object attribute label lfA weighting strategy is applied to the source image information and the input and output information of a certain upper convolutional residual block in a specific unit from the encoder. Let y be the input characteristic information of the convolution residual block on the l-th layerl-1And output characteristic information of ylAn output of flMatching input characteristic information yl-1And outputting the characteristic information ylThe size of the transposed convolution operation is Transpose, and the characteristic information of the encoder of layer 2 is x2Let the weights be α and β, respectively, and initialize α ═ a1,a2,...,as),β=(b1,b2,...,bs) Wherein a isi,biSubject to a standard normal distribution, s denotes ylOr x2Number of feature maps. The convolution residual block on the l-th layer is:
Figure BDA0002648342180000092
Figure BDA0002648342180000091
when l is 3, carrying out weighted summation on the characteristic information of the encoder of the layer 2, the input characteristic information and the output characteristic information of the convolution residual block on the layer 3 to obtain final output information; when l ≠ 3, the output of the convolution residual block is a weighted sum of the input characteristic information and the output characteristic information of the convolution residual block on the l-th layer.
The classifier C is an attribute countermeasure classifier Atta-cls based on a countermeasure method and is used for improving the attribute conversion performance of the image. The discriminator D comprises a series of convolution layers, and the classifier C has the same structure as the discriminator D and shares parameters except the last layer.
If the attribute label is an n-dimensional vector, the label output by the classifier C is an n + 1-dimensional vector, and the source label t isrAnd generating a tag tfAre all n +1 dimensional vectors, i.e. tr/tf∈Rn+1(ii) a In the stage of training classifier C, the first dimension of the label is used to distinguish whether the attribute is identifiable (separable), defining tr 1True for 1 (Real), indicating that the source attribute is identifiable; t is tf 1False (Fake) at 0, indicating that the generation attribute is not recognizable; the last n-dimensional vector of the label corresponds to the n-dimensional attribute of the image, the source label trThe last n-dimensional vector of (2) represents the source attribute label lrGenerating a label tfThe last n-dimensional vector of (2) represents the target attribute label lf. Referring to FIG. 3, when a source image x is inputrWhen the value of the first dimension defining the real image attribute prediction vector is 1 or true, the class is considered identifiable, and then the individual attribute values 0.5, 0.7, 0.3, …, 0.6 and the source attribute label lrShould be as consistent as possible; so classifier C needs to pair source images xrIs optimized. Generating an image x when inputfWhen defining the value of the first dimension of the generated image attribute prediction vector to be 0 or false, the class is considered unrecognizable and the following individual attribute information is not considered.
Let the probability that classifier C correctly classifies the attributes of image x be vector sr=C(x),C(x)1Representing the first element in vector C (x) (representing whether the image attribute is separable), the probability of classifier C misclassifying the attribute of the image is (1-C (x))1) (ii) a Superscript T represents a transposition operation; penalty function losscComprises the following steps:
Figure BDA0002648342180000101
set in the source image xrAnd generating image xfInter-sampling to obtain sample x*,Ex*[]Indicates about x*A gradient penalty term of; the attribute classification of classifier C opposes the loss function LCdAttribute classification of sum generator G countering loss function LCgRespectively as follows:
Figure BDA0002648342180000102
Figure BDA0002648342180000103
the invention uses the generation countermeasure network GAN to ensure that the generated image quality is good. For stable training of the discriminator D, the loss function defined by WGAN-GP is adopted as the countermeasure loss. Set in the source image xrAnd generating image xfInter-sampling to obtain a sample x', the penalty function L of the discriminator DDPenalty function L of sum generator GGRespectively as follows:
Figure BDA0002648342180000104
Figure BDA0002648342180000105
let λ0、λ1、λ2、λ3、λ4For model trade-off parameters, the target loss function of the joint training discriminants D and the classifier C comprises a countering loss function LDAttribute classification of classifier C confrontation loss function LCdThe method specifically comprises the following steps:
Figure BDA0002648342180000106
the target loss function of the training generator G includes the opponent loss function L of the generator GGGenerator G Attribute Classification penalty function LCgReconstruction loss function LrecAnd attribute continuity loss function LaThe method specifically comprises the following steps:
Figure BDA0002648342180000111
through the maximum and minimum games of the discriminator D, the classifier C and the generator G, the attribute conversion rate of the image generated by the generator G is improved, the high quality of the image is kept, and the generated domain approaches to the source domain. Referring to fig. 4, the model of the present invention generates images with precisely transformed attributes including hair style, hair color, skin tone, wrinkles, facings, gender features, etc., and with a sense of realism, and it can be observed that these generated images exhibit high quality and accurate attribute transfer characteristics.
The above embodiments are only used for illustrating the design idea and features of the present invention, and the purpose of the present invention is to enable those skilled in the art to understand the content of the present invention and implement the present invention accordingly, and the protection scope of the present invention is not limited to the above embodiments. Therefore, all equivalent changes and modifications made in accordance with the principles and concepts disclosed herein are intended to be included within the scope of the present invention.

Claims (10)

1. The self-adaptive image attribute editing model based on the classified countermeasure network is characterized in that:
the system comprises a generator G, a classifier C and a discriminator D;
source settingThe domain is PdataThe source image is xrThe source label is trThe source attribute label is lr(ii) a Generating a field of PgThe reconstructed image is xrecGenerating an image as xfEvaluating the source tag as srGenerating a label as sfTarget attribute label is lf(ii) a The generator G is used for receiving a source image xrAnd object attribute label lfEditing a source image xrAnd output a generated image xfOr reconstructing an image xrec(ii) a The generator G comprises an encoder and a decoder T;
the encoder comprises an attribute encoder EaAnd a content encoder Ec
Attribute encoder EaFor receiving a source image xrExtracting attribute characteristics l of the imageattAttribute label E of output evaluationa(xr) And ensuring the evaluation attribute label E by a label approximation methoda(xr) Continuity of (c);
content encoder EcFor receiving a source image xrExtracting the content characteristic l of the imagecontentOutput evaluation content tag Ec(xr);
The decoder T is an upper convolution residual error net Tr-resnet and is used for receiving a target attribute label lfAnd evaluating the attribute tag Ea(xr) And evaluating the content characteristics Ec(xr) Combining the structural target image features ltargetOutputting the generated image xfOr reconstructing an image xrec
The classifier C is an attribute confrontation classifier Atta-cls for receiving a source image xrAnd generating image xfAnd correspondingly outputting an evaluation source label s according to whether the attribute of the image can be divided or notrAnd generating a tag sf
And the discriminator D is used for distinguishing real images from generated images through a generation countermeasure method, and training the generator G until the generator G outputs the images which can not be classified by the classifier C and can not be distinguished by the discriminator D.
2. The adaptive image property editing model based on classification countermeasure network of claim 1, wherein: an attribute continuity module is also included; let a source image xrSource attribute tag of lrThe attribute continuity module is used for passing the attribute continuity loss function LaLet evaluation attribute label Ea(xr) Approximation source attribute label lrThen L isaComprises the following steps:
La=||lr-Ea(xr)||1
wherein the attribute label E is evaluateda(xr) With source attribute label l as a referencerIs a vector of the same dimension, | ·| non-woven phosphor1Is L1And (4) norm.
3. The adaptive image property editing model based on classification countermeasure network of claim 1, wherein: the upper convolution residual error net Tr-resnet comprises a plurality of layers of upper convolution residual error blocks Tr-resnet-block, and partial upper convolution residual error blocks take the input and output combination of the layers as the output of a unit; let y be the input characteristic information of the convolution residual block on the l-th layerl-1And output characteristic information of ylAn output of flMatching input characteristic information yl-1And outputting the characteristic information ylThe size of the transposed convolution operation is Transpose, and the characteristic information of the encoder of layer 2 is x2Let the weights be α and β, respectively, and initialize α ═ a1,a2,...,as),β=(b1,b2,...,bs) Wherein a isi,biSubject to a standard normal distribution, s denotes ylOr x2The number of medium feature maps; the convolution residual block on the l-th layer is:
Figure FDA0002648342170000021
Figure FDA0002648342170000022
when l is 3, carrying out weighted summation on the characteristic information of the encoder of the layer 2, the input characteristic information and the output characteristic information of the convolution residual block on the layer 3 to obtain final output information; when l ≠ 3, the output of the convolution residual block is a weighted sum of the input characteristic information and the output characteristic information of the convolution residual block on the l-th layer.
4. The adaptive image property editing model based on classification countermeasure network of claim 1, wherein: if the attribute label is an n-dimensional vector, the label output by the classifier C is an n + 1-dimensional vector, and the source label t isrAnd generating a tag tfAre all n + 1-dimensional vectors, tr/tf∈Rn+1(ii) a In the stage of training classifier C, let tr 1When the attribute of the real image is recognizable, the classifier C optimizes all the attributes of the image; t is tf 1When the value is false, the attribute of the generated image is not recognizable, and the classifier C does not process the attribute of the image; source tag trThe last n-dimensional vector of (2) represents the source attribute label lrGenerating a label tfThe last n-dimensional vector of (2) represents the target attribute label lf
5. The adaptive image property editing model based on classification countermeasure network of claim 4, wherein: let difference attribute label l*Tag l for target attributefWith source attribute label lrDifference between l*=lf-lr(ii) a The functional relationship between the generator G and the decoder T is:
G(x,l*)=T(Ec(x),l*)。
6. the adaptive image property editing model based on classification countermeasure network of claim 5, wherein: further comprising an image sharpness module for reconstructing the loss function LrecInhibiting the blurring of the reconstructed image and keeping the definition; reconstructed pictureImage xrecProperty label of (2) and source property label lrConsistent, so difference attribute label l*Is 0, the loss function L is reconstructedrecComprises the following steps:
Lrec||x-T(Ec(x),0)||1
7. the adaptive image property editing model based on classification countermeasure network of claim 5, wherein: the image quality module is used for ensuring the good quality of the generated image through a resistance loss function; set in the source image xrAnd generating image xfInter-sampling to obtain a sample x', the penalty function L of the discriminator DDPenalty function L of sum generator GGRespectively as follows:
Figure FDA0002648342170000031
Figure FDA0002648342170000032
8. the adaptive image property editing model based on classification countermeasure network of claim 5, wherein: the system also comprises an attribute conversion module, a data processing module and a data processing module, wherein the attribute conversion module is used for improving the conversion rate of attributes and keeping the stability of the model through an attribute classification countermeasure method; let the probability that classifier C correctly classifies the attributes of image x be vector sr=C(x),C(x)1The first element in the vector c (x) indicates whether the image attribute is separable; the probability of classifier C misclassifying an attribute of an image is (1-C (x)1) (ii) a Superscript T represents a transposition operation; penalty function losscComprises the following steps:
Figure FDA0002648342170000033
set in the source image xrAnd generating image xfInter-sampling to obtain sample x*
Figure FDA0002648342170000037
Indicates about x*A gradient penalty term of; the attribute classification of classifier C opposes the loss function LCdAttribute classification of sum generator G countering loss function LCgRespectively as follows:
Figure FDA0002648342170000034
Figure FDA0002648342170000035
9. the adaptive image property editing model based on classification countermeasure network of any one of claims 2, 6, 7, 8, wherein: let λ0、λ1、λ2、λ3、λ4For model weighting parameters, the objective loss function of the joint training arbiter D and classifier C includes the antagonistic loss function L of the arbiter DDAttribute classification of classifier C confrontation loss function LCdThe method specifically comprises the following steps:
Figure FDA0002648342170000036
the target loss function of the training generator G includes the opponent loss function L of the generator GGGenerator G Attribute Classification penalty function LCgReconstruction loss function LrecAnd attribute continuity loss function LaThe method specifically comprises the following steps:
Figure FDA0002648342170000041
through the maximum and minimum games of the discriminator D, the classifier C and the generator G, the attribute conversion rate of the generated image is improved while the high quality of the image output by the generator G is kept, and the generated domain approaches to the source domain.
10. The editing method of the adaptive attribute editing model based on the classification countermeasure network according to any one of claims 1 to 9, characterized in that: the method comprises the following steps:
s1: constructing a self-adaptive image attribute editing model based on a classification countermeasure network, wherein the self-adaptive image attribute editing model comprises a generator, a classifier C and a discriminator D; the generator G is used for receiving a source image xrAnd object attribute label lfEditing a source image xrAnd output a generated image xfOr reconstructing an image xrec(ii) a The generator G comprises an encoder and a decoder T; the encoder comprises an attribute encoder EaAnd a content encoder Ec(ii) a Attribute encoder EaFor receiving a source image xrExtracting attribute characteristics l of the imageattOutput evaluation attribute label Ea(xr) And ensuring the evaluation attribute label E by a label approximation methoda(xr) Continuity of (c); content encoder EcFor receiving a source image xrExtracting the content characteristic l of the imagecontentOutput evaluation content tag Ec(xr) (ii) a The decoder T is an upper convolution residual error net Tr-resnet and is used for receiving a target attribute label lfAnd evaluating the attribute tag Ea(xr) And evaluating the content characteristics Ec(xr) Combining the structural target image features ltargetOutputting the generated image xfOr reconstructing an image xrec(ii) a The classifier C is an attribute confrontation classifier Atta-cls for receiving a source image xrAnd generating image xfAnd correspondingly outputting an evaluation source label s according to whether the attribute of the image can be divided or notrAnd generating a tag sf(ii) a The discriminator D is used for distinguishing real images and generated images through a generation countermeasure method, and training the generator G until the generator G outputs the attributes which cannot be classified by the classifier C and the images which cannot be distinguished by the discriminator D; initializing model parameters;
s2: a source image xrAnd object attribute label lfInput generator G, which outputs a generated image xfOr reconstructing an image xrec
S3: in the source image xrAnd generating image xfSampling;
s4: inputting the samples of step S3 into a discriminator D and a classifier C, fixing a generator G, and passing through a countermeasure loss function L of the discriminator DDAttribute classification of sum classifier C countering loss function LCdTraining a classifier C and a discriminator D by the obtained target loss function; judging whether the classifier C correctly classifies and the discriminator D judges whether the probability of correct discrimination is maximum, if so, executing step S5; if not, the step is repeatedly executed;
s5: fixing a classifier C and a discriminator D, adjusting and generating the weight of the attribute feature in the image through an upper convolution residual error network Tr-resnet, and adjusting and generating the weight of the attribute feature in the image through an attribute continuity loss function LaReconstruction loss function LrecGenerator G, penalty function LGAttribute classification of sum generator G countering loss function LCgA training generator G; judging that the classifier C correctly classifies and generates an image xfThe discriminator D discriminates the generated image x accuratelyfIf not, the probability of (2) is approximate to 1/2, otherwise, the step is repeatedly executed; if yes, go to step S6;
s6: steps S2 to S5 are repeatedly performed until the generated image is indistinguishable from the real image.
CN202010861642.6A 2020-08-25 2020-08-25 Self-adaptive image attribute editing model and method based on classified countermeasure network Pending CN112241741A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010861642.6A CN112241741A (en) 2020-08-25 2020-08-25 Self-adaptive image attribute editing model and method based on classified countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010861642.6A CN112241741A (en) 2020-08-25 2020-08-25 Self-adaptive image attribute editing model and method based on classified countermeasure network

Publications (1)

Publication Number Publication Date
CN112241741A true CN112241741A (en) 2021-01-19

Family

ID=74170779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010861642.6A Pending CN112241741A (en) 2020-08-25 2020-08-25 Self-adaptive image attribute editing model and method based on classified countermeasure network

Country Status (1)

Country Link
CN (1) CN112241741A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408673A (en) * 2021-08-19 2021-09-17 联想新视界(南昌)人工智能工研院有限公司 Generation countermeasure network subspace decoupling and generation editing method, system and computer
CN115639605A (en) * 2022-10-28 2023-01-24 中国地质大学(武汉) Automatic high-resolution fault identification method and device based on deep learning
WO2023239302A1 (en) * 2022-06-10 2023-12-14 脸萌有限公司 Image processing method and apparatus, electronic device, and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU YING 等: "ClsGAN: Selective Attribute Editing Model based on Classification Adversarial Network", 《ARXIV:1910.11764》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408673A (en) * 2021-08-19 2021-09-17 联想新视界(南昌)人工智能工研院有限公司 Generation countermeasure network subspace decoupling and generation editing method, system and computer
WO2023239302A1 (en) * 2022-06-10 2023-12-14 脸萌有限公司 Image processing method and apparatus, electronic device, and storage medium
CN115639605A (en) * 2022-10-28 2023-01-24 中国地质大学(武汉) Automatic high-resolution fault identification method and device based on deep learning
CN115639605B (en) * 2022-10-28 2024-05-28 中国地质大学(武汉) Automatic identification method and device for high-resolution fault based on deep learning

Similar Documents

Publication Publication Date Title
Xu et al. Adversarially approximated autoencoder for image generation and manipulation
Zhu et al. In-domain gan inversion for real image editing
Pan et al. Recent progress on generative adversarial networks (GANs): A survey
CN112241741A (en) Self-adaptive image attribute editing model and method based on classified countermeasure network
CN109191409B (en) Image processing method, network training method, device, electronic equipment and storage medium
Li et al. The theoretical research of generative adversarial networks: an overview
Li et al. Improved generative adversarial networks with reconstruction loss
CN113837229B (en) Knowledge-driven text-to-image generation method
CN114998602B (en) Domain adaptive learning method and system based on low confidence sample contrast loss
Walsh et al. Automated human cell classification in sparse datasets using few-shot learning
Johari et al. Context-aware colorization of gray-scale images utilizing a cycle-consistent generative adversarial network architecture
CN115546461A (en) Face attribute editing method based on mask denoising and feature selection
Song et al. Editing out-of-domain gan inversion via differential activations
Song et al. Toward a controllable disentanglement network
Nickabadi et al. A comprehensive survey on semantic facial attribute editing using generative adversarial networks
Wenzel Generative adversarial networks and other generative models
Tibebu et al. Text to image synthesis using stacked conditional variational autoencoders and conditional generative adversarial networks
CN115457374B (en) Deep pseudo-image detection model generalization evaluation method and device based on reasoning mode
CN111382871A (en) Domain generalization and domain self-adaptive learning method based on data expansion consistency
CN116844008A (en) Attention mechanism guided content perception non-reference image quality evaluation method
Gan et al. Generative adversarial networks with augmentation and penalty
Oladipo et al. A novel genetic-artificial neural network based age estimation system
Imamverdiyev et al. Analysis of generative adversarial networks
Qiao et al. Progressive text-to-face synthesis with generative adversarial network
Saaim et al. Generative Models for Data Synthesis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210119

RJ01 Rejection of invention patent application after publication