CN113052230A - Clothing image generation system and method based on disentanglement network - Google Patents

Clothing image generation system and method based on disentanglement network Download PDF

Info

Publication number
CN113052230A
CN113052230A CN202110304774.3A CN202110304774A CN113052230A CN 113052230 A CN113052230 A CN 113052230A CN 202110304774 A CN202110304774 A CN 202110304774A CN 113052230 A CN113052230 A CN 113052230A
Authority
CN
China
Prior art keywords
network
image
clothing
discriminator
disentanglement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110304774.3A
Other languages
Chinese (zh)
Inventor
张建明
宋阳
王志坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110304774.3A priority Critical patent/CN113052230A/en
Publication of CN113052230A publication Critical patent/CN113052230A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0621Item configuration or customization

Abstract

The invention relates to the field of computer vision, in particular to a clothing image generation system and a clothing image generation method based on an entanglement removal network, wherein the method comprises the following steps: step S101, acquiring a plurality of clothing images with category labels; step S102, acquiring a color label of the clothing image, and cascading the color label with a category label; step S103, training a de-entanglement neural network, and initializing a generator and a discriminator network parameter of the de-entanglement neural network; step S104, distinguishing the real image and the clothing image generated by the disentanglement generator; and step S105, adjusting and optimizing the parameters of the disentanglement network according to the belonged judgment value and the output image. The invention combines the artificial intelligence technology with the traditional clothing industry, overcomes the problems of single design, insufficient user satisfaction, high design cost and the like in the traditional clothing industry, improves the clothing design efficiency, and ensures that the designed clothing fully meets the design requirements of users.

Description

Clothing image generation system and method based on disentanglement network
Technical Field
The invention relates to the field of computer vision, in particular to a clothing image generation system and method based on an entanglement removal network.
Background
The computer vision technology has very many specific applications in the fields of image generation, conversion, restoration and the like, and the conditional countermeasure generation network can generate a garment image which cannot be distinguished from a real image according to an input label, so that inspiration is provided for garment designers and consumers; the key to the generation of the garment image is diversified style design, which requires the network to be able to well disentangle the features of the garment image, such as the color features and shape features of the garment; meanwhile, a user can select the type or color style of the garment to be generated, the customized design problem of the garment can be solved to a certain extent, intelligent design of the garment with the user as the center is realized, design inspiration is given according to the user requirement, all the generated garment images are generated from scratch according to the real image distribution, conditional random generation can ensure the diversity of the generated garment images, customized generation of the garment can be realized according to the user requirement, meanwhile, the design inspiration can be provided for a garment designer, the combination of the garment industry and the artificial intelligence technology is facilitated, and the development of the garment industry is promoted.
Of course, there are many difficulties in designing garment images using artificial intelligence techniques. The difficulty is mainly shown in that:
1) the training difficulty of the generated countermeasure network is high and convergence is difficult;
2) generating randomness of the image;
3) the clothing image has more characteristics, complex textures and patterns, and is easy to cause characteristic intersection and image blurring;
4) high resolution, high definition images are not well generated.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a clothing image generation method based on an entanglement removal network, which generates clothing images of various styles according to user conditions and provides clothing design inspiration for consumers and designers, and the specific technical scheme is as follows:
a clothing image generation method based on an disentanglement neural network comprises the following steps:
s101, acquiring a plurality of clothing images with category labels;
s102, acquiring a color label of the clothing image, and cascading the color label with a clothing type label;
s103, training a de-entanglement neural network, and initializing a de-entanglement generator parameter and a discriminator network parameter of the de-entanglement neural network;
s104, inputting the cascaded labels into the de-entanglement neural network, and judging a real image and a clothing image generated by the de-entanglement generator;
and S105, adjusting and optimizing the disentanglement network parameters according to the judgment value and the output clothing image.
Further, the class label and the color label of the clothing image are obtained by a single hot coding method, wherein the classification of the clothing image colors adopts an OpenCV tool to convert an RGB model of the clothing image into an HSV model.
Further, step S103 specifically includes:
s103_1, training a de-entanglement neural network, wherein the de-entanglement neural network is a conditional countermeasure network and comprises a de-entanglement generator G and a multi-stage discriminator D, the de-entanglement generator generates a picture after extracting the style features of the clothing image, the multi-stage discriminator discriminates the real picture from the generated picture, and the input of the de-entanglement neural network is a category label l of the clothing imageclassColor label lcolorAnd a random noise variable z, the disentangle generator output G (z, (l)class,lcolor) Log (G (z, (l)) is output from the multi-stage discriminatorclass,lcolor) )) and log (I)real) Corresponding to the discrimination results of the generated picture and the real picture by the multi-stage discriminator, respectively, wherein IrealIs the concatenation of the real garment image and its label;
the overall objective function during training is:
Figure BDA0002986573760000021
i.e., the overall GAN loss function is:
Figure BDA0002986573760000022
among them, the above-mentioned materials are used,
Figure BDA0002986573760000023
respectively, subject to the discrimination result expectation of the true distribution and the de-entanglement generator generated distribution,
Figure BDA0002986573760000024
training processes of the minimum de-entanglement generator of the discriminator to generate a distribution discrimination expectation and a maximum true distribution discrimination expectation, respectively,/true、xtrueL respectively representing a label of the real clothing image, the real clothing image and a label for generating the clothing image;
s103_2, performing spectrum normalization on all network layers of the de-entanglement generator and the multi-stage discriminator, wherein the weight initialization of all network layers is subjected to Gaussian distribution, the mean value is 0, and the variance is 1.
Further, the multi-stage discriminator is composed of a local discriminator and a global discriminator, the local discriminator and the global discriminator respectively perform down-sampling discrimination on the real picture and the generated picture on two different scales, and finally the results of sampling are combined to obtain the discrimination results of the multi-stage discriminator.
Further, the intermediate feature output of the generated image of the multi-stage discriminator is matched with the intermediate feature output of the real image, and the feature matching loss function is as follows:
Figure BDA0002986573760000031
where T is the total number of network layers, NiIndicating the number of elements in each layer network, DkA representative sub-discriminator, as a feature extractor, that minimizes feature matching loss only when training the de-entanglement generator
Figure BDA0002986573760000032
Further, the de-entanglement generator is composed of a mapping network omega and a progressive generation network GprogressThe mapping network omega is composed of a fully-connected network layer, random noise and label codes are input into an intermediate latent space, and an intermediate latent code psi (psi) is outputstyle,ψbias) Controlling parameters of an adaptive instance normalization layer, the normalization function being:
Figure BDA0002986573760000033
wherein each feature map xiRespectively carrying out normalization operation, scaling the feature map by the intermediate latent code psistyleDoubling and adding offset psibias(ii) a Progressive generation network GprogressThe convolution module is provided with a self-adaptive example normalization layer, the convolution module performs up-sampling by adopting a linear interpolation method, the amplification factor is 2, the input is intermediate latent codes and random Gaussian noise, the added noise follows Gaussian distribution, the mean value is 0, the variance is 1, and finally, the output is converted into an RGB image through a convolution layer with the convolution kernel of 1.
Further, the mapping network maps the input signal into a potential spatial variable ω, and then the mapping transformation transforms the potential spatial variable into a pattern variable y ═ y (y)s,yb) Output to progressive generation network Gprogress,ys、ybScaling the multiple and the deviation respectively, after each convolution module, the style variable controls a parameter of the adaptive instance normalization, and the operation of the adaptive instance normalization parameter is represented as:
Figure BDA0002986573760000034
wherein each feature map xiIs regularized separately and then normalized with the feature pattern variable y, so the dimension of y is twice the number of the network layer feature maps.
Further, the step S104 specifically includes: the real image, the category label and the color style label of the real image are sent to a multi-stage discriminator to judge whether the real image is true or false, the real image and the matched label are judged as true when the multi-stage discriminator starts training, and the clothing image generated by the disentanglement generator is judged as false.
Further, step S105 specifically includes: in each training, parameters are optimized according to a judgment value and an objective function, specifically, features of generated images are extracted through a VGG network, and a perception loss function is added in the training process:
Figure BDA0002986573760000041
F(i)indicating that the VGG network contains MiThe i-th network of each element has the objective function as follows:
Figure BDA0002986573760000042
wherein λ is1And λ2Respectively, the hyper-parameters which need to be adjusted during training are adopted, after each iteration of the network, the parameters of the de-entanglement generator and the multi-stage discriminator are updated by using a gradient descent method, and the generator parameter theta is updated for 5 timesGUpdating the discriminator parameter once
Figure BDA0002986573760000043
And the error is propagated backwards.
A clothing image generation system based on a de-entangled neural network, comprising:
a user registration module: confirming the identity information of the user, and recording the clothes type and the color style preferred by the user;
an input conversion module: receiving a clothing design requirement input by a user, and converting the clothing design requirement into an input corresponding to a model;
a clothing image design generation module: the converted clothing design requirements of the user are sent into a trained model, a network receiving noise signal and condition input are generated through disentanglement, and a corresponding clothing image is output;
a display module: the system displays the clothing image output by the disentanglement network to the user, and the user can select the favorite and satisfactory image.
The method adopts the conditional unwrapping to generate the network, can well reserve elements such as textures and patterns of the clothes, can realize customized design according to user requirements, and improves the randomness and the specifity of user experience; the generation of the countermeasure network is adopted, and the generation is promoted together, so that the quality and the effect of the generated image can be continuously improved; after the algorithm is trained, the system can efficiently and massively generate customized clothing images and carry out clothing design according to user conditions, and the clothing type and the color style required by a user can be well generated.
Drawings
FIG. 1 is a schematic block diagram of the process flow of the present invention;
FIG. 2 is a flow schematic block diagram of the system architecture of the present invention;
FIG. 3 is a schematic representation of a sample garment image used in the training of the method of the present invention;
FIG. 4 is a schematic diagram of the overall structure of a disentanglement network used in the training of the method of the present invention;
FIG. 5 is a schematic diagram of a multi-stage discriminator network in a de-entanglement network used in the training of the method of the present invention;
fig. 6 is a diagram of the effects actually produced by the system of the present invention.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings.
A clothing image generation system and method based on disentanglement network, mainly for assisting clothing designer, provide the clothing image design for consumer, the method is:
the system receives the clothing generation requirement of the user in advance, converts the requirement into a label vector L, sends the label vector L into a pre-trained neural network, and generates and displays clothing images according to the input conditions of the user.
The neural network is a conditional generation countermeasure network and mainly comprises two parts: 1) the de-entanglement garment image de-entanglement generator is used for generating a model to capture the image distribution of the original data set and generating a garment image which is continuously close to the original data set according to the input condition; 2) the multi-stage discriminator judges whether the input is true or false, judges whether the input is true for a real garment image with a specific category attribute, and judges whether the input is false for a garment image generated by the generated model; and the generated model continuously improves the model parameters according to the judgment result, and finally generates the clothing image with the image quality and the class attribute meeting the requirements of the user.
More specifically, as shown in fig. 1, a clothing image generation method based on a disentanglement network includes the following steps:
s101, acquiring a plurality of clothing images with category labels.
The clothing images are various clothing images with white backgrounds and sizes of 512 x 512, and the clothing categories can be suits, shirts, one-piece dresses, sweaters and the like. Each garment has a specific garment type, a one-hot coding method is adopted, for example, 1 represents that the garment is a sweater, 4 represents that the garment is a suit, a specific code is respectively set for each type and is converted into a label vector, the specific position is 1, the rest positions are 0, and the code is used as the input of a condition generation network.
S102, acquiring a color label of the clothing image, and cascading with the clothing category label.
The main color styles of the garments, including: white, black, green etc. for every clothing is beaten the color label, specifically is: the main color label of each clothing image can be obtained by an OpenCV tool, the color of each clothing image is independently coded, for example, the red code is 1, the green code is 5, and then the clothing image is converted into the corresponding color label by a method of independent thermal coding; then, the garment image is cascaded with the class label of each garment image to be used as a condition for generating a network by conditional countermeasure, and the generation result is controlled.
The method comprises the steps that an OpenCV tool converts an RGB model of a clothing image into an HSV model, wherein in the HSV model, H belongs to [0,180 ], S belongs to [0,255], V belongs to [0,255], and hue, saturation and brightness of the image are respectively represented.
S103, training a de-entanglement neural network, and initializing a de-entanglement generator parameter and a discriminator network parameter of the de-entanglement neural network.
Training data required by a pre-trained de-entanglement neural network of the system are original garment images, and preprocessing and classified encoding are carried out on each original garment image;
in the training stage, a clothing data set is trained through a de-entanglement neural network consisting of a de-entanglement generator and a multi-stage discriminator, and the input of the de-entanglement neural network is a class label l of a clothing imageclassColor label lcolorAnd a random noise variable z, the disentangle generator output G (z, (l)class,lcolor) Log (G (z, (l)) is output from the multi-stage discriminatorclass,lcolor) )) and log (I)real),IrealThe method is characterized in that a real garment image and a label thereof are cascaded and respectively correspond to a multi-stage discriminator to discriminate a generated picture and a real picture;
the overall objective function during training is:
Figure BDA0002986573760000061
i.e., the global GAN loss function is:
Figure BDA0002986573760000062
among them, the above-mentioned materials are used,
Figure BDA0002986573760000063
respectively, subject to the discrimination result expectation of the true distribution and the de-entanglement generator generated distribution,
Figure BDA0002986573760000064
training processes of the minimum de-entanglement generator of the discriminator to generate a distribution discrimination expectation and a maximum true distribution discrimination expectation, respectively,/true、xtrueAnd l respectively represent a label of the real clothing image, the real clothing image and a label for generating the clothing image.
In order to improve the effect of the GAN loss function, the multi-stage discriminator adopts an intermediate feature matching process, and can output intermediate features, the intermediate feature output of the generated image of the multi-stage discriminator needs to be matched with the intermediate feature output of the real image, and the feature matching loss function is as follows:
Figure BDA0002986573760000065
where T is the total number of network layers, NiIndicating the number of elements in each layer network, DkRepresenting two sub-discriminators, as feature extractors, only performing a minimization of feature matching loss when training the de-entanglement generator
Figure BDA0002986573760000071
The de-entanglement generator can extract the class characteristics and the style characteristics of the clothing image, de-entangle the characteristics of the clothing patterns and the patterns, and generate a network G mainly through a mapping network omega and a progressive generation network GprogressComposition in which the network G is progressively generatedprogressIs a convolution module with an adaptive instance normalization module.
Random noise z and label code L as input of mapping network omega, intermediate latent code and adaptive random noise as progressive generation network GprogressFor an image with an image resolution of 512 × 512, Gprogress16, and the final layer of convolution converts the result to an RGB image using a 1 x 1 convolution kernel. Mapping network omega output intermediate latent code psi ═ phi (psi)style,ψbias) Controlling parameters of an adaptive instance normalization module, the normalization function being:
Figure BDA0002986573760000072
wherein each feature map xiRespectively carrying out normalization operation, scaling the feature map by the intermediate latent code psistyleDoubling and adding offset psibiasThus, the dimension of the intermediate potential ψ is twice that of the feature map x.
The mapping network omega is composed of 6 layers of fully-connected network layers, the input and output sizes are 512 multiplied by 512, the input of the mapping network omega is constant and label one-hot coding, the input of the mapping network omega is mapped to an intermediate latent space, then a mapping variable passes through a convolution module with an adaptive instance normalization module, each convolution module can be regarded as an up-sampling module, the progressive generation network is composed of the up-sampling modules, up-sampling is carried out by adopting a linear interpolation method, and the amplification factor is 2; the convolution kernel size of the convolution layer is 3 multiplied by 3, along with the continuous reduction of the number of the network stacking channels, the convolution layer is followed by a ReLU nonlinear transformation layer and a batch normalization layer, Gaussian noise is added at the end of each layer to increase the randomness of the generated image, the input is intermediate latent features and random Gaussian noise, the random Gaussian noise obeys Gaussian distribution, the mean value of the random Gaussian noise is 0, and the variance is 1; and finally, converting the output into RGB data through a convolution layer with convolution kernel of 1, namely a final output image.
The convolution module contains self-adapting example normalization module, and uses the characteristic pattern y ═ y (y) output by mapping networks,yb) For example normalization, the mapping network maps the input signal into a latent space variable ω, which is then converted into a pattern variable y ═ y (y)s,yb),,ys、ybScaling the multiples and deviations separately, after each convolution module, the pattern variable controls the parameters of the adaptive instance normalization, the operation of which can be expressed as:
Figure BDA0002986573760000081
wherein each feature map xiThe method is independently regularized and then normalized with a pattern variable y, so that the dimension of y is twice of the number of the network layer feature maps, and finally, the randomness of a generated result is increased by directly adding a noise signal.
The multi-stage discriminator is a down-sampling network with a local discriminator and a global discriminator, for an image with the image size of 512 x 512, the local discriminator performs down-sampling on the features with the size of 256 x 256, the global discriminator performs down-sampling on the whole image, and the final sampling results are combined to obtain the discrimination results of the multi-stage discriminator. In order to ensure the stability of the GAN network during training, all network layers of the de-entanglement generator and the multi-stage discriminator are subjected to spectrum normalization, the weight initialization of all network layers is subjected to Gaussian distribution, the mean value is 0, and the variance is 1.
And S104, inputting the real image and the clothing image generated by the disentanglement generator into the discriminator for judgment.
The main objective of the disentanglement generator is to generate a false image that is "indistinguishable" from the true image, and the objective of the discriminator is to determine the true image as true and the false image as false.
Specifically, the real image, the category label and the color style label of the real image are sent to a multi-stage discriminator to judge whether the image is true or false, the multi-stage discriminator judges the real image and the matched label to be true, namely 1 when training is started, the clothing image generated by the detangling generator is judged to be false, namely 0, the detangling generator continuously improves the quality of the generated clothing image to hopefully cheat the discriminator, the discriminator also continuously improves the discrimination level to accurately distinguish the real image from the false image, the real image from the false image and the false image are mutually confronted, and finally the image generated by the detangling generator can be confused.
And S105, adjusting and optimizing the parameters of the de-entanglement network according to the judging value and the output image. And alternately iterating the disentanglement generator and the multi-stage discriminators, wherein the global objective function during training is represented as:
Figure BDA0002986573760000082
an alternating iterative process is shown in which the discriminator minimizes the distributed discrimination values of the image generated by the detangler generator and maximizes the true image discrimination expectation.
In each training round, parameters are optimized according to a judgment value and an objective function, in order to improve the quality of image generation, a VGG network is used for extracting the characteristics of generated images, and a perception loss function is added in the training process:
Figure BDA0002986573760000083
F(i)indicating that the VGG network contains MiThe i-th network of each element has the objective function as follows:
Figure BDA0002986573760000091
λ1and λ2Respectively, the super-parameters which need to be adjusted during training are used for updating parameters of the de-entanglement generator and the discriminator by using a gradient descent method.
To speed up the training process, the overall objective function is optimized by updating the disentanglement generator parameter θ every 5 timesGUpdating the discriminator parameter once
Figure BDA0002986573760000092
And the error is propagated reversely; respectively updating a de-entanglement generator and a discriminator by using a gradient descent algorithm to reduce a loss function, respectively setting initial learning rates of the de-entanglement generator and the discriminator to be 0.001 and 0.004 by adopting an Adam optimizer, wherein the total number of training rounds is 20000, the learning rate is unchanged during the front 10000 rounds of training, the learning rate is attenuated to 0 in a linear mode during the rear 10000 rounds of training, and the optimization parameter beta is optimized1=0,β2The weights are initialized with a gaussian distribution, mean 0 and variance 0.01, 0.999.
As shown in fig. 2, the system using the clothing image generation method based on the disentanglement network includes:
a user registration module: confirming the identity information of the user, and recording the clothes type and the color style preferred by the user;
an input conversion module: receiving a clothing design requirement input by a user, and converting the clothing design requirement into an input corresponding to a model;
a clothing image design generation module: the converted clothing design requirements of the user are sent into a trained model, a network receiving noise signal and condition input are generated through disentanglement, and a corresponding clothing image is output;
a display module: the system displays the clothing image output by the disentanglement network to the user, and the user can select the favorite and satisfactory image.
By processing images of different data sets, experimental results for verifying the scheme provided by the embodiment of the application are obtained, and the data sets and the experimental results are introduced as follows:
the data set used by the invention is a data set provided in an Appeal Classification With Style (ACWS) article, the data set comprises more than 80000 pictures and 15 garment types, the pictures of the data set are unified into 512 x 512 size by methods of zooming, stretching, bilinear interpolation and the like, the label and the garment image of each garment type are simultaneously sent to a multi-stage discriminator for judgment, after iterative training 20000 rounds, the experimental result is tested, as shown in figure 6, the trained model forms an algorithm module of a garment image generation system, the user inputs different types of garment type labels and color Style labels, and the system can generate the garment image specified by the user.
In order to demonstrate the effectiveness of the invention in feature detangling, the invention also performed ablation experiments to demonstrate: the disentanglement generator and the multi-stage discriminators have a very great effect on improving the generation effect of the clothing image, the perception Score (IS) IS an objective evaluation index commonly used for generating the model, the higher the Score IS, the better the effect of the generated model IS, the perception Score IS used for evaluating the result of the invention, and the evaluation result IS shown in the following table:
Method perception score (IS)
Without disentanglement generator (convolution with up-sampling only) 1.7894±0.1136
Does not contain multi-stage discriminators 1.9347±0.1220
The invention 2.2010±0.0884
It can be seen that the method adopted by the invention has the highest IS value, i.e. the method proposed by the invention has the best implementation effect. However, for the generated model, it IS not enough to evaluate only by using the IS value, in order to illustrate the evaluation effect of the method in generating the image perception effect, the image perception similarity index (LPIPS) of the test effect IS also tested, the lower the value IS, the better the perception effect IS, and the evaluation results are shown in the following table:
Method image perception similarity index (LPIPS)
Without disentanglement generator (convolution with up-sampling only) 0.1297
Does not contain multi-stage discriminators 0.1349
The invention 0.1126
The method LPIPS value adopted by the invention is the lowest, which shows the perception degree closest to the real image, and the two quantitative evaluation indexes well show that the method adopted by the invention has the best implementation effect.
The scheme provided by the invention can be applied to various fields such as electronic commerce, application software, garment design industry and the like. It should be noted that, all the above optional technical solutions may be combined arbitrarily to form an optional embodiment of the present application, and are not described in detail herein.

Claims (10)

1. A clothing image generation method based on an disentanglement neural network is characterized by comprising the following steps:
s101, acquiring a plurality of clothing images with category labels;
s102, acquiring a color label of the clothing image, and cascading the color label with a clothing type label;
s103, training a de-entanglement neural network, and initializing a de-entanglement generator parameter and a discriminator network parameter of the de-entanglement neural network;
s104, inputting the cascaded labels into the de-entanglement neural network, and judging a real image and a clothing image generated by the de-entanglement generator;
and S105, adjusting and optimizing the disentanglement network parameters according to the judgment value and the output clothing image.
2. The clothing image generation method based on the disentanglement neural network as claimed in claim 1, wherein the class label and the color label of the clothing image are obtained by a one-hot coding method, and the classification of the clothing image color is performed by converting an RGB model of the clothing image into an HSV model by using an OpenCV tool.
3. The method for generating a garment image based on an disentanglement neural network according to claim 1, wherein the step S103 specifically includes:
s103_1, training a de-entanglement neural network, wherein the de-entanglement neural network is a conditional countermeasure network and comprises a de-entanglement generator G and a multi-stage discriminator D, the de-entanglement generator generates a picture after extracting the pattern features of the clothing image, and the multi-stage discriminator generates a real picture and a generated pictureThe input of the disentanglement neural network is the class label l of the clothing imageclassColor label lcolorAnd a random noise variable z, the disentangle generator output G (z, (l)class,lcolor) Log (G (z, (l)) is output from the multi-stage discriminatorclass,lcolor) )) and log (I)real) Corresponding to the discrimination results of the generated picture and the real picture by the multi-stage discriminator, IrealIs the concatenation of the real garment image and its label;
the overall objective function during training is:
Figure FDA0002986573750000011
i.e., the overall GAN loss function is:
Figure FDA0002986573750000012
among them, the above-mentioned materials are used,
Figure FDA0002986573750000013
respectively, subject to the discrimination result expectation of the true distribution and the de-entanglement generator generated distribution,
Figure FDA0002986573750000014
training processes of the minimum de-entanglement generator of the discriminator to generate a distribution discrimination expectation and a maximum true distribution discrimination expectation, respectively,/true、xtrueL respectively representing a label of the real clothing image, the real clothing image and a label for generating the clothing image;
s103_2, performing spectrum normalization on all network layers of the de-entanglement generator and the multi-stage discriminator, wherein the weight initialization of all network layers is subjected to Gaussian distribution, the mean value is 0, and the variance is 1.
4. The clothing image generation method based on the de-entanglement neural network as claimed in claim 3, wherein the multi-level discriminator is composed of a local discriminator and a global discriminator, the local discriminator and the global discriminator respectively perform down-sampling discrimination on the real picture and the generated picture on two different scales, and finally the results of the sampling are combined to obtain the discrimination results of the multi-level discriminator.
5. The clothing image generation method based on the disentanglement neural network, according to claim 3, wherein the generated image intermediate feature output of the multi-stage discriminator is matched with the real image intermediate feature output, and the feature matching loss function is:
Figure FDA0002986573750000021
where T is the total number of network layers, NiIndicating the number of elements in each layer network, DkA representative sub-discriminator, as a feature extractor, that minimizes feature matching loss only when training the de-entanglement generator
Figure FDA0002986573750000022
6. The method for generating clothes image based on disentanglement neural network as claimed in claim 3, wherein the disentanglement generator is composed of mapping network ω and progressive generation network GprogressThe mapping network omega is composed of a fully-connected network layer, random noise and label codes are input into an intermediate latent space, and an intermediate latent code psi (psi) is outputstyle,ψbias) Controlling parameters of an adaptive instance normalization layer, the normalization function being:
Figure FDA0002986573750000023
wherein each feature map xiRespectively carrying out normalization operation, scaling the feature map by the intermediate latent code psistyleDoubling and adding offset psibias(ii) a Progressive generation network GprogressIs a convolution module with an adaptive instance normalization layerThe convolution module performs up-sampling by adopting a linear interpolation method, the amplification factor is 2, the input is intermediate latent codes and random Gaussian noise, the added noise follows Gaussian distribution, the mean value is 0, the variance is 1, and finally, the output is converted into an RGB image through a convolution layer with a convolution kernel of 1.
7. The method as claimed in claim 6, wherein the mapping network maps the input signal to the potential space variable ω, and then the mapping transformation transforms the potential space variable into the model variable y ═ y (y ═ y)s,yb) Output to progressive generation network Gprogress,ys、ybScaling the multiple and the deviation respectively, after each convolution module, the style variable controls a parameter of the adaptive instance normalization, and the operation of the adaptive instance normalization parameter is represented as:
Figure FDA0002986573750000031
wherein each feature map xiIs regularized separately and then normalized with the feature pattern variable y, so the dimension of y is twice the number of the network layer feature maps.
8. The method for generating a garment image based on an disentanglement neural network according to claim 1, wherein the step S104 specifically comprises: the real image, the category label and the color style label of the real image are sent to a multi-stage discriminator to judge whether the real image is true or false, the real image and the matched label are judged as true when the multi-stage discriminator starts training, and the clothing image generated by the disentanglement generator is judged as false.
9. The method for generating a garment image based on an disentanglement neural network according to claim 1, wherein the step S105 specifically comprises: in each training, parameters are optimized according to a judgment value and an objective function, specifically, features of generated images are extracted through a VGG network, and a perception loss function is added in the training process:
Figure FDA0002986573750000032
F(i)indicating that the VGG network contains MiThe i-th network of each element has the objective function as follows:
Figure FDA0002986573750000033
wherein λ is1And λ2Respectively, the hyper-parameters which need to be adjusted during training are adopted, after each iteration of the network, the parameters of the de-entanglement generator and the multi-stage discriminator are updated by using a gradient descent method, and the generator parameter theta is updated for 5 timesGUpdating the discriminator parameter once
Figure FDA0002986573750000034
And the error is propagated backwards.
10. A clothing image generation system based on a disentanglement neural network is characterized by comprising:
a user registration module: confirming the identity information of the user, and recording the clothes type and the color style preferred by the user;
an input conversion module: receiving a clothing design requirement input by a user, and converting the clothing design requirement into an input corresponding to a model;
a clothing image design generation module: the converted clothing design requirements of the user are sent into a trained model, a network receiving noise signal and condition input are generated through disentanglement, and a corresponding clothing image is output;
a display module: the system displays the clothing image output by the disentanglement network to the user, and the user can select the favorite and satisfactory image.
CN202110304774.3A 2021-03-22 2021-03-22 Clothing image generation system and method based on disentanglement network Pending CN113052230A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110304774.3A CN113052230A (en) 2021-03-22 2021-03-22 Clothing image generation system and method based on disentanglement network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110304774.3A CN113052230A (en) 2021-03-22 2021-03-22 Clothing image generation system and method based on disentanglement network

Publications (1)

Publication Number Publication Date
CN113052230A true CN113052230A (en) 2021-06-29

Family

ID=76514440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110304774.3A Pending CN113052230A (en) 2021-03-22 2021-03-22 Clothing image generation system and method based on disentanglement network

Country Status (1)

Country Link
CN (1) CN113052230A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722783A (en) * 2021-07-08 2021-11-30 浙江海阔人工智能科技有限公司 User-oriented intelligent garment design system and method based on deep learning model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111785261A (en) * 2020-05-18 2020-10-16 南京邮电大学 Cross-language voice conversion method and system based on disentanglement and explanatory representation
CN111951153A (en) * 2020-08-12 2020-11-17 杭州电子科技大学 Face attribute fine editing method based on generation of confrontation network hidden space deconstruction
CN112100908A (en) * 2020-08-31 2020-12-18 西安工程大学 Garment design method for generating confrontation network based on multi-condition deep convolution

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111785261A (en) * 2020-05-18 2020-10-16 南京邮电大学 Cross-language voice conversion method and system based on disentanglement and explanatory representation
CN111951153A (en) * 2020-08-12 2020-11-17 杭州电子科技大学 Face attribute fine editing method based on generation of confrontation network hidden space deconstruction
CN112100908A (en) * 2020-08-31 2020-12-18 西安工程大学 Garment design method for generating confrontation network based on multi-condition deep convolution

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GÖKHAN YILDIRIM ET AL.: "Disentangling Multiple Conditional Inputs in GANs", 《ARXIV》 *
NEERAJ KUMAR ET AL.: "Robust One Shot Audio to Video Generation", 《2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW)》 *
TERO KARRAS ET AL.: "A Style-Based Generator Architecture for Generative Adversarial Networks", 《ARXIV》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722783A (en) * 2021-07-08 2021-11-30 浙江海阔人工智能科技有限公司 User-oriented intelligent garment design system and method based on deep learning model

Similar Documents

Publication Publication Date Title
Liu et al. Adversarial learning for constrained image splicing detection and localization based on atrous convolution
CN109670528B (en) Data expansion method facing pedestrian re-identification task and based on paired sample random occlusion strategy
CN110378985B (en) Animation drawing auxiliary creation method based on GAN
CN110097609B (en) Sample domain-based refined embroidery texture migration method
CN107977932A (en) It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method
CN111798369B (en) Face aging image synthesis method for generating confrontation network based on circulation condition
CN108960304B (en) Deep learning detection method for network transaction fraud behaviors
CN112801015B (en) Multi-mode face recognition method based on attention mechanism
CN111967930A (en) Clothing style recognition recommendation method based on multi-network fusion
CN114708272A (en) Garment image segmentation model establishing method and garment image segmentation method
CN109886281A (en) One kind is transfinited learning machine color image recognition method based on quaternary number
Bounsaythip et al. Genetic algorithms in image processing-a review
CN113052230A (en) Clothing image generation system and method based on disentanglement network
CN109947960B (en) Face multi-attribute joint estimation model construction method based on depth convolution
CN113724354B (en) Gray image coloring method based on reference picture color style
CN111400525A (en) Intelligent fashionable garment matching and recommending method based on visual combination relation learning
CN113722783A (en) User-oriented intelligent garment design system and method based on deep learning model
CN110070587B (en) Pedestrian image generation method based on conditional cascade confrontation generation network
CN116681921A (en) Target labeling method and system based on multi-feature loss function fusion
Al Sasongko et al. Application of Gray Scale Matrix Technique for Identification of Lombok Songket Patterns Based on Backpropagation Learning
CN114299184B (en) Hidden building colored drawing line manuscript painting method and device based on semantic matching
CN111754459B (en) Dyeing fake image detection method based on statistical depth characteristics and electronic device
CN111680760A (en) Clothing style identification method and device, electronic equipment and storage medium
CN113658285A (en) Method for generating face photo to artistic sketch
Zhang et al. Supplementary meta-learning: Towards a dynamic model for deep neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210629