CN113052230A - Clothing image generation system and method based on disentanglement network - Google Patents
Clothing image generation system and method based on disentanglement network Download PDFInfo
- Publication number
- CN113052230A CN113052230A CN202110304774.3A CN202110304774A CN113052230A CN 113052230 A CN113052230 A CN 113052230A CN 202110304774 A CN202110304774 A CN 202110304774A CN 113052230 A CN113052230 A CN 113052230A
- Authority
- CN
- China
- Prior art keywords
- network
- image
- clothing
- discriminator
- disentanglement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 40
- 238000013528 artificial neural network Methods 0.000 claims abstract description 32
- 238000013461 design Methods 0.000 claims abstract description 26
- 230000006870 function Effects 0.000 claims description 25
- 238000010606 normalization Methods 0.000 claims description 24
- 238000013507 mapping Methods 0.000 claims description 17
- 238000005070 sampling Methods 0.000 claims description 15
- 230000003044 adaptive effect Effects 0.000 claims description 12
- 230000008447 perception Effects 0.000 claims description 11
- 230000000750 progressive effect Effects 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 230000003321 amplification Effects 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 239000000463 material Substances 0.000 claims description 3
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 3
- 230000000644 propagated effect Effects 0.000 claims description 3
- 238000001228 spectrum Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 abstract description 3
- 230000000694 effects Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 5
- 230000003745 detangling effect Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011158 quantitative evaluation Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2148—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0621—Item configuration or customization
Abstract
The invention relates to the field of computer vision, in particular to a clothing image generation system and a clothing image generation method based on an entanglement removal network, wherein the method comprises the following steps: step S101, acquiring a plurality of clothing images with category labels; step S102, acquiring a color label of the clothing image, and cascading the color label with a category label; step S103, training a de-entanglement neural network, and initializing a generator and a discriminator network parameter of the de-entanglement neural network; step S104, distinguishing the real image and the clothing image generated by the disentanglement generator; and step S105, adjusting and optimizing the parameters of the disentanglement network according to the belonged judgment value and the output image. The invention combines the artificial intelligence technology with the traditional clothing industry, overcomes the problems of single design, insufficient user satisfaction, high design cost and the like in the traditional clothing industry, improves the clothing design efficiency, and ensures that the designed clothing fully meets the design requirements of users.
Description
Technical Field
The invention relates to the field of computer vision, in particular to a clothing image generation system and method based on an entanglement removal network.
Background
The computer vision technology has very many specific applications in the fields of image generation, conversion, restoration and the like, and the conditional countermeasure generation network can generate a garment image which cannot be distinguished from a real image according to an input label, so that inspiration is provided for garment designers and consumers; the key to the generation of the garment image is diversified style design, which requires the network to be able to well disentangle the features of the garment image, such as the color features and shape features of the garment; meanwhile, a user can select the type or color style of the garment to be generated, the customized design problem of the garment can be solved to a certain extent, intelligent design of the garment with the user as the center is realized, design inspiration is given according to the user requirement, all the generated garment images are generated from scratch according to the real image distribution, conditional random generation can ensure the diversity of the generated garment images, customized generation of the garment can be realized according to the user requirement, meanwhile, the design inspiration can be provided for a garment designer, the combination of the garment industry and the artificial intelligence technology is facilitated, and the development of the garment industry is promoted.
Of course, there are many difficulties in designing garment images using artificial intelligence techniques. The difficulty is mainly shown in that:
1) the training difficulty of the generated countermeasure network is high and convergence is difficult;
2) generating randomness of the image;
3) the clothing image has more characteristics, complex textures and patterns, and is easy to cause characteristic intersection and image blurring;
4) high resolution, high definition images are not well generated.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a clothing image generation method based on an entanglement removal network, which generates clothing images of various styles according to user conditions and provides clothing design inspiration for consumers and designers, and the specific technical scheme is as follows:
a clothing image generation method based on an disentanglement neural network comprises the following steps:
s101, acquiring a plurality of clothing images with category labels;
s102, acquiring a color label of the clothing image, and cascading the color label with a clothing type label;
s103, training a de-entanglement neural network, and initializing a de-entanglement generator parameter and a discriminator network parameter of the de-entanglement neural network;
s104, inputting the cascaded labels into the de-entanglement neural network, and judging a real image and a clothing image generated by the de-entanglement generator;
and S105, adjusting and optimizing the disentanglement network parameters according to the judgment value and the output clothing image.
Further, the class label and the color label of the clothing image are obtained by a single hot coding method, wherein the classification of the clothing image colors adopts an OpenCV tool to convert an RGB model of the clothing image into an HSV model.
Further, step S103 specifically includes:
s103_1, training a de-entanglement neural network, wherein the de-entanglement neural network is a conditional countermeasure network and comprises a de-entanglement generator G and a multi-stage discriminator D, the de-entanglement generator generates a picture after extracting the style features of the clothing image, the multi-stage discriminator discriminates the real picture from the generated picture, and the input of the de-entanglement neural network is a category label l of the clothing imageclassColor label lcolorAnd a random noise variable z, the disentangle generator output G (z, (l)class,lcolor) Log (G (z, (l)) is output from the multi-stage discriminatorclass,lcolor) )) and log (I)real) Corresponding to the discrimination results of the generated picture and the real picture by the multi-stage discriminator, respectively, wherein IrealIs the concatenation of the real garment image and its label;
among them, the above-mentioned materials are used,respectively, subject to the discrimination result expectation of the true distribution and the de-entanglement generator generated distribution,training processes of the minimum de-entanglement generator of the discriminator to generate a distribution discrimination expectation and a maximum true distribution discrimination expectation, respectively,/true、xtrueL respectively representing a label of the real clothing image, the real clothing image and a label for generating the clothing image;
s103_2, performing spectrum normalization on all network layers of the de-entanglement generator and the multi-stage discriminator, wherein the weight initialization of all network layers is subjected to Gaussian distribution, the mean value is 0, and the variance is 1.
Further, the multi-stage discriminator is composed of a local discriminator and a global discriminator, the local discriminator and the global discriminator respectively perform down-sampling discrimination on the real picture and the generated picture on two different scales, and finally the results of sampling are combined to obtain the discrimination results of the multi-stage discriminator.
Further, the intermediate feature output of the generated image of the multi-stage discriminator is matched with the intermediate feature output of the real image, and the feature matching loss function is as follows:
where T is the total number of network layers, NiIndicating the number of elements in each layer network, DkA representative sub-discriminator, as a feature extractor, that minimizes feature matching loss only when training the de-entanglement generator
Further, the de-entanglement generator is composed of a mapping network omega and a progressive generation network GprogressThe mapping network omega is composed of a fully-connected network layer, random noise and label codes are input into an intermediate latent space, and an intermediate latent code psi (psi) is outputstyle,ψbias) Controlling parameters of an adaptive instance normalization layer, the normalization function being:wherein each feature map xiRespectively carrying out normalization operation, scaling the feature map by the intermediate latent code psistyleDoubling and adding offset psibias(ii) a Progressive generation network GprogressThe convolution module is provided with a self-adaptive example normalization layer, the convolution module performs up-sampling by adopting a linear interpolation method, the amplification factor is 2, the input is intermediate latent codes and random Gaussian noise, the added noise follows Gaussian distribution, the mean value is 0, the variance is 1, and finally, the output is converted into an RGB image through a convolution layer with the convolution kernel of 1.
Further, the mapping network maps the input signal into a potential spatial variable ω, and then the mapping transformation transforms the potential spatial variable into a pattern variable y ═ y (y)s,yb) Output to progressive generation network Gprogress,ys、ybScaling the multiple and the deviation respectively, after each convolution module, the style variable controls a parameter of the adaptive instance normalization, and the operation of the adaptive instance normalization parameter is represented as:
wherein each feature map xiIs regularized separately and then normalized with the feature pattern variable y, so the dimension of y is twice the number of the network layer feature maps.
Further, the step S104 specifically includes: the real image, the category label and the color style label of the real image are sent to a multi-stage discriminator to judge whether the real image is true or false, the real image and the matched label are judged as true when the multi-stage discriminator starts training, and the clothing image generated by the disentanglement generator is judged as false.
Further, step S105 specifically includes: in each training, parameters are optimized according to a judgment value and an objective function, specifically, features of generated images are extracted through a VGG network, and a perception loss function is added in the training process:F(i)indicating that the VGG network contains MiThe i-th network of each element has the objective function as follows:
wherein λ is1And λ2Respectively, the hyper-parameters which need to be adjusted during training are adopted, after each iteration of the network, the parameters of the de-entanglement generator and the multi-stage discriminator are updated by using a gradient descent method, and the generator parameter theta is updated for 5 timesGUpdating the discriminator parameter onceAnd the error is propagated backwards.
A clothing image generation system based on a de-entangled neural network, comprising:
a user registration module: confirming the identity information of the user, and recording the clothes type and the color style preferred by the user;
an input conversion module: receiving a clothing design requirement input by a user, and converting the clothing design requirement into an input corresponding to a model;
a clothing image design generation module: the converted clothing design requirements of the user are sent into a trained model, a network receiving noise signal and condition input are generated through disentanglement, and a corresponding clothing image is output;
a display module: the system displays the clothing image output by the disentanglement network to the user, and the user can select the favorite and satisfactory image.
The method adopts the conditional unwrapping to generate the network, can well reserve elements such as textures and patterns of the clothes, can realize customized design according to user requirements, and improves the randomness and the specifity of user experience; the generation of the countermeasure network is adopted, and the generation is promoted together, so that the quality and the effect of the generated image can be continuously improved; after the algorithm is trained, the system can efficiently and massively generate customized clothing images and carry out clothing design according to user conditions, and the clothing type and the color style required by a user can be well generated.
Drawings
FIG. 1 is a schematic block diagram of the process flow of the present invention;
FIG. 2 is a flow schematic block diagram of the system architecture of the present invention;
FIG. 3 is a schematic representation of a sample garment image used in the training of the method of the present invention;
FIG. 4 is a schematic diagram of the overall structure of a disentanglement network used in the training of the method of the present invention;
FIG. 5 is a schematic diagram of a multi-stage discriminator network in a de-entanglement network used in the training of the method of the present invention;
fig. 6 is a diagram of the effects actually produced by the system of the present invention.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings.
A clothing image generation system and method based on disentanglement network, mainly for assisting clothing designer, provide the clothing image design for consumer, the method is:
the system receives the clothing generation requirement of the user in advance, converts the requirement into a label vector L, sends the label vector L into a pre-trained neural network, and generates and displays clothing images according to the input conditions of the user.
The neural network is a conditional generation countermeasure network and mainly comprises two parts: 1) the de-entanglement garment image de-entanglement generator is used for generating a model to capture the image distribution of the original data set and generating a garment image which is continuously close to the original data set according to the input condition; 2) the multi-stage discriminator judges whether the input is true or false, judges whether the input is true for a real garment image with a specific category attribute, and judges whether the input is false for a garment image generated by the generated model; and the generated model continuously improves the model parameters according to the judgment result, and finally generates the clothing image with the image quality and the class attribute meeting the requirements of the user.
More specifically, as shown in fig. 1, a clothing image generation method based on a disentanglement network includes the following steps:
s101, acquiring a plurality of clothing images with category labels.
The clothing images are various clothing images with white backgrounds and sizes of 512 x 512, and the clothing categories can be suits, shirts, one-piece dresses, sweaters and the like. Each garment has a specific garment type, a one-hot coding method is adopted, for example, 1 represents that the garment is a sweater, 4 represents that the garment is a suit, a specific code is respectively set for each type and is converted into a label vector, the specific position is 1, the rest positions are 0, and the code is used as the input of a condition generation network.
S102, acquiring a color label of the clothing image, and cascading with the clothing category label.
The main color styles of the garments, including: white, black, green etc. for every clothing is beaten the color label, specifically is: the main color label of each clothing image can be obtained by an OpenCV tool, the color of each clothing image is independently coded, for example, the red code is 1, the green code is 5, and then the clothing image is converted into the corresponding color label by a method of independent thermal coding; then, the garment image is cascaded with the class label of each garment image to be used as a condition for generating a network by conditional countermeasure, and the generation result is controlled.
The method comprises the steps that an OpenCV tool converts an RGB model of a clothing image into an HSV model, wherein in the HSV model, H belongs to [0,180 ], S belongs to [0,255], V belongs to [0,255], and hue, saturation and brightness of the image are respectively represented.
S103, training a de-entanglement neural network, and initializing a de-entanglement generator parameter and a discriminator network parameter of the de-entanglement neural network.
Training data required by a pre-trained de-entanglement neural network of the system are original garment images, and preprocessing and classified encoding are carried out on each original garment image;
in the training stage, a clothing data set is trained through a de-entanglement neural network consisting of a de-entanglement generator and a multi-stage discriminator, and the input of the de-entanglement neural network is a class label l of a clothing imageclassColor label lcolorAnd a random noise variable z, the disentangle generator output G (z, (l)class,lcolor) Log (G (z, (l)) is output from the multi-stage discriminatorclass,lcolor) )) and log (I)real),IrealThe method is characterized in that a real garment image and a label thereof are cascaded and respectively correspond to a multi-stage discriminator to discriminate a generated picture and a real picture;
among them, the above-mentioned materials are used,respectively, subject to the discrimination result expectation of the true distribution and the de-entanglement generator generated distribution,training processes of the minimum de-entanglement generator of the discriminator to generate a distribution discrimination expectation and a maximum true distribution discrimination expectation, respectively,/true、xtrueAnd l respectively represent a label of the real clothing image, the real clothing image and a label for generating the clothing image.
In order to improve the effect of the GAN loss function, the multi-stage discriminator adopts an intermediate feature matching process, and can output intermediate features, the intermediate feature output of the generated image of the multi-stage discriminator needs to be matched with the intermediate feature output of the real image, and the feature matching loss function is as follows:
where T is the total number of network layers, NiIndicating the number of elements in each layer network, DkRepresenting two sub-discriminators, as feature extractors, only performing a minimization of feature matching loss when training the de-entanglement generator
The de-entanglement generator can extract the class characteristics and the style characteristics of the clothing image, de-entangle the characteristics of the clothing patterns and the patterns, and generate a network G mainly through a mapping network omega and a progressive generation network GprogressComposition in which the network G is progressively generatedprogressIs a convolution module with an adaptive instance normalization module.
Random noise z and label code L as input of mapping network omega, intermediate latent code and adaptive random noise as progressive generation network GprogressFor an image with an image resolution of 512 × 512, Gprogress16, and the final layer of convolution converts the result to an RGB image using a 1 x 1 convolution kernel. Mapping network omega output intermediate latent code psi ═ phi (psi)style,ψbias) Controlling parameters of an adaptive instance normalization module, the normalization function being:wherein each feature map xiRespectively carrying out normalization operation, scaling the feature map by the intermediate latent code psistyleDoubling and adding offset psibiasThus, the dimension of the intermediate potential ψ is twice that of the feature map x.
The mapping network omega is composed of 6 layers of fully-connected network layers, the input and output sizes are 512 multiplied by 512, the input of the mapping network omega is constant and label one-hot coding, the input of the mapping network omega is mapped to an intermediate latent space, then a mapping variable passes through a convolution module with an adaptive instance normalization module, each convolution module can be regarded as an up-sampling module, the progressive generation network is composed of the up-sampling modules, up-sampling is carried out by adopting a linear interpolation method, and the amplification factor is 2; the convolution kernel size of the convolution layer is 3 multiplied by 3, along with the continuous reduction of the number of the network stacking channels, the convolution layer is followed by a ReLU nonlinear transformation layer and a batch normalization layer, Gaussian noise is added at the end of each layer to increase the randomness of the generated image, the input is intermediate latent features and random Gaussian noise, the random Gaussian noise obeys Gaussian distribution, the mean value of the random Gaussian noise is 0, and the variance is 1; and finally, converting the output into RGB data through a convolution layer with convolution kernel of 1, namely a final output image.
The convolution module contains self-adapting example normalization module, and uses the characteristic pattern y ═ y (y) output by mapping networks,yb) For example normalization, the mapping network maps the input signal into a latent space variable ω, which is then converted into a pattern variable y ═ y (y)s,yb),,ys、ybScaling the multiples and deviations separately, after each convolution module, the pattern variable controls the parameters of the adaptive instance normalization, the operation of which can be expressed as:wherein each feature map xiThe method is independently regularized and then normalized with a pattern variable y, so that the dimension of y is twice of the number of the network layer feature maps, and finally, the randomness of a generated result is increased by directly adding a noise signal.
The multi-stage discriminator is a down-sampling network with a local discriminator and a global discriminator, for an image with the image size of 512 x 512, the local discriminator performs down-sampling on the features with the size of 256 x 256, the global discriminator performs down-sampling on the whole image, and the final sampling results are combined to obtain the discrimination results of the multi-stage discriminator. In order to ensure the stability of the GAN network during training, all network layers of the de-entanglement generator and the multi-stage discriminator are subjected to spectrum normalization, the weight initialization of all network layers is subjected to Gaussian distribution, the mean value is 0, and the variance is 1.
And S104, inputting the real image and the clothing image generated by the disentanglement generator into the discriminator for judgment.
The main objective of the disentanglement generator is to generate a false image that is "indistinguishable" from the true image, and the objective of the discriminator is to determine the true image as true and the false image as false.
Specifically, the real image, the category label and the color style label of the real image are sent to a multi-stage discriminator to judge whether the image is true or false, the multi-stage discriminator judges the real image and the matched label to be true, namely 1 when training is started, the clothing image generated by the detangling generator is judged to be false, namely 0, the detangling generator continuously improves the quality of the generated clothing image to hopefully cheat the discriminator, the discriminator also continuously improves the discrimination level to accurately distinguish the real image from the false image, the real image from the false image and the false image are mutually confronted, and finally the image generated by the detangling generator can be confused.
And S105, adjusting and optimizing the parameters of the de-entanglement network according to the judging value and the output image. And alternately iterating the disentanglement generator and the multi-stage discriminators, wherein the global objective function during training is represented as:an alternating iterative process is shown in which the discriminator minimizes the distributed discrimination values of the image generated by the detangler generator and maximizes the true image discrimination expectation.
In each training round, parameters are optimized according to a judgment value and an objective function, in order to improve the quality of image generation, a VGG network is used for extracting the characteristics of generated images, and a perception loss function is added in the training process:
F(i)indicating that the VGG network contains MiThe i-th network of each element has the objective function as follows:λ1and λ2Respectively, the super-parameters which need to be adjusted during training are used for updating parameters of the de-entanglement generator and the discriminator by using a gradient descent method.
To speed up the training process, the overall objective function is optimized by updating the disentanglement generator parameter θ every 5 timesGUpdating the discriminator parameter onceAnd the error is propagated reversely; respectively updating a de-entanglement generator and a discriminator by using a gradient descent algorithm to reduce a loss function, respectively setting initial learning rates of the de-entanglement generator and the discriminator to be 0.001 and 0.004 by adopting an Adam optimizer, wherein the total number of training rounds is 20000, the learning rate is unchanged during the front 10000 rounds of training, the learning rate is attenuated to 0 in a linear mode during the rear 10000 rounds of training, and the optimization parameter beta is optimized1=0,β2The weights are initialized with a gaussian distribution, mean 0 and variance 0.01, 0.999.
As shown in fig. 2, the system using the clothing image generation method based on the disentanglement network includes:
a user registration module: confirming the identity information of the user, and recording the clothes type and the color style preferred by the user;
an input conversion module: receiving a clothing design requirement input by a user, and converting the clothing design requirement into an input corresponding to a model;
a clothing image design generation module: the converted clothing design requirements of the user are sent into a trained model, a network receiving noise signal and condition input are generated through disentanglement, and a corresponding clothing image is output;
a display module: the system displays the clothing image output by the disentanglement network to the user, and the user can select the favorite and satisfactory image.
By processing images of different data sets, experimental results for verifying the scheme provided by the embodiment of the application are obtained, and the data sets and the experimental results are introduced as follows:
the data set used by the invention is a data set provided in an Appeal Classification With Style (ACWS) article, the data set comprises more than 80000 pictures and 15 garment types, the pictures of the data set are unified into 512 x 512 size by methods of zooming, stretching, bilinear interpolation and the like, the label and the garment image of each garment type are simultaneously sent to a multi-stage discriminator for judgment, after iterative training 20000 rounds, the experimental result is tested, as shown in figure 6, the trained model forms an algorithm module of a garment image generation system, the user inputs different types of garment type labels and color Style labels, and the system can generate the garment image specified by the user.
In order to demonstrate the effectiveness of the invention in feature detangling, the invention also performed ablation experiments to demonstrate: the disentanglement generator and the multi-stage discriminators have a very great effect on improving the generation effect of the clothing image, the perception Score (IS) IS an objective evaluation index commonly used for generating the model, the higher the Score IS, the better the effect of the generated model IS, the perception Score IS used for evaluating the result of the invention, and the evaluation result IS shown in the following table:
Method | perception score (IS) |
Without disentanglement generator (convolution with up-sampling only) | 1.7894±0.1136 |
Does not contain multi-stage discriminators | 1.9347±0.1220 |
The invention | 2.2010±0.0884 |
It can be seen that the method adopted by the invention has the highest IS value, i.e. the method proposed by the invention has the best implementation effect. However, for the generated model, it IS not enough to evaluate only by using the IS value, in order to illustrate the evaluation effect of the method in generating the image perception effect, the image perception similarity index (LPIPS) of the test effect IS also tested, the lower the value IS, the better the perception effect IS, and the evaluation results are shown in the following table:
Method | image perception similarity index (LPIPS) |
Without disentanglement generator (convolution with up-sampling only) | 0.1297 |
Does not contain multi-stage discriminators | 0.1349 |
The invention | 0.1126 |
The method LPIPS value adopted by the invention is the lowest, which shows the perception degree closest to the real image, and the two quantitative evaluation indexes well show that the method adopted by the invention has the best implementation effect.
The scheme provided by the invention can be applied to various fields such as electronic commerce, application software, garment design industry and the like. It should be noted that, all the above optional technical solutions may be combined arbitrarily to form an optional embodiment of the present application, and are not described in detail herein.
Claims (10)
1. A clothing image generation method based on an disentanglement neural network is characterized by comprising the following steps:
s101, acquiring a plurality of clothing images with category labels;
s102, acquiring a color label of the clothing image, and cascading the color label with a clothing type label;
s103, training a de-entanglement neural network, and initializing a de-entanglement generator parameter and a discriminator network parameter of the de-entanglement neural network;
s104, inputting the cascaded labels into the de-entanglement neural network, and judging a real image and a clothing image generated by the de-entanglement generator;
and S105, adjusting and optimizing the disentanglement network parameters according to the judgment value and the output clothing image.
2. The clothing image generation method based on the disentanglement neural network as claimed in claim 1, wherein the class label and the color label of the clothing image are obtained by a one-hot coding method, and the classification of the clothing image color is performed by converting an RGB model of the clothing image into an HSV model by using an OpenCV tool.
3. The method for generating a garment image based on an disentanglement neural network according to claim 1, wherein the step S103 specifically includes:
s103_1, training a de-entanglement neural network, wherein the de-entanglement neural network is a conditional countermeasure network and comprises a de-entanglement generator G and a multi-stage discriminator D, the de-entanglement generator generates a picture after extracting the pattern features of the clothing image, and the multi-stage discriminator generates a real picture and a generated pictureThe input of the disentanglement neural network is the class label l of the clothing imageclassColor label lcolorAnd a random noise variable z, the disentangle generator output G (z, (l)class,lcolor) Log (G (z, (l)) is output from the multi-stage discriminatorclass,lcolor) )) and log (I)real) Corresponding to the discrimination results of the generated picture and the real picture by the multi-stage discriminator, IrealIs the concatenation of the real garment image and its label;
among them, the above-mentioned materials are used,respectively, subject to the discrimination result expectation of the true distribution and the de-entanglement generator generated distribution,training processes of the minimum de-entanglement generator of the discriminator to generate a distribution discrimination expectation and a maximum true distribution discrimination expectation, respectively,/true、xtrueL respectively representing a label of the real clothing image, the real clothing image and a label for generating the clothing image;
s103_2, performing spectrum normalization on all network layers of the de-entanglement generator and the multi-stage discriminator, wherein the weight initialization of all network layers is subjected to Gaussian distribution, the mean value is 0, and the variance is 1.
4. The clothing image generation method based on the de-entanglement neural network as claimed in claim 3, wherein the multi-level discriminator is composed of a local discriminator and a global discriminator, the local discriminator and the global discriminator respectively perform down-sampling discrimination on the real picture and the generated picture on two different scales, and finally the results of the sampling are combined to obtain the discrimination results of the multi-level discriminator.
5. The clothing image generation method based on the disentanglement neural network, according to claim 3, wherein the generated image intermediate feature output of the multi-stage discriminator is matched with the real image intermediate feature output, and the feature matching loss function is:
6. The method for generating clothes image based on disentanglement neural network as claimed in claim 3, wherein the disentanglement generator is composed of mapping network ω and progressive generation network GprogressThe mapping network omega is composed of a fully-connected network layer, random noise and label codes are input into an intermediate latent space, and an intermediate latent code psi (psi) is outputstyle,ψbias) Controlling parameters of an adaptive instance normalization layer, the normalization function being:wherein each feature map xiRespectively carrying out normalization operation, scaling the feature map by the intermediate latent code psistyleDoubling and adding offset psibias(ii) a Progressive generation network GprogressIs a convolution module with an adaptive instance normalization layerThe convolution module performs up-sampling by adopting a linear interpolation method, the amplification factor is 2, the input is intermediate latent codes and random Gaussian noise, the added noise follows Gaussian distribution, the mean value is 0, the variance is 1, and finally, the output is converted into an RGB image through a convolution layer with a convolution kernel of 1.
7. The method as claimed in claim 6, wherein the mapping network maps the input signal to the potential space variable ω, and then the mapping transformation transforms the potential space variable into the model variable y ═ y (y ═ y)s,yb) Output to progressive generation network Gprogress,ys、ybScaling the multiple and the deviation respectively, after each convolution module, the style variable controls a parameter of the adaptive instance normalization, and the operation of the adaptive instance normalization parameter is represented as:wherein each feature map xiIs regularized separately and then normalized with the feature pattern variable y, so the dimension of y is twice the number of the network layer feature maps.
8. The method for generating a garment image based on an disentanglement neural network according to claim 1, wherein the step S104 specifically comprises: the real image, the category label and the color style label of the real image are sent to a multi-stage discriminator to judge whether the real image is true or false, the real image and the matched label are judged as true when the multi-stage discriminator starts training, and the clothing image generated by the disentanglement generator is judged as false.
9. The method for generating a garment image based on an disentanglement neural network according to claim 1, wherein the step S105 specifically comprises: in each training, parameters are optimized according to a judgment value and an objective function, specifically, features of generated images are extracted through a VGG network, and a perception loss function is added in the training process:
F(i)indicating that the VGG network contains MiThe i-th network of each element has the objective function as follows:
wherein λ is1And λ2Respectively, the hyper-parameters which need to be adjusted during training are adopted, after each iteration of the network, the parameters of the de-entanglement generator and the multi-stage discriminator are updated by using a gradient descent method, and the generator parameter theta is updated for 5 timesGUpdating the discriminator parameter onceAnd the error is propagated backwards.
10. A clothing image generation system based on a disentanglement neural network is characterized by comprising:
a user registration module: confirming the identity information of the user, and recording the clothes type and the color style preferred by the user;
an input conversion module: receiving a clothing design requirement input by a user, and converting the clothing design requirement into an input corresponding to a model;
a clothing image design generation module: the converted clothing design requirements of the user are sent into a trained model, a network receiving noise signal and condition input are generated through disentanglement, and a corresponding clothing image is output;
a display module: the system displays the clothing image output by the disentanglement network to the user, and the user can select the favorite and satisfactory image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110304774.3A CN113052230A (en) | 2021-03-22 | 2021-03-22 | Clothing image generation system and method based on disentanglement network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110304774.3A CN113052230A (en) | 2021-03-22 | 2021-03-22 | Clothing image generation system and method based on disentanglement network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113052230A true CN113052230A (en) | 2021-06-29 |
Family
ID=76514440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110304774.3A Pending CN113052230A (en) | 2021-03-22 | 2021-03-22 | Clothing image generation system and method based on disentanglement network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113052230A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113722783A (en) * | 2021-07-08 | 2021-11-30 | 浙江海阔人工智能科技有限公司 | User-oriented intelligent garment design system and method based on deep learning model |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111785261A (en) * | 2020-05-18 | 2020-10-16 | 南京邮电大学 | Cross-language voice conversion method and system based on disentanglement and explanatory representation |
CN111951153A (en) * | 2020-08-12 | 2020-11-17 | 杭州电子科技大学 | Face attribute fine editing method based on generation of confrontation network hidden space deconstruction |
CN112100908A (en) * | 2020-08-31 | 2020-12-18 | 西安工程大学 | Garment design method for generating confrontation network based on multi-condition deep convolution |
-
2021
- 2021-03-22 CN CN202110304774.3A patent/CN113052230A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111785261A (en) * | 2020-05-18 | 2020-10-16 | 南京邮电大学 | Cross-language voice conversion method and system based on disentanglement and explanatory representation |
CN111951153A (en) * | 2020-08-12 | 2020-11-17 | 杭州电子科技大学 | Face attribute fine editing method based on generation of confrontation network hidden space deconstruction |
CN112100908A (en) * | 2020-08-31 | 2020-12-18 | 西安工程大学 | Garment design method for generating confrontation network based on multi-condition deep convolution |
Non-Patent Citations (3)
Title |
---|
GÖKHAN YILDIRIM ET AL.: "Disentangling Multiple Conditional Inputs in GANs", 《ARXIV》 * |
NEERAJ KUMAR ET AL.: "Robust One Shot Audio to Video Generation", 《2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW)》 * |
TERO KARRAS ET AL.: "A Style-Based Generator Architecture for Generative Adversarial Networks", 《ARXIV》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113722783A (en) * | 2021-07-08 | 2021-11-30 | 浙江海阔人工智能科技有限公司 | User-oriented intelligent garment design system and method based on deep learning model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | Adversarial learning for constrained image splicing detection and localization based on atrous convolution | |
CN109670528B (en) | Data expansion method facing pedestrian re-identification task and based on paired sample random occlusion strategy | |
CN110378985B (en) | Animation drawing auxiliary creation method based on GAN | |
CN110097609B (en) | Sample domain-based refined embroidery texture migration method | |
CN107977932A (en) | It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method | |
CN111798369B (en) | Face aging image synthesis method for generating confrontation network based on circulation condition | |
CN108960304B (en) | Deep learning detection method for network transaction fraud behaviors | |
CN112801015B (en) | Multi-mode face recognition method based on attention mechanism | |
CN111967930A (en) | Clothing style recognition recommendation method based on multi-network fusion | |
CN114708272A (en) | Garment image segmentation model establishing method and garment image segmentation method | |
CN109886281A (en) | One kind is transfinited learning machine color image recognition method based on quaternary number | |
Bounsaythip et al. | Genetic algorithms in image processing-a review | |
CN113052230A (en) | Clothing image generation system and method based on disentanglement network | |
CN109947960B (en) | Face multi-attribute joint estimation model construction method based on depth convolution | |
CN113724354B (en) | Gray image coloring method based on reference picture color style | |
CN111400525A (en) | Intelligent fashionable garment matching and recommending method based on visual combination relation learning | |
CN113722783A (en) | User-oriented intelligent garment design system and method based on deep learning model | |
CN110070587B (en) | Pedestrian image generation method based on conditional cascade confrontation generation network | |
CN116681921A (en) | Target labeling method and system based on multi-feature loss function fusion | |
Al Sasongko et al. | Application of Gray Scale Matrix Technique for Identification of Lombok Songket Patterns Based on Backpropagation Learning | |
CN114299184B (en) | Hidden building colored drawing line manuscript painting method and device based on semantic matching | |
CN111754459B (en) | Dyeing fake image detection method based on statistical depth characteristics and electronic device | |
CN111680760A (en) | Clothing style identification method and device, electronic equipment and storage medium | |
CN113658285A (en) | Method for generating face photo to artistic sketch | |
Zhang et al. | Supplementary meta-learning: Towards a dynamic model for deep neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210629 |