CN113793397B

CN113793397B - Garment attribute editing method based on improved WGAN

Info

Publication number: CN113793397B
Application number: CN202110871983.6A
Authority: CN
Inventors: 张建明; 王文靖; 王志坚
Original assignee: Yuyao Zhejiang University Robot Research Center; Zhejiang University ZJU
Current assignee: Yuyao Zhejiang University Robot Research Center; Zhejiang University ZJU
Priority date: 2021-07-30
Filing date: 2021-07-30
Publication date: 2023-11-03
Anticipated expiration: 2041-07-30
Also published as: CN113793397A

Abstract

The application belongs to the field of computer vision, and discloses a clothing attribute editing method based on an improved WGAN. The method can generate new clothing attributes based on the existing clothing attributes, and achieves the goal of intelligent clothing aided design. The present application adds a cross-connection residual structure to the producer's decoding network relative to the original WGAN network. The residual structure can promote gradient circulation, merge shallow features and alleviate gradient disappearance; secondly, on the basis of the WGAN original loss function, the L2 norm term is added to the reconstruction loss, and the method has the main advantages that the gradient is smoother, and the problems of gradient disappearance and gradient explosion can be relieved. By optimizing the network structure and the loss function, the WGAN has more excellent effect than before. The method has the characteristics of rapid end-to-end generation, multiple generation attributes and good effect.

Description

Garment attribute editing method based on improved WGAN

Technical Field

The application belongs to the field of computer vision, and particularly relates to a clothing attribute editing method based on an improved WGAN.

Background

Prior to the application of the standard garment size, all garments were made by tailor's fitting to the individual stature. With the development of standard clothing sizes, sewing machines and modern industry and commerce, the time and cost of mass production of clothing is greatly reduced, and ready-made clothing mass-produced by a production line is very popular with consumers in the past decades. This is because of its low price, which can be affordable to the consumer; on the other hand, consumers have more choices because of the variety of styles. However, such garments are typically designed to be standard-sized and in fact it is difficult for most non-standard-sized consumers to find a fully fitted, satisfactory garment.

The modern society economy is rapidly increased, the clothing related industry is rapidly developed, the wearing requirements of people are higher and higher, besides the basic functional requirements of protection, comfort, warmth retention and the like in the past, people have requirements on fashion sense, uniqueness and the like of clothing styles gradually, and the clothing is suitable for the heart instrument clothing of the people to bring confidence and satisfaction to the people, so that on the basis of the selection standard of the clothing, the clothing is gradually changed from comfort and attractive popular standard to personalized design comprehensively considering personal stature characteristics, gas quality characteristics and aesthetic preferences based on fit, and personalized customization and consumption of the clothing of the user become future trends and trends.

Artificial intelligence has evolved rapidly in recent years and has also begun to gradually serve the apparel industry. At present, artificial intelligence technology has been explored in the aspects of detection and segmentation processing of fashion images, virtual fitting, clothing recommendation, fashion trend prediction and the like. However, there has been little research into the design of properties of garments themselves. In recent years, the generation of the development of the countermeasure network technology brings a lot of inspires to the design of clothing attribute in the field of facial expression change and the like. Therefore, the application will explore intelligent design technology of clothing based on deep learning to improve clothing properties.

The traditional clothing component design method is based on the inspiration of a designer or auxiliary material library for collocation design, and completely depends on the design level of the designer. The academic community is also researching and applying artificial intelligence technology to conduct intelligent aided design, and the primary task is to establish a high-quality clothing data set. The deep fashion garment dataset was established by the university of hong Kong multimedia laboratory team Shang Xiaoou. The data set contains labels of the attribute of the clothing, and data support is provided for training a deep learning algorithm to conduct intelligent clothing design. Fashionpedia Dataset is a garment attribute segmentation dataset established by the university of kannel and Google, comprising 48, 825 total garment images, and finely subdividing the garment item and providing fine-grained attributes for the subdivided categories. The data set unifies the example segmentation and the visual attribute identification, and lays a solid foundation for structural understanding of clothing attributes. Based on the data set, wei-Lin Hsiao of UT Austin proposes a Fashion++ clothing design network which uses texture and shape encoders to extract clothing corresponding attribute features and combines the same, and the algorithm uses a generated neural network to learn clothing attribute codes to synthesize clothing. And then the implicit code is explicitly decomposed according to the shape and the texture, so that the attribute editing of the clothing is realized. The algorithm mainly focuses on the editing of the overall shape of the garment, and does not have the editing of local properties such as real sleeves, collars and the like. Ping Q et al propose a Fashion-AttGAN algorithm that focuses on editing properties such as collar, sleeve length, etc. of the coat, which encodes some properties and then uses AttGAN for training generation. The algorithm opens up new possibilities for user-driven apparel design and may be beneficial for virtual try-on, apparel recommendation, visual search, and the like. But the current algorithm generation effect still needs to be improved.

Disclosure of Invention

The present application addresses the above-described problems by providing a garment property editing method based on an improved WGAN. The method can generate new clothing attributes based on the existing clothing attributes, and achieves the goal of intelligent clothing aided design.

In order to solve the technical problems, the specific technical scheme of the clothing attribute editing method based on the improved WGAN is as follows:

the clothing attribute editing method based on the improved WGAN is divided into a training stage and a testing stage, wherein the training stage is optimized by adopting a supervised learning mode; the testing stage adopts the converged network to generate clothing attribute; the method specifically comprises the following steps:

step 1: in the training stage, firstly, acquiring clothing construction clothing data sets of various styles, and then, labeling attributes of each clothing sample; the attributes comprise three primary granularity attributes of sleeve length, color and collar;

step 2: training data with attribute labels is input to generate an countermeasure network for training.

Further, the generating countermeasure network in the step 2 consists of a generating network and a judging network; the generating network comprises a generator and a decoder for learning attribute features of the garment from training data, reconstructing input garment data based on the learned feature expressions on a normal distribution; the distinguishing network consists of a decoder and a distinguishing device; the decoder of the judging network shares weight with the decoder of the generating network, and the decoder of the judging network reconstructs false clothing data based on the generating network and then inputs the false clothing data and the original data label to the judging device for analysis and judgment; the generating network and the discriminating network maintain Nash equilibrium.

Further, the three primary granularity attributes of the sleeve length, the color and the collar comprise 14 fine granularity attributes of a high collar, a V-shaped collar, a round collar, red, orange, yellow, green, cyan, blue, purple, long sleeves, half sleeves, short sleeves, sleeveless sleeves and the like.

Further, the step 1 comprises the following specific steps:

firstly, collecting clothing construction clothing data sets X of various styles, and then labeling attributes of each clothing sample; the collar attribute labels are divided into a high collar, a V-shaped collar and a round collar; color attributes are marked as red, orange, yellow, green, cyan, blue and purple; the sleeve length attribute is marked as long sleeve, half sleeve, short sleeve and no sleeve.

Further, the step 2 comprises the following specific steps:

training data X with labels ₀ Inputting a generated countermeasure network Z for training; the generating countermeasure network Z consists of a generating network and a judging network; the generating network comprises a generator Genc and a decoder Gdec for learning attribute features of the garment from the training data and then reconstructing the input garment data X based on the learned feature expressions over a normal distribution ₀ Obtaining a reconstructed picture X ₁ The method comprises the steps of carrying out a first treatment on the surface of the The generation network calculates a reconstruction loss L based on the reconstructed picture and the original input sample, and the reconstruction loss function is shown as follows:

wherein ,

the generating network optimizes the reconstruction effect of the generating network by minimizing the reconstruction loss L;

the discrimination network consists of a decoder Gdec, a discriminator C and a discriminator D; the decoder Gdec of the discriminating network shares weights with the decoder Gdec of the generating network, which reconstructs false clothing data X based on the generating network Z ₂ Then X is taken up ₂ Raw data X ₀ The corresponding label is input to the discriminator D for discrimination.

Further, the discriminator D learns the features of the input data to discriminate that the input picture is a true sample X ₀ Or reconstructed dummy sample X ₂ Then optimizing through a cross entropy loss function; the discriminator C discriminates the input dummy sample data X with finer granularity ₂ Which classes of attributes are generated and then compared with the real sample X ₀ Attribute comparison of (2); and the attribute generated by the generator is approximated to the real sample by optimizing the cross entropy loss function.

Further, the discriminator D uses a coarse grain discriminator to discriminate true and false generated data, and the discriminator C uses a fine grain discriminator to discriminate false sample attributes.

Further, a cross-connection residual structure is added to the decoding network of the generator on a WAGN basis.

Further, an integral loss function is established, wherein the integral loss function is formed by reconstructing loss L, true and false discrimination loss0 and attribute loss att _loss Three parts are composed, and the following formula is shown:

L _all ＝L+att _loss +loss0

further, the weight of the generated network Z is frozen, then a clothing picture without an attribute tag is input, the network is improved to generate a new clothing style with new attribute, and the intelligent clothing editing target is realized.

The clothing attribute editing method based on the improved WGAN has the following advantages: the present application adds a cross-connection residual structure to the producer's decoding network relative to the original WGAN network. The residual structure can promote gradient circulation, merge shallow features and alleviate gradient disappearance; secondly, on the basis of the WGAN original loss function, the L2 norm term is added to the reconstruction loss, and the method has the main advantages that the gradient is smoother, and the problems of gradient disappearance and gradient explosion can be relieved. By optimizing the network structure and the loss function, the WGAN has more excellent effect than before. Therefore, the method has the characteristics of rapid end-to-end generation, multiple generation attributes and good effect.

Drawings

FIG. 1 is a flow chart of a method for garment property editing based on an improved WGAN of the present application;

fig. 2 is a WGAN generation result diagram;

fig. 3 is a graph of the results of the improved WGAN generation of the present application;

FIG. 4 is a graph of DCGAN generation results;

fig. 5 is a diagram of LSGAN generation results.

Detailed Description

For a better understanding of the objects, structures and functions of the present application, a method for tailoring properties based on an improved WGAN will be described in further detail with reference to the accompanying drawings.

1. Experimental environment and data set

The computer hardware and software configuration adopts Ubuntu 64-bit system, and the processor is i7 8700k, memory 24G and display card 1070ti. The software environment is PyTorch 1.4.

The dataset contained 14,221 garment images, three primary granularity attributes of sleeve length, collar, color, 14 fine granularity attributes using CV-PTON.

2. The specific implementation steps are as follows:

as shown in fig. 1, the garment attribute editing method based on the improved WGAN of the present application is divided into a training phase and a testing phase, wherein the training phase is optimized by adopting a supervised learning mode. And in the test stage, the converged network is adopted for clothing attribute generation.

Particularly, the algorithm designs three primary granularity attributes of sleeve length, color and collar, and 14 fine granularity attributes of high collar, V-shaped collar, round collar, red, orange, yellow, green, blue, purple, long sleeve, half sleeve, short sleeve, sleeveless and the like;

in the training stage, firstly, a clothing construction clothing data set X of various styles is collected, and then, attribute labeling is carried out on each clothing sample. The collar attribute labels are divided into { high collar, V-shaped collar and round collar }; color attributes are labeled { red, orange, yellow, green, cyan, blue, purple }; the sleeve length attribute is marked as { long sleeve, half sleeve, short sleeve, no sleeve }.

Training data X with labels ₀ The input generation antagonizes the network Z for training. The generating countermeasure network Z consists of a generating network and a distinguishing network. The generating network comprises a generator Genc and a decoder Gdec for learning attribute features of the garment from the training data and then reconstructing the input garment data X based on the learned feature expressions over a normal distribution ₀ Obtaining a reconstructed picture X ₁ . The generation network calculates a reconstruction loss L based on the reconstructed picture and the original input samples. The reconstruction loss function is shown as follows:

wherein ,

the generating network optimizes the reconstruction effect of the generating network by minimizing the reconstruction loss L.

The discrimination network is composed of a decoder Gdec and a discriminator C, D. In particular, the decoder Gdec of the discriminating network shares weights with the decoder Gdec of the generating network, which reconstructs false garment data X based on the generating network Z ₂ . Then X is taken up ₂ Raw data X ₀ The corresponding label is input to the discriminator D for discrimination.

In particular, the discriminator D learns the features of the input data to discriminate that the input picture is a true sample X ₀ Or reconstructed dummy sample X ₂ Then optimized by a cross entropy loss function. Further, the arbiter C discriminates the input dummy sample data X with finer granularity ₂ Which classes of attributes are generated and then compared with the real sample X ₀ Is a comparison of attributes of (a). And the attribute generated by the generator is approximated to the real sample by optimizing the cross entropy loss function. Network training is performedThe process is that the generating network generates false pictures close to real data as much as possible, and the judging network distinguishes the false pictures generated by the generating network as much as possible, and the generating network and the false pictures maintain Nash balance.

In particular, the application adopts a coarse grain discriminator D to discriminate the true and false of the generated data, and then adopts a fine grain discriminator C to discriminate the false sample attribute. The method effectively improves the discrimination effect of the discriminator and further improves the generation effect of the whole algorithm.

In particular, in terms of network structure, the present application adds a cross-connection residual structure to the decoding network of the generator on a WAGN basis. The residual structure can promote gradient fluxion, fuse shallow layer characteristics, and alleviate gradient disappearance problem.

The integral loss function of the application is formed by reconstructing loss L, true and false discrimination loss0 and attribute loss att _loss Three parts. The following formula is shown:

L _all ＝L+att _loss +loSS0

in the testing stage, the weight of the generated network Z is frozen, then the clothing picture without the attribute tag is input, the new clothing style with new attribute can be generated by improving the network, the intelligent editing target of the clothing is realized, and the improved network generates new clothing with long-sleeve attribute based on the input short-sleeve picture as shown in fig. 1.

In particular, the application has the characteristics of rapid end-to-end generation, multiple generation attributes and good effect.

3. Test results and analysis

(1) Contrast with WGAN

On the CV-PTON dataset, the loss of WGAN was 10%, whereas the improved WGAN of the present application can reduce the loss to 5%. Fig. 2 and 3 show the results of the creation of WAGN and the improved WGAN for garment property editing. Each row in the image is true value, reconstruction, sleeveless, short sleeve, long sleeve, high collar, POLO collar/T shirt collar, V collar, red, gray, black, white, beige, navy, blue and green.

Overall, the improved WGAN produced a clearer edge and better color effect. For example, for the "cotta" column, there are many cases in WAGN that create incomplete sleeves, which rarely occurs in the results of improving WAGN; in addition to shape attributes, color attributes are also more effective in improving the production of WGANs, such as for the last column "green" the green produced by the improved WGAN is more deeply colored, while the WGAN is more affected by the original color of the garment. Table 1 shows the WGAN generation result color control, and table 2 shows the improvement of the WGAN generation result color control.

Table 1: WGAN generation result color comparison table

Table 2: improved WGAN generation result color comparison table

True value

Red color

Gray color

Black color

White color

Beige color

Navy blue

Blue color

Green colour

#462726

#372e35

#9d999d

#1f1f21

#f7f7f9

#3d2c30

#282f42

#42608d

#444639

#8496ab

#c63c39

#a7a6ab

#141519

#f7f7f7

#937373

#222c3f

#b16768

#748170

#151517

#7c1f20

#e0d7df

#4e485a

#f9f9fb

#2b232a

#16202b

#436cac

#36463f

#615f54

#b73a33

#828682

#101015

#f6f6f6

#765f52

#2a3846

#478090

#7e5951

#1f2223

#802524

#b7b3b5

#595150

#f4f2f5

#4b4143

#2c3446

#2b4f82

#24332a

#292a3e

#6a1f24

#9a96a8

#1b1e23

#f2f0f6

#342d39

#382b34

#2e5280

#324245

#36322f

#6c201f

#8a9290

#44423e

#f0f0f0

#423b36

#2e3847

#3a6479

#405747

#60607d

#ad4e6b

#9f95ab

#3a3442

#fbfbff

#877586

#444459

#65869d

#515151

#71696f

#c23b3d

#80484f

#17181c

#f4f4f6

#805c5f

#263142

#4e85a1

#687262

#323746

#77282f

#5f6268

#202126

#f2f2f4

#3f3d49

#483f46

#305c86

#374646

(2) In contrast to other GANs

As shown in fig. 4, the DCGAN is generated. It can be seen that DCGAN does not substantially modify the garment shape properties, and in terms of color modification, can only color a small amount on the basis of the original garment, and cannot make the garment adhere to vivid colors as in the improvement of WAGN. As shown in table 3, the DCGAN generated result color control.

Table 3: DCGAN generation result color comparison table

As shown in fig. 5, while for LSGAN, it can be seen that the shape properties of the garment are slightly affected but not well modified, it can be said that the modification of the garment shape by LSGAN is as ineffective as DCGAN. LSGAN is also poorly colored with respect to color. Thus, comparison with other GANs further verifies the effectiveness of the improved WAGN. As shown in table 4, LSGAN generated result color control.

Table 4: LSGAN generation result color comparison table

True value

Red color

Gray color

Black color

White color

Beige color

Navy blue

Blue color

Green colour

#462726

#4a292d

#482b2d

#3e2829

#f3f0eb

#4d2a2d

#37222a

#35222b

#3a2624

#8496ab

#766b7e

#8595a9

#5d6778

#cfd4d5

#929ab0

#7988a0

#8c99aa

#748994

#151517

#140c10

#101317

#1d2325

#89867d

#1b1a20

#15151a

#171823

#131416

#615f54

#765853

#5c5a54

#5a5752

#d1d0cc

#625f55

#5e5b53

#585659

#6b635e

#1f2223

#2b2225

#24292b

#383a3c

#f4f6f4

#27272d

#1f232b

#20262f

#1f2526

#292a3e

#352239

#2b2b43

#2b2c44

#e8e7eb

#29293f

#2a283e

#242c4f

#2a3042

#36322f

#42282d

#383232

#4d4543

#e9ebe5

#342b2d

#302b31

#333137

#373b37

#60607d

#86617a

#636978

#545265

#cecbd1

#6c677a

#4c4d69

#929dbc

#66747e

#71696f

#80636a

#786e76

#736d73

#dddeda

#676168

#726879

#767588

#717171

#323746

#524153

#383f4c

#373c4d

#b4b8bc

#474a5a

#414654

#2f3956

#35404f

The present application adds a cross-connection residual structure to the producer's decoding network relative to the original WGAN network. The residual structure can promote gradient circulation, merge shallow features and alleviate gradient disappearance; the advantages are mainly as follows:

(1) Simplifying learning process

If the residual structure does not exist, the neural network learns the original signal, and the residual structure enables the neural network to learn the difference value of the signal, so that the learning process is simplified. The application proves that the residual structure is an effective network optimization mode and has good effect on relieving the problems of gradient disappearance and gradient explosion.

(2) Mitigating network degradation

As the number of network layers increases, network degradation problems occur, mainly due to the symmetry of the network. And the residual connection can break the symmetry of the network, so that the problem of network degradation is solved.

(3) Enhancing generalization capability of a network

The deep network comprising the residual structure can be regarded as the synthesis of shallow neural networks with different depths, and the removal of a certain network layer does not have great influence on the network performance, namely, the residual structure can enhance the generalization capability of the network.

Secondly, on the basis of the WGAN original loss function, the application increases the L2 norm term on the reconstruction loss. The method has the main advantages that the gradient is smoother, and the problems of gradient disappearance and gradient explosion can be relieved. When an abnormally large value of a certain weight in the network appears, the abnormal value is amplified by the square in the two norms, and the abnormal value can be forced to be reduced in the back propagation process. In addition, the L2 norm term can prevent the generation of overfitting.

By optimizing the network structure and the loss function, the WGAN has more excellent effect than before.

In summary, the application has the characteristics of rapid end-to-end generation, multiple generation attributes and good effect.

It will be understood that the application has been described in terms of several embodiments, and that various changes and equivalents may be made to these features and embodiments by those skilled in the art without departing from the spirit and scope of the application. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the application without departing from the essential scope thereof. Therefore, it is intended that the application not be limited to the particular embodiment disclosed, but that the application will include all embodiments falling within the scope of the appended claims.

Claims

1. The clothing attribute editing method based on the improved WGAN is divided into a training stage and a testing stage, wherein the training stage is optimized by adopting a supervised learning mode; the testing stage adopts the converged network to generate clothing attribute; the method is characterized by comprising the following steps of:

step 2: inputting training data with attribute labels to generate an countermeasure network for training;

wherein ,，

the discrimination network is composed of decoder Gdec and discriminationA discriminator C and a discriminator D; the decoder Gdec of the discriminating network shares weights with the decoder Gdec of the generating network, which reconstructs false clothing data X based on the generating network Z ₂ Then X is taken up ₂ Raw data X ₀ Inputting the corresponding label to the discriminator D for discrimination;

the discriminator D learns the characteristics of the input data to discriminate that the input picture is a real sample X ₀ Or reconstructed dummy sample X ₂ Then optimizing through a cross entropy loss function; the discriminator C discriminates the input dummy sample data X with finer granularity ₂ Which classes of attributes are generated and then compared with the real sample X ₀ Attribute comparison of (2); optimizing through a cross entropy loss function, so that the attribute generated by the generator approximates to a real sample;

the discriminator D adopts a coarse grain discriminator to discriminate true and false generated data, and the discriminator C adopts a fine grain discriminator to discriminate false sample attributes;

adding a cross-connection residual structure to a decoding network of the generator on the basis of WAGN;

establishing an integral loss function, wherein the integral loss function is formed by reconstructing loss L, true and false discrimination loss0 and attribute loss att _loss Three parts are composed, and the following formula is shown:

。

2. the garment property editing method based on improved WGAN as claimed in claim 1, wherein the generating countermeasure network in step 2 is composed of two parts of generating network and discriminating network; the generating network comprises a generator and a decoder for learning attribute features of the garment from training data, reconstructing input garment data based on the learned feature expressions on a normal distribution; the distinguishing network consists of a decoder and a distinguishing device; the decoder of the judging network shares weight with the decoder of the generating network, and the decoder of the judging network reconstructs false clothing data based on the generating network and then inputs the false clothing data and the original data label to the judging device for analysis and judgment; the generating network and the discriminating network maintain Nash equilibrium.

3. The improved WGAN-based garment property editing method of claim 1, wherein the three primary granularity properties of sleeve length, color, collar include 14 granularity properties of high collar, V-collar, round collar, red, orange, yellow, green, cyan, blue, violet, long sleeve, half sleeve, short sleeve, sleeveless.

4. The garment property editing method based on improved WGAN as claimed in claim 3, wherein said step 1 comprises the specific steps of:

5. The garment property editing method based on the improved WGAN as claimed in claim 1, wherein the weight of the generated network Z is frozen, then a garment picture without a property label is input, the improved network generates a new garment style with a new property, and the intelligent garment editing goal is achieved.