CN111667547B

CN111667547B - GAN network training method, garment picture generation method and device and electronic equipment

Info

Publication number: CN111667547B
Application number: CN202010520461.7A
Authority: CN
Inventors: 张发恩; 吴佳洪
Original assignee: Alnnovation Beijing Technology Co ltd
Current assignee: Alnnovation Beijing Technology Co ltd
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2023-08-11
Anticipated expiration: 2040-06-09
Also published as: CN111667547A

Abstract

The application provides a GAN network training method, a clothing picture generation method and device and electronic equipment. The GAN network training method comprises the following steps: acquiring a plurality of training samples, wherein each training sample comprises a clothing picture and descriptive text for describing the clothing picture; extracting first characteristic information of the clothing pictures and second characteristic information of corresponding descriptive characters in each training sample to form a characteristic information group with the first characteristic information and the second characteristic information; training the preset GAN network according to each characteristic information group to obtain a target GAN network. The method and the device can generate the corresponding clothing picture based on the descriptive text of the user, and meet the personalized requirements of the user on clothing design, thereby improving the automation degree of clothing picture generation and rapidly and automatically generating the clothing picture.

Description

GAN network training method, garment picture generation method and device and electronic equipment

Technical Field

The application relates to the technical field of computer networks, in particular to a GAN network training method, a clothing picture generation device and electronic equipment.

Background

The existing clothing style generation system is more to assemble and combine according to elements of clothing pictures to generate the clothing pictures. The method lacks a manual aesthetic system to evaluate, and various garment styles can be generated at the same time, but required garment pictures cannot be generated according to the preference of a user, so that the personalized requirements of the user on the garments cannot be met quickly.

In view of the above problems, no effective technical solution is currently available.

Disclosure of Invention

The embodiment of the application aims to provide a GAN network training method, a clothing picture generation device and electronic equipment, which can generate corresponding clothing pictures based on description characters of users, and meet personalized requirements of the users on clothing design.

In a first aspect, an embodiment of the present application provides a GAN network training method, including the following steps:

acquiring a plurality of training samples, wherein each training sample comprises a clothing picture and descriptive text for describing the clothing picture;

extracting first characteristic information of the clothing pictures and second characteristic information of corresponding descriptive characters in each training sample to form a characteristic information group with the first characteristic information and the second characteristic information;

training the preset GAN network according to each characteristic information group to obtain a target GAN network.

According to the embodiment of the application, the target GAN network is trained by adopting the clothing picture and the descriptive text, the corresponding clothing picture can be generated based on the descriptive text of the user, and the personalized requirements of the user on the clothing design are met, so that the automation degree of clothing picture generation can be improved, and the clothing picture can be rapidly and automatically generated.

Optionally, in the GAN network training method according to the embodiment of the present application, the clothing image includes a plurality of clothing elements, and the descriptive text includes a plurality of descriptive fields;

the step of training the preset GAN network according to each of the feature information sets to obtain the target GAN network includes:

dividing each characteristic information group into a plurality of first characteristic information groups, wherein each first characteristic information group comprises first sub-characteristic information and corresponding second sub-characteristic information, the first sub-characteristic information corresponds to one clothing element, the second sub-characteristic information corresponds to one description field, the first characteristic information comprises a plurality of first sub-characteristic information, and the second characteristic information comprises a plurality of second sub-characteristic information;

training the preset GAN network according to a plurality of first characteristic information groups of each characteristic information group to obtain a target GAN network.

Optionally, in the GAN network training method according to the embodiment of the application, the step of dividing each of the feature information groups into a plurality of first feature information groups includes:

word segmentation processing is carried out on each description text so as to obtain a plurality of description fields;

screening at least two clothing pictures with the same description field from the description words of the clothing pictures;

extracting first sub-feature information corresponding to the same description field from the first feature information corresponding to the at least two clothing pictures;

extracting second sub-feature information corresponding to the same description field from the second feature information corresponding to the at least two clothing pictures;

and forming a first characteristic information group by the extracted first sub-characteristic information and the corresponding second sub-characteristic information.

Optionally, in the GAN network training method according to the embodiment of the present application, the step of extracting the first sub-feature information corresponding to the same description field from the first feature information corresponding to the at least two clothing pictures includes:

and extracting the same characteristic information section in the corresponding first characteristic information of the at least two clothing pictures to serve as first sub-characteristic information corresponding to the same description field.

Optionally, in the GAN network training method according to the embodiment of the application, each of the plurality of training samples has the same clothing element as at least one other training sample.

In a second aspect, an embodiment of the present application provides a GAN network training apparatus, including:

the first acquisition module is used for acquiring a plurality of training samples, wherein each training sample comprises a clothing picture and descriptive text for describing the clothing picture;

the first extraction module is used for extracting first characteristic information of the clothing pictures and second characteristic information of corresponding descriptive characters in each training sample so as to form a characteristic information group with the first characteristic information and the second characteristic information;

and the training module is used for training the preset GAN network according to each characteristic information group so as to obtain a target GAN network.

The clothing picture comprises a plurality of clothing elements, and the descriptive text comprises a plurality of descriptive fields;

optionally, in the GAN network training device according to the embodiment of the present application, the training module includes:

the dividing unit is used for dividing each characteristic information group into a plurality of first characteristic information groups, each first characteristic information group comprises first sub-characteristic information and corresponding second sub-characteristic information, the first sub-characteristic information corresponds to one clothing element, the second sub-characteristic information corresponds to one description field, the first characteristic information comprises a plurality of first sub-characteristic information, and the second characteristic information comprises a plurality of second sub-characteristic information;

and the training unit is used for training the preset GAN network according to a plurality of first characteristic information groups of each characteristic information group so as to obtain a target GAN network.

In a third aspect, an embodiment of the present application provides a method for generating a clothing image, where the target GAN network obtained by training by using the GAN network training method described in any one of the above is provided, and the method includes the following steps:

acquiring description words input by a user and used for describing a target clothing picture to be generated;

and inputting the descriptive text into the target GAN network to generate a target clothing picture corresponding to the descriptive text.

In a fourth aspect, an embodiment of the present application provides a clothing image generating device, which is a target GAN network trained by using the GAN network training method described in any one of the above, where the device includes:

the second acquisition module is used for acquiring descriptive words which are input by a user and are used for describing the target clothing picture to be generated;

and the generation module is used for inputting the descriptive text into the target GAN network so as to generate a target clothing picture corresponding to the descriptive text.

In a fifth aspect, an embodiment of the present application provides an electronic device comprising a processor and a memory storing computer readable instructions which, when executed by the processor, perform the steps of the method as provided in the first aspect above.

In a sixth aspect, an embodiment of the present application provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method as provided in the first aspect above.

As can be seen from the above, in the embodiment of the present application, a plurality of training samples are obtained, where each training sample includes a clothing picture and descriptive text for describing the clothing picture; extracting first characteristic information of the clothing pictures and second characteristic information of corresponding descriptive characters in each training sample to form a characteristic information group with the first characteristic information and the second characteristic information; training a preset GAN network according to each characteristic information group to obtain a target GAN network; the target GAN network is trained, corresponding clothing pictures can be generated based on the descriptive text of the user, personalized requirements of the user on clothing design are met, the automation degree of clothing picture generation can be improved, and the clothing pictures can be rapidly and automatically generated.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a GAN network training method according to an embodiment of the present application.

Fig. 2 is a schematic structural diagram of a GAN network training device according to an embodiment of the present application.

Fig. 3 is a flowchart of a clothing image generating method according to an embodiment of the present application.

Fig. 4 is a schematic structural diagram of a clothing image generating device according to an embodiment of the present application.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.

Referring to fig. 1, fig. 1 is a flowchart of a GAN network training method according to some embodiments of the application. The GAN network training method comprises the following steps:

s101, acquiring a plurality of training samples, wherein each training sample comprises a clothing picture and descriptive text for describing the clothing picture.

S102, extracting first characteristic information of the clothing pictures and second characteristic information of corresponding descriptive words in each training sample to form a characteristic information group with the first characteristic information and the second characteristic information.

And S103, training the preset GAN network according to each characteristic information group to obtain a target GAN network.

In the step S101, the plurality of training samples are clothing pictures obtained from the network, and then corresponding descriptive text is added based on the clothing pictures to describe the clothing in the clothing pictures. Each clothing picture is provided with a plurality of clothing elements, and correspondingly, the descriptive text corresponding to the clothing picture is also provided with a plurality of descriptive fields, and each descriptive field corresponds to one clothing element. For example, a black high-collar sweater, the corresponding description fields of which are black, high-collar, and sweater, respectively.

Preferably, each training sample of the plurality of training samples has a same clothing element as at least one other training sample of the plurality of training samples. For example, garment a is a black high-collar sweater, garment B is a white round-collar sweater, and the same garment elements of garment a and garment B are sweaters. Further, each of the packing elements occurs at least twice in the plurality of training samples.

In this step S102, the CNN network may be used to extract the first feature information in each of the package pictures, and the RNN network may be used to extract the second feature information in each of the descriptive words. When the clothing picture and the descriptive text corresponding to each other are respectively input into the CNN network and the RNN network during acquisition, so that a group of corresponding first characteristic information and second characteristic information are acquired, and the first characteristic information and the second characteristic information corresponding to each other form a characteristic information group.

In step S103, the plurality of feature information sets are input into a preset GAN network for training, so as to obtain a target GAN network, that is, the first feature information of each of the package pictures and the second feature information corresponding to the descriptive text are input into the preset GAN network in pairs, so as to train and obtain the target GAN network.

Specifically, in some embodiments, this step S103 comprises the following sub-steps: s1031, dividing each characteristic information group into a plurality of first characteristic information groups, wherein each first characteristic information group comprises first sub-characteristic information and corresponding second sub-characteristic information, the first sub-characteristic information corresponds to one clothing element, the second sub-characteristic information corresponds to one description field, the first characteristic information comprises a plurality of first sub-characteristic information, and the second characteristic information comprises a plurality of second sub-characteristic information; s1032, training the preset GAN network according to the plurality of first characteristic information groups of each characteristic information group to obtain a target GAN network.

In the step S1031, when a clothing picture and a corresponding description text are input, the description text is divided into a plurality of description fields, and clothing elements of each description field are marked in the clothing picture, and when feature extraction is performed, each CNN network extracts first sub-feature information of one clothing element, the RNN network extracts second sub-feature information of the description field corresponding to the clothing element; so as to gradually extract the sub-characteristic information of each clothing element and the corresponding description field in pairs to form a plurality of first sub-characteristic information groups.

It will be appreciated that in some embodiments, this step S1031 includes: s10311, performing word segmentation processing on each description text to obtain a plurality of description fields; s10312, screening at least two clothing pictures with the same description field from the description words of the clothing pictures; s10313, extracting first sub-feature information corresponding to the same description field from the first feature information corresponding to the at least two clothing pictures; s10314, extracting second sub-feature information corresponding to the same description field from the second feature information corresponding to the at least two clothing pictures; s10315, the extracted first sub-feature information and the corresponding second sub-feature information form a first feature information group.

In step S10311, for example, the descriptive text is a white short-sleeved shirt, and three descriptive fields of "white", "short-sleeved shirt" and "shirt" are obtained through word segmentation. In this step S10312, the description text corresponding to the at least two selected clothing pictures has only one common description field. For example, three description fields of the description text of the clothing picture a are respectively: white, long sleeve, checked shirt; the three description fields of the description text of the clothing picture B are respectively: black short sleeve shirt and plaid; the three description fields of the description text of the clothing picture C are respectively: red, middle-long sleeve, and lattice shirt. Thus, the clothing picture A, B, C is screened out and has the same description field plaid. In the step S10313, based on the first characteristic information of the clothing picture A, B, C, the first sub-characteristic information corresponding to the jersey can be determined by finding out the same characteristic information segments therein. Furthermore, by adopting the method, the first sub-feature information corresponding to each description field can be determined in turn. In the step S10314, the same principle as in the step S10313 may be adopted, and the common feature information segment is extracted from the second feature information of the descriptive text of the clothing picture A, B, C, so that the second sub-feature information corresponding to the common descriptive field may be found. In this step S10315, each first sub-feature information and the corresponding second sub-feature information are combined into a first feature information group.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a GAN network training device according to some embodiments of the application. The GAN network training apparatus includes: a first acquisition module 201, a first extraction module 202, and a training module 203.

The first obtaining module 201 is configured to obtain a plurality of training samples, where each training sample includes a clothing picture and descriptive text for describing the clothing picture; the training samples are clothing pictures obtained from a network, and corresponding descriptive characters are added based on the clothing pictures so as to describe clothing in the clothing pictures. Each clothing picture is provided with a plurality of clothing elements, and correspondingly, the descriptive text corresponding to the clothing picture is also provided with a plurality of descriptive fields, and each descriptive field corresponds to one clothing element. For example, a black high-collar sweater, the corresponding description fields of which are black, high-collar, and sweater, respectively.

The first extraction module 202 is configured to extract first feature information of the clothing picture and second feature information of the corresponding descriptive text in each training sample, so as to form a feature information set with the first feature information and the second feature information. The first extraction module 202 may use a CNN network to extract the first feature information in each of the installed pictures, and use an RNN network to extract the second feature information in each of the descriptive words. When the clothing picture and the descriptive text corresponding to each other are respectively input into the CNN network and the RNN network during acquisition, so that a group of corresponding first characteristic information and second characteristic information are acquired, and the first characteristic information and the second characteristic information corresponding to each other form a characteristic information group.

The training module 203 is configured to train the preset GAN network according to each of the feature information sets to obtain a target GAN network. The training module 203 inputs the plurality of feature information groups into a preset GAN network for training, thereby obtaining a target GAN network, that is, inputs the first feature information of each of the installed pictures and the second feature information corresponding to the descriptive text into the preset GAN network in pairs, thereby obtaining the target GAN network through training.

Specifically, in some embodiments, the training module 203 includes: the dividing unit is used for dividing each characteristic information group into a plurality of first characteristic information groups, each first characteristic information group comprises first sub-characteristic information and corresponding second sub-characteristic information, the first sub-characteristic information corresponds to one clothing element, the second sub-characteristic information corresponds to one description field, the first characteristic information comprises a plurality of first sub-characteristic information, and the second characteristic information comprises a plurality of second sub-characteristic information; and the training unit is used for training the preset GAN network according to a plurality of first characteristic information groups of each characteristic information group so as to obtain a target GAN network. Specifically, when a clothing picture and a corresponding description text are input, the description text is divided into a plurality of description fields, clothing elements of each description field are marked in the clothing picture, and when feature extraction is performed, each time a CNN network extracts first sub-feature information of one clothing element, the RNN network extracts second sub-feature information of the description field corresponding to the clothing element; so as to gradually extract the sub-characteristic information of each clothing element and the corresponding description field in pairs to form a plurality of first sub-characteristic information groups.

It will be appreciated that in some embodiments, the partitioning unit is to: word segmentation processing is carried out on each description text so as to obtain a plurality of description fields; screening at least two clothing pictures with the same description field from the description words of the clothing pictures; extracting first sub-feature information corresponding to the same description field from the first feature information corresponding to the at least two clothing pictures; extracting second sub-feature information corresponding to the same description field from the second feature information corresponding to the at least two clothing pictures; and forming a first characteristic information group by the extracted first sub-characteristic information and the corresponding second sub-characteristic information.

For example, the descriptive text is a white short-sleeved shirt, and three descriptive fields of 'white', 'short-sleeved' and 'shirt' are obtained through word segmentation. The description text corresponding to the at least two selected clothing pictures only has one common description field. For example, three description fields of the description text of the clothing picture a are respectively: white, long sleeve, checked shirt; the three description fields of the description text of the clothing picture B are respectively: black short sleeve shirt and plaid; the three description fields of the description text of the clothing picture C are respectively: red, middle-long sleeve, and lattice shirt. Thus, the clothing picture A, B, C is screened out and has the same description field plaid. Based on the first characteristic information of the clothing picture A, B, C, the first sub-characteristic information corresponding to the checked shirt can be determined by finding out the same characteristic information segments. Furthermore, by adopting the method, the first sub-feature information corresponding to each description field can be determined in turn.

Referring to fig. 3, fig. 3 is a flowchart of a clothing image generating method according to some embodiments of the present application, where the clothing image generating method performs image generation by using a target GAN network trained by the GAN network training method according to any of the embodiments. The method comprises the following steps:

s301, acquiring description words input by a user and used for describing a target clothing picture to be generated.

S302, inputting the descriptive text into the target GAN network to generate a target clothing picture corresponding to the descriptive text.

In step S301, the description text may include only one description field or may include a plurality of description fields.

In step S302, when the description text includes only one description field, the target GAN network directly extracts the second sub-feature information of the description field, and then outputs a plurality of clothing pictures corresponding to the description field with the second sub-feature information.

In some embodiments, when the descriptive text has a plurality of descriptive fields, the target GAN network extracts second sub-feature information of the plurality of descriptive fields, and generates a clothing picture having first sub-feature information corresponding to the plurality of second sub-feature information.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a clothing photo device according to some embodiments of the application. The clothing image generating device performs image generation by using the target GAN network trained by the GAN network training method according to any of the embodiments, where the device includes: a second acquisition module 401 and a generation module 402.

The second obtaining module 401 is configured to obtain a description text input by a user for describing a target clothing picture to be generated. The descriptive text may include only one descriptive field or may include a plurality of descriptive fields.

The generating module 402 is configured to input the descriptive text into the target GAN network, so as to generate a target clothing picture corresponding to the descriptive text. When the descriptive text only comprises one descriptive field, the target GAN network directly extracts second sub-characteristic information of the descriptive field, and then outputs a plurality of clothing pictures corresponding to the descriptive field with the second sub-characteristic information. In some embodiments, when the descriptive text has a plurality of descriptive fields, the target GAN network extracts second sub-feature information of the plurality of descriptive fields, and generates a clothing picture having first sub-feature information corresponding to the plurality of second sub-feature information.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and the present application provides an electronic device 5, including: processor 501 and memory 502, the processor 501 and memory 502 being interconnected and in communication with each other by a communication bus 503 and/or other form of connection mechanism (not shown), the memory 502 storing a computer program executable by the processor 501, which when run by a computing device, the processor 501 executes to perform the method in any of the alternative implementations of the embodiments described above.

The present application provides a storage medium that, when executed by a processor, performs the method of any of the alternative implementations of the above embodiments. The storage medium may be implemented by any type of volatile or nonvolatile Memory device or combination thereof, such as static random access Memory (Static Random Access Memory, SRAM), electrically erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), erasable Programmable Read-Only Memory (Erasable Programmable Read OnlyMemory, EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

Further, the units described as separate units may or may not be physically separate, and units displayed as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Furthermore, functional modules in various embodiments of the present application may be integrated together to form a single portion, or each module may exist alone, or two or more modules may be integrated to form a single portion.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. The GAN network training method is characterized by comprising the following steps of:

training a preset GAN network according to each characteristic information group to obtain a target GAN network; the input of the target GAN network is the descriptive text, and the output is the clothing picture;

2. The GAN network training method of claim 1 wherein the step of dividing each of the feature information groups into a plurality of first feature information groups comprises:

screening at least two clothing pictures with the same description field from the description words of the plurality of clothing pictures;

3. The GAN network training method of claim 2, wherein the step of extracting the first sub-feature information corresponding to the same description field from the first feature information corresponding to the at least two clothing pictures comprises:

4. The GAN network training method of claim 1 wherein each of the plurality of training samples has the same clothing element as at least one other training sample.

5. A GAN network training device, comprising:

the training module is used for training the preset GAN network according to each characteristic information group so as to obtain a target GAN network; the input of the target GAN network is the descriptive text, and the output is the clothing picture; the clothing picture comprises a plurality of clothing elements, and the descriptive text comprises a plurality of descriptive fields; the training module comprises: the dividing unit is used for dividing each characteristic information group into a plurality of first characteristic information groups, each first characteristic information group comprises first sub-characteristic information and corresponding second sub-characteristic information, the first sub-characteristic information corresponds to one clothing element, the second sub-characteristic information corresponds to one description field, the first characteristic information comprises a plurality of first sub-characteristic information, and the second characteristic information comprises a plurality of second sub-characteristic information; and the training unit is used for training the preset GAN network according to a plurality of first characteristic information groups of each characteristic information group so as to obtain a target GAN network.

6. A clothing picture generation method, characterized in that a target GAN network obtained by training by using the GAN network training method according to any one of claims 1 to 4 is obtained, the method comprising the steps of:

7. A clothing picture generation device, characterized in that a target GAN network trained by the GAN network training method according to any one of claims 1 to 4 is used, the device comprising:

8. An electronic device comprising a processor and a memory storing computer readable instructions that, when executed by the processor, perform the method of any of claims 1-4 or 6.