CN111667547A

CN111667547A - GAN network training method, clothing picture generation method, device and electronic equipment

Info

Publication number: CN111667547A
Application number: CN202010520461.7A
Authority: CN
Inventors: 张发恩; 吴佳洪
Original assignee: Alnnovation Beijing Technology Co ltd
Current assignee: Alnnovation Beijing Technology Co ltd
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2020-09-15
Anticipated expiration: 2040-06-09
Also published as: CN111667547B

Abstract

The application provides a GAN network training method, a clothing image generation device and electronic equipment. The GAN network training method comprises the following steps: acquiring a plurality of training samples, wherein each training sample comprises a clothing picture and description words for describing the clothing picture; extracting first characteristic information of the clothing picture in each training sample and corresponding second characteristic information of the description characters to form a characteristic information group with the first characteristic information and the second characteristic information; and training a preset GAN network according to each characteristic information group to obtain a target GAN network. The method and the device can generate the corresponding clothing picture based on the description characters of the user, and meet the individual requirements of the user on clothing design, so that the automation degree of the generation of the clothing picture can be improved, and the clothing picture can be generated rapidly and automatically.

Description

GAN network training method, clothing picture generation method, device and electronic equipment

Technical Field

The application relates to the technical field of computer networks, in particular to a GAN network training method, a clothing picture generating device and electronic equipment.

Background

The existing clothing style generation system is more assembled and combined according to the elements of clothing pictures to generate the clothing pictures. The method lacks a manual aesthetic system for evaluation, and although various garment styles can be generated at the same time, the required garment pictures cannot be generated according to the preference of the user, and the personalized requirements of the user on the garment cannot be met quickly.

In view of the above problems, no effective technical solution exists at present.

Disclosure of Invention

An object of the embodiment of the application is to provide a GAN network training method, a clothing picture generating method, a device and an electronic device, which can generate corresponding clothing pictures based on description characters of a user, and meet personalized requirements of the user on clothing design.

In a first aspect, an embodiment of the present application provides a GAN network training method, including the following steps:

acquiring a plurality of training samples, wherein each training sample comprises a clothing picture and description words for describing the clothing picture;

extracting first characteristic information of the clothing picture in each training sample and corresponding second characteristic information of the description characters to form a characteristic information group with the first characteristic information and the second characteristic information;

and training a preset GAN network according to each characteristic information group to obtain a target GAN network.

According to the embodiment of the application, the target GAN network is trained by adopting the clothing pictures and the description characters, the corresponding clothing pictures can be generated based on the description characters of the user, the personalized requirements of the user on clothing design are met, the automation degree of the generation of the clothing pictures can be improved, and the clothing pictures can be quickly and automatically generated.

Optionally, in the GAN network training method according to the embodiment of the present application, the clothing picture includes a plurality of clothing elements, and the description text includes a plurality of description fields;

the step of training a preset GAN network according to each characteristic information group to obtain a target GAN network comprises the following steps:

dividing each characteristic information group into a plurality of first characteristic information groups, wherein each first characteristic information group comprises first sub-characteristic information and corresponding second sub-characteristic information, the first sub-characteristic information corresponds to one clothing element, the second sub-characteristic information corresponds to one description field, the first characteristic information comprises a plurality of first sub-characteristic information, and the second characteristic information comprises a plurality of second sub-characteristic information;

and training a preset GAN network according to the plurality of first characteristic information groups of each characteristic information group to obtain a target GAN network.

Optionally, in the GAN network training method according to the embodiment of the present application, the step of grouping each of the feature information groups into a plurality of first feature information groups includes:

performing word segmentation processing on each description character to obtain a plurality of description fields;

screening at least two clothing pictures with the same description field from the description texts of the clothing pictures;

extracting first sub-feature information corresponding to the same description field from first feature information corresponding to the at least two clothing pictures;

extracting second sub-feature information corresponding to the same description field from second feature information corresponding to the at least two clothing pictures;

and combining the extracted first sub-feature information and the corresponding second sub-feature information into a first feature information group.

Optionally, in the GAN network training method according to the embodiment of the present application, the step of extracting the first sub-feature information corresponding to the same description field from the first feature information corresponding to the at least two clothing pictures includes:

extracting the same characteristic information segment in the corresponding first characteristic information of the at least two clothing pictures to be used as the first sub-characteristic information corresponding to the same description field.

Optionally, in the GAN network training method according to an embodiment of the present application, each of the training samples has the same clothing element as at least one other training sample.

In a second aspect, an embodiment of the present application provides a GAN network training apparatus, including:

the system comprises a first acquisition module, a second acquisition module and a display module, wherein the first acquisition module is used for acquiring a plurality of training samples, and each training sample comprises a clothing picture and description words for describing the clothing picture;

the first extraction module is used for extracting first characteristic information of the clothing pictures in each training sample and corresponding second characteristic information of the description characters to form a characteristic information group with the first characteristic information and the second characteristic information;

and the training module is used for training a preset GAN network according to each characteristic information group to obtain a target GAN network.

The clothing picture comprises a plurality of clothing elements, and the description words comprise a plurality of description fields;

optionally, in the GAN network training apparatus according to the embodiment of the present application, the training module includes:

the dividing unit is used for dividing each characteristic information group into a plurality of first characteristic information groups, each first characteristic information group comprises first sub-characteristic information and corresponding second sub-characteristic information, the first sub-characteristic information corresponds to one clothing element, the second sub-characteristic information corresponds to one description field, the first characteristic information comprises a plurality of first sub-characteristic information, and the second characteristic information comprises a plurality of second sub-characteristic information;

and the training unit is used for training a preset GAN network according to the plurality of first characteristic information groups of each characteristic information group to obtain a target GAN network.

In a third aspect, an embodiment of the present application provides a clothing image generation method, which uses a target GAN network obtained by training with the GAN network training method described in any one of the above, and the method includes the following steps:

obtaining description words input by a user and used for describing a target clothing picture to be generated;

and inputting the description words into the target GAN network to generate target clothing pictures corresponding to the description words.

In a fourth aspect, an embodiment of the present application provides a clothing image generating device, where a target GAN network obtained by training with the GAN network training method described in any one of the above is adopted, and the device includes:

the second acquisition module is used for acquiring description characters which are input by a user and are used for describing a target clothing picture to be generated;

and the generating module is used for inputting the description characters into the target GAN network so as to generate a target clothing picture corresponding to the description characters.

In a fifth aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the steps in the method as provided in the first aspect are executed.

In a sixth aspect, embodiments of the present application provide a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the steps in the method as provided in the first aspect.

As can be seen from the above, in the embodiment of the present application, a plurality of training samples are obtained, where each training sample includes a clothing picture and a description text for describing the clothing picture; extracting first characteristic information of the clothing picture in each training sample and corresponding second characteristic information of the description characters to form a characteristic information group with the first characteristic information and the second characteristic information; training a preset GAN network according to each characteristic information group to obtain a target GAN network; therefore, the target GAN network is trained, the corresponding clothing picture can be generated based on the description characters of the user, the individual requirements of the user on clothing design are met, the automation degree of the clothing picture generation can be improved, and the clothing picture can be rapidly and automatically generated.

Additional features and advantages of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

Fig. 1 is a flowchart of a GAN network training method according to an embodiment of the present disclosure.

Fig. 2 is a schematic structural diagram of a GAN network training apparatus according to an embodiment of the present disclosure.

Fig. 3 is a flowchart of a garment image generation method according to an embodiment of the present application.

Fig. 4 is a schematic structural diagram of a garment image generation device according to an embodiment of the present application.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Referring to fig. 1, fig. 1 is a flowchart of a GAN network training method in some embodiments of the present application. The GAN network training method comprises the following steps:

s101, obtaining a plurality of training samples, wherein each training sample comprises a clothing picture and description words for describing the clothing picture.

S102, extracting first characteristic information of the clothing picture in each training sample and corresponding second characteristic information of the description characters to form a characteristic information group with the first characteristic information and the second characteristic information.

S103, training a preset GAN network according to each characteristic information group to obtain a target GAN network.

In step S101, the training samples are clothing pictures obtained from a network, and corresponding description words are added based on the clothing pictures to describe the clothing in the clothing pictures. Each clothing picture is provided with a plurality of clothing elements, correspondingly, the description text corresponding to the clothing picture is also provided with a plurality of description fields, and each description field corresponds to one clothing element. For example, black high collar sweater, the corresponding description fields are black, high collar and sweater, respectively.

Preferably, each of the plurality of training samples has a same clothing element as at least one other of the plurality of training samples. For example, garment a is a black high collar sweater, garment B is a white round collar sweater, and the same garment element of garment a and garment B is a sweater. Further, each clothing element appears at least twice in the plurality of training samples.

In step S102, a CNN network may be used to extract first feature information in each clothing picture, and an RNN network may be used to extract second feature information in each description text. During collection, the corresponding clothing pictures and the description characters are respectively input into the CNN network and the RNN network, so that a group of corresponding first characteristic information and second characteristic information are collected, and the corresponding first characteristic information and second characteristic information form a characteristic information group.

In step S103, the feature information sets are input to a preset GAN network for training, so as to obtain a target GAN network, that is, the first feature information of each clothing picture and the second feature information of the corresponding description text are input to the preset GAN network in pairs, so as to obtain the target GAN network through training.

Specifically, in some embodiments, this step S103 comprises the following sub-steps: s1031, dividing each of the feature information groups into a plurality of first feature information groups, each of the first feature information groups including a first sub-feature information and a corresponding second sub-feature information, the first sub-feature information corresponding to one of the clothing elements, the second sub-feature information corresponding to one of the description fields, the first feature information including a plurality of first sub-feature information, the second feature information including a plurality of second sub-feature information; s1032, training a preset GAN network according to the plurality of first feature information groups of each feature information group to obtain a target GAN network.

In step S1031, when a clothing picture and a corresponding description text are initially input, that is, the description text is divided into a plurality of description fields, and clothing elements of each description field are marked in the clothing picture, and when feature extraction is performed, each time first sub-feature information of one clothing element is extracted by the CNN network, second sub-feature information of the description field corresponding to the clothing element is extracted by the RNN network; and then gradually extracting the sub-feature information of each clothing element and the corresponding description field in pairs to form a plurality of first sub-feature information groups.

It is to be understood that, in some embodiments, this step S1031 includes: s10311, performing word segmentation processing on each description character to obtain a plurality of description fields; s10312, screening at least two clothing pictures with the same description field from the description texts of the clothing pictures; s10313, extracting first sub-feature information corresponding to the same description field from first feature information corresponding to the at least two clothing pictures; s10314, extracting second sub-feature information corresponding to the same description field from second feature information corresponding to the at least two clothing pictures; and S10315, combining the extracted first sub-feature information and the corresponding second sub-feature information into a first feature information group.

In step S10311, for example, the description character is a white short-sleeved shirt, and the three description fields of "white", "short-sleeved" and "shirt" are obtained through word segmentation processing. In step S10312, the corresponding description words of the screened at least two clothing images only have one common description field. For example, the three description fields of the description text of the clothing picture a are respectively: white, long-sleeved, lattice shirts; the three description fields of the description text of the clothing picture B are respectively as follows: black, short sleeves, checked shirts; the three description fields of the description text of the clothing picture C are respectively as follows: red, medium and long sleeves, and checked shirts. Thus, the garment graphic A, B, C is screened out and has the same description field jersey. In step S10313, based on the first feature information of the clothing picture A, B, C, the same feature information segment is found, so that the first sub-feature information corresponding to the jersey can be determined. Furthermore, by adopting the method, the first sub-feature information corresponding to each description field can be determined in sequence. In step S10314, the same principle as that in step S10313 may be adopted, and the common feature information segment is extracted from the second feature information of the description character of the clothing picture A, B, C, that is, the second sub-feature information corresponding to the common description field may be found. In this step S10315, each first sub-feature information and the corresponding second sub-feature information are combined into a first feature information group.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a GAN network training apparatus in some embodiments of the present application. The GAN network training device comprises: a first acquisition module 201, a first extraction module 202, and a training module 203.

The first obtaining module 201 is configured to obtain a plurality of training samples, where each training sample includes a clothing picture and a description text for describing the clothing picture; the training samples are clothing pictures obtained from the network, and corresponding description characters are added based on the clothing pictures so as to describe the clothing in the clothing pictures. Each clothing picture is provided with a plurality of clothing elements, correspondingly, the description text corresponding to the clothing picture is also provided with a plurality of description fields, and each description field corresponds to one clothing element. For example, black high collar sweater, the corresponding description fields are black, high collar and sweater, respectively.

The first extraction module 202 is configured to extract first feature information of the clothing image in each training sample and second feature information of the corresponding description text, so as to form a feature information group having the first feature information and the second feature information. The first extraction module 202 may employ a CNN network to extract the first feature information in each clothing picture, and employ a RNN network to extract the second feature information in each description text. During collection, the corresponding clothing pictures and the description characters are respectively input into the CNN network and the RNN network, so that a group of corresponding first characteristic information and second characteristic information are collected, and the corresponding first characteristic information and second characteristic information form a characteristic information group.

The training module 203 is configured to train a preset GAN network according to each set of feature information to obtain a target GAN network. The training module 203 inputs the feature information sets into a preset GAN network for training, so as to obtain a target GAN network, that is, the first feature information of each clothing picture and the second feature information of the corresponding description text are input into the preset GAN network in pairs, so as to obtain the target GAN network through training.

Specifically, in some embodiments, the training module 203 comprises: the dividing unit is used for dividing each characteristic information group into a plurality of first characteristic information groups, each first characteristic information group comprises first sub-characteristic information and corresponding second sub-characteristic information, the first sub-characteristic information corresponds to one clothing element, the second sub-characteristic information corresponds to one description field, the first characteristic information comprises a plurality of first sub-characteristic information, and the second characteristic information comprises a plurality of second sub-characteristic information; and the training unit is used for training a preset GAN network according to the plurality of first characteristic information groups of each characteristic information group to obtain a target GAN network. Specifically, when a clothing picture and a corresponding description character are input, namely the description character is divided into a plurality of description fields, and clothing elements of each description field are marked in the clothing picture, and when feature extraction is performed, each time a CNN network extracts first sub-feature information of one clothing element, the RNN network extracts second sub-feature information of the description field corresponding to the clothing element; and then gradually extracting the sub-feature information of each clothing element and the corresponding description field in pairs to form a plurality of first sub-feature information groups.

It will be appreciated that in some embodiments, the partitioning unit is configured to: performing word segmentation processing on each description character to obtain a plurality of description fields; screening at least two clothing pictures with the same description field from the description texts of the clothing pictures; extracting first sub-feature information corresponding to the same description field from first feature information corresponding to the at least two clothing pictures; extracting second sub-feature information corresponding to the same description field from second feature information corresponding to the at least two clothing pictures; and combining the extracted first sub-feature information and the corresponding second sub-feature information into a first feature information group.

For example, the descriptive text is a white short-sleeved shirt, and three descriptive fields of "white", "short-sleeved" and "shirt" are obtained through word segmentation processing. The corresponding description words of at least two screened clothing pictures only have one common description field. For example, the three description fields of the description text of the clothing picture a are respectively: white, long-sleeved, lattice shirts; the three description fields of the description text of the clothing picture B are respectively as follows: black, short sleeves, checked shirts; the three description fields of the description text of the clothing picture C are respectively as follows: red, medium and long sleeves, and checked shirts. Thus, the garment graphic A, B, C is screened out and has the same description field jersey. Based on the first feature information of the clothing picture A, B, C, the first sub-feature information corresponding to the checked sweater can be determined by finding out the same feature information segment therein. Furthermore, by adopting the method, the first sub-feature information corresponding to each description field can be determined in sequence.

Referring to fig. 3, fig. 3 is a flowchart of a garment image generation method in some embodiments of the present application, where the garment image generation method performs image generation by using a target GAN network obtained by training through the GAN network training method in any of the embodiments. The method comprises the following steps:

s301, obtaining description words which are input by a user and used for describing the target clothing picture to be generated.

S302, inputting the description characters into the target GAN network to generate a target clothing picture corresponding to the description characters.

In step S301, the description text may include only one description field, or may include a plurality of description fields.

In step S302, when the description text only includes one description field, the target GAN network directly extracts the second sub-feature information of the description field, and then outputs a plurality of clothing pictures corresponding to the description field having the second sub-feature information.

In some embodiments, when the descriptive text has a plurality of description fields, the target GAN network extracts second sub-feature information of the plurality of description fields, and generates a clothing picture having first sub-feature information corresponding to the plurality of second sub-feature information.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a clothing image generating device in some embodiments of the present application. The clothing image generation device generates images by adopting a target GAN network obtained by training through the GAN network training method in any embodiment, and the device comprises: a second obtaining module 401 and a generating module 402.

The second obtaining module 401 is configured to obtain a description text, which is input by a user and used for describing a target clothing picture to be generated. The descriptor may include only one description field, or may include a plurality of description fields.

The generating module 402 is configured to input the description text into the target GAN network to generate a target clothing picture corresponding to the description text. When the description text only comprises one description field, the target GAN network directly extracts second sub-feature information of the description field and then outputs a plurality of clothing pictures corresponding to the description field with the second sub-feature information. In some embodiments, when the descriptive text has a plurality of description fields, the target GAN network extracts second sub-feature information of the plurality of description fields, and generates a clothing picture having first sub-feature information corresponding to the plurality of second sub-feature information.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, where the present disclosure provides an electronic device 5, including: the processor 501 and the memory 502, the processor 501 and the memory 502 being interconnected and communicating with each other via a communication bus 503 and/or other form of connection mechanism (not shown), the memory 502 storing a computer program executable by the processor 501, the computer program being executable by the processor 501 when the computing device is running, the processor 501 executing the computer program to perform the method of any of the alternative implementations of the embodiments described above.

The embodiment of the present application provides a storage medium, and when being executed by a processor, the computer program performs the method in any optional implementation manner of the above embodiment. The storage medium may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic Memory, a flash Memory, a magnetic disk, or an optical disk.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A GAN network training method is characterized by comprising the following steps:

2. The GAN network training method of claim 1, wherein the clothing picture comprises a plurality of clothing elements, and the descriptive text comprises a plurality of descriptive fields;

3. The GAN network training method of claim 2 wherein the step of grouping each of the profile groups into a plurality of first profile groups comprises:

screening at least two clothing pictures with the same description field from description texts of the clothing pictures;

4. The GAN network training method of claim 3, wherein the step of extracting the first sub-feature information corresponding to the same description field from the first feature information corresponding to the at least two clothing pictures comprises:

5. The GAN network training method of claim 2, wherein each of the plurality of training samples has the same clothing element as at least one other training sample.

6. A GAN network training apparatus, comprising:

7. The GAN network training device of claim 6, wherein the clothing picture comprises a plurality of clothing elements, and the descriptive text comprises a plurality of descriptive fields;

the training module comprises:

8. A clothing picture generating method, characterized in that, a target GAN network is obtained by training with the GAN network training method of any one of claims 1 to 5, the method comprises the following steps:

9. A clothing image generating device, characterized in that, a target GAN network obtained by training with the GAN network training method according to any one of claims 1 to 5 is adopted, the device comprises:

10. An electronic device comprising a processor and a memory, the memory storing computer readable instructions that, when executed by the processor, perform the method of any of claims 1-5 or 8.