WO2023051244A1 - 图像生成方法、装置、设备及存储介质 - Google Patents

图像生成方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023051244A1
WO2023051244A1 PCT/CN2022/118670 CN2022118670W WO2023051244A1 WO 2023051244 A1 WO2023051244 A1 WO 2023051244A1 CN 2022118670 W CN2022118670 W CN 2022118670W WO 2023051244 A1 WO2023051244 A1 WO 2023051244A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
human body
clothing
map
segmentation
Prior art date
Application number
PCT/CN2022/118670
Other languages
English (en)
French (fr)
Inventor
刘礼杰
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023051244A1 publication Critical patent/WO2023051244A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • Embodiments of the present disclosure relate to the technical field of image processing, for example, to an image generation method, device, device, and storage medium.
  • virtual dressing refers to the application of image fusion technology to fuse the user's human body image and the clothing image containing the target clothing to obtain the image of the user wearing the target clothing, so that the user does not need to actually try on the target clothing. Just can understand the wearing effect of target clothing.
  • the image fusion model is usually applied to extract the features of the human body image and the clothing image respectively, and a new image is generated based on the extracted two image features, that is, the image of the user wearing the target clothing.
  • the image fusion model since the image fusion model extracts rough image features, it is easy to cause the newly generated image to lack detailed information when generating the image, which in turn leads to distortion of the image generation effect and poor effect of virtual dressing.
  • Embodiments of the present disclosure provide an image generation method, device, device, and storage medium, which can improve the authenticity of generated images.
  • an embodiment of the present disclosure provides an image generation method, including:
  • an embodiment of the present disclosure further provides an image generation device, including:
  • a human body image acquisition module configured to obtain the first human body image comprising the target human body and the first clothing image comprising the target clothing;
  • the segmentation map acquisition module is configured to perform key point extraction, portrait segmentation, and human body part segmentation on the first human body image, to obtain key point feature maps, portrait segmentation maps, and human body part segmentation maps;
  • the second clothing image acquisition module is configured to input the key point feature map, the portrait segmentation map, the human body part segmentation map and the first clothing image into a deformation model to obtain a deformed second clothing image;
  • the second human body image acquisition module is configured to input the second clothing image, the first human body image, the key point feature map, the portrait segmentation map and the human body part segmentation map into the hybrid model to obtain the second A human body image; wherein, the target human body in the second human body image wears the target clothing.
  • an embodiment of the present disclosure further provides an electronic device, and the electronic device includes:
  • a storage device configured to store one or more programs
  • the one or more processing devices When the one or more programs are executed by the one or more processing devices, the one or more processing devices implement the image generation method according to the embodiments of the present disclosure.
  • the embodiments of the present disclosure further provide a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, the image generation method as described in the embodiments of the present disclosure is implemented.
  • FIG. 1 is a flowchart of an image generation method in an embodiment of the disclosure
  • Fig. 2 is a schematic diagram of a human body image and a clothing image in an embodiment of the present disclosure
  • Fig. 3a is an example diagram of human body key point extraction in an embodiment of the present disclosure
  • Fig. 3b is an example diagram of portrait segmentation in an embodiment of the present disclosure
  • Fig. 3c is an example diagram of human body part segmentation in an embodiment of the present disclosure.
  • Fig. 3d is an example diagram of adjusting the first human body image in an embodiment of the present disclosure
  • Fig. 4 is an example diagram of deforming the target clothing in an embodiment of the present disclosure
  • Fig. 5 is an example diagram of obtaining a human body image after changing clothes in an embodiment of the present disclosure
  • FIG. 6 is a schematic structural diagram of an image generating device in an embodiment of the present disclosure.
  • Fig. 7 is a schematic structural diagram of an electronic device in an embodiment of the present disclosure.
  • the term “comprise” and its variations are open-ended, ie “including but not limited to”.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one further embodiment”; the term “some embodiments” means “at least some embodiments.”
  • the relevant definitions of other terms will be given in the description below.
  • Fig. 1 is a flowchart of an image generation method provided by an embodiment of the present disclosure.
  • This embodiment is applicable to the situation of changing clothes of the target person in the human body image, and the method can be executed by an image generating device, which can be composed of hardware and/or software, and can generally be integrated in a device with image generating function
  • the device may be an electronic device such as a server, a mobile terminal, or a server cluster.
  • the method includes the following steps:
  • Step 110 acquiring a first human body image including the target human body and a first clothing image including the target clothing.
  • the target human body may be a portrait displayed in a certain posture
  • the target clothing may be clothing displayed in a tiled image.
  • FIG. 2 is a schematic diagram of a human body image and clothing images. As shown in FIG. 2 , the left side is the first clothing image including the target clothing, and the right side is the first human body image including the target human body. In Fig. 2, the target clothes are displayed in the form of a tile.
  • Step 120 perform key point extraction, portrait segmentation, and human body part segmentation on the first human body image, to obtain key point feature maps, portrait segmentation maps, and human body part segmentation maps.
  • human body key point extraction can be understood as human body pose estimation.
  • Human body key points can include 17 joint points, namely nose, left and right eyes, left and right ears, left and right shoulders, left and right elbows, left and right wrists, left and right hips, left and right knees, and left and right ankles .
  • any human body key point detection algorithm may be used to perform human body key point detection on the first human body image (not limited here), or the first human body image may be input into a key point extraction model to obtain a key point feature map.
  • FIG. 3a is an example diagram of key point extraction of a human body. As shown in FIG. 3a , the left side figure is the acquired first human body image including the target human body, and the right side is a key point feature map. The relative positional relationship between multiple key points can represent the posture information of the human body.
  • the portrait segmentation map can be understood as an image that separates the portrait from the background.
  • any portrait segmentation technology may be used to perform portrait segmentation (not limited here), or the first human body image may be input into a portrait segmentation model to obtain a related portrait segmentation map.
  • FIG. 3 b is an example diagram of portrait segmentation. As shown in FIG. 3 b , the left diagram is the obtained first human body image including the target human body, and the right diagram is the portrait segmentation diagram. It can be seen from Figure 3b that the portrait segmentation image is an image that separates the portrait from the background.
  • the human body part segmentation map can be understood as an image in which multiple parts of the human body are segmented. For example: Segmented images of face, hair, arms, upper body, legs, etc.
  • any human body part segmentation algorithm can be used to perform body part segmentation on the first human body image (not limited here), or the first human body image can be input into a human body part segmentation model to obtain a human body part segmentation map.
  • FIG. 3 c is an example diagram of human body part segmentation. As shown in FIG. 3 c , the left figure is the acquired first human body image including the target human body, and the right side is the corresponding human body part segmentation figure.
  • the posture information of the human body can be obtained through the key point feature map
  • the size information of the human body can be obtained through the portrait segmentation map
  • the area where the clothes are located can be obtained through the human body part segmentation map. Therefore, the pose of the clothing image can be adjusted according to the key point feature map, the size of the clothing image can be adjusted according to the portrait segmentation image, and the clothing image can be cropped according to the human body part segmentation image.
  • posture adjustment, size adjustment, and cutting of the tiled clothing image a deformed clothing image can be obtained, which can ensure that the deformed clothing image fits the current human body more closely.
  • the following steps are also included: obtaining the distribution information of the reference key points; adjusting the key points of the first human body image based on the distribution information of the reference key points to obtain the adjusted After the first human body image.
  • the reference key point distribution information can be understood as the distribution information of multiple human body key points in the reference image.
  • the extracted key points are aligned with the reference key points, so as to achieve the purpose of adjusting the size of the picture and the proportion of the portrait in the image.
  • FIG. 3d is an example diagram of adjusting the first human body image in this embodiment. See Figure 3d, the proportion of the human body in the figure in (1) and the size of the picture do not match the reference image.
  • the key points of the human body in (1) are extracted to obtain the figure in (2), and then the The keypoints in the (2) graph are aligned with the benchmark keypoints to obtain the adjusted (3) graph.
  • a manner of performing portrait segmentation and human body part segmentation on the first human body image respectively may be: performing portrait segmentation and human body part segmentation on the adjusted first human body image respectively.
  • Step 130 input the key point feature map, the portrait segmentation map, the human body part segmentation map and the first clothing image into the deformable model to obtain the deformed second clothing image.
  • the deformation model may be obtained by training a neural network based on the human body sample image and the clothing sample image.
  • the neural network may be a convolutional neural network or the like.
  • Fig. 4 is an example diagram of deforming the target clothing in this embodiment.
  • input the key point feature map, portrait segmentation map, human body part segmentation map and the first clothing image into the deformation model, and the process of obtaining the deformed second clothing image can be: the deformation model transforms the first clothing image according to the key point feature map Adjust the pose of the image; adjust the size of the pose-adjusted clothing image according to the human body segmentation map; crop the size-adjusted clothing image according to the clothing area in the human body part segmentation map to obtain a deformed second clothing image.
  • the pose adjustment, size adjustment and cropping of the first clothing image can be performed sequentially, and the deformed second clothing image can be obtained, which can ensure the deformed second clothing image It is more suitable for the current human body.
  • the training method of the deformable model is as follows: acquiring a human body sample image and a clothing sample image; wherein, the human body in the human body sample image wears the clothes in the clothing sample image; key point extraction, portrait segmentation and Segment human body parts to obtain key point feature sample images, portrait segmentation sample images, and human body part segmentation sample images; input key point feature sample images, portrait segmentation sample images, human body part segmentation sample images, and clothing sample images into the initial model to obtain the first A deformed clothing image; a loss function is calculated according to the first deformed clothing image and a human body sample image; an initial model is trained according to the loss function to obtain a deformed model.
  • the method of performing key point extraction, portrait segmentation and human body part segmentation on the human body sample image can also be: input the human body sample image into the key point extraction model, the human body segmentation model and the human body part segmentation model respectively, and obtain the key point feature sample map , portrait segmentation sample image and human body part segmentation sample image.
  • Step 140 input the second clothing image, the first human body image, key point feature map, portrait segmentation map and human body part segmentation map into the hybrid model to obtain a second human body image.
  • the target human body in the second human body image wears the target clothing.
  • the hybrid model can be obtained by training the generation model in the generation confrontation network based on the human body sample image and the clothing sample image.
  • the second clothing image, key point feature map, portrait segmentation map, and human body part segmentation map are input into the hybrid model to obtain the second human body image.
  • FIG. 5 is an example diagram of acquiring a human body image after changing clothing in an embodiment of the present disclosure.
  • the second clothing image, the first human body image, the key point feature map, the portrait segmentation map and the human body part segmentation map are input into the hybrid model, and the process of obtaining the second human body image can be: the hybrid model combines the second clothing image and the first The human body image is fused to obtain the initial image; the clothing posture in the initial image is optimized according to the key point feature map, the clothing size in the initial image is optimized according to the portrait segmentation map, and the clothing size in the initial image is optimized according to the human body part segmentation map Perform optimized cropping to obtain a second human body image.
  • the degree of fit between the clothes and the human body in the initial image after fusing the second clothes image and the first human body image is poor, so the initial image needs to be optimized.
  • the initial image is sequentially optimized for posture, size, and cropping, so that the clothing and human body in the acquired second human body image are closer to reality. Effect.
  • the training method of the mixed model is as follows: input the key point feature sample graph, the portrait segmentation sample graph, the human body part segmentation sample graph and the clothing sample image into the deformation model to obtain the second deformed clothing graph; , human body sample image, key point feature sample image, portrait segmentation sample image, human body part segmentation sample image, and clothing sample image are input into the generation model to obtain the generated human body image; the generated human body image is input into the discrimination model to obtain the discrimination result; according to the discrimination result, the Generate a model for training to obtain a hybrid model.
  • the hybrid model is trained based on the deformation model.
  • a generative model is trained adversarially against a discriminative model, which can improve the accuracy of the final hybrid model.
  • the first human body image including the target human body and the first clothing image including the target clothing are obtained; key point extraction, portrait segmentation, and human body part segmentation are respectively performed on the first human body image to obtain key point feature maps, Portrait segmentation map and human body part segmentation map; input the key point feature map, portrait segmentation map, human body part segmentation map and the first clothing image into the deformation model to obtain the deformed second clothing image; the second clothing image, the first The human body image, key point feature map, portrait segmentation map and human body part segmentation map are input into the hybrid model to obtain a second human body image; wherein, the target human body in the second human body image wears target clothing.
  • the image generation method provided by the embodiments of the present disclosure uses a deformation model to deform the target clothing in the first clothing image to obtain a deformed second clothing image, and mixes the deformed target clothing with the target human body through a mixture model to obtain The second human body image wearing the target clothing can improve the realism of the generated image.
  • Fig. 6 is a schematic structural diagram of an image generating device provided by an embodiment of the present disclosure. As shown in Figure 6, the device includes:
  • the human body image acquiring module 210 is configured to acquire the first human body image including the target human body and the first clothing image including the target clothing;
  • the segmentation map acquisition module 220 is configured to perform key point extraction, portrait segmentation, and human body part segmentation on the first human body image, to obtain a key point feature map, a portrait segmentation map, and a human body part segmentation map;
  • the second clothing image acquisition module 230 is configured to input the key point feature map, the portrait segmentation map, the human body part segmentation map and the first clothing image into the deformation model to obtain the deformed second clothing image ;
  • the second human body image acquisition module 240 is configured to input the second clothing image, the first human body image, the key point feature map, the portrait segmentation map and the human body part segmentation map into a hybrid model to obtain a second human body An image; wherein, the target human body in the second human body image wears the target clothing.
  • segmentation map acquisition module 220 is also set to:
  • the first human body image is respectively input into the key point extraction model, the portrait segmentation model and the human body part segmentation model to obtain the key point feature map, the portrait segmentation map and the human body part segmentation map.
  • the second clothing image acquisition module 230 is also set to:
  • the deformation model adjusts the posture of the first clothing image according to the key point feature map
  • the size-adjusted clothing image is cropped according to the clothing area in the human body part segmentation map to obtain a deformed second clothing image.
  • the second human body image acquisition module 240 is also set to:
  • the hybrid model fuses the second clothes image and the first human body image, an initial image
  • the image generation device also includes: a first human body image adjustment module, configured to:
  • the key points of the first human body image are adjusted based on the reference key point distribution information to obtain an adjusted first human body image.
  • segmentation map acquisition module 220 is also set to:
  • a portrait segmentation and a human body part segmentation are respectively performed on the adjusted first human body image.
  • the image generation device also includes: a deformation model training module, which is set to:
  • the initial model is trained according to the loss function to obtain a deformation model.
  • the image generation device also includes: a mixed model training module, which is set to:
  • the generation model is trained according to the discrimination result to obtain a hybrid model.
  • the clothing image is a clothing tile map.
  • the above-mentioned device can execute the methods provided by all the foregoing embodiments of the present disclosure, and has corresponding functional modules and advantageous effects for executing the above-mentioned methods.
  • the above-mentioned device can execute the methods provided by all the foregoing embodiments of the present disclosure, and has corresponding functional modules and advantageous effects for executing the above-mentioned methods.
  • FIG. 7 it shows a schematic structural diagram of an electronic device 300 suitable for implementing the embodiments of the present disclosure.
  • Electronic devices in embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (PDAs), tablet computers (PADs), portable multimedia players (PMPs), vehicle-mounted terminals (such as Mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers, etc., or various forms of servers, such as independent servers or server clusters.
  • PDAs personal digital assistants
  • PADs tablet computers
  • PMPs portable multimedia players
  • vehicle-mounted terminals such as Mobile terminals such as car navigation terminals
  • fixed terminals such as digital TVs, desktop computers, etc.
  • servers such as independent servers or server clusters.
  • the electronic device shown in FIG. 7 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.
  • an electronic device 300 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 301, which may be stored in a read-only storage device (ROM) 302 or loaded into a Various appropriate actions and processes are executed by accessing programs in the storage device (RAM) 303 . In the RAM 303, various programs and data necessary for the operation of the electronic device 300 are also stored.
  • the processing device 301, ROM 302, and RAM 303 are connected to each other through a bus 304.
  • An input/output (I/O) interface 305 is also connected to the bus 304 .
  • the following devices can be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibrating an output device 307 such as a computer; a storage device 308 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 309.
  • the communication means 309 may allow the electronic device 300 to perform wireless or wired communication with other devices to exchange data. While FIG. 7 shows electronic device 300 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer readable medium, the computer program comprising program code for performing a word recommendation method.
  • the computer program may be downloaded and installed from the network via the communication means 309, or from the storage means 305, or from the ROM 302.
  • the processing device 301 When the computer program is executed by the processing device 301, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
  • the computer readable storage medium may be a non-transitory computer readable storage medium.
  • the client and the server can communicate using any currently known or future network protocols such as Hypertext Transfer Protocol (HyperText Transfer Protocol, HTTP), and can communicate with digital data in any form or medium
  • HTTP Hypertext Transfer Protocol
  • Examples of communication networks include local area networks (LANs), wide area networks (WANs), internetworks (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires the first human body image containing the target human body and the first clothing image containing the target clothing ; Carry out key point extraction, portrait segmentation and human body parts segmentation respectively to the first human body image, obtain key point feature map, portrait segmentation map and human body part segmentation map; Describe key point feature map, described portrait segmentation map, The human body part segmentation map and the first clothing image are input into a deformation model to obtain a deformed second clothing image; the second clothing image, the first human body image, the key point feature map, and the The portrait segmentation map and the human body part segmentation map are input into a hybrid model to obtain a second human body image; wherein, the target human body in the second human body image wears the target clothing.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of a unit does not constitute a limitation of the unit itself under certain circumstances.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs System on Chips
  • CPLD Complex Programmable Logical device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the embodiments of the present disclosure disclose an image generation method, including:
  • the second clothing image, the first human body image, the key point feature map, the portrait segmentation map and the human body part segmentation map are input into a hybrid model to obtain a second human body image; wherein, the second human body image
  • the target human body wears the target clothing.
  • key point extraction, portrait segmentation, and human body part segmentation are respectively performed on the first human body image to obtain a key point feature map, a portrait segmentation map, and a human body part segmentation map, including:
  • the first human body image is respectively input into the key point extraction model, the portrait segmentation model and the human body part segmentation model to obtain the key point feature map, the portrait segmentation map and the human body part segmentation map.
  • the key point feature map, the portrait segmentation map, the human body part segmentation map and the first clothing image into the deformation model to obtain the deformed second clothing image, including:
  • the deformation model adjusts the posture of the first clothing image according to the key point feature map
  • the size-adjusted clothing image is cropped according to the clothing area in the human body part segmentation map to obtain a deformed second clothing image.
  • a hybrid model For example, input the second clothing image, the first human body image, the key point feature map, the portrait segmentation map and the human body part segmentation map into a hybrid model to obtain a second human body image, including:
  • the hybrid model fuses the second clothes image and the first human body image, an initial image
  • the first human body image and before portrait segmentation After performing key point extraction on the first human body image and before portrait segmentation, it also includes:
  • the key points of the first human body image are adjusted based on the reference key point distribution information to obtain an adjusted first human body image.
  • performing portrait segmentation and body part segmentation on the first human body image respectively including:
  • a portrait segmentation and a human body part segmentation are respectively performed on the adjusted first human body image.
  • the training method of the deformation model is:
  • the initial model is trained according to the loss function to obtain a deformation model.
  • a mixture model is trained as:
  • the generation model is trained according to the discrimination result to obtain a hybrid model.
  • the clothing image is a clothing tile map.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本公开实施例公开了一种图像生成方法、装置、设备及存储介质。包括:获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像;对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像;将所述第二衣物图像、第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像;其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。

Description

图像生成方法、装置、设备及存储介质
本申请要求在2021年9月29日提交中国专利局、申请号为202111151607.6的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本公开实施例涉及图像处理技术领域,例如涉及一种图像生成方法、装置、设备及存储介质。
背景技术
随着科技的发展,越来越多的应用软件走进了用户的生活,逐渐丰富了用户的业余生活,例如短视频APP等。用户可以采用视频、照片等方式记录生活,并上传到短视频APP上。
短视频APP上有许多基于图像算法与渲染技术的特效玩法。其中,虚拟换装是指应用图像融合技术,将用户的人体图像和包含目标衣物的衣物图像进行融合,得到用户穿戴该目标衣物后的图像,从而在用户无需真正试穿目标衣物的情况下,就能够了解到目标衣物的穿戴效果。
目前,在虚拟换装过程中,通常应用图像融合模型,分别对人体图像和衣物图像进行特征提取,基于提取到的两个图像特征生成新的图像,即用户穿戴目标衣物的图像。但是,在上述过程中,由于图像融合模型所提取的是粗略的图像特征,在生成图像时容易导致新生成的图像缺失细节信息,进而导致图像生成效果失真,虚拟换装的效果较差。
发明内容
本公开实施例提供一种图像生成方法、装置、设备及存储介质,可以提高生成图像的真实度。
第一方面,本公开实施例提供了一种图像生成方法,包括:
获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像;
对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;
将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像;
将所述第二衣物图像、所述第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像;其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。
第二方面,本公开实施例还提供了一种图像生成装置,包括:
人体图像获取模块,设置为获取包含目标人体的第一人体图像及包含目标衣物的第一衣 物图像;
分割图获取模块,设置为对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;
第二衣物图像获取模块,设置为将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像;
第二人体图像获取模块,设置为将所述第二衣物图像、所述第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像;其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。
第三方面,本公开实施例还提供了一种电子设备,所述电子设备包括:
一个或多个处理装置;
存储装置,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现如本公开实施例所述的图像生成方法。
第四方面,本公开实施例还提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现如本公开实施例所述的图像生成方法。
附图说明
图1是本公开实施例中的一种图像生成方法的流程图;
图2是本公开实施例中的人体图像及衣物图像的示意图;
图3a是本公开实施例中的人体关键点提取的示例图;
图3b是本公开实施例中的人像分割的示例图;
图3c是本公开实施例中的人体部位分割的示例图;
图3d是本公开实施例中的调整第一人体图像的示例图;
图4是本公开实施例中的对目标衣物进行形变处理的示例图;
图5是本公开实施例中的获取换装后的人体图像的示例图;
图6是本公开实施例中的一种图像生成装置的结构示意图;
图7是本公开实施例中的一种电子设备的结构示意图。
具体实施方式
应当理解,本公开的方法实施方式中记载的多个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述 中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
图1为本公开实施例提供的一种图像生成方法的流程图。本实施例可适用于对人体图像中的目标人物进行换装的情况,该方法可以由图像生成装置来执行,该装置可由硬件和/或软件组成,并一般可集成在具有图像生成功能的设备中,该设备可以是服务器、移动终端或服务器集群等电子设备。如图1所示,该方法包括如下步骤:
步骤110,获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像。
其中,目标人体可以是以一定的姿态展示的人像,目标衣物可以是以平铺图展示的衣物。示例性的,图2为人体图像及衣物图像的示意图。如图2所示,左侧为包含有目标衣物的第一衣物图像,右侧为包含目标人体的第一人体图像。在图2中,目标衣物以平铺图的形式展示。
步骤120,对第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图。
其中,人体关键点提取可以理解为人体姿态估计,人体关键点可以包含17个关节点,分别是鼻子,左右眼,左右耳,左右肩,左右肘,左右腕,左右臀,左右膝,左右脚踝。本实施例中,可以采用任意的人体关键点检测算法对第一人体图像进行人体关键点检测(此处不作限定),或者将第一人体图像输入关键点提取模型中,获得关键点特征图。示例性的,图3a为人体关键点提取的示例图,如图3a所示,左侧图为获取到的包含有目标人体的第一人体图像,右侧为关键点特征图。多个关键点间的相对位置关系可以表征人体的姿态信息。
其中,人像分割图可以理解为将人像与背景分割开的图像。本实施例中,可以采用任意的人像分割技术进行人像分割(此处不作限定),或者将第一人体图像输入人像分割模型中,获得关人像分割图。示例性的,图3b为人像分割的示例图,如图3b所示,左侧图为获取到的包含有目标人体的第一人体图像,右侧为人像分割图。从图3b可以看出,人像分割图为将人像与背景分割开的图像。
其中,人体部位分割图可以理解为将人体多个部位分割开的图像。例如:将脸部、头发、胳膊、上半身、腿部等分割开的图像。本实施例中,可以采用任意的人体部位分割算法对第一人体图像进行人体部位分割(此处不作限定),或者将第一人体图像输入人体部位分割模型中,获得人体部位分割图。示例性的,图3c为人体部位分割的示例图,如图3c所示,左侧图为获取到的包含有目标人体的第一人体图像,右侧为对应的人体部位分割图。
本实施例中,通过关键点特征图可以获取人体的姿态信息,通过人像分割图可以获得人 体的尺寸信息,通过人体部位分割图可以获得衣物所在的区域。从而可以根据关键点特征图对衣物图进行姿态调整,根据人像分割图对衣物图进行尺寸调整,根据人体部位分割图对衣物图进行裁剪。对平铺的衣物图进行姿态调整、尺寸调整及裁剪后,就可以获得变形后的衣物图,可以保变形后的衣物图与当前的人体更贴合。
例如,在对第一人体图像分别进行关键点提取之后,人像分割之前,还包括如下步骤:获取基准关键点分布信息;基于基准关键点分布信息对第一人体图像的关键点进行调整,获得调整后的第一人体图像。
其中,基准关键点分布信息可以理解为在基准图像中多个人体关键点的分布信息。本实施例中,在对第一人体图像分别进行关键点提取之后,将提取的关键点与基准关键点对齐,从而达到调整图片尺寸以及人像在图像中的占比的目的。示例图的,图3d为本实施例中调整第一人体图像的示例图。参见如图3d,(1)图中人体在图中所占的比例以及图片的尺寸与基准图像不符合,此时对(1)中的人体进行关键点提取,获得(2)图,然后将(2)图中的关键点与基准关键点对齐,获得调整后的(3)图。
对第一人体图像分别进行人像分割及人体部位分割的方式可以是:对调整后的第一人体图像分别进行人像分割及人体部位分割。
步骤130,将关键点特征图、人像分割图、人体部位分割图及第一衣物图像输入形变模型中,获得变形后的第二衣物图像。
其中,形变模型可以是基于人体样本图像及衣物样本图像对设定神经网络训练获得的。其中,设定神经网络可以是卷积神经网络等。
例如,在获得关键点特征图、人像分割图、人体部位分割图及第一衣物图像后,将关键点特征图、人像分割图、人体部位分割图及第一衣物图像输入形变模型中,获得变形后的第二衣物图像。示例性的,图4是本实施例中对目标衣物进行形变处理的示例图。
例如,将关键点特征图、人像分割图、人体部位分割图及第一衣物图像输入形变模型中,获得变形后的第二衣物图像的过程可以是:形变模型根据关键点特征图对第一衣物图像进行姿态调整;根据人体分割图对姿态调整后的衣物图像进行尺寸调整;根据人体部位分割图中的衣物区域对尺寸调整后的衣物图像进行裁剪,获得变形后的第二衣物图像。
根据关键点特征图、人像分割图、人体部位分割图对第一衣物图像依次进行姿态调整、尺寸调整及裁剪后,就可以获得变形后的第二衣物图,可以保证形变后的第二衣物图与当前的人体更贴合。
本实施例中,形变模型的训练方式为:获取人体样本图像及衣物样本图像;其中,人体样本图像中的人体穿戴衣物样本图像中的衣物;对人体样本图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征样本图、人像分割样本图及人体部位分割样本图;将关键点特征样本图、人像分割样本图、人体部位分割样本图及衣物样本图像输入初始模型中,获得第一变形衣物图;根据第一变形衣物图及人体样本图像计算损失函数;根据损失函数训练初始模型,获得形变模型。
其中,对人体样本图像分别进行关键点提取、人像分割及人体部位分割的方式同样可以是:将人体样本图像分别输入关键点提取模型、人像分割模型及人体部位分割模型,获得关键点特征样本图、人像分割样本图及人体部位分割样本图。
步骤140,将第二衣物图像、所述第一人体图像、关键点特征图、人像分割图及人体部位分割图输入混合模型,获得第二人体图像。
其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。混合模型可以是基于人体样本图像及衣物样本图像对生成对抗网络中的生成模型训练获得的。例如,将第二衣物图像、关键点特征图、人像分割图及人体部位分割图输入混合模型,获得第二人体图像。示例性的,图5是本公开实施例中获取换装后的人体图像的示例图。
例如,将第二衣物图像、第一人体图像、关键点特征图、人像分割图及人体部位分割图输入混合模型,获得第二人体图像的过程可以是:混合模型将第二衣服图像和第一人体图像进行融合,获得初始图像;根据关键点特征图对初始图像中的衣物姿态进行优化,根据人像分割图对初始图像中的衣物尺寸进行优化,根据述人体部位分割图对初始图像中的衣物进行优化裁剪,获得第二人体图像。
本实施例中,将第二衣服图像和第一人体图像融合后的初始图像中衣物和人体的贴合度较差,因此需要对初始图像进行优化。根据关键点特征图、人像分割图、人体部位分割图对初始图像依次进行姿态优化、尺寸优化及裁剪优化后,使得获取到的第二人体图像中的衣物和人体更贴合,更接近于真实效果。
本实施例中,混合模型的训练方式为:将关键点特征样本图、人像分割样本图、人体部位分割样本图及衣物样本图像输入形变模型,获得第二变形衣物图;将第二变形衣物图、人体样本图像、关键点特征样本图、人像分割样本图、人体部位分割样本图及衣物样本图像输入生成模型,获得生成人体图像;将生成人体图像输入判别模型,获得判别结果;根据判别结果对生成模型进行训练,获得混合模型。
其中,混合模型是基于形变模型进行训练的。例如,生成模型与判别模型进行对抗训练,可以提高最终混合模型的精度。
本实施例的技术方案,获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像;对第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;将关键点特征图、人像分割图、人体部位分割图及第一衣物图像输入形变模型中,获得变形后的第二衣物图像;将第二衣物图像、第一人体图像、关键点特征图、人像分割图及人体部位分割图输入混合模型,获得第二人体图像;其中,第二人体图像中的目标人体穿戴目标衣物。本公开实施例提供的图像生成方法,通过形变模型对第一衣物图像中的目标衣物进行变形处理,获得变形后的第二衣物图像,通过混合模型将变形后的目标衣物与目标人体混合,获得穿戴目标衣物的第二人体图像,可以提高生成图像的真实度。
图6是本公开实施例提供的一种图像生成装置的结构示意图。如图6所示,该装置包括:
人体图像获取模块210,设置为获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像;
分割图获取模块220,设置为对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;
第二衣物图像获取模块230,设置为将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像;
第二人体图像获取模块240,设置为将所述第二衣物图像、第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像;其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。
例如,分割图获取模块220,还设置为:
将所述第一人体图像分别输入关键点提取模型、人像分割模型及人体部位分割模型,获得关键点特征图、人像分割图及人体部位分割图。
例如,第二衣物图像获取模块230,还设置为:
所述形变模型根据所述关键点特征图对所述第一衣物图像进行姿态调整;
根据所述人体分割图对姿态调整后的衣物图像进行尺寸调整;
根据所述人体部位分割图中的衣物区域对尺寸调整后的衣物图像进行裁剪,获得变形后的第二衣物图像。
例如,第二人体图像获取模块240,还设置为:
所述混合模型将所述第二衣服图像和所述第一人体图像进行融合,初始图像;
根据所述关键点特征图对所述初始图像中的衣物姿态进行优化,根据所述人像分割图对所述初始图像中的衣物尺寸进行优化,根据所述述人体部位分割图对所述初始图像中的衣物进行优化裁剪,获得第二人体图像。
例如,图像生成装置还包括:第一人体图像调整模块,设置为:
获取基准关键点分布信息;
基于所述基准关键点分布信息对所述第一人体图像的关键点进行调整,获得调整后的第一人体图像。
例如,分割图获取模块220,还设置为::
对调整后的第一人体图像分别进行人像分割及人体部位分割。
例如,图像生成装置还包括:形变模型训练模块,设置为:
获取人体样本图像及衣物样本图像;其中,所述人体样本图像中的人体穿戴所述衣物样本图像中的衣物;
对所述人体样本图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征样本图、人像分割样本图及人体部位分割样本图;
将所述关键点特征样本图、所述人像分割样本图、所述人体部位分割样本图及所述衣物样本图像输入初始模型中,获得第一变形衣物图;
根据所述第一变形衣物图及所述人体样本图像计算损失函数;
根据所述损失函数训练所述初始模型,获得形变模型。
例如,图像生成装置还包括:混合模型训练模块,设置为:
将所述关键点特征样本图、所述人像分割样本图、所述人体部位分割样本图及所述衣物样本图像输入形变模型,获得第二变形衣物图;
将所述第二变形衣物图、所述人体样本图像、所述关键点特征样本图、所述人像分割样本图、人体部位分割样本图及所述衣物样本图像输入生成模型,获得生成人体图像;
将所述生成人体图像输入判别模型,获得判别结果;
根据所述判别结果对所述生成模型进行训练,获得混合模型。
例如,所述衣物图像为衣物平铺图。
上述装置可执行本公开前述所有实施例所提供的方法,具备执行上述方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本公开前述所有实施例所提供的方法。
下面参考图7,其示出了适于用来实现本公开实施例的电子设备300的结构示意图。本公开实施例中的电子设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(PDA)、平板电脑(PAD)、便携式多媒体播放器(PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端,或者多种形式的服务器,如独立服务器或者服务器集群。图7示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图7所示,电子设备300可以包括处理装置(例如中央处理器、图形处理器等)301,其可以根据存储在只读存储装置(ROM)302中的程序或者从存储装置305加载到随机访问存储装置(RAM)303中的程序而执行多种适当的动作和处理。在RAM 303中,还存储有电子设备300操作所需的多种程序和数据。处理装置301、ROM 302以及RAM 303通过总线304彼此相连。输入/输出(I/O)接口305也连接至总线304。
通常,以下装置可以连接至I/O接口305:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置306;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置307;包括例如磁带、硬盘等的存储装置308;以及通信装置309。通信装置309可以允许电子设备300与其他设备进行无线或有线通信以交换数据。虽然图7示出了具有多种装置的电子设备300,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行词语的推荐方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置309从网络上被下载和安装,或者从存储装置305被安装,或者从ROM 302被安装。在该计算机程序被处理装置301执行时,执行本公开实施例的方法中限定的上述功 能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。计算机可读存储介质可以为非暂态计算机可读存储介质。
在一些实施方式中,客户端、服务器可以利用诸如超文本传输协议(HyperText Transfer Protocol,HTTP)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(LAN),广域网(WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像;对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像;将所述第二衣物图像、所述第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像;其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在 用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开多种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开实施例的一个或多个实施例,本公开实施例公开了一种图像生成方法,包括:
获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像;
对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;
将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像;
将所述第二衣物图像、第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像;其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。
例如,对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图,包括:
将所述第一人体图像分别输入关键点提取模型、人像分割模型及人体部位分割模型,获得关键点特征图、人像分割图及人体部位分割图。
例如,将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像,包括:
所述形变模型根据所述关键点特征图对所述第一衣物图像进行姿态调整;
根据所述人体分割图对姿态调整后的衣物图像进行尺寸调整;
根据所述人体部位分割图中的衣物区域对尺寸调整后的衣物图像进行裁剪,获得变形后的第二衣物图像。
例如,将所述第二衣物图像、所述第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像,包括:
所述混合模型将所述第二衣服图像和所述第一人体图像进行融合,初始图像;
根据所述关键点特征图对所述初始图像中的衣物姿态进行优化,根据所述人像分割图对所述初始图像中的衣物尺寸进行优化,根据所述述人体部位分割图对所述初始图像中的衣物进行优化裁剪,获得第二人体图像。
例如,在对所述第一人体图像分别进行关键点提取之后,人像分割之前,还包括:
获取基准关键点分布信息;
基于所述基准关键点分布信息对所述第一人体图像的关键点进行调整,获得调整后的第一人体图像。
例如,对所述第一人体图像分别进行人像分割及人体部位分割,包括:
对调整后的第一人体图像分别进行人像分割及人体部位分割。
例如,所述形变模型的训练方式为:
获取人体样本图像及衣物样本图像;其中,所述人体样本图像中的人体穿戴所述衣物样本图像中的衣物;
对所述人体样本图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征样本图、人像分割样本图及人体部位分割样本图;
将所述关键点特征样本图、所述人像分割样本图、所述人体部位分割样本图及所述衣物样本图像输入初始模型中,获得第一变形衣物图;
根据所述第一变形衣物图及所述人体样本图像计算损失函数;
根据所述损失函数训练所述初始模型,获得形变模型。
例如,混合模型的训练方式为:
将所述关键点特征样本图、所述人像分割样本图、所述人体部位分割样本图及所述衣物样本图像输入形变模型,获得第二变形衣物图;
将所述第二变形衣物图、所述人体样本图像、所述关键点特征样本图、所述人像分割样 本图、人体部位分割样本图及所述衣物样本图像输入生成模型,获得生成人体图像;
将所述生成人体图像输入判别模型,获得判别结果;
根据所述判别结果对所述生成模型进行训练,获得混合模型。
例如,所述衣物图像为衣物平铺图。

Claims (12)

  1. 一种图像生成方法,包括:
    获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像;
    对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;
    将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像;
    将所述第二衣物图像、所述第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像;其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。
  2. 根据权利要求1所述的方法,其中,所述对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图,包括:
    将所述第一人体图像分别输入关键点提取模型、人像分割模型及人体部位分割模型,获得关键点特征图、人像分割图及人体部位分割图。
  3. 根据权利要求1所述的方法,其中,所述将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像,包括:
    所述形变模型根据所述关键点特征图对所述第一衣物图像进行姿态调整;
    根据所述人体分割图对姿态调整后的衣物图像进行尺寸调整;
    根据所述人体部位分割图中的衣物区域对尺寸调整后的衣物图像进行裁剪,获得变形后的第二衣物图像。
  4. 根据权利要求1所述的方法,其中,所述将所述第二衣物图像、所述第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像,包括:
    所述混合模型将所述第二衣服图像和所述第一人体图像进行融合,获得初始图像;
    根据所述关键点特征图对所述初始图像中的衣物姿态进行优化,根据所述人像分割图对所述初始图像中的衣物尺寸进行优化,根据所述述人体部位分割图对所述初始图像中的衣物进行优化裁剪,获得第二人体图像。
  5. 根据权利要求1所述的方法,在对所述第一人体图像分别进行关键点提取之后,人像分割之前,还包括:
    获取基准关键点分布信息;
    基于所述基准关键点分布信息对所述第一人体图像的关键点进行调整,获得调整后的第一人体图像。
  6. 根据权利要求5所述的方法,其中,所述对所述第一人体图像分别进行人像分割及人体部位分割,包括:
    对调整后的第一人体图像分别进行人像分割及人体部位分割。
  7. 根据权利要求1所述的方法,其中,所述形变模型的训练方式为:
    获取人体样本图像及衣物样本图像;其中,所述人体样本图像中的人体穿戴所述衣物样本图像中的衣物;
    对所述人体样本图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征样本图、人像分割样本图及人体部位分割样本图;
    将所述关键点特征样本图、所述人像分割样本图、所述人体部位分割样本图及所述衣物样本图像输入初始模型中,获得第一变形衣物图;
    根据所述第一变形衣物图及所述人体样本图像计算损失函数;
    根据所述损失函数训练所述初始模型,获得形变模型。
  8. 根据权利要求7所述的方法,其中,所述混合模型的训练方式为:
    将所述关键点特征样本图、所述人像分割样本图、所述人体部位分割样本图及所述衣物样本图像输入所述形变模型,获得第二变形衣物图;
    将所述第二变形衣物图、所述人体样本图像、所述关键点特征样本图、所述人像分割样本图、人体部位分割样本图及所述衣物样本图像输入生成模型,获得生成人体图像;
    将所述生成人体图像输入判别模型,获得判别结果;
    根据所述判别结果对所述生成模型进行训练,获得混合模型。
  9. 根据权利要求1-8任一所述的方法,其中,所述衣物图像为衣物平铺图。
  10. 一种图像生成装置,包括:
    人体图像获取模块,设置为获取包含目标人体的第一人体图像及包含目标衣物的第一衣物图像;
    分割图获取模块,设置为对所述第一人体图像分别进行关键点提取、人像分割及人体部位分割,获得关键点特征图、人像分割图及人体部位分割图;
    第二衣物图像获取模块,设置为将所述关键点特征图、所述人像分割图、所述人体部位分割图及所述第一衣物图像输入形变模型中,获得变形后的第二衣物图像;
    第二人体图像获取模块,设置为将所述第二衣物图像、所述第一人体图像、所述关键点特征图、所述人像分割图及所述人体部位分割图输入混合模型,获得第二人体图像;其中,所述第二人体图像中的所述目标人体穿戴所述目标衣物。
  11. 一种电子设备,包括:
    一个或多个处理装置;
    存储装置,设置为存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现如权利要求1-9中任一所述的图像生成方法。
  12. 一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理装置执行时实现如权利要求1-9中任一所述的图像生成方法。
PCT/CN2022/118670 2021-09-29 2022-09-14 图像生成方法、装置、设备及存储介质 WO2023051244A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111151607.6A CN113850212A (zh) 2021-09-29 2021-09-29 图像生成方法、装置、设备及存储介质
CN202111151607.6 2021-09-29

Publications (1)

Publication Number Publication Date
WO2023051244A1 true WO2023051244A1 (zh) 2023-04-06

Family

ID=78976935

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/118670 WO2023051244A1 (zh) 2021-09-29 2022-09-14 图像生成方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN113850212A (zh)
WO (1) WO2023051244A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117115926A (zh) * 2023-10-25 2023-11-24 天津大树智能科技有限公司 一种基于实时图像处理的人体动作标准判定方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850212A (zh) * 2021-09-29 2021-12-28 北京字跳网络技术有限公司 图像生成方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109427007A (zh) * 2018-09-17 2019-03-05 叠境数字科技(上海)有限公司 基于多视角的虚拟试衣方法
CN111784845A (zh) * 2020-06-12 2020-10-16 腾讯科技(深圳)有限公司 基于人工智能的虚拟试穿方法、装置、服务器及存储介质
CN112330580A (zh) * 2020-10-30 2021-02-05 北京百度网讯科技有限公司 生成人体衣物融合图像的方法、装置、计算设备、介质
CN112784865A (zh) * 2019-11-04 2021-05-11 奥多比公司 使用多尺度图块对抗性损失的衣物变形
US20210241531A1 (en) * 2020-02-04 2021-08-05 Nhn Corporation Method and apparatus for providing virtual clothing wearing service based on deep-learning
CN113850212A (zh) * 2021-09-29 2021-12-28 北京字跳网络技术有限公司 图像生成方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109427007A (zh) * 2018-09-17 2019-03-05 叠境数字科技(上海)有限公司 基于多视角的虚拟试衣方法
CN112784865A (zh) * 2019-11-04 2021-05-11 奥多比公司 使用多尺度图块对抗性损失的衣物变形
US20210241531A1 (en) * 2020-02-04 2021-08-05 Nhn Corporation Method and apparatus for providing virtual clothing wearing service based on deep-learning
CN111784845A (zh) * 2020-06-12 2020-10-16 腾讯科技(深圳)有限公司 基于人工智能的虚拟试穿方法、装置、服务器及存储介质
CN112330580A (zh) * 2020-10-30 2021-02-05 北京百度网讯科技有限公司 生成人体衣物融合图像的方法、装置、计算设备、介质
CN113850212A (zh) * 2021-09-29 2021-12-28 北京字跳网络技术有限公司 图像生成方法、装置、设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117115926A (zh) * 2023-10-25 2023-11-24 天津大树智能科技有限公司 一种基于实时图像处理的人体动作标准判定方法及装置
CN117115926B (zh) * 2023-10-25 2024-02-06 天津大树智能科技有限公司 一种基于实时图像处理的人体动作标准判定方法及装置

Also Published As

Publication number Publication date
CN113850212A (zh) 2021-12-28

Similar Documents

Publication Publication Date Title
US11989350B2 (en) Hand key point recognition model training method, hand key point recognition method and device
WO2023051244A1 (zh) 图像生成方法、装置、设备及存储介质
CN112257876B (zh) 联邦学习方法、装置、计算机设备及介质
WO2022083383A1 (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
WO2022105862A1 (zh) 视频生成及显示方法、装置、设备、介质
WO2023138560A1 (zh) 风格化图像生成方法、装置、电子设备及存储介质
US11425524B2 (en) Method and device for processing audio signal
WO2023072015A1 (zh) 人物风格形象图的生成方法、装置、设备及存储介质
WO2022100680A1 (zh) 混血人脸图像生成方法、模型训练方法、装置和设备
WO2022037602A1 (zh) 表情变换方法、装置、电子设备和计算机可读介质
CN111476783A (zh) 基于人工智能的图像处理方法、装置、设备及存储介质
WO2023273697A1 (zh) 图像处理方法、模型训练方法、装置、电子设备及介质
WO2020253716A1 (zh) 图像生成方法和装置
WO2023232056A1 (zh) 图像处理方法、装置、存储介质及电子设备
WO2023098664A1 (zh) 特效视频的生成方法、装置、设备及存储介质
WO2021088790A1 (zh) 用于目标设备的显示样式调整方法和装置
WO2023030381A1 (zh) 三维人头重建方法、装置、设备及介质
WO2023143222A1 (zh) 图像处理方法、装置、设备及存储介质
WO2022233223A1 (zh) 图像拼接方法、装置、设备及介质
WO2023138441A1 (zh) 视频生成方法、装置、设备及存储介质
CN111833242A (zh) 人脸变换方法、装置、电子设备和计算机可读介质
CN114049417B (zh) 虚拟角色图像的生成方法、装置、可读介质及电子设备
CN108055461B (zh) 自拍角度的推荐方法、装置、终端设备及存储介质
WO2023098649A1 (zh) 视频生成方法、装置、设备及存储介质
WO2024027819A1 (zh) 图像处理方法、装置、设备及存储介质

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 18696889

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 08.07.24).