WO2023124391A1 - 妆容迁移及妆容迁移网络的训练方法和装置 - Google Patents

妆容迁移及妆容迁移网络的训练方法和装置 Download PDF

Info

Publication number
WO2023124391A1
WO2023124391A1 PCT/CN2022/125086 CN2022125086W WO2023124391A1 WO 2023124391 A1 WO2023124391 A1 WO 2023124391A1 CN 2022125086 W CN2022125086 W CN 2022125086W WO 2023124391 A1 WO2023124391 A1 WO 2023124391A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
makeup
partial
sample
transferred
Prior art date
Application number
PCT/CN2022/125086
Other languages
English (en)
French (fr)
Inventor
吴文岩
郑程耀
甘世康
唐斯伟
张丽
钱晨
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023124391A1 publication Critical patent/WO2023124391A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present disclosure relates to the technical field of computer vision, and in particular to makeup transfer and a training method, device, computer equipment and storage medium for a makeup transfer network.
  • Makeup transfer is an important direction in the field of image generation in computer vision. Makeup transfer refers to transferring a makeup style to an image that does not have the makeup style. For example, for a face image, makeup transfer may be to transfer a certain makeup style to a face image without makeup. However, the reduction degree of traditional makeup transfer methods is low.
  • the first aspect of the embodiments of the present disclosure provides a makeup transfer method, the method includes: acquiring a target image to be transferred and a partial image to be transferred, the target image to be transferred includes a target object, and the partial image to be transferred includes the The local area of the target object; transfer the preset makeup style to the target image to be transferred through the target makeup transfer network to obtain the transfer target image; transfer the preset makeup style to the target image through the target appearance transfer network On the transferred partial image, a transferred partial image is obtained; the transferred target image and the transferred partial image are fused to obtain a makeup transfer result of the target object.
  • the second aspect of the embodiments of the present disclosure provides a method for training a makeup transfer network, the method comprising: acquiring a sample image to be transferred and a partial sample image to be transferred, the sample image to be transferred includes a second sample object, the The local sample image to be transferred includes the local area of the second sample object; the makeup style of the first sample object in the reference sample image is transferred to the sample image to be transferred through the original makeup transfer network to obtain a transferred sample image; The makeup style of the first sample object in the reference partial sample image is transferred to the partial sample image to be transferred through the original makeup transfer network to obtain a transferred partial sample image, the reference partial sample image includes the reference the local area of the target object in the sample image, and the local area included in the partial sample image to be transferred is the same as the local area included in the reference partial sample image; based on the transfer sample image and the transferred partial sample image pair The original makeup transfer network is trained to obtain the target makeup transfer network.
  • a third aspect of the embodiments of the present disclosure provides a makeup migration device, the device includes: an acquisition module, configured to acquire a target image to be migrated and a partial image to be migrated, the target image to be migrated includes a target object, and the target image to be migrated
  • the local image includes the local area of the target object
  • the migration module is used to transfer the preset makeup style to the target image to be transferred through the target makeup transfer network to obtain the transfer target image
  • the preset makeup style is migrated to the partial image to be transferred to obtain a transferred partial image
  • the fusion module is configured to fuse the transferred target image and the transferred partial image to obtain a makeup transfer result of the target object.
  • the fourth aspect of the embodiments of the present disclosure provides a training device for a makeup transfer network
  • the device includes: an acquisition module, configured to acquire a sample image to be transferred and a partial sample image to be transferred, and the sample image to be transferred includes the second The sample object, the local sample image to be transferred includes the partial area of the second sample object; the first migration module is used to transfer the makeup style of the first sample object in the reference sample image to the original makeup transfer network.
  • the migration sample image is obtained on the sample image to be transferred; the second transfer module is used to transfer the makeup style of the first sample object in the reference partial sample image to the partial sample to be transferred through the original makeup transfer network
  • a transferred local sample image is obtained, the reference partial sample image includes a local area of the target object in the reference sample image, and the local area included in the partial sample image to be transferred is the same as that included in the reference partial sample image The same local area; the training module is used to train the original makeup transfer network based on the transfer sample image and the transfer partial sample image to obtain a target makeup transfer network.
  • a fifth aspect of the embodiments of the present disclosure provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the method described in any embodiment is implemented.
  • a sixth aspect of the embodiments of the present disclosure provides a computer device, including a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the program, it implements any of the embodiments. described method.
  • a seventh aspect of the embodiments of the present disclosure provides a computer program product, where the product includes a computer program, and when the computer program is executed by a processor, the method described in any embodiment is implemented.
  • Figure 1 is a schematic diagram of makeup migration.
  • FIG. 2 is a flowchart of a makeup transfer method according to an embodiment of the present disclosure.
  • Fig. 3 is a schematic diagram of a target image to be migrated and a partial image to be migrated according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of an output result according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of a training process according to an embodiment of the disclosure.
  • Fig. 6 is a flowchart of a makeup transfer method according to another embodiment of the present disclosure.
  • Fig. 7 is a flowchart of a training method of a makeup transfer network according to an embodiment of the present disclosure.
  • Fig. 8 is a flowchart of a training method of a makeup transfer network according to another embodiment of the present disclosure.
  • FIG. 9 is a block diagram of a makeup transfer device according to an embodiment of the present disclosure.
  • Fig. 10 is a block diagram of a makeup transfer device according to another embodiment of the present disclosure.
  • Fig. 11 is a block diagram of a training device for a makeup transfer network according to an embodiment of the present disclosure.
  • Fig. 12 is a block diagram of a training device for a makeup transfer network according to another embodiment of the present disclosure.
  • Fig. 13 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure.
  • first, second, third, etc. may be used in the present disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present disclosure, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word “if” as used herein may be interpreted as “at” or “when” or “in response to a determination.”
  • Makeup transfer refers to the transfer of a makeup style to an image that does not have the makeup style.
  • an image with a certain makeup can be obtained first, and then the features of the makeup in the image are extracted through a neural network, and the extracted The features of makeup are transferred to images that do not have that makeup style.
  • the makeup can be obtained by any means such as painting, tattooing, stickers, makeup, etc., and the image can be an image of a certain body part (eg, back, face) of a person or animal.
  • body part eg, back, face
  • FIG. 1 is a schematic diagram of makeup migration.
  • a preset makeup style can be transferred to a plain makeup image 101 to obtain a migration result 102 with the preset makeup style.
  • the face included in the migration result 102 and the face included in the plain face image 101 are faces of the same person.
  • the preset makeup style may also include makeup styles of other regions.
  • the plain makeup image in this embodiment refers to a facial image without makeup.
  • the preset makeup style can also be transferred to a facial image with other makeup styles.
  • a new makeup style can be obtained by overwriting the original makeup style on the face image, or merging with the original makeup style on the face image.
  • the makeup migration algorithm needs to ensure the makeup restoration of each area of the facial makeup, such as eye shadow, eyeliner, and colored contact lenses in the eye area; lipstick highlights, colors, textures, etc. in the mouth area, all need to be migrated to the user with a certain makeup restoration on the face.
  • the robustness and naturalness of the migration results are low. Since the facial image to be transferred may have arbitrary illumination/angle/face shape/occlusion status, it is difficult for related technologies to ensure that the transferred image is still natural, and it is also difficult to take into account a variety of facial images, and the transfer results are often inconsistent. harmony.
  • the migration algorithm needs to change the identifiable id (identity information) information (also called id attribute information) of the user itself while transferring the makeup.
  • the id information is used to represent the user's identity information, and the user's facial features Changes in any one or more of , expression, face angle, single and double eyelid properties of the eyes, opening and closing of the mouth, etc., may cause the id information to change, that is, cause one user to be identified as another user. Therefore, on the premise that the migration intensity is sufficient, it is difficult for related technologies to ensure that the user's own id information remains unchanged, that is, it is difficult to ensure a high degree of id retention.
  • enhancement can only be made on one of the above-mentioned dimensions of makeup restoration, naturalness, and user identifiable id information, but cannot take into account the effects of each of the above-mentioned dimensions. That is, related technologies either guarantee the naturalness and id retention of the image after migration at the cost of lower makeup restoration; or sacrifice the naturalness of migration and obviously modify some id attribute information of the face image to achieve Guaranteed strong makeup restoration.
  • an embodiment of the present disclosure provides a makeup transfer method, as shown in FIG. 2 , the method includes steps 201 to 203 .
  • Step 201 Obtain a target image to be migrated pic_global and a local image to be migrated pic_local, the target image to be migrated pic_global includes a target object, and the local image to be migrated pic_local includes a local area of the target object.
  • Step 202 Transfer the preset makeup style to the target image pic_global to be transferred and the local image pic_local to be transferred through the target makeup transfer network to obtain the transfer target image gen_global and the transfer local image gen_local.
  • Step 203 Fusion the migration target image P_global and the migration local image P_local to obtain a makeup migration result of the target object.
  • the solutions of the embodiments of the present disclosure can be used in products such as interactive entertainment, makeup beautification, and makeup testing.
  • the target makeup transfer network is used to perform makeup migration on the target image pic_global to be transferred and the local image pic_local to be transferred.
  • the makeup details of the local area can be better transferred, thereby improving the makeup appearance.
  • Migration reduction By performing makeup migration on the to-be-migrated target image pic_global including the complete target object, the naturalness of makeup migration can be improved.
  • the solutions of the embodiments of the present disclosure can take into account the naturalness and restoration of makeup migration.
  • the target image to be migrated can be a single image, or a frame of image or multiple video frames in a video, and the target image to be migrated can be collected in real time, or can be collected and stored in advance of.
  • the target image to be transferred is an image or a video frame including the target object in a video collected in real time
  • the partial image to be transferred is the image or video frame including the target object
  • the video collected in real time may include multiple continuous or discontinuous video frames
  • makeup migration may be performed on all video frames including the target object in the video, or a video that includes the target object in the video and meets the preset conditions may be selected Frames for makeup migration.
  • the preset conditions may include but not limited to definition conditions, size conditions of the target object, and the like.
  • satisfying the definition condition means that the definition of the video frame is greater than a preset definition threshold
  • the size condition of the target object means that the size of the target object in the video frame is within a preset size range.
  • the target image pic_global to be migrated may include a complete target object, for example, a face, and the target object is an object whose makeup needs to be migrated.
  • a target object may include multiple local areas, for example, a face includes areas such as eyes, nose, and mouth.
  • the partial image pic_local to be transferred may include one or more local regions. For example, considering the difference in makeup type and makeup effect, the partial image pic_local to be transferred may include left eye region, right eye region, nose region, forehead At least one area, apple muscle area, etc. In addition, in order to restore details better, according to the requirements of makeup transfer, the partial image pic_local to be transferred may also include only one local region.
  • the local image pic_local to be transferred can be obtained by performing object detection and image segmentation on the target image pic_global to be transferred.
  • target detection may be performed on the original image of the target object, and the key point positions of the target object in the original image are determined; based on the key point positions of the target object, the Describe the partial image to be transferred.
  • facial key points such as left eye key points, right eye key points, nose key points, and mouth key points can be detected, and the left eye area can be cut out from the original image based on the position of the left eye key points. Partial image to be transferred.
  • the partial image to be transferred cut out from the original image may be an image area in the original image with a first preset size, the first preset size being smaller than the size of the target object , and larger than the size of the local area, the local area is located at a first preset position in the image area of the first preset size.
  • the first preset position may be the center position of the image, or the intersection position of the horizontal three-point line and the vertical three-point line in the image, or other positions. Since the first preset size is larger than the size of the partial region, the cropped partial image to be transferred can include a complete partial region, for example, a complete eye region, a mouth region, and the like.
  • an affine matrix may be established based on the position and angle of the target object in the original image, and the target image to be migrated is cut out from the original image based on the affine matrix.
  • the target image cut out from the original image may be an image area in the original image with a second preset size, the second preset size being larger than the size of the target object , and the target object is located at a second preset position in the image area of the second preset size.
  • the second preset position may be the center position of the image, or the intersection position of the horizontal three-point line and the vertical three-point line in the image, or other positions. Since the second preset size is larger than the size of the target object, the cropped target image to be migrated can include the complete target object.
  • the first preset size is, for example, 256*256 pixels
  • the second preset size is, for example, 1024*1024 pixels
  • the size of the target object is, for example, 800*800 pixels.
  • the above numerical values are only illustrative descriptions, and are not intended to limit the present disclosure. In practical applications, other dimensions can be set as required. If the size of the original image is too large or too small, the original image may be scaled first, and then the target image to be migrated is cut out from the scaled original image to a desired size. Alternatively, an image area including the target object may be cropped from the original image first, and then the cropped image area may be scaled to a required size.
  • the original image including the target object may include a background area
  • the original image may also be cropped or background segmented to obtain the image area where the target object is located (ie, the target image to be migrated).
  • the target object in the target image pic_global to be migrated may also be adjusted to a preset angle, for example, the preset angle satisfies: the top of the head and the chin of the target object are aligned in the vertical direction.
  • the process of angle adjustment can be realized by affine transformation. By adjusting the angle, it is convenient to perform various processing on the target image pic_global to be migrated, for example, image segmentation, feature extraction, and the like.
  • the original image may be an image uploaded by a user (for example, an image stored in a photo album of a mobile phone), or an image collected in real time by an image acquisition device. It is also possible for the user to crop the original image to obtain a target image to be transferred that meets the requirements, and directly upload the target image to be transferred for makeup transfer.
  • the partial image to be transferred is cut out from the original image based on the key point position of the target object. The target image to be transferred is cut out, and then the partial image to be transferred is cut out from the target image to be transferred.
  • the target image pic_global to be migrated and the local image pic_local to be migrated in some embodiments are shown in FIG. 3 , wherein the local image pic_local to be migrated includes a partial image for the right eye, a partial image for the left eye, and a partial image for the mouth.
  • the partial image pic_local to be migrated is not limited to the parts of the above three kinds of partial images, but can also set images of other local regions other than the above three kinds of local images according to the actual makeup transfer requirements. This is not limited.
  • the target makeup transfer network can pre-learn the makeup features of the preset makeup style, so that the preset makeup style can be transferred to the target image pic_global to be transferred and the local image pic_local to be transferred.
  • a target makeup transfer network corresponds to a preset makeup style.
  • the image with the preset makeup style can be used as a sample image to train the target makeup transfer network.
  • a target makeup transfer network may also correspond to multiple preset makeup styles.
  • the target makeup transfer network can be trained by using an image of each preset makeup style in the plurality of preset makeup styles as a sample image, wherein each image of the preset makeup style carries label information, and the label information uses The preset makeup style corresponding to the logo image.
  • the target makeup transfer network may include a first sub-network and a second sub-network, the first sub-network is used to transfer the preset makeup style to the target image pic_global to be transferred, and the second sub-network The sub-network is used to transfer the preset makeup style to the partial image pic_local to be transferred.
  • the number of the second sub-networks includes one or more, and in the case of including at least two second sub-networks, different second sub-networks are used for different local areas to be migrated Image pic_local for makeup migration.
  • a facial image may include local areas such as left eye, right eye, nose, and mouth.
  • each sub-network can include a makeup style extractor and a generator, wherein the makeup style extractor is used to extract the makeup feature F_ref from the image ref with makeup; That is, the target image to be migrated (pic_global or the local image to be migrated (pic_local)) generates a style-transferred image (ie, the target image for migration gen_global or the local image for migration gen_local).
  • the overall makeup transfer effect of the target object can be obtained by taking the target image pic_global to be transferred as a whole; at the same time, the makeup features of a local area of the target object can be fully excavated, thereby obtaining the makeup transfer effect of the local area.
  • the migration local image gen_local is fused into the migration target image gen_global to obtain the final migration image gen_face (i.e. makeup migration result), the migration image gen_face includes the target object included in the target image to be migrated, and the target object With said preset makeup style.
  • the target image pic_global to be migrated includes user A's face image without makeup, and the preset makeup style includes gray eye shadow, red lipstick, and blue contact lenses.
  • the migration image gen_face includes gray eye shadow, red lipstick, and blue cosmetic Hitomi's user A's face image.
  • the transfer target image can be semantically segmented to obtain the position of the local area in the transfer target image; based on the position of the local area in the transfer target image, the transferred partial image fused into the migration target image to obtain the makeup migration result of the target object.
  • the above image fusion process can be realized by using algorithms such as Laplacian fusion and feathering fusion, and the specific algorithms used are not limited in this disclosure.
  • the target makeup transfer network may include a first sub-network, a second sub-network and a third sub-network.
  • the first sub-network is used to transfer the preset makeup style to the target image to be transferred to obtain the transfer target image;
  • the second sub-network is used to transfer the preset makeup style to the partial image to be transferred to obtain the transferred part image.
  • the third sub-network is used to acquire the migration target image and the migration partial image, and fuse the migration target image and the migration partial image.
  • the original makeup transfer network can be trained in advance based on the sample image to be transferred, the partial sample image to be transferred, the reference sample image, and the reference partial sample image to obtain the target Makeup Transfer Network.
  • the reference sample image ref_global includes complete sample objects with the preset makeup style, and the sample objects included in the reference sample image ref_global and the sample objects included in the sample image samp_global to be migrated are sample objects of the same category , for example, are all human faces.
  • the sample image samp_global to be migrated includes sample objects with makeup styles other than the preset makeup style, that is, the sample objects included in the reference sample image ref_global and the sample objects included in the sample image samp_global to be migrated have different makeup.
  • plain makeup can be regarded as a special makeup look, or it can also belong to one of the makeup looks contained in the sample image to be migrated.
  • the local sample image to be migrated samp_local includes the local area of the sample object in the sample image to be migrated samp_global
  • the reference local sample image ref_local includes the local area of the sample object in the reference sample image ref_global
  • the The local area included in the to-be-migrated partial sample image samp_local is the same as the local area included in the reference partial sample image ref_local.
  • Both the local sample image samp_local to be transferred and the reference local sample image ref_local may include one or more local regions. For example, both include the left eye area, or both include the left eye area and the nose area.
  • the local sample image samp_local to be transferred can be obtained by performing target detection and image segmentation on the sample image samp_global to be transferred.
  • the reference sample image is selected from a first image set, the first image set includes a plurality of images, and each image in the first image set includes the same makeup with the preset makeup style. sample object.
  • the reference sample image includes a plurality of video frames in the sample video, and each video frame in the plurality of video frames in the sample video includes a sample object with the preset makeup style.
  • the sample video may be a video directly captured by an image capture device, or may be an edited video.
  • the multiple video frames in the sample video may include multiple video frames that are temporally continuous, or may include temporally discontinuous video frames.
  • multiple video frames in the sample video meet at least one of the following conditions: the angles and/or expressions of the sample objects in at least two video frames are different; the light intensity in at least two video frames is different.
  • the sample image to be transferred is selected from a second image set, the second image set includes a plurality of images, and each image in the second image set includes In other words, each image in the second image set includes a sample object that does not have the preset makeup style, and the sample objects included in at least two images in the second image set are different .
  • the trained target makeup transfer network can fully learn the ability to transfer makeup styles to objects with different ids, thus making the makeup transfer results more natural. high.
  • the sample object included in the sample image to be migrated may be referred to as a second sample object
  • the sample object included in the reference sample image may be referred to as a first sample object.
  • the makeup style in the reference sample image ref_global can be transferred to the sample image samp_global to be transferred through the original makeup transfer network to obtain the transfer sample image samp_gen_global; the makeup in the reference local sample image ref_local can be obtained through the original makeup transfer network
  • the style is transferred to the local sample image samp_local to be transferred to obtain the transferred local sample image samp_gen_local; the original makeup transfer network is trained based on the transferred sample image samp_gen_global and the transferred local sample image samp_gen_local to obtain the target makeup transfer network.
  • the sample image samp_global to be migrated and the local sample image samp_local to be migrated, as well as the reference sample image ref_global and the reference local sample image ref_local can be processed, including adjusting the size of the image and adjusting the image.
  • the angle of the sample object may be adjusted to the same size (for example, the second preset size), and the local sample image samp_local to be migrated and the reference local sample image ref_local may be adjusted to the same size (eg, the first preset size).
  • the first loss function can be established based on the migration sample image samp_gen_global; the second loss function can be established based on the migration local sample image samp_gen_local; the original loss function can be established based on the first loss function and the second loss function
  • the makeup transfer network is trained to obtain the target makeup transfer network.
  • the target makeup transfer network includes a first sub-network and a second sub-network
  • the original first sub-network in the original makeup transfer network can be trained based on the first loss function to obtain the The first sub-network; based on the second loss function, the original second sub-network in the original makeup transfer network is trained to obtain the second sub-network.
  • the loss function used to train the sub-network may include at least one of the following:
  • a loss function used to characterize the authenticity loss of the output image of the sub-network The output image of the sub-network can be input to the discriminator, and the discriminator can judge whether the output image is a synthetic image obtained through makeup migration.
  • the goal is to make the discriminator unable to identify whether the output image is a real image or a synthetic image, so , the loss function can be obtained according to the difference between the output result of the discriminator and the real result.
  • the loss function is obtained through the confrontation between the generator and the opponent (discriminator), so it can also be called the confrontation generation loss function.
  • the attributes corresponding to different local regions are different.
  • the attributes corresponding to the eye region may include eyelid attributes, and the eyelid attributes are used to indicate whether the eyelids are single eyelids or double eyelids;
  • the nose region The corresponding attribute may include the height of the bridge of the nose;
  • the corresponding attribute of the mouth area may include the radian of the corner of the mouth and the like.
  • the output image of the subnetwork can be input into the attribute classifier to obtain the attribute category corresponding to the output image, and based on the attribute category corresponding to the output image and the image to be migrated into the subnetwork (sample image to be migrated samp_global or local sample image to be migrated samp_local) Perform a similarity comparison to obtain the loss function.
  • this loss function it can be ensured that the id attribute information of the sample object after migration is consistent with that of the sample object before migration. Therefore, this loss function can also be called an attribute preservation loss function.
  • a loss function for characterizing the makeup similarity loss between the output image of the sub-network and the reference image input into the sub-network can be input to the makeup style extractor included in the sub-network to extract the makeup features in the output image, and then the makeup features in the output image and the reference image input into the sub-network (reference sample image ref_global Or refer to the makeup feature in the local sample image ref_local) to perform similarity comparison, so as to obtain the loss function.
  • this loss function it is possible to ensure that the makeup style after migration is consistent with the makeup style in the reference image, thereby improving the degree of restoration of makeup migration. Therefore, this loss function can also be called a style consistent loss function.
  • the similarity of the image to be transferred to the sub-network determines the loss function, which may also be called a cycle-consistent loss function.
  • the output image of the sub-network can be kept consistent with the structural information of the sample object in the original image to be migrated.
  • the structural information includes semantic information of each point in the image, and the semantic information is used to indicate the local area to which the pixel belongs, for example, indicating whether the pixel belongs to the nose area or the mouth area.
  • the subnetwork may be the first subnetwork or the second subnetwork.
  • the image to be migrated, the reference image input into the sub-network, and the output image of the sub-network are all images including a complete target object.
  • the image to be migrated, the reference image input to the sub-network, and the output image of the sub-network are all images including the local area of the target object.
  • the first subnetwork and each second subnetwork can be trained based on at least one of the above four loss functions, and the attributes used by different second subnetworks Preserving loss functions can be obtained based on different attribute classes.
  • the attribute-preserving loss function used by the second sub-network for processing the left eye area and the right-eye area can be obtained based on the eyelid attribute category
  • the attribute-preserving loss function used by the second sub-network for processing the mouth area can be based on the lip thickness Category and/or Mouth Radius category acquisition.
  • makeup transfer may cause the transferred local area on the target object to have a different color from the non-transferred local area on the target object.
  • the face part The color of the neck part may be different from the color. Therefore, in order to further reduce the sense of incongruity of the makeup transfer result and improve the naturalness, after the transfer target image and the transfer partial image are fused, the local area on the target object that has not undergone makeup transfer can be colored migrate.
  • the transferred color of the target object may be obtained; based on the transferred color of the target object, the color of the region on the target object that has not undergone makeup transfer is adjusted.
  • the output image may also be restored to the same size as the target image to be migrated. For example, assuming that the size of the target image to be migrated cut out from the original image is 1024*1024 pixels, the target image to be migrated can be restored from 1024*1024 pixels to the original size.
  • the partial images corresponding to the main makeup area may include a left eyebrow partial image, a right eyebrow partial image, a left eye partial image, a right eye partial image, a nose partial image and/or a mouth partial image.
  • the main makeup area The corresponding partial images include left eye partial image, right eye partial image and mouth partial image as an example for description.
  • the face image in the original image and the partial image corresponding to the main makeup area on the face need to be cut out and adjusted to the specified size; prepare these data for the next step
  • the migration of makeup including step (1) to step (4).
  • a face image with a size of 1024*1024 pixels can be cut out from the original image, wherein the face is centered, and the face part occupies a size of 800*800 pixels.
  • Step (4) Perform makeup transfer on the face image, left eye partial image, right eye partial image and mouth partial image, and finally merge the makeup transfer results of the above four images into one image.
  • the color migration is performed on the areas of the user's original exposed skin that have not undergone makeup migration (for example, neck, ears) to reduce the sense of incongruity and improve naturalness.
  • Step (4) further includes step (4.1) to step (4.3).
  • the partial mouth image gen_mouth can draw a segmentation map of the local region of the human face, and the segmentation map includes a partial image of the mouth, a partial image of the left eye, and a partial image of the right eye.
  • the target makeup transfer network in the embodiment of the present disclosure may include a first subnetwork and at least one second subnetwork, and each of the first subnetwork and each second subnetwork may include a feature extractor and a generator.
  • the training framework of the second sub-network used for makeup transfer to the left eye partial image is shown in Figure 5, where the second sub-network can be combined with the discriminator, eyelid attribute classifier, makeup style extractor (also called feature extractor) for joint training.
  • the two feature extractors in the above training framework have the same network structure.
  • the other second sub-networks and the first sub-network can be trained using a similar training framework, only need to replace the eyelid attribute classifier and eyelid attribute preservation loss function with the corresponding attribute classifier and attribute preservation loss function.
  • the training process of the second sub-network for makeup transfer of the partial image of the left eye takes the training process of the second sub-network for makeup transfer of the partial image of the left eye as an example to illustrate, and the training process of other sub-networks can refer to the training process of the second sub-network.
  • the training process includes step (1) to step (7).
  • single id means that the sample objects included in the video frames are sample objects with the same id information, and the sample objects in each video frame have the same makeup style.
  • a video generally contains 1000 to 5000 video frames.
  • a specified number of video frames may be extracted according to a certain frame extraction strategy (for example, according to a certain frame number interval, or random frame extraction, etc.). It is possible to collect images of unmakeup faces with different id information, and establish a multi-id plain face data set (i.e.
  • the multi-id plain face data set can include 15,000 images, and each image includes a An isolated human face without make-up.
  • Each image in the single-id makeup data set and the multi-id plain face data set can be processed by the above-mentioned face key point detection, face cropping, and local area cropping.
  • a partial image of the left eye src_left_eye is randomly selected from the multi-id makeup dataset as the makeup image (that is, the local sample image to be transferred), and a partial image of the left eye is randomly selected from the single-id makeup dataset as the left eye image.
  • the reference local sample image of the eye region denoted as ref_left_eye.
  • the discriminator distinguishes whether the migrated image is a synthetic image or a real image, thereby establishing an adversarial generative loss function.
  • Adversarial loss functions can improve the fidelity (i.e., realism) of generated results.
  • the eyelid attribute maintains the loss function
  • the migrated image is input into the eyelid attribute classifier for classification.
  • the classification result is used to indicate whether the eyelid attribute in the migrated image is single eyelid or double eyelid. Since the user's id information is expected not to be changed , so the classification result should be consistent with the classification result of plain color map.
  • the makeup transfer generation framework is similar to the above-mentioned makeup transfer generation framework for the left eye region.
  • Users can upload their own photos to a processor such as a terminal, and use the makeup migration method of the present disclosure to obtain a migrated user photo.
  • a makeup video can be used as input to train the target makeup transfer network.
  • the makeup video includes multiple video frames, and each video frame includes a target object with a preset makeup style. In this way, the degree of detail and restoration of makeup migration can be improved.
  • the attribute preservation loss function of the local area is used to ensure that the user's local attributes are not changed, thereby improving the retention of user ID information during the makeup migration process.
  • the eyelid attribute preservation loss function is used in the training process of the second sub-network in Figure 5. If the local sample image to be transferred is obtained after makeup migration by the second sub-network If the eyelid attributes are changed in the generated image, the value of the eyelid attribute preservation loss function is larger; if the eyelid attributes are not changed in the generated image, the value of the eyelid attribute preservation loss function is smaller. Therefore, by adjusting the network parameter value of the second sub-network, the eyelid attribute can be kept with a smaller value for the loss function. In this way, the degree of retention of eyelid attributes before and after makeup migration is improved, thereby improving the degree of retention of user id information.
  • the multi-id plain makeup dataset and the single-id makeup dataset are used as sample data. Different images in the single-id makeup dataset cover the visual effects presented under various angles and expressions under specific makeup.
  • the multi-id plain makeup dataset The different images of different id information cover the attribute information of the target object with a variety of id information, so the target makeup migration network trained based on the two data sets can not only learn the subtle changes of the same makeup style, but also learn the subtle changes of different id information. Variety. Therefore, the degree of retention of user id information and the strength of makeup migration (that is, the degree of restoration) can be taken into account.
  • the target makeup transfer network can better obtain the makeup details of the local area; by using the complete image, It can make the target makeup transfer network better grasp the overall makeup characteristics of the target object. Therefore, it can not only guarantee the naturalness of the target object, but also ensure a high degree of restoration of key makeup areas.
  • the embodiments of the present disclosure can take into account both the degree of detail restoration and id retention of makeup migration, and cover face images in various situations, improving the robustness of makeup migration, and the makeup migration of the embodiments of the present disclosure Not at the cost of sacrificing the performance of any dimension (reduction, id retention, robustness, etc.).
  • the embodiment of the present disclosure also provides a makeup transfer method, the method includes steps 601 to 602 .
  • Step 601 Obtain a target image to be migrated.
  • Step 602 Transfer the preset makeup style to the target image to be transferred through the pre-trained target makeup transfer network to obtain the transfer target image.
  • the target makeup transfer network is obtained by training the original makeup transfer network by using each video frame in a plurality of video frames and the transfer sample image corresponding to each video frame, and the plurality of video frames include sample object, a transfer sample image corresponding to a video frame is an image obtained by transferring the makeup style of the sample object in the video frame to the sample image to be transferred through the original makeup transfer network.
  • an embodiment of the present disclosure also provides a method for training a makeup transfer network, and the method may include steps 701 to 704 .
  • Step 701 Obtain a sample image to be migrated and a partial sample image to be migrated, the sample image to be migrated includes a second sample object, and the partial sample image to be migrated includes a partial area of the second sample object.
  • Step 702 Transfer the makeup style of the first sample object in the reference sample image to the sample image to be transferred through the original makeup transfer network to obtain a transferred sample image.
  • Step 703 Use the original makeup transfer network to transfer the makeup style of the first sample object in the reference partial sample image to the partial sample image to be transferred to obtain a transferred partial sample image.
  • the reference partial sample image includes the reference sample The local area of the first sample object in the image, and the local area included in the partial sample image to be transferred is the same as the local area included in the reference partial sample image.
  • Step 704 Train the original makeup transfer network based on the transfer sample image and the transfer partial sample image to obtain a target makeup transfer network.
  • an embodiment of the present disclosure also provides a method for training a makeup transfer network, and the method may include steps 801 to 803 .
  • Step 801 Obtain a plurality of video frames, the plurality of video frames include sample objects with the same makeup style.
  • Step 802 For each video frame in the plurality of video frames, transfer the makeup style of the sample object in the video frame to the sample image to be transferred through the original makeup transfer network, and obtain the transferred sample image corresponding to the video frame.
  • Step 803 Train the original makeup transfer network based on each video frame in the plurality of video frames and the transfer sample images corresponding to each video frame to obtain a target makeup transfer network.
  • This disclosure relates to the field of augmented reality.
  • acquiring the image information of the target object in the real environment and then using various visual correlation algorithms to detect or identify the relevant features, states and attributes of the target object, and thus obtain the image information that matches the specific application.
  • AR effect combining virtual and reality.
  • the target object may involve faces, limbs, gestures, actions, etc. related to the human body, or markers and markers related to objects, or sand tables, display areas or display items related to venues or places.
  • Vision-related algorithms can involve visual positioning, SLAM, 3D reconstruction, image registration, background segmentation, object key point extraction and tracking, object pose or depth detection, etc.
  • Specific applications can not only involve interactive scenes such as guided tours, navigation, explanations, reconstructions, virtual effect overlays and display related to real scenes or objects, but also special effects processing related to people, such as makeup beautification, body beautification, special effect display, virtual Interactive scenarios such as model display.
  • the relevant features, states and attributes of the target object can be detected or identified through the convolutional neural network.
  • the above-mentioned convolutional neural network is a network model obtained by performing model training based on a deep learning framework.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • an embodiment of the present disclosure also provides a makeup migration device, the device includes: an acquisition module 901, configured to acquire a target image to be migrated and a partial image to be migrated, the target image to be migrated includes a target object,
  • the local image to be transferred includes the local area of the target object;
  • the transfer module 902 is configured to transfer the preset makeup style to the target image to be transferred through the target makeup transfer network to obtain the transfer target image;
  • the The target makeup transfer network transfers the preset makeup style to the partial image to be transferred to obtain the transferred partial image;
  • the fusion module 903 is used to fuse the transferred target image and the transferred partial image to obtain the The makeup transfer result of the target object.
  • an embodiment of the present disclosure also provides a makeup transfer device, which includes: an acquisition module 1001 for acquiring a target image to be transferred; a transfer module 1002 for transferring preset makeup to The style is transferred to the target image to be transferred to obtain the transfer target image; the target makeup transfer network transfers the original makeup by using each video frame in a plurality of video frames and the transfer sample image corresponding to each video frame The network is trained, and the plurality of video frames include sample objects with the same makeup style, and the migration sample image corresponding to one video frame is the makeup style of the sample objects in the video frame transferred to The image obtained on the sample image to be migrated.
  • an embodiment of the present disclosure also provides a training device for a makeup transfer network
  • the device includes: an acquisition module 1101 , configured to acquire a sample image to be transferred and a partial sample image to be transferred, and the sample image to be transferred Including the second sample object, the local sample image to be transferred includes the partial region of the second sample object;
  • the first migration module 1102 is used to transfer the first sample object in the reference sample image through the original makeup transfer network
  • the makeup style is transferred to the sample image to be transferred to obtain the transferred sample image;
  • the second transfer module 1103 is configured to transfer the makeup style of the first sample object in the reference partial sample image to On the partial sample image to be migrated, a partial sample image to be migrated is obtained, the reference partial sample image includes the local region of the first sample object in the reference sample image, and the partial sample image included in the partial sample image to be migrated
  • the local area is the same as the local area included in the reference partial sample image;
  • the training module 1104 is configured to train the original makeup transfer network based on the
  • the embodiment of the present disclosure also provides a training device for makeup transfer network
  • the device includes: an acquisition module 1201, used to acquire a plurality of video frames, the plurality of video frames include the same makeup style Sample object; transfer module 1202, for each video frame in the plurality of video frames, transfer the makeup style of the sample object in the video frame to the sample image to be transferred through the original makeup transfer network to obtain the video frame Corresponding migration sample images; a training module 1203, configured to train the original makeup migration network based on each video frame in the plurality of video frames and the migration sample images corresponding to the respective video frames, to obtain the target makeup migration network.
  • the functions or modules included in the device provided by the embodiments of the present disclosure can be used to execute the methods described in the method embodiments above, and its specific implementation can refer to the description of the method embodiments above. For brevity, here No longer.
  • the embodiment of this specification also provides a computer device, which at least includes a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein, when the processor executes the program, the computer program described in any of the preceding embodiments is implemented. described method.
  • FIG. 13 shows a schematic diagram of a more specific hardware structure of a computing device provided by the embodiment of this specification.
  • the device may include: a processor 1301 , a memory 1302 , an input/output interface 1303 , a communication interface 1304 and a bus 1305 .
  • the processor 1301 , the memory 1302 , the input/output interface 1303 and the communication interface 1304 are connected to each other within the device through the bus 1305 .
  • the processor 1301 may be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize the technical solutions provided by the embodiments of this specification.
  • the processor 1301 may also include a graphics card, and the graphics card may be an Nvidia titan X graphics card or a 1080Ti graphics card.
  • the memory 1302 can be implemented in the form of ROM (Read Only Memory, read-only memory), RAM (Random Access Memory, random access memory), static storage device, dynamic storage device, etc.
  • the memory 1302 can store operating systems and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 1302 and invoked by the processor 1301 for execution.
  • the input/output interface 1303 is used to connect the input/output module to realize information input and output.
  • the input/output/module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions.
  • the input device may include a keyboard, mouse, touch screen, microphone, various sensors, etc.
  • the output device may include a display, a speaker, a vibrator, an indicator light, and the like.
  • the communication interface 1304 is used to connect a communication module (not shown in the figure), so as to realize the communication interaction between this device and other devices.
  • the communication module can realize communication through wired means (such as USB, network cable, etc.), and can also realize communication through wireless means (such as mobile network, WIFI, Bluetooth, etc.).
  • Bus 1305 includes a path for transferring information between the various components of the device (eg, processor 1301, memory 1302, input/output interface 1303, and communication interface 1304).
  • the above device only shows the processor 1301, the memory 1302, the input/output interface 1303, the communication interface 1304, and the bus 1305, in the specific implementation process, the device may also include other components.
  • the above-mentioned device may only include components necessary to implement the solutions of the embodiments of this specification, and does not necessarily include all the components shown in the figure.
  • An embodiment of the present disclosure further provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the method described in any one of the foregoing embodiments is implemented.
  • Computer-readable media including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information.
  • Information may be computer readable instructions, data structures, modules of a program, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.
  • a typical implementing device is a computer, which may take the form of a personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation device, e-mail device, game control device, etc. desktops, tablets, wearables, or any combination of these.
  • each embodiment in this specification is described in a progressive manner, the same or similar parts of each embodiment can be referred to each other, and each embodiment focuses on the difference from other embodiments.
  • the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiment.
  • the device embodiments described above are only illustrative, and the modules described as separate components may or may not be physically separated, and the functions of each module may be integrated in the same or multiple software and/or hardware implementations. Part or all of the modules can also be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

提供一种妆容迁移及妆容迁移网络的训练方法和装置,所述妆容迁移方法包括:获取待迁移目标图像和待迁移局部图像,所述待迁移目标图像包括目标对象,所述待迁移局部图像上包括所述目标对象的局部区域;通过目标妆容迁移网络分别将预设妆容风格迁移到所述待迁移目标图像和所述待迁移局部图像上,得到迁移目标图像和迁移局部图像;对所述迁移目标图像和所述迁移局部图像进行融合,得到所述目标对象的妆容迁移结果。

Description

妆容迁移及妆容迁移网络的训练方法和装置
相关申请交叉引用
本申请主张申请号为202111653519.6、申请日为2021年12月30日的中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及计算机视觉技术领域,尤其涉及妆容迁移及妆容迁移网络的训练方法、装置、计算机设备及存储介质。
背景技术
妆容迁移是计算机视觉中图像生成领域的一个重要方向。妆容迁移是指将一种妆容风格迁移到不具有该妆容风格的图像上,例如,针对脸部图像而言,妆容迁移可以是将某种妆容风格迁移到素颜的脸部图像上。然而,传统的妆容迁移方式的还原度较低。
发明内容
本公开实施例的第一方面提供一种妆容迁移方法,所述方法包括:获取待迁移目标图像和待迁移局部图像,所述待迁移目标图像包括目标对象,所述待迁移局部图像包括所述目标对象的局部区域;通过目标妆容迁移网络将预设妆容风格迁移到所述待迁移目标图像上,得到迁移目标图像;通过所述目标容迁移网络将所述预设妆容风格迁移到所述待迁移局部图像上,得到迁移局部图像;对所述迁移目标图像和所述迁移局部图像进行融合,得到所述目标对象的妆容迁移结果。
本公开实施例的第二方面提供一种妆容迁移网络的训练方法,所述方法包括:获取待迁移样本图像和待迁移局部样本图像,所述待迁移样本图像中包括第二样本对象,所述待迁移局部样本图像中包括所述第二样本对象的局部区域;通过原始妆容迁移网络将参考样本图像中第一样本对象的妆容风格迁移到所述待迁移样本图像上,得到迁移样本图像;通过所述原始妆容迁移网络将参考局部样本图像中所述第一样本对象的妆容风格迁移到所述待迁移局部样本图像上,得到迁移局部样本图像,所述参考局部样本图像包括所述参考样本图像中目标对象的局部区域,且所述待迁移局部样本图像中包括的局部区域与所述参考局部样本图像中包括的局部区域相同;基于所述迁移样本图像和所述迁移局部样本图像对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
本公开实施例的第三方面提供一种妆容迁移装置,所述装置包括:获取模块,用于获取待迁移目标图像和待迁移局部图像,所述待迁移目标图像包括目标对象,所述待迁移局部图像上包括所述目标对象的局部区域;迁移模块,用于通过目标妆容迁移网络将预设妆容风格迁移到所述待迁移目标图像,得到迁移目标图像;通过所述目标妆容迁移网络将所述预设妆容风格迁移到所述待迁移局部图像上,得到迁移局部图像;融合模块,用于对所述迁移目标图像和所述迁移局部图像进行融合,得到所述目标对象的妆容迁移结果。
本公开实施例的第四方面提供一种妆容迁移网络的训练装置,所述装置包括:获取模块,用于获取待迁移样本图像和待迁移局部样本图像,所述待迁移样本图像中包括第二样本对象,所述待迁移局部样本图像中包括所述第二样本对象的局部区域;第一迁移模块,用于通过原始妆容迁移网络将参考样本图像中第一样本对象的妆容风格迁移到所述待迁移样本图像上,得到迁移样本图像;第二迁移模块,用于通过所述原始妆容迁移网络将参考局部样本图像中所述第一样本对象的妆容风格迁移到所述待迁移局部样本 图像上,得到迁移局部样本图像,所述参考局部样本图像包括所述参考样本图像中目标对象的局部区域,且所述待迁移局部样本图像中包括的局部区域与所述参考局部样本图像中包括的局部区域相同;训练模块,用于基于所述迁移样本图像和迁移局部样本图像对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
本公开实施例的第五方面提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现任一实施例所述的方法。
本公开实施例的第六方面提供一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现任一实施例所述的方法。
本公开实施例的第七方面提供一种计算机程序产品,所述产品包括计算机程序,所述计算机程序被处理器执行时实现任一实施例所述的方法。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。
图1是妆容迁移的示意图。
图2是根据本公开实施例的妆容迁移方法的流程图。
图3是根据本公开实施例的待迁移目标图像和待迁移局部图像的示意图。
图4是根据本公开实施例的输出结果的示意图。
图5是根据本公开实施例的训练过程的示意图。
图6是根据本公开另一实施例的妆容迁移方法的流程图。
图7是根据本公开实施例的妆容迁移网络的训练方法的流程图。
图8是根据本公开另一实施例的妆容迁移网络的训练方法的流程图。
图9是根据本公开实施例的妆容迁移装置的框图。
图10是根据本公开另一实施例的妆容迁移装置的框图。
图11是根据本公开实施例的妆容迁移网络的训练装置的框图。
图12是根据本公开另一实施例的妆容迁移网络的训练装置的框图。
图13是根据本公开实施例的计算机设备的结构示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数 形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合。
应当理解,尽管在本公开可能采用术语“第一”、“第二”、“第三”等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
为了使本技术领域的人员更好的理解本公开实施例中的技术方案,并使本公开实施例的上述目的、特征和优点能够更加明显易懂,下面结合附图对本公开实施例中的技术方案作进一步详细的说明。
妆容迁移是指将一种妆容风格迁移到不具有该妆容风格的图像上,一般可以先获取具有某种妆容的图像,再通过神经网络对图像中的妆容的特征进行提取,并将提取出的妆容的特征迁移到不具有该妆容风格的图像上。所述妆容可以是通过彩绘、纹身、贴图、化妆等任意方式获得的,所述图像可以是人或动物的某个身体部位(例如,背部、脸部)的图像。为了便于说明,下文以脸部图像为例,对本公开实施例的方案进行说明。图1是妆容迁移的示意图,可以将预设妆容风格迁移到一张素颜图像101上,得到具有该预设妆容风格的迁移结果102。其中,迁移结果102中包括的脸部与素颜图像101中包括的脸部是同一个人的脸部。预设妆容风格除了图1中所示的眼部区域1021、脸颊区域1022、嘴唇区域1023等部位的妆容风格以外,还可以包括其他区域的妆容风格。本实施例中的素颜图像是指未化妆的脸部图像,当然,除了将预设妆容风格迁移到素颜图像之外,还可以将预设妆容风格迁移到具有其他妆容风格的脸部图像上,以覆盖脸部图像上原有的妆容风格,或者是与脸部图像上原有的妆容风格进行融合,得到新的妆容风格。
妆容迁移算法需要保证脸部妆容各个区域的妆容还原度,例如眼部区域的眼影、眼线、美瞳;嘴巴区域的口红高光、颜色、纹理质感等,都需要以一定的妆容还原度迁移到用户脸部上。然而,相关技术中,往往难以达到足够高的妆容还原度。
除此之外,相关技术中的妆容迁移方式往往还存在以下问题:
(1)迁移结果的鲁棒性和自然度较低。由于要迁移的脸部图像可能存在任意的光照/角度/人脸形状/遮挡状态,相关技术难以保证迁移后的图像依然是自然的,并且也难以兼顾多样的脸部图像,迁移结果往往存在违和感。
(2)迁移算法需要在迁移妆容的同时,往往会改变用户本身的可辨认的id(identity information)信息(也称为id属性信息),id信息用于表征用户的身份信息,用户的五官形状、表情、脸部角度、眼睛的单双眼皮属性、嘴巴的张闭情况等中的任意一者或多者的改变,都可能导致id信息改变,即,导致一个用户被辨认为另一个用户。因此,在迁移强度足够的前提下,相关技术难以保证用户本身的id信息不变,即,难以保证较高的id保留度。
(3)只能以单张图像作为参考妆容,无法获得妆容在不同的视角、表情下的表现,导致迁移的妆容还原度和自然度较差。
除此之外,相关技术中一般只能在上述妆容还原度、自然度和用户可辨认id信息中的某个维度上进行增强,而无法兼顾上述各个维度的效果。即,相关技术要么以较低的妆容还原度为代价,保证迁移后的图像自然度和id保留度;要么牺牲了迁移的自然度,并明显地修改了脸部图像的一些id属性信息,来保证较强的妆容还原度。
基于此,本公开实施例提供一种妆容迁移方法,参见图2,所述方法包括步骤201至步骤203。
步骤201:获取待迁移目标图像pic_global和待迁移局部图像pic_local,所述待迁移目标图像pic_global包括目标对象,所述待迁移局部图像pic_local包括所述目标对象的局部区域。
步骤202:通过目标妆容迁移网络将预设妆容风格分别迁移到所述待迁移目标图像pic_global和所述待迁移局部图像pic_local上,得到迁移目标图像gen_global和迁移局部图像gen_local。
步骤203:对所述迁移目标图像P_global和所述迁移局部图像P_local进行融合,得到所述目标对象的妆容迁移结果。
本公开实施例的方案可用于互动娱乐、妆容美化及试妆等产品中。本公开实施例通过目标妆容迁移网络分别对待迁移目标图像pic_global和待迁移局部图像pic_local进行妆容迁移,通过对待迁移局部图像pic_local进行妆容迁移,能够更好地迁移局部区域的妆容细节,从而能够提高妆容迁移的还原度。此外,通过对包括完整目标对象的待迁移目标图像pic_global进行妆容迁移,能够提高妆容迁移的自然度。综上所述,本公开实施例的方案能够兼顾妆容迁移的自然度和还原度。
在步骤201中,待迁移目标图像可以是单独的一张图像,也可以是视频中的一帧图像或多帧视频帧,所述待迁移目标图像可以是实时采集的,也可以预先采集并存储的。在一些实施例中,所述待迁移目标图像为实时采集的视频中的包括所述目标对象的图像或视频帧,所述待迁移局部图像为从所述包括所述目标对象的图像或视频帧中裁剪出的包括所述局部区域的局部图像。所述实时采集的视频中可以包括多帧连续或不连续的视频帧,可以对视频中包括目标对象的所有视频帧进行妆容迁移,也可以选取视频中包括目标对象,且满足预设条件的视频帧进行妆容迁移。其中,所述预设条件可以包括但不限于清晰度条件、目标对象的尺寸条件等。例如,满足所述清晰度条件即视频帧的清晰度大于预设清晰度阈值;目标对象的尺寸条件即视频帧中目标对象的尺寸在预设的尺寸范围内。
待迁移目标图像pic_global可以包括完整的目标对象,例如,脸部,所述目标对象是需要进行妆容迁移的对象。目标对象可以包括多个局部区域,例如,脸部包括眼睛、鼻子、嘴巴等区域。所述待迁移局部图像pic_local上可以包括一个或多个局部区域,例如,考虑到妆容类型以及妆容效果的不同,所述待迁移局部图像pic_local可以包括左眼区域、右眼区域、鼻子区域、额头区域、苹果肌区域等至少一种。此外,为了能够更好地进行细节还原,根据妆容迁移的需求,所述待迁移局部图像pic_local上还可以仅包括一个局部区域。
待迁移局部图像pic_local可以通过对待迁移目标图像pic_global进行目标检测和图像分割得到。在一些实施例中,可以对所述目标对象的原始图像进行目标检测,确定所述原始图像中目标对象的关键点位置;基于所述目标对象的关键点位置从所述原始图像中裁剪出所述待迁移局部图像。例如,可以检测出脸部的左眼关键点、右眼关键点、鼻子关键点、嘴巴关键点等脸部关键点,基于左眼关键点的位置从所述原始图像中裁剪出左眼区域的待迁移局部图像。
在一些实施例中,从所述原始图像中裁剪出的待迁移局部图像可以是所述原始图像中具有第一预设尺寸的图像区域,所述第一预设尺寸小于所述目标对象的尺寸,且大于所述局部区域的尺寸,所述局部区域位于所述第一预设尺寸的图像区域中的第一预设位置处。在一些实施例中,所述第一预设位置可以是图像的中心位置,或者图像中横向的 三分线与纵向的三分线的交点位置,或者是其他位置。由于第一预设尺寸大于所述局部区域的尺寸,能够使裁剪出的待迁移局部图像中包括完整的局部区域,例如包括完整的眼部区域、嘴巴区域等。
进一步地,还可以对目标对象的原始图像进行目标检测,确定原始图像中目标对象的位置和角度,基于所述原始图像中目标对象的位置和角度从所述原始图像中裁剪出待迁移目标图像。其中,可以基于所述原始图像中目标对象的位置和角度建立仿射矩阵,并基于所述仿射矩阵从所述原始图像中裁剪出所述待迁移目标图像。
在一些实施例中,从所述原始图像中裁剪出的待迁移目标图像可以是所述原始图像中具有第二预设尺寸的图像区域,所述第二预设尺寸大于所述目标对象的尺寸,且所述目标对象位于所述第二预设尺寸的图像区域中的第二预设位置处。在一些实施例中,所述第二预设位置可以是图像的中心位置,或者图像中横向的三分线与纵向的三分线的交点位置,或者是其他位置。由于第二预设尺寸大于所述目标对象的尺寸,能够使裁剪出的待迁移目标图像中包括完整的目标对象。
在一些实施例中,所述第一预设尺寸例如为256*256像素,所述第二预设尺寸例如为1024*1024像素,目标对象的尺寸例如为800*800像素。当然,上述数值只是一种示例性说明,并非用于限制本公开。在实际应用中,可以根据需要设置其他的尺寸。在原始图像的尺寸过大或过小的情况下,可以先对原始图像进行缩放,再从缩放后的原始图像中裁剪出所需尺寸的待迁移目标图像。或者,也可以先从原始图像中裁剪出包括目标对象的图像区域,再将裁剪出的图像区域缩放为所需的尺寸。
进一步地,由于包括目标对象的原始图像中可能包括背景区域,还可以对原始图像进行裁剪或背景分割处理,得到目标对象所在的图像区域(即待迁移目标图像)。
在一些实施例中,还可以将待迁移目标图像pic_global中目标对象调整预设角度,例如,所述预设角度满足:目标对象的头顶与下巴在竖直方向上对齐。角度调整的过程可以通过仿射变换实现。通过进行角度调整,可以便于对待迁移目标图像pic_global进行各种处理,例如,图像分割、特征提取等。
在上述实施例中,所述原始图像可以是用户上传的图像(例如,手机相册中存储的图像),或者是通过图像采集装置实时采集到的图像。也可以由用户对原始图像进行裁剪,得到符合要求的待迁移目标图像,并直接上传待迁移目标图像进行妆容迁移。上述实施例中的基于所述目标对象的关键点位置从所述原始图像中裁剪出所述待迁移局部图像,例如可以直接从原始图像中裁剪出待迁移局部图像,也可以先从原始图像中裁剪出待迁移目标图像,再从待迁移目标图像中裁剪出所述待迁移局部图像。
一些实施例的待迁移目标图像pic_global和待迁移局部图像pic_local如图3所示,其中,待迁移局部图像pic_local包括右眼局部图像、左眼局部图像和嘴巴局部图像。在实际应用中,待迁移局部图像pic_local并不局限于包括上述3种局部图像中的部分,还可以根据实际妆容迁移需求,设置除上述3种局部图像以外的其他局部区域的图像,本公开对此并不限定。
在步骤202中,目标妆容迁移网络可以预先学习出预设妆容风格的妆容特征,从而能够将预设妆容风格迁移到所述待迁移目标图像pic_global和所述待迁移局部图像pic_local上。在一些实施例中,一个目标妆容迁移网络对应于一种预设妆容风格。可以利用具有该预设妆容风格的图像作为样本图像训练出目标妆容迁移网络。在另一些实施例中,一个目标妆容迁移网络也可以对应于多种预设妆容风格。可以利用具有所述多种预设妆容风格中每种预设妆容风格的图像作为样本图像训练出目标妆容迁移网络,其中,每种预设妆容风格的图像都携带标签信息,所述标签信息用于标识图像对应的预设妆容 风格。
在一些实施例中,目标妆容迁移网络可以包括第一子网络和第二子网络,所述第一子网络用于将预设妆容风格迁移到所述待迁移目标图像pic_global上,所述第二子网络用于将预设妆容风格迁移到所述待迁移局部图像pic_local上。在一些实施例中,所述第二子网络的数量包括一个或多个,在包含至少两个第二子网络的情况下,不同的第二子网络用于对不同的局部区域的待迁移局部图像pic_local进行妆容迁移。例如,脸部图像可以包括左眼、右眼、鼻子、嘴巴等局部区域,因此,可以采用至少4个第二子网络,分别对左眼区域、右眼区域、鼻子区域、嘴巴区域进行妆容迁移。每个子网络都可以包括妆容风格提取器以及生成器,其中,妆容风格提取器用于从带妆容的图像ref中提取出妆容特征F_ref;生成器用于基于提取出的妆容特征F_ref以及待迁移的图像(即待迁移目标图像pic_global或待迁移局部图像pic_local)生成风格迁移后的图像(即迁移目标图像gen_global或迁移局部图像gen_local)。
通过采用本步骤,能够以待迁移目标图像pic_global为整体,得到目标对象整体的妆容迁移效果;同时,还能够充分挖掘出目标对象的局部区域的妆容特征,从而得到该局部区域的妆容迁移效果。
在步骤203中,将迁移局部图像gen_local融合到迁移目标图像gen_global中,得到最终的迁移图像gen_face(即妆容迁移结果),迁移图像gen_face中包括待迁移目标图像中包括的目标对象,且该目标对象具有所述预设妆容风格。例如,待迁移目标图像pic_global中包括用户A的素颜脸部图像,预设妆容风格中包括灰色眼影、红色口红、蓝色美瞳,则迁移图像gen_face中包括具有灰色眼影、红色口红、蓝色美瞳的用户A的脸部图像。
在一些实施例中,可以对所述迁移目标图像进行语义分割,得到所述迁移目标图像中所述局部区域的位置;基于所述迁移目标图像中所述局部区域的位置将所述迁移局部图像融合到所述迁移目标图像中,得到所述目标对象的妆容迁移结果。上述图像融合过程可以采用拉普拉斯融合、羽化融合等算法实现,所采用的具体算法本公开不做限制。
上述融合过程可以在目标妆容迁移网络外部实现,也可以在目标妆容迁移网络内部实现。在通过目标妆容迁移网络内部实现图像融合的实施例中,目标妆容迁移网络中可以包括第一子网络、第二子网络和第三子网络。第一子网络用于将预设妆容风格迁移到所述待迁移目标图像上,得到迁移目标图像;第二子网络用于将预设妆容风格迁移到所述待迁移局部图像上,得到迁移局部图像。第三子网络用于获取所述迁移目标图像和所述迁移局部图像,并对所述迁移目标图像和迁移局部图像进行融合。
在一些实施例中,为了获得较好的妆容迁移效果,可以预先基于待迁移样本图像、待迁移局部样本图像、参考样本图像以及参考局部样本图像,对原始妆容迁移网络进行训练,得到所述目标妆容迁移网络。其中,所述参考样本图像ref_global包括具有所述预设妆容风格的完整的样本对象,所述参考样本图像ref_global中包括的样本对象与待迁移样本图像samp_global中包括的样本对象为同一类别的样本对象,例如,都是人脸。所述待迁移样本图像samp_global包括具有除所述预设妆容风格以外的妆容风格的样本对象,即,所述参考样本图像ref_global中包括的样本对象与待迁移样本图像samp_global中包括的样本对象带有不同的妆容。特别地,在本公开实施例中,可以将素颜作为一种特殊的妆容,也可以属于待迁移样本图像中包含的妆容种类之一。
所述待迁移局部样本图像samp_local中包括所述待迁移样本图像samp_global中的样本对象的局部区域,所述参考局部样本图像ref_local中包括所述参考样本图像ref_global中的样本对象的局部区域,且所述待迁移局部样本图像samp_local中包括的局部区域与参考局部样本图像ref_local中包括的局部区域相同。待迁移局部样本图像 samp_local与参考局部样本图像ref_local中都可以包括一种或多种局部区域。例如,二者均包括左眼区域,或者二者均包括左眼区域和鼻子区域。待迁移局部样本图像samp_local可以通过对待迁移样本图像samp_global进行目标检测和图像分割得到。
在一些实施例中,所述参考样本图像从第一图像集中选取,所述第一图像集中包括多张图像,所述第一图像集中的每张图像包括具有所述预设妆容风格的相同的样本对象。在一些实施例中,所述参考样本图像包括样本视频中的多个视频帧,所述样本视频中的多个视频帧中的每个视频帧均包括具有所述预设妆容风格的样本对象。所述样本视频可以是直接通过图像采集装置采集得到的视频,也可以是编辑后的视频。样本视频中的多个视频帧可以包括时间上连续的多个视频帧,也可以包括时间上不连续的视频帧。
在一些实施例中,所述样本视频中的多个视频帧满足以下至少一个条件:至少两帧视频帧中的样本对象的角度和/或表情不同;至少两帧视频帧中的光照强度不同。通过采用多个视频帧作为参考样本图像ref_global,能够提供足量的妆容图像,使得训练出的目标妆容迁移网络能够重复挖掘出预设妆容风格在不同角度、不同光照、不同表情下的细节变化信息,从而提高妆容迁移的还原度。
在一些实施例中,所述待迁移样本图像从第二图像集中选取,所述第二图像集中包括多张图像,所述第二图像集中的每张图像包括具有除所述预设妆容风格以外的妆容风格的样本对象,换言之,所述第二图像集中的每张图像包括不具有所述预设妆容风格的样本对象,且所述第二图像集中的至少两张图像中包括的样本对象不同。通过采用不同的目标对象的图像作为待迁移样本图像samp_global,使得训练出的目标妆容迁移网络能够充分学习到将妆容风格迁移到具有不同id的对象上的能力,从而使得妆容迁移结果的自然度更高。在本公开实施例中,可以将待迁移样本图像中包括的样本对象称为第二样本对象,将参考样本图像中包括的样本对象称为第一样本对象。
在训练时,可以通过原始妆容迁移网络将参考样本图像ref_global中的妆容风格迁移到待迁移样本图像samp_global上,得到迁移样本图像samp_gen_global;通过所述原始妆容迁移网络将参考局部样本图像ref_local中的妆容风格迁移到待迁移局部样本图像samp_local上,得到迁移局部样本图像samp_gen_local;基于所述迁移样本图像samp_gen_global和所述迁移局部样本图像samp_gen_local对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
在对原始妆容迁移网络进行训练之前,可以先对待迁移样本图像samp_global和待迁移局部样本图像samp_local、以及对参考样本图像ref_global和参考局部样本图像ref_local进行处理,包括调整图像的尺寸以及调整图像中的样本对象的角度。其中,可以将待迁移样本图像samp_global和参考样本图像ref_global调整为相同的尺寸(例如,所述第二预设尺寸),并将待迁移局部样本图像samp_local和参考局部样本图像ref_local调整为相同的尺寸(例如,所述第一预设尺寸)。
在一些实施例中,可以基于所述迁移样本图像samp_gen_global建立第一损失函数;基于所述迁移局部样本图像samp_gen_local建立第二损失函数;基于所述第一损失函数和所述第二损失函数对原始妆容迁移网络进行训练,得到所述目标妆容迁移网络。在一些实施例中,在目标妆容迁移网络包括第一子网络和第二子网络的情况下,可以基于所述第一损失函数对原始妆容迁移网络中的原始第一子网络进行训练,得到所述第一子网络;基于所述第二损失函数对原始妆容迁移网络中的原始第二子网络进行训练,得到所述第二子网络。
用于训练子网络的损失函数可以包括以下至少一者:
(1)用于表征所述子网络的输出图像的真实度损失的损失函数。可以将所述子网络 的输出图像输入到判别器,通过判别器判断该输出图像是否为通过妆容迁移得到的合成图像,目标是使判别器无法识别出该输出图像是真实图像还是合成图像,因此,根据判别器的输出结果与真实结果之间的差异可以得到该损失函数。通过采用该损失函数,能够提高输出图像的真实度和自然度。该损失函数是通过生成器和对抗器(判别器)进行对抗得到的,因此,也可以称为对抗生成损失函数。
(2)用于表征所述子网络的输出图像与输入所述子网络的待迁移图像之间的属性相似度损失的损失函数。其中,不同的局部区域对应的属性不同,例如,眼部区域(包括左眼区域和右眼区域)对应的属性可以包括眼皮属性,所述眼皮属性用于表征眼皮是单眼皮还是双眼皮;鼻子区域对应的属性可以包括鼻梁的高度;嘴巴区域对应的属性可以包括嘴角的弧度等。可以将子网络的输出图像输入属性分类器,得到输出图像对应的属性类别,并基于输出图像对应的属性类别与输入子网络的待迁移图像(待迁移样本图像samp_global或待迁移局部样本图像samp_local)进行相似度比较,从而获取该损失函数。通过采用该损失函数,能够尽量保证迁移后的样本对象与迁移前的样本对象的id属性信息是一致的,因此,该损失函数也可以称为属性保持损失函数。
(3)用于表征所述子网络的输出图像与输入所述子网络的参考图像之间的妆容相似度损失的损失函数。可以将子网络的输出图像输入到该子网络包括的妆容风格提取器,以提取出输出图像中的妆容特征,再将输出图像中的妆容特征与输入该子网络的参考图像(参考样本图像ref_global或参考局部样本图像ref_local)中的妆容特征进行相似度比较,从而获取该损失函数。通过采用该损失函数,能够尽量保证迁移后的妆容风格与参考图像中的妆容风格是一致的,从而提高妆容迁移的还原度,因此,该损失函数也可以称为风格一致损失函数。
(4)用于表征目标样本图像与输入所述子网络的待迁移图像之间的相似度损失的损失函数;所述目标样本图像通过将输入所述子网络的待迁移图像上的妆容风格迁移到所述子网络的输出图像上得到。即,可以将子网络的输出图像作为待迁移图像,将原本输入到子网络的待迁移图像作为参考图像,再次通过该子网络进行妆容迁移,基于得到的迁移结果(目标样本图像)与原本输入到子网络的待迁移图像的相似度确定该损失函数,该损失函数也可以称为循环一致损失函数。通过采用该损失函数,能够使子网络的输出图像与原本的待迁移图像中样本对象的结构信息保持一致。结构信息包括图像中各个点的语义信息,所述语义信息用于表示像素点所属的局部区域,例如,表示像素点是属于鼻子区域还是嘴巴区域。
在上述实施例中,所述子网络可以是所述第一子网络或所述第二子网络。在子网络为第一子网络的情况下,输入子网络的待迁移图像、参考图像以及子网络的输出图像均为包括完整的目标对象的图像。在子网络为第二子网络的情况下,输入子网络的待迁移图像、参考图像以及子网络的输出图像均为包括目标对象的局部区域的图像。在第二子网络的数量为多个的情况下,第一子网络以及每个第二子网络都可以基于上述四个损失函数中的至少一者进行训练,不同的第二子网络采用的属性保持损失函数可以基于不同的属性类别获取。例如,用于处理左眼区域和右眼区域的第二子网络采用的属性保持损失函数可以基于眼皮属性类别获取,用于处理嘴巴区域的第二子网络采用的属性保持损失函数可以基于嘴唇厚度类别和/或嘴角弧度类别获取。
在一些实施例中,进行妆容迁移可能导致目标对象上被迁移的局部区域迁移后的颜色与目标对象上未被迁移的局部区域的颜色不同,例如,对人脸进行妆容迁移后,人脸部分的颜色与脖子部分的颜色可能不同。因此,为了进一步减少妆容迁移结果的违和感,提高自然度,在对所述迁移目标图像和所述迁移局部图像进行融合之后,可以对所述目标对象上未进行妆容迁移的局部区域进行颜色迁移。在一些实施例中,可以获取所述目 标对象迁移后的颜色;基于所述目标对象迁移后的颜色对所述目标对象上未进行妆容迁移的区域的颜色进行调整。
进一步地,在得到所述目标对象的妆容迁移结果(输出图像)之后,还可以将输出图像还原为与待迁移目标图像相同的尺寸。例如,假设从原始图像中裁剪出的待迁移目标图像的尺寸为1024*1024像素,可以将该待迁移目标图像从1024*1024像素还原到原始尺寸。
下面以待迁移目标图像是人脸图像,待迁移局部图像是人脸上的主要妆容区域对应的局部图像为例,对本公开实施例的整体流程进行说明。其中,主要妆容区域对应的局部图像可以包括左眉局部图像、右眉局部图像、左眼局部图像、右眼局部图像、鼻子局部图像和/或嘴巴局部图像,下面的实施例中以主要妆容区域对应的局部图像包括左眼局部图像、右眼局部图像和嘴巴局部图像为例进行说明。本公开实施例的妆容迁移方法的整体流程如下:
给定一张任意尺寸的原始图像,需要将原始图像中的人脸图像,以及人脸上的主要妆容区域对应的局部图像裁剪出来,并调整到指定的大小;准备这些数据,用于下一步的妆容迁移,包括步骤(1)至步骤(4)。
(1)对原始图像进行人脸关键点检测,获取关键点坐标以及人脸的位置信息和角度信息,基于人脸的位置信息和角度信息可以生成仿射矩阵。
(2)根据仿射矩阵可以从原始图像中裁剪出尺寸为1024*1024像素的人脸图像,其中人脸居中,人脸部分占据800*800像素的大小。
(3)在1024*1024像素的人脸图像中,根据上述人脸关键点,分别将左眼局部图像、右眼局部图像、嘴巴局部图像裁剪出来,并归一化成256*256像素的大小。
(4)分别对人脸图像、左眼局部图像、右眼局部图像和嘴巴局部图像执行妆容迁移,最后把上述四种图像的妆容迁移结果融合到一张图中。同时考虑到妆容迁移后的颜色与原始肤色不同,所以对用户的原始暴露的皮肤中未进行妆容迁移的区域(例如,脖子、耳朵)进行颜色迁移,以减少违和感,提升自然度。步骤(4)进一步包括步骤(4.1)至步骤(4.3)。
(4.1)基于(1)中得到的关键点以及迁移后的四张图像(即迁移后的人脸图像gen_face、迁移后的左眼局部图像gen_left_eye、迁移后的右眼局部图像gen_right_eye和迁移后的嘴巴局部图像gen_mouth),可以绘制出人脸的局部区域的分割图,所述分割图中包括嘴巴局部图像、左眼局部图像、右眼局部图像。
(4.2)按照分割图,将迁移后的左眼局部图像gen_left_eye、迁移后的右眼局部图像gen_right_eye和迁移后的嘴巴局部图像gen_mouth融合到迁移后的人脸图像gen_face中,得到融合结果blend_face,融合算法可以采用拉普拉斯融合,也可以采用其他融合方式。
(4.3)基于(1)中得到的仿射矩阵的逆矩阵对融合结果blend_face进行仿射变换,使得融合结果blend_face从1024*1024像素还原到原始尺寸,得到最终的迁移结果,如图4所示。
本公开实施例的目标妆容迁移网络可以包括第一子网络和至少一个第二子网络,第一子网络和各个第二子网络均可以包括特征提取器和生成器。用于对左眼局部图像进行妆容迁移的第二子网络的训练框架如图5所示,其中,该第二子网络可以与判别器、眼皮属性分类器、妆容风格提取器(也称为特征提取器)进行联合训练。上述训练框架中的两个特征提取器具有相同的网络结构。其他的第二子网络以及第一子网络可以采用类 似的训练框架进行训练,只需要将其中的眼皮属性分类器以及眼皮属性保持损失函数替换成相应的属性分类器以及属性保持损失函数即可,下面以用于对左眼局部图像进行妆容迁移的第二子网络的训练过程为例进行说明,其他子网络的训练过程可以参考该第二子网络的训练过程。训练过程包括步骤(1)至步骤(7)。
(1)从一整段单id视频中通过抽帧得到一组视频帧,根据这一组视频帧建立单id妆容数据集(即前述第一图像集)。其中,单id是指视频帧中包括的样本对象为id信息均相同的样本对象,且各个视频帧中的样本对象具有相同的妆容风格。根据视频时长的不等,一个视频中一般包含1000到5000张视频帧。可以按照一定的抽帧策略(例如,按照一定的帧数间隔,或者随机抽帧等)抽取指定数量的视频帧。可以采集不同id信息的未化妆素颜人脸的图像,并建立多id素颜数据集(即前述第二图像集),多id素颜数据集中例如可以包括1.5万张图像,每张图像中都包括一张独立的未化妆素颜人脸。单id妆容数据集和多id素颜数据集中的每张图像都可以进行上述的人脸关键点检测、人脸裁剪、局部区域裁剪等处理。
(2)对第二子网络与判别器、眼皮属性分类器、妆容风格提取器进行联合训练。
(3)每次训练,从多id素颜数据集随机抽取一张左眼局部图像src_left_eye作为素颜图(即待迁移局部样本图像),从单id妆容数据集随机抽取一张左眼局部图像作为左眼区域的参考局部样本图像,记为ref_left_eye。
(4)将参考局部样本图像ref_left_eye送入特征提取器,得到妆容特征(例如64*1的张量)。
(5)将左眼局部图像src_left_eye和妆容特征一起输入生成器,生成迁移后的图像gen_left_eye,它是将参考局部样本图像的妆容特征迁移到左眼局部图像上的迁移效果图,并具有该左眼局部图像对应的单双眼皮属性、眼睛大小形状等id信息,并且具有妆容图像的美瞳颜色、眼睫毛、眼影等妆容信息。
(6)对于迁移后的图像gen_left_eye,从以下四个方面进行监督,这是迁移训练算法的核心:
(6.1)判别器判别迁移后的图像是合成的图像,还是真的图像,从而建立对抗生成损失函数。对抗损失函数能够提升生成结果的逼真程度(即真实度)。
(6.2)眼皮属性保持损失函数,将迁移后的图像输入眼皮属性分类器进行分类,分类结果用于表示迁移后的图像中的眼皮属性是单眼皮还是双眼皮,由于期望用户的id信息不被改变,故而分类结果应该与素颜图的分类结果一致。
(6.3)风格一致损失函数,为了保证迁移后的妆容风格与参考妆容图像的妆容风格是一样的,会把迁移后的图像gen_left_eye也输入妆容风格提取器,得到64*1的妆容特征,这一特征应该与参考局部样本图像ref_left_eye得到的妆容特征相似。
(6.4)循环一致损失函数,把迁移后的图像gen_left_eye(生成图像)作为新的素颜图,素颜图左眼局部图像src_left_eye作为参考局部样本图像,新的素颜图再次通过图5中的迁移框架进行迁移,得到的结果应该跟素颜图左眼局部图像src_left_eye相似。
(7)对于其他的区域(整脸、嘴巴、右眼),妆容迁移生成框架与上述左眼区域的妆容迁移生成框架类似。
可以由用户上传自己的照片至例如终端的处理器,使用本公开的妆容迁移方法,从而获得迁移后的用户照片。
本公开实施例具有以下优点:
(1)可以以一段妆容视频作为输入来训练目标妆容迁移网络,妆容视频中包括多帧视频帧,每帧视频帧中包括具有预设妆容风格的目标对象。这样,可以提高妆容迁移的细节度和还原度。
(2)训练过程中采用了局部区域的属性保持损失函数,保证用户的局部属性不被改变,从而提高了妆容迁移过程中用户id信息的保留度。例如,在局部区域为左眼区域的情况下,在图5的第二子网络的训练过程中采用了眼皮属性保持损失函数,如果待迁移局部样本图像经第二子网络进行妆容迁移后得到的生成图像中眼皮属性发生改变,则眼皮属性保持损失函数的取值较大;如果生成图像中眼皮属性未发生改变,则眼皮属性保持损失函数的取值较小。因此,可以通过调整第二子网络的网络参数值,使眼皮属性保持损失函数获得较小的取值。这样,提高了妆容迁移前后眼皮属性的保留度,从而提高了用户id信息的保留度。
(3)采用多id素颜数据集和单id妆容数据集作为样本数据,单id妆容数据集中的不同图像覆盖了在特定妆容下各种角度和表情下所呈现的视觉效果,多id素颜数据集中的不同图像覆盖了多种id信息的目标对象的属性信息,从而基于两种数据集训练出的目标妆容迁移网络,既能够学习出同一妆容风格的细微变化,又能够学习出不同id信息的细微变化。因此,可以兼顾用户id信息的保留度与妆容迁移的强度(即还原度)。
(4)采用局部图像与目标对象的完整图像共同进行妆容迁移以及目标妆容迁移网络的训练,通过采用局部图像,能够使目标妆容迁移网络更好地获取局部区域的妆容细节;通过采用完整图像,能够使目标妆容迁移网络更好地把握目标对象整体的妆容特征。因此,既能保证目标对象的自然度,同时又能保证关键妆容区域的高还原度。
综上所述,本公开实施例能够同时兼顾妆容迁移的细节还原度和id保留度,并覆盖各种情况的人脸图像,提高了妆容迁移的鲁棒性,且本公开实施例的妆容迁移不以牺牲任何一个维度的性能(还原度、id保留度、鲁棒性等)为代价。
参见图6,本公开实施例还提供一种妆容迁移方法,所述方法包括步骤601至步骤602。
步骤601:获取待迁移目标图像。
步骤602:通过预先训练的目标妆容迁移网络将预设妆容风格迁移到所述待迁移目标图像上,得到迁移目标图像。
所述目标妆容迁移网络通过采用多个视频帧中的每个视频帧以及所述每个视频帧对应的迁移样本图像对原始妆容迁移网络进行训练得到,所述多个视频帧中包括妆容风格相同的样本对象,一个视频帧对应的迁移样本图像为通过所述原始妆容迁移网络将所述视频帧中的样本对象的妆容风格迁移到待迁移样本图像上得到的图像。
本实施例的相关细节可参见前一实施例中的妆容迁移方法,此处不再赘述。
参见图7,本公开实施例还提供一种妆容迁移网络的训练方法,所述方法可包括步骤701至步骤704。
步骤701:获取待迁移样本图像和待迁移局部样本图像,所述待迁移样本图像中包括第二样本对象,所述待迁移局部样本图像中包括所述第二样本对象的局部区域。
步骤702:通过原始妆容迁移网络将参考样本图像中第一样本对象的妆容风格迁移到所述待迁移样本图像上,得到迁移样本图像。
步骤703:通过原始妆容迁移网络将参考局部样本图像中第一样本对象的妆容风格迁移到所述待迁移局部样本图像上,得到迁移局部样本图像,所述参考局部样本图像包 括所述参考样本图像中第一样本对象的局部区域,且所述待迁移局部样本图像中包括的局部区域与参考局部样本图像中包括的局部区域相同。
步骤704:基于所述迁移样本图像和迁移局部样本图像对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
上述训练方法的实施例中各步骤的细节详见前述妆容迁移方法中有关目标妆容迁移网络的训练过程的实施例,此处不再赘述。
参见图8,本公开实施例还提供一种妆容迁移网络的训练方法,所述方法可包括步骤801至步骤803。
步骤801:获取多个视频帧,所述多个视频帧中包括妆容风格相同的样本对象。
步骤802:针对所述多个视频帧中的每个视频帧,通过原始妆容迁移网络将该视频帧中的样本对象的妆容风格迁移到待迁移样本图像,得到该视频帧对应的迁移样本图像。
步骤803:基于所述多个视频帧中的各个视频帧以及所述各个视频帧分别对应的迁移样本图像对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
上述训练方法的实施例中各步骤的细节详见前述妆容迁移方法中有关目标妆容迁移网络的训练过程的实施例,此处不再赘述。
本公开涉及增强现实领域,通过获取现实环境中的目标对象的图像信息,进而借助各类视觉相关算法实现对目标对象的相关特征、状态及属性进行检测或识别处理,从而得到与具体应用匹配的虚拟与现实相结合的AR效果。示例性的,目标对象可涉及与人体相关的脸部、肢体、手势、动作等,或者与物体相关的标识物、标志物,或者与场馆或场所相关的沙盘、展示区域或展示物品等。视觉相关算法可涉及视觉定位、SLAM、三维重建、图像注册、背景分割、对象的关键点提取及跟踪、对象的位姿或深度检测等。具体应用不仅可以涉及跟真实场景或物品相关的导览、导航、讲解、重建、虚拟效果叠加展示等交互场景,还可以涉及与人相关的特效处理,比如妆容美化、肢体美化、特效展示、虚拟模型展示等交互场景。可通过卷积神经网络,实现对目标对象的相关特征、状态及属性进行检测或识别处理。上述卷积神经网络是基于深度学习框架进行模型训练而得到的网络模型。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
如图9所示,本公开实施例还提供一种妆容迁移装置,所述装置包括:获取模块901,用于获取待迁移目标图像和待迁移局部图像,所述待迁移目标图像包括目标对象,所述待迁移局部图像上包括所述目标对象的局部区域;迁移模块902,用于通过目标妆容迁移网络将预设妆容风格迁移到所述待迁移目标图像上,得到迁移目标图像;通过所述目标妆容迁移网络将所述预设妆容风格迁移到所述待迁移局部图像上,得到迁移局部图像;融合模块903,用于对所述迁移目标图像和所述迁移局部图像进行融合,得到所述目标对象的妆容迁移结果。
如图10所示,本公开实施例还提供一种妆容迁移装置,所述装置包括:获取模块1001,用于获取待迁移目标图像;迁移模块1002,用于通过目标妆容迁移网络将预设妆容风格迁移到所述待迁移目标图像上,得到迁移目标图像;所述目标妆容迁移网络通过采用多个视频帧中的每个视频帧以及所述每个视频帧对应的迁移样本图像对原始妆容迁移网络进行训练得到,所述多个视频帧中包括妆容风格相同的样本对象,一个视频帧对应的迁移样本图像为通过所述原始妆容迁移网络将所述视频帧中的样本对象的妆容 风格迁移到待迁移样本图像上得到的图像。
如图11所示,本公开实施例还提供一种妆容迁移网络的训练装置,所述装置包括:获取模块1101,用于获取待迁移样本图像和待迁移局部样本图像,所述待迁移样本图像中包括第二样本对象,所述待迁移局部样本图像中包括所述第二样本对象的局部区域;第一迁移模块1102,用于通过原始妆容迁移网络将参考样本图像中第一样本对象的妆容风格迁移到所述待迁移样本图像上,得到迁移样本图像;第二迁移模块1103,用于通过所述原始妆容迁移网络将参考局部样本图像中所述第一样本对象的妆容风格迁移到所述待迁移局部样本图像上,得到迁移局部样本图像,所述参考局部样本图像包括所述参考样本图像中所述第一样本对象的局部区域,且所述待迁移局部样本图像中包括的局部区域与所述参考局部样本图像中包括的局部区域相同;训练模块1104,用于基于所述迁移样本图像和迁移局部样本图像对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
如图12所示,本公开实施例还提供一种妆容迁移网络的训练装置,所述装置包括:获取模块1201,用于获取多个视频帧,所述多个视频帧中包括妆容风格相同的样本对象;迁移模块1202,用于针对所述多个视频帧中的每个视频帧,通过原始妆容迁移网络将该视频帧中的样本对象的妆容风格迁移到待迁移样本图像,得到该视频帧对应的迁移样本图像;训练模块1203,用于基于所述多个视频帧中的各个视频帧以及所述各个视频帧分别对应的迁移样本图像对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
在一些实施例中,本公开实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。
本说明书实施例还提供一种计算机设备,其至少包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,处理器执行所述程序时实现前述任一实施例所述的方法。
图13示出了本说明书实施例所提供的一种更为具体的计算设备硬件结构示意图,该设备可以包括:处理器1301、存储器1302、输入/输出接口1303、通信接口1304和总线1305。其中处理器1301、存储器1302、输入/输出接口1303和通信接口1304通过总线1305实现彼此之间在设备内部的通信连接。
处理器1301可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本说明书实施例所提供的技术方案。处理器1301还可以包括显卡,所述显卡可以是Nvidia titan X显卡或者1080Ti显卡等。
存储器1302可以采用ROM(Read Only Memory,只读存储器)、RAM(Random Access Memory,随机存取存储器)、静态存储设备,动态存储设备等形式实现。存储器1302可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器1302中,并由处理器1301来调用执行。
输入/输出接口1303用于连接输入/输出模块,以实现信息输入及输出。输入输出/模块可以作为组件配置在设备中(图中未示出),也可以外接于设备以提供相应功能。其中输入设备可以包括键盘、鼠标、触摸屏、麦克风、各类传感器等,输出设备可以包括显示器、扬声器、振动器、指示灯等。
通信接口1304用于连接通信模块(图中未示出),以实现本设备与其他设备的通 信交互。其中通信模块可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信。
总线1305包括一通路,在设备的各个组件(例如处理器1301、存储器1302、输入/输出接口1303和通信接口1304)之间传输信息。
需要说明的是,尽管上述设备仅示出了处理器1301、存储器1302、输入/输出接口1303、通信接口1304以及总线1305,但是在具体实施过程中,该设备还可以包括实现正常运行所必需的其他组件。此外,本领域的技术人员可以理解的是,上述设备中也可以仅包含实现本说明书实施例方案所必需的组件,而不必包含图中所示的全部组件。
本公开实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现前述任一实施例所述的方法。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本说明书实施例可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本说明书实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本说明书实施例各个实施例或者实施例的某些部分所述的方法。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同或相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,在实施本说明书实施例方案时可以把各模块的功能在同一个或多个软件和/或硬件中实现。也可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
以上所述仅是本说明书实施例的具体实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本说明书实施例原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本说明书实施例的保护范围。

Claims (21)

  1. 一种妆容迁移方法,包括:
    获取待迁移目标图像和待迁移局部图像,所述待迁移目标图像包括目标对象,所述待迁移局部图像包括所述目标对象的局部区域;
    通过目标妆容迁移网络将预设妆容风格迁移到所述待迁移目标图像上,得到迁移目标图像;
    通过所述目标妆容迁移网络将所述预设妆容风格迁移到所述待迁移局部图像上,得到迁移局部图像;
    对所述迁移目标图像和所述迁移局部图像进行融合,得到所述目标对象的妆容迁移结果。
  2. 根据权利要求1所述的方法,其中,获取所述待迁移局部图像,包括:
    对所述目标对象的原始图像进行目标检测,确定所述原始图像中所述目标对象的关键点位置;
    基于所述目标对象的关键点位置,从所述原始图像中裁剪出所述待迁移局部图像。
  3. 根据权利要求2所述的方法,其中,基于所述目标对象的关键点位置,从所述原始图像中裁剪出所述待迁移局部图像,包括:
    基于所述目标对象的关键点位置,从所述原始图像中裁剪出第一预设尺寸的图像区域,所述第一预设尺寸小于所述目标对象的尺寸,且大于所述目标对象的局部区域的尺寸,所述目标对象的局部区域位于所述第一预设尺寸的图像区域中的第一预设位置处;
    将所述第一预设尺寸的图像区域确定为所述待迁移局部图像。
  4. 根据权利要求1至3中任意一项所述的方法,其中,所述待迁移目标图像为实时采集的视频中的包括所述目标对象的图像,所述待迁移局部图像为从所述包括所述目标对象的图像中裁剪出的、包括所述目标对象的局部区域的局部图像。
  5. 根据权利要求1至4中任意一项所述的方法,其中,所述目标妆容迁移网络基于待迁移样本图像、待迁移局部样本图像、参考样本图像以及参考局部样本图像共同训练得到;
    其中,所述参考样本图像包括具有所述预设妆容风格的第一样本对象;
    所述待迁移样本图像包括具有除所述预设妆容风格以外的妆容风格的第二样本对象;
    所述待迁移局部样本图像中包括所述待迁移样本图像中的第二样本对象的局部区域;
    所述参考局部样本图像中包括所述参考样本图像中的第一样本对象的局部区域,且所述待迁移局部样本图像中包括的局部区域与参考局部样本图像中包括的局部区域相同。
  6. 根据权利要求5所述的方法,其中,所述参考样本图像从第一图像集中选取,所述第一图像集中包括多张图像,所述第一图像集中的每张图像包括具有所述预设妆容风格的相同的第一样本对象;
    所述待迁移样本图像从第二图像集中选取,所述第二图像集中包括多张图像,所述第二图像集中的每张图像包括具有除所述预设妆容风格以外的妆容风格的第二样本对象,且所述第二图像集中的至少两张图像中包括的第二样本对象不同。
  7. 根据权利要求5或6所述的方法,其中,所述参考样本图像包括样本视频中的多个视频帧,所述样本视频中的多个视频帧中的每个视频帧均包括具有所述预设妆容风格的第一样本对象。
  8. 根据权利要求1至7中任意一项所述的方法,其中,所述目标妆容迁移网络包括第一子网络和第二子网络;
    所述第一子网络用于将所述预设妆容风格迁移到所述待迁移目标图像上;
    所述第二子网络用于将所述预设妆容风格迁移到所述待迁移局部图像上。
  9. 根据权利要求1至8中任意一项所述的方法,其中,对所述迁移目标图像和所述迁移局部图像进行融合,得到所述目标对象的妆容迁移结果,包括:
    对所述迁移目标图像进行语义分割,得到所述迁移目标图像中所述局部区域的位置;
    基于所述迁移目标图像中所述局部区域的位置将所述迁移局部图像融合到所述迁移目标图像中,得到所述目标对象的妆容迁移结果。
  10. 根据权利要求1至9中任意一项所述的方法,其中,在对所述迁移目标图像和所述迁移局部图像进行融合之后,所述方法还包括:
    获取所述目标对象迁移后的颜色;
    基于所述目标对象迁移后的颜色对所述目标对象上未进行妆容迁移的区域的颜色进行调整。
  11. 一种妆容迁移网络的训练方法,包括:
    获取待迁移样本图像和待迁移局部样本图像,所述待迁移样本图像中包括第二样本对象,所述待迁移局部样本图像中包括所述第二样本对象的局部区域;
    通过原始妆容迁移网络将参考样本图像中第一样本对象的妆容风格迁移到所述待迁移样本图像上,得到迁移样本图像;
    通过所述原始妆容迁移网络将参考局部样本图像中所述第一样本对象的妆容风格迁移到所述待迁移局部样本图像上,得到迁移局部样本图像,所述参考局部样本图像包括所述参考样本图像中所述第一样本对象的局部区域,且所述待迁移局部样本图像中包括的局部区域与所述参考局部样本图像中包括的局部区域相同;
    基于所述迁移样本图像和所述迁移局部样本图像对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
  12. 根据权利要求11所述的方法,其中,基于所述迁移样本图像和所述迁移局部样本图像对所述原始妆容迁移网络进行训练,包括:
    基于所述迁移样本图像建立第一损失函数;
    基于所述迁移局部样本图像建立第二损失函数;
    基于所述第一损失函数和所述第二损失函数对所述原始妆容迁移网络进行训练,得到所述目标妆容迁移网络。
  13. 根据权利要求12所述的方法,其中,所述目标妆容迁移网络包括第一子网络和第二子网络,所述第一子网络用于将所述预设妆容风格迁移到所述待迁移目标图像上,所述第二子网络用于将所述预设妆容风格迁移到所述待迁移局部图像上;
    基于所述第一损失函数和所述第二损失函数对所述原始妆容迁移网络进行训练,包括:
    基于所述第一损失函数对原始第一子网络进行训练,得到所述第一子网络;
    基于所述第二损失函数对原始第二子网络进行训练,得到所述第二子网络。
  14. 根据权利要求13所述的方法,其中,用于训练子网络的损失函数包括以下至少一者:
    用于表征所述子网络的输出图像的真实度损失的损失函数;
    用于表征所述子网络的输出图像与输入所述子网络的待迁移图像之间的属性相似度损失的损失函数;
    用于表征所述子网络的输出图像与输入所述子网络的参考图像之间的妆容相似度损失的损失函数;
    用于表征目标样本图像与输入所述子网络的待迁移图像之间的相似度损失的损失函数;所述目标样本图像通过将输入所述子网络的待迁移图像上的妆容风格迁移到所述子网络的输出图像上得到;
    所述子网络为所述第一子网络或所述第二子网络。
  15. 根据权利要求11至14中任意一项所述的方法,其中,所述参考样本图像包括视频中的多个视频帧,每张参考局部样本图像包括一个视频帧中的第一样本对象的局部区域,所述多个视频帧的各个视频帧中包括的第一样本对象相同,且具有相同的妆容风格。
  16. 根据权利要求11至15中任意一项所述的方法,其中,所述待迁移样本图像的数量大于1,至少两张待迁移样本图像中的第二样本对象为不同的对象,每张待迁移局部样本图像包括一张待迁移样本图像中的第二样本对象的局部区域。
  17. 一种妆容迁移装置,包括:
    获取模块,用于获取待迁移目标图像和待迁移局部图像,所述待迁移目标图像包括目标对象,所述待迁移局部图像包括所述目标对象的局部区域;
    迁移模块,用于通过目标妆容迁移网络将预设妆容风格迁移到所述待迁移目标图像上,得到迁移目标图像;并用于通过所述目标妆容迁移网络将所述预设妆容风格迁移到所述待迁移局部图像上,得到迁移局部图像;
    融合模块,用于对所述迁移目标图像和所述迁移局部图像进行融合,得到所述目标对象的妆容迁移结果。
  18. 一种妆容迁移网络的训练装置,包括:
    获取模块,用于获取待迁移样本图像和待迁移局部样本图像,所述待迁移样本图像中包括目标对象,所述待迁移局部样本图像中包括所述目标对象的局部区域;
    第一迁移模块,用于通过原始妆容迁移网络将参考样本图像中目标对象的妆容风格迁移到所述待迁移样本图像上,得到迁移样本图像;
    第二迁移模块,用于通过所述原始妆容迁移网络将参考局部样本图像中目标对象的妆容风格迁移到所述待迁移局部样本图像上,得到迁移局部样本图像,所述参考局部样本图像包括所述参考样本图像中目标对象的局部区域,且所述待迁移局部样本图像中包括的局部区域与参考局部样本图像中包括的局部区域相同;
    训练模块,用于基于所述迁移样本图像和迁移局部样本图像对所述原始妆容迁移网络进行训练,得到目标妆容迁移网络。
  19. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现权利要求1至16任意一项所述的方法。
  20. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至16任意一项所述的方法。
  21. 一种计算机程序产品,所述产品包括计算机程序,所述计算机程序在由处理器执行时实现权利要求1至16任意一项所述的方法。
PCT/CN2022/125086 2021-12-30 2022-10-13 妆容迁移及妆容迁移网络的训练方法和装置 WO2023124391A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111653519.6 2021-12-30
CN202111653519.6A CN114283052A (zh) 2021-12-30 2021-12-30 妆容迁移及妆容迁移网络的训练方法和装置

Publications (1)

Publication Number Publication Date
WO2023124391A1 true WO2023124391A1 (zh) 2023-07-06

Family

ID=80878778

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125086 WO2023124391A1 (zh) 2021-12-30 2022-10-13 妆容迁移及妆容迁移网络的训练方法和装置

Country Status (2)

Country Link
CN (1) CN114283052A (zh)
WO (1) WO2023124391A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117241064A (zh) * 2023-11-15 2023-12-15 北京京拍档科技股份有限公司 一种直播实时人脸替换的方法、设备及存储介质
CN117912120A (zh) * 2024-03-19 2024-04-19 中国科学技术大学 人脸隐私保护方法、系统、设备及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283052A (zh) * 2021-12-30 2022-04-05 北京大甜绵白糖科技有限公司 妆容迁移及妆容迁移网络的训练方法和装置
US20230410267A1 (en) * 2022-06-17 2023-12-21 Lemon Inc. Agilegan-based stylization method to enlarge a style region
CN115375857B (zh) * 2022-10-26 2023-01-03 深圳市其域创新科技有限公司 三维场景重建方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110111291A (zh) * 2019-05-10 2019-08-09 衡阳师范学院 基于局部和全局优化融合图像卷积神经网络风格迁移方法
CN111640057A (zh) * 2020-05-25 2020-09-08 武汉理工大学 基于隐藏变量解耦的人脸图像局部特征迁移网络及方法
CN111950430A (zh) * 2020-08-07 2020-11-17 武汉理工大学 基于颜色纹理的多尺度妆容风格差异度量及迁移方法、系统
CN112949605A (zh) * 2021-04-13 2021-06-11 杭州欣禾圣世科技有限公司 基于语义分割的人脸上妆方法和系统
US20210295045A1 (en) * 2020-03-18 2021-09-23 Adobe Inc. Automatic makeup transfer using semi-supervised learning
CN114283052A (zh) * 2021-12-30 2022-04-05 北京大甜绵白糖科技有限公司 妆容迁移及妆容迁移网络的训练方法和装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110111291A (zh) * 2019-05-10 2019-08-09 衡阳师范学院 基于局部和全局优化融合图像卷积神经网络风格迁移方法
US20210295045A1 (en) * 2020-03-18 2021-09-23 Adobe Inc. Automatic makeup transfer using semi-supervised learning
CN111640057A (zh) * 2020-05-25 2020-09-08 武汉理工大学 基于隐藏变量解耦的人脸图像局部特征迁移网络及方法
CN111950430A (zh) * 2020-08-07 2020-11-17 武汉理工大学 基于颜色纹理的多尺度妆容风格差异度量及迁移方法、系统
CN112949605A (zh) * 2021-04-13 2021-06-11 杭州欣禾圣世科技有限公司 基于语义分割的人脸上妆方法和系统
CN114283052A (zh) * 2021-12-30 2022-04-05 北京大甜绵白糖科技有限公司 妆容迁移及妆容迁移网络的训练方法和装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117241064A (zh) * 2023-11-15 2023-12-15 北京京拍档科技股份有限公司 一种直播实时人脸替换的方法、设备及存储介质
CN117241064B (zh) * 2023-11-15 2024-03-19 北京京拍档科技股份有限公司 一种直播实时人脸替换的方法、设备及存储介质
CN117912120A (zh) * 2024-03-19 2024-04-19 中国科学技术大学 人脸隐私保护方法、系统、设备及存储介质

Also Published As

Publication number Publication date
CN114283052A (zh) 2022-04-05

Similar Documents

Publication Publication Date Title
WO2023124391A1 (zh) 妆容迁移及妆容迁移网络的训练方法和装置
US11189104B2 (en) Generating 3D data in a messaging system
US11961189B2 (en) Providing 3D data for messages in a messaging system
Natsume et al. Fsnet: An identity-aware generative model for image-based face swapping
US11783556B2 (en) Augmented reality content generators including 3D data in a messaging system
US11410401B2 (en) Beautification techniques for 3D data in a messaging system
US11825065B2 (en) Effects for 3D data in a messaging system
CN115699114B (zh) 用于分析的图像增广的方法和装置
US11494999B2 (en) Procedurally generating augmented reality content generators
WO2023050992A1 (zh) 用于人脸重建的网络训练方法、装置、设备及存储介质
US11816926B2 (en) Interactive augmented reality content including facial synthesis
KR20220051376A (ko) 메시징 시스템에서의 3d 데이터 생성
WO2023036160A1 (zh) 视频处理方法、装置、计算机可读存储介质及计算机设备
US11875600B2 (en) Facial synthesis in augmented reality content for online communities
WO2023071790A1 (zh) 目标对象的姿态检测方法、装置、设备及存储介质
US20240062500A1 (en) Generating ground truths for machine learning
WO2021155666A1 (zh) 用于生成图像的方法和装置
CN116802692A (zh) 使用媒体数据的对象重构
Liu et al. One-stage context and identity hallucination network
WO2023023464A1 (en) Object reconstruction using media data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913655

Country of ref document: EP

Kind code of ref document: A1