WO2023273697A1 - 图像处理方法、模型训练方法、装置、电子设备及介质 - Google Patents

图像处理方法、模型训练方法、装置、电子设备及介质 Download PDF

Info

Publication number
WO2023273697A1
WO2023273697A1 PCT/CN2022/094586 CN2022094586W WO2023273697A1 WO 2023273697 A1 WO2023273697 A1 WO 2023273697A1 CN 2022094586 W CN2022094586 W CN 2022094586W WO 2023273697 A1 WO2023273697 A1 WO 2023273697A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
special effect
generator
image processing
target object
Prior art date
Application number
PCT/CN2022/094586
Other languages
English (en)
French (fr)
Inventor
周思宇
何茜
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023273697A1 publication Critical patent/WO2023273697A1/zh

Links

Images

Classifications

    • G06T3/04
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour

Definitions

  • Embodiments of the present disclosure relate to the field of image processing, for example, to an image processing method, a model training method, a device, an electronic device, and a medium.
  • the App when it performs face image beautification, it usually extracts corresponding special effect data from a special effect library based on selected special effects, and applies it to a corresponding position of the face image.
  • the deficiencies of the related technologies at least include: applying the special effect data to the face image, the obtained special effect image has poor authenticity, and the beautification effect cannot be guaranteed.
  • Embodiments of the present disclosure provide an image processing method, a model training method, a device, an electronic device, and a medium, which can obtain a special effect image with better authenticity and a good beautification effect.
  • an embodiment of the present disclosure provides an image processing method, including:
  • the image processing model is trained based on superimposing the target object and removing the image of the conflicting object; wherein, the target object includes an object that has the same rendering effect as the special effect object and is adjustable; the The image from which the conflicting object is removed is generated by a generator trained on the basis of a generative confrontation network.
  • an embodiment of the present disclosure provides a model training method, including:
  • a target object is generated based on the special effect object in the second image, and the target object is superimposed on the first image to obtain a third image; wherein the target object has the same presentation effect as the special effect object , and has a regulatory object;
  • the first generator and the second generator are trained with a generative confrontation network.
  • an embodiment of the present disclosure further provides an image processing device, including:
  • the input module is configured to input the image to be processed into the image processing model in response to the special effect trigger instruction
  • An output module configured to output a target image containing special effect objects and removing conflicting objects corresponding to the special effect objects through the image processing model
  • the image processing model is trained based on superimposing the target object and removing the image of the conflicting object; wherein, the target object includes an object that has the same rendering effect as the special effect object and is adjustable; the The image from which the conflicting object is removed is generated by a generator trained on the basis of a generative confrontation network.
  • the embodiment of the present disclosure also provides a model training device, including:
  • the first image generation module is configured to input the original image into the first generator, through which the first generator generates the first image that removes the conflicting object corresponding to the special effect object;
  • the second image generation module is configured to input the first image into a second generator, and generate a second image containing the special effect object through the second generator;
  • the third image generation module is configured to generate a target object based on the special effect object in the second image, and superimpose the target object on the first image to obtain a third image; wherein the target object includes the same
  • the special effect object has the same rendering effect and is adjustable;
  • a training module configured to train an image processing model according to the original image and the third image
  • the first generator and the second generator are trained with a generative confrontation network.
  • an embodiment of the present disclosure further provides an electronic device, and the electronic device includes:
  • processors one or more processors
  • storage means configured to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the image processing method according to any one of the embodiments of the present disclosure, or realize the image processing methods described in the embodiments of the present disclosure Any of the described model training methods.
  • the embodiments of the present disclosure further provide a storage medium containing computer-executable instructions, which are configured to perform the image processing as described in any one of the embodiments of the present disclosure when executed by a computer processor. method, or implement the model training method described in any one of the embodiments of the present disclosure.
  • FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure.
  • FIG. 3 is a schematic flowchart of a model training method provided by an embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure.
  • Fig. 5 is a schematic structural diagram of a model training device provided by an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • the term “comprise” and its variations are open-ended, ie “including but not limited to”.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one further embodiment”; the term “some embodiments” means “at least some embodiments.” Relevant definitions of other terms will be given in the description below.
  • FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure.
  • Embodiments of the present disclosure are applicable to the situation of image processing, for example, to the situation of beautifying human face images.
  • the method can be performed by an image processing device, which can be implemented in the form of software and/or hardware.
  • the device can be integrated into application software, and can be installed in electronic equipment along with the application software, such as electronic devices such as mobile phones and computers. in the device.
  • the image processing method provided in this embodiment includes:
  • the device for executing the image processing method provided by the embodiments of the present disclosure can be integrated into application software supporting image processing functions, and can be installed in electronic devices such as mobile phones and computers along with the application software.
  • the application software may be multimedia application software related to images/videos, such as image/video acquisition software, image/video processing software, multimedia sharing software, and multimedia communication software, etc., which are not exhaustive here.
  • the special effect triggering instruction can be received through the user interface provided by the application software. After the application software receives the special effect trigger instruction, it can call the image processing device to execute the image processing method.
  • the special effect triggering instruction may be an instruction for triggering execution of special effects on images/videos.
  • the special effect trigger instruction may carry a special effect identifier with a special effect, and each special effect identifier may uniquely represent a corresponding special effect.
  • special effects may include, but not limited to, special effects of adding virtual objects and/or removing real objects in the image.
  • the special effects may include but not limited to adding lying silkworm special effects, adding double eyelid special effects, adding dimple special effects, removing eye lines special effects and nasolabial folds special effects, etc.
  • the image to be processed may be an image collected by the application software, or may be an image in the storage space of the electronic device read by the application software.
  • the application software acquires the image to be processed and receives the special effect trigger instruction, it can use the acquired image to be processed as a call parameter to call the image processing device, so that the image processing device executes the special effect on the image to be processed.
  • the image processing model may be a pre-trained machine learning model, for example, may be a pre-trained machine learning model for a server of an application software.
  • the server can be sent to the application software for image processing.
  • the server can pre-train the corresponding image processing model, that is, the application software can receive multiple image processing models.
  • the image processing device may record the correspondence between the special effect identifier of the special effect and the model identifier of the image processing model.
  • the application software when it acquires the image to be processed and receives the special effect trigger instruction, it can also use the special effect identifier carried in the special effect processing instruction as a call parameter to call the image processing device.
  • the image processing device can first determine the target model ID corresponding to the received special effect ID according to the recorded relationship between the special effect ID and the model ID; and then can select the image processing model corresponding to the target model ID to perform special effects on the image to be processed.
  • the special effect object may be an added virtual object; the conflict object may be a real object in the image to be processed.
  • the conflicting object may be an object that reduces the rendering effect of the special effect object.
  • the special effect object is a virtual double eyelid
  • the real double eyelid in the image to be processed will affect the rendering effect of the special effect, so the real double eyelid can be used as a conflict object corresponding to the special effect object.
  • the image processing model can learn the corresponding relationship between special effect objects and conflict objects through training in the training stage, so that when performing special effects based on the trained image processing model, special effect objects can be generated in the image to be processed and conflict objects can be removed. to generate the target image after performing special effects.
  • the image processing model when it is pre-trained, it may be trained based on images of superimposed target objects and removed conflicting objects. For example, the image processing model training may be performed based on the original image and the original image after the conflicting object is removed and the target object is superimposed on the image.
  • the target object may include an object that has the same presentation effect as the special effect object and is adjustable.
  • the target object may be generated based on a special effect object with better rendering effect.
  • Images containing objects with special effects can be generated based on generators trained with generative adversarial networks. Among them, the image from which conflicting objects are removed can also be generated by a generator trained based on a generative adversarial network. Due to the mutual game learning between the generator and the discriminator during the training process of the generative confrontation network, the generated image can be more authentic and better through the generator obtained through the training of the generative confrontation network.
  • the target object can be a pre-adjusted object with a better effect of presenting special effects objects, and the authenticity of the image generated by the generator based on the training of the generative confrontation network is better, so the image generated based on the target object and the generator
  • the training of the image processing model can make the image processed based on the trained image processing model have better authenticity and better beautification effect.
  • the image processing method can be applied to a face image beautification application; wherein, the special effect object includes a first face tissue object generated based on an image processing model; and the conflict object includes a second face tissue object included in the image to be processed.
  • the special effect object when the application software is a face image beautification application, the special effect object may be a virtual first face tissue object generated based on an image processing model, and the conflict object may be a real second face tissue object included in the image to be processed.
  • the conflicting objects may include real lying silkworms, tear grooves and/or eye bags; if the special effect objects include double eyelids, the conflicting objects include real double eyelids.
  • the obtained special effect is only covered on a specific position of the face area, so that the special effect appears very false. Due to the inaccuracy of the coverage position, it is also easy to have the situation that the special effect does not produce the expected beautification effect, so that the user experience is poor.
  • a special effect when executed by an image processing apparatus, not only can a special effect object be generated, but also conflicting objects that affect the presentation of the special effect object can be removed.
  • the image processing device it can be trained based on the target object and the image generated by the generator, so that the special effect object generated based on the trained image processing model has better authenticity and better beautification effect, so that it can Improve user experience.
  • the image processing method can be executed by an image processing device installed in the application software, and the training process of the image processing model can be executed by the server of the application software.
  • the image processing method and the training process of the image processing model can be executed by application software, or both can be executed by the server. Therefore, the execution subject of the image processing method and the training process of the image processing model disclosed above cannot be limited by the execution subject of the image processing method and the training process of the image processing model.
  • the image to be processed in response to a special effect triggering instruction, is input into the image processing model; through the image processing model, the target image containing the special effect object and removing the conflicting object corresponding to the special effect object is output; wherein, The image processing model is trained based on superimposing the image of the target object and removing the conflicting object; wherein, the target object includes an object that has the same rendering effect as the special effect object and has adjustment properties; the image of removing the conflicting object is trained by generating an adversarial network The resulting generator generates.
  • an image for image processing model training can be obtained. Since the target object can be a pre-adjusted object with better effect of presenting the special effect object, and the authenticity of the image generated by the generator based on the training of the generative confrontation network is better, through the image generated based on the special effect mask and the generator , to perform image processing model training, which can make the image processed based on the trained image processing model have better authenticity and better beautification effect.
  • Embodiments of the present disclosure may be combined with various exemplary solutions in the image processing methods provided in the above embodiments.
  • the image processing method provided in this embodiment describes in detail the training process of the image processing model.
  • the generation effect of special effect objects can be improved to a certain extent, and the rendering effect of generated target objects (such as special effect masks) can be improved.
  • the target object such as the special effect masks
  • the target object such as the special effect masks
  • superimposing the adjusted target object on the image from which the conflicting object is removed it is possible to generate a high-quality image for image processing model training.
  • Generate images for model training based on adjustable target objects. Compared with repeatedly training the generator to obtain better training images, it can shorten the generation time of training images on the basis of ensuring good training images, thereby It can improve the training efficiency of the image processing model.
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure. As shown in Figure 2, the image processing method provided in this embodiment includes:
  • the original image may be a random sample image.
  • the original image can be obtained through collection, virtual rendering generation or network generation, which is not exhaustive here.
  • the image processing method is applied to a face image beautification application
  • the original image may be a random face image under different angles/lights.
  • the first generator can be included in the first generation confrontation network during the training process, and can be trained along with the first generation confrontation network based on random first sample images and second sample images that do not contain conflicting objects.
  • the first sample image is a random sample image, and may be the same as the sample set of the original image.
  • the second sample image may be a sample image that does not contain conflicting objects.
  • the first sample image and the second sample image can also be obtained through acquisition, virtual rendering generation or network generation. Exemplarily, it is assumed that the original image is a random face image under different angles/lights, the special effect object is a lying silkworm, and the conflict object is an eye bag. Then, the first sample image may also be a random face image under different angles/lights, and the second sample image may be a face image without bags under the eyes under different angles/lights.
  • the training process of the first generator along with the first generation network may include: first, inputting the first sample image into the first generator, so that the first generator generates a first output image that does not contain conflicting objects; then, Input the second sample image and the first output image into the first discriminator in the first generation network, so that the first discriminator can distinguish the authenticity of the two types of images, wherein the standard discrimination result is that the second sample image is true, and the second discriminator The first output image is false; finally, the first discriminator is trained with the goal that the first discriminator can accurately distinguish the true and false of the two types of images, and the output image generated by the first generator is difficult for the first discriminator to accurately distinguish between true and false
  • the first generator is trained for the target, and through the game learning between the first generator and the first discriminator, the first generator has a better image processing effect for removing conflicting objects.
  • preprocessing may also be performed on the first sample image and the second sample image.
  • the preprocessing may include but not limited to cropping, rotating and other processing.
  • the object of the special effect is a lying silkworm, and the object of conflict is an eye bag.
  • the eye area can be determined according to the key points of the face in the image, and the eye area can be cropped.
  • the conflicting objects in the original image can be removed by the first generator, and the first image from which the conflicting objects are removed can be generated.
  • the second generator may be included in the second GAN during the training process, and may be trained along with the second GAN based on the first sample image and the third sample image including the special effect object.
  • the third sample image may be a sample image containing special effect objects.
  • the third sample image may also be obtained through acquisition, virtual rendering generation or network generation.
  • the object of the special effect is a lying silkworm, and the object of conflict is an eye bag.
  • the third sample image may be a face image containing lying silkworms under different angles/lights.
  • the training process of the second generator along with the second generation network may include: first, inputting the first sample image into the second generator, so that the second generator generates a second output image containing special effect objects;
  • the third sample image and the second output image are input to the second discriminator in the second generation network, so that the second discriminator can distinguish the authenticity of the two types of images, wherein the standard discrimination result is that the third sample image is true, and the second The output image is false;
  • the second discriminator is trained with the goal that the second discriminator can accurately distinguish the true and false of the two types of images, and the output image generated by the second generator is difficult for the second discriminator to accurately distinguish true and false as The goal is to train the second generator, and through the game learning between the second generator and the second discriminator, the second generator has a better image processing effect for generating special effect objects.
  • the third sample image can be processed in the same preprocessing manner, thereby improving the training effect and efficiency of the second generative adversarial network.
  • the second image including the special effect object can be generated.
  • the generation effect of special effect objects can be improved to a certain extent.
  • an object having the same effect as the special effect object and having adjustability may be generated by extracting the special effect object in the second image.
  • a third image that removes the conflicting object and includes the special effect object can be obtained.
  • the target object includes a special effect mask
  • generating the target object based on the special effect object in the second image may include: acquiring key points of the special effect object in the second image, and generating a special effect mask according to the key points.
  • the special effect mask may be an overlay layer that exhibits the same effect as the special effect object.
  • the first one can be extracted by means of Active Shape Model (ASM) algorithm, Active Appearance Model (AAM) algorithm, Cascaded pose regression (CPR) or deep learning methods.
  • ASM Active Shape Model
  • AAM Active Appearance Model
  • CPR Cascaded pose regression
  • the key points of the special effect object in the second image can be determined through the connection area between the key points, so as to generate a special effect mask with the same effect as the special effect object.
  • the special effect mask after generating the special effect mask according to the key points, it may also include: adjusting the special effect mask based on the adjustment instruction; for example, superimposing the target object with the first image may include: combining the adjusted special effect mask with the second image An image is superimposed.
  • adjusting the special effect mask may be adjusting attributes such as shape, size, and strength of the mask.
  • the strength attribute of the mask can be understood as the transparency attribute of the mask.
  • the special effect mask can be adjusted to an optimal presentation effect.
  • the third image used for model training is generated based on the special effect mask with the best rendering effect, so that after the image processed by the model is processed based on the trained image, the beautification effect of the image is better.
  • the target object other than the special effect mask can also be adjusted and superimposed on the first image to obtain a third image, so that the target object can present the best effect.
  • the image processing model is trained, which may include: inputting the original image into the image processing model, so that the image processing model generates a third output image; between the third output image and the third image The deviation is smaller than the preset deviation as the target, and the image processing model is trained.
  • the image-based processing model it is possible to output an image with the same effect as the image generated based on the generator and the special effect mask.
  • the second image generated based on the generator can also be used for image processing model training.
  • the target object since the target object can flexibly adjust the rendering effect, it is not necessary to repeatedly train the generator to obtain a better training image, so it can reduce the training time and improve the model quality. training efficiency.
  • the process of image model training can be summarized as follows: firstly, based on the first generator G1, according to the original image A, the first image G1(A) that removes conflicting objects can be generated; secondly, based on the second generator G2, According to the first image G1(A), generate the second image G2(G1(A)) containing the special effect object; again, according to the key points of the second image G2(G1(A)), make a special effect mask of the special effect object area mask; then, by adjusting the mask, the effect of the special effect object can be controlled to be the best; then, the adjusted mask can be superimposed on the first image G1(A) to generate the target image mask(G1(A)); finally, you can use
  • the target image mask (G1(A)) and the original image A are used to train the image processing model.
  • the image processing model is trained based on superimposing the image of the target object and removing the conflicting object; wherein, the target object includes an object that has the same rendering effect as the special effect object and has adjustment properties; Generated by the generator trained by the network.
  • the image processing model can also be installed in the face image beautification application, so that the application software can process the face image online .
  • the trained first generator and the second generator can also be applied to the face image beautification application.
  • the image processing model is smaller.
  • the technical solutions of the embodiments of the present disclosure describe in detail the training process of the image processing model.
  • the generation effect of special effect objects can be improved to a certain extent, and the rendering effect of the generated target objects (such as special effect masks) can be improved.
  • the target object such as the special effect masks
  • the image processing method provided by the embodiment of the present disclosure belongs to the same disclosed concept as the image processing method provided by the above embodiment, and the technical details not described in detail in this embodiment can be referred to the above embodiment, and the same technical features are described in this embodiment Example has the same beneficial effect as in the above-mentioned embodiment.
  • FIG. 3 is a schematic flowchart of a model training method provided by an embodiment of the present disclosure.
  • the embodiments of the present disclosure are applicable to the situation of training an image processing model, for example, to the situation of training an image processing model for beautifying human face images.
  • the method can be executed by a model training device, which can be implemented in the form of software and/or hardware, and the device can be configured in an electronic device, such as a server.
  • the model training method provided in this embodiment includes:
  • the first generator and the second generator are trained with the generative confrontation network.
  • the first generator is included in the first generation confrontation network during the training process, and is trained with the first generation confrontation network based on the random first sample image and the second sample image that does not contain the conflicting object;
  • the second generation The device is included in the second generation confrontation network during the training process, and the second generation confrontation network is trained based on the first sample image and the third sample image including the special effect object.
  • the target object may include a special effect mask
  • generating the target object based on the special effect object in the second image may include: acquiring key points of the special effect object in the second image, and generating a special effect mask according to the key points.
  • After generating the special effect mask according to the key points it may also include: adjusting the special effect mask based on the adjustment instruction; for example, superimposing the target object with the first image includes: combining the adjusted special effect mask with the first image overlay.
  • the target object can be a pre-adjusted object with better effect of presenting the special effect object, and the authenticity of the image generated by the generator based on the training of the generative confrontation network is better, by using the image generated based on the target object and the generator,
  • the training of the image processing model can make the image processed based on the trained image processing model have better authenticity and better beautification effect.
  • the image processing model can also be applied to execute the image processing method disclosed in the above embodiments to obtain a target image that removes conflicting objects and includes special effect objects.
  • the trained image processing model can be applied to the face image beautification application; wherein, the special effect object includes the first face tissue object generated based on the image processing model; the conflict object includes the second face tissue object contained in the image to be processed .
  • the special effect object includes lying silkworms
  • the conflicting objects may include real lying silkworms, tear grooves and/or eye bags; if the special effect objects include double eyelids, the conflicting objects include real double eyelids.
  • the original image is input into the first generator, and the first image is generated by removing the conflicting object corresponding to the special effect object through the first generator; the first image is input into the second generator, and the first image is generated through the first generator.
  • the second generator generates a second image containing special effect objects; generates a target object based on the special effect objects in the second image, and superimposes the target object with the first image to obtain a third image; according to the original image and the third image, the The image processing model is trained; wherein, the first generator and the second generator are trained with the generation confrontation network.
  • the generation effect of special effect objects can be improved to a certain extent, and the rendering effect of the generated target objects can be improved.
  • the target object can be a special effect mask.
  • the target object By adjusting the target object such as the special effect mask to the best rendering effect, and superimposing the target object on the image of the conflicting object, it is possible to generate a high-quality image for image processing.
  • Images for model training Generate images for model training based on adjustable target objects. Compared with repeatedly training the generator to obtain better training images, it can shorten the generation time of training images on the basis of ensuring good training images, thereby It can improve the training efficiency of the image processing model.
  • model training method provided by the embodiment of the present disclosure belongs to the same disclosed concept as the image processing method provided by the above-mentioned embodiment, and the technical details not described in detail in this embodiment can be referred to the above-mentioned embodiment, and the same technical features are described in this embodiment Example has the same beneficial effect as in the above-mentioned embodiment.
  • FIG. 4 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure.
  • the image processing device provided in this embodiment is suitable for image processing, for example, for beautifying human face images.
  • this embodiment provides an image processing device, including:
  • the input module 410 is configured to input the image to be processed into the image processing model in response to the special effect trigger instruction
  • the output module 420 is configured to use the image processing model to output the target image that contains the special effect object and removes the conflicting object corresponding to the special effect object;
  • the image processing model is trained based on superimposing the image of the target object and removing the conflicting object; wherein, the target object includes an object that has the same rendering effect as the special effect object and has adjustment properties; Generated by the generator trained by the network.
  • the image processing device may also include:
  • the pre-training module is set to train the image processing model based on the following steps:
  • the original image is input into the first generator, and the first image that removes the conflicting object is generated through the first generator; the first image is input into the second generator, and the second image containing the special effect object is generated through the second generator; based on The target object is generated from the special effect object in the second image, and the target object is superimposed on the first image to obtain a third image; an image processing model is trained according to the original image and the third image.
  • the first generator is included in the first generation confrontation network during the training process, and is trained with the first generation confrontation network based on random first sample images and second sample images that do not contain conflicting objects;
  • the second generation The device is included in the second generation confrontation network during the training process, and the second generation confrontation network is trained based on the first sample image and the third sample image including the special effect object.
  • the target object includes a special effect mask
  • the pre-training module may be configured to: obtain key points of the special effect object in the second image, and generate a special effect mask according to the key points.
  • the pre-training module may also be configured to: after generating the special effect mask according to the key points, adjust the special effect mask based on the adjustment instruction; and superimpose the adjusted special effect mask on the first image.
  • the image processing device may be applied to a face image beautification application; wherein, the special effect object includes a first face tissue object generated based on an image processing model; and the conflict object includes a second face tissue object included in the image to be processed.
  • the image processing device provided by the embodiment of the present disclosure can execute the image processing method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.
  • Fig. 5 is a schematic structural diagram of a model training device provided by an embodiment of the present disclosure.
  • the model training device provided in this embodiment is suitable for training an image processing model, for example, it is suitable for training an image processing model for beautifying human face images.
  • the model training device provided in this embodiment includes:
  • the first image generating module 510 is configured to input the original image into the first generator, and generate the first image removing the conflicting object corresponding to the special effect object through the first generator;
  • the second image generation module 520 is configured to input the first image into the second generator, and generate the second image containing the special effect object through the second generator;
  • the third image generating module 530 is configured to generate a target object based on the special effect object in the second image, and superimpose the target object on the first image to obtain a third image; wherein, the target object has the same rendering effect as the special effect object, and have regulatory objects;
  • the training module 540 is configured to train the image processing model according to the original image and the third image;
  • the first generator and the second generator are trained with the generative confrontation network.
  • the first generator is included in the first generation confrontation network during the training process, and is trained with the first generation confrontation network based on random first sample images and second sample images that do not contain conflicting objects;
  • the second generation The device is included in the second generation confrontation network during the training process, and the second generation confrontation network is trained based on the first sample image and the third sample image including the special effect object.
  • the target object includes a special effect mask
  • the third image generation module may be configured to: acquire key points of the special effect object in the second image, and generate a special effect mask according to the key points.
  • the third image generating module may also be configured to: after generating the special effect mask according to the key points, adjust the special effect mask based on the adjustment instruction; and superimpose the adjusted special effect mask on the first image.
  • a model training device may also include:
  • the image processing module is configured to input the image to be processed into the image processing model in response to the special effect trigger instruction after the training of the image processing model is completed; through the image processing model, output the target containing the special effect object and remove the conflicting object corresponding to the special effect object image.
  • the image processing module can be applied to a face image beautification application; wherein, the special effect object includes a first face tissue object generated based on an image processing model; the conflict object includes a second face tissue object included in the image to be processed.
  • the model training device provided by the embodiments of the present disclosure can execute the model training method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.
  • FIG. 6 it shows a schematic structural diagram of an electronic device (such as the terminal device or server in FIG. 6 ) 600 suitable for implementing the embodiments of the present disclosure.
  • the terminal equipment in the embodiment of the present disclosure may include but not limited to such as mobile phone, notebook computer, digital broadcast receiver, PDA (personal digital assistant), PAD (tablet computer), PMP (portable multimedia player), vehicle terminal (such as mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers and the like.
  • the electronic device shown in FIG. 6 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.
  • an electronic device 600 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 606 is loaded into the program in the random access memory (Random Access Memory, RAM) 603 to execute various appropriate actions and processes.
  • a processing device such as a central processing unit, a graphics processing unit, etc.
  • RAM Random Access Memory
  • various programs and data necessary for the operation of the electronic device 600 are also stored.
  • the processing device 601, ROM 602, and RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to the bus 604 .
  • the following devices can be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration an output device 607 such as a computer; a storage device 608 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 609.
  • the communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While FIG. 6 shows electronic device 600 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product, which includes a computer program carried on a non-transitory computer readable medium, where the computer program includes program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from a network via communication means 609 , or from storage means 606 , or from ROM 602 .
  • the processing device 601 the above-mentioned functions defined in the image processing method of the embodiment of the present disclosure are executed, or the above-mentioned functions defined in the model training method of the embodiment of the present disclosure are executed.
  • the electronic device provided by the embodiment of the present disclosure belongs to the same disclosed concept as the image processing method and the model training method provided by the above-mentioned embodiment.
  • the technical details not described in detail in this embodiment can be referred to the above-mentioned embodiment, and this embodiment is the same as the above-mentioned
  • the embodiments have the same beneficial effect.
  • An embodiment of the present disclosure provides a computer storage medium on which a computer program is stored.
  • the program is executed by a processor, the image processing method provided in the above embodiment is implemented, or the model training method provided in the above embodiment is implemented.
  • the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof.
  • Computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM) or flash memory (FLASH), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
  • the client and the server can communicate using any currently known or future-developed network protocols such as HTTP (Hyper Text Transfer Protocol, Hypertext Transfer Protocol), and can communicate with any form or medium of digital Data communication (eg, communication network) interconnections.
  • HTTP Hyper Text Transfer Protocol
  • Examples of communication networks include local area networks ("LANs”), wide area networks ("WANs”), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device:
  • the image processing model In response to the special effect trigger instruction, input the image to be processed into the image processing model; through the image processing model, output the target image containing the special effect object and remove the conflicting object corresponding to the special effect object; wherein, the image processing model is based on the removal of the superimposed target object
  • the images of conflicting objects are trained; among them, the target objects include objects that have the same rendering effect as the special effect objects and are adjustable; the images of removing conflicting objects are generated by the generator based on the training of the generative confrontation network.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device:
  • the original image is input into the first generator, and the first image that removes the conflicting object corresponding to the special effect object is generated by the first generator; the first image is input into the second generator, and the image containing the special effect object is generated by the second generator The second image; generating a target object based on the special effect object in the second image, and superimposing the target object on the first image to obtain a third image; wherein, the target object includes the same presentation effect as the special effect object, and is adjustable The object; according to the original image and the third image, the image processing model is trained; wherein, the first generator and the second generator are trained with the generated confrontation network.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Included are conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the names of the units and modules do not constitute limitations on the units and modules themselves under certain circumstances, for example, the input module may also be described as an "image input module".
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (Field Programmable Gate Arrays, FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Parts, ASSP), System on Chip (System on Chip, SOC), Complex Programmable Logic Device (CPLD), etc.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • Example 1 provides an image processing method, the method including:
  • the image processing model is trained based on superimposing the target object and removing the image of the conflicting object; wherein, the target object includes an object that has the same rendering effect as the special effect object and is adjustable; the The image from which the conflicting object is removed is generated by a generator trained on the basis of a generative confrontation network.
  • Example 2 provides an image processing method, further comprising:
  • the image processing model is trained based on the following steps:
  • the image processing model is trained according to the original image and the third image.
  • Example 3 provides an image processing method, further comprising:
  • the first generator is included in the first generation confrontation network during the training process, and along with the first generation confrontation network is based on a random first sample image, and a second sample image that does not contain the conflicting object conduct training;
  • the second generator is included in the second generation adversarial network during the training process, and is trained with the second generation adversarial network based on the first sample image and the third sample image including the special effect object.
  • Example 4 provides an image processing method, further comprising:
  • the target object includes a special effect mask; the generating the target object based on the special effect object in the second image includes:
  • Example 5 provides an image processing method, further comprising:
  • the method further includes: adjusting the special effect mask based on an adjustment instruction;
  • the superimposing the target object on the first image includes: superimposing an adjusted special effect mask on the first image.
  • Example 6 provides an image processing method, further comprising:
  • the special effect object includes a first human face tissue object generated based on an image processing model; the conflict object includes a second human face tissue object included in the image to be processed.
  • Example 7 provides a model training method, including:
  • a target object is generated based on the special effect object in the second image, and the target object is superimposed on the first image to obtain a third image; wherein the target object has the same presentation effect as the special effect object , and has a regulatory object;
  • the first generator and the second generator are trained with a generative confrontation network.
  • the image processing model is trained based on the generator in the generative confrontation network and the image generated by the adjustable target object, based on the target image output by the image processor, the processing result is more authentic and the beautification effect is better.

Abstract

本公开实施例公开了一种图像处理方法、模型训练方法、装置、电子设备及介质,其中图像处理方法包括:响应于特效触发指令,将待处理图像输入图像处理模型;通过图像处理模型,输出包含特效对象,且去除与特效对象对应的冲突对象的目标图像;其中,图像处理模型,基于叠加目标对象的去除冲突对象的图像训练而成;其中,目标对象包括与特效对象具备相同呈现效果,且具备调节性的对象;去除冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。

Description

图像处理方法、模型训练方法、装置、电子设备及介质
本申请要求在2021年6月30日提交中国专利局、申请号为202110737811.X的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本公开实施例涉及图像处理领域,例如涉及一种图像处理方法、模型训练方法、装置、电子设备及介质。
背景技术
随着科技的发展,越来越多的应用软件(Application,App)走进了用户的生活。目前,一些App可支持人脸图像美化的特效玩法,深受用户喜爱。
相关技术中,App进行人脸图像美化时,通常基于选取的特效,从特效库中提取相应特效数据,并将其作用于人脸图像相应位置。
相关技术的不足之处,至少包括:将特效数据作用于人脸图像,得到的特效图像真实性较差,美化效果得不到保证。
发明内容
本公开实施例提供了一种图像处理方法、模型训练方法、装置、电子设备及介质,能够得到真实性较佳的特效图像,且美化效果好。
第一方面,本公开实施例提供了一种图像处理方法,包括:
响应于特效触发指令,将待处理图像输入图像处理模型;
通过所述图像处理模型,输出包含特效对象,且去除与所述特效对象对应的冲突对象的目标图像;
其中,所述图像处理模型,基于叠加目标对象的去除所述冲突对象的图像训练而成;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;所述去除所述冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
第二方面,本公开实施例提供了一种模型训练方法,包括:
将原始图像输入第一生成器,通过所述第一生成器,生成去除与特效对象对应的冲突对象的第一图像;
将所述第一图像输入第二生成器,通过所述第二生成器,生成包含所述特效对象的第二图像;
基于所述第二图像中的特效对象生成目标对象,并将所述目标对象与所述第一图像进行叠加,得到第三图像;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;
根据所述原始图像和所述第三图像,对图像处理模型进行训练;
其中,所述第一生成器和所述第二生成器随生成对抗网络进行训练。
第三方面,本公开实施例还提供了一种图像处理装置,包括:
输入模块,设置为响应于特效触发指令,将待处理图像输入图像处理模型;
输出模块,设置为通过所述图像处理模型,输出包含特效对象,且去除与所述特效对象对应的冲突对象的目标图像;
其中,所述图像处理模型,基于叠加目标对象的去除所述冲突对象的图像训练而成;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;所述去除所述冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
第四方面,本公开实施例还提供了一种模型训练装置,包括:
第一图像生成模块,设置为将原始图像输入第一生成器,通过所述第一生成器,生成去除与特效对象对应的冲突对象的第一图像;
第二图像生成模块,设置为将所述第一图像输入第二生成器,通过所述第二生成器,生成包含所述特效对象的第二图像;
第三图像生成模块,设置为基于所述第二图像中的特效对象生成目标对象,并将所述目标对象与所述第一图像进行叠加,得到第三图像;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;
训练模块,设置为根据所述原始图像和所述第三图像,对图像处理模型进行训练;
其中,所述第一生成器和所述第二生成器随生成对抗网络进行训练。
第五方面,本公开实施例还提供了一种电子设备,所述电子设备包括:
一个或多个处理器;
存储装置,设置为存储一个或多个程序,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如本公开实施例任一所述的图像处理方法,或实现如本公开实施例任一所述的模型训练方法。
第六方面,本公开实施例还提供了一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时设置为执行如本公开实施例任一所述的图像处理方法,或实现如本公开实施例任一所述的模型训练方法。
附图说明
贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。
图1为本公开一实施例所提供的一种图像处理方法的流程示意图;
图2为本公开另一实施例所提供的一种图像处理方法的流程示意图;
图3为本公开一实施例所提供的一种模型训练方法的流程示意图;
图4为本公开一实施例所提供的一种图像处理装置的结构示意图;
图5为本公开一实施例所提供的一种模型训练装置的结构示意图;
图6为本公开一实施例所提供的一种电子设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的 实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
图1为本公开一实施例所提供的一种图像处理方法的流程示意图。本公开实施例适用于图像处理的情形,例如适用于对人脸图像进行美化的情形。该方法可以由图像处理装置来执行,该装置可以通过软件和/或硬件的形式实现,该装置可集成于应用软件,且可随应用软件安装到电子设备中,例如安装到手机、电脑等电子设备中。
如图1所示,本实施例提供的图像处理方法,包括:
S110、响应于特效触发指令,将待处理图像输入图像处理模型。
执行本公开实施例提供的图像处理方法的装置,可集成于支持图像处理功能的应用软件中,且可随应用软件安装于手机、电脑等电子设备中。其中,应用软件可以为涉及图像/视频的多媒体类应用软件,例如图像/视频采集类软件、图像/视频处理类软件、多媒体分享类软件和多媒体通信类软件等,在此不做穷举。
当电子设备运行该些应用软件时,可通过应用软件提供的用户界面接收特效触发指令。应用软件在接收到特效触发指令后,可以调用图像处理装置执行图像处理方法。其中,特效触发指令,可以是用于触发对图像/视频执行特效的指令。其中,特效触发指令中可以携带有特效的特效标识,每种特效标识可以唯一表征对应的特效。其中,特效可以包括但不限于,添加虚拟对象和/或去除图像中真实对象的特殊效果。示例性的,当应用软件为人脸图像美化的应用时,特效可以包括但不限于添加卧蚕特效、添加双眼皮特效、添加酒窝特效、去除眼纹特效和去除法令纹特效等。
其中,待处理图像可以为通过应用软件采集的图像,也可以为应用软件读取的电子设备存储空间中的图像。当应用软件获取到待处理图像,并接收到特效触发指令时,可以将获取的待处理图像作为调用参数,调用图像处理装置,以使图像处理装置对待处理图像执行特效。
其中,图像处理模型可以为预先训练的机器学习模型,例如,可以为应用软件的服务器预先训练的机器学习模型。服务器可于训练完毕后下发至应用软件,以供图像处理使用。其中,针对每种特效,服务器可以对相应的图像处理模型进行预先训练,即应用软件可以接收到多种图像处理模型。图像处理装置响应于应用软件接收到图像处理模型,可以记录特效的特效标识和图像处理模型的模型标识之间的对应关系。
例如,当应用软件获取到待处理图像,并接收到特效触发指令时,还可以将特效处理指 令中携带的特效标识作为调用参数,调用图像处理装置。图像处理装置可以首先根据记录的特效标识和模型标识的关系,确定与接收的特效标识对应的目标模型标识;然后可以选用目标模型标识对应的图像处理模型,来对待处理图像执行特效。
S120、通过图像处理模型,输出包含特效对象,且去除与特效对象对应的冲突对象的目标图像。
其中,特效对象可以是添加的虚拟对象;冲突对象可以是待处理图像中的真实对象。其中,冲突对象可以是降低特效对象呈现效果的对象。例如,特效对象为虚拟的双眼皮时,待处理图像中真实的双眼皮将影响特效的呈现效果,因此可将真实的双眼皮作为与特效对象对应的冲突对象。
其中,图像处理模型可以在训练阶段通过训练学习到特效对象和冲突对象的对应关系,从而在基于训练完成的图像处理模型执行特效时,可以在待处理图像中生成特效对象,并去除冲突对象,以生成执行特效后的目标图像。
其中,图像处理模型在进行预先训练时,可以基于叠加目标对象的去除冲突对象的图像训练而成。例如,可以,基于原始图像,与原始图像经去除冲突对象并叠加目标对象后的图像,进行图像处理模型训练。
其中,目标对象可以包括与特效对象具备相同呈现效果,且具备调节性的对象。例如,目标对象可以基于呈现效果较佳的特效对象生成。包含特效对象的图像可以基于生成对抗网络训练得到的生成器生成。其中,去除冲突对象的图像也可以由基于生成对抗网络训练得到的生成器生成。由于生成对抗网络训练过程中,生成器和判别器之间互相博弈学习,通过基于生成对抗网络训练得到的生成器,所生成的图像能够真实性更高、效果更好。
由于目标对象可以为预先调节的、呈现特效对象的效果较佳的对象,且基于生成对抗网络训练得到的生成器所生成的图像的真实性更佳,因此通过基于目标对象和生成器生成的图像来进行图像处理模型训练,能够使得基于训练完成的图像处理模型处理的图像,真实性较佳、且美化效果好。
例如,图像处理方法可应用于人脸图像美化应用;其中,特效对象包括基于图像处理模型生成的第一人脸组织对象;冲突对象包括待处理图像中包含的第二人脸组织对象。
其中,当应用软件为人脸图像美化应用时,特效对象可以为基于图像处理模型生成的虚拟的第一人脸组织对象,冲突对象可以为待处理图像中包含的真实的第二人脸组织对象。例如,若特效对象包括卧蚕,则冲突对象可以包括真实的卧蚕、泪沟和/或眼袋;若特效对象包括双眼皮,则冲突对象包括真实的双眼皮。
相关技术中,通过将特效数据作用于人脸图像,得到的特效仅覆盖在人脸区域特定位置上,使特效呈现得非常虚假。由于覆盖位置的不精准,还容易出现特效并未产生预期美化效果的情况,从而用户体验较差。
在本申请的示例实现方式中,通过图像处理装置执行特效时,不仅可以生成特效对象,还可以去除影响特效对象呈现的冲突对象。图像处理装置在训练过程中,可以通过基于目标对象和生成器生成的图像来进行训练,能够使得基于训练完成的图像处理模型生成的特效对象,真实性更加佳、且美化效果更好,从而可以提高用户体验。
值得注意的是,上述实施例中公开了图像处理方法可以由安装于应用软件的图像处理装置执行,图像处理模型的训练过程可以由应用软件的服务器执行。但是,理论上图像处理方 法和图像处理模型的训练过程,可以皆由应用软件执行,也可以皆由服务器执行。因此,上述公开的图像处理方法和图像处理模型的训练过程的执行主体,并不能成为图像处理方法和图像处理模型的训练过程的执行主体的限制。
本公开实施例提供的图像处理方法,响应于特效触发指令,将待处理图像输入图像处理模型;通过图像处理模型,输出包含特效对象,且去除与特效对象对应的冲突对象的目标图像;其中,图像处理模型,基于叠加目标对象的去除冲突对象的图像训练而成;其中,目标对象包括与特效对象具备相同呈现效果,且具备调节性的对象;去除冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
通过将可调节的与特效对象具备相同呈现效果的目标对象,叠加到基于生成对抗网络训练得到的生成器所生成的去除冲突对象的图像上,能够得到用于进行图像处理模型训练的图像。由于目标对象可以为预先调节的、呈现特效对象的效果较佳的对象,且基于生成对抗网络训练得到的生成器所生成的图像的真实性更佳,通过基于特效蒙版和生成器生成的图像,来进行图像处理模型训练,能够使得基于训练完成的图像处理模型处理的图像,真实性较佳、且美化效果好。
本公开实施例与上述实施例中所提供的图像处理方法中各个示例方案可以结合。本实施例所提供的图像处理方法,对图像处理模型的训练过程进行了详细描述。通过先对原始图像去除冲突对象,再在此基础上生成特效对象,可在一定程度上提高特效对象的生成效果,能够使生成的目标对象(例如特效蒙版)呈现效果变好。进而,通过将特效蒙版等目标对象调整至最佳的呈现效果,并将调整后的目标对象叠加至去除冲突对象的图像,能够生成质量较高的、用于进行图像处理模型训练的图像。基于可调整的目标对象来生成用于模型训练的图像,与反复训练生成器以得到效果较佳的训练图像相比,能够在保证训练图像效果佳的基础上,缩短训练图像的生成时间,从而可提高图像处理模型的训练效率。
图2为本公开实施例所提供的一种图像处理方法的流程示意图。如图2所示,本实施例提供的图像处理方法,包括:
S210、将原始图像输入第一生成器,通过第一生成器,生成去除冲突对象的第一图像。
本实施例中,原始图像可以是随机的样本图像。其中,原始图像可以通过采集得到、虚拟渲染生成或者网络生成,在此不做穷举。示例性的,假设图像处理方法应用于人脸图像美化应用,那么原始图像可以为在不同角度/光线下的随机的人脸图像。
其中,第一生成器在训练过程中可以包含于第一生成对抗网络,且可以随第一生成对抗网络基于随机的第一样本图像和未包含冲突对象的第二样本图像进行训练。
其中,第一样本图像为随机的样本图像,且可以与原始图像的样本集合相同。其中,第二样本图像可以是没有包含冲突对象的样本图像。其中,第一样本图像和第二样本图像也同样可以通过采集得到、虚拟渲染生成或者网络生成。示例性的,假设原始图像为在不同角度/光线下的随机的人脸图像,特效对象为卧蚕,冲突对象为眼袋。那么,第一样本图像也可以为在不同角度/光线下的随机的人脸图像,第二样本图像可以为在不同角度/光线下没有包含眼袋的人脸图像。
其中,第一生成器随第一生成网络的训练过程可以包括:首先,将第一样本图像输入第一生成器,以使第一生成器生成未包含冲突对象的第一输出图像;接着,将第二样本图像和第一输出图像,输入第一生成网络中的第一判别器,以使第一判别器判别两类图像的真假, 其中标准判别结果为第二样本图像为真,第一输出图像为假;最后,以第一判别器可准确判别两类图像真假为目标训练第一判别器,且以第一生成器生成的输出图像,难以另第一判别器准确判别真假为目标训练第一生成器,通过第一生成器和第一判别器的博弈学习,使第一生成器具有较好的去除冲突对象的图像处理效果。
针对不同应用场景,在利用第一样本图像和第二样本图像进行训练之前,还可以对第一样本图像和第二样本图像进行预处理。其中,预处理可以包括但不限于裁剪、旋转等处理。示例性的,假设特效对象为卧蚕,冲突对象为眼袋。那么,在获取第一样本图像和第二样本图像后,可以根据图像中的人脸关键点,确定出眼部区域,并将眼部区域进行裁剪处理。通过利用裁剪处理的后的图像进行生成对抗网络训练,能够使训练集中学习重要的眼部区域,忽略其他区域,有助于提高训练效果和效率。
通过将原始图像输入训练完毕的第一生成器,能够通过第一生成器对原始图像中的冲突对象进行去除,生成去除冲突对象的第一图像。
S220、将第一图像输入第二生成器,通过第二生成器,生成包含特效对象的第二图像。
本实施例中,第二生成器在训练过程中可以包含于第二生成对抗网络,且可以随第二生成对抗网络基于第一样本图像,和包含特效对象的第三样本图像进行训练。
其中,第三样本图像可以是包含特效对象的样本图像。其中,第三样本图像也可以通过采集得到、虚拟渲染生成或者网络生成。示例性的,假设特效对象为卧蚕,冲突对象为眼袋。那么,第三样本图像可以为在不同角度/光线下包含卧蚕的人脸图像。
其中,第二生成器随第二生成网络的训练过程可以包括:首先,将第一样本图像输入第二生成器,以使第二生成器生成包含特效对象的第二输出图像;接着,将第三样本图像和第二输出图像,输入第二生成网络中的第二判别器,以使第二判别器判别两类图像的真假,其中标准判别结果为第三样本图像为真,第二输出图像为假;最后,以第二判别器可准确判别两类图像真假为目标训练第二判别器,且以第二生成器生成的输出图像,难以另第二判别器准确判别真假为目标训练第二生成器,通过第二生成器和第二判别器的博弈学习,使第二生成器具有较好的生成特效对象的图像处理效果。
在对第一样本图像和第二样本图像进行了预处理的情况下,第三样本图像可采用同样的预处理方式进行图像处理,从而可以提高第二生成对抗网络的训练效果和效率。
通过先对原始图像去除冲突对象生成第一图像,再将第一图像输入训练完毕的第二生成器,能够生成包含特效对象的第二图像。通过先去除冲突对象,再在此基础上生成特效对象,可在一定程度上提高特效对象的生成效果。
S230、基于第二图像中的特效对象生成目标对象,并将目标对象与第一图像进行叠加,得到第三图像。
本实施例中,可以通过提取第二图像中的特效对象来生成与特效对象呈现相同效果的,且具备可调节性的对象。其中,通过将目标对象与第一图像叠加,能够得到既去除冲突对象,又包含特效对象的第三图像。
例如,目标对象包括特效蒙版;基于第二图像中的特效对象生成目标对象,可以包括:获取第二图像中特效对象的关键点,并根据关键点生成特效蒙版。
在这些示例的实现方式中,特效蒙版(可称为mask)可以是与特效对象呈现相同效果的覆盖层。其中,可以通过基于主动形状模型(Active Shape Model,ASM)算法、主动外观 模型(Active Appearance Model,AAM)算法、级联形状回归(Cascaded pose regression,CPR)或深度学习方法等方式,提取出第二图像中特效对象的关键点。进而可以通过关键点之间的连线区域,确定蒙版的形状(例如矩形、三角形或不规则的多边形等)、像素灰度等属性,以生成与特效对象呈现相同效果的特效蒙版。例如,在根据关键点生成特效蒙版之后,还可以包括:基于调整指令对特效蒙版进行调整;例如,将目标对象与第一图像进行叠加,可以包括:将调整后的特效蒙版与第一图像进行叠加。
例如,对特效蒙版进行调节,可以是对蒙版的形状、大小、强弱等属性进行调节。其中,蒙版的强弱属性可以理解为蒙版的透明度属性。通过基于调整指令对特效蒙版的各属性进行进一步调整,能够将特效蒙版调整至最佳的呈现效果。基于呈现效果最佳的特效蒙版来生成用于进行模型训练的第三图像,能够使得基于训练完成的图像处理模型处理的图像后,图像的美化效果更好。同理,除特效蒙版外的目标对象,也可以在进行调整后叠加到第一图像上,得到第三图像,从而使得目标对象能够呈现最佳效果。
S240、根据原始图像和第三图像,对图像处理模型进行训练。
其中,根据原始图像和第三图像,对图像处理模型进行训练,可以包括:将原始图像输入图像处理模型,以使图像处理模型生成第三输出图像;以第三输出图像和第三图像之间的偏差小于预设偏差为目标,对图像处理模型进行训练。以实现基于图像处理模型,能够输出与基于生成器和特效蒙版生成的图像呈现相同效果的图像。
此外,基于生成器生成的第二图像同样也可用于进行图像处理模型训练。但是,与基于生成器和目标对象生成的第三图像相比,由于目标对象可灵活调节呈现效果,不需要反复训练生成器以得到较佳效果的训练图像,故可以减少训练耗时,提高模型训练效率。
示例性的,假设原始图像记为A,第一生成器记为G1,第二生成器记为G2。那么,进行图像模型训练的过程,可以概述为:首先,基于第一生成器G1,根据原始图像A,可以生成去除冲突对象的第一图像G1(A);其次,基于第二生成器G2,根据第一图像G1(A),生成包含特效对象的第二图像G2(G1(A));再次,根据第二图像G2(G1(A))的关键点,制作特效对象区域的特效蒙版mask;接着,通过调整mask,可控制特效对象呈现效果最佳;然后,可以将调整后的mask与第一图像G1(A)叠加,生成目标图像mask(G1(A));最后,可以利用目标图像mask(G1(A))和原始图像A,对图像处理模型进行训练。
S250、响应于特效触发指令,将待处理图像输入图像处理模型。
S260、通过图像处理模型,输出包含特效对象,且去除与特效对象对应的冲突对象的目标图像。
其中,图像处理模型,基于叠加目标对象的去除冲突对象的图像训练而成;其中,目标对象包括与特效对象具备相同呈现效果,且具备调节性的对象;去除冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
示例性的,假设图像处理方法应用于人脸图像美化应用,那么在图像处理模型训练完毕后,还可以将图像处理模型安装于人脸图像美化应用中,以供应用软件线上处理人脸图像。此外,还可以将训练完成的第一生成器和第二生成器,应用于人脸图像美化应用。然而,与利用生成器美化图像相比,图像处理模型更加小巧,通过将图像处理模型应用于线上应用软件,能够节省软件耗费的计算资源,加快处理速度。
本公开实施例的技术方案,对图像处理模型的训练过程进行了详细描述。通过先对原始 图像去除冲突对象,再在此基础上生成特效对象,可在一定程度上提高特效对象的生成效果,能够使生成的目标对象(例如特效蒙版)呈现效果变好。进而,通过将特效蒙版等目标对象调整至最佳的呈现效果,并将调整后的目标对象叠加至去除冲突对象的图像,能够生成质量较高的、用于进行图像处理模型训练的图像。基于可调整的目标对象来生成用于模型训练的图像,与反复训练生成器以得到效果较佳的训练图像相比,能够在保证训练图像效果佳的基础上,缩短训练图像的生成时间,从而可提高图像处理模型的训练效率。
此外,本公开实施例提供的图像处理方法与上述实施例提供的图像处理方法属于同一公开构思,未在本实施例中详尽描述的技术细节可参见上述实施例,并且相同的技术特征在本实施例与上述实施例中具有相同的有益效果。
图3为本公开实施例所提供的一种模型训练方法的流程示意图。本公开实施例适用于训练图像处理模型的情形,例如适用于训练对人脸图像进行美化的图像处理模型的情形。该方法可以由模型训练装置来执行,该装置可以通过软件和/或硬件的形式实现,该装置可配置于电子设备中,例如配置于服务器中。
如图3所示,本实施例提供的模型训练方法,包括:
S310、将原始图像输入第一生成器,通过第一生成器,生成去除与特效对象对应的冲突对象的第一图像。
S320、将第一图像输入第二生成器,通过第二生成器,生成包含特效对象的第二图像。
其中,第一生成器和第二生成器随生成对抗网络进行训练。
其中,第一生成器在训练过程中包含于第一生成对抗网络,且随第一生成对抗网络基于随机的第一样本图像,和未包含冲突对象的第二样本图像进行训练;第二生成器在训练过程中包含于第二生成对抗网络,且随第二生成对抗网络基于第一样本图像,和包含特效对象的第三样本图像进行训练。
S330、基于第二图像中的特效对象生成目标对象,并将目标对象与第一图像进行叠加,得到第三图像;其中,目标对象包括与特效对象具备相同呈现效果,且具备调节性的对象。
其中,目标对象可以包括特效蒙版;基于第二图像中的特效对象生成目标对象,可以包括:获取第二图像中特效对象的关键点,并根据关键点生成特效蒙版。在根据关键点生成特效蒙版之后,还可以包括:基于调整指令对特效蒙版进行调整;例如,将目标对象与第一图像进行叠加,包括:将调整后的特效蒙版与第一图像进行叠加。
S340、根据原始图像和第三图像,对图像处理模型进行训练。
由于目标对象可以为预先调节的、呈现特效对象的效果较佳的对象,且基于生成对抗网络训练得到的生成器所生成的图像的真实性更佳,通过基于目标对象和生成器生成的图像,来进行图像处理模型训练,能够使得基于训练完成的图像处理模型处理的图像,真实性较佳、且美化效果好。
此外,在图像处理模型训练完毕后,还可应用图像处理模型,执行上述实施例公开的图像处理方法,以得到去除冲突对象且包含特效对象的目标图像。
其中,训练完毕的图像处理模型可以应用于人脸图像美化应用;其中,特效对象包括基于图像处理模型生成的第一人脸组织对象;冲突对象包括待处理图像中包含的第二人脸组织对象。例如,若特效对象包括卧蚕,则冲突对象可以包括真实的卧蚕、泪沟和/或眼袋;若特效对象包括双眼皮,则冲突对象包括真实的双眼皮。
本公开实施例的模型训练方法,将原始图像输入第一生成器,通过第一生成器,生成去除与特效对象对应的冲突对象的第一图像;将第一图像输入第二生成器,通过第二生成器,生成包含特效对象的第二图像;基于第二图像中的特效对象生成目标对象,并将目标对象与第一图像进行叠加,得到第三图像;根据原始图像和第三图像,对图像处理模型进行训练;其中,第一生成器和第二生成器随生成对抗网络进行训练。
通过先对原始图像去除冲突对象,再在此基础上生成特效对象,可在一定程度上提高特效对象的生成效果,能够使生成的目标对象呈现效果变好。其中,目标对象可以为特效蒙版,通过将特效蒙版等目标对象调整至最佳的呈现效果,并将目标对象叠加至去除冲突对象的图像,能够生成质量较高的、用于进行图像处理模型训练的图像。基于可调整的目标对象来生成用于模型训练的图像,与反复训练生成器以得到效果较佳的训练图像相比,能够在保证训练图像效果佳的基础上,缩短训练图像的生成时间,从而可提高图像处理模型的训练效率。
此外,本公开实施例提供的模型训练方法与上述实施例提供的图像处理方法属于同一公开构思,未在本实施例中详尽描述的技术细节可参见上述实施例,并且相同的技术特征在本实施例与上述实施例中具有相同的有益效果。
图4为本公开实施例所提供的一种图像处理装置的结构示意图。本实施例提供的图像处理装置适用于图像处理的情形,例如适用于对人脸图像进行美化的情形。
如图4所示,本实施例提供图像处理装置,包括:
输入模块410,设置为响应于特效触发指令,将待处理图像输入图像处理模型;
输出模块420,设置为通过图像处理模型,输出包含特效对象,且去除与特效对象对应的冲突对象的目标图像;
其中,图像处理模型,基于叠加目标对象的去除冲突对象的图像训练而成;其中,目标对象包括与特效对象具备相同呈现效果,且具备调节性的对象;去除冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
例如,图像处理装置,还可以包括:
预训练模块,设置为基于下述步骤训练图像处理模型:
将原始图像输入第一生成器,通过第一生成器,生成去除冲突对象的第一图像;将第一图像输入第二生成器,通过第二生成器,生成包含特效对象的第二图像;基于第二图像中的特效对象生成目标对象,并将目标对象与第一图像进行叠加,得到第三图像;根据原始图像和第三图像,对图像处理模型进行训练。
例如,第一生成器在训练过程中包含于第一生成对抗网络,且随第一生成对抗网络基于随机的第一样本图像,和未包含冲突对象的第二样本图像进行训练;第二生成器在训练过程中包含于第二生成对抗网络,且随第二生成对抗网络基于第一样本图像,和包含特效对象的第三样本图像进行训练。
例如,目标对象包括特效蒙版;预训练模块,可设置为:获取第二图像中特效对象的关键点,并根据关键点生成特效蒙版。
例如,预训练模块,还可设置为:在根据关键点生成特效蒙版之后,基于调整指令对特效蒙版进行调整;将调整后的特效蒙版与第一图像进行叠加。
例如,图像处理装置,可以应用于人脸图像美化应用;其中,特效对象包括基于图像处理模型生成的第一人脸组织对象;冲突对象包括待处理图像中包含的第二人脸组织对象。
本公开实施例所提供的图像处理装置,可执行本公开任意实施例所提供的图像处理方法,具备执行方法相应的功能模块和有益效果。
值得注意的是,上述装置所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本公开实施例的保护范围。
图5为本公开实施例所提供的一种模型训练装置结构示意图。本实施例提供的模型训练装置适用于训练图像处理模型的情形,例如适用于训练对人脸图像进行美化的图像处理模型的情形。
如图5所示,本实施例提供的模型训练装置,包括:
第一图像生成模块510,设置为将原始图像输入第一生成器,通过第一生成器,生成去除与特效对象对应的冲突对象的第一图像;
第二图像生成模块520,设置为将第一图像输入第二生成器,通过第二生成器,生成包含特效对象的第二图像;
第三图像生成模块530,设置为基于第二图像中的特效对象生成目标对象,并将目标对象与第一图像进行叠加,得到第三图像;其中,目标对象包括与特效对象具备相同呈现效果,且具备调节性的对象;
训练模块540,设置为根据原始图像和第三图像,对图像处理模型进行训练;
其中,第一生成器和第二生成器随生成对抗网络进行训练。
例如,第一生成器在训练过程中包含于第一生成对抗网络,且随第一生成对抗网络基于随机的第一样本图像,和未包含冲突对象的第二样本图像进行训练;第二生成器在训练过程中包含于第二生成对抗网络,且随第二生成对抗网络基于第一样本图像,和包含特效对象的第三样本图像进行训练。
例如,目标对象包括特效蒙版;第三图像生成模块,可设置为:获取第二图像中特效对象的关键点,并根据关键点生成特效蒙版。
例如,第三图像生成模块,还可设置为:在根据关键点生成特效蒙版之后,基于调整指令对特效蒙版进行调整;将调整后的特效蒙版与第一图像进行叠加。
例如,模型训练装置,还可以包括:
图像处理模块,设置为在图像处理模型训练完成后,响应于特效触发指令,将待处理图像输入图像处理模型;通过图像处理模型,输出包含特效对象,且去除与特效对象对应的冲突对象的目标图像。
例如,图像处理模块可应用于人脸图像美化应用;其中,特效对象包括基于图像处理模型生成的第一人脸组织对象;冲突对象包括待处理图像中包含的第二人脸组织对象。
本公开实施例所提供的模型训练装置,可执行本公开任意实施例所提供的模型训练方法,具备执行方法相应的功能模块和有益效果。
值得注意的是,上述装置所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本公开实施例的保护范围。
下面参考图6,其示出了适于用来实现本公开实施例的电子设备(例如图6中的终端设备或服务器)600的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电 话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图6示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图6所示,电子设备600可以包括处理装置(例如中央处理器、图形处理器等)601,其可以根据存储在只读存储器(Read-Only Memory,ROM)602中的程序或者从存储装置606加载到随机访问存储器(Random Access Memory,RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有电子设备600操作所需的各种程序和数据。处理装置601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。
通常,以下装置可以连接至I/O接口605:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置606;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置607;包括例如磁带、硬盘等的存储装置608;以及通信装置609。通信装置609可以允许电子设备600与其他设备进行无线或有线通信以交换数据。虽然图6示出了具有各种装置的电子设备600,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
例如,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置609从网络上被下载和安装,或者从存储装置606被安装,或者从ROM602被安装。在该计算机程序被处理装置601执行时,执行本公开实施例的图像处理方法中限定的上述功能,或执行本公开实施例的模型训练方法中限定的上述功能。
本公开实施例提供的电子设备与上述实施例提供的图像处理方法,以及模型训练方法属于同一公开构思,未在本实施例中详尽描述的技术细节可参见上述实施例,并且本实施例与上述实施例具有相同的有益效果。
本公开实施例提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述实施例所提供的图像处理方法,或实现上述实施例所提供的模型训练方法。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)或闪存(FLASH)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计 算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(Hyper Text Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:
响应于特效触发指令,将待处理图像输入图像处理模型;通过图像处理模型,输出包含特效对象,且去除与特效对象对应的冲突对象的目标图像;其中,图像处理模型,基于叠加目标对象的去除冲突对象的图像训练而成;其中,目标对象包括与特效对象具备相同呈现效果,且具备调节性的对象;去除冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
或者,上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:
将原始图像输入第一生成器,通过第一生成器,生成去除与特效对象对应的冲突对象的第一图像;将第一图像输入第二生成器,通过第二生成器,生成包含特效对象的第二图像;基于第二图像中的特效对象生成目标对象,并将目标对象与第一图像进行叠加,得到第三图像;其中,目标对象包括与特效对象具备相同呈现效果,且具备调节性的对象;根据原始图像和第三图像,对图像处理模型进行训练;其中,第一生成器和第二生成器随生成对抗网络进行训练。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)-连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是, 框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元、模块的名称在某种情况下并不构成对该单元、模块本身的限定,例如,输入模块还可以被描述为“图像输入模块”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、片上系统(System on Chip,SOC)、复杂可编程逻辑设备(CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的一个或多个实施例,【示例一】提供了一种图像处理方法,该方法包括:
响应于特效触发指令,将待处理图像输入图像处理模型;
通过所述图像处理模型,输出包含特效对象,且去除与所述特效对象对应的冲突对象的目标图像;
其中,所述图像处理模型,基于叠加目标对象的去除所述冲突对象的图像训练而成;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;所述去除所述冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
根据本公开的一个或多个实施例,【示例二】提供了一种图像处理方法,还包括:
例如,所述图像处理模型基于下述步骤训练:
将原始图像输入第一生成器,通过所述第一生成器,生成去除所述冲突对象的第一图像;
将所述第一图像输入第二生成器,通过所述第二生成器,生成包含所述特效对象的第二图像;
基于所述第二图像中的特效对象生成目标对象,并将所述目标对象与所述第一图像进行叠加,得到第三图像;
根据所述原始图像和所述第三图像,对所述图像处理模型进行训练。
根据本公开的一个或多个实施例,【示例三】提供了一种图像处理方法,还包括:
例如,所述第一生成器在训练过程中包含于第一生成对抗网络,且随所述第一生成对抗网络基于随机的第一样本图像,和未包含所述冲突对象的第二样本图像进行训练;
所述第二生成器在训练过程中包含于第二生成对抗网络,且随所述第二生成对抗网络基于所述第一样本图像,和包含所述特效对象的第三样本图像进行训练。
根据本公开的一个或多个实施例,【示例四】提供了一种图像处理方法,还包括:
所述目标对象包括特效蒙版;所述基于所述第二图像中的特效对象生成目标对象,包括:
获取第二图像中特效对象的关键点,并根据所述关键点生成特效蒙版。
根据本公开的一个或多个实施例,【示例五】提供了一种图像处理方法,还包括:
例如,在所述根据所述关键点生成特效蒙版之后,还包括:基于调整指令对所述特效蒙版进行调整;
所述将所述目标对象与所述第一图像进行叠加,包括:将调整后的特效蒙版与所述第一图像进行叠加。
根据本公开的一个或多个实施例,【示例六】提供了一种图像处理方法,还包括:
例如,应用于人脸图像美化应用;
其中,所述特效对象包括基于图像处理模型生成的第一人脸组织对象;所述冲突对象包括所述待处理图像中包含的第二人脸组织对象。
根据本公开的一个或多个实施例,【示例七】提供了一种模型训练方法,包括:
将原始图像输入第一生成器,通过所述第一生成器,生成去除与特效对象对应的冲突对象的第一图像;
将所述第一图像输入第二生成器,通过所述第二生成器,生成包含所述特效对象的第二图像;
基于所述第二图像中的特效对象生成目标对象,并将所述目标对象与所述第一图像进行叠加,得到第三图像;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;
根据所述原始图像和所述第三图像,对所述图像处理模型进行训练;
其中,所述第一生成器和所述第二生成器随生成对抗网络进行训练。
以上描述仅为本公开的示例实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
由于图像处理模型,基于生成对抗网络中的生成器,以及可调整的目标对象生成的图像进行训练,基于该图像处理器输出的目标图像,其处理结果真实性较佳、且美化效果好。

Claims (11)

  1. 一种图像处理方法,包括:
    响应于特效触发指令,将待处理图像输入图像处理模型;
    通过所述图像处理模型,输出包含特效对象,且去除与所述特效对象对应的冲突对象的目标图像;
    其中,所述图像处理模型,基于叠加目标对象的去除所述冲突对象的图像训练而成;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;所述去除所述冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
  2. 根据权利要求1所述的方法,其中,所述图像处理模型基于下述步骤训练:
    将原始图像输入第一生成器,通过所述第一生成器,生成去除所述冲突对象的第一图像;
    将所述第一图像输入第二生成器,通过所述第二生成器,生成包含所述特效对象的第二图像;
    基于所述第二图像中的特效对象生成目标对象,并将所述目标对象与所述第一图像进行叠加,得到第三图像;
    根据所述原始图像和所述第三图像,对所述图像处理模型进行训练。
  3. 根据权利要求2所述的方法,其中,所述第一生成器在训练过程中包含于第一生成对抗网络,且随所述第一生成对抗网络基于随机的第一样本图像,和未包含所述冲突对象的第二样本图像进行训练;
    所述第二生成器在训练过程中包含于第二生成对抗网络,且随所述第二生成对抗网络基于所述第一样本图像,和包含所述特效对象的第三样本图像进行训练。
  4. 根据权利要求2所述的方法,其中,所述目标对象包括特效蒙版;所述基于所述第二图像中的特效对象生成目标对象,包括:
    获取第二图像中特效对象的关键点,并根据所述关键点生成特效蒙版。
  5. 根据权利要求4所述的方法,在所述根据所述关键点生成特效蒙版之后,还包括:基于调整指令对所述特效蒙版进行调整;
    所述将所述目标对象与所述第一图像进行叠加,包括:将调整后的特效蒙版与所述第一图像进行叠加。
  6. 根据权利要求1-5中任一所述的方法,应用于人脸图像美化应用;
    其中,所述特效对象包括基于图像处理模型生成的第一人脸组织对象;所述冲突对象包括所述待处理图像中包含的第二人脸组织对象。
  7. 一种模型训练方法,包括:
    将原始图像输入第一生成器,通过所述第一生成器,生成去除与特效对象对应的冲突对象的第一图像;
    将所述第一图像输入第二生成器,通过所述第二生成器,生成包含所述特效对象的第二图像;
    基于所述第二图像中的特效对象生成目标对象,并将所述目标对象与所述第一图像进行叠加,得到第三图像;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;
    根据所述原始图像和所述第三图像,对图像处理模型进行训练;
    其中,所述第一生成器和所述第二生成器随生成对抗网络进行训练。
  8. 一种图像处理装置,包括:
    输入模块,设置为响应于特效触发指令,将待处理图像输入图像处理模型;
    输出模块,设置为通过所述图像处理模型,输出包含特效对象,且去除与所述特效对象对应的冲突对象的目标图像;
    其中,所述图像处理模型,基于叠加目标对象的去除所述冲突对象的图像训练而成;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;所述去除所述冲突对象的图像,由基于生成对抗网络训练得到的生成器生成。
  9. 一种模型训练装置,包括:
    第一图像生成模块,设置为将原始图像输入第一生成器,通过所述第一生成器,生成去除与特效对象对应的冲突对象的第一图像;
    第二图像生成模块,设置为将所述第一图像输入第二生成器,通过所述第二生成器,生成包含所述特效对象的第二图像;
    第三图像生成模块,设置为基于所述第二图像中的特效对象生成目标对象,并将所述目标对象与所述第一图像进行叠加,得到第三图像;其中,所述目标对象包括与所述特效对象具备相同呈现效果,且具备调节性的对象;
    训练模块,设置为根据所述原始图像和所述第三图像,对图像处理模型进行训练;
    其中,所述第一生成器和所述第二生成器随生成对抗网络进行训练。
  10. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,设置为存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-6中任一所述的图像处理方法,或实现如权利要求7中所述的模型训练方法。
  11. 一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时设置为执行如权利要求1-6中任一所述的图像处理方法,或实现如权利要求7中所述的模型训练方法。
PCT/CN2022/094586 2021-06-30 2022-05-24 图像处理方法、模型训练方法、装置、电子设备及介质 WO2023273697A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110737811.XA CN113344776B (zh) 2021-06-30 2021-06-30 图像处理方法、模型训练方法、装置、电子设备及介质
CN202110737811.X 2021-06-30

Publications (1)

Publication Number Publication Date
WO2023273697A1 true WO2023273697A1 (zh) 2023-01-05

Family

ID=77481891

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/094586 WO2023273697A1 (zh) 2021-06-30 2022-05-24 图像处理方法、模型训练方法、装置、电子设备及介质

Country Status (2)

Country Link
CN (1) CN113344776B (zh)
WO (1) WO2023273697A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344776B (zh) * 2021-06-30 2023-06-27 北京字跳网络技术有限公司 图像处理方法、模型训练方法、装置、电子设备及介质
CN113989103B (zh) * 2021-10-25 2024-04-26 北京字节跳动网络技术有限公司 模型训练方法、图像处理方法、装置、电子设备及介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107945188A (zh) * 2017-11-20 2018-04-20 北京奇虎科技有限公司 基于场景分割的人物装扮方法及装置、计算设备
CN108898546A (zh) * 2018-06-15 2018-11-27 北京小米移动软件有限公司 人脸图像处理方法、装置及设备、可读存储介质
CN110136054A (zh) * 2019-05-17 2019-08-16 北京字节跳动网络技术有限公司 图像处理方法和装置
CN110913205A (zh) * 2019-11-27 2020-03-24 腾讯科技(深圳)有限公司 视频特效的校验方法及装置
CN111833461A (zh) * 2020-07-10 2020-10-27 北京字节跳动网络技术有限公司 一种图像特效的实现方法、装置、电子设备及存储介质
US20210065454A1 (en) * 2019-08-28 2021-03-04 Snap Inc. Generating 3d data in a messaging system
CN112489169A (zh) * 2020-12-17 2021-03-12 脸萌有限公司 人像图像处理方法及装置
CN113344776A (zh) * 2021-06-30 2021-09-03 北京字跳网络技术有限公司 图像处理方法、模型训练方法、装置、电子设备及介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109727320A (zh) * 2018-12-29 2019-05-07 三星电子(中国)研发中心 一种虚拟化身的生成方法和设备
WO2020258668A1 (zh) * 2019-06-26 2020-12-30 平安科技(深圳)有限公司 基于对抗网络模型的人脸图像生成方法及装置、非易失性可读存储介质、计算机设备
CN110288523B (zh) * 2019-07-02 2023-10-13 北京字节跳动网络技术有限公司 图像生成方法和装置
CN112330526B (zh) * 2019-08-05 2024-02-09 深圳Tcl新技术有限公司 一种人脸转换模型的训练方法、存储介质及终端设备
CN111325657A (zh) * 2020-02-18 2020-06-23 北京奇艺世纪科技有限公司 图像处理方法、装置、电子设备和计算机可读存储介质
CN111563855B (zh) * 2020-04-29 2023-08-01 百度在线网络技术(北京)有限公司 图像处理的方法及装置
CN112381717A (zh) * 2020-11-18 2021-02-19 北京字节跳动网络技术有限公司 图像处理方法、模型训练方法、装置、介质及设备

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107945188A (zh) * 2017-11-20 2018-04-20 北京奇虎科技有限公司 基于场景分割的人物装扮方法及装置、计算设备
CN108898546A (zh) * 2018-06-15 2018-11-27 北京小米移动软件有限公司 人脸图像处理方法、装置及设备、可读存储介质
CN110136054A (zh) * 2019-05-17 2019-08-16 北京字节跳动网络技术有限公司 图像处理方法和装置
US20210065454A1 (en) * 2019-08-28 2021-03-04 Snap Inc. Generating 3d data in a messaging system
CN110913205A (zh) * 2019-11-27 2020-03-24 腾讯科技(深圳)有限公司 视频特效的校验方法及装置
CN111833461A (zh) * 2020-07-10 2020-10-27 北京字节跳动网络技术有限公司 一种图像特效的实现方法、装置、电子设备及存储介质
CN112489169A (zh) * 2020-12-17 2021-03-12 脸萌有限公司 人像图像处理方法及装置
CN113344776A (zh) * 2021-06-30 2021-09-03 北京字跳网络技术有限公司 图像处理方法、模型训练方法、装置、电子设备及介质

Also Published As

Publication number Publication date
CN113344776B (zh) 2023-06-27
CN113344776A (zh) 2021-09-03

Similar Documents

Publication Publication Date Title
WO2022166872A1 (zh) 一种特效展示方法、装置、设备及介质
WO2023273697A1 (zh) 图像处理方法、模型训练方法、装置、电子设备及介质
WO2023125374A1 (zh) 图像处理方法、装置、电子设备及存储介质
US20230421716A1 (en) Video processing method and apparatus, electronic device and storage medium
CN110827379A (zh) 虚拟形象的生成方法、装置、终端及存储介质
WO2021254502A1 (zh) 目标对象显示方法、装置及电子设备
WO2022037602A1 (zh) 表情变换方法、装置、电子设备和计算机可读介质
WO2022171024A1 (zh) 图像显示方法、装置、设备及介质
WO2023051244A1 (zh) 图像生成方法、装置、设备及存储介质
WO2023143129A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2022233223A1 (zh) 图像拼接方法、装置、设备及介质
WO2023232056A1 (zh) 图像处理方法、装置、存储介质及电子设备
WO2023040749A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2023109829A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2021031847A1 (zh) 图像处理方法、装置、电子设备和计算机可读存储介质
KR20220106848A (ko) 비디오 특수 효과 처리 방법 및 장치
WO2022171114A1 (zh) 图像处理方法、装置、设备及介质
WO2022001604A1 (zh) 数据处理方法、装置、可读介质及电子设备
CN112785669B (zh) 一种虚拟形象合成方法、装置、设备及存储介质
WO2024027819A1 (zh) 图像处理方法、装置、设备及存储介质
WO2023071694A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2023138441A1 (zh) 视频生成方法、装置、设备及存储介质
WO2023143118A1 (zh) 图像处理方法、装置、设备及介质
WO2023239299A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2020155908A1 (zh) 用于生成信息的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22831520

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE