WO2023005358A1 - 风格迁移模型训练方法、图像风格迁移方法及装置 - Google Patents

风格迁移模型训练方法、图像风格迁移方法及装置 Download PDF

Info

Publication number
WO2023005358A1
WO2023005358A1 PCT/CN2022/093144 CN2022093144W WO2023005358A1 WO 2023005358 A1 WO2023005358 A1 WO 2023005358A1 CN 2022093144 W CN2022093144 W CN 2022093144W WO 2023005358 A1 WO2023005358 A1 WO 2023005358A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
training
style
sample image
training sample
Prior art date
Application number
PCT/CN2022/093144
Other languages
English (en)
French (fr)
Inventor
白须
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023005358A1 publication Critical patent/WO2023005358A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image

Definitions

  • Embodiments of the present disclosure relate to the field of computers, and in particular, to a style transfer model training method, an image style transfer method and an apparatus.
  • Image style transfer refers to an image processing technique that renders an image into a painting with a specific artistic style.
  • the neural network model is generally used to extract the style features in the style image, and the content is mixed with the target image to extract the target image. Perform reconstruction to obtain the target image with style features.
  • the embodiments of the present disclosure provide a style transfer model training method, an image style transfer method and an apparatus.
  • the present disclosure provides a method for training a style transfer model, including:
  • the unpaired training sample image set includes at least a preset style training sample image and a first original training sample image;
  • the style transfer model is trained by using the paired training sample image set to obtain the trained style transfer model.
  • an image style transfer method including:
  • the trained image style transfer model is obtained according to the style transfer model training method described in any one of the first aspect
  • a style image corresponding to the target image is obtained.
  • the present disclosure provides a style transfer model training device, including:
  • An acquisition module configured to acquire an unpaired training sample image set; wherein, the unpaired training sample image set includes at least a preset style training sample image and a first original training sample image;
  • the first training module is used to use the unpaired training sample image set to train the style image generation model, so as to use the trained style image generation model to process the second original training sample image to obtain the second original training The style sample image corresponding to the sample image;
  • the second training module is used to obtain a paired training sample image set according to the second original training sample image and the style sample image corresponding to the second original training sample image; and use the paired training sample image set to perform style migration
  • the model is trained to obtain the trained style transfer model.
  • an image style transfer device including:
  • An acquisition module configured to acquire a target image
  • a processing module configured to input the target image into the trained style transfer model to perform image style transfer processing; the trained image style transfer model is trained according to the style transfer model described in any one of the first aspect obtained by the method;
  • the acquiring module is further configured to acquire the style image corresponding to the target image.
  • an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory;
  • the memory stores computer-executable instructions
  • the at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the above-mentioned first aspect and various possible methods for training the style transfer model related to the first aspect, or executes As mentioned above, the second aspect and various possible methods for image style transfer related to the second aspect.
  • the embodiments of the present disclosure provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the above-mentioned first aspect and the first aspect are realized.
  • various possible training methods related to the style transfer model, or to implement the above-mentioned second aspect and various possible methods related to the image style transfer described in the second aspect are realized.
  • the embodiments of the present disclosure provide a computer program product, including computer instructions.
  • the computer instructions When the computer instructions are executed by a processor, the above-mentioned first aspect and various possible style transfer models related to the first aspect can be realized.
  • an embodiment of the present disclosure provides a computer program, which, when executed by a processor, implements the above-mentioned first aspect and various possible methods for training a style transfer model related to the first aspect, or, Realize the image style transfer method described in the above second aspect and various possible related aspects of the second aspect.
  • the style image generation model is first trained using the unpaired training image set, so as to obtain the paired training image set according to the trained style image generation model and then training the style transfer model based on the paired training image set to obtain a trained style transfer model, which can be used to perform style transfer processing on the target image to obtain a corresponding style image. Since the training images of the style transfer model are obtained by using the trained style image generation model, the number of training images is sufficient and the quality is relatively uniform, so that the training effect of the style transfer model is better, and then the trained style transfer model is used to output The robustness of the style image of the target image is higher, and the style effect is better.
  • FIG. 1 is a schematic diagram of a network architecture based on the present disclosure
  • FIG. 2 is a schematic flowchart of an image style transfer method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a data flow when training a style image generation model provided by an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of data flow when training a style transfer model provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of another image style transfer method provided by an embodiment of the present disclosure.
  • FIG. 6 is a structural block diagram of a style transfer model training device provided by an embodiment of the present disclosure.
  • FIG. 7 is a structural block diagram of a terminal provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present disclosure.
  • GAN networks have been more and more widely used in computer vision research, and image style transfer is just a practical application of applying GAN networks to the field of computer vision.
  • Image style transfer refers to an image processing technique that renders an image into a painting with a specific artistic style.
  • the trained generative adversarial model can be used to extract the style features in the style image, and the content is mixed with the target image to reconstruct the target image, and the target image with style features can be obtained .
  • the target image and the style image with a specific style are usually used to train the generative adversarial model, so that the trained generative adversarial model can process the target image, and obtain A style image corresponding to the target image having the same style as the style image, so as to complete the style transfer processing for the image.
  • the inventor first thought that the style image generation model for generating the paired training sample image set can be constructed by using the unpaired training sample image set, because the style sample images in the paired training sample image set are generated through the style image generation model Generated, the image quality is better, which makes the training effect of the style transfer model better.
  • the style transfer model is trained by using the paired training images, the extraction of the style features in the training images is more accurate. High robustness.
  • FIG. 1 is a schematic diagram of a network architecture on which the present disclosure is based.
  • the network architecture shown in FIG. 1 may specifically include at least one terminal 1 and a server 2 .
  • the terminal 1 may specifically be a hardware device such as a user's mobile phone, a smart home device, a tablet computer, or a wearable device.
  • the server 2 may specifically be a server or a server cluster arranged in the cloud, and a style transfer model training device for generating a style transfer model is integrated or installed in the server or server cluster, so that the style transfer model training device is used for implementing the present disclosure.
  • Hardware or software for style transfer model training methods are provided.
  • the server 2 is equipped with a style image generation model and a style transfer model, and the server 2 can obtain a trained style transfer model based on the style transfer model training method provided in the present disclosure.
  • the style transfer model can be trained by constructing a style image generation model to obtain a paired training sample image set.
  • the server can send the trained style transfer model to terminal 1, so that terminal 1 can use the trained style transfer model to process the target image to obtain the style image of the target image.
  • the architecture shown in Figure 1 can be applied to various application APP scenarios of image processing.
  • the above-mentioned server 2 may be an application APP running server, which provides corresponding basic application functions for the terminal 1 through interaction with the terminal 1 .
  • the style transfer model training method and the image style transfer method provided in the present disclosure can be applied to scenarios such as image stylization processing, image special effect editing, and the like.
  • FIG. 2 is a schematic flowchart of a method for training a style transfer model provided by an embodiment of the present disclosure.
  • the style transfer model training method provided by the embodiment of the present disclosure includes:
  • Step 101 Obtain an unpaired training sample image set; wherein, the unpaired training sample image set includes at least a preset style training sample image and a first original training sample image;
  • Step 102 Use the unpaired training sample image set to train the style image generation model, so as to use the trained style image generation model to process the second original training sample image to obtain the corresponding style sample image;
  • Step 103 Obtain a paired training sample image set according to the second original training sample image and the style sample image corresponding to the second original training sample image.
  • Step 104 using the paired training sample image set to train the style transfer model to obtain the trained style transfer model.
  • the execution subject of the style transfer model training method provided in this embodiment is the aforementioned style transfer model training device.
  • the style transfer model training device may be a server; in other implementations
  • the main body of the style transfer model training device may be a terminal, so that the terminal can process the target image based on the trained style transfer model.
  • two neural network models will be constructed in the embodiments provided in the present disclosure to cope with different processing requirements in the image style transfer process.
  • the unpaired training image set is first used to train the style image generation model, so that the trained style image generation model Used to obtain a set of paired training images.
  • the style image generation model may specifically include a generation confrontation network, so that it can realize the acquisition of paired training sets.
  • the non-paired training image set includes at least a preset style training sample image and a first original training sample image.
  • the preset style training sample images refer to images with a unified target style, for example, images with a specific style.
  • the first original training sample image refers to a real image, that is, an image that has not undergone style transfer processing.
  • the original data of the first original training sample image may be preprocessed in advance to improve the quality of the first original training sample image.
  • Image Quality That is, acquire the original data of the first original training sample image, perform data preprocessing on the original data of the first original training sample image, and obtain the first original training sample image, and the first original training sample image and the pre-acquired
  • the preset style training sample images constitute the unpaired training image set.
  • the image includes the target area, for example, the face area, and the size and proportion of the face area in the image will affect the model when it performs feature extraction and other processing. processing effect.
  • the original data may be firstly subjected to target detection to determine the target area in the first original training sample image; then, according to the target area, the target area may be trimmed and aligned, Obtain the first original training sample image in the unpaired training image set.
  • the target detection may specifically be human face detection, and the target area may specifically be a human face area.
  • a face detection frame can be obtained to represent the face area where the face in the first original training sample image is located. Then, the original data can be cropped and aligned according to the face area marked by the face detection frame, so that the aspect ratio of the obtained original image is greater than 1:1, for example, it can be 1.3:1. Under the aspect ratio, the model can get better training effect.
  • an unpaired training image set including the first original training sample image and a preset style training sample image can be constructed, and the pre-established style image generation model can be trained using the unpaired training image set.
  • the training of the style image generation model is generally a multi-cycle training process, that is, when the first original training sample images and the preset style training sample images in the unpaired training image set are used to input the style image generation model and obtain the style image generation model after training this time, according to actual needs, each image in the same unpaired training image set can also be re-inputted into the style image generation model after the previous training to re-train the style
  • the image generation model is trained, repeating the process a preset number of times. Based on actual requirements, when the trained style image generation model can output a style image corresponding to the first original training sample image, the training can be stopped and the trained style image generation model can be obtained.
  • the style image generation model includes a first generation adversarial network
  • the first generation adversarial network includes a first generator and a first discriminator.
  • it when using the unpaired training image set to train the style image generation model, it includes: selecting any training sample image from each first original training sample image in the unpaired training image set as the first generated training image ; Input the first generated training image to the first generator to obtain a first intermediate image; gather the first intermediate image and the non-paired training image into the pre-set corresponding to the first generated training image Set a style training sample image, input it to the first discriminator, and obtain the discrimination result; adjust the parameters of the first generator according to the discrimination result, and select from each first original training sample image in the non-paired training image set Selecting the next training sample image as the first generated training image, returning to the step of inputting the first generated training image to the first generator until the first generator meets the preset condition.
  • FIG. 3 is a schematic diagram of a data flow when training a style image generation model provided by an embodiment of the present disclosure.
  • the style image generation model includes a first generator and a first discriminator.
  • the purpose of the first generator is to process the first original training sample image input to the first generator based on the weight parameters, and output the first intermediate image, and the first intermediate image and the preset style image are simultaneously input to the first discriminator for the first discriminator to discriminate the two images. It can be seen that the purpose of the training is to make the discrimination result of the first discriminator on the first intermediate image consistent with the discrimination result of the preset style image.
  • the first generator can be made to generate the first intermediate image that the first discriminator has the style feature of the preset wind image.
  • the trained style image generation model can obtain a preset
  • the style sample image corresponding to the second original training sample image of the style feature of the style image, and the style sample image corresponding to the second original training sample image and the second original training sample image can be used as a paired training image set for subsequent The model is trained.
  • the style transfer model training device can obtain a paired training image set according to the first original training sample image and the style sample image corresponding to the second original training sample image, and use the paired training image set to train the style transfer model , to get the trained style transfer model.
  • the above-mentioned second original training sample image may specifically refer to a real image, that is, an image that has not undergone style transfer processing.
  • the training of the style transfer model can adopt the method of using paired training images to train the model, that is, paired training images set
  • the training images include a second original training sample image, and a style sample image corresponding to the second original training sample image obtained through the aforementioned processing.
  • the second original training sample image is an image selected from the first original training sample image. That is, a partial image can be selected from the first original training sample image as the second original training sample image, and the second original training sample image can be obtained by using the aforementioned trained style image generation model to obtain the second original training sample image The corresponding style sample image.
  • the image size of the target image is usually not fixed.
  • the process of migrating the model for training may also include:
  • image adjustment processing based on multiple image size dimensions is performed to obtain image pairs of each image pair in multiple image size dimensions
  • the style transfer model is trained by using the image pairs of each image pair in the paired training image set under multiple image size dimensions.
  • the model can be adapted to process target images of different sizes, and the generalization ability and robustness of the model can be improved.
  • the In the process of training the style transfer model with the paired training image set it may also include randomly adjusting the image brightness of the target area of the second original training sample image in the paired training image set, using the randomly adjusted image brightness
  • the second original training sample image and the style sample image corresponding to the second original training sample image are used to train the style transfer model.
  • random brightness processing may be performed on the global image of the second original training sample image in the paired training image set; and the target area image of the second original training sample image in the paired training image set , to perform brightness enhancement processing.
  • the above-mentioned target area may specifically include a human face area.
  • performing random brightness processing on the global image refers to randomly assigning values based on the gamma values of all areas of the image, so that the lighting conditions in the image are in a random state;
  • the brightness enhancement process refers to performing face area recognition on the image, and establishing a face mask based on the recognized face area image, and then using the face mask to adjust the brightness of the face part in the image, so that The brightness of the face area is higher than the brightness of the non-face area.
  • the image brightness in the paired training image set can be simulated as much as possible under the real image brightness, so that the style transfer model obtained by using such a paired training image set can be used for different brightness.
  • the image is processed.
  • this embodiment will also use the paired training image set to train the model.
  • the style transfer model is trained by using the paired training image set to obtain the trained style transfer model.
  • the style transfer model is a generation confrontation network including a second generator and a second discriminator.
  • using the paired training image set to train the style transfer model when obtaining the trained style transfer model, includes: selecting any sample image from the second original training sample images in the paired training image set as the second Generate a training image; input the second generated training image to the second generator to obtain a second intermediate image; combine the second intermediate image and the style corresponding to the second original training sample image in the paired training image set
  • the sample image is input to the second discriminator to obtain the discriminant result; the second generator is adjusted according to the discriminative result, and the next one is selected from each second original training sample image in the paired training image set
  • the training sample image is used as the second generated training image, and the step of inputting the second generated training image to the second generator is returned until the second generator meets the preset condition.
  • FIG. 4 is a schematic diagram of a data flow when training a style transfer model provided by an embodiment of the present disclosure.
  • the style transfer model includes a second generator and a second discriminator.
  • the purpose of the second generator is to process the second original training sample image input to the second generator based on the weight parameters, and output the second intermediate image.
  • the second intermediate image and the style image corresponding to the second original training sample image input to the second generator will be input into the second discriminator at the same time, so that the second discriminator can discriminate the two images .
  • the purpose of the training is to make the discrimination result of the second discriminator on the second intermediate image consistent with the discrimination result of the style image obtained in advance.
  • the second generator can be enabled to generate the second intermediate image that enables the second discriminator to identify the style feature of the preset wind image.
  • the second discriminator is specifically used to: respectively extract the features of the second intermediate image and its corresponding style sample image; determine the second intermediate image and the first and discriminating the difference between features of the training image, and obtaining a discrimination result according to the difference between the features and the output result of the second discriminator.
  • the second discriminator will not only distinguish the authenticity of the image, but also supervise based on whether the high-level features are similar. Specifically, the supervision can be realized based on the loss function, that is, the content loss (content loss) based on VGG19-bn ) to output the discrimination result.
  • the second intermediate image can be input into the second discriminator, so that the VGG19-bn among them can extract the high-layer of the second intermediate image feature, to obtain the corresponding content loss (content loss); then, the second discriminator can also perform the same processing on the style sample image corresponding to the second original training sample image in the paired training image set, that is, extract the style image based on VGG19-bn
  • the high-level features of the corresponding content loss (content loss) are obtained; finally, the difference between the two content loss (content loss) is calculated.
  • a discrimination result is obtained, which can be fed back to the second generator for adjusting the parameters of the second generator.
  • the trained style transfer model can use the second generator after repeated parameter adjustment to process the target image, generate a corresponding style image, and realize the style transfer processing of the image.
  • the style image generation model is first trained using the unpaired training image set, so as to obtain the paired training image set based on the trained style image generation model;
  • the style transfer model is trained to obtain the trained style transfer model, and the trained style transfer model can be used to perform style transfer processing on the target image to obtain a corresponding style image. Since the training images of the style transfer model are obtained by using the trained style image generation model, the number of training images is sufficient and the quality is relatively uniform, so that the training effect of the style transfer model is better, and then the trained style transfer model is used to output The robustness of the style image of the target image is higher, and the style effect is better.
  • FIG. 5 is a schematic flowchart of an image style transfer method provided by an embodiment of the present disclosure. As shown in FIG. 5 , the method includes:
  • Step 201 acquiring a target image
  • Step 202 input the target image into the trained style transfer model, and perform image style transfer processing
  • Step 203 Obtain a style image corresponding to the target image.
  • style transfer model involved in this embodiment can be obtained by training based on any of the foregoing embodiments.
  • process of obtaining the model will not be described in detail.
  • the image to be processed may also be preprocessed for the image to be processed, Obtain the target image to be processed.
  • the target image when acquiring the target image, it may include: acquiring the image to be processed; performing target detection on the image to be processed to determine the target area in the image to be processed; performing preprocessing on the target area to obtain the target image.
  • the target may specifically include a human face, the target detection includes human face detection, and the target area includes a human face area.
  • the terminal may also perform preprocessing on the target image to be processed.
  • the face area of the target image can be size-cut to retain the part with the face area, and make the reserved part meet a certain size ratio, so as to facilitate the style Transfer model processing of images.
  • target segmentation is performed on the clipped target region to obtain a mask image of the target; and gamma correction is performed on the mask image to enhance brightness. That is to establish a face mask based on the face area, and then use the face mask to adjust the brightness of the target image to a certain extent, so that the image brightness of the face area is enhanced to a certain extent, making the image of the face area more vivid.
  • the processed target image can be input to the style transfer model for processing, and the style image corresponding to the target image can be obtained.
  • the training images of the style transfer model are obtained by using the trained style image generation model, the number of training images is sufficient and the quality is relatively uniform, and the training effect on the style transfer model is better , so that the style image of the target image output by the trained style transfer model has higher robustness and better style effect.
  • FIG. 6 is a structural block diagram of a device for training a style transfer model provided in an embodiment of the present disclosure.
  • the style transfer model training device includes:
  • An acquisition module 11 configured to acquire an unpaired training sample image set; wherein, the unpaired training sample image set includes at least a preset style training sample image and a first original training sample image;
  • the first training module 12 is configured to use the unpaired training sample image set to train the style image generation model, so as to use the trained style image generation model to process the second original training sample image to obtain the second original The style sample image corresponding to the training sample image;
  • the second training module 13 is used to obtain a paired training sample image set according to the second original training sample image and the style sample image corresponding to the second original training sample image; and use the paired training sample image set to pair the style
  • the transfer model is trained to obtain the trained style transfer model.
  • acquire module 11 for:
  • the first original training sample image and the pre-acquired preset style training sample image form the non-paired training image set.
  • acquire module 11 for:
  • the second training module 13 is used for:
  • image adjustment processing based on multiple image size dimensions is performed to obtain image pairs of each image pair in multiple image size dimensions
  • the style transfer model is trained by using the image pairs of each image pair in the paired training image set under multiple image size dimensions.
  • the second training module 13 is used for:
  • the second training module 13 is used for:
  • the style image generation model includes a first generation adversarial network, and the first generation adversarial network includes a first generator and a first discriminator; correspondingly, the first training module 12 is used for:
  • any training sample image from each first original training sample image in the non-paired training image set as the first generated training image input the first generated training image to the first generator to obtain a first intermediate image ; Input the first intermediate image and the preset style training sample image corresponding to the first generated training image in the unpaired training image set to the first discriminator to obtain a discrimination result; according to the discrimination result to The first generator performs parameter adjustment, and selects the next training sample image from each first original training sample image in the unpaired training image set as the first generated training image, and returns the first generated training image The step of inputting to the first generator until the first generator meets a predetermined condition.
  • the second original training sample image includes an image selected from the first original training sample image.
  • the style transfer model includes a second generation adversarial network, and the second generation adversarial network includes a second generator and a second discriminator; correspondingly, the second training module 13 is used to learn from the pairing Select any sample image in the second original training sample image in the training image set as the second generated training image; the second generated training image is input to the second generator to obtain a second intermediate image; the second The intermediate image and the style sample image corresponding to the second original training sample image in the paired training image set are input to the second discriminator to obtain a discrimination result; the second generator is adjusted according to the discrimination result, and from Selecting the next training sample image from each second original training sample image in the paired training image set as the second generated training image, returning to the step of inputting the second generated training image to the second generator until the The second generator described above meets the preset conditions.
  • the second training module 13 is used for:
  • the style transfer model training device uses the unpaired training image set to train the style image generation model first, so as to obtain the paired training image set based on the trained style image generation model; then based on the paired training image set
  • the style transfer model is trained to obtain a trained style transfer model, and the trained style transfer model can be used to perform style transfer processing on a target image to obtain a corresponding style image. Since the training images of the style transfer model are obtained by using the trained style image generation model, the number of training images is sufficient and the quality is relatively uniform, so that the training effect of the style transfer model is better, and then the trained style transfer model is used to output The robustness of the style image of the target image is higher, and the style effect is better.
  • FIG. 7 is a structural block diagram of an image style transfer device provided in an embodiment of the present disclosure.
  • the image style transfer device includes:
  • a processing module 22 configured to input the target image into the trained style transfer model to perform image style transfer processing
  • the acquisition module 21 is further configured to acquire the style image corresponding to the target image.
  • the trained image style transfer model is obtained according to the style transfer model training method described in the foregoing embodiments.
  • the acquiring module 21 is configured to: acquire an image to be processed; perform target detection on the image to be processed, and determine a target area in the image to be processed; perform preprocessing on the target area to obtain the Describe the target image.
  • the pre-processing includes at least one of the following processing: size cropping processing and brightness processing of the face area.
  • the processing module 22 is configured to perform size cutting processing on the target area.
  • the processing module 22 is further configured to perform target segmentation on the clipped target region to obtain a mask image of the target; and perform gamma correction on the mask image to enhance brightness.
  • the training images of the style transfer model are obtained by using the trained style image generation model, the number of training images is sufficient and the quality is relatively uniform, so that the training effect on the style transfer model is relatively low.
  • the style image of the target image output by the trained style transfer model is more robust and the style effect is better.
  • the electronic device provided in this embodiment can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar, and details will not be repeated here in this embodiment.
  • the electronic device 900 may be a terminal device or a media library.
  • the terminal equipment may include but not limited to mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA for short), tablet computers (Portable Android Device, PAD for short), portable multimedia players (Portable Media Player, referred to as PMP), vehicle-mounted terminals (such as vehicle-mounted navigation terminals), mobile terminals such as wearable electronic devices, and fixed terminals such as digital TVs, desktop computers, and smart home devices.
  • PDA Personal Digital Assistant
  • PMP portable multimedia players
  • vehicle-mounted terminals such as vehicle-mounted navigation terminals
  • mobile terminals such as wearable electronic devices
  • fixed terminals such as digital TVs, desktop computers, and smart home devices.
  • the electronic device shown in FIG. 8 is only an embodiment, and should not limit the functions and application scope of the embodiments of the present disclosure.
  • an electronic device 900 may include a processor 901 (such as a central processing unit, a graphics processing unit, etc.) for performing the aforementioned method, which may be stored in a read-only memory (Read Only Memory, ROM for short) 902 according to the Various appropriate actions and processes are executed by the program loaded from the storage device 908 or the program loaded into the random access memory (Random Access Memory, RAM for short) 903. In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are also stored.
  • the processor 901, ROM 902, and RAM 903 are connected to each other through a bus 904.
  • An input/output (Input/Output, I/O for short) interface 905 is also connected to the bus 904 .
  • an input device 906 including, for example, a touch screen, a touchpad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; ), a speaker, a vibrator, etc.
  • a storage device 908 including, for example, a magnetic tape, a hard disk, etc.
  • the communication means 909 may allow the electronic device 900 to perform wireless or wired communication with other devices to exchange data. While FIG. 8 shows electronic device 900 having various means, it should be understood that it is not a requirement to implement or have all of the illustrated means, and more or fewer means may instead be implemented or provided.
  • an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program is used to execute the methods shown in the flow charts according to the embodiments of the present disclosure.
  • program code may be downloaded and installed from a network via communication means 909, or from storage means 908, or from ROM 902.
  • the processor 901 When the computer program is executed by the processor 901, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, Random Access Memory (RAM), Read-Only Memory (Read-Only Memory).
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • the program code contained on the computer readable medium can be transmitted by any appropriate medium, including but not limited to: electric wire, optical cable, radio frequency (Radio Frequency, RF for short), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device is made to execute the methods shown in the above-mentioned embodiments.
  • Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or media library.
  • the remote computer can be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it can be connected to an external A computer (connected via the Internet, eg, using an Internet service provider).
  • LAN Local Area Network
  • WAN Wide Area Network
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit for obtaining at least two Internet Protocol addresses".
  • exemplary types of hardware logic components include: Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA for short), Application Specific Integrated Circuit (ASIC for short), application specific standard product (Application Specific Standard Product, ASSP for short), System On Chip (SOC for short), Complex programmable logic device (CPLD for short), etc.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, Random Access Memory (RAM), Read-Only Memory (RAM), ROM), Erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory, referred to as EPROM or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, referred to as CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • RAM Random Access Memory
  • RAM Read-Only Memory
  • ROM Erasable Programmable Read-Only Memory
  • EPROM Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • a style transfer model training method includes:
  • the unpaired training sample image set includes at least a preset style training sample image and a first original training sample image;
  • the style transfer model is trained by using the paired training sample image set to obtain the trained style transfer model.
  • the acquisition of an unpaired training image set includes:
  • the first original training sample image and the pre-acquired preset style training sample image form the non-paired training image set.
  • the acquiring the original data of the first original training sample image, and performing data preprocessing on the original data of the first original training sample image to obtain the first original training sample image includes:
  • using the paired training image set to train the style transfer model includes:
  • image adjustment processing based on multiple image size dimensions is performed to obtain image pairs of each image pair in multiple image size dimensions
  • the style transfer model is trained by using the image pairs of each image pair in the paired training image set under multiple image size dimensions.
  • using the paired training image set to train the style transfer model includes:
  • the randomly adjusting the image brightness of the target area of the second original training sample image in the paired training image set includes:
  • the style image generation model includes a first generation adversarial network, and the first generation adversarial network includes a first generator and a first discriminator;
  • using the non-paired training image set to train the style image generation model includes:
  • Adjust the parameters of the first generator according to the discrimination result and select the next training sample image from each first original training sample image in the unpaired training image set as the first generated training image, and return to the The step of inputting the first generated training images to the first generator until the first generator meets preset conditions.
  • the second original training sample image includes an image selected from the first original training sample image.
  • the style transfer model includes a second generation confrontation network, and the second generation confrontation network includes a second generator and a second discriminator;
  • the style transfer model is trained by using the paired training image set to obtain the trained style transfer model, including:
  • the inputting the second intermediate image and the style sample image corresponding to the second original training sample image in the paired training image set to a second discriminator to obtain a discriminant result further includes: respectively extracting the The characteristics of the second intermediate image and its corresponding style sample image; determining the difference between the characteristics of the second intermediate image and its corresponding style sample image, and according to the difference of the characteristics and the result output by the second discriminator, Get the judgment result.
  • a style transfer method includes:
  • the trained image style transfer model is obtained according to the style transfer model training method described in any one of the first aspect
  • a style image corresponding to the target image is obtained.
  • the acquisition of the target image includes:
  • the preprocessing of the target area includes:
  • said preprocessing the target area also includes:
  • Gamma correction is performed on the mask image to enhance brightness.
  • a style transfer model training device includes:
  • An acquisition module configured to acquire an unpaired training sample image set; wherein, the unpaired training sample image set includes at least a preset style training sample image and a first original training sample image;
  • the first training module is used to use the unpaired training sample image set to train the style image generation model, so as to use the trained style image generation model to process the second original training sample image to obtain the second original training The style sample image corresponding to the sample image;
  • the second training module is used to obtain a paired training sample image set according to the second original training sample image and the style sample image corresponding to the second original training sample image; and is used to use the paired training sample image set pair
  • the style transfer model is trained to obtain the trained style transfer model.
  • the acquiring module is specifically configured to acquire the original data of the first original training sample image, perform data preprocessing on the original data of the first original training sample image, and obtain the first original training sample image;
  • the first original training sample image and the pre-acquired preset style training sample image form the non-paired training image set.
  • the first training module is configured to perform target detection on the original data, determine a target area in the first original training sample image; and perform clipping and alignment on the target area based on the obtained target area processing to obtain the first original training sample image in the unpaired training image set.
  • the second training module is configured to perform image adjustment processing based on multiple image size dimensions for each image pair in the paired training image set, to obtain the image size of each image pair under multiple image size dimensions. image pairs; using image pairs in multiple image size dimensions of each image pair in the paired training image set to train the style transfer model.
  • the second training module is configured to randomly adjust the image brightness of the target area of the second original training sample image in the paired training image set, and use the second original training module whose image brightness has been randomly adjusted.
  • the sample image and the style sample image corresponding to the second original training sample image are used to train the style transfer model.
  • the second training module is specifically configured to perform random brightness processing on the global image of the second original training sample image in the paired training image set; and perform random brightness processing on the paired training image set.
  • the target area image of the second original training sample image is subjected to brightness enhancement processing based on the area mask.
  • the style image generation model includes a first generation adversarial network, and the first generation adversarial network includes a first generator and a first discriminator;
  • the first training module is specifically used to select any training sample image from each first original training sample image in the unpaired training image set as the first generated training image; input the first generated training image to the The first generator obtains the first intermediate image; the first intermediate image and the preset style training sample image corresponding to the first generated training image in the unpaired training image set are input to the first discriminator, Obtaining a discrimination result; adjusting the parameters of the first generator according to the discrimination result, and selecting the next training sample image from each first original training sample image in the unpaired training image set as the first generated training image , returning to the step of inputting the first generated training image to the first generator until the first generator meets the preset condition.
  • the second original training sample image includes an image selected from the first original training sample image.
  • the style transfer model includes a second generation confrontation network, and the second generation confrontation network includes a second generator and a second discriminator;
  • the second training module is specifically used to: select any sample image from the second original training sample image in the paired training image set as the second generated training image; input the second generated training image to the second A generator to obtain a second intermediate image; input the second intermediate image and the style sample image corresponding to the second original training sample image in the paired training image set to a second discriminator to obtain a discrimination result; according to the discrimination As a result, the parameters of the second generator are adjusted, and the next training sample image is selected from each second original training sample image in the paired training image set as the second generated training image, and the second generated training image is returned to the second generated training image.
  • the second training module is specifically used to: respectively extract the features of the second intermediate image and its corresponding style sample image; determine the difference between the features of the second intermediate image and its corresponding style sample image , according to the feature difference and the result output by the second discriminator, a discrimination result is obtained.
  • an image style transfer device includes:
  • An acquisition module configured to acquire a target image
  • a processing module configured to input the target image into the trained style transfer model to perform image style transfer processing; the trained image style transfer model is trained according to the style transfer model described in any one of the first aspect obtained by the method;
  • the obtaining module is also used to obtain the style image corresponding to the target image.
  • the acquiring module is specifically configured to: acquire an image to be processed; perform target detection on the image to be processed to determine a target area in the image to be processed; perform preprocessing on the target area to obtain the Describe the target image.
  • the processing module is configured to perform size cutting processing on the target area.
  • the acquisition module is specifically configured to perform target segmentation on the clipped target area to obtain a mask image of the target; and perform gamma correction on the mask image to enhance brightness.
  • an electronic device includes: at least one processor and a memory;
  • the memory stores computer-executable instructions
  • the at least one processor executes the computer-implemented instructions stored in the memory, such that the at least one processor performs the method as described in any one of the preceding items.
  • a computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the following is implemented: The method described in any one of the foregoing.
  • a computer program product includes computer instructions, and when the computer instructions are executed by a processor, implement the method as described in any one of the preceding items.
  • a computer program when executed by a processor, implements the method described in any one of the preceding items.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

提供了一种风格迁移模型训练方法、图像风格迁移方法及装置,方法包括:先利用非配对训练图像集对风格图像生成模型进行训练,以根据训练后的风格图像生成模型,得到配对训练图像集;再基于配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型,训练后的风格迁移模型可用于对目标图像进行风格迁移处理,得到相应的风格图像。

Description

风格迁移模型训练方法、图像风格迁移方法及装置
相关申请的交叉引用
本申请要求于2021年7月28日提交的、申请号为2021108581775、名称为“风格迁移模型训练方法、图像风格迁移方法及装置”的中国专利申请的优先权,其全部内容通过引用并入本文。
技术领域
本公开实施例涉及计算机领域,尤其涉及一种风格迁移模型训练方法、图像风格迁移方法及装置。
背景技术
近年来,GAN网络开始被越来越广泛地应用在计算机视觉研究工作中,图像风格迁移正是将GAN网络运用到计算机视觉领域的一种实际应用。图像风格迁移是指将一张图片渲染为具有特定艺术风格的画作的图像处理技术。
但是,在利用现有的图像风格迁移技术将原始图像迁移至特定风格图像的过程中,一般是利用神经网络模型将风格图像中的风格特征提取出来,并与目标图像进行内容混合以对目标图像进行重建,得到具有风格特征的目标图像。
但是,在对于神经网络模型进行训练的过程中,由于训练样本的缺乏且质量不统一,导致训练得到的神经网络模型执行图像风格迁移所得到的风格特征的目标图像的效果不佳。
发明内容
针对上述问题,本公开实施例提供了一种风格迁移模型训练方法、图像风格迁移方法及装置。
第一方面,本公开提供了一种风格迁移模型训练方法,包括:
获取非配对训练样本图像集;其中,所述非配对训练样本图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像;
利用所述非配对训练样本图像集对风格图像生成模型进行训练,以利用训练后的风格图像生成模型对第二原始训练样本图像进行处理,得到所述第二原始训练样本图像对应的风格样本图像;
根据所述第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练样本图像集;以及
利用所述配对训练样本图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
第二方面,本公开提供了一种图像风格迁移方法,包括:
获取目标图像;
将所述目标图像输入至训练后的风格迁移模型中,进行图像风格迁移处理;所述训练后的图像风格迁移模型是根据第一方面任一项所述的风格迁移模型训练方法所得到的;
获得所述目标图像对应的风格图像。
第三方面,本公开提供了一种风格迁移模型训练装置,包括:
获取模块,用于获取非配对训练样本图像集;其中,所述非配对训练样本图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像;
第一训练模块,用于利用所述非配对训练样本图像集对风格图像生成模型进行训练,以利用训练后的风格图像生成模型对第二原始训练样本图像进行处理,得到所述第二原始训练样本图像对应的风格样本图像;
第二训练模块,用于根据所述第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练样本图像集;以及利用所述配对训练样本图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
第四方面,本公开提供了一种图像风格迁移装置,包括:
获取模块,用于获取目标图像;
处理模块,用于将所述目标图像输入至训练后的风格迁移模型中,进行图像风格迁移处理;所述训练后的图像风格迁移模型是根据第一方面任一项所述的风格迁移模型训练方法所得到的;
其中,所述获取模块,还用于获得所述目标图像对应的风格图像。
第五方面,本公开实施例提供一种电子设备,包括:至少一个处理器和存储器;
所述存储器存储计算机执行指令;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上述第一方面以及第一方面各种可能的涉及所述的风格迁移模型训练方法,或,执行如上述第二方面以及第二方面各种可能的涉及所述的图像风格迁移方法。
第六方面,本公开实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上述第一方面以及第一方面各种可能的涉及所述的风格迁移模型训练方法,或,实现如上述第二方面以及第二方面各种可能的涉及所述的图像风格迁移方法。
第七方面,本公开实施例提供一种计算机程序产品,包括计算机指令,所述计算机指令被处理器执行时,实现如上述第一方面以及第一方面各种可能的涉及所述的风格迁移模型训练方法,或,实现如上述第二方面以及第二方面各种可能的涉及所述的图像风格迁移方法。
第八方面,本公开实施例提供一种计算机程序,所述计算机程序被处理器执行时,实现如上述第一方面以及第一方面各种可能的涉及所述的风格迁移模型训练方法,或,实现如上述第二方面以及第二方面各种可能的涉及所述的图像风格迁移方法。
本公开实施例提供的风格迁移模型训练方法、图像风格迁移方法及装置,由于先利用非配对训练图像集对风格图像生成模型进行训练,以根据训练后的风格图像生成模型,得到配对训练图像集;再基于配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型,所述训练后的风格迁移模型可用于对目标图像进行风格迁移处理,得到相应的风格图像。由于风格迁移模型的训练图像是利用训练后的风格图像生成模型获得的,其训练图像的数量充足且质量较为统一,使得对于风格迁移模型的训练效果较好,进而利用训练后的风格迁移模型输出的目标图像的风格图像的鲁棒性较高,风格效果更佳。
附图说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开所基于的一种网络架构的示意图;
图2为本公开实施例提供的一种图像风格迁移方法的流程示意图;
图3为本公开实施例提供的一种对风格图像生成模型进行训练时的数据流示意图;
图4为本公开实施例提供的一种对风格迁移模型进行训练时的数据流示意图;
图5为本公开实施例提供的另一种图像风格迁移方法的流程示意图;
图6为本公开实施例提供的风格迁移模型训练装置的结构框图;
图7为本公开实施例提供的终端的结构框图;
图8为本公开实施例提供的电子设备的硬件结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
近年来,GAN网络开始被越来越广泛地应用在计算机视觉研究工作中,图像风格迁移正是将GAN网络运用到计算机视觉领域的一种实际应用。
图像风格迁移是指将一张图像渲染为具有特定艺术风格的画作的图像处理技术。一般来说,在实现图像风格迁移时,可利用训练完毕的生成对抗模型将风格图像中的风格特征提取出来,并与目标图像进行内容混合以对目标图像进行重建,得到具有风格特征的目标图像。
在现有技术中,为了实现对神经网络模型的训练,通常会使用目标图像以及具有特定风格的风格图像对生成对抗模型进行训练,以使得训练后的生成对抗模型可对目标图像进行处理,得到与风格图像风格相同的目标图像对应的风格图像,从而完成对于图像的风格迁移处理。
但是,对于一些特定场景和特定风格的风格迁移处理来说,现有的处理方式会使得得到的风格图像的风格鲁棒性不高,效果不佳。
针对这样的问题,发明人首先想到,可先利用非配对训练样本图像集构建用于生成配对训练样本图像集的风格图像生成模型,由于配对训练样本图像集中的风格样本图像是通过风格图像生成模型生成的,其图像质量较佳,使得对于风格迁移模型的训练效果较好,同时,还由于采用配对训练图像的方式,对风格迁移模型进行训练,以使得对训练图像中的风格特征的提取的鲁棒性较高。
参考图1,图1为本公开所基于的一种网络架构的示意图,该图1所示网络架构具体可包括至少一个终端1以及服务器2。
其中,终端1具体可为用户手机、智能家居设备、平板电脑、可穿戴设备等硬件设备。服务器2可具体为设置在云端的服务器或者服务器集群,其服务器或服务器集群中集成或安装有用于生成风格迁移模型的风格迁移模型训练装置,以使该风格迁移模型训练装置为用于执行本公开风格迁移模型训练方法的硬件或软件。
其中,服务器2中布设有风格图像生成模型和风格迁移模型,服务器2可基于本公开提供的风格迁移模型训练方法,得到训练完毕的风格迁移模型。
在对风格迁移模型进行训练时,可以通过构建风格图像生成模型以得到配对训练样本图像集,从而实现对风格迁移模型的训练。
服务器可以将训练完毕的风格迁移模型发送至终端1,从而终端1可以利用训练完毕的风格迁移模型对目标图像进行处理,得到目标图像的风格图像。
图1所示架构可适用于图像处理的各类应用APP的场景中。无论基于何种应用APP的场景,上述的服务器2均可为应用APP的运行服务器,其通过与终端1的交互,在为终端1提供相应的基本应用功能。
具体的,本公开提供的风格迁移模型训练方法以及图像风格迁移方法可应用于图像的风格化处理、图像特效编辑等场景。
下面将针对本公开提供的风格迁移模型训练方法进行进一步说明:
第一方面,图2为本公开实施例提供的一种风格迁移模型训练方法的流程示意图。参考图2,本公开实施例提供的风格迁移模型训练方法,包括:
步骤101、获取非配对训练样本图像集;其中,所述非配对训练样本图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像;
步骤102、利用所述非配对训练样本图像集对风格图像生成模型进行训练,以利用训练后的风格图像生成模型对第二原始训练样本图像进行处理,得到所述第二原始训练样本图像对应的风格样本图像;
步骤103、根据所述第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练样本图像集。
步骤104、利用所述配对训练样本图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
需要说明的是,本实施例的提供的风格迁移模型训练方法的执行主体为前述的风格迁移模型训练装置,在本公开的一些实施例中,风格迁移模型训练装置可以为服务器;在另一些实施方式中,风格迁移模型训练装置的主体可以为终端,从而使得终端能够基于训练后的风格迁移模型对目标图像进行处理。
为了使得训练后的风格迁移模型能够有着更佳的图像风格迁移能力,在本公开提供的实施方式中将构建两个神经网络模型,以应对图像风格迁移过程中的不同处理需求。
基于此,为了能够使得训练后的风格迁移模型具有质量更佳的训练效果,在本实施方式中首先将利用非配对训练图像集对风格图像生成模型进行训练,以使得训练后的风格图像生成模型用于获得配对训练图像集。
具体的,风格图像生成模型具体可包括生成对抗网络,从而使得其能够实现对配对训练集的获取。首先,获取非配对训练图像集,以对该风格图像生成模型进行非配对训练。非配对训练图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像。
其中,预设风格训练样本图像是指具有统一的目标风格的图像,例如,具有特定风格的图像。
第一原始训练样本图像则是指真实图像,即,未经过风格迁移处理的图像。
可选的,为了使得非配对训练图像集中的第一原始训练样本图像具有较高的图像质量,可预先对第一原始训练样本图像的原始数据进行预处理,以提高第一原始训练样本图像的图像质量。即,获取第一原始训练样本图像的原始数据,对所述第一原始训练样本图像的原始数据进行数据预处理,得到所述第一原始训练样本图像,而第一原始训练样本图像和预先获取的预设风格训练样本图像构成所述非配对训练图像集。
其中,对于第一原始训练样本图像来说,其图像中包括有目标区域,例如,人脸区域,而图像中的人脸区域的尺寸和比例将会影响到模型对其进行特征提取等处理时的处理效果。基于此,可选实施方式中,可先对原始数据进行目标检测,确定所述第一原始训练样本图像中的目标区域;然后,根据所述目标区域,对所述目标区域进行裁剪对齐处理,得到所述非配对训练图像集中的所述第一原始训练样本图像。其中的目标检测具体可为人脸检测,而目标区域具体可为人脸区域。
具体来说,对于人脸检测可以采用常用的检测图像中人脸区域所在位置的技术手段,本实施方式对实现其人脸检测的具体实现方式不进行限制。
当完成对原始数据进行人脸检测之后,可以获得人脸检测框,以用于表示第一原始训练样本图像中的人脸所在的人脸区域。然后,可根据人脸检测框所标出的人脸区域,对原始数据进行裁剪和对齐处理,以使得获得的原始图像的长宽比例大于1:1,例如,可以为1.3:1,在该长宽比例下,模型能够得到更好的训练效果。
当完成上述处理后,可以构建包括有第一原始训练样本图像和预设风格训练样本图像的非配对训练图像集,并利用该非配对训练图像集对预先建立的风格图像生成模型进行训练。
具体的,对风格图像生成模型的训练一般是一个多次循环训练的过程,即,当利用非配对训练图像集中的各第一原始训练样本图像以及各预设风格训练样本图像输入至风格图像生成模型并得到本次训练后的风格图像生成模型之后,根据实际需求,还可以将相同的非配对训练图像集中的各图像再次重新输入该前次训练后的风格图像生成模型中,以再次对风格图像生成模型进行训练,重复该处理过程至预设次数。基于实际需求,当训练后的风格图像生成模型能够输出与第一原始训练样本图像对应的风格图像时,训练可以停止,并获取到训练后的风格图像生成模型。
进一步地,所述风格图像生成模型包括第一生成对抗网络,所述第一生成对抗网络包括第一生成器和第一判别器。相应的,在利用所述非配对训练图像集对风格图像生成模型进行训练时,包括:从所述非配对训练图像集中各第一原始训练样本图像中选出任一训练样本图像作为第一生成训练图像;将所述第一生成训练图像输入至所述第一生成器,得到第一中间图像;将所述第一中间图像以及所述非配对训练图像集中与所述第一生成训练图像对应的预设风格训练样本图像,输入至第一判别器,得到判别结果;根据 所述判别结果对所述第一生成器进行调参,并从所述非配对训练图像集中各第一原始训练样本图像中选出下一训练样本图像作为第一生成训练图像,返回所述将第一生成训练图像输入至所述第一生成器的步骤直至所述第一生成器符合预设条件。
图3为本公开实施例提供的一种对风格图像生成模型进行训练时的数据流示意图。参考图3,风格图像生成模型中包括第一生成器和第一判别器。
其中,第一生成器的目的在于基于权重参数对输入至第一生成器的第一原始训练样本图像进行处理,并输出第一中间图像,所述第一中间图像以及预设风格图像被同时输入至第一判别器中,以供第一判别器对该两种图像进行判别处理。可知的是,其训练目的在于,使第一判别器对第一中间图像的判别结果与预设风格图像的判别结果一致。
基于此,在利用非配对训练图像集对风格图像生成模型进行训练时,还需要将根据第一判别器对第一中间图像和预设风格图像进行判别所得到的判别结果对第一生成器进行调参,使得第一生成器得到优化,重复上述过程直至第一判别器对第一中间图像的判别结果与预设风格图像的判别结果一致。通过这样的方式,能够使得第一生成器生成使得第一判别器判别具有该预设风图像的风格特征的第一中间图像。
正是通过采用上述方式对风格图像生成模型进行训练,能够使得训练完毕的风格图像生成模型在利用反复调参后的第一生成器对第二原始训练样本图像进行处理时,能够得到具有预设风格图像的风格特征的第二原始训练样本图像对应的风格样本图像,并且,该第二原始训练样本图像对应的风格样本图像和第二原始训练样本图像可以作为配对训练图像集,以对后续的模型进行训练。
风格迁移模型训练装置可以根据所述第一原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练图像集,并利用所述配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
其中,上述的第二原始训练样本图像具体可指真实图像,即,未经过风格迁移处理的图像。
具体来说,为了使得风格迁移模型在风格迁移时保持更佳的鲁棒性,本实施方式中,对于风格迁移模型的训练可以采用利用配对训练图像对模型进行训练的方式,即配对训练图像集中的训练图像包括第二原始训练样本图像,以及前述处理得到的第二原始训练样本图像对应的风格样本图像。
可选的,为了提高训练效率,所述第二原始训练样本图像为从所述第一原始训练样本图像中选出的图像。即,可先从第一原始训练样本图像中选择出部分图像作为第二原始训练样本图像,在利用前述训练后的风格图像生成模型对第二原始训练样本进行处理,得到第二原始训练样本图像对应的风格样本图像。
此外,在实际使用过程中,目标图像的图像尺寸通常是不固定的,为了提高对不同尺寸图像的风格迁移的稳定性,在可选实施方式中,在执行利用所述配对训练图像集对风格迁移模型进行训练的过程中,还可以包括:
对所述配对训练图像集中各图像对,进行基于多个图像尺寸维度的图像调整处理,得到所述每个图像对在多个图像尺寸维度下的图像对;
利用所述配对训练图像集中各图像对在多个图像尺寸维度下的图像对,对风格迁移模型进行训练。
通过采用不同图像尺寸的第二原始训练样本图像以及其对应的风格图像对风格迁移模型进行训练,可以使得模型适应于对于不同尺寸的目标图像进行处理,提高了模型的泛化能力以及鲁棒性。
在上述实施方式的基础上,在实际使用过程中,目标图像的亮度差异是极大的,为了提高对不同图像亮度图像的风格迁移的稳定性,在其他可选实施方式中,在执行利用所述配对训练图像集对风格迁移模型进行训练的过程中,还可以包括对所述配对训练图像集中的所述第二原始训练样本图像的目标区域的图像亮度进行随机调整,利用图像亮度经过随机调整的第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,对风格迁移模型进行训练。
在实际图像中,对于图像全局来说,其亮度是十分随机的,而图像中人脸部分相对较亮,为了更多的模拟真实情况下的目标图像的亮度状态,具体来说,可选实施方式中,可针对所述配对训练图像集中的所述第二原始训练样本图像的全局图像,进行随机亮度处理;以及对所述配对训练图像集中的所述第二原始训练样本图像的目标区域图像,进行亮度增亮处理。
其中,上述的目标区域具体可包括人脸区域,相应的,对全局图像进行随机亮度处理是指基于图像的全部区域的gamma值进行随机赋值,以使得图像中的光照条件处于随机状态;而对亮度增亮处理是指,对图像进行人脸区域识别,并基于识别得到的人脸区域图像建立人脸掩膜,然后利用人脸掩膜对图像中的人脸部分的亮度进行调整,以使人脸区域的亮度要高于非人脸区域的亮度。
通过上述对图像亮度的处理,可使得对配对训练图像集中的图像尽可能模拟真实情况下的图像亮度,从而使得利用这样的配对训练图像集训练得到的风格迁移模型,能够对不同亮度下的目标图像进行处理。
当完成对配对训练图像集中的第二原始训练样本图像以及风格图像的处理之后,本实施方式还将利用配对训练图像集对模型进行训练。
利用所述配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
具体的,风格迁移模型为包括第二生成器和第二判别器的生成对抗网络。相应的,利用所述配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型时,包括:从所述配对训练图像集中的第二原始训练样本图像中选出任一样本图像作为第二生成训练图像;将所述第二生成训练图像输入至所述第二生成器,得到第二中间图像;将所述第二中间图像以及所述配对训练图像集中第二原始训练样本图像对应的风格样本图像,输入至第二判别器,得到判别结果;根据所述判别结果对所述第二生成器进行调参,并从所述配对训练图像集中各第二原始训练样本图像中选出下一训练样本图像作为第二生成训练图像,返回所述将第二生成训练图像输入至所述第二生成器的步骤,直至所述第二生成器符合预设条件。
图4为本公开实施例提供的一种对风格迁移模型进行训练时的数据流示意图。参考图4,风格迁移模型中包括第二生成器和第二判别器。
其中,第二生成器的目的在于基于权重参数对输入至第二生成器的第二原始训练样本图像进行处理,并输出第二中间图像。所述第二中间图像以及与输入至第二生成器的第二原始训练样本图像的对应风格图像将被同时输入至第二判别器中,以供第二判别器 对该两种图像进行判别处理。可知的是,其训练目的在于,使第二判别器对第二中间图像的判别结果与其预先获得的风格图像判别结果一致。
基于此,利用配对训练图像集对风格迁移模型进行训练时,还需要根据第二判别器对在第二中间图像和风格图像进行判别所得到的判别结果对第二生成器进行调参,使得第二生成器得到优化,并重复上述过程。
通过这样的方式,可以使得第二生成器能够生成使得第二判别器判别具有该预设风图像的风格特征的第二中间图像。
在这一过程中,为了实现对图像的判别,第二判别器具体用于:分别提取所述第二中间图像以及其对应的风格样本图像的特征;确定所述第二中间图像以及所述第二判别训练图像的特征之差,根据所述特征之差以及所述第二判别器输出的结果,得到判别结果。
也就是说,第二判别器不仅会对图像的真伪进行判别,还将基于高层特征是否相似进行监督,具体的,该监督可基于损失函数实现,即基于VGG19-bn的内容损失(content loss)对判别结果进行输出。其中,当第二生成器生成第二原始训练样本图像对应的第二中间图像之后,可将第二中间图像输入至第二判别器中,以使其中的VGG19-bn提取第二中间图像的高层特征,得到相应的内容损失(content loss);然后,第二判别器还可以对配对训练图像集中的第二原始训练样本图像对应的风格样本图像,进行相同处理,即基于VGG19-bn提取风格图像的高层特征,得到相应的内容损失(content loss);最后,计算两者的内容损失(content loss)之差。基于特征之差以及第二判别器输出的结果(真伪判别),得到判别结果,该判别结果可以被反馈至第二生成器,以用于对第二生成器的参数进行调整。
当然,可知的是,特征之差的绝对值越小,第二生成器所生成的图像的质量越好,其生成图像与风格图像越接近,模型的训练效果越好。
通过采用上述方式对风格迁移模型进行训练,使得训练完毕的风格迁移模型能在利用反复调参后的第二生成器对目标图像进行处理,生成相应的风格图像,实现对图像的风格迁移处理。
本公开实施例提供的图像风格迁移方法,由于先利用非配对训练图像集对风格图像生成模型进行训练,以根据训练后的风格图像生成模型,得到配对训练图像集;再基于配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型,该训练后的风格迁移模型可用于对目标图像进行风格迁移处理,得到相应的风格图像。由于风格迁移模型的训练图像是利用训练后的风格图像生成模型获得的,其训练图像的数量充足且质量较为统一,使得对于风格迁移模型的训练效果较好,进而利用训练后的风格迁移模型输出的目标图像的风格图像的鲁棒性较高,风格效果更佳。
在上述实施方式的基础上,图5为本公开实施例提供的一种图像风格迁移方法的流程示意图,如图5所示的,该方法包括:
步骤201、获取目标图像;
步骤202、将所述目标图像输入至训练后的风格迁移模型中,进行图像风格迁移处理;
步骤203、获得所述目标图像对应的风格图像。
其中,本实施方式中所涉及的风格迁移模型可以基于前述任一实施方式训练所获得。本实施方式对其模型的获取过程不再进行赘述。
在本实施方式中,针对特定风格图像中的人脸面积区域的亮度和色彩均十分鲜明的风格特点,在进行风格迁移时,还可以针对待处理图像,对所述待处理图像进行预处理,得到所述待处理的目标图像。
具体的,在获取目标图像时,可包括:获取待处理图像;对所述待处理图像进行目标检测,确定所述待处理图像中的目标区域;对所述目标区域进行预处理,得到所述目标图像。其中的目标具体可包括人脸,目标检测包括人脸检测,目标区域包括人脸区域。
换句话说,在对目标图像进行风格迁移处理之前,终端还可先对其待处理的目标图像进行预处理。在一个实施方式中,可以基于识别得到的人脸区域,对目标图像的人脸区域进行尺寸剪裁处理,以保留具有人脸区域的部分,并使得保留的部分满足一定的尺寸比例,以便于风格迁移模型对图像的处理。再后,对剪裁后的目标区域进行目标分割,以得到目标的掩膜图像;以及对所述掩膜图像进行伽马矫正以增强亮度。即基于人脸区域建立人脸掩膜,然后利用人脸掩膜对目标图像的亮度进行一定调整,以使人脸区域的图像亮度得到一定增强,使得人脸区域的图像更为鲜明。
当完成上述处理之后,可将处理后的目标图像输入至风格迁移模型对其进行处理,并获取到目标图像对应的风格图像。
本公开实施例提供的图像风格迁移方法,由于风格迁移模型的训练图像是利用训练后的风格图像生成模型获得的,其训练图像的数量充足且质量较为统一,对于风格迁移模型的训练效果较好,使得利用训练后的风格迁移模型输出的目标图像的风格图像的鲁棒性较高,风格效果较佳。
对应于上文实施例的风格迁移模型训练方法,图6为本公开实施例提供的风格迁移模型训练装置的结构框图。为了便于说明,仅示出了与本公开实施例相关的部分。参照图6,所述风格迁移模型训练装置包括:
获取模块11,用于获取非配对训练样本图像集;其中,所述非配对训练样本图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像;
第一训练模块12,用于利用所述非配对训练样本图像集对风格图像生成模型进行训练,以利用训练后的风格图像生成模型对第二原始训练样本图像进行处理,得到所述第二原始训练样本图像对应的风格样本图像;
第二训练模块13,用于根据所述第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练样本图像集;以及利用所述配对训练样本图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
可选的,获取模块11用于:
获取第一原始训练样本图像的原始数据,对所述第一原始训练样本图像的原始数据进行数据预处理,得到所述第一原始训练样本图像;
将所述第一原始训练样本图像和预先获取的预设风格训练样本图像构成所述非配对训练图像集。
可选的,获取模块11用于:
对所述原始数据进行目标检测,确定所述第一原始训练样本图像中的目标区域;
基于得到的所述目标区域,对所述目标区域进行裁剪对齐处理,得到所述非配对训练图像集中的所述第一原始训练样本图像。
可选的,第二训练模块13,用于:
对所述配对训练图像集中各图像对,进行基于多个图像尺寸维度的图像调整处理,得到所述每个图像对在多个图像尺寸维度下的图像对;
利用所述配对训练图像集中各图像对在多个图像尺寸维度下的图像对,对风格迁移模型进行训练。
可选的,第二训练模块13,用于:
对所述配对训练图像集中的所述第二原始训练样本图像的目标区域的图像亮度进行随机调整,利用图像亮度经过随机调整的第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,对风格迁移模型进行训练。
可选的,第二训练模块13,用于:
针对所述配对训练图像集中的所述第二原始训练样本图像的全局图像,进行随机亮度处理;以及
对所述配对训练图像集中的所述第二原始训练样本图像的目标区域图像,进行基于区域掩膜的亮度增亮处理。
可选的,所述风格图像生成模型包括第一生成对抗网络,所述第一生成对抗网络包括第一生成器和第一判别器;相应地,第一训练模块12,用于:
从所述非配对训练图像集中各第一原始训练样本图像中选出任一训练样本图像作为第一生成训练图像;将所述第一生成训练图像输入至所述第一生成器,得到第一中间图像;将所述第一中间图像以及所述非配对训练图像集中与所述第一生成训练图像对应的预设风格训练样本图像,输入至第一判别器,得到判别结果;根据所述判别结果对所述第一生成器进行调参,并从所述非配对训练图像集中各第一原始训练样本图像中选出下一训练样本图像作为第一生成训练图像,返回所述将第一生成训练图像输入至所述第一生成器的步骤直至所述第一生成器符合预设条件。
可选的,所述第二原始训练样本图像包括从所述第一原始训练样本图像中选出的图像。
可选的,所述风格迁移模型包括第二生成对抗网络,所述第二生成对抗网络包括第二生成器和第二判别器;相应地,所述第二训练模块13用于从所述配对训练图像集中的第二原始训练样本图像中选出任一样本图像作为第二生成训练图像;将所述第二生成训练图像输入至所述第二生成器,得到第二中间图像;将所述第二中间图像以及所述配对训练图像集中第二原始训练样本图像对应的风格样本图像,输入至第二判别器,得到判别结果;根据所述判别结果对所述第二生成器进行调参,并从所述配对训练图像集中各第二原始训练样本图像中选出下一训练样本图像作为第二生成训练图像,返回所述将第二生成训练图像输入至所述第二生成器的步骤,直至所述第二生成器符合预设条件。
可选的,所述第二训练模块13用于:
分别提取所述第二中间图像以及其对应的风格样本图像的特征;确定所述第二中间图像以及其对应的风格样本图像的特征之差,根据所述特征之差以及所述第二判别器输出的结果,得到判别结果。
本公开实施例提供的风格迁移模型训练装置,由于先利用非配对训练图像集对风格图像生成模型进行训练,以根据训练后的风格图像生成模型,得到配对训练图像集;再基于配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型,所述训练后的风格迁移模型可用于对目标图像进行风格迁移处理,得到相应的风格图像。由于风格迁移模型的训练图像是利用训练后的风格图像生成模型获得的,其训练图像的数量充足且质量较为统一,使得对于风格迁移模型的训练效果较好,进而利用训练后的风格迁移模型输出的目标图像的风格图像的鲁棒性较高,风格效果更佳。
对应于上文实施例的图像风格迁移方法,图7为本公开实施例提供的图像风格迁移装置的结构框图。为了便于说明,仅示出了与本公开实施例相关的部分。参照图7,所述图像风格迁移装置包括:
获取模块21,用于获取目标图像;
处理模块22,用于将所述目标图像输入至训练后的风格迁移模型中,进行图像风格迁移处理;
其中,获取模块21,还用于获得所述目标图像对应的风格图像。
其中,训练后的图像风格迁移模型是根据前述实施例所述的风格迁移模型训练方法所得到的。
可选的,所述获取模块21,用于:获取待处理图像;对所述待处理图像进行目标检测,确定所述待处理图像中的目标区域;对所述目标区域进行预处理,得到所述目标图像。
可选的,所述预处理包括如下处理中的至少一种处理:尺寸剪裁处理和人脸区域的亮度处理。
可选的,处理模块22,用于对所述目标区域进行尺寸剪裁处理。
可选的,处理模块22,还用于对剪裁后的目标区域进行目标分割,以得到目标的掩膜图像;以及对所述掩膜图像进行伽马矫正以增强亮度。
本公开实施例提供的图像风格迁移装置,由于风格迁移模型的训练图像是利用训练后的风格图像生成模型获得的,其训练图像的数量充足且质量较为统一,使得对于风格迁移模型的训练效果较好,使得利用训练后的风格迁移模型输出的目标图像的风格图像的鲁棒性较高,风格效果更佳。
本实施例提供的电子设备,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本实施例此处不再赘述。
参考图8,其示出了适于用来实现本公开实施例的电子设备900的结构示意图,该电子设备900可以为终端设备或媒体库。其中,终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,简称PDA)、平板电脑(Portable Android Device,简称PAD)、便携式多媒体播放器(Portable Media Player,简称PMP)、车载终端(例如车载导航终端)、可穿戴电子设备等等的移动终端以及诸如数字TV、台式计算机、智能家居设备等等的固定终端。图8示出的电子设备仅仅是一个实施例,不应对本公开实施例的功能和使用范围带来任何限制。
如图8所示,电子设备900可以包括用于执行前述方法的处理器901(例如中央处理器、图形处理器等),其可以根据存储在只读存储器(Read Only Memory,简称ROM) 902中的程序或者从存储装置908加载到随机访问存储器(Random Access Memory,简称RAM)903中的程序而执行各种适当的动作和处理。在RAM 903中,还存储有电子设备900操作所需的各种程序和数据。处理器901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(Input/Output,简称I/O)接口905也连接至总线904。
通常,以下装置可以连接至I/O接口905:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置906;包括例如液晶屏幕(Liquid Crystal Display,简称LCD)、扬声器、振动器等的输出装置907;包括例如磁带、硬盘等的存储装置908;以及通信装置909。通信装置909可以允许电子设备900与其他设备进行无线或有线通信以交换数据。虽然图8示出了具有各种装置的电子设备900,但是应理解的是,并不要求实施或具备所有示出的装置,可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,所述计算机程序包含用于执行根据本公开实施例所述的各流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置909从网络上被下载和安装,或者从存储装置908被安装,或者从ROM 902被安装。在该计算机程序被处理器901执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(Random Access Memory,简称RAM)、只读存储器(Read-Only Memory,简称ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,简称EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,简称CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,简称RF)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述实施例所示的方法。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++, 还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或媒体库上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,简称LAN)或广域网(Wide Area Network,简称WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field-Programmable Gate Array,简称FPGA)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、专用标准产品(Application Specific Standard Product,简称ASSP)、片上系统(System On Chip,简称SOC)、复杂可编程逻辑设备(Complex programmable logic device,简称CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体实施例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(Random Access Memory,简称RAM)、只读存储器(Read-Only Memory,简称ROM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,简称EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(Compact Disc Read-Only Memory,简称CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
以下是本公开的一些实施例。
第一方面,根据本公开的一个或多个实施例,一种风格迁移模型训练方法,包括:
获取非配对训练样本图像集;其中,所述非配对训练样本图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像;
利用所述非配对训练样本图像集对风格图像生成模型进行训练,以利用训练后的风格图像生成模型对第二原始训练样本图像进行处理,得到所述第二原始训练样本图像对应的风格样本图像;
根据所述第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练样本图像集;以及
利用所述配对训练样本图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
可选的,所述获取非配对训练图像集,包括:
获取第一原始训练样本图像的原始数据,对所述第一原始训练样本图像的原始数据进行数据预处理,得到所述第一原始训练样本图像;
将所述第一原始训练样本图像和预先获取的预设风格训练样本图像构成所述非配对训练图像集。
可选的,所述获取第一原始训练样本图像的原始数据,对所述第一原始训练样本图像的原始数据进行数据预处理,得到所述第一原始训练样本图像,包括:
对所述原始数据进行目标检测,确定所述第一原始训练样本图像中的目标区域;
基于得到的所述目标区域,对所述目标区域进行裁剪对齐处理,得到所述非配对训练图像集中的所述第一原始训练样本图像。
可选的,所述利用所述配对训练图像集对风格迁移模型进行训练,包括:
对所述配对训练图像集中各图像对,进行基于多个图像尺寸维度的图像调整处理,得到所述每个图像对在多个图像尺寸维度下的图像对;
利用所述配对训练图像集中各图像对在多个图像尺寸维度下的图像对,对风格迁移模型进行训练。
可选的,所述利用所述配对训练图像集对风格迁移模型进行训练,包括:
对所述配对训练图像集中的所述第二原始训练样本图像的目标区域的图像亮度进行随机调整,利用图像亮度经过随机调整的第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,对风格迁移模型进行训练。
可选的,所述对所述配对训练图像集中的所述第二原始训练样本图像的目标区域的图像亮度进行随机调整,包括:
针对所述配对训练图像集中的所述第二原始训练样本图像的全局图像,进行随机亮度处理;以及
对所述配对训练图像集中的所述第二原始训练样本图像的目标区域图像,进行基于区域掩膜的亮度增亮处理。
可选的,所述风格图像生成模型包括第一生成对抗网络,所述第一生成对抗网络包括第一生成器和第一判别器;
相应的,所述利用所述非配对训练图像集对风格图像生成模型进行训练,包括:
从所述非配对训练图像集中各第一原始训练样本图像中选出任一训练样本图像作为第一生成训练图像;
将所述第一生成训练图像输入至所述第一生成器,得到第一中间图像;
将所述第一中间图像以及所述非配对训练图像集中与所述第一生成训练图像对应的预设风格训练样本图像,输入至第一判别器,得到判别结果;
根据所述判别结果对所述第一生成器进行调参,并从所述非配对训练图像集中各第一原始训练样本图像中选出下一训练样本图像作为第一生成训练图像,返回所述将第一生成训练图像输入至所述第一生成器的步骤直至所述第一生成器符合预设条件。
可选的,所述第二原始训练样本图像包括从所述第一原始训练样本图像中选出的图像。
可选的,所述风格迁移模型包括第二生成对抗网络,所述第二生成对抗网络包括第二生成器和第二判别器;
所述利用所述配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型,包括:
从所述配对训练图像集中的第二原始训练样本图像中选出任一样本图像作为第二生成训练图像;
将所述第二生成训练图像输入至所述第二生成器,得到第二中间图像;
将所述第二中间图像以及所述配对训练图像集中第二原始训练样本图像对应的风格样本图像,输入至第二判别器,得到判别结果;
根据所述判别结果对所述第二生成器进行调参,并从所述配对训练图像集中各第二原始训练样本图像中选出下一训练样本图像作为第二生成训练图像,返回所述将第二生成训练图像输入至所述第二生成器的步骤,直至所述第二生成器符合预设条件。
可选的,所述将所述第二中间图像以及所述配对训练图像集中第二原始训练样本图像对应的风格样本图像,输入至第二判别器,得到判别结果,还包括:分别提取所述第二中间图像以及其对应的风格样本图像的特征;确定所述第二中间图像以及其对应的风格样本图像的特征之差,根据所述特征之差以及所述第二判别器输出的结果,得到判别结果。
第二方面,根据本公开的一个或多个实施例,一种风格迁移方法,包括:
获取目标图像;
将所述目标图像输入至训练后的风格迁移模型中,进行图像风格迁移处理;所述训练后的图像风格迁移模型是根据第一方面任一项所述的风格迁移模型训练方法所得到的;
获得所述目标图像对应的风格图像。
可选的,所述获取目标图像,包括:
获取待处理图像;
对所述待处理图像进行目标检测,确定所述待处理图像中的目标区域;
对所述目标区域进行预处理,得到所述目标图像。
可选的,所述对所述目标区域进行预处理,包括:
对所述目标区域进行尺寸剪裁处理。
可选的,所述对所述目标区域进行预处理还包括:
对剪裁后的目标区域进行目标分割,以得到目标的掩膜图像;以及
对所述掩膜图像进行伽马矫正以增强亮度。
第三方面,根据本公开的一个或多个实施例,一种风格迁移模型训练装置,包括:
获取模块,用于获取非配对训练样本图像集;其中,所述非配对训练样本图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像;
第一训练模块,用于利用所述非配对训练样本图像集对风格图像生成模型进行训练,以利用训练后的风格图像生成模型对第二原始训练样本图像进行处理,得到所述第二原始训练样本图像对应的风格样本图像;
第二训练模块,用于根据所述第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练样本图像集;以及用于利用所述配对训练样本图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
可选的,获取模块,具体用于获取第一原始训练样本图像的原始数据,对所述第一原始训练样本图像的原始数据进行数据预处理,得到所述第一原始训练样本图像;
将所述第一原始训练样本图像和预先获取的预设风格训练样本图像构成所述非配对训练图像集。
可选的,第一训练模块,用于对所述原始数据进行目标检测,确定所述第一原始训练样本图像中的目标区域;基于得到的所述目标区域,对所述目标区域进行裁剪对齐处理,得到所述非配对训练图像集中的所述第一原始训练样本图像。
可选的,所述第二训练模块,用于对所述配对训练图像集中各图像对,进行基于多个图像尺寸维度的图像调整处理,得到所述每个图像对在多个图像尺寸维度下的图像对;利用所述配对训练图像集中各图像对在多个图像尺寸维度下的图像对,对风格迁移模型进行训练。
可选的,所述第二训练模块,用于对所述配对训练图像集中的所述第二原始训练样本图像的目标区域的图像亮度进行随机调整,利用图像亮度经过随机调整的第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,对风格迁移模型进行训练。
可选的,所述第二训练模块,具体用于针对所述配对训练图像集中的所述第二原始训练样本图像的全局图像,进行随机亮度处理;以及对所述配对训练图像集中的所述第二原始训练样本图像的目标区域图像,进行基于区域掩膜的亮度增亮处理。
可选的,所述风格图像生成模型包括第一生成对抗网络,所述第一生成对抗网络包括第一生成器和第一判别器;
所述第一训练模块,具体用于从所述非配对训练图像集中各第一原始训练样本图像中选出任一训练样本图像作为第一生成训练图像;将所述第一生成训练图像输入至所述第一生成器,得到第一中间图像;将所述第一中间图像以及所述非配对训练图像集中与所述第一生成训练图像对应的预设风格训练样本图像,输入至第一判别器,得到判别结果;根据所述判别结果对所述第一生成器进行调参,并从所述非配对训练图像集中各第一原始训练样本图像中选出下一训练样本图像作为第一生成训练图像,返回所述将第一生成训练图像输入至所述第一生成器的步骤直至所述第一生成器符合预设条件。
可选的,所述第二原始训练样本图像包括从所述第一原始训练样本图像中选出的图像。
可选的,所述风格迁移模型包括第二生成对抗网络,所述第二生成对抗网络包括第二生成器和第二判别器;
所述第二训练模块具体用于:从所述配对训练图像集中的第二原始训练样本图像中选出任一样本图像作为第二生成训练图像;将所述第二生成训练图像输入至所述第二生成器,得到第二中间图像;将所述第二中间图像以及所述配对训练图像集中第二原始训练样本图像对应的风格样本图像,输入至第二判别器,得到判别结果;根据所述判别结果对所述第二生成器进行调参,并从所述配对训练图像集中各第二原始训练样本图像中选出下一训练样本图像作为第二生成训练图像,返回所述将第二生成训练图像输入至所述第二生成器的步骤,直至所述第二生成器符合预设条件。
可选的,所述第二训练模块具体用于:分别提取所述第二中间图像以及其对应的风格样本图像的特征;确定所述第二中间图像以及其对应的风格样本图像的特征之差,根据所述特征之差以及所述第二判别器输出的结果,得到判别结果。
第四方面,根据本公开的一个或多个实施例,一种图像风格迁移装置,包括:
获取模块,用于获取目标图像;
处理模块,用于将所述目标图像输入至训练后的风格迁移模型中,进行图像风格迁移处理;所述训练后的图像风格迁移模型是根据第一方面任一项所述的风格迁移模型训练方法所得到的;
其中,获取模块,还用于获得所述目标图像对应的风格图像。
可选的,所述获取模块,具体用于:获取待处理图像;对所述待处理图像进行目标检测,确定所述待处理图像中的目标区域;对所述目标区域进行预处理,得到所述目标图像。
可选的,所述处理模块,用于对所述目标区域进行尺寸剪裁处理。
可选的,所述获取模块,具体用于对剪裁后的目标区域进行目标分割,以得到目标的掩膜图像;以及对所述掩膜图像进行伽马矫正以增强亮度。
第五方面,根据本公开的一个或多个实施例,一种电子设备,包括:至少一个处理器和存储器;
所述存储器存储计算机执行指令;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如前述任一项所述的方法。
第六方面,根据本公开的一个或多个实施例,一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如前述任一项所述的方法。
第七方面,根据本公开的一个或多个实施例,一种计算机程序产品,包括计算机指令,所述计算机指令被处理器执行时,实现如前述任一项所述的方法。
第八方面,根据本公开的一个或多个实施例,一种计算机程序,所述计算机程序被处理器执行时,实现如前述任一项所述的方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进 行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的实施例形式。

Claims (20)

  1. 一种风格迁移模型训练方法,其特征在于,包括:
    获取非配对训练样本图像集;其中,所述非配对训练样本图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像;
    利用所述非配对训练样本图像集对风格图像生成模型进行训练,以利用训练后的风格图像生成模型对第二原始训练样本图像进行处理,得到所述第二原始训练样本图像对应的风格样本图像;
    根据所述第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练样本图像集;以及
    利用所述配对训练样本图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
  2. 根据权利要求1所述的风格迁移模型训练方法,其特征在于,所述获取非配对训练图像集,包括:
    获取第一原始训练样本图像的原始数据,对所述第一原始训练样本图像的原始数据进行数据预处理,得到所述第一原始训练样本图像;
    将所述第一原始训练样本图像和预先获取的预设风格训练样本图像构成所述非配对训练图像集。
  3. 根据权利要求2所述的风格迁移模型训练方法,其特征在于,所述获取第一原始训练样本图像的原始数据,对所述第一原始训练样本图像的原始数据进行数据预处理,得到所述第一原始训练样本图像,包括:
    对所述原始数据进行目标检测,确定所述第一原始训练样本图像中的目标区域;
    基于得到的所述目标区域,对所述目标区域进行裁剪对齐处理,得到所述非配对训练图像集中的所述第一原始训练样本图像。
  4. 根据权利要求1-3中任一项所述的图像风格迁移模型训练方法,其特征在于,所述利用所述配对训练图像集对风格迁移模型进行训练,包括:
    对所述配对训练图像集中各图像对,进行基于多个图像尺寸维度的图像调整处理,得到所述每个图像对在多个图像尺寸维度下的图像对;
    利用所述配对训练图像集中各图像对在多个图像尺寸维度下的图像对,对风格迁移模型进行训练。
  5. 根据权利要求1-4中任一项所述的风格迁移模型训练方法,其特征在于,所述利用所述配对训练图像集对风格迁移模型进行训练,包括:
    对所述配对训练图像集中的所述第二原始训练样本图像的目标区域的图像亮度进行随机调整,利用图像亮度经过随机调整的第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,对风格迁移模型进行训练。
  6. 根据权利要求5所述的风格迁移模型训练方法,其特征在于,所述对所述配对训练图像集中的所述第二原始训练样本图像的目标区域的图像亮度进行随机调整,包括:
    针对所述配对训练图像集中的所述第二原始训练样本图像的全局图像,进行随机亮度处理;以及
    对所述配对训练图像集中的所述第二原始训练样本图像的目标区域图像,进行基于 区域掩膜的亮度增亮处理。
  7. 根据权利要求1-6中任一项所述的风格迁移模型训练方法,其特征在于,所述风格图像生成模型包括第一生成对抗网络,所述第一生成对抗网络包括第一生成器和第一判别器;
    相应的,所述利用所述非配对训练图像集对风格图像生成模型进行训练,包括:
    从所述非配对训练图像集中各第一原始训练样本图像中选出任一训练样本图像作为第一生成训练图像;
    将所述第一生成训练图像输入至所述第一生成器,得到第一中间图像;
    将所述第一中间图像以及所述非配对训练图像集中与所述第一生成训练图像对应的预设风格训练样本图像,输入至第一判别器,得到判别结果;
    根据所述判别结果对所述第一生成器进行调参,并从所述非配对训练图像集中各第一原始训练样本图像中选出下一训练样本图像作为第一生成训练图像,返回所述将第一生成训练图像输入至所述第一生成器的步骤直至所述第一生成器符合预设条件。
  8. 根据权利要求1-7中任一项所述的风格迁移模型训练方法,其特征在于,所述第二原始训练样本图像包括从所述第一原始训练样本图像中选出的图像。
  9. 根据权利要求1-8中任一项所述的风格迁移模型训练方法,其特征在于,所述风格迁移模型包括第二生成对抗网络,所述第二生成对抗网络包括第二生成器和第二判别器;
    所述利用所述配对训练图像集对风格迁移模型进行训练,得到训练后的风格迁移模型,包括:
    从所述配对训练图像集中的第二原始训练样本图像中选出任一样本图像作为第二生成训练图像;
    将所述第二生成训练图像输入至所述第二生成器,得到第二中间图像;
    将所述第二中间图像以及所述配对训练图像集中第二原始训练样本图像对应的风格样本图像,输入至第二判别器,得到判别结果;
    根据所述判别结果对所述第二生成器进行调参,并从所述配对训练图像集中各第二原始训练样本图像中选出下一训练样本图像作为第二生成训练图像,返回所述将第二生成训练图像输入至所述第二生成器的步骤,直至所述第二生成器符合预设条件。
  10. 根据权利要求9所述的风格迁移模型训练方法,其特征在于,所述将所述第二中间图像以及所述配对训练图像集中第二原始训练样本图像对应的风格样本图像,输入至第二判别器,得到判别结果,还包括:
    分别提取所述第二中间图像以及其对应的风格样本图像的特征;
    确定所述第二中间图像以及其对应的风格样本图像的特征之差,根据所述特征之差以及所述第二判别器输出的结果,得到判别结果。
  11. 一种图像风格迁移方法,其特征在于,包括:
    获取目标图像;
    将所述目标图像输入至训练后的风格迁移模型中,进行图像风格迁移处理;所述训练后的图像风格迁移模型是根据权利要求1-10中任一项所述的风格迁移模型训练方法所得到的;
    获得所述目标图像对应的风格图像。
  12. 根据权利要求11所述的图像风格迁移方法,其特征在于,所述获取目标图像,包括:
    获取待处理图像;
    对所述待处理图像进行目标检测,确定所述待处理图像中的目标区域;
    对所述目标区域进行预处理,得到所述目标图像。
  13. 根据权利要求12所述的图像风格迁移方法,其特征在于,所述对所述目标区域进行预处理,包括:
    对所述目标区域进行尺寸剪裁处理。
  14. 根据权利要求13所述的图像风格迁移方法,其特征在于,所述对所述目标区域进行预处理还包括:
    对剪裁后的目标区域进行目标分割,以得到目标的掩膜图像;以及
    对所述掩膜图像进行伽马矫正以增强亮度。
  15. 一种风格迁移模型训练装置,其特征在于,包括:
    获取模块,用于获取非配对训练样本图像集;其中,所述非配对训练样本图像集中至少包括预设风格训练样本图像以及第一原始训练样本图像;
    第一训练模块,用于利用所述非配对训练样本图像集对风格图像生成模型进行训练,以利用训练后的风格图像生成模型对第二原始训练样本图像进行处理,得到所述第二原始训练样本图像对应的风格样本图像;
    第二训练模块,用于根据所述第二原始训练样本图像以及所述第二原始训练样本图像对应的风格样本图像,得到配对训练样本图像集;以及用于利用所述配对训练样本图像集对风格迁移模型进行训练,得到训练后的风格迁移模型。
  16. 一种图像风格迁移装置,其特征在于,包括:
    获取模块,用于获取目标图像;
    处理模块,用于将所述目标图像输入至训练后的风格迁移模型中,进行图像风格迁移处理;所述训练后的图像风格迁移模型是根据权利要求1-10中任一项所述的风格迁移模型训练方法所得到的;
    其中,所述获取模块,还用于获得所述目标图像对应的风格图像。
  17. 一种电子设备,其中,包括:
    至少一个处理器;以及
    存储器;
    所述存储器存储计算机执行指令;
    所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如权利要求1-10中任一项所述的风格迁移模型训练方法,或,11-14中任一项所述的图像风格迁移方法。
  18. 一种计算机可读存储介质,其中,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1-10中任一项所述的风格迁移模型训练方法,或,11-14中任一项所述的图像风格迁移方法。
  19. 一种计算机程序产品,包括计算机指令,其特征在于,所述计算机指令被处理 器执行时实现如权利要求1-10中任一项所述的风格迁移模型训练方法,或,11-14中任一项所述的图像风格迁移方法。
  20. 一种计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-10中任一项所述的风格迁移模型训练方法,或,11-14中任一项所述的图像风格迁移方法。
PCT/CN2022/093144 2021-07-28 2022-05-16 风格迁移模型训练方法、图像风格迁移方法及装置 WO2023005358A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110858177.5 2021-07-28
CN202110858177.5A CN115689863A (zh) 2021-07-28 2021-07-28 风格迁移模型训练方法、图像风格迁移方法及装置

Publications (1)

Publication Number Publication Date
WO2023005358A1 true WO2023005358A1 (zh) 2023-02-02

Family

ID=85057897

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/093144 WO2023005358A1 (zh) 2021-07-28 2022-05-16 风格迁移模型训练方法、图像风格迁移方法及装置

Country Status (2)

Country Link
CN (1) CN115689863A (zh)
WO (1) WO2023005358A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109741244A (zh) * 2018-12-27 2019-05-10 广州小狗机器人技术有限公司 图片生成方法及装置、存储介质及电子设备
US20190244060A1 (en) * 2018-02-02 2019-08-08 Nvidia Corporation Domain Stylization Using a Neural Network Model
CN111814660A (zh) * 2020-07-07 2020-10-23 集美大学 一种图像识别方法、终端设备及存储介质
CN112418310A (zh) * 2020-11-20 2021-02-26 第四范式(北京)技术有限公司 文本风格迁移模型训练方法和系统及图像生成方法和系统
CN112989904A (zh) * 2020-09-30 2021-06-18 北京字节跳动网络技术有限公司 风格图像生成方法、模型训练方法、装置、设备和介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190244060A1 (en) * 2018-02-02 2019-08-08 Nvidia Corporation Domain Stylization Using a Neural Network Model
CN109741244A (zh) * 2018-12-27 2019-05-10 广州小狗机器人技术有限公司 图片生成方法及装置、存储介质及电子设备
CN111814660A (zh) * 2020-07-07 2020-10-23 集美大学 一种图像识别方法、终端设备及存储介质
CN112989904A (zh) * 2020-09-30 2021-06-18 北京字节跳动网络技术有限公司 风格图像生成方法、模型训练方法、装置、设备和介质
CN112418310A (zh) * 2020-11-20 2021-02-26 第四范式(北京)技术有限公司 文本风格迁移模型训练方法和系统及图像生成方法和系统

Also Published As

Publication number Publication date
CN115689863A (zh) 2023-02-03

Similar Documents

Publication Publication Date Title
CN111275784B (zh) 生成图像的方法和装置
CN110070896B (zh) 图像处理方法、装置、硬件装置
JP2023547917A (ja) 画像分割方法、装置、機器および記憶媒体
WO2023125374A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2020248900A1 (zh) 全景视频的处理方法、装置及存储介质
EP4243398A1 (en) Video processing method and apparatus, electronic device, and storage medium
CN112839223B (zh) 图像压缩方法、装置、存储介质及电子设备
US20240013359A1 (en) Image processing method, model training method, apparatus, medium and device
CN111414879A (zh) 人脸遮挡程度识别方法、装置、电子设备及可读存储介质
WO2023143129A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2023273697A1 (zh) 图像处理方法、模型训练方法、装置、电子设备及介质
CN105979283A (zh) 视频转码方法和装置
CN115731341A (zh) 三维人头重建方法、装置、设备及介质
CN113902636A (zh) 图像去模糊方法及装置、计算机可读介质和电子设备
US11443537B2 (en) Electronic apparatus and controlling method thereof
CN110689478B (zh) 图像风格化处理方法、装置、电子设备及可读介质
WO2023138441A1 (zh) 视频生成方法、装置、设备及存储介质
CN110069641B (zh) 图像处理方法、装置和电子设备
WO2023005358A1 (zh) 风格迁移模型训练方法、图像风格迁移方法及装置
CN110059739B (zh) 图像合成方法、装置、电子设备和计算机可读存储介质
WO2023143118A1 (zh) 图像处理方法、装置、设备及介质
WO2020155908A1 (zh) 用于生成信息的方法和装置
WO2022262473A1 (zh) 图像处理方法、装置、设备及存储介质
CN115953597B (zh) 图像处理方法、装置、设备及介质
CN115760553A (zh) 特效处理方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22847971

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE