RU2023121327A

RU2023121327A - METHOD AND DEVICE FOR TRAINING AN IMAGE GENERATION MODEL, METHOD AND DEVICE FOR GENERATING IMAGES AND THEIR DEVICES

Info

Publication number: RU2023121327A
Application number: RU2023121327A
Authority: RU
Inventors: Ань ЛИ; Юйлэ ЛИ; Вэй СЯН
Original assignee: Биго Текнолоджи Пте. Лтд.
Priority date: 2021-02-02
Filing date: 2022-01-28
Publication date: 2023-10-04

Claims

1. A method for training an image generation model, including:

obtaining a first transformation model by training, wherein the first transformation model is configured to generate a first training image based on the first noise sample, and the first training image is a first style image;

obtaining a reconstruction model by learning based on the first transformation model, wherein the reconstruction model is configured to link an image of the original image to a latent variable corresponding to an example of the original image;

obtaining a second transformation model by training, the second transformation model configured to generate a second training image based on the second noise sample, and the second training image is a second style image;

generating a spliced transformation model by splicing the first transformation model with the second transformation model; And

creating an image generation model based on the reconstruction model and the spliced transformation model, wherein the image generation model is configured to transform the first style image to be transformed into a second style target image.

2. The method according to claim 1, in which:

the first transformation model contains a first mapping network and a first synthesis network; A

obtaining the first transformation model through training involves:

obtaining a first sample of training samples, wherein the first sample of training samples contains a plurality of first noise samples;

obtaining latent variables corresponding to the plurality of first noise samples, which is accomplished by inputting the plurality of first noise samples into the first mapping network;

obtaining first training images corresponding to the plurality of first noise samples, which is accomplished by inputting latent variables corresponding to the plurality of first noise samples into the first synthesis network; And

adjusting the weight parameter of the first transformation model based on the first training samples corresponding to the plurality of first noise samples.

3. The method according to claim 2, in which:

the first transformation model contains a first discrimination network; A

correction of the weight parameter of the first transformation model based on the first training samples corresponding to the plurality of first noise samples provides:

obtaining first discrimination losses corresponding to the plurality of first noise samples by inputting first training images corresponding to the plurality of first noise samples into the first discrimination network; And

adjusting a weight parameter of the first transform model based on the first discrimination losses corresponding to the plurality of first noise samples.

4. The method according to claim 1, in which obtaining a reconstruction model by training based on the first transformation model involves:

obtaining a second set of training samples, wherein the second set of training samples contains a plurality of samples of the original image;

generating latent variables corresponding to the plurality of source image samples, which is done by inputting the plurality of source image samples into the reconstruction model;

generating reconstructed images corresponding to a plurality of original image samples, which is accomplished by inputting latent variables corresponding to a plurality of original image samples into a first transformation model, wherein the plurality of original image samples and the reconstructed images corresponding to the plurality of original image samples are first style images;

determining a loss of a reconstruction model corresponding to the plurality of source image samples based on the plurality of source image samples and reconstructed images corresponding to the plurality of source image samples; And

adjusting the weight parameter of the reconstruction model based on the losses of the reconstruction model corresponding to the set of samples of the original image.

5. The method according to claim 4, in which:

the first transformation model includes a first discrimination network; and determining the loss of the reconstruction model corresponding to the plurality of source image samples, based on the plurality of source image samples and the reconstructed images corresponding to the plurality of source image samples, involves:

determining first sub-losses based on an output obtained by inputting each of the reconstructed images corresponding to the plurality of samples of the original image into a first discrimination network, the first sub-losses indicating a first characteristic of the reconstructed image;

determining a second sub-loss based on an output obtained by inputting each of the plurality of source image samples and each of the reconstructed images corresponding to the plurality of source image samples into the perceptual network, the second sub-loss denoting a first-degree match of the source image sample to a reconstructed image corresponding to the source image sample , according to the criterion of the target attribute;

determining a third sub-loss based on an output obtained by inputting each of the plurality of original image samples and each of the reconstructed images corresponding to the plurality of original image samples into a regression function, the third sub-loss indicating a second degree of correspondence of the original image sample to a reconstructed image corresponding to the original image sample , according to the criterion of the target attribute; And

determining the reconstruction model losses based on the first sublosses, the second sublosses and the third sublosses.

6. Method according to any one of paragraphs. 1-5, in which during the training process the initial weight parameter of the second transformation model is the weight parameter of the first transformation model.

7. Method according to any one of paragraphs. 1-5, in which generating a spliced transformation model by splicing a first transformation model with a second transformation model involves:

generating a spliced transformation model by splicing n weight network layers from among the plurality of weight network layers in the first transformation model with m weight network layers from among the plurality of weight network layers in the second transformation model, wherein a different number of n weight network layers and m weight network layers are provided, the value n is a positive integer and the value m is a positive integer; or

generating a spliced transformation model by performing a sum operation or an averaging operation or a difference operation with respect to the weight parameters of the plurality of weight network layers in the first transformation model and the corresponding weight parameters of the plurality of weight network layers in the second transformation model.

8. Method according to any one of paragraphs. 1-5, in which the creation of an image generation model based on the reconstruction model and the spliced transformation model involves:

obtaining a combined transformation model by combining the reconstruction model and the spliced transformation model;

obtaining a fourth set of training samples, the fourth set of training samples including at least one sample of the original image and a second style image corresponding to the at least one sample of the original image; And

creating an image generation model by adjusting the combined transformation model using the fourth set of training samples.

9. A method for generating images, comprising:

generating a latent variable corresponding to the image to be transformed, which is done by inputting the first style image to be transformed into the reconstruction model; And

generating, based on the latent variable corresponding to the image to be transformed, a target image corresponding to the image to be transformed using a spliced transformation model, the target image being a second style image;

wherein the spliced transformation model is a model generated by splicing the first transformation model with the second transformation model; the first transformation model is configured to generate a first style image in accordance with the first noise sample; and the second transformation model is configured to generate a second style image in accordance with the second noise pattern.

10. A device for training an image generation model, containing:

a model learning module configured to obtain a first transformation model by training, wherein the first transformation model is configured to generate a first training image in accordance with the first noise sample, and the first training image is a first style image, wherein:

the model training module is configured with the additional ability to obtain a reconstruction model by training based on the first transformation model, wherein the reconstruction model is configured to link a sample of the original image to a latent variable corresponding to the sample of the original image; And

the model training module is further configured to obtain a second transformation model by training, the second transformation model is configured to generate a second training image in accordance with the second noise pattern, and the second training image is a second style image; And

a model generation module configured to generate a spliced transformation model by splicing a first transformation model with a second transformation model; wherein:

the model generation module is further configured to create an image generation model based on the reconstruction model and the spliced transformation model, wherein the image generation model is configured to convert the first style image to be transformed into a second style target image.

11. An image generating device comprising:

a variable generation module configured to generate a latent variable corresponding to the image to be transformed by inputting the first style image to be transformed into the reconstruction model; And

an image generating module configured to generate, based on a latent variable corresponding to an image to be transformed, a target image corresponding to the image to be transformed using a spliced transformation model, wherein the target image is a second style image;

wherein the spliced transform model is a model generated by splicing a first transform model with a second transform model, the first transform model is configured to generate a first style image in accordance with the first noise sample, and the second transform model is configured to generate a second style image in accordance with second noise sample.

12. A computer device for training an image generation model, comprising a processor and memory in which one or more computer programs are stored, wherein one or more computer programs, when loaded and executed by the processor of the computer device, initiates the implementation by the computer device of a method for training the image generation model according to any of pp. 1-8.

13. A computer-readable storage medium for storing on it one or more computer programs, wherein one or more computer programs, when loaded and executed by a processor, initiates implementation by this processor of a method for training an image generation model according to any one of claims. 1-8.

14. A computer device for generating images, comprising a processor and a memory, wherein the memory stores one or more computer programs, wherein the one or more computer programs, when loaded and executed by the processor of the computer device, causes the computer device to implement the image generation method of claim 9.

15. A computer-readable storage medium for storing on it one or more computer programs, wherein one or more computer programs, when loaded and executed by a processor, causes the processor to implement the image generation method of claim 9.