CN117788629A - Image generation method, device and storage medium with style personalization - Google Patents

Image generation method, device and storage medium with style personalization Download PDF

Info

Publication number
CN117788629A
CN117788629A CN202410217621.9A CN202410217621A CN117788629A CN 117788629 A CN117788629 A CN 117788629A CN 202410217621 A CN202410217621 A CN 202410217621A CN 117788629 A CN117788629 A CN 117788629A
Authority
CN
China
Prior art keywords
image
style
noise
prediction
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410217621.9A
Other languages
Chinese (zh)
Other versions
CN117788629B (en
Inventor
徐小龙
许逸非
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202410217621.9A priority Critical patent/CN117788629B/en
Publication of CN117788629A publication Critical patent/CN117788629A/en
Application granted granted Critical
Publication of CN117788629B publication Critical patent/CN117788629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Processing (AREA)

Abstract

The invention discloses an image generation method, device and storage medium with style individuation in the technical field of artificial intelligence, and aims to solve the technical problems of quality and accuracy of image generation. The method comprises the following steps: selecting a style image, inputting the style image into a pre-constructed VGG19_f3 network model, and obtaining feature images with different sizes; calculating a Gram matrix of the feature map to extract style information of the feature map; inputting the coding, noise image and style information of the text obtained by the text coder into a pre-constructed noise prediction network with a style guide module for noise prediction, obtaining prediction noise, repeatedly denoising the noise image to obtain a latent space image, and decoding the latent space image by an image decoder to obtain a final generated style personalized image. The method and the device can enable the model to generate the style personalized image end to end, enable the model to have stronger style personalized capability, and simultaneously ensure the quality and the accuracy of the generated image.

Description

Image generation method, device and storage medium with style personalization
Technical Field
The invention relates to an image generation method, device and storage medium with style personalization, belonging to the technical field of artificial intelligence.
Background
The intelligent generation of the illustration oriented to the electronic publication is essentially a process of generating images according to texts, the existing large-scale image generation model can realize high-quality and diversified image generation according to natural language text prompts, however, in order to attract more readers, the illustration generation oriented to the electronic publication requires the model to generate the illustration with personalized style according to the personal style preference of a user, and under the background, the existing large-scale image generation model lacks the style individuation capability and cannot meet the application requirements.
For image generation problems, existing researches are usually built on pre-trained large image generation models to fully utilize the image generation capability of the large models, and most solutions thereof need to be divided into a plurality of stages or modules to train the models, or a plurality of steps of fine tuning are needed for the pre-trained large image generation models, so that the complexity is high compared with an end-to-end model, and the weight of the models needs to be re-tuned each time facing a new style, which is time-consuming and memory-consuming, and therefore the expandability and the practicability of the methods are greatly limited.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides an image generation method, an image generation device and a storage medium with style individuation, which can guide the image generation process by utilizing style characteristics, do not need to retrain a model for each new style grid, and simultaneously enable the model to have stronger style individuation capability through a constructed noise prediction network with a style guide module, and can ensure the quality and accuracy of the generated image.
In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:
in a first aspect, the present invention provides an image generation method with style personalization, including:
selecting a style image, and inputting the style image into a pre-constructed VGG19_f3 network model to obtain feature images with different sizes;
calculating a Gram matrix of the feature map to extract style information of the feature map;
inputting the coding, randomly sampled noise image and style information of the text obtained by the text coder into a pre-constructed noise prediction network with a style guide module for noise prediction to obtain prediction noise;
carrying out repeated denoising operation on the noise image by utilizing the predicted noise to obtain a latent space image;
and decoding the latent space image through an image decoder in a large image generation model to obtain a finally generated style personalized image.
With reference to the first aspect, further, the obtaining feature maps with different sizes includes:
setting the pixel size of the style image to a set size;
the first three downsampling blocks of the VGG19 network model are utilized to form a VGG19_f3 network model, the style image is input into the VGG19_f3 network model, and a first feature map with a large designated size is outputSecond characteristic map->And third characteristic diagram->
With reference to the first aspect, the extracting style information of the feature map includes:
calculating the first feature mapSecond characteristic map->And third characteristic diagram->Is used for obtaining the first style characteristic +.>Second windLattice characterization->And third style characteristics->
Characterizing the first styleSecond style characteristics->After the maximum pooling operation, the third style characteristic is +.>And adding to obtain style information S.
With reference to the first aspect, the computational expression of the Gram matrix of the feature map and the Gram matrix is as follows:
wherein,an ith row and a jth column of matrixes in the Gram matrix of the feature map F; />Channel->Is>An element; />Channel->Is>An element; />Is a characteristic diagram->Is high of (2); />Is a characteristic diagram->Is not limited to a wide range.
With reference to the first aspect, the construction process of the noise prediction network with the style guidance module is as follows:
and copying a downsampling block and an intermediate block of a noise prediction network in the large image generation model to obtain a downsampling network, wherein the downsampling network and the noise prediction network in the large image generation model form the noise prediction network with the style guiding module.
With reference to the first aspect, the obtaining the prediction noise includes:
inputting the style information into the downsampling network to obtain first style informationSecond style information->Third style information->Fourth style information->Fifth style information->
Encoding the text obtained by the text encoderRandomly pickingNoise image of sample->Sequentially inputting the first prediction noise into a downsampling block and an intermediate block in a noise prediction network;
the fifth style informationFourth style information->Third style information->Second style information->First style information->Sequentially adding to the first predicted noise to output final predicted noise +.>
With reference to the first aspect, the obtaining a latent space image includes:
performing a first denoising operation on the noise image by using the predicted noise to obtain a first denoised imageThe first denoising image is +.>The method comprises the steps of inputting the codes of the style individuation characteristics and the text to a pre-constructed noise prediction network with a style guide module, and outputting a first denoising prediction result +.>Using the first denoising prediction result to denoise the first denoising image +.>Performing a second denoising operation to obtain a second denoised image +.>
Repeating the denoising operation for a set number of times to obtain a latent space image
In combination with the first aspect, the expression of the denoising operation is as follows:
wherein,is the added noise variance; />Is made of->A decremental sequence obtained by calculation; />Is->To->Is the product of (1); />Is->To->Is the product of (1); />Is a noise image randomly sampled from a standard normal distribution; />The noise image is denoised in one step; />Is->Noise images at the time of steps; />The text information is coded; />Is the step length; />Is style information; />Is the predicted noise.
In a second aspect, an image generation apparatus with style personalization, the apparatus comprising:
the image input module is used for selecting a style image, inputting the style image into a pre-constructed VGG19_f3 network model, and obtaining feature images with different sizes;
the style information extracting module is used for calculating a Gram matrix of the feature map so as to extract style information of the feature map;
the noise prediction module is used for inputting the codes of the texts obtained through the text encoder, the noise images obtained through random sampling and the style information into a pre-constructed noise prediction network with the style guide module to conduct noise prediction, so as to obtain prediction noise;
the noise removing module is used for carrying out repeated noise removing operation on the noise image by utilizing the predicted noise to obtain a latent space image;
and the image decoding module is used for decoding the latent space image through an image decoder in the large-scale image generation model to obtain a finally generated style personalized image.
In a third aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the methods described in the preceding claims.
Compared with the prior art, the invention has the beneficial effects that:
according to the invention, the VGG19-f3 network model is utilized to extract the feature images with different pixel sizes, the Gram matrix is utilized to calculate the style characteristics of each layer of the feature images so as to extract the style information, the style information is utilized to guide the generation process of the images, and the content characteristics of the images do not participate in the guide, so that the model is not required to be trained by using the images corresponding to the content.
Drawings
FIG. 1 is a schematic diagram of an image generation process provided by an embodiment of the present invention;
FIG. 2 is stylistic image data provided by an embodiment of the present invention;
FIG. 3 is a truth image provided by an embodiment of the present invention;
FIG. 4 is an image with style personalization provided by an embodiment of the present invention;
fig. 5 is an image generated by a conditional image generation model ControlNet of the current mainstream provided in the embodiment of the present invention.
Detailed Description
The following detailed description of the technical solutions of the present invention will be given by way of the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and embodiments of the present invention are detailed descriptions of the technical solutions of the present invention, and not limiting the technical solutions of the present invention, and that the embodiments and technical features of the embodiments of the present invention may be combined with each other without conflict.
The term "and/or" in the present invention is merely an association relation describing the association object, and indicates that three kinds of relations may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Example 1
Fig. 1 is a flowchart of an image generation method with style personalization in embodiment 1 of the present invention. The flow chart merely shows the logical sequence of the method according to the present embodiment, and the steps shown or described may be performed in a different order than shown in fig. 1 in other possible embodiments of the invention without mutual conflict.
Referring to fig. 1, the method of the present embodiment specifically includes the following steps:
s1, selecting a style image, and inputting the style image into a pre-constructed VGG19_f3 network model to obtain feature images with different sizes;
it should be noted that, the style image of the present invention is 10000 sheets randomly selected from the Pinterest open source image dataset.
Specifically, referring to fig. 2, for the selected style image data, to adapt to the requirement of the network model on the input size, the pixel size is set to 512×512, the style image data and the pixel size of 512×512 are input into the model by using the vgg19_f3 network model composed of the first three downsampling blocks of the VGG19 network model, so as to extract the first feature images with the sizes of 128×128×64, 64×64×128 and 32×32×256 respectivelySecond characteristic map->Third feature map->And stores the three feature maps so as to facilitate the subsequent extraction of style information.
Wherein, the composition and parameters of the vgg19_f3 network model refer to the network structure and the feature map size of the first three downsampling blocks in table 1:
TABLE 1 composition and parameters of VGG19_f3 network model
S2, calculating a Gram matrix of the feature map to extract style information of the feature map;
specifically, by calculating a first feature mapSecond characteristic map->Third feature map->Extracting style information of style image data, wherein the computing expression of the Gram matrix is as follows:
wherein,an ith row and a jth column of matrixes in the Gram matrix of the feature map F; />Channel->Is>An element; />Channel->Is>An element; />Is a characteristic diagram->Is high of (2); />Is a characteristic diagram->Is not limited to a wide range.
Further, the style characteristics of each layer are obtained by adopting the calculation expression of the Gram matrixAnd->Will->After the maximum pooling operation, and +.>And adding to obtain style information S.
S3, inputting the codes of the texts obtained through the text coder, the noise images obtained through random sampling and style information into a pre-constructed noise prediction network with a style guide module for noise prediction to obtain prediction noise;
specifically, the generation of the image is realized by using a large image generation model (SD), and a noise prediction network with a style guide module is designed according to the SD model structure.
Wherein the noise prediction network in SD comprises 4 downsampled blocks、/>、/>And->1 middle block->And 4 upsampling blocks->、/>、/>And->
Further, the downsampled block and the intermediate block in the noise prediction network are replicated to obtain a downsampled block replica、/>、/>And middle block->The total five network blocks together form a downsampling network, and the downsampling network obtained by copying the network blocks and the SDForms a noise prediction network with style guidance modules.
It should be noted that the downsampled block copy、/>、/>、/>And middle block->The image feature sizes of (a) are 64×64, 32×32, 16×16, 8×8, and 8×8, respectively.
As can be derived from fig. 1, the final noise prediction is obtained by adding the output in the noise prediction network to the output in the downsampled block, as follows:
when the style personalized feature S obtained in the step 2 is input into the downsampling network, the style personalized feature S can be obtainedOutput of (2)、/>Output of +.>、/>Output of +.>、/>Output of +.>And +.>Output of +.>At the same time, the input text is coded based on the input text prompt +.>And a noise image sampled randomly +.>Input to downsampling block->Downsampling block +.>Input to the downsampled block +.>Downsampling block +.>Input to the downsampled block +.>Downsampling block +.>Input to the downsampled block +.>Downsampling block +.>Is input to the intermediate block->
Next to this, the process is carried out,will beOutput of +.>And middle block->Is added as an up-sampling block +.>Is to add a feature map->And upsample block->Is added as an up-sampling block +.>Is to add a feature map->And upsample block->Is added as an up-sampling block +.>Is to add a feature map->And upsample block->Is added as an up-sampling block +.>Is finally characterized byAnd upsample block->Output of (2) adds up, outputting the final prediction noise +.>
S4, carrying out repeated denoising operation on the noise image by using the predicted noise to obtain a latent space image;
in particular, using predictive noiseNoise image->Performing a first denoising operation to obtain a first denoised image +.>The first denoising image is +.>Style information S, coding of text +.>Then input the first noise removal prediction result to a pre-constructed noise prediction network with a style guiding module, and output the first noise removal prediction result +.>Using the first denoising prediction result to denoise the first denoising image +.>Performing a second denoising operation to obtain a second denoised image +.>The second denoised image is again +.>Style information S, coding of text +.>Then input the noise prediction result to a pre-constructed noise prediction network with a style guiding module, and output a second denoising prediction result +.>The method comprises the steps of carrying out a first treatment on the surface of the Repeating the above denoising operation for 50 times until obtaining the latent space image +.>
The noise image is denoised using the final predicted noise, and the denoising expression is as follows:
wherein,is added noise variance and is followed by +.>Increased linearly increasing superparameter, ++>;/>Is made of->A decremental sequence obtained by calculation; />Is->To->Is the product of (1); />Is->To->Is the product of (1); />For randomly sampled noise from a normal distribution, i.e. +.>The method comprises the steps of carrying out a first treatment on the surface of the Noise image +.>Noise image is obtained after one-step denoising>
Further, the training method of the noise prediction network with the style guidance module comprises the following steps:
for a selected original image, continuously adding noise to the selected original image, and coding the noise-added image and the corresponding textStyle information->Input to noise prediction network with style guidance module +.>Calculated predicted noiseAnd added real noise->Optimizing the network with a loss function of +.>The expression is:
step S5, utilizing an image decoder in SD to perform image decoding on the latent space imageDecoding is carried out, and a final style personalized image is obtained.
Example 2
In the embodiment of the invention, the method and the model in the embodiment 1 are compared with the current mainstream conditional image generation model ControlNet on a test set formed by 2000 pieces of user data, the experiment on the data set respectively calculates the index of the image generated by the model on a true value and the index on a style image set, and the overall performance comparison result is shown in the following table 2, wherein the FID is an index Free chet Inception Distance for measuring the quality of the generated image, which is important in the field of image generation; KID is an important indicator Kernel Inception Distance for measuring the diversity of the generated images; clisim is an important indicator of how well a generated image is consistent with text. StySim is an index Style Similarity for measuring the Similarity of the image Style, and the calculation method comprises the following steps:
for two imagesAnd->Extracting their features using the first four network blocks of VGG19 network, each layer outputting the result +.>And +.>Calculate their +.>Matrix arrayObtain->And->Calculating the mean square error of a pair of Gram matrixes of each layer, and finally summing the calculation results of the four mean square errors to obtain the style similarity,/V>Is calculated by the formula of (2)
Wherein,for the number of elements of the Gram matrix in each layer, the smaller the value of the index is, the more similar the styles of the two images are, and the stronger the style individuation performance of the model is; />Is->Gram matrix of (a); />Is->Gram matrix of (a);and (5) measuring the Style Similarity of the images.
Table 2 results of performance comparisons
As can be seen from table 2, on the quality index FID, the present invention is equivalent to control net, and on the diversity index KID, the performance of the present invention is lower than control net, because the style of the generated image is limited, resulting in the decrease of diversity of the generated image, and the style individuation performance of the present invention is stronger from the side; compared with ControlNet on the style similarity index StySim, the invention respectively shows 4.3% ((1.85-1.77)/(1.85 multiplied by 100%) and 7.5% ((2.14-1.98)/(2.14 multiplied by 100%) on the truth image set and the style image set; the present invention is comparable to control net in the text image consistency index clisim.
The generation results of different methods under a group of style conditions are shown in fig. 3, fig. 4 and fig. 5, and the lion image in fig. 3 is a true value, namely an original image; the lion image in fig. 4 is a style personalized image obtained by the invention; the lion in fig. 5 is the image generated by the current mainstream conditional image generation model ControlNet, so compared with the ControlNet, the style personalization effect of the invention is more obvious and is more similar to the style image and the true plane painting style.
Example 3
An image generation apparatus with style personalization, the apparatus comprising:
the image input module is used for selecting a style image, inputting the style image into a pre-constructed VGG19_f3 network model, and obtaining feature images with different sizes;
the style information extracting module is used for calculating a Gram matrix of the feature map so as to extract style information of the feature map;
the noise prediction module is used for inputting the codes of the texts obtained through the text encoder, the noise images obtained through random sampling and the style information into a pre-constructed noise prediction network with the style guide module to conduct noise prediction, so as to obtain prediction noise;
the noise removing module is used for carrying out repeated noise removing operation on the noise image by utilizing the predicted noise to obtain a latent space image;
and the image decoding module is used for decoding the latent space image through an image decoder in the large-scale image generation model to obtain a finally generated style personalized image.
Example 4
The embodiments of the present invention also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method described below.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims (10)

1. An image generation method with style personalization, comprising:
selecting a style image, and inputting the style image into a pre-constructed VGG19_f3 network model to obtain feature images with different sizes;
calculating a Gram matrix of the feature map to extract style information of the feature map;
inputting the coding, randomly sampled noise image and style information of the text obtained by the text coder into a pre-constructed noise prediction network with a style guide module for noise prediction to obtain prediction noise;
carrying out repeated denoising operation on the noise image by utilizing the predicted noise to obtain a latent space image;
and decoding the latent space image through an image decoder in a large image generation model to obtain a finally generated style personalized image.
2. The method for generating an image with style personalization according to claim 1, wherein the obtaining feature maps of different sizes comprises:
setting the pixel size of the style image to a set size;
forming a VGG19_f3 network model by utilizing the first three downsampling blocks of the VGG19 network model, inputting the style image into the VGG19_f3 network model, and outputting a first characteristic diagram with a specified sizeSecond characteristic map->And third characteristic diagram->
3. The method for generating an image with style personalization according to claim 2, wherein extracting style information of the feature map comprises:
calculating the first feature mapSecond characteristic map->And third characteristic diagram->Obtain a first style characteristic of Gram matrix of (2)Second style characteristics->And third style characteristics->
Characterizing the first styleSecond style characteristics->After the maximum pooling operation, the third style characteristic is +.>And adding to obtain style information S.
4. The method of generating an image with style personalization of claim 3, wherein the first feature mapSecond characteristic map->And third characteristic diagram->The computational expression of the Gram matrix of (c) is as follows:
wherein,an ith row and a jth column of matrixes in the Gram matrix of the feature map F; />Channel->Is>An element; />Channel->Is>An element; />Is a characteristic diagram->Is high of (2); />Is a characteristic diagram->Is not limited to a wide range.
5. The method for generating a personalized image according to claim 1, wherein the noise prediction network with the style guidance module is constructed as follows:
and copying a downsampling block and an intermediate block of a noise prediction network in the large image generation model to obtain a downsampling network, wherein the downsampling network and the noise prediction network in the large image generation model form the noise prediction network with the style guiding module.
6. The method for generating an image with style personalization of claim 5, wherein said obtaining a prediction noise comprises:
inputting the style information into the downsampling network to obtain first style informationSecond style information->Third style information->Fourth style information->Fifth style information->
Encoding the text obtained by the text encoderRandomly sampled noise image->Sequentially inputting the first prediction noise into a downsampling block and an intermediate block in a noise prediction network;
the fifth style informationFourth style information->Third style information->Second style information->First style informationSequentially adding to the first predicted noise to output final predicted noise +.>
7. The method of generating an image with style personalization of claim 6, wherein the obtaining a latent space image comprises:
performing a first denoising operation on the noise image by using the predicted noise to obtain a first denoising operationImage processing apparatusThe first denoising image is +.>Style information S, coding of text +.>Then input the first noise removal prediction result to a pre-constructed noise prediction network with a style guiding module, and output the first noise removal prediction result +.>Using the first denoising prediction result to denoise the first denoising image +.>Performing a second denoising operation to obtain a second denoised image +.>
Repeating the denoising operation for a set number of times to obtain a latent space image
8. The method for generating an image with style personalization according to claim 7, wherein the expression of the denoising operation is as follows:
wherein,is the added noise variance; />Is made of->A decremental sequence obtained by calculation; />Is->To->Is the product of (1); />Is->To->Is the product of (1); />Is a noise image randomly sampled from a standard normal distribution; />The noise image is denoised in one step; />Is->Noise images at the time of steps; />The text information is coded; />Is the step length; />Is style information; />Is the predicted noise.
9. An image generation apparatus with style personalization, the apparatus comprising:
the image input module is used for selecting a style image, inputting the style image into a pre-constructed VGG19_f3 network model, and obtaining feature images with different sizes;
the style information extracting module is used for calculating a Gram matrix of the feature map so as to extract style information of the feature map;
the noise prediction module is used for inputting the codes of the texts obtained through the text encoder, the noise images obtained through random sampling and the style information into a pre-constructed noise prediction network with the style guide module to conduct noise prediction, so as to obtain prediction noise;
the noise removing module is used for carrying out repeated noise removing operation on the noise image by utilizing the predicted noise to obtain a latent space image;
and the image decoding module is used for decoding the latent space image through an image decoder in the large-scale image generation model to obtain a finally generated style personalized image.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
CN202410217621.9A 2024-02-28 2024-02-28 Image generation method, device and storage medium with style personalization Active CN117788629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410217621.9A CN117788629B (en) 2024-02-28 2024-02-28 Image generation method, device and storage medium with style personalization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410217621.9A CN117788629B (en) 2024-02-28 2024-02-28 Image generation method, device and storage medium with style personalization

Publications (2)

Publication Number Publication Date
CN117788629A true CN117788629A (en) 2024-03-29
CN117788629B CN117788629B (en) 2024-05-10

Family

ID=90385321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410217621.9A Active CN117788629B (en) 2024-02-28 2024-02-28 Image generation method, device and storage medium with style personalization

Country Status (1)

Country Link
CN (1) CN117788629B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210410A (en) * 2019-06-04 2019-09-06 南京邮电大学 A kind of Handwritten Digit Recognition method based on characteristics of image
CN111325681A (en) * 2020-01-20 2020-06-23 南京邮电大学 Image style migration method combining meta-learning mechanism and feature fusion
CN114692733A (en) * 2022-03-11 2022-07-01 华南理工大学 End-to-end video style migration method, system and storage medium for inhibiting time domain noise amplification
CN116664719A (en) * 2023-07-28 2023-08-29 腾讯科技(深圳)有限公司 Image redrawing model training method, image redrawing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210410A (en) * 2019-06-04 2019-09-06 南京邮电大学 A kind of Handwritten Digit Recognition method based on characteristics of image
CN111325681A (en) * 2020-01-20 2020-06-23 南京邮电大学 Image style migration method combining meta-learning mechanism and feature fusion
CN114692733A (en) * 2022-03-11 2022-07-01 华南理工大学 End-to-end video style migration method, system and storage medium for inhibiting time domain noise amplification
CN116664719A (en) * 2023-07-28 2023-08-29 腾讯科技(深圳)有限公司 Image redrawing model training method, image redrawing method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHANG, L.等: "New Image Processing: VGG Image Style Transfer with Gram Matrix Style Features", 《2023 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER APPLICATIONS (ICAICA)》, 23 February 2024 (2024-02-23), pages 468 - 472 *
高璇: "基于属性分解的图像风格可控迁移理论与方法研究", 《中国优秀硕士学位论文全文数据库》, no. 1, 31 January 2023 (2023-01-31), pages 1 - 84 *

Also Published As

Publication number Publication date
CN117788629B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
CN107392973B (en) Pixel-level handwritten Chinese character automatic generation method, storage device and processing device
CN112734634B (en) Face changing method and device, electronic equipment and storage medium
CN111242841B (en) Image background style migration method based on semantic segmentation and deep learning
US20210397945A1 (en) Deep hierarchical variational autoencoder
CN109087258A (en) A kind of image rain removing method and device based on deep learning
CN116704079B (en) Image generation method, device, equipment and storage medium
CN116721334B (en) Training method, device, equipment and storage medium of image generation model
CN113449787B (en) Chinese character stroke structure-based font library completion method and system
CN112184582B (en) Attention mechanism-based image completion method and device
CN111402365A (en) Method for generating picture from characters based on bidirectional architecture confrontation generation network
CN113140020A (en) Method for generating image based on text of countermeasure network generated by accompanying supervision
CN114330736A (en) Latent variable generative model with noise contrast prior
CN115526223A (en) Score-based generative modeling in a potential space
CN117635418B (en) Training method for generating countermeasure network, bidirectional image style conversion method and device
CN113962192B (en) Method and device for generating Chinese character font generation model and Chinese character font generation method and device
CN115049556A (en) StyleGAN-based face image restoration method
CN114529785A (en) Model training method, video generation method and device, equipment and medium
CN117058276B (en) Image generation method, device, equipment and storage medium
CN116958324A (en) Training method, device, equipment and storage medium of image generation model
CN117788629B (en) Image generation method, device and storage medium with style personalization
Luhman et al. High fidelity image synthesis with deep vaes in latent space
CN114119923B (en) Three-dimensional face reconstruction method and device and electronic equipment
CN114494387A (en) Data set network generation model and fog map generation method
CN113111906B (en) Method for generating confrontation network model based on condition of single pair image training
CN114897884A (en) No-reference screen content image quality evaluation method based on multi-scale edge feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant