WO2022088878A1 - Style image generation method, model training method and apparatus, and device and medium - Google Patents

Style image generation method, model training method and apparatus, and device and medium Download PDF

Info

Publication number
WO2022088878A1
WO2022088878A1 PCT/CN2021/114211 CN2021114211W WO2022088878A1 WO 2022088878 A1 WO2022088878 A1 WO 2022088878A1 CN 2021114211 W CN2021114211 W CN 2021114211W WO 2022088878 A1 WO2022088878 A1 WO 2022088878A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
style
style image
vector
generation model
Prior art date
Application number
PCT/CN2021/114211
Other languages
French (fr)
Chinese (zh)
Inventor
尹淳骥
张耀
李文越
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2022088878A1 publication Critical patent/WO2022088878A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/28Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present disclosure relates to the technical field of data processing, and in particular, to a style image generation method, a model training method, an apparatus, a device and a storage medium.
  • image style conversion refers to converting an image from one style to another style that meets user needs.
  • the present disclosure provides a style image generation method, a model training method, an apparatus, a device and a storage medium, which can generate high-quality style images and improve user experience. .
  • the present disclosure provides a style image generation method executed by an image generation model, the method comprising:
  • the image generation model is obtained by training based on the first style image samples and the second style image samples having a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
  • the latent space of the image generation model is constrained to be a vector dictionary containing a preset number of latent vectors; after performing the style conversion process on the first style image, a second style image is generated.
  • images including:
  • a second style image corresponding to the first style image is generated.
  • the latent space of the image generation model is constrained to be a normal distribution; after performing style conversion processing on the first style image, generating a second style image, including:
  • a second style image corresponding to the first style image is generated.
  • the method before generating the second style image corresponding to the first style image based on the feature vector, the method further includes:
  • the eigenvector is updated based on the target weight coefficient to obtain the updated eigenvector;
  • the target weight coefficient is used to represent the distance between the eigenvector and the origin of the normal distribution;
  • generating a second style image corresponding to the first style image based on the feature vector includes:
  • a second style image corresponding to the first style image is generated.
  • the method before the feature vector is updated based on the target weight coefficient to obtain the updated feature vector, the method further includes:
  • the target weight coefficient is acquired in response to an input operation of the target weight coefficient.
  • the first style image includes a line art style image
  • the second style image includes a comic style image
  • the present disclosure provides a training method for an image generation model, the method comprising:
  • the latent space is constrained to obtain a trained image generation model.
  • the latent space is constrained to obtain a trained image generation model, including:
  • the vector dictionary From the vector dictionary, determine the hidden vector with the smallest distance from the feature vector as the target vector corresponding to the feature vector; the vector dictionary stores a preset number of hidden vectors;
  • the vector dictionary is updated based on the loss value, and the next round of iterative training is entered until a preset convergence condition is reached, and a trained image generation model is obtained.
  • the method before the extracting the feature vector of the first style image sample, the method further includes:
  • the extracting the feature vector of the first style image sample includes:
  • Feature vectors of the enhanced image samples are extracted.
  • the latent space is constrained to obtain a trained image generation model, include:
  • the current vector distribution is updated based on the standard normal distribution and the eigenvector
  • the method before the mapping of the first style image sample to the current vector distribution to obtain the feature vector of the first style image sample, the method further includes:
  • the first style image sample is mapped to the current vector distribution to obtain the feature vector of the first style image sample, including:
  • the enhanced image sample is mapped to the current vector distribution to obtain the feature vector of the first style image sample.
  • the present disclosure provides a style image generation apparatus, the apparatus is applied to an image generation model, and the apparatus includes:
  • a receiving module for receiving the first style image
  • a generating module configured to generate a second style image after performing style conversion processing on the first style image
  • the image generation model is obtained by training based on the first style image samples and the second style image samples having a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
  • the present disclosure provides an apparatus for training an image generation model, the apparatus comprising:
  • an acquisition module configured to acquire image samples of the first style and image samples of the second style that have a corresponding relationship
  • the constraint module is configured to constrain the latent space in the process of training based on the first style image sample and the second style image sample having the corresponding relationship to obtain a trained image generation model.
  • the present disclosure provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed on a terminal device, the terminal device is made to implement the above method.
  • the present disclosure provides a device comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, when the processor executes the computer program, Implement the above method.
  • An embodiment of the present disclosure provides a style image generation method. First, an image generation model receives a first style image, and then, after performing style conversion processing on the first style image, a second style image is generated.
  • the embodiments of the present disclosure constrain the latent space of the image generation model during the training of the image generation model, so that the image generation model can be used to generate style images with better quality and effect, thereby improving user experience.
  • FIG. 1 is an application environment architecture diagram of a style image generation method provided by an embodiment of the present disclosure
  • FIG. 2 is a flowchart of a method for generating a style image according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of the effect of implementing image style conversion based on an image generation model according to an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of a training process of an image generation model according to an embodiment of the present disclosure
  • FIG. 5 is a flowchart of a training method for an image generation model provided by an embodiment of the present disclosure
  • FIG. 6 is a schematic diagram of a model training process for constraining the latent space of an image generation model according to an embodiment of the present disclosure
  • FIG. 7 is a flowchart of a model training method for constraining the latent space of an image generation model according to an embodiment of the present disclosure
  • FIG. 8 is an image enhancement effect diagram provided by an embodiment of the present disclosure.
  • FIG. 9 is a flowchart of a method for generating a style image according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic diagram of another model training process for constraining the latent space of an image generation model according to an embodiment of the present disclosure
  • FIG. 11 is a flowchart of another model training method for constraining the latent space of an image generation model according to an embodiment of the present disclosure
  • FIG. 13 is a schematic structural diagram of a style image generating apparatus according to an embodiment of the present disclosure.
  • FIG. 14 is a schematic structural diagram of an apparatus for training an image generation model according to an embodiment of the present disclosure
  • FIG. 15 is a schematic structural diagram of a style image generating device according to an embodiment of the present disclosure.
  • FIG. 16 is a schematic structural diagram of a training device for an image generation model according to an embodiment of the present disclosure.
  • the present disclosure provides a style image generation method, the style image generation method is performed by an image generation model, and the image generation model constrains the latent space in the training process, therefore, the image generation model receives a first style image to After the style conversion process is performed, the quality and effect of the generated second style image can be guaranteed, thereby improving user experience.
  • the application environment of the style image generation method provided by the present disclosure is first introduced.
  • an application environment architecture diagram of a style image generation method provided by an embodiment of the present disclosure wherein the style image generation method provided by the embodiment of the present disclosure can be applied to the server 102.
  • the terminal 101 obtains the After the first style image, the first style image is sent to the server 102 through the network connection between the terminal 101 and the server 102, and the image generation model deployed on the server 102 performs style conversion processing on the first style image to generate Second style image.
  • the terminal 101 can be a desktop computer, a mobile terminal (such as a smart phone, smart glasses, tablet computer, laptop computer, wearable electronic device, smart home device, etc.), and the server 102 can be an independent server or multiple servers A cluster of servers is implemented.
  • the terminal 101 sends it to the server 102 .
  • the server 102 After receiving the first style image, the server 102 performs style conversion processing to generate a second style image.
  • the image generation model is obtained through training by the server 102 based on the first style image samples and the second style image samples having the corresponding relationship, and the latent space of the image generation model is constrained during the training process.
  • the style image generation method provided by the embodiment of the present disclosure can be directly applied to the terminal 101. Specifically, after receiving the first style image, the terminal 101 sends the first style image to the image generation model , after the image generation model performs style conversion processing, a second style image is generated.
  • the image generation model is obtained by the terminal 101 through training based on the first style image samples and the second style image samples having the corresponding relationship, and the latent space of the image generation model is constrained during the training process.
  • an embodiment of the present disclosure provides a method for generating a style image.
  • FIG. 2 a flowchart of a method for generating a style image provided by an embodiment of the present disclosure is provided. The method can be applied to an image generation model, and the method includes: :
  • S201 Receive a first style image.
  • the first style image may be an image of any style, such as a line art style image, a two-dimensional style image, a sketch style image, an oil painting style image, a cartoon style image, a real style image, a comic style image, etc. .
  • the first style image may be an image captured in real time by a camera invoked by the terminal, an image drawn in real time by a user on a terminal interface, or an image obtained from an album of the terminal. This disclosure does not limit this.
  • the first style image may include a face
  • the first style image may be a face image
  • the style of the face image can be converted based on the style image generation method provided by the embodiment of the present disclosure, such as Convert line art face images to manga face images.
  • the image generation model is obtained by training based on the image samples of the first style and the image samples of the second style that have a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
  • the image generation model is used to convert the first style image into the second style image.
  • the first style and the second style belong to two different styles of the image.
  • the style of the image may include a line art style, a two-dimensional style, a sketch style, an oil painting style, a cartoon style, a real style, and a comic style.
  • the image generation model provided by the embodiments of the present disclosure is used to convert an image from one style to another.
  • FIG. 3 it is a schematic diagram of the effect of implementing image style conversion based on an image generation model according to an embodiment of the present disclosure.
  • the first style image as a line art style image
  • the second style image as a comic style image
  • Style image to achieve the effect of converting a line art style image to a comic style image.
  • the image generation model is first trained before applying the image generation model.
  • the image generation model is trained based on the first style image samples and the second style image samples with the corresponding relationship, and the trained image generation model is obtained, which is used to convert the style of the image.
  • the embodiment of the present disclosure constrains the latent space of the image generation model, so that the latent space constrained in multiple rounds of iterative training based on high-quality image samples can be
  • the input feature vector of the first style image is better constrained, and finally a second style image with higher quality is generated, which improves the user experience.
  • constraining the latent space refers to constraining the latent code in the latent space.
  • constraining the latent vector the feature vector input to the decoder of the image generation model in the model application stage is constrained. Decoding by the decoder results in a higher quality second style image.
  • style image generation method by constraining the latent space of the image generation model during the training process of the image generation model, in the application stage of the image generation model, a style image with higher quality and effect can be generated , to improve the user experience.
  • the embodiment of the present disclosure first introduces the training method of the image generation model.
  • 4 is a schematic diagram of a training process of an image generation model provided by an embodiment of the present disclosure, wherein the image generation model to be trained may be a conditional confrontation generation network CGAN, and the conditional confrontation generation network includes a generator and a discriminator.
  • the conditional adversarial generative network is trained to obtain the trained image generative model.
  • the first style image sample and the second style image sample with the corresponding relationship are first obtained, the output image is obtained by inputting the first style image sample into the generator, and then the output image sum of the generator is combined with the first style image sample.
  • the style image samples have corresponding second style image samples, and are input to the discriminator at the same time, and the parameters in the generator are adjusted based on the loss value output by the discriminator, so as to complete the current round of iterative training for the generator.
  • the discriminator is used to perform multiple rounds of training on the generator until the preset convergence condition is reached.
  • the post image generation model based on a large number of high-quality first-style image samples and second-style image samples with corresponding relationships.
  • the second style image sample may be drawn by a professional based on the first style image sample, thus, the first style sample image and the second sample image have a corresponding relationship, and are drawn by a professional.
  • the quality of the image samples can also be guaranteed, thereby ensuring that the image generation model trained based on the image samples can generate high-quality style images.
  • FIG. 5 a flowchart of an image generation model training method provided in an embodiment of the present disclosure, wherein the method can be applied to an image generation model to be trained, and the method includes:
  • S501 Acquire a first style image sample and a second style image sample having a corresponding relationship.
  • first style image sample and the second style image sample may be two different types of images in a line art style, a two-dimensional style image, a sketch style image, an oil painting style image, a cartoon style image, a real style image, a comic style image, etc. style image.
  • first style graphic sample includes a lineart style image
  • second style graphic sample includes a manga style image.
  • S502 In the process of training based on the first style image sample and the second style image sample having the corresponding relationship, constrain the latent space of the image generation model to obtain a trained image generation model.
  • the latent space of the image generation model is continuously constrained in each round of iterative training, and finally an image generation model with constrained latent space is obtained, which can be used for Image style conversion.
  • the embodiments of the present disclosure provide the following two model training methods for constraining the latent space of the image generation model.
  • FIG. 6 a schematic diagram of a model training process for constraining the latent space of an image generation model provided by an embodiment of the present disclosure, wherein the generator of the image generation model includes an encoder and a decoder.
  • the adjustment of the parameters in the encoder and decoder realizes the training of the generator, and obtains the trained image generation model.
  • the first style image sample is input to the encoder in the generator, the feature vector of the first style image sample is extracted by the encoder, and then the hidden vector with the smallest distance from the feature vector is determined from the vector dictionary.
  • latent vector as the target vector corresponding to the feature vector.
  • the vector dictionary is used to store a preset number of hidden vectors. After the target vector corresponding to the feature vector is determined, the target vector is input into the decoder, and the decoder decodes to obtain the first output image corresponding to the first style image sample, and combines the first output image and the first output image with the first style image sample.
  • the second style image sample corresponding to the style image sample is simultaneously input into the discriminator, and after being processed by the discriminator, the loss value is obtained.
  • the vector dictionary is then updated based on the loss value, so that the latent vectors in the vector dictionary are continuously adjusted to vectors that can generate higher quality style images.
  • a trained image generation model is finally obtained through multiple rounds of iterative training based on high-quality image samples, which is used to convert the style of the image.
  • the latent vector in the vector dictionary is continuously adjusted, so as to realize the constraint of the latent space in the image generation model, so that the hidden vector is
  • the spatially constrained image generation model can generate high-quality style images and improve user experience.
  • an embodiment of the present disclosure also provides a flowchart of a model training method for constraining the latent space of an image generation model.
  • the method includes:
  • S701 Acquire a first style image sample and a second style image sample having a corresponding relationship.
  • the feature vector extracted by the processor may be N (N is a positive integer, for example, 25) M-dimensional (M is a positive integer, for example, 64) feature vectors.
  • the image generation model to be trained first performs image processing on the first style image sample based on the target image enhancement method.
  • the enhancement process is performed to obtain the enhanced image sample, and then the feature vector of the enhanced image sample is extracted.
  • the target image enhancement method may be one of image enhancement methods such as random expansion, erosion, rotation, translation, scaling, and deformation.
  • the trained image generation model has lower requirements on the quality of the images input to the model. That is to say, better style transfer images can be obtained for images of different quality. For example, the requirements for the stroke thickness of the line art style image are low, and for the line art style images with different stroke thicknesses, a comic style image with better effect can be generated.
  • FIG. 8 a data enhancement effect diagram provided by an embodiment of the present disclosure, in which, based on two line art style images A and B of different qualities, a comic style image with better quality can be obtained.
  • image enhancement processing is performed on the first style image sample based on a random deformation image enhancement method.
  • the preset image is divided into N ⁇ N squares of the same size, and the center of each square is Randomly determine an offset vector (dx, dy), then linearly spread the offset vector to the entire square, and control the offset at the border of the square to be 0, to obtain the offset field corresponding to the preset picture.
  • image enhancement processing is performed on the first style image sample according to the above offset field to obtain an enhanced image sample.
  • S703 From the vector dictionary, determine the hidden vector with the smallest distance from the feature vector as the target vector corresponding to the feature vector; wherein, a preset number of hidden vectors is stored in the vector dictionary.
  • the vector dictionary may include 64*64 64-dimensional latent vectors, and after extracting N 64-dimensional feature vectors corresponding to the first style image sample, respectively obtain N 64-dimensional feature vectors from the vector dictionary The hidden vector with the smallest distance between each feature vector in the dimensional feature vector is used as the target vector of the corresponding feature vector.
  • S704 Based on the target vector, generate a first output image corresponding to the first style image sample.
  • the replaced target vector is input into the decoder, and the decoder decodes to obtain the first output image corresponding to the first style image sample.
  • S705 Input the second style image sample and the first output image corresponding to the first style image sample into the discriminator, and obtain the loss value after being processed by the discriminator.
  • S706 Update the vector dictionary based on the loss value, and enter the next round of iterative training until a preset convergence condition is reached, and a trained image generation model is obtained.
  • the training objective of the generator is to minimize the loss value
  • the training objective of the discriminator is to maximize the loss value.
  • the vector dictionary is used as a parameter in the model. During the confrontation training between the generator and the discriminator, it is necessary to continuously adjust the hidden vector in the vector dictionary, so that the final image generation model after training can generate quality Higher style image.
  • the first style image is converted into the second style image.
  • FIG. 9 which is a flowchart of a method for generating a style image provided by an embodiment of the present disclosure, the method includes:
  • S902 Extract the feature vector of the first style image; wherein, the latent space of the image generation model is constrained to be a vector dictionary including a preset number of latent vectors.
  • the encoder extracts the feature vector of the first style image.
  • S903 From the vector dictionary, determine the hidden vector with the smallest distance from the feature vector as the target vector corresponding to the feature vector.
  • a latent vector with the smallest distance from the feature vector is determined from the trained vector dictionary, and the latent vector is used as the feature vector corresponding to the feature vector.
  • target vector the latent vector refers to the vector in the latent space.
  • S904 Based on the target vector, generate a second style image corresponding to the first style image.
  • the decoder in the image generation model decodes the target vector to obtain a second style image corresponding to the first style image, and the image generation model outputs the second style image.
  • the image generation model can generate a style image with better quality and effect. Meet the user's image style conversion needs and improve the user's experience.
  • FIG. 10 another schematic diagram of a model training process for constraining the latent space of an image generation model provided by an embodiment of the present disclosure, wherein the image generation model to be trained may be a conditional confrontation generation network CGAN, and the conditional confrontation generation network includes: The generator and the discriminator train the conditional adversarial generation network to obtain the trained image generation model.
  • the image generation model to be trained may be a conditional confrontation generation network CGAN, and the conditional confrontation generation network includes: The generator and the discriminator train the conditional adversarial generation network to obtain the trained image generation model.
  • the first style image sample is input to the encoder in the generator, and the first style image sample is mapped to the current vector distribution by the encoder, and the feature vector of the first style image sample is obtained to realize the feature vector. extraction.
  • a vector corresponding to the feature vector is randomly determined from the standard normal distribution, and then the maximum average difference algorithm is used to calculate the difference between the vector and the feature vector, and the The difference is determined as a distribution loss value, and based on the distribution loss value, the current vector distribution in the encoder can be adjusted and updated, so that the current vector distribution is continuously adjusted to a vector that can be used to obtain a higher quality style image.
  • the maximum mean difference algorithm is often used to measure the difference between two distributions.
  • the decoder After the feature vector is obtained, it is input to the decoder, and the decoder decodes to obtain the first output image, and then combines the second style image sample corresponding to the first style image sample with the first output image
  • the image is input to the discriminator at the same time, and the loss value is obtained, and the parameters in the generator and the discriminator can be adjusted based on the loss value.
  • a trained image generation model is finally obtained through multiple rounds of iterative training based on high-quality image samples, which is used to convert the style of the image.
  • the current vector distribution in the encoder is continuously adjusted, so as to realize the transformation of the current vector distribution in the image generation model to the standard normal
  • the process of distribution constraints achieves the purpose of constraining the latent space of the image generation model, so that the image generation model constrained by the latent space can generate high-quality style images and improve user experience.
  • an embodiment of the present disclosure also provides a flowchart of another model training method for constraining the latent space of an image generation model.
  • the method includes:
  • S1101 Acquire a first style image sample and a second style image sample having a corresponding relationship.
  • S1102 Map the first style image sample to the current vector distribution to obtain a feature vector of the first style image sample.
  • the feature vector extracted by the encoder can be a feature vector of size P (P is a positive integer, such as 512) dimension.
  • image enhancement processing is first performed on the first style image sample based on the target image enhancement method. image samples, and then map the enhanced image samples to the current vector distribution to obtain the feature vector of the first style image sample.
  • the target image enhancement method may be one of image enhancement methods such as random expansion, erosion, rotation, translation, scaling, and deformation.
  • the style transfer images with better effect can be obtained for images of different quality.
  • the requirements for the stroke thickness of the line art style image are low, and for the line art style images with different stroke thicknesses, a comic style image with better effect can be generated.
  • a comic style image with better quality can be obtained.
  • image enhancement processing is performed on the first style image sample based on a random deformation image enhancement method.
  • the preset image is divided into N ⁇ N squares of the same size, and the center of each square is Randomly determine an offset vector (dx, dy), then linearly spread the offset vector to the entire square, and control the offset at the border of the square to be 0, to obtain the offset field corresponding to the preset picture.
  • image enhancement processing is performed on the first style image sample according to the above offset field to obtain an enhanced image sample.
  • S1103 Use the maximum average difference algorithm to update the current vector distribution based on the standard normal distribution and the eigenvectors.
  • a vector corresponding to the feature vector is randomly determined from the standard normal distribution, and then the maximum average difference algorithm is used to calculate the difference between the vector and the feature vector.
  • the difference is determined as a distribution loss value. Based on the distribution loss value, the current vector distribution in the encoder can be adjusted and updated, so that the current vector distribution can be continuously adjusted to a vector that can be used to obtain higher quality style images. .
  • the image generation model can be trained based on multiple pairs of image samples with corresponding relationships at the same time, after the corresponding vectors are determined for the feature vectors of each first style image sample from the standard normal distribution , using the maximum average difference algorithm to calculate the difference between each vector and the corresponding eigenvector, and then determine the distribution loss value based on the difference. Based on the distribution loss value, the current vector distribution in the encoder can be adjusted and updated, improving the encoder. The update efficiency of the current vector distribution in .
  • S1104 Based on the feature vector, generate a first output image corresponding to the first style image sample.
  • S1105 Input the second style image sample and the first output image corresponding to the first style image sample into the discriminator, implement this round of iterative training, and enter the next round of iterative training, until a preset convergence condition is reached, Get the trained image generation model.
  • the current vector distribution in the encoder is continuously adjusted, so that the image generation model obtained after training can generate style images with better quality and effect. Meet the user's image style conversion needs and improve the user's experience.
  • the first style image is converted into the second style image.
  • FIG. 12 which is a flowchart of another style image generation method provided by an embodiment of the present disclosure, the method includes:
  • S1202 Map the first style image into a normal distribution to obtain a feature vector of the first style image; wherein, the latent space of the image generation model is constrained to be a normal distribution.
  • the first style image is input to the encoder of the image generation model, and the encoder maps the first style image to the normal distribution after training to obtain the feature vector of the first style image, and realizes the feature vector extraction.
  • the embodiments of the present disclosure can be used to generate high-quality style images based on the feature vector mapped to the normal distribution after training, so as to improve user experience.
  • S1203 Based on the feature vector, generate a second style image corresponding to the first style image.
  • the feature vector is input into the decoder of the image generation model
  • the second style image corresponding to the first style image is obtained after decoding by the decoder
  • the image generation model outputs the first style image.
  • Second style image is obtained after the feature vector is obtained.
  • a style image with better quality and effect can be generated, which can better meet the user's image style conversion needs and improve the user's sense of style. experience.
  • the distance between the feature vector obtained by mapping and the vector corresponding to the origin of the normal distribution can represent the aesthetics of the style image generated based on the feature vector, the aesthetics of the generated style image can be improved by adjusting the distance.
  • the embodiments of the present disclosure are provided with a target weight coefficient to represent the distance between the feature vector and the origin of the normal distribution.
  • the target weight coefficient By adjusting the target weight coefficient, the distance can be changed, thereby affecting the aesthetics of the generated style image.
  • the image generation model may update the feature vector of the first style image based on the target weight coefficient to obtain the updated feature vector. Then, based on the updated feature vector, a second style image corresponding to the first style image is generated to change the aesthetics of the style image.
  • the target weight coefficient may be adjusted by the user to obtain a style image that meets the user's aesthetic requirements.
  • the image generation model receives the user's input operation on the target weight coefficient, obtains the target weight coefficient corresponding to the operation, and then updates the feature vector of the first style image based on the target weight coefficient to obtain the updated feature vector. Finally, based on the updated feature vector, a style image that meets the user's aesthetic requirements is generated.
  • the present disclosure also provides a style image generating apparatus.
  • a schematic structural diagram of a style image generating apparatus provided in an embodiment of the present disclosure, the apparatus is applied to images In generating the model, the device includes:
  • a receiving module 1301, configured to receive a first style image
  • the image generation model is obtained by training based on the first style image samples and the second style image samples having a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
  • the latent space of the image generation model is constrained to a vector dictionary containing a preset number of latent vectors; the generation module 1302 includes:
  • an extraction submodule for extracting the feature vector of the first style image
  • Determining submodule for from the vector dictionary, determine the hidden vector with the minimum distance between the eigenvectors, as the target vector corresponding to the eigenvectors;
  • the first generating submodule is configured to generate a second style image corresponding to the first style image based on the target vector.
  • the latent space of the image generation model is constrained to be a normal distribution; the generation module 1302 includes:
  • mapping submodule for mapping the first style image to the normal distribution to obtain a feature vector of the first style image
  • the second generating sub-module is configured to generate a second style image corresponding to the first style image based on the feature vector.
  • the generating module 1302 further includes:
  • an update sub-module for updating the feature vector based on the target weight coefficient to obtain the updated feature vector;
  • the target weight coefficient is used to represent the distance between the feature vector and the origin of the normal distribution;
  • the second generation submodule is specifically used for:
  • a second style image corresponding to the first style image is generated.
  • the generating module 1302 further includes:
  • the obtaining sub-module is configured to obtain the target weight coefficient in response to the input operation of the target weight coefficient.
  • the first style image includes a line art style image
  • the second style image includes a comic style image
  • style image generation method by constraining the latent space of the image generation model during the training process of the image generation model, in the application stage of the image generation model, a style image with higher quality and effect can be generated , to improve the user experience.
  • the present disclosure also provides an apparatus for training an image generation model.
  • FIG. 14 it is a schematic structural diagram of an apparatus for training an image generation model provided in an embodiment of the present disclosure.
  • the device includes:
  • an acquisition module 1401 configured to acquire a first style image sample and a second style image sample with a corresponding relationship
  • the constraint module 1402 is configured to constrain the latent space during the training process based on the first style image samples and the second style image samples having the corresponding relationship to obtain a trained image generation model.
  • the constraint module 1402 includes:
  • an extraction submodule for extracting the feature vector of the first style image sample
  • a determination submodule is used to determine, from the vector dictionary, a latent vector with the smallest distance from the feature vector as a target vector corresponding to the feature vector; a preset number of hidden vectors is stored in the vector dictionary;
  • a first generating submodule configured to generate a first output image corresponding to the first style image sample based on the target vector
  • the first processing sub-module is used for inputting the second style image sample corresponding to the first style image sample and the first output image into the discriminator, and after being processed by the discriminator, the loss is obtained value;
  • the first update sub-module is configured to update the current vector dictionary based on the loss value, and enter the next round of iterative training until a preset convergence condition is reached, and a trained image generation model is obtained.
  • the constraint module 1402 further includes:
  • a first enhancement sub-module configured to perform image enhancement processing on the first style image sample based on the target image enhancement method to obtain an enhanced image sample
  • the extraction submodule is specifically used for:
  • Feature vectors of the enhanced image samples are extracted.
  • the constraint module 1402 includes:
  • mapping submodule configured to map the first style image sample to the current vector distribution after inputting the first style image sample into the image generation model, to obtain a feature vector of the first style image sample
  • the second update submodule is used to update the current vector distribution based on the standard normal distribution and the eigenvector using the maximum average difference algorithm;
  • a second generating submodule configured to generate a first output image corresponding to the first style image sample based on the feature vector
  • the training sub-module is used to input the second style image sample and the first output image corresponding to the first style image sample into the discriminator, realize this round of iterative training, and enter the next round of iterative training , until the preset convergence condition is reached, and the trained image generation model is obtained.
  • the constraint module 1402 further includes:
  • a second enhancement sub-module configured to perform image enhancement processing on the first style image sample based on the target image enhancement method to obtain an enhanced image sample
  • mapping submodule is specifically used for:
  • the enhanced image sample is mapped to the current vector distribution to obtain the feature vector of the first style image sample.
  • the latent space of the image generation model is continuously constrained in each round of iterative training, and finally an image generation model with constrained latent space is obtained, which can be used for Image style conversion.
  • embodiments of the present disclosure also provide a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed on a terminal device, the terminal device is made to implement the present invention.
  • the style image generation method or the image generation model training method according to the disclosed embodiments is disclosed.
  • an embodiment of the present disclosure further provides a style image generating device, as shown in FIG. 15 , which may include:
  • the number of processors 1501 in the style image generating device may be one or more, and one processor is taken as an example in FIG. 15 .
  • the processor 1501, the memory 1502, the input device 1503, and the output device 1504 may be connected through a bus or other means, wherein the connection through a bus is taken as an example in FIG. 15 .
  • the memory 1502 can be used to store computer programs and modules, and the processor 1501 executes various functional applications and data processing of the style image generating apparatus by running the computer programs and modules stored in the memory 1502 .
  • the memory 1502 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function, and the like. Additionally, memory 1502 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input device 1503 may be used to receive input numerical or character information, and generate signal input related to user settings and function control of the style image generating apparatus.
  • the processor 1501 loads the executable files corresponding to the processes of one or more computer programs into the memory 1502 according to the following instructions, and the processor 1501 executes the executable files stored in the memory 1502 A computer program, thereby realizing each step in the above-mentioned style image generation method.
  • an embodiment of the present disclosure also provides a training device for an image generation model, as shown in FIG. 16 , which may include:
  • the number of processors 1601 in the image generation model training device may be one or more, and one processor is taken as an example in FIG. 16 .
  • the processor 1601 , the memory 1602 , the input device 1603 and the output device 1604 may be connected by a bus or other means, wherein the connection by a bus is taken as an example in FIG. 16 .
  • the memory 1602 can be used to store computer programs and modules, and the processor 1601 executes various functional applications and data processing of the image generation model training device by running the computer programs and modules stored in the memory 1602 .
  • the memory 1602 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function, and the like. Additionally, memory 1602 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input device 1603 may be used to receive input numerical or character information, and to generate signal input related to user settings and functional control of the training device for the image generation model.
  • the processor 1601 loads the executable files corresponding to the processes of one or more computer programs into the memory 1602 according to the following instructions, and the processor 1601 executes the executable files stored in the memory 1602 A computer program, thereby realizing each step in the above-mentioned training method of an image generation model.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A style image generation method, a model training method and apparatus, and a device and a storage medium. The method comprises: firstly, receiving a first style image; and then, after performing style conversion processing on the first style image, generating a second style image. By means of the method, a hidden space of an image generation model is constrained during a training process of the image generation model, such that the image generation model applied to the style image generation method can generate a style image with a relatively good quality and result, thereby improving the user experience.

Description

风格图像生成方法、模型的训练方法、装置、设备及介质Style image generation method, model training method, device, equipment and medium
本公开要求于2020年10月30日提交中国专利局、申请号为202011197824.4、发明名称为“风格图像生成方法、模型的训练方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。This disclosure claims the priority of the Chinese patent application with the application number 202011197824.4 and the invention titled "Style Image Generation Method, Model Training Method, Apparatus, Equipment and Medium" filed with the China Patent Office on October 30, 2020, all of which The contents are incorporated by reference in this disclosure.
技术领域technical field
本公开涉及数据处理技术领域,尤其涉及一种风格图像生成方法、模型的训练方法、装置、设备及存储介质。The present disclosure relates to the technical field of data processing, and in particular, to a style image generation method, a model training method, an apparatus, a device and a storage medium.
背景技术Background technique
随着图像处理技术的不断发展,在图像应用领域,图像风格转换功能已经成为了一种新的趣味性玩法。具体的,图像风格转换是指将图像从一种风格转换为符合用户需求的另一种风格。With the continuous development of image processing technology, the image style conversion function has become a new interesting game in the field of image application. Specifically, image style conversion refers to converting an image from one style to another style that meets user needs.
目前,机器模型已经成为图像处理中的常用方式,例如,可以基于机器模型实现图像风格转换,但是,如何提高基于机器模型进行图像风格转换得到的风格图像的质量,是目前亟需解决的技术问题。At present, machine models have become a common method in image processing. For example, image style conversion can be realized based on machine models. However, how to improve the quality of style images obtained by image style conversion based on machine models is a technical problem that needs to be solved urgently. .
发明内容SUMMARY OF THE INVENTION
为了解决上述技术问题或者至少部分地解决上述技术问题,本公开提供了一种风格图像生成方法、模型的训练方法、装置、设备及存储介质,能够生成质量较高的风格图像,提升用户的体验。In order to solve the above-mentioned technical problems or at least partially solve the above-mentioned technical problems, the present disclosure provides a style image generation method, a model training method, an apparatus, a device and a storage medium, which can generate high-quality style images and improve user experience. .
第一方面,本公开提供了一种风格图像生成方法,由图像生成模型执行,所述方法包括:In a first aspect, the present disclosure provides a style image generation method executed by an image generation model, the method comprising:
接收第一风格图像;receiving a first style image;
对所述第一风格图像进行风格转换处理后,生成第二风格图像;After performing style conversion processing on the first style image, a second style image is generated;
其中,所述图像生成模型为基于具有对应关系的第一风格图像样本与第二风格图像样本训练得到,所述图像生成模型的隐空间在训练的过程中被约束。The image generation model is obtained by training based on the first style image samples and the second style image samples having a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
一种可选的实施方式中,所述图像生成模型的隐空间被约束为包含预设个数隐向量的向量字典;所述对所述第一风格图像进行风格转换处理后,生成第二风格图像,包括:In an optional embodiment, the latent space of the image generation model is constrained to be a vector dictionary containing a preset number of latent vectors; after performing the style conversion process on the first style image, a second style image is generated. images, including:
提取所述第一风格图像的特征向量;extracting the feature vector of the first style image;
从所述向量字典中,确定与所述特征向量之间的距离最小的隐向量,作为所述特征向量对应的目标向量;From the vector dictionary, determine the latent vector with the smallest distance from the feature vector as the target vector corresponding to the feature vector;
基于所述目标向量,生成所述第一风格图像对应的第二风格图像。Based on the target vector, a second style image corresponding to the first style image is generated.
一种可选的实施方式中,所述图像生成模型的隐空间被约束为正态分布;所述对所述第一风格图像进行风格转换处理后,生成第二风格图像,包括:In an optional implementation manner, the latent space of the image generation model is constrained to be a normal distribution; after performing style conversion processing on the first style image, generating a second style image, including:
将所述第一风格图像映射到所述正态分布中,得到所述第一风格图像的特征向量;mapping the first style image to the normal distribution to obtain a feature vector of the first style image;
基于所述特征向量,生成所述第一风格图像对应的第二风格图像。Based on the feature vector, a second style image corresponding to the first style image is generated.
一种可选的实施方式中,在所述基于所述特征向量,生成所述第一风格图像对应的第二风格图像之前,所述方法还包括:In an optional implementation manner, before generating the second style image corresponding to the first style image based on the feature vector, the method further includes:
基于目标权重系数更新所述特征向量,得到更新后特征向量;所述目标权重系数用于表示所述特征向量与所述正态分布的原点之间的距离;The eigenvector is updated based on the target weight coefficient to obtain the updated eigenvector; the target weight coefficient is used to represent the distance between the eigenvector and the origin of the normal distribution;
相应的,所述基于所述特征向量,生成所述第一风格图像对应的第二风格图像,包括:Correspondingly, generating a second style image corresponding to the first style image based on the feature vector includes:
基于所述更新后特征向量,生成所述第一风格图像对应的第二风格图像。Based on the updated feature vector, a second style image corresponding to the first style image is generated.
一种可选的实施方式中,在所述基于目标权重系数更新所述特征向量,得到更新后特征向量之前,所述方法还包括:In an optional embodiment, before the feature vector is updated based on the target weight coefficient to obtain the updated feature vector, the method further includes:
响应于目标权重系数的输入操作,获取所述目标权重系数。The target weight coefficient is acquired in response to an input operation of the target weight coefficient.
一种可选的实施方式中,所述第一风格图像包括线稿风格图像,所述第二风格图像包括漫画风格图像。In an optional implementation manner, the first style image includes a line art style image, and the second style image includes a comic style image.
第二方面,本公开提供了一种图像生成模型的训练方法,所述方法包括:In a second aspect, the present disclosure provides a training method for an image generation model, the method comprising:
获取具有对应关系的第一风格图像样本与第二风格图像样本;obtaining image samples of the first style and image samples of the second style that have a corresponding relationship;
在基于所述具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型。In the process of training based on the first style image samples and the second style image samples having the corresponding relationship, the latent space is constrained to obtain a trained image generation model.
一种可选的实施方式中,所述在基于具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型,包括:In an optional embodiment, in the process of performing training based on the first style image sample and the second style image sample having a corresponding relationship, the latent space is constrained to obtain a trained image generation model, including:
提取所述第一风格图像样本的特征向量;extracting the feature vector of the first style image sample;
从向量字典中,确定与所述特征向量的距离最小的隐向量,作为所述特征向量对应的目标向量;所述向量字典中存储有预设个数隐向量;From the vector dictionary, determine the hidden vector with the smallest distance from the feature vector as the target vector corresponding to the feature vector; the vector dictionary stores a preset number of hidden vectors;
基于所述目标向量,生成所述第一风格图像样本对应的第一输出图像;based on the target vector, generating a first output image corresponding to the first style image sample;
将与所述第一风格图像样本具有对应关系的第二风格图像样本和所述第一输出图像输入至判别器中,经过所述判别器的处理后,得到损失值;Inputting the second style image sample and the first output image corresponding to the first style image sample into the discriminator, and after processing by the discriminator, a loss value is obtained;
基于所述损失值更新所述向量字典,并进入下一轮迭代训练,直到达到预设收敛条件,得到训练后的图像生成模型。The vector dictionary is updated based on the loss value, and the next round of iterative training is entered until a preset convergence condition is reached, and a trained image generation model is obtained.
一种可选的实施方式中,在所述提取所述第一风格图像样本的特征向量之前,所述方法还包括:In an optional implementation manner, before the extracting the feature vector of the first style image sample, the method further includes:
基于目标图像增强方式,对所述第一风格图像样本进行图像增强处理,得到增强后图像样本;Perform image enhancement processing on the first style image sample based on the target image enhancement method to obtain an enhanced image sample;
相应的,所述提取所述第一风格图像样本的特征向量,包括:Correspondingly, the extracting the feature vector of the first style image sample includes:
提取所述增强后图像样本的特征向量。Feature vectors of the enhanced image samples are extracted.
一种可选的实施方式中,所述在基于所述具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型,包括:In an optional embodiment, in the process of training based on the first style image sample and the second style image sample with the corresponding relationship, the latent space is constrained to obtain a trained image generation model, include:
将所述第一风格图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量;mapping the first style image sample to the current vector distribution to obtain the feature vector of the first style image sample;
利用最大平均差异算法,基于标准正态分布和所述特征向量对所述当前向量分布进行更新;Using the maximum average difference algorithm, the current vector distribution is updated based on the standard normal distribution and the eigenvector;
基于所述特征向量,生成所述第一风格图像样本对应的第一输出图像;generating a first output image corresponding to the first style image sample based on the feature vector;
将与所述第一风格图像样本具有对应关系的第二风格图像样本和所述第一输出图像输入至判别器中,实现本轮迭代训练,并进入下一轮迭代训练,直到达到预设收敛条件,得到训练后的图像生成模型。Input the second style image sample and the first output image corresponding to the first style image sample into the discriminator, implement this round of iterative training, and enter the next round of iterative training until the preset convergence is reached condition to get the trained image generation model.
一种可选的实施方式中,在所述将所述第一风格图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量之前,所述方法还包括:In an optional implementation manner, before the mapping of the first style image sample to the current vector distribution to obtain the feature vector of the first style image sample, the method further includes:
基于目标图像增强方式,对所述第一风格图像样本进行图像增强处理,得到增强后图像样本;Perform image enhancement processing on the first style image sample based on the target image enhancement method to obtain an enhanced image sample;
相应的,将所述第一风格图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量,包括:Correspondingly, the first style image sample is mapped to the current vector distribution to obtain the feature vector of the first style image sample, including:
将所述增强后图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量。The enhanced image sample is mapped to the current vector distribution to obtain the feature vector of the first style image sample.
第三方面,本公开提供了一种风格图像生成装置,所述装置应用于图像生成模型中,所述装置包括:In a third aspect, the present disclosure provides a style image generation apparatus, the apparatus is applied to an image generation model, and the apparatus includes:
接收模块,用于接收第一风格图像;a receiving module for receiving the first style image;
生成模块,用于对所述第一风格图像进行风格转换处理后,生成第二风格图像;a generating module, configured to generate a second style image after performing style conversion processing on the first style image;
其中,所述图像生成模型为基于具有对应关系的第一风格图像样本与第二风格图像样本训练得到,所述图像生成模型的隐空间是在训练的过程中被约束。The image generation model is obtained by training based on the first style image samples and the second style image samples having a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
第四方面,本公开提供了一种图像生成模型的训练装置,所述装置包括:In a fourth aspect, the present disclosure provides an apparatus for training an image generation model, the apparatus comprising:
获取模块,用于获取具有对应关系的第一风格图像样本与第二风格图像样本;an acquisition module, configured to acquire image samples of the first style and image samples of the second style that have a corresponding relationship;
约束模块,用于在基于所述具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型。The constraint module is configured to constrain the latent space in the process of training based on the first style image sample and the second style image sample having the corresponding relationship to obtain a trained image generation model.
第五方面,本公开提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现上述的方法。In a fifth aspect, the present disclosure provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed on a terminal device, the terminal device is made to implement the above method.
第六方面,本公开提供了一种设备,包括:存储器,处理器,及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现上述的方法。In a sixth aspect, the present disclosure provides a device comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, when the processor executes the computer program, Implement the above method.
本公开实施例提供的技术方案与现有技术相比具有如下优点:Compared with the prior art, the technical solutions provided by the embodiments of the present disclosure have the following advantages:
本公开实施例提供了一种风格图像生成方法,首先,图像生成模型接收第一风格图像,然后,对所述第一风格图像进行风格转换处理后,生成第二风格图像。本公开实施例通过在图像生成模型训练的过程中对图像生成模型的隐空间进行约束,使得利用图像生成模型能够生成质量和效果较好的风格图像,提升用户的体验。An embodiment of the present disclosure provides a style image generation method. First, an image generation model receives a first style image, and then, after performing style conversion processing on the first style image, a second style image is generated. The embodiments of the present disclosure constrain the latent space of the image generation model during the training of the image generation model, so that the image generation model can be used to generate style images with better quality and effect, thereby improving user experience.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure.
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技 术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the accompanying drawings that are required to be used in the description of the embodiments or the prior art will be briefly introduced below. In other words, on the premise of no creative labor, other drawings can also be obtained from these drawings.
图1为本公开实施例提供的一种风格图像生成方法的应用环境架构图;FIG. 1 is an application environment architecture diagram of a style image generation method provided by an embodiment of the present disclosure;
图2为本公开实施例提供的一种风格图像生成方法的流程图;FIG. 2 is a flowchart of a method for generating a style image according to an embodiment of the present disclosure;
图3为本公开实施例提供的一种基于图像生成模型实现图像风格转换的效果示意图;3 is a schematic diagram of the effect of implementing image style conversion based on an image generation model according to an embodiment of the present disclosure;
图4为本公开实施例提供的一种图像生成模型的训练过程的示意图;4 is a schematic diagram of a training process of an image generation model according to an embodiment of the present disclosure;
图5为本公开实施例提供的一种图像生成模型的训练方法的流程图;5 is a flowchart of a training method for an image generation model provided by an embodiment of the present disclosure;
图6为本公开实施例提供的一种对图像生成模型的隐空间进行约束的模型训练过程示意图;6 is a schematic diagram of a model training process for constraining the latent space of an image generation model according to an embodiment of the present disclosure;
图7为本公开实施例提供的一种对图像生成模型的隐空间进行约束的模型训练方法的流程图;7 is a flowchart of a model training method for constraining the latent space of an image generation model according to an embodiment of the present disclosure;
图8为本公开实施例提供的一种图像增强效果图;FIG. 8 is an image enhancement effect diagram provided by an embodiment of the present disclosure;
图9为本公开实施例提供的一种风格图像生成方法的流程图;9 is a flowchart of a method for generating a style image according to an embodiment of the present disclosure;
图10为本公开实施例提供的另一种对图像生成模型的隐空间进行约束的模型训练过程示意图;FIG. 10 is a schematic diagram of another model training process for constraining the latent space of an image generation model according to an embodiment of the present disclosure;
图11为本公开实施例提供的另一种对图像生成模型的隐空间进行约束的模型训练方法的流程图;11 is a flowchart of another model training method for constraining the latent space of an image generation model according to an embodiment of the present disclosure;
图12为本公开实施例提供的另一种风格图像生成方法的流程图;12 is a flowchart of another style image generation method provided by an embodiment of the present disclosure;
图13为本公开实施例提供的一种风格图像生成装置的结构示意图;FIG. 13 is a schematic structural diagram of a style image generating apparatus according to an embodiment of the present disclosure;
图14为本公开实施例提供的一种图像生成模型的训练装置的结构示意图;14 is a schematic structural diagram of an apparatus for training an image generation model according to an embodiment of the present disclosure;
图15为本公开实施例提供的一种风格图像生成设备的结构示意图;FIG. 15 is a schematic structural diagram of a style image generating device according to an embodiment of the present disclosure;
图16为本公开实施例提供的一种图像生成模型的训练设备的结构示意图。FIG. 16 is a schematic structural diagram of a training device for an image generation model according to an embodiment of the present disclosure.
具体实施方式Detailed ways
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。In order to more clearly understand the above objects, features and advantages of the present disclosure, the solutions of the present disclosure will be further described below. It should be noted that the embodiments of the present disclosure and the features in the embodiments may be combined with each other under the condition of no conflict.
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。Many specific details are set forth in the following description to facilitate a full understanding of the present disclosure, but the present disclosure can also be implemented in other ways different from those described herein; obviously, the embodiments in the specification are only a part of the embodiments of the present disclosure, and Not all examples.
本公开提供了一种风格图像生成方法,该风格图像生成方法由图像生成模型执行,并且,图像生成模型在训练的过程中约束了隐空间,因此,图像生成模型在接收第一风格图像对其进行风格转换处理后,生成的第二风格图像的质量和效果能够得到保证,进而提升了用户体验。The present disclosure provides a style image generation method, the style image generation method is performed by an image generation model, and the image generation model constrains the latent space in the training process, therefore, the image generation model receives a first style image to After the style conversion process is performed, the quality and effect of the generated second style image can be guaranteed, thereby improving user experience.
在对本公开提供的风格图像生成方法进行具体介绍之前,首先介绍一下本公开提供的风格图像生成方法的应用环境。Before the specific introduction of the style image generation method provided by the present disclosure, the application environment of the style image generation method provided by the present disclosure is first introduced.
参考图1,为本公开实施例提供的一种风格图像生成方法的应用环境架构图,其中, 本公开实施例提供的风格图像生成方法可以应用于服务端102,具体的,终端101在获取到第一风格图像后,通过终端101和服务器102之间的网络连接,将第一风格图像发送至服务端102,由部署于服务端102的图像生成模型对第一风格图像进行风格转换处理,生成第二风格图像。其中,终端101可以为台式计算机、移动终端(如智能手机、智能眼镜、平板电脑、笔记本电脑、可穿戴电子设备、智能家居设备等),服务器102可以为独立的服务器,也可以由多个服务器组成的服务器集群实现。Referring to FIG. 1, an application environment architecture diagram of a style image generation method provided by an embodiment of the present disclosure, wherein the style image generation method provided by the embodiment of the present disclosure can be applied to the server 102. Specifically, the terminal 101 obtains the After the first style image, the first style image is sent to the server 102 through the network connection between the terminal 101 and the server 102, and the image generation model deployed on the server 102 performs style conversion processing on the first style image to generate Second style image. The terminal 101 can be a desktop computer, a mobile terminal (such as a smart phone, smart glasses, tablet computer, laptop computer, wearable electronic device, smart home device, etc.), and the server 102 can be an independent server or multiple servers A cluster of servers is implemented.
实际应用中,终端101获取到第一风格图像后,将其发送至服务器102。服务器102接收到第一风格图像后,进行风格转换处理,生成第二风格图像。其中,图像生成模型是服务器102基于具有对应关系的第一风格图像样本和第二风格图像样本训练得到,且在训练的过程中对图像生成模型的隐空间进行约束。In practical applications, after acquiring the first style image, the terminal 101 sends it to the server 102 . After receiving the first style image, the server 102 performs style conversion processing to generate a second style image. The image generation model is obtained through training by the server 102 based on the first style image samples and the second style image samples having the corresponding relationship, and the latent space of the image generation model is constrained during the training process.
在另一种应用环境中,本公开实施例提供的风格图像生成方法可以直接应用于终端101,具体的,终端101在接收到第一风格图像后,将该第一风格图像发送至图像生成模型中,由图像生成模型进行风格转换处理后,生成第二风格图像。其中,图像生成模型是终端101基于具有对应关系的第一风格图像样本和第二风格图像样本训练得到,且在训练的过程中对图像生成模型的隐空间进行约束。In another application environment, the style image generation method provided by the embodiment of the present disclosure can be directly applied to the terminal 101. Specifically, after receiving the first style image, the terminal 101 sends the first style image to the image generation model , after the image generation model performs style conversion processing, a second style image is generated. The image generation model is obtained by the terminal 101 through training based on the first style image samples and the second style image samples having the corresponding relationship, and the latent space of the image generation model is constrained during the training process.
在上述应用环境的基础上,以下对本公开提供的风格图像生成方法进行具体介绍。On the basis of the above application environment, the style image generation method provided by the present disclosure will be specifically introduced below.
具体的,本公开实施例提供了一种风格图像生成方法,参考图2,为本公开实施例提供的一种风格图像生成方法的流程图,该方法可以应用于图像生成模型中,该方法包括:Specifically, an embodiment of the present disclosure provides a method for generating a style image. Referring to FIG. 2 , a flowchart of a method for generating a style image provided by an embodiment of the present disclosure is provided. The method can be applied to an image generation model, and the method includes: :
S201:接收第一风格图像。S201: Receive a first style image.
本公开实施例中,第一风格图像可以为任意风格的图像,例如可以为线稿风格图像、二次元风格图像、素描风格图像、油画风格图像、卡通风格图像、真实风格图像、漫画风格图像等。In the embodiment of the present disclosure, the first style image may be an image of any style, such as a line art style image, a two-dimensional style image, a sketch style image, an oil painting style image, a cartoon style image, a real style image, a comic style image, etc. .
实际应用中,第一风格图像可以为终端调用摄像头实时拍摄的图像,也可以为用户在终端界面上实时绘制的图像,还可以为从终端相册中获取的图像等。本公开不对此进行限制。In practical applications, the first style image may be an image captured in real time by a camera invoked by the terminal, an image drawn in real time by a user on a terminal interface, or an image obtained from an album of the terminal. This disclosure does not limit this.
一种可选的实施例中,第一风格图像可以包括脸部,例如第一风格图像可以为人脸图像,基于本公开实施例提供的风格图像生成方法能够对人脸图像的风格进行转换,例如将线稿人脸图像转换为漫画人脸图像。In an optional embodiment, the first style image may include a face, for example, the first style image may be a face image, and the style of the face image can be converted based on the style image generation method provided by the embodiment of the present disclosure, such as Convert line art face images to manga face images.
S202:对第一风格图像进行风格转换处理后,生成第二风格图像。S202: After performing style conversion processing on the first style image, a second style image is generated.
其中,图像生成模型为基于具有对应关系的第一风格图像样本与第二风格图像样本训练得到,图像生成模型的隐空间在训练的过程中被约束。The image generation model is obtained by training based on the image samples of the first style and the image samples of the second style that have a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
本公开实施例中,图像生成模型用于将第一风格图像转换为第二风格图像。其中,第一风格和第二风格属于图像的两种不同风格,具体的,图像的风格可以包括线稿风格、二次元风格、素描风格、油画风格、卡通风格、真实风格、漫画风格等。本公开实施例提供的图像生成模型用于将图像从一种风格转换为另一种风格。In the embodiment of the present disclosure, the image generation model is used to convert the first style image into the second style image. Among them, the first style and the second style belong to two different styles of the image. Specifically, the style of the image may include a line art style, a two-dimensional style, a sketch style, an oil painting style, a cartoon style, a real style, and a comic style. The image generation model provided by the embodiments of the present disclosure is used to convert an image from one style to another.
参考图3,为本公开实施例提供的一种基于图像生成模型实现图像风格转换的效果示意图。以第一风格图像为线稿风格图像,第二风格图像为漫画风格图像为例,将第一风格 图像输入图像生成模型之后,经过图像生成模型的处理,得到与第一风格图像对应的第二风格图像,实现将线稿风格图像转换为漫画风格图像的效果。Referring to FIG. 3 , it is a schematic diagram of the effect of implementing image style conversion based on an image generation model according to an embodiment of the present disclosure. Taking the first style image as a line art style image and the second style image as a comic style image as an example, after the first style image is input into the image generation model, after the image generation model is processed, a second style image corresponding to the first style image is obtained. Style image, to achieve the effect of converting a line art style image to a comic style image.
实际应用中,在对图像生成模型进行应用之前,首先对图像生成模型进行训练。本公开实施例中,基于具有对应关系的第一风格图像样本和第二风格图像样本对图像生成模型进行训练,得到训练后的图像生成模型,用于对图像的风格进行转换。In practical applications, before applying the image generation model, the image generation model is first trained. In the embodiment of the present disclosure, the image generation model is trained based on the first style image samples and the second style image samples with the corresponding relationship, and the trained image generation model is obtained, which is used to convert the style of the image.
在对图像生成模型进行训练的过程中,本公开实施例对图像生成模型的隐空间latent space进行约束,使得在基于质量较高的图像样本进行多轮迭代训练中约束的隐空间,能够在图像生成模型应用的过程中,对输入的第一风格图像的特征向量进行较好的约束,最终生成质量较高的第二风格图像,提升用户的体验。In the process of training the image generation model, the embodiment of the present disclosure constrains the latent space of the image generation model, so that the latent space constrained in multiple rounds of iterative training based on high-quality image samples can be In the process of applying the generative model, the input feature vector of the first style image is better constrained, and finally a second style image with higher quality is generated, which improves the user experience.
其中,对隐空间进行约束是指对隐空间中的隐向量(latent code)进行约束,通过对隐向量的约束使得模型应用阶段被输入到图像生成模型的解码器的特征向量被约束,最终经过解码器的解码得到质量较高的第二风格图像。Among them, constraining the latent space refers to constraining the latent code in the latent space. By constraining the latent vector, the feature vector input to the decoder of the image generation model in the model application stage is constrained. Decoding by the decoder results in a higher quality second style image.
本公开实施例提供的风格图像生成方法中,通过在图像生成模型训练的过程中对图像生成模型的隐空间进行约束,使得在图像生成模型的应用阶段,能够生成质量和效果较高的风格图像,提升用户的体验。In the style image generation method provided by the embodiment of the present disclosure, by constraining the latent space of the image generation model during the training process of the image generation model, in the application stage of the image generation model, a style image with higher quality and effect can be generated , to improve the user experience.
为了便于对方案的理解,在对后续的风格图像生成方法进行介绍之前,本公开实施例首先对图像生成模型的训练方法进行介绍。参考图4,为本公开实施例提供的一种图像生成模型的训练过程的示意图,其中,待训练图像生成模型可以为条件对抗生成网络CGAN,该条件对抗生成网络包括生成器和判别器,通过对条件对抗生成网络进行训练,从而得到训练后的图像生成模型。In order to facilitate the understanding of the solution, before introducing the subsequent style image generation method, the embodiment of the present disclosure first introduces the training method of the image generation model. 4 is a schematic diagram of a training process of an image generation model provided by an embodiment of the present disclosure, wherein the image generation model to be trained may be a conditional confrontation generation network CGAN, and the conditional confrontation generation network includes a generator and a discriminator. The conditional adversarial generative network is trained to obtain the trained image generative model.
实际训练过程中,首先获取具有对应关系的第一风格图像样本和第二风格图像样本,通过将第一风格图像样本输入至生成器得到输出图像,然后将该生成器的输出图像和与第一风格图像样本具有对应关系的第二风格图像样本,同时输入至判别器,基于判别器输出的损失值对生成器中的参数进行调整,以完成对生成器的本轮迭代训练。依照上述训练方式,基于大量质量较高的具有对应关系的第一风格图像样本和第二风格图像样本,利用判别器对生成器进行多轮训练,直到达到预设收敛条件结束训练,从而得到训练后的图像生成模型。In the actual training process, the first style image sample and the second style image sample with the corresponding relationship are first obtained, the output image is obtained by inputting the first style image sample into the generator, and then the output image sum of the generator is combined with the first style image sample. The style image samples have corresponding second style image samples, and are input to the discriminator at the same time, and the parameters in the generator are adjusted based on the loss value output by the discriminator, so as to complete the current round of iterative training for the generator. According to the above training method, based on a large number of high-quality first-style image samples and second-style image samples with corresponding relationships, the discriminator is used to perform multiple rounds of training on the generator until the preset convergence condition is reached. The post image generation model.
一种可选的实施方式中,第二风格图像样本可以为由专业人员基于第一风格图像样本绘制的,由此,第一风格样本图像和第二样本图像具有对应关系,并且经过专业人员绘制的图像还可以保证图像样本的质量,进而保证基于图像样本训练得到的图像生成模型能够生成质量较高的风格图像。In an optional implementation manner, the second style image sample may be drawn by a professional based on the first style image sample, thus, the first style sample image and the second sample image have a corresponding relationship, and are drawn by a professional. The quality of the image samples can also be guaranteed, thereby ensuring that the image generation model trained based on the image samples can generate high-quality style images.
参考图5,为本公开实施例提供的一种图像生成模型的训练方法的流程图,其中,该方法可以应用于待训练图像生成模型,该方法包括:Referring to FIG. 5 , a flowchart of an image generation model training method provided in an embodiment of the present disclosure, wherein the method can be applied to an image generation model to be trained, and the method includes:
S501:获取具有对应关系的第一风格图像样本与第二风格图像样本。S501: Acquire a first style image sample and a second style image sample having a corresponding relationship.
其中,第一风格图像样本与第二风格图像样本可以为线稿风格图像、二次元风格图像、素描风格图像、油画风格图像、卡通风格图像、真实风格图像、漫画风格图像等中的两种不同风格图像。例如,第一风格图形样本包括线稿风格图像,第二风格图形样本包括漫画 风格图像。Wherein, the first style image sample and the second style image sample may be two different types of images in a line art style, a two-dimensional style image, a sketch style image, an oil painting style image, a cartoon style image, a real style image, a comic style image, etc. style image. For example, the first style graphic sample includes a lineart style image, and the second style graphic sample includes a manga style image.
S502:在基于具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对图像生成模型的隐空间进行约束,得到训练后的图像生成模型。S502: In the process of training based on the first style image sample and the second style image sample having the corresponding relationship, constrain the latent space of the image generation model to obtain a trained image generation model.
本公开实施例中,在对图像生成模型进行训练的过程中,在每轮迭代训练中分别对图像生成模型的隐空间进行不断的约束,最终得到隐空间被约束的图像生成模型,能够用于图像风格的转换。In the embodiment of the present disclosure, in the process of training the image generation model, the latent space of the image generation model is continuously constrained in each round of iterative training, and finally an image generation model with constrained latent space is obtained, which can be used for Image style conversion.
为了进一步的对图像生成模型的训练过程进行介绍,本公开实施例提供了以下两种对图像生成模型的隐空间进行约束的模型训练方法。In order to further introduce the training process of the image generation model, the embodiments of the present disclosure provide the following two model training methods for constraining the latent space of the image generation model.
参考图6,为本公开实施例提供的一种对图像生成模型的隐空间进行约束的模型训练过程示意图,其中,图像生成模型的生成器包括编码器和解码器,在模型训练的过程中通过对编码器和解码器中参数的调整实现对生成器的训练,得到训练后的图像生成模型。Referring to FIG. 6 , a schematic diagram of a model training process for constraining the latent space of an image generation model provided by an embodiment of the present disclosure, wherein the generator of the image generation model includes an encoder and a decoder. The adjustment of the parameters in the encoder and decoder realizes the training of the generator, and obtains the trained image generation model.
实际训练过程中,将第一风格图像样本输入至生成器中的编码器,由编码器提取该第一风格图像样本的特征向量,然后从向量字典中确定与该特征向量的距离最小的隐向量latent vector,作为该特征向量对应的目标向量。其中,向量字典用于存储预设个数的隐向量。在确定该特征向量对应的目标向量之后,将该目标向量输入至解码器中,由解码器解码得到该第一风格图像样本对应的第一输出图像,并将第一输出图像和与该第一风格图像样本对应的第二风格图像样本同时输入至判别器中,经过判别器的处理后,得到损失值。然后基于损失值更新向量字典,以使得向量字典中的隐向量被不断调整为能够生成质量较高的风格图像的向量。依照上述训练方式,基于质量较高的图像样本通过多轮迭代训练最终得到训练后的图像生成模型,用于对图像的风格进行转换。In the actual training process, the first style image sample is input to the encoder in the generator, the feature vector of the first style image sample is extracted by the encoder, and then the hidden vector with the smallest distance from the feature vector is determined from the vector dictionary. latent vector, as the target vector corresponding to the feature vector. Among them, the vector dictionary is used to store a preset number of hidden vectors. After the target vector corresponding to the feature vector is determined, the target vector is input into the decoder, and the decoder decodes to obtain the first output image corresponding to the first style image sample, and combines the first output image and the first output image with the first style image sample. The second style image sample corresponding to the style image sample is simultaneously input into the discriminator, and after being processed by the discriminator, the loss value is obtained. The vector dictionary is then updated based on the loss value, so that the latent vectors in the vector dictionary are continuously adjusted to vectors that can generate higher quality style images. According to the above training method, a trained image generation model is finally obtained through multiple rounds of iterative training based on high-quality image samples, which is used to convert the style of the image.
本公开实施例提供的图像生成模型的训练方法中,在对图像生成模型训练的过程中,不断的对向量字典中的隐向量进行调整,实现对图像生成模型中隐空间的约束,使得经过隐空间约束的图像生成模型能够生成质量较高的风格图像,提升用户体验。In the training method of the image generation model provided by the embodiment of the present disclosure, in the process of training the image generation model, the latent vector in the vector dictionary is continuously adjusted, so as to realize the constraint of the latent space in the image generation model, so that the hidden vector is The spatially constrained image generation model can generate high-quality style images and improve user experience.
与上述训练方法相对应的,本公开实施例还提供了一种对图像生成模型的隐空间进行约束的模型训练方法的流程图,参考图7,该方法包括:Corresponding to the above training method, an embodiment of the present disclosure also provides a flowchart of a model training method for constraining the latent space of an image generation model. Referring to FIG. 7 , the method includes:
S701:获取具有对应关系的第一风格图像样本与第二风格图像样本。S701: Acquire a first style image sample and a second style image sample having a corresponding relationship.
S702:提取第一风格图像样本的特征向量。S702: Extract the feature vector of the first style image sample.
本公开实施例中,当第一风格图像样本为线稿图像样本,且第二风格图像样本为漫画图像样本时,将第一风格图像样本输入至待训练图像生成模型的编码器后,由编码器提取到的特征向量可以为N(N为正整数,例如为25)个M维(M为正整数,例如为64)的特征向量。In the embodiment of the present disclosure, when the first style image sample is a line art image sample, and the second style image sample is a comic image sample, after the first style image sample is input into the encoder of the image generation model to be trained, the encoding The feature vector extracted by the processor may be N (N is a positive integer, for example, 25) M-dimensional (M is a positive integer, for example, 64) feature vectors.
为了提高图像生成模型的鲁棒性,本公开实施例在将第一风格图像输入至待训练图像生成模型后,首先由待训练图像生成模型基于目标图像增强方式,对第一风格图像样本进行图像增强处理,得到增强后图像样本,然后提取增强后图像样本的特征向量。其中,目标图像增强方式可以为随机膨胀、腐蚀、旋转、平移、放缩和形变等图像增强方式之一。In order to improve the robustness of the image generation model, in this embodiment of the present disclosure, after the first style image is input into the image generation model to be trained, the image generation model to be trained first performs image processing on the first style image sample based on the target image enhancement method. The enhancement process is performed to obtain the enhanced image sample, and then the feature vector of the enhanced image sample is extracted. The target image enhancement method may be one of image enhancement methods such as random expansion, erosion, rotation, translation, scaling, and deformation.
本公开实施例中,在图像生成模型的训练过程中,对图像样本进行随机图像增强处理,使得训练后的图像生成模型对输入至模型的图像质量要求较低。也就是说,对于不同质量 的图像均能够得到效果较好的风格转换图像。例如,对于线稿风格图像的笔画粗细要求较低,对于不同笔画粗细的线稿风格图像,均能够生成效果较好的漫画风格图像。如图8所示,为本公开实施例提供的一种数据增强效果图,其中,基于不同质量的A、B两张线稿风格图像,均能够得到质量较好的漫画风格图像。In the embodiment of the present disclosure, during the training process of the image generation model, random image enhancement processing is performed on the image samples, so that the trained image generation model has lower requirements on the quality of the images input to the model. That is to say, better style transfer images can be obtained for images of different quality. For example, the requirements for the stroke thickness of the line art style image are low, and for the line art style images with different stroke thicknesses, a comic style image with better effect can be generated. As shown in FIG. 8 , a data enhancement effect diagram provided by an embodiment of the present disclosure, in which, based on two line art style images A and B of different qualities, a comic style image with better quality can be obtained.
一种可选的实施方式中,基于随机形变的图像增强方式对第一风格图像样本进行图像增强处理,具体的,将预设图片划分为N×N个大小相同的正方形,在每个正方形中心随机确定一个偏移向量(dx,dy),然后将偏移向量线性地扩散到整个正方形中,且控制正方形边界处的偏移为0,得到预设图片对应的偏移场。最终,根据上述偏移场对第一风格图像样本进行图像增强处理,得到增强后图像样本。In an optional embodiment, image enhancement processing is performed on the first style image sample based on a random deformation image enhancement method. Specifically, the preset image is divided into N×N squares of the same size, and the center of each square is Randomly determine an offset vector (dx, dy), then linearly spread the offset vector to the entire square, and control the offset at the border of the square to be 0, to obtain the offset field corresponding to the preset picture. Finally, image enhancement processing is performed on the first style image sample according to the above offset field to obtain an enhanced image sample.
S703:从向量字典中,确定与特征向量之间的距离最小的隐向量,作为特征向量对应的目标向量;其中,向量字典中存储有预设个数隐向量。S703: From the vector dictionary, determine the hidden vector with the smallest distance from the feature vector as the target vector corresponding to the feature vector; wherein, a preset number of hidden vectors is stored in the vector dictionary.
本公开实施例中,向量字典中可以包括64*64个64维的隐向量,在提取到第一风格图像样本对应的N个64维的特征向量之后,从向量字典中分别获取与N个64维特征向量中每个特征向量距离最小的隐向量,作为对应的特征向量的目标向量。In the embodiment of the present disclosure, the vector dictionary may include 64*64 64-dimensional latent vectors, and after extracting N 64-dimensional feature vectors corresponding to the first style image sample, respectively obtain N 64-dimensional feature vectors from the vector dictionary The hidden vector with the smallest distance between each feature vector in the dimensional feature vector is used as the target vector of the corresponding feature vector.
S704:基于目标向量,生成第一风格图像样本对应的第一输出图像。S704: Based on the target vector, generate a first output image corresponding to the first style image sample.
本公开实施例中,利用目标向量替换对应的特征向量之后,将替换后的目标向量输入至解码器中,由解码器解码得到第一风格图像样本对应的第一输出图像。In the embodiment of the present disclosure, after replacing the corresponding feature vector with the target vector, the replaced target vector is input into the decoder, and the decoder decodes to obtain the first output image corresponding to the first style image sample.
S705:将与第一风格图像样本具有对应关系的第二风格图像样本和第一输出图像输入至判别器中,经过判别器的处理后,得到损失值。S705: Input the second style image sample and the first output image corresponding to the first style image sample into the discriminator, and obtain the loss value after being processed by the discriminator.
S706:基于损失值更新向量字典,并进入下一轮迭代训练,直到达到预设收敛条件,得到训练后的图像生成模型。S706: Update the vector dictionary based on the loss value, and enter the next round of iterative training until a preset convergence condition is reached, and a trained image generation model is obtained.
本公开实施例中,在模型训练的过程中,生成器的训练目标是使得损失值最小化,而判别器的训练目标是使得损失值最大化,在生成器和判别器的对抗训练中,通过不断的调整模型中的参数,使得判别器无法识别出生成器输出的第一输出图像的真伪,最终得到训练后的图像生成模型。In the embodiment of the present disclosure, in the process of model training, the training objective of the generator is to minimize the loss value, and the training objective of the discriminator is to maximize the loss value. In the confrontation training between the generator and the discriminator, by The parameters in the model are continuously adjusted so that the discriminator cannot identify the authenticity of the first output image output by the generator, and finally a trained image generation model is obtained.
本公开实施例中,将向量字典作为模型中的参数,在生成器和判别器的对抗训练中,需要不断的调整向量字典中的隐向量,使得最终得到的训练后的图像生成模型能够生成质量较高的风格图像。In the embodiment of the present disclosure, the vector dictionary is used as a parameter in the model. During the confrontation training between the generator and the discriminator, it is necessary to continuously adjust the hidden vector in the vector dictionary, so that the final image generation model after training can generate quality Higher style image.
基于上述对图像生成模型的隐空间进行约束的模型训练方法得到的图像生成模型,将第一风格图像转换为第二风格图像。具体的,参考图9,为本公开实施例提供的一种风格图像生成方法的流程图,该方法包括:Based on the image generation model obtained by the model training method for constraining the latent space of the image generation model, the first style image is converted into the second style image. Specifically, referring to FIG. 9 , which is a flowchart of a method for generating a style image provided by an embodiment of the present disclosure, the method includes:
S901:获取第一风格图像。S901: Acquire a first style image.
S902:提取第一风格图像的特征向量;其中,图像生成模型的隐空间被约束为包含预设个数隐向量的向量字典。S902: Extract the feature vector of the first style image; wherein, the latent space of the image generation model is constrained to be a vector dictionary including a preset number of latent vectors.
本公开实施例中,将第一风格图像输入至图像生成模型中的编码器之后,由编码器提取该第一风格图像的特征向量。In the embodiment of the present disclosure, after the first style image is input to the encoder in the image generation model, the encoder extracts the feature vector of the first style image.
S903:从向量字典中,确定与特征向量之间的距离最小的隐向量,作为特征向量对应 的目标向量。S903: From the vector dictionary, determine the hidden vector with the smallest distance from the feature vector as the target vector corresponding to the feature vector.
本公开实施例中,在提取到第一风格图像的特征向量之后,从已训练的向量字典中,确定与该特征向量之间的距离最小的隐向量,并将该隐向量作为该特征向量对应的目标向量。其中,隐向量是指隐空间中的向量。In this embodiment of the present disclosure, after the feature vector of the first style image is extracted, a latent vector with the smallest distance from the feature vector is determined from the trained vector dictionary, and the latent vector is used as the feature vector corresponding to the feature vector. target vector. Among them, the latent vector refers to the vector in the latent space.
本公开实施例中,将目标向量替换对应的特征向量之后,传入到图像生成模型中的解码器。In the embodiment of the present disclosure, after replacing the corresponding feature vector with the target vector, it is passed to the decoder in the image generation model.
S904:基于目标向量,生成第一风格图像对应的第二风格图像。S904: Based on the target vector, generate a second style image corresponding to the first style image.
本公开实施例中,图像生成模型中的解码器对目标向量解码得到第一风格图像对应的第二风格图像,并由图像生成模型输出该第二风格图像。In the embodiment of the present disclosure, the decoder in the image generation model decodes the target vector to obtain a second style image corresponding to the first style image, and the image generation model outputs the second style image.
本公开实施例中,通过将第一风格对象对应的特征向量替换为训练后的向量字典中与其距离最接近的隐向量,使得图像生成模型能够生成质量和效果较好的风格图像,较好的满足用户的图像风格转换需求,提升用户的体验。In the embodiment of the present disclosure, by replacing the feature vector corresponding to the first style object with the latent vector with the closest distance to the vector dictionary after training, the image generation model can generate a style image with better quality and effect. Meet the user's image style conversion needs and improve the user's experience.
参考图10,为本公开实施例提供的另一种对图像生成模型的隐空间进行约束的模型训练过程示意图,其中,待训练图像生成模型可以为条件对抗生成网络CGAN,该条件对抗生成网络包括生成器和判别器,通过对条件对抗生成网络进行训练,从而得到训练后的图像生成模型。Referring to FIG. 10, another schematic diagram of a model training process for constraining the latent space of an image generation model provided by an embodiment of the present disclosure, wherein the image generation model to be trained may be a conditional confrontation generation network CGAN, and the conditional confrontation generation network includes: The generator and the discriminator train the conditional adversarial generation network to obtain the trained image generation model.
实际训练过程中,将第一风格图像样本输入至生成器中的编码器,由编码器将第一风格图像样本映射到当前向量分布中,得到该第一风格图像样本的特征向量,实现特征向量的提取。在得到该第一风格图像样本的特征向量之后,从标准正态分布中随机确定出一个与该特征向量对应的向量,然后利用最大平均差异算法,计算该向量与特征向量的差异,并将该差异确定为分布损失值,基于分布损失值可以对编码器中的当前向量分布进行调整更新,以使得当前向量分布被不断调整为能够用于得到质量较高的风格图像的向量。其中,最大平均差异算法常被用来度量两个分布之间的差异。In the actual training process, the first style image sample is input to the encoder in the generator, and the first style image sample is mapped to the current vector distribution by the encoder, and the feature vector of the first style image sample is obtained to realize the feature vector. extraction. After obtaining the feature vector of the first style image sample, a vector corresponding to the feature vector is randomly determined from the standard normal distribution, and then the maximum average difference algorithm is used to calculate the difference between the vector and the feature vector, and the The difference is determined as a distribution loss value, and based on the distribution loss value, the current vector distribution in the encoder can be adjusted and updated, so that the current vector distribution is continuously adjusted to a vector that can be used to obtain a higher quality style image. Among them, the maximum mean difference algorithm is often used to measure the difference between two distributions.
本公开实施例中,在得到特征向量之后,输入至解码器中,由解码器解码得到第一输出图像,然后将与第一风格图像样本具有对应关系的第二风格图像样本和该第一输出图像同时输入至判别器中,得到损失值,基于该损失值可以对生成器和判别器中的参数进行调整。In the embodiment of the present disclosure, after the feature vector is obtained, it is input to the decoder, and the decoder decodes to obtain the first output image, and then combines the second style image sample corresponding to the first style image sample with the first output image The image is input to the discriminator at the same time, and the loss value is obtained, and the parameters in the generator and the discriminator can be adjusted based on the loss value.
依照上述训练方式,基于质量较高的图像样本通过多轮迭代训练最终得到训练后的图像生成模型,用于对图像的风格进行转换。According to the above training method, a trained image generation model is finally obtained through multiple rounds of iterative training based on high-quality image samples, which is used to convert the style of the image.
本公开实施例提供的图像生成模型的训练方法中,在对图像生成模型训练的过程中,不断的对编码器中的当前向量分布进行调整,实现将图像生成模型中当前向量分布向标准正态分布约束的过程,达到对图像生成模型的隐空间进行约束的目的,使得经过隐空间约束的图像生成模型能够生成质量较高的风格图像,提升用户体验。In the training method of the image generation model provided by the embodiment of the present disclosure, in the process of training the image generation model, the current vector distribution in the encoder is continuously adjusted, so as to realize the transformation of the current vector distribution in the image generation model to the standard normal The process of distribution constraints achieves the purpose of constraining the latent space of the image generation model, so that the image generation model constrained by the latent space can generate high-quality style images and improve user experience.
与上述训练方法相对应的,本公开实施例还提供了另一种对图像生成模型的隐空间进行约束的模型训练方法的流程图,参考图11,该方法包括:Corresponding to the above training method, an embodiment of the present disclosure also provides a flowchart of another model training method for constraining the latent space of an image generation model. Referring to FIG. 11 , the method includes:
S1101:获取具有对应关系的第一风格图像样本与第二风格图像样本。S1101: Acquire a first style image sample and a second style image sample having a corresponding relationship.
S1102:将第一风格图像样本映射到当前向量分布中,得到第一风格图像样本的特征向 量。S1102: Map the first style image sample to the current vector distribution to obtain a feature vector of the first style image sample.
本公开实施例中,当第一风格图像样本为线稿图像样本,且第二风格图像样本为漫画图像样本时,将第一风格图像样本输入至编码器后,由编码器提取到的特征向量可以为大小为P(P为正整数,例如为512)维的特征向量。In the embodiment of the present disclosure, when the first style image sample is a line art image sample and the second style image sample is a comic image sample, after the first style image sample is input into the encoder, the feature vector extracted by the encoder It can be a feature vector of size P (P is a positive integer, such as 512) dimension.
为了提高图像生成模型的鲁棒性,本公开实施例在将第一风格图像输入至待训练图像生成模型后,首先基于目标图像增强方式,对第一风格图像样本进行图像增强处理,得到增强后图像样本,然后将增强后图像样本映射到当前向量分布中,得到该第一风格图像样本的特征向量。其中,目标图像增强方式可以为随机膨胀、腐蚀、旋转、平移、放缩和形变等图像增强方式之一。In order to improve the robustness of the image generation model, in this embodiment of the present disclosure, after the first style image is input into the to-be-trained image generation model, image enhancement processing is first performed on the first style image sample based on the target image enhancement method. image samples, and then map the enhanced image samples to the current vector distribution to obtain the feature vector of the first style image sample. The target image enhancement method may be one of image enhancement methods such as random expansion, erosion, rotation, translation, scaling, and deformation.
本公开实施例中,在图像生成模型的训练过程中,对图像样本进行随机图像增强处理,使得训练后的图像生成模型对输入至模型的图像质量要求较低。也就是说,对于不同质量的图像均能够得到效果较好的风格转换图像。例如,对于线稿风格图像的笔画粗细要求较低,对于不同笔画粗细的线稿风格图像,均能够生成效果较好的漫画风格图像。如图8所示,基于不同质量的A、B两张线稿风格图像,均能够得到质量较好的漫画风格图像。In the embodiment of the present disclosure, during the training process of the image generation model, random image enhancement processing is performed on the image samples, so that the trained image generation model has lower requirements on the quality of the images input to the model. That is to say, the style transfer images with better effect can be obtained for images of different quality. For example, the requirements for the stroke thickness of the line art style image are low, and for the line art style images with different stroke thicknesses, a comic style image with better effect can be generated. As shown in Fig. 8, based on two line art style images of A and B with different qualities, a comic style image with better quality can be obtained.
一种可选的实施方式中,基于随机形变的图像增强方式对第一风格图像样本进行图像增强处理,具体的,将预设图片划分为N×N个大小相同的正方形,在每个正方形中心随机确定一个偏移向量(dx,dy),然后将偏移向量线性地扩散到整个正方形中,且控制正方形边界处的偏移为0,得到预设图片对应的偏移场。最终,根据上述偏移场对第一风格图像样本进行图像增强处理,得到增强后图像样本。In an optional embodiment, image enhancement processing is performed on the first style image sample based on a random deformation image enhancement method. Specifically, the preset image is divided into N×N squares of the same size, and the center of each square is Randomly determine an offset vector (dx, dy), then linearly spread the offset vector to the entire square, and control the offset at the border of the square to be 0, to obtain the offset field corresponding to the preset picture. Finally, image enhancement processing is performed on the first style image sample according to the above offset field to obtain an enhanced image sample.
S1103:利用最大平均差异算法,基于标准正态分布和特征向量对当前向量分布进行更新。S1103: Use the maximum average difference algorithm to update the current vector distribution based on the standard normal distribution and the eigenvectors.
本公开实施例中,在得到第一风格图像样本的特征向量之后,从标准正态分布中随机确定出一个与该特征向量对应的向量,然后利用最大平均差异算法,计算该向量与特征向量的差异,并将该差异确定为分布损失值,基于分布损失值可以对编码器中的当前向量分布进行调整更新,以使得当前向量分布被不断调整为能够用于得到质量较高的风格图像的向量。In the embodiment of the present disclosure, after the feature vector of the first style image sample is obtained, a vector corresponding to the feature vector is randomly determined from the standard normal distribution, and then the maximum average difference algorithm is used to calculate the difference between the vector and the feature vector. The difference is determined as a distribution loss value. Based on the distribution loss value, the current vector distribution in the encoder can be adjusted and updated, so that the current vector distribution can be continuously adjusted to a vector that can be used to obtain higher quality style images. .
一种优选的实施方式中,可以同时基于多对具有对应关系的图像样本对图像生成模型进行训练,在从标准正态分布中分别为各个第一风格图像样本的特征向量确定出对应的向量之后,利用最大平均差异算法,计算各个向量与对应的特征向量之间的差异,然后基于该差异确定分布损失值,基于分布损失值可以对编码器中的当前向量分布进行调整更新,提高了编码器中的当前向量分布的更新效率。In a preferred embodiment, the image generation model can be trained based on multiple pairs of image samples with corresponding relationships at the same time, after the corresponding vectors are determined for the feature vectors of each first style image sample from the standard normal distribution , using the maximum average difference algorithm to calculate the difference between each vector and the corresponding eigenvector, and then determine the distribution loss value based on the difference. Based on the distribution loss value, the current vector distribution in the encoder can be adjusted and updated, improving the encoder. The update efficiency of the current vector distribution in .
S1104:基于特征向量,生成第一风格图像样本对应的第一输出图像。S1104: Based on the feature vector, generate a first output image corresponding to the first style image sample.
S1105:将与第一风格图像样本具有对应关系的第二风格图像样本和第一输出图像输入至判别器中,实现本轮迭代训练,并进入下一轮迭代训练,直到达到预设收敛条件,得到训练后的图像生成模型。S1105: Input the second style image sample and the first output image corresponding to the first style image sample into the discriminator, implement this round of iterative training, and enter the next round of iterative training, until a preset convergence condition is reached, Get the trained image generation model.
本公开实施例中,在对图像生成模型的训练过程中,通过不断的调整编码器中的当前向量分布,使得经过训练得到的图像生成模型能够生成质量和效果较好的风格图像,较好 的满足用户的图像风格转换需求,提升用户的体验。In the embodiment of the present disclosure, during the training process of the image generation model, the current vector distribution in the encoder is continuously adjusted, so that the image generation model obtained after training can generate style images with better quality and effect. Meet the user's image style conversion needs and improve the user's experience.
基于上述对图像生成模型的隐空间进行约束的模型训练方法得到的图像生成模型,将第一风格图像转换为第二风格图像。具体的,参考图12,为本公开实施例提供的另一种风格图像生成方法的流程图,该方法包括:Based on the image generation model obtained by the model training method for constraining the latent space of the image generation model, the first style image is converted into the second style image. Specifically, referring to FIG. 12 , which is a flowchart of another style image generation method provided by an embodiment of the present disclosure, the method includes:
S1201:获取第一风格图像。S1201: Acquire a first style image.
S1202:将第一风格图像映射到正态分布中,得到第一风格图像的特征向量;其中,图像生成模型的隐空间被约束为正态分布。S1202: Map the first style image into a normal distribution to obtain a feature vector of the first style image; wherein, the latent space of the image generation model is constrained to be a normal distribution.
本公开实施例中,将第一风格图像输入至图像生成模型的编码器,由编码器将第一风格图像映射到训练后的正态分布中,得到第一风格图像的特征向量,实现特征向量的提取。In the embodiment of the present disclosure, the first style image is input to the encoder of the image generation model, and the encoder maps the first style image to the normal distribution after training to obtain the feature vector of the first style image, and realizes the feature vector extraction.
由于本公开实施例中的编码器的参数被约束为正态分布,而输入的图像较大概率会被映射到正态分布的原点位置对应的向量,而基于原点位置对应的向量生成的风格图像的美观度较高,因此,本公开实施例基于训练后的正态分布映射到的特征向量,能够用于生成质量较高的风格图像,提升用户体验。Since the parameters of the encoder in the embodiment of the present disclosure are constrained to be normal distribution, the input image will be mapped to the vector corresponding to the origin position of the normal distribution with a high probability, and the style image generated based on the vector corresponding to the origin position Therefore, the embodiments of the present disclosure can be used to generate high-quality style images based on the feature vector mapped to the normal distribution after training, so as to improve user experience.
S1203:基于特征向量,生成第一风格图像对应的第二风格图像。S1203: Based on the feature vector, generate a second style image corresponding to the first style image.
本公开实施例中,在得到特征向量之后,将该特征向量输入至图像生成模型的解码器中,经过解码器解码后得到第一风格图像对应的第二风格图像,并由图像生成模型输出第二风格图像。In the embodiment of the present disclosure, after the feature vector is obtained, the feature vector is input into the decoder of the image generation model, the second style image corresponding to the first style image is obtained after decoding by the decoder, and the image generation model outputs the first style image. Second style image.
本公开实施例中,通过基于已训练的正态分布映射得到第一风格对象对应的特征向量,能够生成质量和效果较好的风格图像,较好的满足用户的图像风格转换需求,提升用户的体验。In the embodiment of the present disclosure, by mapping the feature vector corresponding to the first style object based on the trained normal distribution, a style image with better quality and effect can be generated, which can better meet the user's image style conversion needs and improve the user's sense of style. experience.
另外,由于映射得到的特征向量与正态分布的原点对应的向量的距离,能够表示基于该特征向量生成的风格图像的美观度,因此,可以通过调节该距离提高生成的风格图像的美观度。In addition, since the distance between the feature vector obtained by mapping and the vector corresponding to the origin of the normal distribution can represent the aesthetics of the style image generated based on the feature vector, the aesthetics of the generated style image can be improved by adjusting the distance.
为此,本公开实施例设置有目标权重系数,用于表示特征向量与正态分布的原点之间的距离,通过调节目标权重系数,能够改变该距离,进而影响生成的风格图像的美观度。To this end, the embodiments of the present disclosure are provided with a target weight coefficient to represent the distance between the feature vector and the origin of the normal distribution. By adjusting the target weight coefficient, the distance can be changed, thereby affecting the aesthetics of the generated style image.
一种可选的实施方式中,图像生成模型可以基于目标权重系数更新第一风格图像的特征向量,得到更新后特征向量。然后,基于该更新后特征向量,生成第一风格图像对应的第二风格图像,以改变风格图像的美观度。In an optional implementation manner, the image generation model may update the feature vector of the first style image based on the target weight coefficient to obtain the updated feature vector. Then, based on the updated feature vector, a second style image corresponding to the first style image is generated to change the aesthetics of the style image.
一种可选的实施方式中,可以由用户调整目标权重系数,以得到满足用户美观度要求的风格图像。具体的,图像生成模型接收用户对目标权重系数的输入操作,获取该操作对应的目标权重系数,然后基于该目标权重系数更新第一风格图像的特征向量,得到更新后特征向量。最终,基于该更新后特征向量,生成满足用户美观度要求的风格图像。In an optional implementation manner, the target weight coefficient may be adjusted by the user to obtain a style image that meets the user's aesthetic requirements. Specifically, the image generation model receives the user's input operation on the target weight coefficient, obtains the target weight coefficient corresponding to the operation, and then updates the feature vector of the first style image based on the target weight coefficient to obtain the updated feature vector. Finally, based on the updated feature vector, a style image that meets the user's aesthetic requirements is generated.
与上述方法实施例基于同一个发明构思,本公开还提供了一种风格图像生成装置,参考图13,为本公开实施例提供的一种风格图像生成装置的结构示意图,所述装置应用于图像生成模型中,所述装置包括:Based on the same inventive concept as the above method embodiments, the present disclosure also provides a style image generating apparatus. Referring to FIG. 13 , a schematic structural diagram of a style image generating apparatus provided in an embodiment of the present disclosure, the apparatus is applied to images In generating the model, the device includes:
接收模块1301,用于接收第一风格图像;a receiving module 1301, configured to receive a first style image;
生成模块1302,用于对所述第一风格图像进行风格转换处理后,生成第二风格图像;A generating module 1302, configured to generate a second style image after performing style conversion processing on the first style image;
其中,所述图像生成模型为基于具有对应关系的第一风格图像样本与第二风格图像样本训练得到,所述图像生成模型的隐空间是在训练的过程中被约束。The image generation model is obtained by training based on the first style image samples and the second style image samples having a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
一种可选的实施方式中,所述图像生成模型的隐空间被约束为包含预设个数隐向量的向量字典;所述生成模块1302包括:In an optional embodiment, the latent space of the image generation model is constrained to a vector dictionary containing a preset number of latent vectors; the generation module 1302 includes:
提取子模块,用于,提取所述第一风格图像的特征向量;an extraction submodule for extracting the feature vector of the first style image;
确定子模块,用于从所述向量字典中,确定与所述特征向量之间的距离最小的隐向量,作为所述特征向量对应的目标向量;Determining submodule, for from the vector dictionary, determine the hidden vector with the minimum distance between the eigenvectors, as the target vector corresponding to the eigenvectors;
第一生成子模块,用于基于所述目标向量,生成所述第一风格图像对应的第二风格图像。The first generating submodule is configured to generate a second style image corresponding to the first style image based on the target vector.
一种可选的实施方式中,所述图像生成模型的隐空间被约束为正态分布;所述生成模块1302包括:In an optional embodiment, the latent space of the image generation model is constrained to be a normal distribution; the generation module 1302 includes:
映射子模块,用于将所述第一风格图像映射到所述正态分布中,得到所述第一风格图像的特征向量;a mapping submodule for mapping the first style image to the normal distribution to obtain a feature vector of the first style image;
第二生成子模块,用于基于所述特征向量,生成所述第一风格图像对应的第二风格图像。The second generating sub-module is configured to generate a second style image corresponding to the first style image based on the feature vector.
一种可选的实施方式中,所述生成模块1302还包括:In an optional implementation manner, the generating module 1302 further includes:
更新子模块,用于基于目标权重系数更新所述特征向量,得到更新后特征向量;所述目标权重系数用于表示所述特征向量与所述正态分布的原点之间的距离;an update sub-module for updating the feature vector based on the target weight coefficient to obtain the updated feature vector; the target weight coefficient is used to represent the distance between the feature vector and the origin of the normal distribution;
相应的,所述第二生成子模块具体用于:Correspondingly, the second generation submodule is specifically used for:
基于所述更新后特征向量,生成所述第一风格图像对应的第二风格图像。Based on the updated feature vector, a second style image corresponding to the first style image is generated.
一种可选的实施方式中,所述生成模块1302还包括:In an optional implementation manner, the generating module 1302 further includes:
获取子模块,用于响应于目标权重系数的输入操作,获取所述目标权重系数。The obtaining sub-module is configured to obtain the target weight coefficient in response to the input operation of the target weight coefficient.
一种可选的实施方式中,所述第一风格图像包括线稿风格图像,所述第二风格图像包括漫画风格图像。In an optional implementation manner, the first style image includes a line art style image, and the second style image includes a comic style image.
本公开实施例提供的风格图像生成方法中,通过在图像生成模型训练的过程中对图像生成模型的隐空间进行约束,使得在图像生成模型的应用阶段,能够生成质量和效果较高的风格图像,提升用户的体验。In the style image generation method provided by the embodiments of the present disclosure, by constraining the latent space of the image generation model during the training process of the image generation model, in the application stage of the image generation model, a style image with higher quality and effect can be generated , to improve the user experience.
与上述方法实施例基于同一个发明构思,本公开还提供了一种图像生成模型的训练装置,参考图14,为本公开实施例提供的一种图像生成模型的训练装置的结构示意图,所述装置包括:Based on the same inventive concept as the above method embodiments, the present disclosure also provides an apparatus for training an image generation model. Referring to FIG. 14 , it is a schematic structural diagram of an apparatus for training an image generation model provided in an embodiment of the present disclosure. The device includes:
获取模块1401,用于获取具有对应关系的第一风格图像样本与第二风格图像样本;an acquisition module 1401, configured to acquire a first style image sample and a second style image sample with a corresponding relationship;
约束模块1402,用于在基于所述具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型。The constraint module 1402 is configured to constrain the latent space during the training process based on the first style image samples and the second style image samples having the corresponding relationship to obtain a trained image generation model.
一种可选的实施方式中,所述约束模块1402包括:In an optional implementation manner, the constraint module 1402 includes:
提取子模块,用于提取所述第一风格图像样本的特征向量;an extraction submodule for extracting the feature vector of the first style image sample;
确定子模块,用于从向量字典中,确定与所述特征向量的距离最小的隐向量,作为所述特征向量对应的目标向量;所述向量字典中存储有预设个数隐向量;A determination submodule is used to determine, from the vector dictionary, a latent vector with the smallest distance from the feature vector as a target vector corresponding to the feature vector; a preset number of hidden vectors is stored in the vector dictionary;
第一生成子模块,用于基于所述目标向量,生成所述第一风格图像样本对应的第一输出图像;a first generating submodule, configured to generate a first output image corresponding to the first style image sample based on the target vector;
第一处理子模块,用于将与所述第一风格图像样本具有对应关系的第二风格图像样本和所述第一输出图像输入至判别器中,经过所述判别器的处理后,得到损失值;The first processing sub-module is used for inputting the second style image sample corresponding to the first style image sample and the first output image into the discriminator, and after being processed by the discriminator, the loss is obtained value;
第一更新子模块,用于基于所述损失值更新所述当前向量字典,并进入下一轮迭代训练,直到达到预设收敛条件,得到训练后的图像生成模型。The first update sub-module is configured to update the current vector dictionary based on the loss value, and enter the next round of iterative training until a preset convergence condition is reached, and a trained image generation model is obtained.
一种可选的实施方式中,所述约束模块1402还包括:In an optional implementation manner, the constraint module 1402 further includes:
第一增强子模块,用于基于目标图像增强方式,对所述第一风格图像样本进行图像增强处理,得到增强后图像样本;a first enhancement sub-module, configured to perform image enhancement processing on the first style image sample based on the target image enhancement method to obtain an enhanced image sample;
相应的,所述提取子模块具体用于:Correspondingly, the extraction submodule is specifically used for:
提取所述增强后图像样本的特征向量。Feature vectors of the enhanced image samples are extracted.
一种可选的实施方式中,所述约束模块1402包括:In an optional implementation manner, the constraint module 1402 includes:
映射子模块,用于将所述第一风格图像样本输入至图像生成模型后,将所述第一风格图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量;a mapping submodule, configured to map the first style image sample to the current vector distribution after inputting the first style image sample into the image generation model, to obtain a feature vector of the first style image sample;
第二更新子模块,用于利用最大平均差异算法,基于标准正态分布和所述特征向量对所述当前向量分布进行更新;The second update submodule is used to update the current vector distribution based on the standard normal distribution and the eigenvector using the maximum average difference algorithm;
第二生成子模块,用于基于所述特征向量,生成所述第一风格图像样本对应的第一输出图像;a second generating submodule, configured to generate a first output image corresponding to the first style image sample based on the feature vector;
训练子模块,用于将与所述第一风格图像样本具有对应关系的第二风格图像样本和所述第一输出图像输入至判别器中,实现本轮迭代训练,并进入下一轮迭代训练,直到达到预设收敛条件,得到训练后的图像生成模型。The training sub-module is used to input the second style image sample and the first output image corresponding to the first style image sample into the discriminator, realize this round of iterative training, and enter the next round of iterative training , until the preset convergence condition is reached, and the trained image generation model is obtained.
一种可选的实施方式中,所述约束模块1402还包括:In an optional implementation manner, the constraint module 1402 further includes:
第二增强子模块,用于基于目标图像增强方式,对所述第一风格图像样本进行图像增强处理,得到增强后图像样本;a second enhancement sub-module, configured to perform image enhancement processing on the first style image sample based on the target image enhancement method to obtain an enhanced image sample;
相应的,所述映射子模块具体用于:Correspondingly, the mapping submodule is specifically used for:
将所述增强后图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量。The enhanced image sample is mapped to the current vector distribution to obtain the feature vector of the first style image sample.
本公开实施例中,在对图像生成模型进行训练的过程中,在每轮迭代训练中分别对图像生成模型的隐空间进行不断的约束,最终得到隐空间被约束的图像生成模型,能够用于图像风格的转换。In the embodiment of the present disclosure, in the process of training the image generation model, the latent space of the image generation model is continuously constrained in each round of iterative training, and finally an image generation model with constrained latent space is obtained, which can be used for Image style conversion.
除了上述方法和装置以外,本公开实施例还提供了一种计算机可读存储介质,计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现本公开实施例所述的风格图像生成方法或图像生成模型的训练方法。In addition to the above method and apparatus, embodiments of the present disclosure also provide a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed on a terminal device, the terminal device is made to implement the present invention. The style image generation method or the image generation model training method according to the disclosed embodiments is disclosed.
另外,本公开实施例还提供了一种风格图像生成设备,参见图15所示,可以包括:In addition, an embodiment of the present disclosure further provides a style image generating device, as shown in FIG. 15 , which may include:
处理器1501、存储器1502、输入装置1503和输出装置1504。风格图像生成设备中的处理器1501的数量可以一个或多个,图15中以一个处理器为例。在本公开的一些实施例中,处理器1501、存储器1502、输入装置1503和输出装置1504可通过总线或其它方式连 接,其中,图15中以通过总线连接为例。A processor 1501, a memory 1502, an input device 1503 and an output device 1504. The number of processors 1501 in the style image generating device may be one or more, and one processor is taken as an example in FIG. 15 . In some embodiments of the present disclosure, the processor 1501, the memory 1502, the input device 1503, and the output device 1504 may be connected through a bus or other means, wherein the connection through a bus is taken as an example in FIG. 15 .
存储器1502可用于存储计算机程序以及模块,处理器1501通过运行存储在存储器1502的计算机程序以及模块,从而执行风格图像生成设备的各种功能应用以及数据处理。存储器1502可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等。此外,存储器1502可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。输入装置1503可用于接收输入的数字或字符信息,以及产生与风格图像生成设备的用户设置以及功能控制有关的信号输入。The memory 1502 can be used to store computer programs and modules, and the processor 1501 executes various functional applications and data processing of the style image generating apparatus by running the computer programs and modules stored in the memory 1502 . The memory 1502 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function, and the like. Additionally, memory 1502 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. The input device 1503 may be used to receive input numerical or character information, and generate signal input related to user settings and function control of the style image generating apparatus.
具体在本实施例中,处理器1501会按照如下的指令,将一个或一个以上的计算机程序的进程对应的可执行文件加载到存储器1502中,并由处理器1501来运行存储在存储器1502中的计算机程序,从而实现上述风格图像生成方法中的各个步骤。Specifically in this embodiment, the processor 1501 loads the executable files corresponding to the processes of one or more computer programs into the memory 1502 according to the following instructions, and the processor 1501 executes the executable files stored in the memory 1502 A computer program, thereby realizing each step in the above-mentioned style image generation method.
另外,本公开实施例还提供了一种图像生成模型的训练设备,参见图16所示,可以包括:In addition, an embodiment of the present disclosure also provides a training device for an image generation model, as shown in FIG. 16 , which may include:
处理器1601、存储器1602、输入装置1603和输出装置1604。图像生成模型的训练设备中的处理器1601的数量可以一个或多个,图16中以一个处理器为例。在本公开的一些实施例中,处理器1601、存储器1602、输入装置1603和输出装置1604可通过总线或其它方式连接,其中,图16中以通过总线连接为例。A processor 1601, a memory 1602, an input device 1603, and an output device 1604. The number of processors 1601 in the image generation model training device may be one or more, and one processor is taken as an example in FIG. 16 . In some embodiments of the present disclosure, the processor 1601 , the memory 1602 , the input device 1603 and the output device 1604 may be connected by a bus or other means, wherein the connection by a bus is taken as an example in FIG. 16 .
存储器1602可用于存储计算机程序以及模块,处理器1601通过运行存储在存储器1602的计算机程序以及模块,从而执行图像生成模型的训练设备的各种功能应用以及数据处理。存储器1602可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等。此外,存储器1602可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。输入装置1603可用于接收输入的数字或字符信息,以及产生与图像生成模型的训练设备的用户设置以及功能控制有关的信号输入。The memory 1602 can be used to store computer programs and modules, and the processor 1601 executes various functional applications and data processing of the image generation model training device by running the computer programs and modules stored in the memory 1602 . The memory 1602 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function, and the like. Additionally, memory 1602 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. The input device 1603 may be used to receive input numerical or character information, and to generate signal input related to user settings and functional control of the training device for the image generation model.
具体在本实施例中,处理器1601会按照如下的指令,将一个或一个以上的计算机程序的进程对应的可执行文件加载到存储器1602中,并由处理器1601来运行存储在存储器1602中的计算机程序,从而实现上述图像生成模型的训练方法中的各个步骤。Specifically in this embodiment, the processor 1601 loads the executable files corresponding to the processes of one or more computer programs into the memory 1602 according to the following instructions, and the processor 1601 executes the executable files stored in the memory 1602 A computer program, thereby realizing each step in the above-mentioned training method of an image generation model.
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that, in this document, relational terms such as "first" and "second" are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these Any such actual relationship or sequence exists between entities or operations. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device comprising a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
以上所述仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不 会被限制于本文所述的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above descriptions are only specific embodiments of the present disclosure, so that those skilled in the art can understand or implement the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the present disclosure. Therefore, the present disclosure is not to be limited to the embodiments described herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (15)

  1. 一种风格图像生成方法,由图像生成模型执行,其特征在于,所述方法包括:A style image generation method, executed by an image generation model, characterized in that the method comprises:
    接收第一风格图像;receiving a first style image;
    对所述第一风格图像进行风格转换处理后,生成第二风格图像;After performing style conversion processing on the first style image, a second style image is generated;
    其中,所述图像生成模型为基于具有对应关系的第一风格图像样本与第二风格图像样本训练得到,所述图像生成模型的隐空间在训练的过程中被约束。The image generation model is obtained by training based on the first style image samples and the second style image samples having a corresponding relationship, and the latent space of the image generation model is constrained during the training process.
  2. 根据权利要求1所述的方法,其特征在于,所述图像生成模型的隐空间被约束为包含预设个数隐向量的向量字典;所述对所述第一风格图像进行风格转换处理后,生成第二风格图像,包括:The method according to claim 1, wherein the latent space of the image generation model is constrained to a vector dictionary containing a preset number of latent vectors; after the style conversion process is performed on the first style image, Generate second style images, including:
    提取所述第一风格图像的特征向量;extracting the feature vector of the first style image;
    从所述向量字典中,确定与所述特征向量之间的距离最小的隐向量,作为所述特征向量对应的目标向量;From the vector dictionary, determine the latent vector with the smallest distance from the feature vector as the target vector corresponding to the feature vector;
    基于所述目标向量,生成所述第一风格图像对应的第二风格图像。Based on the target vector, a second style image corresponding to the first style image is generated.
  3. 根据权利要求1所述的方法,其特征在于,所述图像生成模型的隐空间被约束为正态分布;所述对所述第一风格图像进行风格转换处理后,生成第二风格图像,包括:The method according to claim 1, wherein the latent space of the image generation model is constrained to be a normal distribution; after performing the style conversion process on the first style image, generating a second style image, comprising: :
    将所述第一风格图像映射到所述正态分布中,得到所述第一风格图像的特征向量;mapping the first style image to the normal distribution to obtain a feature vector of the first style image;
    基于所述特征向量,生成所述第一风格图像对应的第二风格图像。Based on the feature vector, a second style image corresponding to the first style image is generated.
  4. 根据权利要求3所述的方法,其特征在于,在所述基于所述特征向量,生成所述第一风格图像对应的第二风格图像之前,所述方法还包括:The method according to claim 3, characterized in that before generating the second style image corresponding to the first style image based on the feature vector, the method further comprises:
    基于目标权重系数更新所述特征向量,得到更新后特征向量;所述目标权重系数用于表示所述特征向量与所述正态分布的原点之间的距离;The eigenvector is updated based on the target weight coefficient to obtain the updated eigenvector; the target weight coefficient is used to represent the distance between the eigenvector and the origin of the normal distribution;
    相应的,所述基于所述特征向量,生成所述第一风格图像对应的第二风格图像,包括:Correspondingly, generating a second style image corresponding to the first style image based on the feature vector includes:
    基于所述更新后特征向量,生成所述第一风格图像对应的第二风格图像。Based on the updated feature vector, a second style image corresponding to the first style image is generated.
  5. 根据权利要求4所述的方法,其特征在于,在所述基于目标权重系数更新所述特征向量,得到更新后特征向量之前,所述方法还包括:The method according to claim 4, wherein before the feature vector is updated based on the target weight coefficient to obtain the updated feature vector, the method further comprises:
    响应于目标权重系数的输入操作,获取所述目标权重系数。The target weight coefficient is acquired in response to an input operation of the target weight coefficient.
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,所述第一风格图像包括线稿风格图像,所述第二风格图像包括漫画风格图像。The method according to any one of claims 1-5, wherein the first style image comprises a lineart style image, and the second style image comprises a comic style image.
  7. 一种图像生成模型的训练方法,其特征在于,所述方法包括:A training method for an image generation model, characterized in that the method comprises:
    获取具有对应关系的第一风格图像样本与第二风格图像样本;obtaining image samples of the first style and image samples of the second style that have a corresponding relationship;
    在基于所述具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型。In the process of training based on the first style image samples and the second style image samples having the corresponding relationship, the latent space is constrained to obtain a trained image generation model.
  8. 根据权利要求7所述的方法,其特征在于,所述在基于具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型,包括:The method according to claim 7, wherein, in the process of training based on the first style image sample and the second style image sample with the corresponding relationship, the latent space is constrained to obtain the image generated after training models, including:
    提取所述第一风格图像样本的特征向量;extracting the feature vector of the first style image sample;
    从向量字典中,确定与所述特征向量的距离最小的隐向量,作为所述特征向量对应的 目标向量;所述向量字典中存储有预设个数隐向量;From the vector dictionary, determine the minimum hidden vector with the distance of the feature vector, as the target vector corresponding to the feature vector; Store a preset number of hidden vectors in the vector dictionary;
    基于所述目标向量,生成所述第一风格图像样本对应的第一输出图像;based on the target vector, generating a first output image corresponding to the first style image sample;
    将与所述第一风格图像样本具有对应关系的第二风格图像样本和所述第一输出图像输入至判别器中,经过所述判别器的处理后,得到损失值;Inputting the second style image sample and the first output image corresponding to the first style image sample into the discriminator, and after processing by the discriminator, a loss value is obtained;
    基于所述损失值更新所述向量字典,并进入下一轮迭代训练,直到达到预设收敛条件,得到训练后的图像生成模型。The vector dictionary is updated based on the loss value, and the next round of iterative training is entered until a preset convergence condition is reached, and a trained image generation model is obtained.
  9. 根据权利要求8所述的方法,其特征在于,所述提取所述第一风格图像样本的特征向量之前,还包括:The method according to claim 8, wherein before the extracting the feature vector of the first style image sample, the method further comprises:
    基于目标图像增强方式,对所述第一风格图像样本进行图像增强处理,得到增强后图像样本;Perform image enhancement processing on the first style image sample based on the target image enhancement method to obtain an enhanced image sample;
    相应的,所述提取所述第一风格图像样本的特征向量,包括:Correspondingly, the extracting the feature vector of the first style image sample includes:
    提取所述增强后图像样本的特征向量。Feature vectors of the enhanced image samples are extracted.
  10. 根据权利要求7所述的方法,其特征在于,所述在基于所述具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型,包括:The method according to claim 7, wherein, in the process of training based on the first style image sample and the second style image sample with the corresponding relationship, the latent space is constrained to obtain the trained image Image generation models, including:
    接收所述第一风格图像样本,将所述第一风格图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量;receiving the first style image sample, mapping the first style image sample to the current vector distribution, and obtaining a feature vector of the first style image sample;
    利用最大平均差异算法,基于标准正态分布和所述特征向量对所述当前向量分布进行更新;Using the maximum average difference algorithm, the current vector distribution is updated based on the standard normal distribution and the eigenvector;
    基于所述特征向量,生成所述第一风格图像样本对应的第一输出图像;generating a first output image corresponding to the first style image sample based on the feature vector;
    将与所述第一风格图像样本具有对应关系的第二风格图像样本和所述第一输出图像输入至判别器中,实现本轮迭代训练,并进入下一轮迭代训练,直到达到预设收敛条件,得到训练后的图像生成模型。Input the second style image sample and the first output image corresponding to the first style image sample into the discriminator, realize this round of iterative training, and enter the next round of iterative training until the preset convergence is reached condition to get the trained image generation model.
  11. 根据权利要求10所述的方法,其特征在于,在所述将所述第一风格图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量之前,所述方法还包括:The method according to claim 10, wherein before the mapping of the first style image samples into the current vector distribution to obtain the feature vector of the first style image samples, the method further comprises:
    基于目标图像增强方式,对所述第一风格图像样本进行图像增强处理,得到增强后图像样本;Perform image enhancement processing on the first style image sample based on the target image enhancement method to obtain an enhanced image sample;
    相应的,将所述第一风格图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量,包括:Correspondingly, the first style image sample is mapped to the current vector distribution to obtain the feature vector of the first style image sample, including:
    将所述增强后图像样本映射到当前向量分布中,得到所述第一风格图像样本的特征向量。The enhanced image sample is mapped to the current vector distribution to obtain the feature vector of the first style image sample.
  12. 一种风格图像生成装置,其特征在于,所述装置应用于图像生成模型中,所述装置包括:A style image generation device, characterized in that, the device is applied to an image generation model, and the device includes:
    接收模块,用于接收第一风格图像;a receiving module for receiving the first style image;
    生成模块,用于对所述第一风格图像进行风格转换处理后,生成第二风格图像;a generating module, configured to generate a second style image after performing style conversion processing on the first style image;
    其中,所述图像生成模型为基于具有对应关系的第一风格图像样本与第二风格图像样本训练得到,所述图像生成模型的隐空间是在训练的过程中被约束。Wherein, the image generation model is obtained by training based on the first style image samples and the second style image samples having the corresponding relationship, and the latent space of the image generation model is constrained during the training process.
  13. 一种图像生成模型的训练装置,其特征在于,所述装置包括:An apparatus for training an image generation model, wherein the apparatus comprises:
    获取模块,用于获取具有对应关系的第一风格图像样本与第二风格图像样本;an acquisition module, configured to acquire image samples of the first style and image samples of the second style that have a corresponding relationship;
    约束模块,用于在基于所述具有对应关系的第一风格图像样本与第二风格图像样本进行训练的过程中,对隐空间进行约束,得到训练后的图像生成模型。The constraint module is configured to constrain the latent space in the process of training based on the first style image sample and the second style image sample with the corresponding relationship, and obtain a trained image generation model.
  14. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现如权利要求1-6任一项所述的方法或者实现如权利要求7-11任一项所述的方法。A computer-readable storage medium, characterized in that the computer-readable storage medium stores instructions, and when the instructions are executed on a terminal device, the terminal device is made to implement any one of claims 1-6. The method or implement the method according to any one of claims 7-11.
  15. 一种设备,其特征在于,包括:存储器,处理器,及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现如权利要求1-6任一项所述的方法或者实现如权利要求7-11任一项所述的方法。A device, characterized in that it comprises: a memory, a processor, and a computer program stored in the memory and running on the processor, when the processor executes the computer program, the computer program as claimed in the claims is realized. The method of any one of 1-6 or the implementation of the method of any one of claims 7-11.
PCT/CN2021/114211 2020-10-30 2021-08-24 Style image generation method, model training method and apparatus, and device and medium WO2022088878A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011197824.4A CN112991148B (en) 2020-10-30 2020-10-30 Style image generation method, model training method, device, equipment and medium
CN202011197824.4 2020-10-30

Publications (1)

Publication Number Publication Date
WO2022088878A1 true WO2022088878A1 (en) 2022-05-05

Family

ID=76344502

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/114211 WO2022088878A1 (en) 2020-10-30 2021-08-24 Style image generation method, model training method and apparatus, and device and medium

Country Status (2)

Country Link
CN (1) CN112991148B (en)
WO (1) WO2022088878A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115909296A (en) * 2022-12-20 2023-04-04 无锡慧眼人工智能科技有限公司 Driver state analysis and judgment method based on deep learning neuron network

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991148B (en) * 2020-10-30 2023-08-11 抖音视界有限公司 Style image generation method, model training method, device, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685749A (en) * 2018-09-25 2019-04-26 平安科技(深圳)有限公司 Image style conversion method, device, equipment and computer storage medium
CN109816589A (en) * 2019-01-30 2019-05-28 北京字节跳动网络技术有限公司 Method and apparatus for generating cartoon style transformation model
WO2020125505A1 (en) * 2018-12-21 2020-06-25 Land And Fields Limited Image processing system
CN112991148A (en) * 2020-10-30 2021-06-18 北京字节跳动网络技术有限公司 Method for generating style image, method, device, equipment and medium for training model

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458216B (en) * 2019-07-31 2022-04-12 中山大学 Image style migration method for generating countermeasure network based on conditions
CN110930295B (en) * 2019-10-25 2023-12-26 广东开放大学(广东理工职业学院) Image style migration method, system, device and storage medium
CN111062426A (en) * 2019-12-11 2020-04-24 北京金山云网络技术有限公司 Method, device, electronic equipment and medium for establishing training set
CN111508048B (en) * 2020-05-22 2023-06-20 南京大学 Automatic generation method of interactive arbitrary deformation style face cartoon

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685749A (en) * 2018-09-25 2019-04-26 平安科技(深圳)有限公司 Image style conversion method, device, equipment and computer storage medium
WO2020125505A1 (en) * 2018-12-21 2020-06-25 Land And Fields Limited Image processing system
CN109816589A (en) * 2019-01-30 2019-05-28 北京字节跳动网络技术有限公司 Method and apparatus for generating cartoon style transformation model
CN112991148A (en) * 2020-10-30 2021-06-18 北京字节跳动网络技术有限公司 Method for generating style image, method, device, equipment and medium for training model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115909296A (en) * 2022-12-20 2023-04-04 无锡慧眼人工智能科技有限公司 Driver state analysis and judgment method based on deep learning neuron network

Also Published As

Publication number Publication date
CN112991148A (en) 2021-06-18
CN112991148B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
JP7373554B2 (en) Cross-domain image transformation
CN109711422B (en) Image data processing method, image data processing device, image data model building method, image data model building device, computer equipment and storage medium
WO2022088878A1 (en) Style image generation method, model training method and apparatus, and device and medium
WO2021169404A1 (en) Depth image generation method and apparatus, and storage medium
CN113674146A (en) Image super-resolution
CN112614110B (en) Method and device for evaluating image quality and terminal equipment
CN113487618B (en) Portrait segmentation method, portrait segmentation device, electronic equipment and storage medium
CN112132208B (en) Image conversion model generation method and device, electronic equipment and storage medium
CN108734653A (en) Image style conversion method and device
CN112084920B (en) Method, device, electronic equipment and medium for extracting hotwords
CN114330236A (en) Character generation method and device, electronic equipment and storage medium
CN112149545B (en) Sample generation method, device, electronic equipment and storage medium
CN115965840A (en) Image style migration and model training method, device, equipment and medium
CA3110260A1 (en) Method for book recognition and book reading device
CN114627244A (en) Three-dimensional reconstruction method and device, electronic equipment and computer readable medium
JP2023541745A (en) Facial image processing method, facial image processing model training method, device, equipment, and computer program
WO2022170982A1 (en) Image processing method and apparatus, image generation method and apparatus, device, and medium
CN113766117B (en) Video de-jitter method and device
CN117094362B (en) Task processing method and related device
CN111836058A (en) Method, device and equipment for real-time video playing and storage medium
Han Texture Image Compression Algorithm Based on Self‐Organizing Neural Network
CN113723294A (en) Data processing method and device and object identification method and device
CN115205157B (en) Image processing method and system, electronic device and storage medium
CN116934591A (en) Image stitching method, device and equipment for multi-scale feature extraction and storage medium
CN114418835A (en) Image processing method, apparatus, device and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21884610

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 10.08.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21884610

Country of ref document: EP

Kind code of ref document: A1