CN114066718A - Image style migration method and device, storage medium and terminal - Google Patents

Image style migration method and device, storage medium and terminal Download PDF

Info

Publication number
CN114066718A
CN114066718A CN202111198634.9A CN202111198634A CN114066718A CN 114066718 A CN114066718 A CN 114066718A CN 202111198634 A CN202111198634 A CN 202111198634A CN 114066718 A CN114066718 A CN 114066718A
Authority
CN
China
Prior art keywords
model
style migration
image
data set
style
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111198634.9A
Other languages
Chinese (zh)
Inventor
刘斌
徐博诚
胡航
黄鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Terminus Technology Group Co Ltd
Original Assignee
Terminus Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Terminus Technology Group Co Ltd filed Critical Terminus Technology Group Co Ltd
Priority to CN202111198634.9A priority Critical patent/CN114066718A/en
Publication of CN114066718A publication Critical patent/CN114066718A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image style migration method, an image style migration device, a storage medium and a terminal, wherein the method comprises the following steps: acquiring a target image to be rendered, and determining a target style parameter to be migrated; inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set; and outputting the stylized composite image corresponding to the target image. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.

Description

Image style migration method and device, storage medium and terminal
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image style migration method, an image style migration apparatus, a storage medium, and a terminal.
Background
With the development of internet technology, image processing means are more and more abundant. The image style conversion can change the details and styles of the image under the condition of keeping the main content of the image, such as the age change of the portrait, the conversion of the portrait into a hand-drawing style, the conversion of the photo into an animation style and the like.
The main task of image style migration is to obtain a style-rendered composite picture after fusion of a content picture and a style picture, and the basic principle is to define two distances, one for content (DC) and the other for style (DS). DC measures the difference in content between two pictures, while DS measures the difference in style between two pictures. For example, a third picture, the input, is taken and converted to minimize its content distance from the content picture and its style distance from the style picture.
In the existing image style conversion, a neural network model is trained through a large number of image style migration data sets to obtain better image fitting operational capability, but the model training data sets disclosed by the existing image style migration are few, the number and the types of the model training data sets are limited, and the development of an image style migration technology is limited, so that the image style effect output after the trained model conversion is poor, and the robustness and the diversity of the image style migration model are reduced.
Disclosure of Invention
The embodiment of the application provides an image style migration method and device, a storage medium and a terminal. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
In a first aspect, an embodiment of the present application provides an image style migration method, where the method includes:
acquiring a target image to be rendered, and determining a target style parameter to be migrated;
inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and outputting the stylized composite image corresponding to the target image.
Optionally, the generating a pre-trained style migration model according to the following steps includes:
collecting an image style migration data set;
generating an enhanced data set by adopting the generated countermeasure network and the image style migration data set;
merging the enhanced data set into an image style migration data set to generate a model training sample;
creating a style migration model;
training the style migration model according to the model training sample to obtain a loss value of the style migration model;
and when the loss value of the style migration model reaches the minimum value, generating a pre-trained style migration model.
Optionally, generating an enhanced data set by using the generation countermeasure network and the image style migration data set includes:
calculating prior distribution according to the image style migration data set;
obtaining a first number of data samples from a prior distribution;
and inputting a first number of data samples into a pre-trained generative model in the generative confrontation network, and outputting an enhanced data set.
Optionally, the generating a pre-trained generative model in the countermeasure network according to the following steps includes:
acquiring and generating a confrontation network; generating a countermeasure network, wherein the generation of the countermeasure network comprises a generation model and a discrimination model;
normalizing the images in the image style migration data set to generate normalized image style migration data samples;
acquiring a second number of image samples from the normalized image style migration data samples;
obtaining a third number of samples from the prior distribution;
inputting a third number of samples into the generative model, and outputting a first target image sample;
inputting a second number of image samples and the first target image samples into a discrimination model, and outputting a loss value of the discrimination model;
when the loss value of the discrimination model reaches the minimum value, generating a pre-trained discrimination model;
and generating a pre-trained generation model according to the pre-trained discrimination model.
Optionally, generating a pre-trained generative model according to the pre-trained discriminant model includes:
obtaining a fourth number of samples from the prior distribution;
inputting a fourth number of samples into the generative model, and outputting a second target image sample;
inputting a second target image sample into a pre-trained discrimination model, and outputting a loss value of the generated model;
when the loss value of the generative model reaches the minimum, the pre-trained generative model is generated.
Optionally, the obtaining of the loss value of the style migration model after the style migration model is trained according to the model training sample includes:
cutting images in the model training sample according to a plurality of preset sizes to generate cut image samples with various sizes;
inputting the cut images of other sizes except the minimum size in the cut image samples of various sizes into a 2D convolution layer of the style migration model to obtain weighted feature maps of various dimensions;
performing feature fusion on the weighted feature maps of various dimensions to generate texture pictures;
inputting the texture picture and a real texture label preset on each image in the model training sample into a pre-trained VGG network, and outputting a multilayer feature map in the VGG network;
and calculating the loss value of the style migration model according to the multilayer characteristic diagram.
Optionally, when the loss value of the style migration model reaches the minimum, generating a pre-trained style migration model, including:
and when the loss value of the style migration model does not reach the minimum value, updating the model parameters of the style migration model based on the loss value of the style migration model, and continuing to execute the step of obtaining the loss value of the style migration model after training the style migration model according to the model training sample.
In a second aspect, an embodiment of the present application provides an image style migration apparatus, including:
the data acquisition module is used for acquiring a target image to be rendered and determining a target style parameter to be migrated;
the data input module is used for inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and the synthetic image generating module is used for outputting the stylized synthetic image corresponding to the target image.
In a third aspect, embodiments of the present application provide a computer storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.
In a fourth aspect, an embodiment of the present application provides a terminal, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
in the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a schematic flowchart of an image style migration method according to an embodiment of the present application;
FIG. 2 is a scene schematic diagram of an image style migration according to an embodiment of the present disclosure;
FIG. 3 is a schematic block diagram of a flowchart of a method for training an image style migration model according to an embodiment of the present application;
fig. 4 is a schematic network structure diagram of a VGG network according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an image style migration apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them.
It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
The application provides an image style migration method, an image style migration device, a storage medium and a terminal, which are used for solving the problems in the related technical problems. In the technical scheme provided by the application, because the countermeasure network and the image style migration data set are generated to generate the enhanced data set, and the acquired image style migration data set is expanded by using the enhanced data set, the scale of the model training sample is increased, and the robustness and diversity of the image style migration model are further improved, which is described in detail by using an exemplary embodiment.
The image style migration method provided by the embodiment of the present application will be described in detail below with reference to fig. 1 to 4. The method may be implemented in dependence on a computer program, executable on an image style migration apparatus based on the von neumann architecture. The computer program may be integrated into the application or may run as a separate tool-like application.
Referring to fig. 1, a flowchart of an image style migration method is provided in an embodiment of the present application. As shown in fig. 1, the method of the embodiment of the present application may include the following steps:
s101, acquiring a target image to be rendered, and determining a target style parameter to be transferred;
the target image to be rendered is a content image, and is an image that needs to be subjected to style rendering, such as a face image or an animal image. The target style parameters to be migrated are style parameters for rendering the target image, and the style parameters are obtained according to a style migration model trained in advance.
Generally, the target image is an image needing to be subjected to style rendering on a local terminal, and can also be an image needing to be subjected to style rendering on a line or a cloud. When the target style parameters to be migrated are determined, a plurality of target parameter styles can be randomly searched from the pre-trained style migration model, and a target parameter style can also be searched from the pre-trained style migration model according to specific style key identification.
In a possible implementation manner, when image lattice migration is performed, an image which needs to be subjected to style rendering and is sent on a line is received, a style migration model which is trained in advance is loaded, an input style identification is received, and finally a target style parameter to be migrated is searched from the style migration model which is trained in advance according to the style identification.
S102, inputting a target image and a target style parameter into a style migration model trained in advance;
the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
generally, a pre-trained style migration model is a mathematical model for image style migration, and the mathematical model is a trained lightweight multi-scale feedforward generator network, which can effectively fuse low-level and high-level image characteristics, fuse global feature information and local texture information, effectively improve training effect, and generate a high-quality style migration picture.
In the embodiment of the application, a pre-trained style migration model is generated according to the following steps of firstly acquiring an image style migration data set, then generating an enhanced data set by adopting a generated countermeasure network and the image style migration data set, merging the enhanced data set into the image style migration data set to generate a model training sample, secondly creating a style migration model, then training the style migration model according to the model training sample to obtain a loss value of the style migration model, and finally generating the pre-trained style migration model when the loss value of the style migration model reaches the minimum value.
In a possible implementation manner, after the target image to be rendered and the target style parameter to be migrated are obtained based on step S101, the target image to be rendered and the target style parameter to be migrated may be input into a generator network of a lightweight multi-scale feedforward trained in advance for processing, and a synthetic image with different specific textures may be generated in real time. The web style migration is more accurate and less time consuming.
And S103, outputting the stylized composite image corresponding to the target image.
For example, as shown in fig. 2, the target image to be rendered is an image of a cat, the target style parameter to be migrated is a style parameter of sanskrit self-portrait, and then the image of the cat and the style parameter of sanskrit self-portrait are input into a generator network trained in advance for processing, and then a cat image similar to the sanskrit portrait style is output.
In the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
Referring to fig. 3, a schematic flow chart of training an image style migration model is provided according to an embodiment of the present application. As shown in fig. 3, the method of the embodiment of the present application may include the following steps:
s201, collecting an image style migration data set;
wherein the image style migration dataset is a sample image of an original collected model training.
In one possible implementation, the image style migration dataset is a collection of approximately 1 million various types of style migration pictures from the web, which contains 100 types.
It should be noted that, because the number and types of model training data sets disclosed in the current image style migration are limited, the development of the image style migration technology is limited, and thus the application needs to perform data expansion on the image style migration data sets.
S202, generating an enhanced data set by adopting a generated countermeasure network and an image style migration data set;
the generated countermeasure network (GAN) is a deep learning Model, and the GAN includes at least two models, namely, a generated Model (generated Model) and a discriminant Model (discriminant Model).
In general, GAN is another way of modeling generation based on a micro-generator network. It is practical to fit a new distribution by applying a model-generating operation to the prior distribution. The prior distribution is defined as a probability distribution of the overall distribution parameter θ. The basic idea of bayesian theory is to consider that in any statistical inference problem regarding the global distribution parameter θ, in addition to using the information provided by the samples, a prior distribution must be specified, which is an indispensable element in making the statistical inference. They believe that the prior distribution need not be objectively based and may be based in part or in whole on subjective beliefs.
In a possible implementation manner, when generating the enhanced data set, firstly, a priori distribution is calculated according to the image style migration data set, then, a first number of data samples are obtained from the priori distribution, and finally, the first number of data samples are input into a generation model which is trained in advance in a generation countermeasure network, and the enhanced data set is output.
Specifically, when the prior distribution is calculated according to the image style migration data set, firstly, the prior information of the image style migration data set is determined by a user according to subjective consciousness and is input into a client, the client calculates the probability distribution according to the prior information and a Bayesian formula, and finally, the probability distribution is determined as the prior distribution.
Further, generating a generation model trained in advance in the countermeasure network according to the following steps, and firstly, acquiring the generated countermeasure network; the generation of the countermeasure network comprises a generation model and a discrimination model, images in the image style migration data set are normalized to generate normalized image style migration data samples, a second number of image samples are obtained from the normalized image style migration data samples, a third number of samples are obtained from the prior distribution, the third number of samples are input into the generation model, a first target image sample is output, the second number of image samples and the first target image sample are input into the discrimination model, the loss value of the discrimination model is output, and finally when the loss value of the discrimination model reaches the minimum, the pre-trained discrimination model is generated and the pre-trained generation model is generated according to the pre-trained discrimination model.
Further, when the pre-trained generative model is generated according to the pre-trained discriminant model, a fourth number of samples are obtained from the prior distribution, the fourth number of samples are input into the generative model, a second target image sample is output, the second target image sample is input into the pre-trained discriminant model, a loss value of the generative model is output, and finally the pre-trained generative model is generated when the loss value of the generative model reaches a minimum value.
Specifically, firstly, the images in the image style migration data set are normalized, and the specific formula is
Figure BDA0003304048640000081
Where x denotes input data, x*Represents the normalized output such that all data are at [0,1 ]]In between, the normalization process can increase the speed of convergence and can improve the accuracy of the model. Then obtaining a sample with the capacity size of m from the normalized image style migration data sample
Figure BDA0003304048640000082
Then from the prior distribution pz(z) taking a sample with a volume m
Figure BDA0003304048640000083
And mixing the sample
Figure BDA0003304048640000084
Input into the generative model G to obtain output m samples
Figure BDA0003304048640000085
Then the sample is put
Figure BDA0003304048640000086
And a sample
Figure BDA0003304048640000091
Inputting the data into a discrimination model D, optimizing a network of the discrimination model according to a random gradient rising method by the discrimination model, and generating a pre-trained discrimination model by the discrimination model, wherein the specific formula is as follows:
Figure BDA0003304048640000092
wherein xiSample representing m
Figure BDA0003304048640000096
ziSample representing m
Figure BDA0003304048640000093
D denotes a discriminator, G denotes a generator, and m denotes the number of samples taken.
After the discriminant model is generated, samples with a capacity of m are additionally taken from the prior distribution
Figure BDA0003304048640000094
The generator optimizes the network according to a random gradient descent method, and the specific formula is as follows:
Figure BDA0003304048640000095
where z represents a sample of capacity size m taken separately, D represents the discriminator, G represents the generator, and m represents the number of samples taken.
After the network optimization is carried out through repeated iteration, the training of the generated model and the judgment model is stable, the stable condition is that the model loss function is normally reduced, and the generated sample is reasonable. And finally, generating a generated enhanced data sample with the sample capacity of m according to the trained generation model.
S203, merging the enhanced data set into an image style migration data set to generate a model training sample;
in one possible implementation, after generating the enhanced data set according to step S202, the enhanced data set is supplemented to the original data set to obtain an enhanced image style migration data set, i.e. a model training sample.
S204, creating a style migration model;
typically, the style migration model is created using a generator network.
S205, training the style migration model according to the model training sample to obtain a loss value of the style migration model;
in the embodiment of the application, firstly, images in model training samples are cut according to a plurality of preset sizes to generate cut image samples of various sizes, then, cut images of other sizes except for the minimum size in the cut image samples of various sizes are input into a 2D convolution layer of a style migration model to obtain weighted feature maps of various dimensions, then, feature fusion is carried out on the weighted feature maps of various dimensions to generate texture pictures, then, the texture pictures and real texture labels preset on each image in the model training samples are input into a pre-trained VGG network, multi-layer feature maps in the VGG network are output, and finally, loss values of the style migration model are calculated according to the multi-layer feature maps.
In one possible implementation, since the pictures in the model training samples are mostly 256 × 256 in size, the picture input is typically a power of 2, and then all the pictures in the model training samples are respectively cut into 128 × 128, 64 × 64, 32 × 32 and 16, denoted as x 16, according to the three-layer down-sampling structure of the generator network1,x2,x3,x4And all the cropped images x except the minimum size are combined1,x2,x3Respectively sending the weighted feature maps into the 2D convolutional layers to obtain weighted feature maps with corresponding dimensions, wherein the specific formula is as follows:
yn=Conv2D(xn,inchannel,outchannel,ksize),(n=1,2,3)。
wherein xn(n ═ 1,2,3) denote clip images x which clip the source image into 128 × 128, 64 × 64, and 32 × 32, respectively1,x2,x3Inchannel represents the number of input channels, set according to the number of image channels, outchannel the number of output channels, usually set empirically according to the number of network layers, ksize the convolution size, Conv2D the 2D convolution, ynAnd (n is 1,2 and 3) respectively represent weighted feature maps output in different dimensions.
And performing feature fusion on the weighted feature maps output in different dimensions through corresponding connection strategies to obtain a feature map containing more global feature information and local texture information, and generating texture pictures G (Z).
The specific connection strategy is as follows:
cutting image x in minimum size4Up-sampling to x by deconvolution3Same sizez1Post and weighted feature map y3Adding to obtain fused f1The specific formula of the fusion characteristic diagram is as follows:
z1=DConv2D(x4,inchannel,outchannel,ksize)
f1=z1+y3
where inchannel represents the number of input channels, outchannel the number of output channels, ksize the deconvolution size, and DConv2D the 2D deconvolution.
② f after fusion1The fused feature map is up-sampled to x by deconvolution2Same dimension z2Post and weighted feature map y2Adding to obtain fused f2The specific formula of the fusion characteristic diagram is as follows:
z2=DConv2D(f1,inchannel,outchannel,ksize)
f2=z2+y2
where inchannel represents the number of input channels, outchannel the number of output channels, ksize the deconvolution size, and DConv2D the 2D deconvolution.
③ f after fusion2The fused feature map is up-sampled to x by deconvolution1Same dimension z3Post and weighted feature map y1Adding to obtain fused f3The specific formula of the fusion characteristic diagram is as follows:
z3=DConv2D(f2,inchannel,outchannel,ksize)
f3=z3+y1
where inchannel represents the number of input channels, outchannel the number of output channels, ksize the deconvolution size, and DConv2D the 2D deconvolution.
(iv) fused f3The fused feature map is up-sampled to x same size z by deconvolution4Adding the texture sample G (Z) with the source image x to generate a texture sample G (Z), wherein the specific formula is as follows:
z4=DConv2D(f3,inchannel,outchannel,ksize)
G(Z)=z4+x。
inputting the generated texture picture G (Z) and a real texture label Z' preset in a model training sample into a discriminator consisting of a pre-trained VGG network, wherein the pre-trained VGG network is trained on a very large image data set, so that the network can identify a plurality of low-level and high-level image characteristics. The outputs of the three-layer output are respectively extracted from relu1_1, relu2_1 and relu3_1 in the network for output, and the three-layer output is respectively recorded as l ═ 1,2 and 3, and the specific formula is as follows:
Figure BDA0003304048640000111
will output yl,
Figure BDA0003304048640000112
Calculating the perception loss and returning an updating parameter, wherein the specific formula is as follows:
Figure BDA0003304048640000113
where the superscript l represents the ith layer output and the subscript j represents the jth class. L and C and respectively represent the number of output layers and the number of channels.
It should be noted that VGG is one of the neural networks commonly used in migration learning, VGG is a deep convolutional network developed by oxford university computer vision group and deep mind corporation, and the second name of the classified item and the first name of the positioning item are obtained on ILSVRC competition in 2014, the network structure of the VGG is shown in fig. 4, the VGG structure is composed of 5 layers of convolutional layers, 3 layers of fully connected layers, and output layers, maximum pooling separation is used between the layers, and the ReLU function is adopted for the activation units of all hidden layers. The convolution layer with a plurality of smaller convolution kernels (3x3) is used for replacing a convolution layer with a larger convolution kernel, so that parameters can be reduced, more nonlinear mapping is equivalently performed, and the fitting/expression capability of the network can be increased.
And S206, when the loss value of the style migration model reaches the minimum value, generating a style migration model trained in advance.
In a possible implementation manner, when the loss value of the style migration model reaches the minimum, a style migration model trained in advance is generated, or when the loss value of the style migration model does not reach the minimum, the model parameters of the style migration model are updated based on the loss value of the style migration model, and the step of obtaining the loss value of the style migration model after the style migration model is trained according to the model training sample is continuously executed.
In the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.
Referring to fig. 5, a schematic structural diagram of an image style migration apparatus according to an exemplary embodiment of the present invention is shown. The image style migration apparatus may be implemented as all or a part of the terminal by software, hardware, or a combination of both. The device 1 comprises a data acquisition module 10, a data input module 20 and a composite image generation module 30.
The data acquisition module 10 is configured to acquire a target image to be rendered and determine a target style parameter to be migrated;
a data input module 20, configured to input the target image and the target style parameters into a pre-trained style migration model; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and a synthetic image generating module 30 for outputting the stylized synthetic image corresponding to the target image.
It should be noted that, when the image style migration apparatus provided in the foregoing embodiment executes the image style migration method, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed to different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the functions described above. In addition, the image style migration apparatus and the image style migration method provided in the above embodiments belong to the same concept, and details of implementation processes thereof are referred to in the method embodiments, and are not described herein again.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
The present invention also provides a computer readable medium, on which program instructions are stored, which program instructions, when executed by a processor, implement the image style migration method provided by the above-mentioned various method embodiments.
The present invention also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the image style migration method of the above-described respective method embodiments.
Please refer to fig. 6, which provides a schematic structural diagram of a terminal according to an embodiment of the present application. As shown in fig. 6, terminal 1000 can include: at least one processor 1001, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002.
Wherein a communication bus 1002 is used to enable connective communication between these components.
The user interface 1003 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
Processor 1001 may include one or more processing cores, among other things. The processor 1001 interfaces various components throughout the electronic device 1000 using various interfaces and lines to perform various functions of the electronic device 1000 and to process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1005 and invoking data stored in the memory 1005. Alternatively, the processor 1001 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1001 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 1001, but may be implemented by a single chip.
The Memory 1005 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer-readable medium. The memory 1005 may be used to store an instruction, a program, code, a set of codes, or a set of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 6, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an image style migration application program.
In the terminal 1000 shown in fig. 6, the user interface 1003 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 1001 may be configured to invoke the image style migration application stored in the memory 1005 and specifically perform the following operations:
acquiring a target image to be rendered, and determining a target style parameter to be migrated;
inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and outputting the stylized composite image corresponding to the target image.
In one embodiment, the processor 1001 specifically performs the following operations when performing the generation of the pre-trained style migration model:
collecting an image style migration data set;
generating an enhanced data set by adopting the generated countermeasure network and the image style migration data set;
merging the enhanced data set into an image style migration data set to generate a model training sample;
creating a style migration model;
training the style migration model according to the model training sample to obtain a loss value of the style migration model;
and when the loss value of the style migration model reaches the minimum value, generating a pre-trained style migration model.
In one embodiment, the processor 1001, when executing the generating the enhanced data set using the generation countermeasure network and the image style migration data set, specifically performs the following operations:
calculating prior distribution according to the image style migration data set;
obtaining a first number of data samples from a prior distribution;
and inputting a first number of data samples into a pre-trained generative model in the generative confrontation network, and outputting an enhanced data set.
In one embodiment, the processor 1001, when executing generating the pre-trained generative model, specifically performs the following operations:
acquiring and generating a confrontation network; generating a countermeasure network, wherein the generation of the countermeasure network comprises a generation model and a discrimination model;
normalizing the images in the image style migration data set to generate normalized image style migration data samples;
acquiring a second number of image samples from the normalized image style migration data samples;
obtaining a third number of samples from the prior distribution;
inputting a third number of samples into the generative model, and outputting a first target image sample;
inputting a second number of image samples and the first target image samples into a discrimination model, and outputting a loss value of the discrimination model;
when the loss value of the discrimination model reaches the minimum value, generating a pre-trained discrimination model;
and generating a pre-trained generation model according to the pre-trained discrimination model.
In one embodiment, the processor 1001, when executing the generation of the pre-trained generative model according to the pre-trained discriminant model, specifically performs the following operations:
obtaining a fourth number of samples from the prior distribution;
inputting a fourth number of samples into the generative model, and outputting a second target image sample;
inputting a second target image sample into a pre-trained discrimination model, and outputting a loss value of the generated model;
when the loss value of the generative model reaches the minimum, the pre-trained generative model is generated.
In an embodiment, when the processor 1001 obtains the loss value of the style migration model after performing training on the style migration model according to the model training sample, specifically perform the following operations:
cutting images in the model training sample according to a plurality of preset sizes to generate cut image samples with various sizes;
inputting the cut images of other sizes except the minimum size in the cut image samples of various sizes into a 2D convolution layer of the style migration model to obtain weighted feature maps of various dimensions;
performing feature fusion on the weighted feature maps of various dimensions to generate texture pictures;
inputting the texture picture and a real texture label preset on each image in the model training sample into a pre-trained VGG network, and outputting a multilayer feature map in the VGG network;
and calculating the loss value of the style migration model according to the multilayer characteristic diagram.
In one embodiment, when the processor 1001 generates the pre-trained style migration model when the loss value of the style migration model reaches the minimum, it specifically performs the following operations:
and when the loss value of the style migration model does not reach the minimum value, updating the model parameters of the style migration model based on the loss value of the style migration model, and continuing to execute the step of obtaining the loss value of the style migration model after training the style migration model according to the model training sample.
In the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by a computer program to instruct associated hardware, and the program for image style migration may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (10)

1. An image style migration method, characterized in that the method comprises:
acquiring a target image to be rendered, and determining a target style parameter to be migrated;
inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and outputting the stylized composite image corresponding to the target image.
2. The method of claim 1, wherein generating a pre-trained style migration model comprises:
collecting an image style migration data set;
generating an enhanced data set using the generated countermeasure network and the image style migration data set;
merging the enhanced data set into the image style migration data set to generate a model training sample;
creating a style migration model;
training the style migration model according to the model training sample to obtain a loss value of the style migration model;
and when the loss value of the style migration model reaches the minimum value, generating a pre-trained style migration model.
3. The method according to claim 2, wherein the training the style migration model according to the model training sample to obtain a loss value of the style migration model comprises:
cutting the images in the model training samples according to a plurality of preset sizes to generate cut image samples with various sizes;
inputting the cut images of other sizes except the minimum size in the cut image samples of the multiple sizes into the 2D convolution layer of the style migration model to obtain weighted feature maps of multiple different dimensions;
performing feature fusion on the weighted feature maps of the multiple different dimensions to generate texture pictures;
inputting the texture picture and a real texture label preset on each image in the model training sample into a pre-trained VGG network, and outputting a multilayer feature map in the VGG network;
and calculating the loss value of the style migration model according to the multilayer characteristic diagram.
4. The method of claim 2, wherein generating a pre-trained style migration model when the loss value of the style migration model reaches a minimum comprises:
and when the loss value of the style migration model does not reach the minimum value, updating the model parameters of the style migration model based on the loss value of the style migration model, and continuing to execute the step of obtaining the loss value of the style migration model after training the style migration model according to the model training sample.
5. The method of any of claims 2-4, wherein generating the enhanced data set using the generation countermeasure network and the image style migration data set comprises:
calculating prior distribution according to the image style migration data set;
obtaining a first number of data samples from the prior distribution;
and inputting the first number of data samples into a pre-trained generative model in a generative confrontation network, and outputting an enhanced data set.
6. The method of claim 5, wherein generating a pre-trained generative model in a countermeasure network comprises:
acquiring and generating a confrontation network; wherein the generation of the countermeasure network comprises a generation model and a discrimination model;
normalizing the images in the image style migration data set to generate normalized image style migration data samples;
acquiring a second number of image samples from the normalized image style migration data samples;
obtaining a third number of samples from the prior distribution;
inputting the third number of samples into the generative model, and outputting a first target image sample;
inputting the second number of image samples and the first target image samples into the discriminant model, and outputting a loss value of the discriminant model;
when the loss value of the discrimination model reaches the minimum value, generating a pre-trained discrimination model;
and generating a pre-trained generation model according to the pre-trained discrimination model.
7. The method of claim 6, wherein generating a pre-trained generative model from a pre-trained discriminative model comprises:
obtaining a fourth number of samples from the prior distribution;
inputting the fourth number of samples into the generative model, and outputting a second target image sample;
inputting the second target image sample into the pre-trained discrimination model, and outputting a loss value of the generated model;
and when the loss value of the generative model reaches the minimum value, generating a pre-trained generative model.
8. An image style migration apparatus, characterized in that the apparatus comprises:
the data acquisition module is used for acquiring a target image to be rendered and determining a target style parameter to be migrated;
the data input module is used for inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and the synthetic image generating module is used for outputting the stylized synthetic image corresponding to the target image.
9. A computer storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor and to perform the method steps according to any of claims 1-7.
10. A terminal, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1-7.
CN202111198634.9A 2021-10-14 2021-10-14 Image style migration method and device, storage medium and terminal Pending CN114066718A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111198634.9A CN114066718A (en) 2021-10-14 2021-10-14 Image style migration method and device, storage medium and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111198634.9A CN114066718A (en) 2021-10-14 2021-10-14 Image style migration method and device, storage medium and terminal

Publications (1)

Publication Number Publication Date
CN114066718A true CN114066718A (en) 2022-02-18

Family

ID=80234560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111198634.9A Pending CN114066718A (en) 2021-10-14 2021-10-14 Image style migration method and device, storage medium and terminal

Country Status (1)

Country Link
CN (1) CN114066718A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115187706A (en) * 2022-06-28 2022-10-14 北京汉仪创新科技股份有限公司 Lightweight method and system for face style migration, storage medium and electronic equipment
CN115249221A (en) * 2022-09-23 2022-10-28 阿里巴巴(中国)有限公司 Image processing method and device and cloud equipment
CN115761267A (en) * 2022-12-27 2023-03-07 四川数聚智造科技有限公司 Detection method for solving outdoor low-frequency image acquisition abnormity
CN115861312A (en) * 2023-02-24 2023-03-28 季华实验室 OLED dry film defect detection method based on style migration positive sample generation
CN116051683A (en) * 2022-12-20 2023-05-02 中国科学院空天信息创新研究院 Remote sensing image generation method, storage medium and device based on style self-organization
CN116596753A (en) * 2023-07-20 2023-08-15 哈尔滨工程大学三亚南海创新发展基地 Acoustic image dataset expansion method and system based on style migration network
WO2024145808A1 (en) * 2023-01-04 2024-07-11 京东方科技集团股份有限公司 Font style migration network training method and apparatus, device, and storage medium
WO2024193489A1 (en) * 2023-02-17 2024-09-26 北京字跳网络技术有限公司 Image processing method and apparatus, device and storage medium

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115187706B (en) * 2022-06-28 2024-04-05 北京汉仪创新科技股份有限公司 Lightweight method and system for face style migration, storage medium and electronic equipment
CN115187706A (en) * 2022-06-28 2022-10-14 北京汉仪创新科技股份有限公司 Lightweight method and system for face style migration, storage medium and electronic equipment
CN115249221A (en) * 2022-09-23 2022-10-28 阿里巴巴(中国)有限公司 Image processing method and device and cloud equipment
CN116051683A (en) * 2022-12-20 2023-05-02 中国科学院空天信息创新研究院 Remote sensing image generation method, storage medium and device based on style self-organization
CN116051683B (en) * 2022-12-20 2023-07-04 中国科学院空天信息创新研究院 Remote sensing image generation method, storage medium and device based on style self-organization
CN115761267A (en) * 2022-12-27 2023-03-07 四川数聚智造科技有限公司 Detection method for solving outdoor low-frequency image acquisition abnormity
CN115761267B (en) * 2022-12-27 2023-06-16 四川数聚智造科技有限公司 Detection method for solving outdoor low-frequency image acquisition abnormality
WO2024145808A1 (en) * 2023-01-04 2024-07-11 京东方科技集团股份有限公司 Font style migration network training method and apparatus, device, and storage medium
WO2024193489A1 (en) * 2023-02-17 2024-09-26 北京字跳网络技术有限公司 Image processing method and apparatus, device and storage medium
CN115861312A (en) * 2023-02-24 2023-03-28 季华实验室 OLED dry film defect detection method based on style migration positive sample generation
CN115861312B (en) * 2023-02-24 2023-05-26 季华实验室 OLED dry film defect detection method based on style migration positive sample generation
CN116596753B (en) * 2023-07-20 2024-02-02 哈尔滨工程大学三亚南海创新发展基地 Acoustic image dataset expansion method and system based on style migration network
CN116596753A (en) * 2023-07-20 2023-08-15 哈尔滨工程大学三亚南海创新发展基地 Acoustic image dataset expansion method and system based on style migration network

Similar Documents

Publication Publication Date Title
CN114066718A (en) Image style migration method and device, storage medium and terminal
CN110674829B (en) Three-dimensional target detection method based on graph convolution attention network
US11977960B2 (en) Techniques for generating designs that reflect stylistic preferences
CN112434721A (en) Image classification method, system, storage medium and terminal based on small sample learning
US20210150807A1 (en) Generating realistic point clouds
CN111476708B (en) Model generation method, model acquisition method, device, equipment and storage medium
GB2585396A (en) Utilizing a critical edge detection neural network and a geometric model to determine camera parameters from a single digital image
US9262853B2 (en) Virtual scene generation based on imagery
CN117157678A (en) Method and system for graph-based panorama segmentation
CN113836338B (en) Fine granularity image classification method, device, storage medium and terminal
CN110807362A (en) Image detection method and device and computer readable storage medium
CN113408570A (en) Image category identification method and device based on model distillation, storage medium and terminal
US20230153965A1 (en) Image processing method and related device
CN115131849A (en) Image generation method and related device
CN113869371A (en) Model training method, clothing fine-grained segmentation method and related device
CN112819510A (en) Fashion trend prediction method, system and equipment based on clothing multi-attribute recognition
CN114610272A (en) AI model generation method, electronic device, and storage medium
CN111967478B (en) Feature map reconstruction method, system, storage medium and terminal based on weight overturn
CN113554655A (en) Optical remote sensing image segmentation method and device based on multi-feature enhancement
CN112668675A (en) Image processing method and device, computer equipment and storage medium
CN114913330B (en) Point cloud component segmentation method and device, electronic equipment and storage medium
CN114913305B (en) Model processing method, device, equipment, storage medium and computer program product
CN116978042A (en) Image processing method, related device and storage medium
CN113658338A (en) Point cloud tree monomer segmentation method and device, electronic equipment and storage medium
CN113840169A (en) Video processing method and device, computing equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination