CN114066718A - Image style migration method and device, storage medium and terminal - Google Patents
Image style migration method and device, storage medium and terminal Download PDFInfo
- Publication number
- CN114066718A CN114066718A CN202111198634.9A CN202111198634A CN114066718A CN 114066718 A CN114066718 A CN 114066718A CN 202111198634 A CN202111198634 A CN 202111198634A CN 114066718 A CN114066718 A CN 114066718A
- Authority
- CN
- China
- Prior art keywords
- model
- style migration
- image
- data set
- style
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013508 migration Methods 0.000 title claims abstract description 243
- 230000005012 migration Effects 0.000 title claims abstract description 243
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000003860 storage Methods 0.000 title claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 67
- 239000002131 composite material Substances 0.000 claims abstract description 12
- 238000009826 distribution Methods 0.000 claims description 30
- 238000010586 diagram Methods 0.000 claims description 13
- 230000004927 fusion Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 238000005520 cutting process Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 6
- 238000009877 rendering Methods 0.000 description 6
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 241000282326 Felis catus Species 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 239000008186 active pharmaceutical agent Substances 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000013178 mathematical model Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/001—Texturing; Colouring; Generation of texture or colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image style migration method, an image style migration device, a storage medium and a terminal, wherein the method comprises the following steps: acquiring a target image to be rendered, and determining a target style parameter to be migrated; inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set; and outputting the stylized composite image corresponding to the target image. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
Description
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image style migration method, an image style migration apparatus, a storage medium, and a terminal.
Background
With the development of internet technology, image processing means are more and more abundant. The image style conversion can change the details and styles of the image under the condition of keeping the main content of the image, such as the age change of the portrait, the conversion of the portrait into a hand-drawing style, the conversion of the photo into an animation style and the like.
The main task of image style migration is to obtain a style-rendered composite picture after fusion of a content picture and a style picture, and the basic principle is to define two distances, one for content (DC) and the other for style (DS). DC measures the difference in content between two pictures, while DS measures the difference in style between two pictures. For example, a third picture, the input, is taken and converted to minimize its content distance from the content picture and its style distance from the style picture.
In the existing image style conversion, a neural network model is trained through a large number of image style migration data sets to obtain better image fitting operational capability, but the model training data sets disclosed by the existing image style migration are few, the number and the types of the model training data sets are limited, and the development of an image style migration technology is limited, so that the image style effect output after the trained model conversion is poor, and the robustness and the diversity of the image style migration model are reduced.
Disclosure of Invention
The embodiment of the application provides an image style migration method and device, a storage medium and a terminal. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
In a first aspect, an embodiment of the present application provides an image style migration method, where the method includes:
acquiring a target image to be rendered, and determining a target style parameter to be migrated;
inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and outputting the stylized composite image corresponding to the target image.
Optionally, the generating a pre-trained style migration model according to the following steps includes:
collecting an image style migration data set;
generating an enhanced data set by adopting the generated countermeasure network and the image style migration data set;
merging the enhanced data set into an image style migration data set to generate a model training sample;
creating a style migration model;
training the style migration model according to the model training sample to obtain a loss value of the style migration model;
and when the loss value of the style migration model reaches the minimum value, generating a pre-trained style migration model.
Optionally, generating an enhanced data set by using the generation countermeasure network and the image style migration data set includes:
calculating prior distribution according to the image style migration data set;
obtaining a first number of data samples from a prior distribution;
and inputting a first number of data samples into a pre-trained generative model in the generative confrontation network, and outputting an enhanced data set.
Optionally, the generating a pre-trained generative model in the countermeasure network according to the following steps includes:
acquiring and generating a confrontation network; generating a countermeasure network, wherein the generation of the countermeasure network comprises a generation model and a discrimination model;
normalizing the images in the image style migration data set to generate normalized image style migration data samples;
acquiring a second number of image samples from the normalized image style migration data samples;
obtaining a third number of samples from the prior distribution;
inputting a third number of samples into the generative model, and outputting a first target image sample;
inputting a second number of image samples and the first target image samples into a discrimination model, and outputting a loss value of the discrimination model;
when the loss value of the discrimination model reaches the minimum value, generating a pre-trained discrimination model;
and generating a pre-trained generation model according to the pre-trained discrimination model.
Optionally, generating a pre-trained generative model according to the pre-trained discriminant model includes:
obtaining a fourth number of samples from the prior distribution;
inputting a fourth number of samples into the generative model, and outputting a second target image sample;
inputting a second target image sample into a pre-trained discrimination model, and outputting a loss value of the generated model;
when the loss value of the generative model reaches the minimum, the pre-trained generative model is generated.
Optionally, the obtaining of the loss value of the style migration model after the style migration model is trained according to the model training sample includes:
cutting images in the model training sample according to a plurality of preset sizes to generate cut image samples with various sizes;
inputting the cut images of other sizes except the minimum size in the cut image samples of various sizes into a 2D convolution layer of the style migration model to obtain weighted feature maps of various dimensions;
performing feature fusion on the weighted feature maps of various dimensions to generate texture pictures;
inputting the texture picture and a real texture label preset on each image in the model training sample into a pre-trained VGG network, and outputting a multilayer feature map in the VGG network;
and calculating the loss value of the style migration model according to the multilayer characteristic diagram.
Optionally, when the loss value of the style migration model reaches the minimum, generating a pre-trained style migration model, including:
and when the loss value of the style migration model does not reach the minimum value, updating the model parameters of the style migration model based on the loss value of the style migration model, and continuing to execute the step of obtaining the loss value of the style migration model after training the style migration model according to the model training sample.
In a second aspect, an embodiment of the present application provides an image style migration apparatus, including:
the data acquisition module is used for acquiring a target image to be rendered and determining a target style parameter to be migrated;
the data input module is used for inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and the synthetic image generating module is used for outputting the stylized synthetic image corresponding to the target image.
In a third aspect, embodiments of the present application provide a computer storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.
In a fourth aspect, an embodiment of the present application provides a terminal, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
in the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a schematic flowchart of an image style migration method according to an embodiment of the present application;
FIG. 2 is a scene schematic diagram of an image style migration according to an embodiment of the present disclosure;
FIG. 3 is a schematic block diagram of a flowchart of a method for training an image style migration model according to an embodiment of the present application;
fig. 4 is a schematic network structure diagram of a VGG network according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an image style migration apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them.
It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
The application provides an image style migration method, an image style migration device, a storage medium and a terminal, which are used for solving the problems in the related technical problems. In the technical scheme provided by the application, because the countermeasure network and the image style migration data set are generated to generate the enhanced data set, and the acquired image style migration data set is expanded by using the enhanced data set, the scale of the model training sample is increased, and the robustness and diversity of the image style migration model are further improved, which is described in detail by using an exemplary embodiment.
The image style migration method provided by the embodiment of the present application will be described in detail below with reference to fig. 1 to 4. The method may be implemented in dependence on a computer program, executable on an image style migration apparatus based on the von neumann architecture. The computer program may be integrated into the application or may run as a separate tool-like application.
Referring to fig. 1, a flowchart of an image style migration method is provided in an embodiment of the present application. As shown in fig. 1, the method of the embodiment of the present application may include the following steps:
s101, acquiring a target image to be rendered, and determining a target style parameter to be transferred;
the target image to be rendered is a content image, and is an image that needs to be subjected to style rendering, such as a face image or an animal image. The target style parameters to be migrated are style parameters for rendering the target image, and the style parameters are obtained according to a style migration model trained in advance.
Generally, the target image is an image needing to be subjected to style rendering on a local terminal, and can also be an image needing to be subjected to style rendering on a line or a cloud. When the target style parameters to be migrated are determined, a plurality of target parameter styles can be randomly searched from the pre-trained style migration model, and a target parameter style can also be searched from the pre-trained style migration model according to specific style key identification.
In a possible implementation manner, when image lattice migration is performed, an image which needs to be subjected to style rendering and is sent on a line is received, a style migration model which is trained in advance is loaded, an input style identification is received, and finally a target style parameter to be migrated is searched from the style migration model which is trained in advance according to the style identification.
S102, inputting a target image and a target style parameter into a style migration model trained in advance;
the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
generally, a pre-trained style migration model is a mathematical model for image style migration, and the mathematical model is a trained lightweight multi-scale feedforward generator network, which can effectively fuse low-level and high-level image characteristics, fuse global feature information and local texture information, effectively improve training effect, and generate a high-quality style migration picture.
In the embodiment of the application, a pre-trained style migration model is generated according to the following steps of firstly acquiring an image style migration data set, then generating an enhanced data set by adopting a generated countermeasure network and the image style migration data set, merging the enhanced data set into the image style migration data set to generate a model training sample, secondly creating a style migration model, then training the style migration model according to the model training sample to obtain a loss value of the style migration model, and finally generating the pre-trained style migration model when the loss value of the style migration model reaches the minimum value.
In a possible implementation manner, after the target image to be rendered and the target style parameter to be migrated are obtained based on step S101, the target image to be rendered and the target style parameter to be migrated may be input into a generator network of a lightweight multi-scale feedforward trained in advance for processing, and a synthetic image with different specific textures may be generated in real time. The web style migration is more accurate and less time consuming.
And S103, outputting the stylized composite image corresponding to the target image.
For example, as shown in fig. 2, the target image to be rendered is an image of a cat, the target style parameter to be migrated is a style parameter of sanskrit self-portrait, and then the image of the cat and the style parameter of sanskrit self-portrait are input into a generator network trained in advance for processing, and then a cat image similar to the sanskrit portrait style is output.
In the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
Referring to fig. 3, a schematic flow chart of training an image style migration model is provided according to an embodiment of the present application. As shown in fig. 3, the method of the embodiment of the present application may include the following steps:
s201, collecting an image style migration data set;
wherein the image style migration dataset is a sample image of an original collected model training.
In one possible implementation, the image style migration dataset is a collection of approximately 1 million various types of style migration pictures from the web, which contains 100 types.
It should be noted that, because the number and types of model training data sets disclosed in the current image style migration are limited, the development of the image style migration technology is limited, and thus the application needs to perform data expansion on the image style migration data sets.
S202, generating an enhanced data set by adopting a generated countermeasure network and an image style migration data set;
the generated countermeasure network (GAN) is a deep learning Model, and the GAN includes at least two models, namely, a generated Model (generated Model) and a discriminant Model (discriminant Model).
In general, GAN is another way of modeling generation based on a micro-generator network. It is practical to fit a new distribution by applying a model-generating operation to the prior distribution. The prior distribution is defined as a probability distribution of the overall distribution parameter θ. The basic idea of bayesian theory is to consider that in any statistical inference problem regarding the global distribution parameter θ, in addition to using the information provided by the samples, a prior distribution must be specified, which is an indispensable element in making the statistical inference. They believe that the prior distribution need not be objectively based and may be based in part or in whole on subjective beliefs.
In a possible implementation manner, when generating the enhanced data set, firstly, a priori distribution is calculated according to the image style migration data set, then, a first number of data samples are obtained from the priori distribution, and finally, the first number of data samples are input into a generation model which is trained in advance in a generation countermeasure network, and the enhanced data set is output.
Specifically, when the prior distribution is calculated according to the image style migration data set, firstly, the prior information of the image style migration data set is determined by a user according to subjective consciousness and is input into a client, the client calculates the probability distribution according to the prior information and a Bayesian formula, and finally, the probability distribution is determined as the prior distribution.
Further, generating a generation model trained in advance in the countermeasure network according to the following steps, and firstly, acquiring the generated countermeasure network; the generation of the countermeasure network comprises a generation model and a discrimination model, images in the image style migration data set are normalized to generate normalized image style migration data samples, a second number of image samples are obtained from the normalized image style migration data samples, a third number of samples are obtained from the prior distribution, the third number of samples are input into the generation model, a first target image sample is output, the second number of image samples and the first target image sample are input into the discrimination model, the loss value of the discrimination model is output, and finally when the loss value of the discrimination model reaches the minimum, the pre-trained discrimination model is generated and the pre-trained generation model is generated according to the pre-trained discrimination model.
Further, when the pre-trained generative model is generated according to the pre-trained discriminant model, a fourth number of samples are obtained from the prior distribution, the fourth number of samples are input into the generative model, a second target image sample is output, the second target image sample is input into the pre-trained discriminant model, a loss value of the generative model is output, and finally the pre-trained generative model is generated when the loss value of the generative model reaches a minimum value.
Specifically, firstly, the images in the image style migration data set are normalized, and the specific formula isWhere x denotes input data, x*Represents the normalized output such that all data are at [0,1 ]]In between, the normalization process can increase the speed of convergence and can improve the accuracy of the model. Then obtaining a sample with the capacity size of m from the normalized image style migration data sampleThen from the prior distribution pz(z) taking a sample with a volume mAnd mixing the sampleInput into the generative model G to obtain output m samplesThen the sample is putAnd a sampleInputting the data into a discrimination model D, optimizing a network of the discrimination model according to a random gradient rising method by the discrimination model, and generating a pre-trained discrimination model by the discrimination model, wherein the specific formula is as follows:wherein xiSample representing mziSample representing mD denotes a discriminator, G denotes a generator, and m denotes the number of samples taken.
After the discriminant model is generated, samples with a capacity of m are additionally taken from the prior distributionThe generator optimizes the network according to a random gradient descent method, and the specific formula is as follows:where z represents a sample of capacity size m taken separately, D represents the discriminator, G represents the generator, and m represents the number of samples taken.
After the network optimization is carried out through repeated iteration, the training of the generated model and the judgment model is stable, the stable condition is that the model loss function is normally reduced, and the generated sample is reasonable. And finally, generating a generated enhanced data sample with the sample capacity of m according to the trained generation model.
S203, merging the enhanced data set into an image style migration data set to generate a model training sample;
in one possible implementation, after generating the enhanced data set according to step S202, the enhanced data set is supplemented to the original data set to obtain an enhanced image style migration data set, i.e. a model training sample.
S204, creating a style migration model;
typically, the style migration model is created using a generator network.
S205, training the style migration model according to the model training sample to obtain a loss value of the style migration model;
in the embodiment of the application, firstly, images in model training samples are cut according to a plurality of preset sizes to generate cut image samples of various sizes, then, cut images of other sizes except for the minimum size in the cut image samples of various sizes are input into a 2D convolution layer of a style migration model to obtain weighted feature maps of various dimensions, then, feature fusion is carried out on the weighted feature maps of various dimensions to generate texture pictures, then, the texture pictures and real texture labels preset on each image in the model training samples are input into a pre-trained VGG network, multi-layer feature maps in the VGG network are output, and finally, loss values of the style migration model are calculated according to the multi-layer feature maps.
In one possible implementation, since the pictures in the model training samples are mostly 256 × 256 in size, the picture input is typically a power of 2, and then all the pictures in the model training samples are respectively cut into 128 × 128, 64 × 64, 32 × 32 and 16, denoted as x 16, according to the three-layer down-sampling structure of the generator network1,x2,x3,x4And all the cropped images x except the minimum size are combined1,x2,x3Respectively sending the weighted feature maps into the 2D convolutional layers to obtain weighted feature maps with corresponding dimensions, wherein the specific formula is as follows:
yn=Conv2D(xn,inchannel,outchannel,ksize),(n=1,2,3)。
wherein xn(n ═ 1,2,3) denote clip images x which clip the source image into 128 × 128, 64 × 64, and 32 × 32, respectively1,x2,x3Inchannel represents the number of input channels, set according to the number of image channels, outchannel the number of output channels, usually set empirically according to the number of network layers, ksize the convolution size, Conv2D the 2D convolution, ynAnd (n is 1,2 and 3) respectively represent weighted feature maps output in different dimensions.
And performing feature fusion on the weighted feature maps output in different dimensions through corresponding connection strategies to obtain a feature map containing more global feature information and local texture information, and generating texture pictures G (Z).
The specific connection strategy is as follows:
cutting image x in minimum size4Up-sampling to x by deconvolution3Same sizez1Post and weighted feature map y3Adding to obtain fused f1The specific formula of the fusion characteristic diagram is as follows:
z1=DConv2D(x4,inchannel,outchannel,ksize)
f1=z1+y3
where inchannel represents the number of input channels, outchannel the number of output channels, ksize the deconvolution size, and DConv2D the 2D deconvolution.
② f after fusion1The fused feature map is up-sampled to x by deconvolution2Same dimension z2Post and weighted feature map y2Adding to obtain fused f2The specific formula of the fusion characteristic diagram is as follows:
z2=DConv2D(f1,inchannel,outchannel,ksize)
f2=z2+y2
where inchannel represents the number of input channels, outchannel the number of output channels, ksize the deconvolution size, and DConv2D the 2D deconvolution.
③ f after fusion2The fused feature map is up-sampled to x by deconvolution1Same dimension z3Post and weighted feature map y1Adding to obtain fused f3The specific formula of the fusion characteristic diagram is as follows:
z3=DConv2D(f2,inchannel,outchannel,ksize)
f3=z3+y1
where inchannel represents the number of input channels, outchannel the number of output channels, ksize the deconvolution size, and DConv2D the 2D deconvolution.
(iv) fused f3The fused feature map is up-sampled to x same size z by deconvolution4Adding the texture sample G (Z) with the source image x to generate a texture sample G (Z), wherein the specific formula is as follows:
z4=DConv2D(f3,inchannel,outchannel,ksize)
G(Z)=z4+x。
inputting the generated texture picture G (Z) and a real texture label Z' preset in a model training sample into a discriminator consisting of a pre-trained VGG network, wherein the pre-trained VGG network is trained on a very large image data set, so that the network can identify a plurality of low-level and high-level image characteristics. The outputs of the three-layer output are respectively extracted from relu1_1, relu2_1 and relu3_1 in the network for output, and the three-layer output is respectively recorded as l ═ 1,2 and 3, and the specific formula is as follows:
will output yl,Calculating the perception loss and returning an updating parameter, wherein the specific formula is as follows:
where the superscript l represents the ith layer output and the subscript j represents the jth class. L and C and respectively represent the number of output layers and the number of channels.
It should be noted that VGG is one of the neural networks commonly used in migration learning, VGG is a deep convolutional network developed by oxford university computer vision group and deep mind corporation, and the second name of the classified item and the first name of the positioning item are obtained on ILSVRC competition in 2014, the network structure of the VGG is shown in fig. 4, the VGG structure is composed of 5 layers of convolutional layers, 3 layers of fully connected layers, and output layers, maximum pooling separation is used between the layers, and the ReLU function is adopted for the activation units of all hidden layers. The convolution layer with a plurality of smaller convolution kernels (3x3) is used for replacing a convolution layer with a larger convolution kernel, so that parameters can be reduced, more nonlinear mapping is equivalently performed, and the fitting/expression capability of the network can be increased.
And S206, when the loss value of the style migration model reaches the minimum value, generating a style migration model trained in advance.
In a possible implementation manner, when the loss value of the style migration model reaches the minimum, a style migration model trained in advance is generated, or when the loss value of the style migration model does not reach the minimum, the model parameters of the style migration model are updated based on the loss value of the style migration model, and the step of obtaining the loss value of the style migration model after the style migration model is trained according to the model training sample is continuously executed.
In the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.
Referring to fig. 5, a schematic structural diagram of an image style migration apparatus according to an exemplary embodiment of the present invention is shown. The image style migration apparatus may be implemented as all or a part of the terminal by software, hardware, or a combination of both. The device 1 comprises a data acquisition module 10, a data input module 20 and a composite image generation module 30.
The data acquisition module 10 is configured to acquire a target image to be rendered and determine a target style parameter to be migrated;
a data input module 20, configured to input the target image and the target style parameters into a pre-trained style migration model; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and a synthetic image generating module 30 for outputting the stylized synthetic image corresponding to the target image.
It should be noted that, when the image style migration apparatus provided in the foregoing embodiment executes the image style migration method, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed to different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the functions described above. In addition, the image style migration apparatus and the image style migration method provided in the above embodiments belong to the same concept, and details of implementation processes thereof are referred to in the method embodiments, and are not described herein again.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
The present invention also provides a computer readable medium, on which program instructions are stored, which program instructions, when executed by a processor, implement the image style migration method provided by the above-mentioned various method embodiments.
The present invention also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the image style migration method of the above-described respective method embodiments.
Please refer to fig. 6, which provides a schematic structural diagram of a terminal according to an embodiment of the present application. As shown in fig. 6, terminal 1000 can include: at least one processor 1001, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002.
Wherein a communication bus 1002 is used to enable connective communication between these components.
The user interface 1003 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
The Memory 1005 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer-readable medium. The memory 1005 may be used to store an instruction, a program, code, a set of codes, or a set of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 6, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an image style migration application program.
In the terminal 1000 shown in fig. 6, the user interface 1003 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 1001 may be configured to invoke the image style migration application stored in the memory 1005 and specifically perform the following operations:
acquiring a target image to be rendered, and determining a target style parameter to be migrated;
inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and outputting the stylized composite image corresponding to the target image.
In one embodiment, the processor 1001 specifically performs the following operations when performing the generation of the pre-trained style migration model:
collecting an image style migration data set;
generating an enhanced data set by adopting the generated countermeasure network and the image style migration data set;
merging the enhanced data set into an image style migration data set to generate a model training sample;
creating a style migration model;
training the style migration model according to the model training sample to obtain a loss value of the style migration model;
and when the loss value of the style migration model reaches the minimum value, generating a pre-trained style migration model.
In one embodiment, the processor 1001, when executing the generating the enhanced data set using the generation countermeasure network and the image style migration data set, specifically performs the following operations:
calculating prior distribution according to the image style migration data set;
obtaining a first number of data samples from a prior distribution;
and inputting a first number of data samples into a pre-trained generative model in the generative confrontation network, and outputting an enhanced data set.
In one embodiment, the processor 1001, when executing generating the pre-trained generative model, specifically performs the following operations:
acquiring and generating a confrontation network; generating a countermeasure network, wherein the generation of the countermeasure network comprises a generation model and a discrimination model;
normalizing the images in the image style migration data set to generate normalized image style migration data samples;
acquiring a second number of image samples from the normalized image style migration data samples;
obtaining a third number of samples from the prior distribution;
inputting a third number of samples into the generative model, and outputting a first target image sample;
inputting a second number of image samples and the first target image samples into a discrimination model, and outputting a loss value of the discrimination model;
when the loss value of the discrimination model reaches the minimum value, generating a pre-trained discrimination model;
and generating a pre-trained generation model according to the pre-trained discrimination model.
In one embodiment, the processor 1001, when executing the generation of the pre-trained generative model according to the pre-trained discriminant model, specifically performs the following operations:
obtaining a fourth number of samples from the prior distribution;
inputting a fourth number of samples into the generative model, and outputting a second target image sample;
inputting a second target image sample into a pre-trained discrimination model, and outputting a loss value of the generated model;
when the loss value of the generative model reaches the minimum, the pre-trained generative model is generated.
In an embodiment, when the processor 1001 obtains the loss value of the style migration model after performing training on the style migration model according to the model training sample, specifically perform the following operations:
cutting images in the model training sample according to a plurality of preset sizes to generate cut image samples with various sizes;
inputting the cut images of other sizes except the minimum size in the cut image samples of various sizes into a 2D convolution layer of the style migration model to obtain weighted feature maps of various dimensions;
performing feature fusion on the weighted feature maps of various dimensions to generate texture pictures;
inputting the texture picture and a real texture label preset on each image in the model training sample into a pre-trained VGG network, and outputting a multilayer feature map in the VGG network;
and calculating the loss value of the style migration model according to the multilayer characteristic diagram.
In one embodiment, when the processor 1001 generates the pre-trained style migration model when the loss value of the style migration model reaches the minimum, it specifically performs the following operations:
and when the loss value of the style migration model does not reach the minimum value, updating the model parameters of the style migration model based on the loss value of the style migration model, and continuing to execute the step of obtaining the loss value of the style migration model after training the style migration model according to the model training sample.
In the embodiment of the application, an image style migration device firstly acquires a target image to be rendered, determines a target style parameter to be migrated, and then inputs the target image and the target style parameter into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set, and finally, a stylized composite image corresponding to the target image is output. According to the method and the device, the confrontation network is generated, the image style migration data set is generated to form the enhanced data set, and the acquired image style migration data set is expanded by the enhanced data set, so that the scale of a model training sample is increased, and the robustness and diversity of the image style migration model are improved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by a computer program to instruct associated hardware, and the program for image style migration may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.
Claims (10)
1. An image style migration method, characterized in that the method comprises:
acquiring a target image to be rendered, and determining a target style parameter to be migrated;
inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and outputting the stylized composite image corresponding to the target image.
2. The method of claim 1, wherein generating a pre-trained style migration model comprises:
collecting an image style migration data set;
generating an enhanced data set using the generated countermeasure network and the image style migration data set;
merging the enhanced data set into the image style migration data set to generate a model training sample;
creating a style migration model;
training the style migration model according to the model training sample to obtain a loss value of the style migration model;
and when the loss value of the style migration model reaches the minimum value, generating a pre-trained style migration model.
3. The method according to claim 2, wherein the training the style migration model according to the model training sample to obtain a loss value of the style migration model comprises:
cutting the images in the model training samples according to a plurality of preset sizes to generate cut image samples with various sizes;
inputting the cut images of other sizes except the minimum size in the cut image samples of the multiple sizes into the 2D convolution layer of the style migration model to obtain weighted feature maps of multiple different dimensions;
performing feature fusion on the weighted feature maps of the multiple different dimensions to generate texture pictures;
inputting the texture picture and a real texture label preset on each image in the model training sample into a pre-trained VGG network, and outputting a multilayer feature map in the VGG network;
and calculating the loss value of the style migration model according to the multilayer characteristic diagram.
4. The method of claim 2, wherein generating a pre-trained style migration model when the loss value of the style migration model reaches a minimum comprises:
and when the loss value of the style migration model does not reach the minimum value, updating the model parameters of the style migration model based on the loss value of the style migration model, and continuing to execute the step of obtaining the loss value of the style migration model after training the style migration model according to the model training sample.
5. The method of any of claims 2-4, wherein generating the enhanced data set using the generation countermeasure network and the image style migration data set comprises:
calculating prior distribution according to the image style migration data set;
obtaining a first number of data samples from the prior distribution;
and inputting the first number of data samples into a pre-trained generative model in a generative confrontation network, and outputting an enhanced data set.
6. The method of claim 5, wherein generating a pre-trained generative model in a countermeasure network comprises:
acquiring and generating a confrontation network; wherein the generation of the countermeasure network comprises a generation model and a discrimination model;
normalizing the images in the image style migration data set to generate normalized image style migration data samples;
acquiring a second number of image samples from the normalized image style migration data samples;
obtaining a third number of samples from the prior distribution;
inputting the third number of samples into the generative model, and outputting a first target image sample;
inputting the second number of image samples and the first target image samples into the discriminant model, and outputting a loss value of the discriminant model;
when the loss value of the discrimination model reaches the minimum value, generating a pre-trained discrimination model;
and generating a pre-trained generation model according to the pre-trained discrimination model.
7. The method of claim 6, wherein generating a pre-trained generative model from a pre-trained discriminative model comprises:
obtaining a fourth number of samples from the prior distribution;
inputting the fourth number of samples into the generative model, and outputting a second target image sample;
inputting the second target image sample into the pre-trained discrimination model, and outputting a loss value of the generated model;
and when the loss value of the generative model reaches the minimum value, generating a pre-trained generative model.
8. An image style migration apparatus, characterized in that the apparatus comprises:
the data acquisition module is used for acquiring a target image to be rendered and determining a target style parameter to be migrated;
the data input module is used for inputting the target image and the target style parameters into a style migration model trained in advance; the style migration model is generated by training an acquired image style migration data set and an enhanced data set, and the enhanced data set is generated by adopting a generation countermeasure network and the image style migration data set;
and the synthetic image generating module is used for outputting the stylized synthetic image corresponding to the target image.
9. A computer storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor and to perform the method steps according to any of claims 1-7.
10. A terminal, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111198634.9A CN114066718A (en) | 2021-10-14 | 2021-10-14 | Image style migration method and device, storage medium and terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111198634.9A CN114066718A (en) | 2021-10-14 | 2021-10-14 | Image style migration method and device, storage medium and terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114066718A true CN114066718A (en) | 2022-02-18 |
Family
ID=80234560
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111198634.9A Pending CN114066718A (en) | 2021-10-14 | 2021-10-14 | Image style migration method and device, storage medium and terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114066718A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115187706A (en) * | 2022-06-28 | 2022-10-14 | 北京汉仪创新科技股份有限公司 | Lightweight method and system for face style migration, storage medium and electronic equipment |
CN115249221A (en) * | 2022-09-23 | 2022-10-28 | 阿里巴巴(中国)有限公司 | Image processing method and device and cloud equipment |
CN115761267A (en) * | 2022-12-27 | 2023-03-07 | 四川数聚智造科技有限公司 | Detection method for solving outdoor low-frequency image acquisition abnormity |
CN115861312A (en) * | 2023-02-24 | 2023-03-28 | 季华实验室 | OLED dry film defect detection method based on style migration positive sample generation |
CN116051683A (en) * | 2022-12-20 | 2023-05-02 | 中国科学院空天信息创新研究院 | Remote sensing image generation method, storage medium and device based on style self-organization |
CN116596753A (en) * | 2023-07-20 | 2023-08-15 | 哈尔滨工程大学三亚南海创新发展基地 | Acoustic image dataset expansion method and system based on style migration network |
WO2024145808A1 (en) * | 2023-01-04 | 2024-07-11 | 京东方科技集团股份有限公司 | Font style migration network training method and apparatus, device, and storage medium |
WO2024193489A1 (en) * | 2023-02-17 | 2024-09-26 | 北京字跳网络技术有限公司 | Image processing method and apparatus, device and storage medium |
-
2021
- 2021-10-14 CN CN202111198634.9A patent/CN114066718A/en active Pending
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115187706B (en) * | 2022-06-28 | 2024-04-05 | 北京汉仪创新科技股份有限公司 | Lightweight method and system for face style migration, storage medium and electronic equipment |
CN115187706A (en) * | 2022-06-28 | 2022-10-14 | 北京汉仪创新科技股份有限公司 | Lightweight method and system for face style migration, storage medium and electronic equipment |
CN115249221A (en) * | 2022-09-23 | 2022-10-28 | 阿里巴巴(中国)有限公司 | Image processing method and device and cloud equipment |
CN116051683A (en) * | 2022-12-20 | 2023-05-02 | 中国科学院空天信息创新研究院 | Remote sensing image generation method, storage medium and device based on style self-organization |
CN116051683B (en) * | 2022-12-20 | 2023-07-04 | 中国科学院空天信息创新研究院 | Remote sensing image generation method, storage medium and device based on style self-organization |
CN115761267A (en) * | 2022-12-27 | 2023-03-07 | 四川数聚智造科技有限公司 | Detection method for solving outdoor low-frequency image acquisition abnormity |
CN115761267B (en) * | 2022-12-27 | 2023-06-16 | 四川数聚智造科技有限公司 | Detection method for solving outdoor low-frequency image acquisition abnormality |
WO2024145808A1 (en) * | 2023-01-04 | 2024-07-11 | 京东方科技集团股份有限公司 | Font style migration network training method and apparatus, device, and storage medium |
WO2024193489A1 (en) * | 2023-02-17 | 2024-09-26 | 北京字跳网络技术有限公司 | Image processing method and apparatus, device and storage medium |
CN115861312A (en) * | 2023-02-24 | 2023-03-28 | 季华实验室 | OLED dry film defect detection method based on style migration positive sample generation |
CN115861312B (en) * | 2023-02-24 | 2023-05-26 | 季华实验室 | OLED dry film defect detection method based on style migration positive sample generation |
CN116596753B (en) * | 2023-07-20 | 2024-02-02 | 哈尔滨工程大学三亚南海创新发展基地 | Acoustic image dataset expansion method and system based on style migration network |
CN116596753A (en) * | 2023-07-20 | 2023-08-15 | 哈尔滨工程大学三亚南海创新发展基地 | Acoustic image dataset expansion method and system based on style migration network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114066718A (en) | Image style migration method and device, storage medium and terminal | |
CN110674829B (en) | Three-dimensional target detection method based on graph convolution attention network | |
US11977960B2 (en) | Techniques for generating designs that reflect stylistic preferences | |
CN112434721A (en) | Image classification method, system, storage medium and terminal based on small sample learning | |
US20210150807A1 (en) | Generating realistic point clouds | |
CN111476708B (en) | Model generation method, model acquisition method, device, equipment and storage medium | |
GB2585396A (en) | Utilizing a critical edge detection neural network and a geometric model to determine camera parameters from a single digital image | |
US9262853B2 (en) | Virtual scene generation based on imagery | |
CN117157678A (en) | Method and system for graph-based panorama segmentation | |
CN113836338B (en) | Fine granularity image classification method, device, storage medium and terminal | |
CN110807362A (en) | Image detection method and device and computer readable storage medium | |
CN113408570A (en) | Image category identification method and device based on model distillation, storage medium and terminal | |
US20230153965A1 (en) | Image processing method and related device | |
CN115131849A (en) | Image generation method and related device | |
CN113869371A (en) | Model training method, clothing fine-grained segmentation method and related device | |
CN112819510A (en) | Fashion trend prediction method, system and equipment based on clothing multi-attribute recognition | |
CN114610272A (en) | AI model generation method, electronic device, and storage medium | |
CN111967478B (en) | Feature map reconstruction method, system, storage medium and terminal based on weight overturn | |
CN113554655A (en) | Optical remote sensing image segmentation method and device based on multi-feature enhancement | |
CN112668675A (en) | Image processing method and device, computer equipment and storage medium | |
CN114913330B (en) | Point cloud component segmentation method and device, electronic equipment and storage medium | |
CN114913305B (en) | Model processing method, device, equipment, storage medium and computer program product | |
CN116978042A (en) | Image processing method, related device and storage medium | |
CN113658338A (en) | Point cloud tree monomer segmentation method and device, electronic equipment and storage medium | |
CN113840169A (en) | Video processing method and device, computing equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |