CN115861041A - Image style migration method and device, computer equipment, storage medium and product - Google Patents

Image style migration method and device, computer equipment, storage medium and product Download PDF

Info

Publication number
CN115861041A
CN115861041A CN202211625387.0A CN202211625387A CN115861041A CN 115861041 A CN115861041 A CN 115861041A CN 202211625387 A CN202211625387 A CN 202211625387A CN 115861041 A CN115861041 A CN 115861041A
Authority
CN
China
Prior art keywords
image
style
network
style migration
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211625387.0A
Other languages
Chinese (zh)
Inventor
姚彤
董露露
陈燕科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN202211625387.0A priority Critical patent/CN115861041A/en
Publication of CN115861041A publication Critical patent/CN115861041A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The application relates to an image style migration method, an image style migration device, computer equipment, a storage medium and a product. The method comprises the following steps: acquiring a content image to be processed; inputting the content image into a preset style migration network for style migration processing to obtain a target style migration image; the preset style migration network is used for converting the original style of the content image into a preset style; the system comprises a preset style migration network, an initial style migration network and a sample content image, wherein the preset style migration network is obtained by training the initial style migration network based on a preset weight parameter set, the sample content image and a sample style image; and outputting the target style transition image. According to the method and the device, based on different preset weight parameters, the fusion proportion of the sample content images and the sample style images in the training process of the initial style migration network is adjusted, the weight parameters with the best style migration effect are obtained through traversal iteration and are used as the fusion proportion between the content images and the style images, the accuracy of the style migration network can be improved, and the image effect of the style migration can be improved.

Description

Image style migration method and device, computer equipment, storage medium and product
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image style migration method, apparatus, computer device, storage medium, and product.
Background
Currently, style migration technology is widely applied to image processing, computer picture synthesis, computer vision, and the like. The style migration technology refers to an image processing technology for migrating an image from an original style to another style and ensuring that the content of the image is not changed. For example: when the style of the real face image is migrated based on the cartoon image, the style of the cartoon image can be migrated to the real face image so as to migrate the original style of the real face image to the cartoon style and generate the cartoon face image containing the real face image.
Conventionally, there is a problem that when the style migration processing is performed on an image, the style migration effect is not good.
Disclosure of Invention
In view of the foregoing, it is desirable to provide an image style migration method, an apparatus, a computer device, a computer readable storage medium, and a computer program product capable of improving a style migration effect of an image.
In a first aspect, the present application provides an image style migration method. The method comprises the following steps:
acquiring a content image to be processed;
inputting the content image into a preset style migration network for style migration processing to obtain a target style migration image; the preset style migration network is used for converting the original style of the content image into a preset style; the system comprises a preset style migration network, an initial style migration network and a weight parameter set, wherein the preset style migration network is obtained by training the initial style migration network based on the preset weight parameter set, a sample content image and a sample style image;
and outputting the target style migration image.
In the embodiment, a target style migration image is obtained by acquiring a content image to be processed and inputting the content image into a preset style migration network for style migration processing, and then the target style migration image is output; the preset style migration network is used for converting the original style of the content image into a preset style; the preset style migration network is obtained by training the initial style migration network based on a preset weight parameter set, the sample content images and the sample style images; that is to say, in the embodiment of the application, based on different preset weight parameters, the fusion ratio between the sample content image and the sample style image in the initial style migration network training process is adjusted, and each preset weight parameter in the iteration preset weight parameter set is traversed to obtain a preset weight parameter with a better style migration effect through training, and the preset weight parameter is used as the fusion ratio between the content image and the style image, so that a better style migration effect is achieved; compared with the problem that the style migration effect is poor due to the fact that the fusion proportion between the content image and the style image is difficult to balance in the traditional technology, the image style migration method provided by the application can improve the accuracy of the style migration network and improve the image effect of style migration.
In one embodiment, the training process of the pre-set style migration network includes:
acquiring a preset weight parameter set, a sample content image and a sample style image;
training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image and the initial style migration network to generate a candidate style migration network;
determining a preset style migration network from the candidate style migration networks according to the sample style images
In the embodiment, a preset weight parameter set, a sample content image and a sample style image are obtained; training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image and the initial style migration network to generate a candidate style migration network; and determining a preset style migration network from the candidate style migration networks according to the sample style images. In the embodiment, a preset weight parameter set is determined according to a value range of a fusion proportion, the initial style migration network is updated and iteratively trained through each preset weight parameter in the preset weight parameter set, a candidate style migration network corresponding to each preset weight parameter set is obtained, and a candidate style migration network with a good style migration effect is selected from the candidate style migration networks to determine a final preset style migration network; the preset weight parameters with the best style migration effect are determined through traversal training of different preset weight parameters, the optimal image fusion proportion is obtained, accurate determination of the preset style migration network obtained through training can be greatly improved, and the processing effect of image style migration is improved.
In one embodiment, training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image, and the initial style migration network to generate a candidate style migration network includes:
and performing style migration processing on the sample content images and training the initial style migration network according to each preset weight parameter, the sample content images, the sample style images and the initial style migration network in the preset weight parameter set to generate candidate style migration images corresponding to the sample content images and candidate style migration networks corresponding to the candidate style migration images.
In this embodiment, according to each preset weight parameter in a preset weight parameter set, a sample content image, a sample style image and an initial style migration network, performing style migration processing on the sample content image and training the initial style migration network to generate a candidate style migration image corresponding to the sample content image and a candidate style migration network corresponding to the candidate style migration image; training to obtain candidate style migration images and candidate style migration networks corresponding to the preset weight parameters aiming at each different preset weight parameter so as to determine a fusion weight parameter with the best style migration effect based on each candidate style migration image subsequently and finally preset style migration networks in each candidate style migration network; and the style migration effect and robustness of the preset style migration network are improved.
In one embodiment, performing style migration processing on a sample content image and training on an initial style migration network according to each preset weight parameter, sample content image, sample style image and initial style migration network in a preset weight parameter set to generate a candidate style migration image corresponding to the sample content image and a candidate style migration network corresponding to the candidate style migration image includes:
inputting preset weight parameters, sample content images and sample style images in a preset weight parameter set into an initial style migration network to perform style migration processing on the sample content images, and generating candidate style migration images corresponding to the sample content images;
training the initial style migration network according to the candidate style migration image and the sample style image to obtain a candidate style migration network
And taking the next preset weight parameter of the preset weight parameters as a new preset weight parameter, iteratively generating a new candidate style migration image and a new candidate style migration network corresponding to the sample content image until the iteration is carried out to the last preset weight parameter in the preset weight parameter set, taking the candidate style migration image and the new candidate style migration image as candidate style migration images, and taking the candidate style migration image and the new candidate style migration network as the candidate style migration network.
In the embodiment, each preset weight parameter in the preset weight parameter set is sequentially used as a network parameter of the style migration network, and network iterative training is performed according to the sample content image and the sample style image, so that a candidate style migration network and a candidate style migration image corresponding to the preset weight parameter under different fusion proportions are obtained, and an image fusion proportion with the best style migration effect is selected according to each candidate style migration image and the sample style image, so that the preset style migration network is obtained; by adopting the training process of the preset style migration network in the embodiment, candidate style migration networks under different fusion proportions can be obtained, so that the fusion proportions between the content images and the style images are balanced, and the preset style migration network with the best style migration effect is obtained; not only can the network training be realized efficiently, but also the robustness and the accuracy of the style migration network can be improved.
In one embodiment, the initial style migration network comprises an initial feature extraction network and an initial adaptive instance normalization network; the preset weight parameters comprise first type preset weight parameters; inputting preset weight parameters, sample content images and sample style images in a preset weight parameter set into an initial style migration network to perform style migration processing on the sample content images, and generating candidate style migration images corresponding to the sample content images, wherein the method comprises the following steps:
inputting the sample content image and the sample style image into an initial feature extraction network for feature extraction to obtain a first feature image of the sample content image and a second feature image of the sample style image;
updating the network parameters of the adaptive instance normalized network into first-class preset weight parameters to generate an initial adaptive instance normalized network;
and inputting the first characteristic image and the second characteristic image into an initial self-adaptive example normalized network for style migration, and generating a candidate style migration image corresponding to the sample content image.
In this embodiment, the initial style migration network includes an initial feature extraction network and an initial adaptive instance normalization network; the preset weight parameters comprise first-class preset weight parameters, and the first-class preset weight parameters are network parameters in the adaptive instance normalized network and are used for representing the fusion ratio between the content images and the style images; when the server trains the initial style migration network to generate candidate style migration images corresponding to the sample content images, inputting the sample content images and the sample style images into the initial feature extraction network for feature extraction to obtain first feature images of the sample content images and second feature images of the sample style images; then, updating the network parameters of the adaptive instance normalized network into a first type of preset weight parameters, and generating an initial adaptive instance normalized network; inputting the first characteristic image and the second characteristic image into an initial self-adaptive example normalized network for style migration, and generating a candidate style migration image corresponding to the sample content image; in other words, in the embodiment, the traditional adaptive instance normalized network is improved, and the weight parameter is increased to balance the fusion ratio between the content image and the lattice image, so that the stylized fusion image processing effect can be greatly improved.
In one embodiment, inputting the first feature image and the second feature image into an initial adaptive instance normalization network for style migration, and generating a first candidate style migration image corresponding to the sample content image, includes:
calculating a weighted sum of the mean value of the first characteristic image and the mean value of the second characteristic image based on the first type of preset weight parameters;
calculating a weighted sum of the variance of the first characteristic image and the variance of the second characteristic image based on the first type of preset weight parameters;
and inputting the weighted sum of the mean values, the weighted sum of the variances, the mean values and the variances of the first characteristic images and the sample content images into an initial adaptive example normalized network for style migration, and generating first candidate style migration images corresponding to the sample content images.
In this embodiment, the fusion mean and the fusion variance between the sample content image and the sample style image are calculated by presetting the weight parameters, so as to balance the sample content image and the sample style image, improve the balance of image fusion, and further improve the fusion effect of the stylized image.
In one embodiment, the preset weight parameters further include a second type of preset weight parameters; the initial characteristic extraction network comprises a preset convolutional neural network, a first encoder network and a second encoder network; inputting the sample content image and the sample style image into an initial feature extraction network for feature extraction to obtain a first feature image of the sample content image and a second feature image of the sample style image, wherein the method comprises the following steps:
inputting the sample content image into a preset convolutional neural network for feature extraction to obtain a first feature of the sample content image, and inputting the sample content image into a first encoder network for feature extraction to obtain a second feature of the sample content image;
according to the second type of preset weight parameters, performing feature fusion on the first features of the sample content images and the second features of the sample content images to generate first feature images of the sample content images;
and inputting the sample style image into a second encoder network for feature extraction to obtain a second feature image of the sample style image.
In this embodiment, the preset weight parameters further include a second type of preset weight parameters; the initial characteristic extraction network comprises a preset convolutional neural network, a first encoder network and a second encoder network; the second type of preset weight parameters are used for balancing the output proportion of the two networks of the preset convolutional neural network and the first encoder network; when extracting the characteristic images of the sample content image and the sample style image through the initial characteristic extraction network, the server can obtain a first characteristic of the sample content image by inputting the sample content image into a preset convolution neural network for characteristic extraction, and can obtain a second characteristic of the sample content image by inputting the sample content image into a first encoder network for characteristic extraction; according to the second type of preset weight parameters, performing feature fusion on the first features of the sample content images and the second features of the sample content images to generate first feature images of the sample content images; and inputting the sample style image into a second encoder network for feature extraction to obtain a second feature image of the sample style image. In the embodiment, the characteristics of the content image are mainly extracted through two different networks, especially for the face image, the problem of face semantic information loss can be solved, and the accuracy and the integrity of face characteristic extraction are improved.
In one embodiment, the predetermined convolutional neural network is trained based on a predetermined image database corresponding to the sample content image.
In this embodiment, a convolutional neural network for extracting human face features is trained in advance based on a preset image database, and more semantic information of a human face image can be extracted through the convolutional neural network, so that the integrity of subsequent human face feature extraction is improved, and the problem of semantic information loss is avoided.
In a second aspect, the application further provides an image style migration apparatus. The device includes:
the acquisition module is used for acquiring a content image to be processed;
the processing module is used for inputting the content image into a preset style migration network for style migration processing to obtain a target style migration image; the preset style migration network is used for converting the original style of the content image into a preset style; the system comprises a preset style migration network, an initial style migration network and a sample content image, wherein the preset style migration network is obtained by training the initial style migration network based on a preset weight parameter set, the sample content image and a sample style image;
and the output module is used for outputting the target style migration image.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the image style migration method of the first aspect when executing said computer program.
In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the image style migration method of the first aspect.
In a fifth aspect, the present application further provides a computer program product comprising a computer program that, when executed by a processor, implements the steps of the image style migration method of the first aspect.
According to the image style migration method, the image style migration device, the computer equipment, the storage medium and the computer program product, the target style migration image is obtained by obtaining the content image to be processed and inputting the content image into the preset style migration network for style migration processing, and then the target style migration image is output; the preset style migration network is used for converting the original style of the content image into a preset style; the preset style migration network is obtained by training the initial style migration network based on a preset weight parameter set, a sample content image and a sample style image; that is to say, in the embodiment of the application, based on different preset weight parameters, the fusion ratio between the sample content image and the sample style image in the initial style migration network training process is adjusted, and each preset weight parameter in the iteration preset weight parameter set is traversed to obtain a preset weight parameter with a better style migration effect through training, and the preset weight parameter is used as the fusion ratio between the content image and the style image, so that a better style migration effect is achieved; compared with the problem that the style migration effect is poor due to the fact that the fusion proportion between the content image and the style image is difficult to balance in the traditional technology, the image style migration method provided by the application can improve the accuracy of the style migration network and improve the image effect of style migration.
Drawings
FIG. 1 is a diagram of an application environment of an image style migration method in one embodiment;
FIG. 2 is a flowchart illustrating an image style migration method according to an embodiment;
FIG. 3 is a flowchart illustrating an image style migration method according to another embodiment;
FIG. 4 is a flowchart illustrating an image style migration method according to another embodiment;
FIG. 5 is a flowchart illustrating an image style migration method according to another embodiment;
FIG. 6 is a flowchart illustrating an image style migration method according to another embodiment;
FIG. 7 is a diagram illustrating an overall network architecture of a default style migration network in one embodiment;
FIG. 8 is a schematic diagram of a network structure of a face feature extraction network in one embodiment;
FIG. 9 is a block diagram showing the configuration of an image style migration apparatus according to an embodiment;
FIG. 10 is a block diagram showing the construction of an image style migration apparatus according to another embodiment;
FIG. 11 is a diagram illustrating an internal structure of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
The generation countermeasure network belongs to one of the generation models, and the principle is that iterative training is carried out on one model, so that the trained model can generate data according to the data rule given in the training process. The generation of the countermeasure network can produce samples in parallel, the used function limit is very small, and the performance is better than that of the former generation model. In the training process of the neural network, the weight value of each layer of the whole network is constantly changed, so that the layer behind the network needs to constantly adapt to the distribution change of data, meanwhile, the problem of gradient disappearance easily occurs, the convergence learning speed of the model is reduced, and therefore Batch Normalization (BN) occurs to integrate and process the data of each Batch, so that the input features meet the distribution that the mean value is 0 and the variance is 1. Most of the applications IN the initial style migration networks are batch Normalization, and later researches find that Instance Normalization IN (Instance Normalization) is more suitable for application IN style migration, which is different from the case Normalization IN that the Instance Normalization is performed for each sample, and the convergence speed and the loss function of the Instance Normalization are better than those of batch Normalization as a whole. Then, based on the example Normalization, adaptive Instance Normalization (AdaIN) appears, which is a method for fusing styles of two images to realize style migration, and has the advantages that:
(1) AdaIN is based on a forward neural network and is generated quickly.
(2) Migration of any style is supported, the migration is not limited to a specific style, the input of the migration is a content image and a style image, and migration of any style can be realized.
Based on the features x of a given content image and the features y of a lattice image, the adaptive instance normalization can migrate the style of y onto x, and its model is mainly composed of a downsampling Encoder (VGG Encoder), adaptive instance normalization (AdaIN), and an upsampling Decoder (Decoder). The downsampling Encoder Encoder adopts a pretrained VGG model, the parameters of the Encoder Encoder are fixed and unchanged in the training process, the first 8 layers (such as a cut ReLU4-1 layer) of the VGG are used, a content image and a feature map of a style image are obtained by the downsampling Encoder and are input into the adaptive instance normalized residual error layer together, the features of the style image are migrated into the content image to obtain a new feature image, and the picture after the style migration can be obtained by the new feature image through the upsampling Decoder Decoder.
The traditional style migration network utilizes a cyclic generation confrontation network and some deformation thereof to carry out mutual conversion between two styles, but the particularity of the face features causes the stylization of the face not to be suitable for realizing the cyclic generation confrontation network, and a deep neural network with stronger performance is required to extract the face features.
The traditional adaptive normalization algorithm adopts a channel-level mode to calculate pixels, and the specific mode is as follows: the stylized image is obtained by subtracting the mean value of the content image pixels from the feature map area pixels of the input content image and dividing by the variance, then multiplying by the variance of the stylized image pixels and adding to the mean value of the stylized image pixels. Such an approach has a disadvantage in that it is difficult to balance the proportions of the content image and the character of the stylistic image. When the proportion of the style image is too large, the generated image is easy to lose original semantic information no matter the foreground or the background, so that the appearance is influenced.
Therefore, the application provides a new weight-based adaptive instance standardization method, so that the generated style image is more in line with the impression of people.
The following describes technical solutions related to the embodiments of the present application with reference to a scenario in which the embodiments of the present application are applied.
The image style migration method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or a server cluster comprised of multiple servers.
Illustratively, the terminal 102 may respond to a content image to be processed input by the user and transmit the content image to be processed to the server 104; after receiving the content image to be processed sent by the terminal 102, the server 104 performs style migration processing on the content image to be processed based on a preset style migration network to obtain a target image after the style migration processing; the target image is an image of the content which comprises a preset style and a content image and corresponds to the preset style after the original style of the content image is converted into the preset style; then, the server 104 transmits the stylized target image to the terminal 102 so that the terminal 102 can output and display the target image.
In one embodiment, as shown in fig. 2, an image style migration method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
step 220, acquiring a content image to be processed.
Alternatively, the content image may be any type of image that requires a genre conversion, such as: images captured by a mobile phone, a camera, or the like, images generated by scanning, rendering, processing, synthesizing, or the like by image processing software, or the like; in addition, the content image may be any content image such as a landscape image, a building image, a person image, an animal image, and an object image.
Optionally, the server may receive a content image to be processed sent by the terminal, may also obtain the content image to be processed from a preset storage location according to a preset path, may also obtain the content image to be processed from a preset database, and the like; the embodiment of the present application does not specifically limit the manner in which the server acquires the content image to be processed.
And 240, inputting the content image into a preset style migration network for style migration processing to obtain a target style migration image.
The preset style migration network is used for converting the original style of the content image into a preset style. The preset style migration network is obtained by training the initial style migration network based on the preset weight parameter set, the sample content image and the sample style image. The sample style image is a sample image corresponding to the preset style.
Optionally, the preset weight parameter set includes at least one preset weight parameter, the preset weight parameter may be used to represent a fusion ratio when the content image and the style image are fused, and the content image and the style image may be balanced by the preset weight parameter, so as to avoid a problem that semantic information is lost or the stylization is not obvious in the stylized image due to an excessively high occupancy ratio of one of the content image and the style image. Since the preset weight parameter represents the fusion ratio of the content image and the lattice image, the value range of the preset weight parameter should be between 0 and 1, and does not include 0 and 1; in order to avoid that one party is over high, the value range of the preset weight parameter may be set to be between 0.3 and 0.7, or between 0.4 and 0.6, and the like.
Further, after the value range of the preset weight parameter is determined, a plurality of preset weight parameters can be determined from the value range according to the preset granularity, and the preset weight parameter set is generated; illustratively, the preset granularity may be 1, 0.5, 0.1, etc.; the finer the granularity is, the higher the accuracy of the trained preset style migration network is. Based on this, when the initial style migration network is trained according to the sample content image and the sample style image, iterative training can be sequentially performed based on each preset weight parameter in the preset weight parameter set, and according to the training result, the preset weight parameter with better style conversion effect is determined, that is, the fusion ratio between the content image with better style conversion effect and the style image is determined. It should be noted that, for different content images and style images, the corresponding fusion proportion may be different.
Optionally, after the initial style migration network is trained based on the preset weight parameter set, the sample content image and the sample style image to obtain the preset style migration network, the style conversion processing may be performed on the content image to be processed through the preset style migration network to obtain the target style migration image after the style conversion.
And step 260, outputting the target style transition image.
Optionally, after the server performs the style migration processing on the content image to be processed to obtain the target style migration image, the server may send the target style migration image to the terminal, so that the terminal may output and display the target style migration image to the user.
In the image style migration method, a server acquires a content image to be processed, inputs the content image into a preset style migration network for style migration processing to obtain a target style migration image, and then outputs the target style migration image; the preset style migration network is used for converting the original style of the content image into a preset style; the preset style migration network is obtained by training the initial style migration network based on a preset weight parameter set, the sample content images and the sample style images; that is to say, in the embodiment of the application, based on different preset weight parameters, the fusion ratio between the sample content image and the sample style image in the initial style migration network training process is adjusted, and each preset weight parameter in the iteration preset weight parameter set is traversed to obtain a preset weight parameter with a better style migration effect through training, and the preset weight parameter is used as the fusion ratio between the content image and the style image, so that a better style migration effect is achieved; compared with the problem that the style migration effect is poor due to the fact that fusion proportion between the content image and the style image is difficult to balance in the traditional technology, the image style migration method provided by the application can improve accuracy of a style migration network and improve the image effect of style migration.
FIG. 3 is a flowchart illustrating an image style migration method according to another embodiment. Based on the foregoing embodiment, as shown in fig. 3, the method further includes:
and step 320, acquiring a preset weight parameter set, a sample content image and a sample style image.
Taking the face style migration process as an example, optionally, according to the experience of the past style fusion on the scale control of the face image and the cartoon image, the preset weight parameter may be set to have a variation range of 0.3-0.7, and if the preset weight parameter is divided by weight with a granularity of 0.5, a preset weight parameter set may be obtained as [0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7]. For the sample content image, the sample content image can be a real face image; the sample style image can be an animation face image.
Step 340, training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image and the initial style migration network, and generating a candidate style migration network.
The preset weight parameter is used as a network parameter in the initial style migration network, and when the preset weight parameter is different values, namely when the content image and the style image are fused by adopting different fusion proportions, the style migration image generated by the initial style migration network has difference.
Optionally, each preset weight parameter in the preset weight parameter set may be sequentially used as a network parameter in the initial style migration network, the initial style migration network is trained by using a sample content image and a sample style image, and other network parameters in the initial style migration network, such as a neuron parameter in the network, are optimized in an iterative training process; after the training satisfies a certain preset iteration stop condition, candidate style migration networks respectively corresponding to the preset weight parameters can be obtained.
Optionally, a first preset weight parameter in the preset weight parameter set, for example, 0.3, may also be used as a network parameter of the initial style migration network, the sample content image and the sample style image are adopted to perform iterative training on the initial style migration network, and after a preset iteration stop condition is met, a candidate style migration network corresponding to the first preset weight parameter is obtained; then, taking a second preset weight parameter, such as 0.35, in the preset weight parameter set as a network parameter of the candidate style migration network, adopting the sample content image and the sample style image, continuing iterative training on the candidate style migration network, and obtaining a candidate style migration network corresponding to the second preset weight parameter after a preset iteration stop condition is met; and repeating iteration in a circulating way until the candidate style migration network corresponding to the last preset weight parameter in the preset weight parameter set is obtained.
And step 360, determining a preset style migration network from the candidate style migration networks according to the sample style images.
Optionally, after obtaining the candidate style migration network corresponding to each preset weight parameter in the preset weight parameter set, further, according to the sample style image, one candidate style migration network with a better style migration effect may be determined from each candidate style migration network as the preset style migration network.
For example, a candidate style migration image with a better style migration effect may be determined according to the similarity between a candidate style migration image corresponding to a sample content image and the sample style image output by each candidate style migration network, and then, the candidate style migration network corresponding to the candidate style migration image may be used as the preset style migration network; that is, the preset weight parameter corresponding to the candidate style migration network may be used as the network parameter of the preset style migration network. It should be noted that after determining the candidate style migration network corresponding to the target weight parameter with better style migration, a preset style migration network may be determined based on the candidate style migration network, where a network parameter corresponding to image migration in the preset style migration network, that is, a network parameter for image fusion, is the target weight parameter corresponding to the candidate style migration network; for other network parameters in the preset style migration network except for the target weight parameter, the network parameters may be the same as or different from the network parameters in the candidate style migration network, or may be determined in other manners, which is not specifically limited in this embodiment of the present application.
In the embodiment, a preset weight parameter set, a sample content image and a sample style image are obtained; training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image and the initial style migration network to generate a candidate style migration network; and determining a preset style migration network from the candidate style migration networks according to the sample style images. In the embodiment, a preset weight parameter set is determined according to a value range of a fusion proportion, the initial style migration network is updated and iteratively trained through each preset weight parameter in the preset weight parameter set, a candidate style migration network corresponding to each preset weight parameter set is obtained, and a candidate style migration network with a good style migration effect is selected from the candidate style migration networks to determine a final preset style migration network; the preset weight parameters with the best style migration effect are determined through traversal training of different preset weight parameters, the optimal image fusion proportion is obtained, accurate determination of the preset style migration network obtained through training can be greatly improved, and the processing effect of image style migration is improved.
In an optional embodiment of the present application, the training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image, and the initial style migration network in the step 302 to generate the candidate style migration network may include: and performing style migration processing on the sample content images and training the initial style migration network according to each preset weight parameter, the sample content images, the sample style images and the initial style migration network in the preset weight parameter set to generate candidate style migration images corresponding to the sample content images and candidate style migration networks corresponding to the candidate style migration images. That is to say, in the process of performing iterative training, the candidate style migration network corresponding to each preset weight parameter and the candidate style migration image after image processing is performed through the candidate style migration network are saved; based on this, when the preset style migration network is determined from the candidate style migration networks according to the sample style images in step 303, a candidate style migration network with the best style migration effect may be determined from the multiple candidate style migration networks according to the similarity between the candidate style migration image corresponding to each preset weight parameter and the sample style image, that is, the best fusion ratio is determined, that is, the preset weight parameter with the best fusion effect is determined from the preset weight parameter set; and then, determining a preset style migration network based on the preset weight parameter with the best fusion effect.
Optionally, as shown in fig. 4, the process of performing, according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image, and the initial style migration network, style migration processing on the sample content image and training the initial style migration network to generate a candidate style migration image corresponding to the sample content image and a candidate style migration network corresponding to the candidate style migration image may include:
and step 420, inputting the preset weight parameters, the sample content images and the sample style images in the preset weight parameter set to an initial style migration network to perform style migration processing on the sample content images, and generating candidate style migration images corresponding to the sample content images.
And step 440, training the initial style migration network according to the candidate style migration image and the sample style image to obtain a candidate style migration network.
Step 460, using the next preset weight parameter of the preset weight parameters as a new preset weight parameter, iteratively generating a new candidate style migration image and a new candidate style migration network corresponding to the sample content image until the iteration reaches the last preset weight parameter in the preset weight parameter set, using the candidate style migration image and the new candidate style migration image as candidate style migration images, and using the candidate style migration image and the new candidate style migration network as candidate style migration networks.
As described in the foregoing step 302, that is, each preset weight parameter in the preset weight parameter set is sequentially used as a network parameter of the style migration network for iterative training, so as to obtain a candidate style migration network and a candidate style migration image corresponding to each preset weight parameter. Exemplarily, the first preset weight parameter may be used as a network parameter of the initial style migration network to perform iterative training for a preset number of times, and after the preset number of times is reached, the candidate style migration network and the candidate style migration image corresponding to the first preset weight parameter may be obtained; then, the second preset weight parameter is used as the network parameter of the candidate style migration network to carry out iteration training for preset times, and after the preset times are reached, the candidate style migration network and the candidate style migration image corresponding to the second preset weight parameter can be obtained; and then, performing iteration training for a preset number of times by taking the third preset weight parameter as the network parameter of the candidate style migration network, and performing iteration training in sequence in a circulating manner until the last preset weight parameter is taken as the network parameter of the candidate style migration network to perform iteration training for the preset number of times, so as to obtain the candidate style migration network corresponding to the last preset weight parameter and the candidate style migration image.
In the embodiment, each preset weight parameter in the preset weight parameter set is sequentially used as a network parameter of the style migration network, and network iterative training is performed according to the sample content image and the sample style image, so that a candidate style migration network and a candidate style migration image corresponding to the preset weight parameter under different fusion proportions are obtained, and an image fusion proportion with the best style migration effect is selected according to each candidate style migration image and the sample style image, so that the preset style migration network is obtained; by adopting the training process of the preset style migration network in the embodiment, candidate style migration networks under different fusion proportions can be obtained, so that the fusion proportions between the content images and the style images are balanced, and the preset style migration network with the best style migration effect is obtained; not only can the network training be realized efficiently, but also the robustness and the accuracy of the style migration network can be improved.
In an optional embodiment of the present application, the initial style migration network may include an initial feature extraction network and an initial adaptive instance normalization network; the initial feature extraction network is used for carrying out feature extraction on the sample content images and the sample style images to obtain feature maps of the sample content images and feature maps of the sample style images; the initial adaptive instance standardization network is used for fusing the characteristic graph of the sample content image and the characteristic graph of the sample style image, namely realizing style migration to obtain an image with fused style; the preset weight parameters include a first type of preset weight parameters, which are network parameters in the initial adaptive instance normalized network, such as: and carrying out image fusion on the characteristic graph of the sample content image and the characteristic graph of the sample style image.
Based on this, as shown in fig. 5, in step 401, inputting the preset weight parameters, the sample content images, and the sample style images in the preset weight parameter set to the initial style migration network to perform style migration processing on the sample content images, and generating candidate style migration images corresponding to the sample content images, includes:
and step 520, inputting the sample content image and the sample style image into an initial feature extraction network for feature extraction to obtain a first feature image of the sample content image and a second feature image of the sample style image.
Alternatively, the initial feature extraction Network may be a Convolutional Neural Network-based feature extraction Network, including but not limited to a Convolutional Neural Network (CNN), a Deep Neural Network (DNN), a Generative Adaptive Network (GAN), and so on. Optionally, the feature extraction Network may also be an encoder Network, a Visual Geometry Group Network (VGG), and the like, which is not specifically limited in this embodiment of the present application. It should be noted that, for the sample content image and the sample style image, one feature extraction network may be used to extract features of the sample content image and features of the sample style image, or two feature extraction networks may be used to extract features of the sample content image and features of the sample style image; under the condition that the two feature extraction networks are adopted to respectively extract the features of the sample content images and the features of the sample style images, the two feature extraction networks can be the same or different; in addition, the feature extraction network for extracting the sample content image features or the feature extraction network for extracting the sample style image features may be one network or a combination of multiple networks, that is, multiple networks of different types may be used to extract the features, the extracted features are fused, and the fused feature map is used as the final feature map; the embodiment of the present application is not particularly limited to this regarding the network structure and type.
For the initial feature extraction networks with different network structures, only the sample content images and the sample style images are required to be input into the initial feature extraction network, and the first feature images corresponding to the sample content images and the second feature images corresponding to the sample style images are obtained.
And 540, updating the network parameters of the adaptive instance normalized network into a first type of preset weight parameters, and generating the initial adaptive instance normalized network.
Conventionally, an adaptive instance normalization network can be realized by an adaptive instance normalization algorithm, which can be represented by the following formula (1).
Figure BDA0004004146270000141
Where x is an input content image (which may be a sample content image when network training is performed), y is an input style image (which may be a sample style image when network training is performed), μ (x) is a variance of the content image, σ (x) is a mean of the content image, μ (y) is a variance of the style image, σ (y) is a mean of the style image, and AdaIN (x, y) is a style transition image.
Alternatively, in the present embodiment, the improved adaptive instance normalization algorithm can be represented by the following equation (2).
Figure BDA0004004146270000151
Where σ (x, y) is the mean of the combined content image and genre image, μ (x, y) is the variance of the combined content image and genre image, and W-AdaIN (x, y) is the genre migration image.
Alternatively, σ (x, y) can be expressed as:
σ(x,y)= σ(x)×λ+σ(y)×(1-λ) (3)
μ (x, y) can be expressed as:
μ(x,y)= μ(x)×λ+μ(y)×(1-λ) (4)
wherein, λ is a preset weight parameter representing a fusion ratio between the content image and the style image; λ is also used as a network parameter of the adaptive instance normalized network, i.e., a first type of preset weight parameter.
During iterative training, updating network parameters of the adaptive instance normalized network into first-class preset weight parameters to generate an initial adaptive instance normalized network; illustratively, based on the preset weight parameter set in the above example, a first preset weight parameter in the preset weight parameter set, for example, 0.3, may be used as a network parameter of the adaptive instance normalized network to obtain the initial adaptive instance normalized network, that is, the initial adaptive instance normalized network may be represented as:
Figure BDA0004004146270000152
and step 560, inputting the first characteristic image and the second characteristic image into the initial adaptive instance normalized network for style migration, and generating a candidate style migration image corresponding to the sample content image.
Alternatively, the first feature image of the sample content image may be used as x in the formula (5), and the second feature image of the sample style image may be used as y in the formula (5), and the style migration candidate image corresponding to the sample content image may be obtained by inputting the first feature image of the sample content image into the formula (5) to perform the style migration calculation.
That is, a weighted sum of the mean value of the first feature image and the mean value of the second feature image, i.e., σ (x, y) =0.3 σ (x) +0.7 σ (y), may be calculated based on the first-type preset weight parameter λ; and calculating a weighted sum of the variance of the first feature image and the variance of the second feature image, namely, mu (x, y) =0.3 mu (x) +0.7 mu (y), based on the first-class preset weight parameter lambda; then, the weighted sum σ (x, y) of the mean, the weighted sum μ (x, y) of the variance, the mean σ (x) and the variance μ (x) of the first feature image, and the sample content image x are input into the initial adaptive instance normalization network for style migration, and a first candidate style migration image corresponding to the sample content image is generated.
In this embodiment, the initial style migration network includes an initial feature extraction network and an initial adaptive instance normalization network; the preset weight parameters comprise first-class preset weight parameters, and the first-class preset weight parameters are network parameters in the adaptive instance normalized network and are used for representing the fusion ratio between the content images and the style images; when the server trains the initial style migration network to generate candidate style migration images corresponding to the sample content images, inputting the sample content images and the sample style images into the initial feature extraction network for feature extraction to obtain first feature images of the sample content images and second feature images of the sample style images; then, updating the network parameters of the adaptive instance normalized network into a first type of preset weight parameters, and generating an initial adaptive instance normalized network; inputting the first characteristic image and the second characteristic image into an initial self-adaptive example normalized network for style migration, and generating a candidate style migration image corresponding to the sample content image; in other words, in the embodiment, the traditional adaptive instance normalized network is improved, and the weight parameter is increased to balance the fusion ratio between the content image and the lattice image, so that the stylized fusion image processing effect can be greatly improved.
In an optional embodiment of the present application, the preset weight parameters further include a second type of preset weight parameters; the initial feature extraction network may include a preset convolutional neural network, a first encoder network, and a second encoder network; the system comprises a preset convolutional neural network, a first encoder network, a second encoder network and a third encoder network, wherein the preset convolutional neural network and the first encoder network are used for extracting the characteristics of a content image, and the second encoder network is used for extracting the characteristics of a style image; the second type of preset weight parameter is used as a fusion proportion of the preset convolutional neural network and the first encoder network, namely, the feature graph of the content image output by the preset convolutional neural network and the feature graph of the content image output by the first encoder network are fused to obtain a final feature graph of the content image.
Based on this, as shown in fig. 6, the step 501 inputs the sample content image and the sample style image into the initial feature extraction network for feature extraction, and obtains a first feature image of the sample content image and a second feature image of the sample style image, and includes:
step 620, inputting the sample content image into a preset convolutional neural network for feature extraction to obtain a first feature of the sample content image, and inputting the sample content image into a first encoder network for feature extraction to obtain a second feature of the sample content image.
Optionally, the network structure of the preset convolutional neural network may be a network structure of a conventional convolutional neural network, may also be an improved network structure based on a conventional convolutional neural network, may also be a combined network structure based on a conventional convolutional neural network and other neural networks, and the like. The first encoder network may also be a network structure based on any neural network, which is not specifically limited in this embodiment of the present application.
Optionally, the preset convolutional neural network may be a network obtained after training is completed in advance based on a preset image database corresponding to the sample content image, and for example, the preset convolutional neural network may be a front 8-layer network of a VGG19 network trained in advance based on a face database; it should be noted that the top 8-layer network of the pre-trained VGG19 network recited in this embodiment is only one example of the preset convolutional neural network, and is not used to limit the preset convolutional neural network.
Optionally, the network parameter in the first encoder network may be an initialized network parameter, and in the training process, the network parameter in the first encoder network is continuously optimized and updated through iteration to obtain a finally trained first encoder network; optionally, the network parameters in the first encoder network may include neuron parameters in the network.
Optionally, the server may input the sample content image into a plurality of different feature extraction networks, respectively, to obtain a plurality of features corresponding to the sample image, such as: inputting the sample content image into a preset convolutional neural network for feature extraction to obtain a first feature of the sample content image, and inputting the sample content image into a first encoder network for feature extraction to obtain a second feature of the sample content image.
And step 640, performing feature fusion on the first feature of the sample content image and the second feature of the sample content image according to the second type of preset weight parameter, and generating a first feature image of the sample content image.
Optionally, the second type of preset weight parameter may be used as a fusion weight or a fusion proportion for performing feature fusion on the first feature of the sample content image and the second feature of the sample content image; illustratively, the fusion process of different features of the content image can be represented by the following formula (6).
F(X)=ωV(X)+(1-ω)E(X) (6)
Where ω is a second type of preset weight parameter, V (X) is a first feature of the sample content image, E (X) is a second feature of the sample content image, and F (X) is a first feature image of the sample content image.
Optionally, for the second type of preset weight parameter ω, a value range thereof may be 0 to 1; illustratively, the initialization of ω may be 0, which gradually increases with the update of the network parameters during the network training process; or, the initialization of ω may be 1, and gradually decreases with the update of the network parameters during the network training process; of course, the value range of the second type of preset weight parameter ω may also be 0 to 0.7, 0.1 to 0.9, 0.2 to 0.8, 0.1 to 0.7, and the like, which is not specifically limited in this embodiment of the application.
And continuously updating the second-type preset weight parameter omega through network training to obtain the weight parameter omega with the best fusion effect.
And 660, inputting the sample style image into a second encoder network for feature extraction to obtain a second feature image of the sample style image.
Optionally, the network parameters in the second encoder network may be initialized network parameters, and in the training process, the network parameters in the first encoder network are continuously optimized and updated through iteration to obtain a finally trained second encoder network; optionally, the network parameters in the second encoder network may comprise neuron parameters in the network.
Optionally, the server may input the sample-style image into a second encoder network for feature extraction, so as to obtain a second feature image of the sample-style image.
In this embodiment, the preset weight parameters further include a second type of preset weight parameters; the initial characteristic extraction network comprises a preset convolutional neural network, a first encoder network and a second encoder network; the second type of preset weight parameters are used for balancing the output proportion of the two networks of the preset convolutional neural network and the first encoder network; when extracting the characteristic images of the sample content image and the sample style image through the initial characteristic extraction network, the server can obtain a first characteristic of the sample content image by inputting the sample content image into a preset convolution neural network for characteristic extraction, and can obtain a second characteristic of the sample content image by inputting the sample content image into a first encoder network for characteristic extraction; according to the second type of preset weight parameters, performing feature fusion on the first features of the sample content images and the second features of the sample content images to generate first feature images of the sample content images; and inputting the sample style image into a second encoder network for feature extraction to obtain a second feature image of the sample style image. In the embodiment, the features of the content image are mainly extracted through two different networks, and particularly for the face image, the problem of face semantic information loss can be solved, and the accuracy and the integrity of face feature extraction are improved.
The technical solution of the present application is described in its entirety below. For example, referring to fig. 7, the input images of the network may include an RGB real face image with a length and width of 512 × 512 and with a channel number of 3, and an RGB cartoon face image with a length and width of 512 × 512 and with a channel number of 3; alternatively, if the size of the input image exceeds 512 × 512 pixels, the input image can be scaled to 512 × 512 pixels to fit the input of the network, and the data set image used in the present application can be manually adjusted to a proper size to fit the model training requirement. The whole network is divided into three parts, wherein the first part is a face feature extraction network, comprises a VGG network and a first encoder network, and is responsible for extracting face features and providing the face features to a second network, namely a W-AdaIN adaptive instance normalized network for fusion and upsampling; the second part is an animation style generation network, namely a second encoder network, and is responsible for extracting animation face features, inputting the animation face features into a W-AdaIN adaptive instance normalized network, performing normalized combination on the animation face features, and finally outputting a final animation face stylized image through a decoder; the third part is a discriminator part in the network model of the application, and is responsible for improving the performances of the first encoder network, the second encoder network and the decoder network, providing the generated cartoon face stylized image and the cartoon face image input into the network to the discriminator, scoring the generated cartoon face image, and improving the neuron parameters of the first encoder network, the second encoder network and the decoder network through the adjustment of the score and the loss function.
Optionally, as shown in fig. 8, the input image is a real face image with a length and a width of 512 × 512 and a pixel channel number of 3, which is recorded as X, wherein the first face feature extraction network intercepts the front 8 layers of the pre-trained VGG19 network and records as V, and the input real face image is subjected to operations such as convolution downsampling through the V network to extract a feature image with a relatively obvious face feature profile, which is recorded as V (X); the second face feature extraction network is an encoder structure with parameters capable of learning through back propagation, and is denoted as E, the input real face image is subjected to operations such as convolution and down-sampling through the E network, another common face feature image is extracted, and is denoted as E (X), the encoder can obtain stronger feature extraction capability after a plurality of times of iterative learning, the two feature images are subjected to channel-level weighted summation by respectively giving a weight parameter omega (namely the second type of preset weight parameter), so that a new face feature image is obtained, and is denoted as F, and the face feature image F is the final face style feature image output by the network. The initialization of omega can be 0, the value range can be 0 to 0.7 along with the gradual increase of the update of the parameters of the network, and the final weight value is obtained through training.
The new face feature extraction model provided by the application can ensure that semantic information is not lost in the style transition image generation process; in addition, the application also provides a novel weight-based adaptive instance standardization method, so that the generated style image is more in line with the impression of people.
The formula of the weight-based adaptive instance normalized animation face image generation algorithm may be as shown in the above formula (2), wherein the fusion mean may be as shown in the above formula (3), and the fusion variance may be as shown in the above formula (4), and the fusion ratio between the content image and the lattice image is balanced by the weight parameter λ.
Exemplarily, according to the experience of the previous style fusion on the proportional control of the face image and the cartoon image, the variation range of the face image and the cartoon image can be set to be 0.3-0.7, and the weight parameter value of the preset style migration network most suitable for the application is finally determined through the updating iteration of the network parameters; optionally, a preset weight parameter set may be set, and the preset weight parameter set may be a preset weight parameter array, for example: [0.3,0.35,0.4,0.45,0.5,0.5,0.55,0.6,0.65,0.7], the length of the array is 10, the array corresponds to values in 100 iteration epochs respectively, in the training process, the array is traversed, a weight parameter value is taken out each time and serves as a weight value lambda initialized in each 10 iteration epochs, finally, a check point with the best training result is selected through the generated image quality and evaluation indexes, and the corresponding lambda value is taken as a final weight parameter value.
Optionally, for the above weight parameter ω, it may be to update the parameter value every 100 iterations epoch; illustratively, if the parameter range is 0-1, and the update granularity is 0.05, after every 100 iterations epoch, 0.05 is added, resulting in a new weight parameter ω, and so on.
The invention provides a novel human face image feature extraction network based on fusion of a VGG (video graphics gateway) and an encoder aiming at the problems of human face semantic information loss and the like in a style migration task, the human face image feature extraction network comprises two feature extraction networks, one is a VGG network formed by training a high-definition human face data set, the network is trained by using a human face data set, the first eight layers of the network are taken out for the human face feature extraction of the invention after parameters are fixed, the other network is an encoder network, and gradually increased parameters are set to balance the output ratio of the two networks. It should be noted that the first eight layers of the VGG network selected in this application are only for illustration and are not used to limit the VGG network.
Aiming at an original self-adaptive example normalization algorithm, the original self-adaptive example normalization directly aligns the variance and the mean of a real image to the variance and the mean of an animation style image, so that the generated image has the problems of discontinuity and color confusion; therefore, the application provides a new weight-based adaptive instance normalized animation image generation method, and the proportion of the variance and the mean value of the content image and the animation image is adjusted by introducing a weight parameter, so that style migration can be better performed, and the image effect of the style migration is improved.
The application provides a face image feature extraction network based on the fusion of a VGG network and an encoder, and an adaptive instance normalized cartoon face style image generation algorithm based on weight is improved, so that excellent performance and effect can be shown in the field of face cartoon image generation.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present application further provides an image style migration apparatus for implementing the image style migration method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so specific limitations in one or more embodiments of the image style migration device provided below may refer to the limitations on the image style migration method in the foregoing, and details are not described herein again.
In one embodiment, as shown in fig. 9, there is provided an image style migration apparatus including: an obtaining module 920, a processing module 940 and an output module 960, wherein:
an obtaining module 920, configured to obtain a content image to be processed;
the processing module 940 is configured to input the content image into a preset style migration network for style migration processing, so as to obtain a target style migration image; the preset style migration network is used for converting the original style of the content image into a preset style; the system comprises a preset style migration network, an initial style migration network and a sample content image, wherein the preset style migration network is obtained by training the initial style migration network based on a preset weight parameter set, the sample content image and a sample style image;
an output module 960 for outputting the target style migration image.
In one embodiment, as shown in fig. 10, the apparatus further includes a network training module 980, where the network training module 980 is configured to train to obtain a migration network with a preset style; the network training module 980 comprises an acquisition unit, a training unit and a determination unit; the device comprises an acquisition unit, a weight parameter setting unit, a sample content image acquisition unit and a sample style image acquisition unit, wherein the acquisition unit is used for acquiring a preset weight parameter set, a sample content image and a sample style image; the training unit is used for training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image and the initial style migration network to generate a candidate style migration network; and the determining unit is used for determining the preset style migration network from the candidate style migration networks according to the sample style images.
In one embodiment, the training unit is configured to perform style migration processing on the sample content image and train the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image, and the initial style migration network, and generate a candidate style migration image corresponding to the sample content image and a candidate style migration network corresponding to the candidate style migration image.
In one embodiment, the training unit is configured to input preset weight parameters, sample content images, and sample style images in a preset weight parameter set to an initial style migration network to perform style migration processing on the sample content images, and generate candidate style migration images corresponding to the sample content images; training the initial style migration network according to the candidate style migration image and the sample style image to obtain a candidate style migration network; and taking the next preset weight parameter of the preset weight parameters as a new preset weight parameter, iteratively generating a new candidate style migration image and a new candidate style migration network corresponding to the sample content image until the iteration is carried out to the last preset weight parameter in the preset weight parameter set, taking the candidate style migration image and the new candidate style migration image as candidate style migration images, and taking the candidate style migration image and the new candidate style migration network as candidate style migration networks.
In one embodiment, the initial style migration network comprises an initial feature extraction network and an initial adaptive instance normalization network; the preset weight parameters comprise first type preset weight parameters; the training unit is used for inputting the sample content images and the sample style images into an initial feature extraction network for feature extraction to obtain first feature images of the sample content images and second feature images of the sample style images; updating the network parameters of the adaptive instance normalized network into a first type of preset weight parameters, and generating an initial adaptive instance normalized network; and inputting the first characteristic image and the second characteristic image into an initial self-adaptive example normalized network for style migration, and generating a candidate style migration image corresponding to the sample content image.
In one embodiment, the training unit is configured to calculate a weighted sum of a mean value of the first feature image and a mean value of the second feature image based on a first class of preset weight parameters; calculating a weighted sum of the variance of the first characteristic image and the variance of the second characteristic image based on the first type of preset weight parameters; and inputting the weighted sum of the mean values, the weighted sum of the variances, the mean values and the variances of the first characteristic images and the sample content images into an initial adaptive example normalized network for style migration, and generating first candidate style migration images corresponding to the sample content images.
In one embodiment, the preset weight parameters further include a second type of preset weight parameters; the initial characteristic extraction network comprises a preset convolutional neural network, a first encoder network and a second encoder network; the training unit is used for inputting the sample content image into a preset convolutional neural network for feature extraction to obtain a first feature of the sample content image, and inputting the sample content image into a first encoder network for feature extraction to obtain a second feature of the sample content image; according to the second type of preset weight parameters, performing feature fusion on the first features of the sample content images and the second features of the sample content images to generate first feature images of the sample content images; and inputting the sample style image into a second encoder network for feature extraction to obtain a second feature image of the sample style image.
In one embodiment, the predetermined convolutional neural network is trained based on a predetermined image database corresponding to the sample content image.
The modules in the image style migration apparatus may be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 11. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The database of the computer device is used for storing network data of the preset style migration network. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image style migration method.
Those skilled in the art will appreciate that the architecture shown in fig. 11 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the image style migration method in the above embodiments when executing the computer program.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of the image style migration method in the various embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, realizes the steps of the image style migration method in the above-mentioned respective embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), magnetic Random Access Memory (MRAM), ferroelectric Random Access Memory (FRAM), phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (12)

1. An image style migration method, characterized in that the method comprises:
acquiring a content image to be processed;
inputting the content image into a preset style migration network for style migration processing to obtain a target style migration image; the preset style migration network is used for converting the original style of the content image into a preset style; the system comprises a preset style migration network, a sample content image and a sample style image, wherein the preset style migration network is obtained by training an initial style migration network based on a preset weight parameter set, the sample content image and the sample style image;
and outputting the target style transition image.
2. The method of claim 1, wherein the training process of the pre-defined style migration network comprises:
acquiring the preset weight parameter set, the sample content image and the sample style image;
training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image and the initial style migration network to generate a candidate style migration network;
and determining the preset style migration network from the candidate style migration networks according to the sample style images.
3. The method according to claim 2, wherein the training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image, and the initial style migration network to generate a candidate style migration network comprises:
and performing style migration processing on the sample content images and training the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content images, the sample style images and the initial style migration network to generate candidate style migration images corresponding to the sample content images and candidate style migration networks corresponding to the candidate style migration images.
4. The method according to claim 3, wherein performing style migration processing on the sample content image and training on the initial style migration network according to each preset weight parameter in the preset weight parameter set, the sample content image, the sample style image, and the initial style migration network to generate a candidate style migration image corresponding to the sample content image and a candidate style migration network corresponding to the candidate style migration image comprises:
inputting preset weight parameters in the preset weight parameter set, the sample content images and the sample style images into the initial style migration network to perform style migration processing on the sample content images, and generating candidate style migration images corresponding to the sample content images;
training the initial style migration network according to the candidate style migration image and the sample style image to obtain a candidate style migration network;
taking the next preset weight parameter of the preset weight parameters as a new preset weight parameter, and iteratively generating a new candidate style migration image and a new candidate style migration network corresponding to the sample content image; until the last preset weight parameter in the preset weight parameter set is iterated, taking the candidate style migration image and the new candidate style migration image as candidate style migration images, and taking the candidate style migration images and the new candidate style migration network as candidate style migration networks.
5. The method of claim 4, wherein the initial style migration network comprises an initial feature extraction network and an initial adaptive instance normalization network; the preset weight parameters comprise first-class preset weight parameters; the inputting preset weight parameters in the preset weight parameter set, the sample content images and the sample style images into the initial style migration network to perform style migration processing on the sample content images and generate candidate style migration images corresponding to the sample content images includes:
inputting the sample content image and the sample style image into an initial feature extraction network for feature extraction to obtain a first feature image of the sample content image and a second feature image of the sample style image;
updating the network parameters of the adaptive instance normalized network into the first type of preset weight parameters, and generating the initial adaptive instance normalized network;
and inputting the first characteristic image and the second characteristic image into the initial adaptive instance normalized network for style migration, and generating a candidate style migration image corresponding to the sample content image.
6. The method of claim 5, wherein inputting the first feature image and the second feature image into the initial adaptive instance normalization network for style migration to generate a first candidate style migration image corresponding to the sample content image comprises:
calculating a weighted sum of the mean value of the first characteristic image and the mean value of the second characteristic image based on a first type of preset weight parameter;
calculating a weighted sum of the variance of the first characteristic image and the variance of the second characteristic image based on the first type of preset weight parameters;
and inputting the weighted sum of the mean values, the weighted sum of the variances, the mean value and the variance of the first characteristic image and the sample content image into the initial adaptive instance normalization network for style migration, and generating a first candidate style migration image corresponding to the sample content image.
7. The method according to claim 5 or 6, wherein the preset weight parameters further comprise a second type of preset weight parameters; the initial feature extraction network comprises a preset convolutional neural network, a first encoder network and a second encoder network; inputting the sample content image and the sample style image into an initial feature extraction network for feature extraction to obtain a first feature image of the sample content image and a second feature image of the sample style image, wherein the method comprises the following steps:
inputting the sample content image into the preset convolutional neural network for feature extraction to obtain a first feature of the sample content image, and inputting the sample content image into the first encoder network for feature extraction to obtain a second feature of the sample content image;
according to the second type of preset weight parameters, performing feature fusion on the first features of the sample content images and the second features of the sample content images to generate first feature images of the sample content images;
and inputting the sample style image into the second encoder network for feature extraction to obtain a second feature image of the sample style image.
8. The method of claim 7, wherein the predetermined convolutional neural network is trained based on a predetermined image database corresponding to the sample content images.
9. An image processing apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a content image to be processed;
the processing module is used for inputting the content image into a preset style migration network for style migration processing to obtain a target style migration image; the preset style migration network is used for converting the original style of the content image into a preset style; the system comprises a preset style migration network, a sample content image and a sample style image, wherein the preset style migration network is obtained by training an initial style migration network based on a preset weight parameter set, the sample content image and the sample style image;
and the output module is used for outputting the target style migration image.
10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 8.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
12. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the method of any one of claims 1 to 8.
CN202211625387.0A 2022-12-16 2022-12-16 Image style migration method and device, computer equipment, storage medium and product Pending CN115861041A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211625387.0A CN115861041A (en) 2022-12-16 2022-12-16 Image style migration method and device, computer equipment, storage medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211625387.0A CN115861041A (en) 2022-12-16 2022-12-16 Image style migration method and device, computer equipment, storage medium and product

Publications (1)

Publication Number Publication Date
CN115861041A true CN115861041A (en) 2023-03-28

Family

ID=85673778

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211625387.0A Pending CN115861041A (en) 2022-12-16 2022-12-16 Image style migration method and device, computer equipment, storage medium and product

Country Status (1)

Country Link
CN (1) CN115861041A (en)

Similar Documents

Publication Publication Date Title
US11593615B2 (en) Image stylization based on learning network
US11501415B2 (en) Method and system for high-resolution image inpainting
US10867416B2 (en) Harmonizing composite images using deep learning
CN112001914A (en) Depth image completion method and device
CN107438866A (en) Depth is three-dimensional:Study predicts new view from real world image
WO2020073758A1 (en) Method and apparatus for training machine learning modle, apparatus for video style transfer
CN112330684B (en) Object segmentation method and device, computer equipment and storage medium
CN111127309B (en) Portrait style migration model training method, portrait style migration method and device
CN112102477A (en) Three-dimensional model reconstruction method and device, computer equipment and storage medium
US20230401672A1 (en) Video processing method and apparatus, computer device, and storage medium
CN110874575A (en) Face image processing method and related equipment
CN117078790B (en) Image generation method, device, computer equipment and storage medium
WO2021218037A1 (en) Target detection method and apparatus, computer device and storage medium
CN116977531A (en) Three-dimensional texture image generation method, three-dimensional texture image generation device, computer equipment and storage medium
CN113538254A (en) Image restoration method and device, electronic equipment and computer readable storage medium
CN115115552B (en) Image correction model training method, image correction device and computer equipment
CN113554047A (en) Training method of image processing model, image processing method and corresponding device
CN116258923A (en) Image recognition model training method, device, computer equipment and storage medium
CN116977539A (en) Image processing method, apparatus, computer device, storage medium, and program product
CN115861041A (en) Image style migration method and device, computer equipment, storage medium and product
Chen et al. Adaptive Visual Field Multi-scale Generative Adversarial Networks Image Inpainting Base on Coordinate-Attention
CN113674383A (en) Method and device for generating text image
CN114998634B (en) Image processing method, image processing device, computer equipment and storage medium
CN116385643B (en) Virtual image generation method, virtual image model training method, virtual image generation device, virtual image model training device and electronic equipment
CN114782256B (en) Image reconstruction method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination