CN113837926A

CN113837926A - Image migration method based on mean standard deviation

Info

Publication number: CN113837926A
Application number: CN202111035198.3A
Authority: CN
Inventors: 叶汉民; 李志波; 蒲立力
Original assignee: Guilin University of Technology
Current assignee: Guilin University of Technology
Priority date: 2021-09-05
Filing date: 2021-09-05
Publication date: 2021-12-24

Abstract

The invention discloses an image migration method based on a mean standard deviation, which constructs a feature space to store feature information of different filters and realizes different normalization statistics in different network layers, thereby better obtaining multi-scale and stable features without training real data and flexibly performing style transfer. According to CNN theoretical analysis, feature information reconstructed by a high-level network is built and extracted on the basis, and an FC layer and a soft-max layer are removed to improve the operation efficiency. The experimental result shows that the performance of the mean standard deviation algorithm is superior to that of the Gram algorithm in the style transfer process, the style transfer distortion effect is small, and the time operation efficiency is improved by about 30 times.

Description

Image migration method based on mean standard deviation

Technical Field

The invention belongs to the crossing field of digital image processing, artificial intelligence and neural networks, and the subject matter is an image style migration method based on a mean standard deviation algorithm, which is a method with small style transfer distortion effect and high migration operation efficiency.

Background

The image style migration is to extract unique style characteristics of the style images and migrate the characteristics into the content images so as to combine the characteristic images of the style images and the content images. In the style migration process, the generated feature map is to faithfully represent the artistic features of the original style image, and the texture features generated by combining the content image and the style image are also rendered. The goal of the genre migration is to make the intermediate image consistent with the content image in terms of content and with the genre image in terms of genre, through multiple parameter adjustments.

The image style migration is closely related to the texture features of the image, and the two are complementary. Before deep learning is not started, the texture features of the images are statistical models built by analyzing images of a certain style and using local features, and the images are required to be migrated by changing the models to be better fit. Kolliopoulos uses local morphology to describe the location of strokes, producing different features in different semantic regions of the image. The model established based on the local features has poor applicability to extracting artistic features, certain limitations and poor overall feature capturing capability. When deep learning becomes a research hotspot, GatysLA and the like extract a feature space based on a convolutional neural network, and a new natural texture generation model is introduced, so that model sample information expression is accurate, perception quality is high, and the effectiveness of the convolutional neural network on feature expression is proved. Subsequently, GatysLA researches find that the expression effects generated by extracting information of different layers of the network are different, and the style expression can be richer by adopting a multi-layer feature fusion method. When calculating loss, content and style loss should be considered at the same time, a parameter coefficient is required to control weight, and when measuring texture features of an image, a Gram matrix is introduced to ensure that the product in a feature map has no dependency relationship with the position of the feature map. Li et al challenge the metric style migration of the Gram matrix and theoretically prove that the matching of the Gram matrix is equal to the maximum mean deviation minimization (MMD) with a second order polynomial kernel.

Disclosure of Invention

The method is used for researching a style transfer technology based on a convolutional neural network, realizes feature extraction of a style image and a content image in an image feature space, improves an image style transfer algorithm based on Grammatrix, replaces the correlation of multi-layer feature information in CNN as the style feature, obtains a mean value and a standard deviation in a high-level network to realize normalization, improves the operation efficiency of the style feature, simultaneously carries out PSNR and SSIM on the generated feature image and the content image to carry out image quality comparison, and simultaneously reduces distortion and artifacts of the image feature by comparing the loss function value trend of the improved algorithm.

The main technical scheme comprises: and (3) designing a feature space by the style extraction network, wherein the feature space can be constructed on convolution kernels of various network layers to store feature information on different convolution kernels. The characteristics of convolution kernels of different network layers are combined, so that rich and stable characteristics can be obtained; and (3) style migration training, wherein a loss function and a weight of the VGG-19 network are trained, and the loss function and the weight in the neural style migration have different meanings.

In the VGG-19 network training process, the weight value is updated through back propagation, the weight is changed, the loss function value is related to the weight value, and the pixel of the input image is not changed. In the field of neural style migration, we can operate in a feature space by using a deep neural network trained by object recognition to explicitly represent the high-level content of an image. Calculating target loss, and adjusting the noise image x by optimizing a loss function to enable the generated content feature set to be close to the original feature set, and enable the generated style feature set to be close to the original image style feature set; the overall loss function is defined as a weighted sum of a content loss function and a stylistic loss function, the production stylistic image being obtained by optimizing the optimal loss.

Experimental simulation shows that the method can well improve the operation efficiency of style characteristics and reduce distortion and artifacts of image characteristics.

Drawings

The main figures of the process are as follows.

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is an original content image.

Fig. 3 is an original style sheet corresponding to original contents, respectively.

FIG. 4 is a migration diagram generated by the G-NST algorithm.

FIG. 5 is a migration map generated based on the mean standard deviation algorithm.

FIG. 6 is a graph of the loss of the G-NST algorithm over the number of iterations.

Figure 7 is a graph of the loss of the mean standard deviation algorithm over the number of iterations.

Detailed Description

The invention is further described with reference to the following figures and detailed description.

The flow chart of the invention is shown in fig. 1, and the image style migration method based on the mean standard deviation algorithm specifically comprises the following steps:

the method comprises the following steps: style extraction network

And designing a feature space which can be constructed on convolution kernels of various network layers to store feature information on different convolution kernels.

The features of the convolution kernels of different network layers are combined to obtain rich and stable features. In the field of neural style migration, we can operate in a feature space by using a deep neural network trained by object recognition to explicitly represent the high-level content of an image.

Wherein, l represents the l-th network layer,

representing the feature value of the content picture p at the position j on the ith feature map in the ith network layer,

representing the feature value of the generated picture x at the position j on the ith feature map in the 1 st network layer.

When the style characteristics of the image are calculated, the style characteristics are represented by the inner product of the ith and jth characteristic maps in the ith layer. The generated image is characterized by

Similarly, the style characteristic A of the style image can be calculated. Calculate style loss of layer I as

Grammatrix does not include position information and calculates the correlation between each feature, and the calculation complexity is high; while the mean and variance contain features of different domains

And

a feature map representing images a and x at layer i,

and

mean (equation 2.5) and standard deviation of ith eigenchannel at ith layer for images a and x, respectively

The total loss function is defined as a weighted sum of a content loss function and a style loss function

L_total＝αL_c(c，e)+βL_s(s，e)

The total loss depends on content loss and style loss, parameters alpha and beta influence the calculation of a loss function, when the proportion of alpha is larger, the style characteristics of the content are more obvious, otherwise, the style characteristics are more prominent, and further the training process is influenced

Step two: style migration training

1) The loss function and the weight value in the training of the VGG-19 network are different from the loss function and the weight value in the neural style migration.

2) In the VGG-19 network training process, the weight value is updated through back propagation, the weight is changed, the loss function value is related to the weight value, and the pixel of the input image is not changed.

3) In the field of neural style migration, we can operate in a feature space by using a deep neural network trained by object recognition to explicitly represent the high-level content of an image.

4) Calculating target loss, and adjusting the noise image x by optimizing a loss function to enable the generated content feature set to be close to the original feature set, and enable the generated style feature set to be close to the original image style feature set.

Step three: obtaining generation-style images by optimizing optimal loss

1) In order to find the characteristic diagram corresponding to the maximum activation response, iteration gradient is required to be continuously carried out, so that the picture content and the artistic texture can be better fused by matching the high-level network characteristics, and specific pixel information cannot be excessively reserved.

2) The total loss depends on content loss and style loss, parameters alpha and beta influence the calculation of a loss function, when the proportion of alpha is larger, the style characteristics of the content are more obvious, otherwise, the style characteristics are more prominent, and further the training process is influenced.

Method test

The simulation experiment selects three original content images and style images corresponding to the three original content images to carry out a series of experiments.

The performance of the algorithm is evaluated from two aspects of PSNR and SSIM, the PSNR can be used for evaluating the quality of an image after image reconstruction compared with an original image, the higher the PSNR value is, the better the image robustness is, the smaller the distortion is, the more the visual effect of the image can be reflected, and the calculation formula is

I_i，jAnd l'_i，jFor the pixel values of two images to be calculated, N₁×N₂Is the image size.

The SIM is based on the structural similarity among brightness, contrast and structure, and the calculation formula is as follows:

wherein, mu_xIs the mean value of x, μ_yIs the average value of the values of y,

is the variance of x and is the sum of the differences,

variance of y, σ_xyIs the covariance of x and y, c₁And c₂Is two constants and is avoided to be 0.

Table 1 shows PSNR values of an effect graph and a content image in a repeated experiment

Table 2 shows the SSIM values of the effect graph and the content image in the repeated experiments

Claims

1. An image migration method based on a mean standard deviation is characterized in that: the method has the advantages that the method improves an image style transfer algorithm based on Gram matrix by extracting the features of the style image and the content image in an image feature space, replaces the correlation of multi-layer feature information in CNN as the style feature, obtains the mean value and the standard deviation in a high-level network to realize normalization, improves the operation efficiency of the style feature, simultaneously carries out PSNR and SSIM image quality comparison on the generated feature image and the content image, and simultaneously reduces the distortion and the artifact of the image feature by comparing the loss function value trend of the improved algorithm.

2. The method for image migration according to claim 1, wherein: the method specifically comprises the following steps:

the method comprises the following steps: style extraction network

Designing a characteristic space which can be constructed on convolution kernels of various network layers to store characteristic information on different convolution kernels; the characteristics of convolution kernels of different network layers are combined, so that rich and stable characteristics can be obtained; in the field of neural style migration, by using a deep neural network trained by object recognition, we can operate in a feature space to express high-level contents of an image; the method is characterized in that the style of the image is extracted, and not only through the pixel observation of the style image, but also the characteristics extracted by a pre-trained model are combined with the content of the style image;

step two: style migration training

1) Training a loss function and a weight of the VGG-19 network, wherein the loss function and the weight in the neural style migration have different meanings;

2) in the VGG-19 network training process, the weight is updated through back propagation, the weight is changed, the loss function value is related to the weight, and the pixels of the input image are not changed;

3) in the field of neural style migration, by using a deep neural network trained by object recognition, we can operate in a feature space to express high-level contents of an image;

4) calculating target loss, and adjusting the noise image x by optimizing a loss function to enable the generated content feature set to be close to the original feature set, and enable the generated style feature set to be close to the original image style feature set;

step three: obtaining generation-style images by optimizing optimal loss

1) In order to find the characteristic diagram corresponding to the maximum activation response, iteration gradient is required to be continuously carried out, so that the picture content and the artistic texture can be better fused by matching the high-level network characteristics, and specific pixel information cannot be excessively reserved;

2) when the style characteristics of the image are calculated, the style characteristics are expressed by the inner product of the ith and jth characteristic maps in the ith layer;

3) the Gram matrix does not include position information and calculates the correlation between each feature, and the calculation complexity is high; but the statistics (mean and variance) of the batch normalization layer contain features of different domains;

4) the overall loss function is defined as a weighted sum of the content loss function and the style loss function:

L_total＝αL_c(c，e)+βL_s(s，e)

5) the total loss depends on content loss and style loss, parameters alpha and beta influence the calculation of a loss function, when the proportion of alpha is larger, the style characteristics of the content are more obvious, otherwise, the style characteristics are more prominent, and further the training process is influenced.