CN111768335A

CN111768335A - CNN-based user interactive image local clothing style migration method

Info

Publication number: CN111768335A
Application number: CN202010628294.8A
Authority: CN
Inventors: 熊海涛; 王涵颍; 蔡圆媛
Original assignee: Beijing Technology and Business University
Current assignee: Beijing Technology and Business University
Priority date: 2020-07-02
Filing date: 2020-07-02
Publication date: 2020-10-13
Anticipated expiration: 2040-07-02
Also published as: CN111768335B

Abstract

The invention discloses a CNN-based user interactive image local clothing style migration method, which comprises the following steps: (1) inputting the content graph and the style graph into a CNN (content network node) for mapping to obtain content characteristics and style characteristics; (2) interactively segmenting the content graph by using a GrabConut algorithm, framing and extracting the local clothing by using a rectangle, and generating a local clothing outline graph; (3) converting the contour map into a binary map, and performing distance transformation to generate a distance transformation matrix; (4) increasing the distance between the inside and the outside of the local clothing outline by utilizing power level operation to form outline characteristics; (5) calculating the content loss, style loss and contour loss of the random noise image according to the characteristics; (6) and synthesizing three types of losses and adding a regular term to smoothly denoise the boundary area. The invention adopts a user interactive method to introduce contour loss to the picture so as to realize the retention of the clothing shape and the limitation of the style migration area, and effectively realizes the style migration of the local clothing.

Description

CNN-based user interactive image local clothing style migration method

Technical Field

The invention relates to the technical field of image processing and recognition, in particular to a CNN-based user interactive image local clothing style migration method.

Background

Image style migration refers to extracting a style from one picture and applying the style to another picture. Originally, in the visual field, image rendering is generally regarded as an expansion problem of texture synthesis, and a new image is generated through texture modeling, but the quality of the image generated by the method is not high. Although there are studies trying to perform style migration using a generative countermeasure network (GAN), although a good migration effect is obtained, it should be noted that the GAN-based method is unstable, its generation is too free, and it needs to be well constrained to generate a reasonable result stably, and the generative countermeasure network is a data-driven method, which is premised on a large amount of data, and is difficult to be effective when the amount of data is not sufficient. In recent years, image style migration research mainly focuses on feature mapping of content and style through a convolutional neural network, losses of content features and style features are reduced continuously, new images are generated in an iterative mode, good results are obtained, but content details cannot be well reserved in the style migration process of the algorithms, and semantic and depth information contained in content images is lacked. If the method is directly applied to fashion style migration of clothes, the generated clothes picture has low resolution, the clothes shape can be deformed and the color of the original clothes is kept, the style is irregularly migrated to the background instead of the local clothes, and the clothes and the new style are difficult to fuse.

Disclosure of Invention

In order to apply style migration to the fashion field and realize style design of fashion clothing, the method utilizes the user interactive GrabCT algorithm to perform image segmentation, a user only needs to frame a local clothing decoration by a rectangle to extract a clothing contour map, and the contour loss is introduced to realize retention of clothing shapes and limitation of style migration areas in combination with a convolutional neural network, so that the defects that the clothing shapes are deformed and cannot be fused with new styles are overcome. The invention adopts a simple user interactive method, can obtain a new clothes style with the picture style only fused with the local clothes, and realizes the user interactive image local clothes style migration.

The technical scheme adopted by the invention for solving the technical problems is as follows: a CNN-based user interactive image local clothing style migration method comprises the following steps:

step 1: taking a clothing image as a content image, taking a picture as a style image, inputting the picture into a CNN network for feature mapping to obtain content features and style features;

step 2: performing interactive image segmentation by using a GrabCut algorithm, framing the local clothing to be subjected to style migration in the content image in the step 1 by using a rectangle, and marking the local clothing as unknown; marking the region outside the local clothing as a background, calculating the probability that an unknown pixel in a rectangular frame belongs to the background or the target according to a Gaussian mixture model, thereby segmenting the image into the background and the target, extracting the target in the rectangular frame, namely the local clothing, and generating a local clothing contour map;

and step 3: converting the local clothing outline image in the step 2 into a binary image, and performing distance transformation by using an Euler distance formula to obtain a distance transformation matrix;

and 4, step 4: through the distance transformation in the step 3, the pixel value in the local clothing outline is 0, and the pixel value outside the local clothing outline is increased by utilizing power-level operation to increase the distance difference between the inside and the outside to form the outline characteristic;

and 5: respectively obtaining content loss, style loss and contour loss by respectively carrying out difference on the features of the random noise map, the content features and the style features obtained in the step 1 and the contour features obtained in the step 4;

step 6: and giving different weights to the three types of losses for addition, adding a regular term punishment weight, and finally updating the network weight by utilizing gradient descent to minimize the loss to generate a result graph.

The specific process of the step 1 is as follows:

step 1.1: inputting a clothing image as a content image into a trained VGG-19 network model to define a random noise image

The number of convolution kernels of the l-th layer is N_lThe characteristic diagram size of the I layer is M_lThe characteristics of the random noise figure at the l-th layer are expressed as a matrix

Represents the activation value of the noise figure at j on the ith convolution kernel of the I < th > layer of the CNN; defining content graph of input

Similarly, the characteristic of the content graph at the level l represents P^l，

The activation value of j on the ith convolution kernel of the I th layer of the CNN representing the content graph;

step 1.2: inputting a picture as a style diagram into a trained VGG-19 network model, calculating the inner product of different features from the same layer according to style features, defining by synthesizing the features of a plurality of layers of convolutional layers, introducing a Gram matrix,

and expressing the inner product of the feature map i and the feature map j of the layer I of the random noise map, wherein the formula is as follows:

where k represents the kth element of the feature map.

The specific process of the step 2 is as follows:

performing interactive image segmentation by using a GrabCut algorithm, framing the local clothing decoration needing style migration in the content image in the step 1 by using a rectangle, and marking the local clothing decoration as unknown; marking the region outside the local clothing as a background, calculating the probability that an unknown pixel in a rectangular frame belongs to the background or the target according to a Gaussian mixture model, thereby segmenting the image into the background and the target, extracting the target in the rectangular frame, namely the local clothing, and generating a local clothing contour map;

representing the gray value of the original gray map as z ═ z (z)₁,z₂,…,z_n)z_nThe gray value of the nth pixel is represented, and the value of the pixel is represented by opacity α (α)₁,α₂,…,α_n) Is represented by α∈ [0,1]The transparency value of 0 represents the background in the image, and the transparency value of 1 represents the foreground in the image; the algorithm models the foreground and background of a color image with Gaussian Mixture Models (GMMs), each of which is considered as a K-dimensional covariance, K ═ m₁,k₂,…,k_n)，k_n∈ {1,2, …, K } indicates to which Gaussian component each pixel belongs, the formula of the Gibbs energy function of the GrabCut algorithm is as follows:

E(α,k,θ,z)＝U(α,k,θ,z)+V(α,z) (2)

where E is Gibbs energy, U is a data item of an energy function, which represents a negative logarithm of a probability that a certain pixel belongs to an object or a background, V is a smoothing item of the energy function, θ is a histogram of gray values, { h (z; α), α ═ 0,1}, which describes a distribution of gray values z of the foreground and the background, and the data item U is defined as follows:

U(α,k,θ,z)＝∑_nD(α_n,k_n,θ,z_n) (3)

wherein the area term D is defined by the following formula:

D(α_n,k_n,θ,z_n)＝-logp(z_n|α_n,k_n,θ)-logπ(α_n,k_n) (4)

wherein the function p (-) is a Gaussian probability distribution function, pi (-) is a mixed weight coefficient of the Gaussian model sample occupying the whole, and the further derivation is as follows:

where Σ (α)_n,k_n) Is covariance matrix, det is determinant symbol; the parameter vector θ for GMM is represented as follows:

θ＝{π(a,k),μ(a,k),Σ(a,k),α＝0,1,k＝1…K} (6)

in the formula, pi is weight, the mean value of a mu Gaussian model, and sigma is covariance; the smooth term function V is defined as follows:

V(a,z)＝γ∑_(m,n)∈C[α_n≠α_m]exp-β‖z_m－z_n‖²(7)

wherein the parameter γ is a weight of the degree of smoothing; m, n represents the neighborhood pixels in the picture C pixels; II z_m－z_nThe parameter β is determined by the contrast of the image, if the contrast of the image is low, a large β value is selected to amplify the pixel difference value, and if the contrast of the image is high, a small β value is selected to reduce the pixel difference value.

The specific process of the step 3 is as follows:

converting the local clothing outline image into a binary image, calculating the distance between the inside and the outside of the outline by utilizing an Euler distance formula so as to obtain a distance transformation matrix, and defining the matrix as D, wherein the distance transformation formula is as follows:

D(p)＝Min(disf(pq))p∈O,q∈B (8)

wherein the defined pixel point p belongs to the image O in the local clothing outline and has the coordinate of (x)₁,y₁) Defining pixel point q as belonging to the out-of-contour image B, and its coordinate is (x)₂,y₂)；

The specific process of the step 4 is as follows:

the pixel value in the local clothing contour after the distance transformation is 0, the pixel value outside the local clothing contour is increased by utilizing power level operation to increase the distance difference between the inside and the outside to form contour characteristics, and the formula is as follows:

wherein n is at least 2.

The specific process of the step 5 is as follows:

step 5.1: random noise image

Content characteristics of

And content map

Content characteristics of

The difference is taken to obtain the content loss, and the formula is as follows:

step 5.2: random noise image

Style characteristics of

And style sheet

Style characteristics of

Making difference to obtain style loss, defining style loss E of layer_lThe formula is as follows:

the style loss for all layers of the CNN is defined by the following formula:

wherein, w_lRepresenting the weight lost by each layer style of CNN.

Step 5.3: mapping random noise

Defining the profile features after distance transformation as a matrix

Input local dress outline drawing

Defining the profile features after distance transformation as a matrix

Loss of contour L_dThe formula is defined as follows:

the specific process of the step 6 is as follows:

step 6.1 α, β, gamma, r are four weight coefficients, L_TVThe method is characterized in that a total variation regular term is introduced, the function of the total variation regular term is to inhibit noise generated in the style migration process and smooth the boundary of the local clothing outline, and the formula is as follows:

L_total＝αL_c+βL_s+γL_d+rL_TV(15)

D_x、D_yrespectively representing the lateral and longitudinal differences of the resulting image,

respectively, the number of elements corresponding to the difference result.

Step 6.2:

the loss function is minimized using gradient descent, the derivative of the content loss function being:

the derivative of the style loss function is:

step 6.3: and updating the network weight to minimize loss, and generating a new clothes style with the picture style only fused with partial clothes.

Has the advantages that:

the method can realize self-control of the clothing style, and the user interactive method is adopted, so that the clothing contour can be extracted by the user by using a rectangular frame to frame the clothing, the method is simple and convenient, the common user of a non-professional designer can select the clothing contour suitable for the user according to the trend, then a new design style is generated by utilizing the unique fashion sense and preference of the user, and finally a unique clothing style is formed, and the fashion requirement of the user is met. Meanwhile, the invention can provide a great deal of inspiration for professional designers, and can be quickly designed into draft according to the style given by the user, thereby improving the efficiency and the satisfaction degree of customers. The method has high efficiency and low cost, the generated design quality is high, and the pursuit of fashion is met.

Drawings

FIG. 1 is a flow chart of a method of the present invention;

FIG. 2 is a diagram of four contents inputted by the present invention, wherein (a), (b), (c), and (d) are the first, second, third, and fourth inputted contents respectively;

FIG. 3 is a partial clothing contour map extracted by Grabcut of the present invention, wherein (a), (b), (c), and (d) are the partial clothing contour maps extracted by the first, second, third, and fourth content maps, respectively;

FIG. 4 is two stylistic graphs input by the present invention, wherein (a) and (b) are the first and second tabbed graphs respectively;

fig. 5(e) and (f) are partial migration result graphs generated by the content fig. 2(a), the content fig. 2(b), and the genre chart 4(a), respectively, and fig. 5 (g) and (h) are partial migration result graphs generated by the content fig. 2(c), the content fig. 2(d), and the genre chart 4(b), respectively.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than all embodiments, and all other embodiments obtained by a person skilled in the art based on the embodiments of the present invention belong to the protection scope of the present invention without creative efforts.

The invention discloses a CNN-based user interactive image local clothing style migration method, which comprises the following steps as shown in figure 1:

Specifically, the step 1 includes:

step 1.1: inputting the content maps, such as four content maps in the embodiment shown in FIGS. 2(a), (b), (c) and (d), into the trained VGG-19 network model to define a random noise image

step 1.2: inputting the style diagrams, such as two style diagrams of the embodiment shown in fig. 4(a) and (b), into a trained VGG-19 network model, wherein the style characteristics need to calculate inner products between different characteristics from the same layer, and further integrating the characteristics of the multilayer convolutional layer for definition, introducing a Gram matrix,

showing random noise figure at lThe inner product of the layer feature map i and the feature map j is as follows:

where k represents the kth element of the feature map.

Step 2: performing interactive image segmentation by using a GrabCut algorithm, framing the local clothing decoration needing style migration in the content image in the step 1 by using a rectangle, and marking the local clothing decoration as unknown; marking the region outside the local clothing as a background, calculating the probability that an unknown pixel in a rectangular frame belongs to the background or the target according to a Gaussian mixture model, thereby segmenting the image into the background and the target, extracting the target in the rectangular frame, namely the local clothing, and generating a local clothing contour map; as shown in fig. 3(a) (b) (c) (d), four partial garment silhouettes of this embodiment;

E(α,k,θ,z)＝U(α,k,θ,z)+V(α,z) (2)

U(α,k,θ,z)＝∑_nD(α_n,k_n,θ,z_n) (3)

wherein the area term D is defined by the following formula:

D(α_n,k_n,θ,z_n)＝-logp(z_n|α_n,k_n,θ)-logπ(α_n,k_n) (4)

θ＝{π(a,k),μ(a,k),Σ(a,k),α＝0,1,k＝1…K} (6)

V(a,z)＝γ∑_(m,n)∈C[α_n≠α_m]exp-β‖zm－zn‖2 (7)

Step 3, converting the local clothing outline image into a binary image, calculating the distance between the inside and the outside of the outline by using an Euler distance formula so as to obtain a distance transformation matrix, and defining the matrix as D, wherein the distance transformation formula is as follows:

D(p)＝Min(disf(pq))p∈O,q∈B (8)

wherein the defined pixel point p belongs to the image O in the local clothing outline and has the coordinate of (x)₁,y₁) Defining pixel point q as belonging to the out-of-contour image B, and its coordinate is (x)₂,y₂)

Step 4, the pixel value in the local clothing contour after the distance transformation is 0, the pixel value outside the local clothing contour is increased by utilizing power level operation to increase the distance difference between the inside and the outside to form the contour characteristic, and the formula is as follows:

wherein n is at least 2.

Step 5, specifically comprising:

step 5.1: random noise image

Content characteristics of

And content map

Content characteristics of

step 5.2: random noise image

Style characteristics of

And styleDrawing (A)

Style characteristics of

the style loss for all layers of the CNN is defined by the following formula:

wherein, w_lRepresenting the weight lost by each layer style of CNN.

Step 5.3: mapping random noise

Defining the profile features after distance transformation as a matrix

Input local dress outline drawing

Defining the profile features after distance transformation as a matrix

Loss of contour L_dThe formula is defined as follows:

the step 6 specifically comprises the following steps:

step 6.1 α, β, gamma, r are four weight coefficients, L_TVIs an introduced total variation regularization term which has the function of inhibiting the production in the style migration processThe noise generated smoothes the boundary of the local clothing outline, and the formula is as follows:

L_total＝αL_c+βL_s+γL_d+rL_TV(15)

respectively, the number of elements corresponding to the difference result.

Step 6.2:

the derivative of the style loss function is:

step 6.3: and updating the network weight to minimize loss, and generating a new clothes style with the picture style only fused with partial clothes. As shown in fig. 5, (e) (f) are local migration result maps generated by the content map (a) and the content map (b) and the style map (a), respectively, and (g) (h) are local migration result maps generated by the content map (c) and the content map (d) and the style map (b), respectively.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but various changes may be apparent to those skilled in the art, and it is intended that all inventive concepts utilizing the inventive concepts set forth herein be protected without departing from the spirit and scope of the present invention as defined and limited by the appended claims.

Claims

1. A CNN-based user interactive image local clothing style migration method is characterized by comprising the following steps:

2. The CNN-based user-interactive image partial apparel style migration method of claim 1, wherein: the step 1 specifically comprises the following steps:

step 1.1: inputting a clothing drawing as a content drawing into the trained VGG-19 network model to defineRandom noise image

Letter l indicates the number of layers, the number of convolution kernels of the l-th layer being N_lThe characteristic diagram size of the I layer is M_lThe characteristics of the random noise figure at the l-th layer are expressed as a matrix

where k represents the kth element of the feature map.

3. The CNN-based user-interactive image partial apparel style migration method of claim 1, wherein:

step 2, performing interactive image segmentation by using a GrabConut algorithm, and framing the local clothing to be subjected to style migration in the content image in the step 1 by using a rectangle, wherein the clothing is marked as unknown; marking the region outside the local clothing as a background, calculating the probability that an unknown pixel in a rectangular frame belongs to the background or the target according to a Gaussian mixture model, thereby segmenting the image into the background and the target, extracting the target in the rectangular frame, namely the local clothing, and generating a local clothing contour map;

representing the gray value of the original gray map as z ═ z (z)₁,z₂,…,z_n),z_nThe gray value of the nth pixel is represented, and the value of the pixel is represented by opacity α (α)₁,α₂,…,α_n) Is represented by α∈ [0,1]The transparency value of 0 represents the background in the image, and the transparency value of 1 represents the foreground in the image; the algorithm adopts Gaussian Mixture Model (GMM) to Model the foreground and the background of the color image, and each Gaussian Mixture Model is regarded as a K-dimensional covariance, K ═ K [ ((K) ])₁,k₂,…,k_n)，k_n∈ {1,2, …, K } indicates to which Gaussian component each pixel belongs, the formula of the Gibbs energy function of the GrabCut algorithm is as follows:

E(α,k,θ,z)＝U(α,k,θ,z)+V(α,z) (2)

U(α,k,θ,z)＝∑_nD(α_n,k_n,θ,z_n) (3)

wherein the area term D is defined by the following formula:

D(α_n,k_n,θ,z_n)＝-logp(z_n|α_n,k_n,θ)-logπ(α_n,k_n) (4)

θ＝{π(a,k),μ(a,k),Σ(a,k),α＝0,1,k＝1…K} (6)

V(a,z)＝γ∑_(m,n)∈C[α_n≠α_m]exp-β‖z_m－z_n‖²(7)

4. The CNN-based user-interactive image partial apparel style migration method of claim 1, wherein:

in the step 3, the local clothing outline image is converted into a binary image, the distance between the inside and the outside of the outline is calculated by using an Euler distance formula, so as to obtain a distance transformation matrix, the matrix is defined as D, and the distance transformation formula is as follows:

D(p)＝Min(disf(pq))p∈O,q∈B (8)

wherein the defined pixel point p belongs to the image O in the local clothing outline and has the coordinate of (x)₁,y₁) Defining pixel point q as belonging to the out-of-contour image B, and its coordinate is (x)₂,y₂)。

5. The CNN-based user-interactive image partial apparel style migration method of claim 1, wherein:

in step 4, the pixel value inside the local clothing contour after distance conversion is 0, and the pixel value outside the local clothing contour is increased by using power-level operation to increase the distance difference between the inside and the outside to form the contour characteristic, wherein the formula is as follows: