CN112802160B

CN112802160B - U-GAT-IT-based improved method for migrating cartoon style of Qin cavity character

Info

Publication number: CN112802160B
Application number: CN202110037379.3A
Authority: CN
Inventors: 耿国华; 马星锐; 刘喆; 周明全; 冯龙; 李启航; 刘晓宁; 刘阳洋
Original assignee: NORTHWEST UNIVERSITY
Current assignee: NORTHWEST UNIVERSITY
Priority date: 2021-01-12
Filing date: 2021-01-12
Publication date: 2023-10-17
Anticipated expiration: 2041-01-12
Also published as: CN112802160A

Abstract

The invention discloses a U-GAT-IT-based improved method for transferring cartoon style of a Qin cavity role, which comprises the steps of firstly, manufacturing a unique cartoon data set of the Qin cavity role, carrying out preprocessing operations such as modifying the size of a collected face picture, and secondly, adding two stacked up-down sampling convolution modules before a coder and after a decoder on the basis of a U-GAT-IT network, so as to gradually improve the characteristic abstraction and reconstruction capability of a model. In addition, a face loss function is added into the U-GAT-IT network, and the newly generated cartoon diagram of the Qin cavity role is restrained, so that the identity information of the cartoon diagram of the Qin cavity role is kept consistent with the input face information as much as possible. And finally, constructing a network model through a pytorch, and transmitting the processed face image into the network to obtain the cartoon graph with the characteristic of the Qin cavity role. The method solves the problem that the identity information of the original image is easy to lose when the cartoon image is migrated in style, improves the defect that a large number of data sets are needed by U-GAT-IT, and can more accurately outline the detailed information of the cartoon image.

Description

U-GAT-IT-based improved method for migrating cartoon style of Qin cavity character

Technical Field

The invention belongs to the technical field of computer graphics processing, and particularly relates to a U-GAT-IT-based improved method for migrating a cartoon style of a Qin cavity character.

Background

The Qin cavity is a quite ancient theatrical species, and is called as a nasal ancestor of China's drama. The role of the Qin cavity basically inherits the scale system of the Yuan-omnipotent drama and the Ming dynasty legend 'river and lake twelve roles'. In the general view, four major lines are raw, denier, net and ugly. With the rapid development of modern economy, entertainment modes of the masses of the people are changed. The original entertainment function of the Qin cavity is weakened, and the audience group is aged and faces the crisis of survival and development. The image of the face of the Qin cavity is transferred into the image of the cartoon style through the image cartoon style on the basis of U-GAT-IT, and the computer is used for refreshing the cultural heritage of the non-matter of the Qin cavity art, thereby realizing cultural innovation of the Qin cavity art and enabling people to regain interest in traditional drama in a more modern way.

Where U-GAT-IT refers to an unsupervised generation attention network with adaptive layer instance normalization for image-to-image translation.

The migration of the cartoon style of the human face is to keep the identity information of the original image unchanged, keep as many texture details as possible and convert the real human face image into a non-realistic image of the cartoon style. The cartoon image is a simplified line, the line has clear texture and smooth color blocks, but the technology is easy to have the following technical problems of losing identity information and image details of a human face, having great acquisition difficulty of cartoon image data, having high manufacturing cost of the cartoon image and the like in the process of cartoon image processing.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide the improved method for the cartoon style migration of the Qin cavity character based on the U-GAT-IT, which solves the problem that the identity information of the original image is easy to lose during the style migration of the cartoon image, improves the defect that the U-GAT-IT needs a large number of data sets, and can more accurately outline the detailed information of the cartoon image.

In order to achieve the above purpose, the invention adopts the following technical scheme:

a U-GAT-IT-based improved method for migrating the cartoon style of a Qin cavity character comprises the following steps of;

step 1: acquiring a cartoon data set of a Qin cavity role and a face picture set;

step 2: constructing an original U-GAT-IT network frame by using a pyrach, wherein the U-GAT-IT network frame comprises a Generator and a Discriminator, and one face image acquired in the step 1 is input into the U-GAT-IT network frame as source domain images X-X _S Simultaneously inputting the cartoon image of the Qin cavity character obtained in the step 1 as the target domain images X-X _y ；

Step 3: the source domain image and the target domain image obtained in the step 2 are processed through two stacked up-down sampling convolution modules, the characteristics of the source domain image and the target domain image are extracted under the condition that the semantic information position is unchanged, and then the characteristics of the source domain image and the target domain image are processed through an encoder E in a Generator _S Coding to obtain a coded characteristic diagram

Step 4: the feature map obtained in the step 3 is passed through an auxiliary classifier eta of two classifications in a Generator _s Classifying to distinguish whether the image comes from the source domain or the target domain to obtain the weight information of each feature map

Step 5: multiplying the weight information of each feature map obtained in the step 4 with the encoded feature map to obtain a feature map with attention;

step 6: obtaining a class activation diagram from the attention characteristic diagram obtained in the step 5 through a convolution and activation function layer of 1*1

Wherein a represents each class activation graph, w represents weight information of each feature graph, and E represents the encoded feature graph;

step 7: the class activation diagram obtained in the step 6 is processedThe decoder G in the Generator is obtained by the full connection layer fc _t 2c parameters gamma and beta used for carrying out normalization treatment on AdaLIN in the Adaptive Layer-Instance Normalization Layer;

step 8: the gamma and beta obtained in the step 7 are further generated into an image through an adaptive residual block Adaptive Residual Blocks;

step 9: the image obtained in the step 8 is further generated into a target domain image from the embedded feature through two newly added stacked up-down sampling convolution modules;

step 10: judging the authenticity of the image by the CAM activation diagram in the Discriminator;

step 11: and (3) calculating the result judged in the step (10) through GAN Loss, cycle Loss, identity Loss, CAM Loss and Face Loss functions in the U-GAT-IT network framework to obtain the cartoon image of the character characteristics of the Qin cavity.

Further, the normalization formulas of Instance Normalization and Layer Normalization in AdaLIN in step 7 are:

wherein mu _I And delta _I Mean and standard deviation, mu, respectively, of Instance Normalization example normalization mode _L And delta _L The mean value and standard deviation are respectively calculated in a Layer Normalization layer normalization mode;

the normalized weighted sum formula of Instance Normalization and Layer Normalization in AdaLIN is:

wherein λ is a feature map with attention, γ is a scaling factor, and β is an offset; θ is a learning weight, updated by back propagation.

Further, the θ interval clipping formula is:

θ←clip[0,1](θ-τΔθ)

where τ is the learning rate.

Further, the two stacking up-down sampling convolution modules in the step 9 include four pairs of downsampling blocks and upsampling blocks which are corresponding to each other from end to end, so as to extract the surface features of the face in the picture before entering the main network, abstract the input image into a form easy to encode, and ensure the unchanged position of semantic information while extracting the features.

Further, the function formula of the Loss function Face Loss in step 11 is:

wherein P is _source P is a source domain image _target Is a target domain image; l (L) _face The cosine distance is used to constrain the features of the source domain image and the features of the target domain image.

Further, the cartoon data set of the Qin cavity role in the step 1 is a flower denier cartoon data set, and the face picture set is a female face picture set.

Compared with the prior art, the invention has the beneficial effects that:

the invention discloses a method for transferring cartoon style of a Qin cavity character based on U-GAT-IT improvement, which comprises the steps of firstly obtaining a cartoon data set and a face picture set of the Qin cavity character, and secondly adding two stacked up-down sampling convolution modules before an encoder and after a decoder on the basis of a U-GAT-IT network so as to gradually improve the characteristic abstract and reconstruction capability of a model. In addition, a face loss function is added into the U-GAT-IT network, and the newly generated cartoon diagram of the Qin cavity role is restrained, so that the identity information of the cartoon diagram of the Qin cavity role is kept consistent with the input face information as much as possible. And finally, constructing a network model through a pytorch, and transmitting the processed face image into the network to obtain the cartoon graph with the characteristic of the Qin cavity role. The method solves the problem that the identity information of the original image is easy to lose when the cartoon image is migrated in style, improves the defect that a large number of data sets are needed by U-GAT-IT, and can more accurately outline the detailed information of the cartoon image.

Drawings

FIG. 1 is a cartoon diagram of a Qin cavity flower denier made by the invention;

FIG. 2 is a network block diagram of the present invention;

FIG. 3 is a block diagram of a stacked up-down sample convolution module of the present invention;

FIG. 4 is a network architecture diagram within the Generator of the present invention;

FIG. 5 is a network structure diagram of the inside of the Discriminator of the present invention;

FIG. 6 is a block diagram of an AdaLIN example normalization of the present invention;

fig. 7 is a graph of the results of a cavity role cartoon style migration through a modified U-GAT-IT network.

Detailed Description

The present invention will be described in further detail with reference to specific examples, but is not limited thereto.

The invention provides a U-GAT-IT-based improved method for transferring the cartoon style of a Qin cavity character, which comprises the following steps of;

step 1: the Qin cavity flower denier cartoon data set is manufactured by collecting flower denier pictures on the Qin cavity website, and the picture size is 512 x 512mm. Collecting female face pictures, wherein the size is set as follows: 256 x 256mm (see fig. 1).

Step 2: constructing an original U-GAT-IT network framework (see FIG. 2) using a pytorch, the U-GAT-IT network framework including a generator GeneThe Discriminator (see fig. 4) and the Discriminator (see fig. 5), one face image acquired in step 1 is input as source domain images X to X in the U-GAT-IT network frame _S Simultaneously inputting the cartoon image of the Qin cavity character obtained in the step 1 as the target domain images X-X _y 。

Step 3: the source domain image and the target domain image obtained in the step 2 are processed through two stacked up-down sampling convolution modules (refer to figure 3), the characteristics of the source domain image and the target domain image are extracted under the condition that the semantic information positions are unchanged, and then the characteristics of the source domain image and the target domain image are processed through an encoder E in a Generator _S Coding to obtain a coded characteristic diagram

Step 4: the feature map in the step 3 is passed through an auxiliary classifier eta of two classifications in a Generator _s Classifying to distinguish whether the image comes from the source domain or the target domain to obtain the weight information of each feature map

Step 5: and (3) multiplying the weight information of each feature map obtained in the step (4) with the encoded feature map to obtain the feature map with attention.

Where a represents each class activation map, w represents weight information of each feature map, and E represents the encoded feature map.

Step 7: in step 6The resulting class activation diagramThe decoder G in the Generator is obtained by the full connection layer fc _t 2c parameters gamma and beta used for normalization processing of AdaLIN (refer to figure 6) in the Adaptive Layer-Instance Normalization Layer;

the normalized formulas of Instance Normalization and Layer Normalization in AdaLIN are:

wherein mu _I And delta _I Mean and standard deviation, mu, calculated for the Instance Normalization example normalization _L And delta _L Mean and standard deviation were determined for the Layer Normalization layer normalization approach.

The normalized weighted sum formula for Instance Normalization and Layer Normalization in AdaLIN is:

where λ is the feature map with attention, λ is the scaling factor, and β is the offset. θ is a learning weight, updated by back propagation.

In order to prevent theta from exceeding the range of [0,1], the theta is subjected to interval clipping with the clipping formula of

θ←clip[0,1](θ-τΔθ)

Where τ is the learning rate.

step 11: and (3) calculating the result judged in the step (10) through GAN Loss, cycle Loss, identity Loss, CAM Loss and Face Loss functions in the U-GAT-IT network framework to obtain a cartoon image with the Qin cavity flower denier characteristic (refer to figure 7).

The function formula of the Face Loss function is as follows:

wherein P is _source P is a source domain image _target Is a target domain image. L (L) _face The cosine distance is used to constrain the features of the source domain image and the features of the target domain image. The Face Loss function is added to restrict the newly generated flower denier cartoon graph of the Qin cavity, so that the identity information of the flower denier cartoon graph is kept consistent with the input Face information as much as possible.

In summary, the embodiment realizes style migration from the face to the Qin cavity flower denier cartoon on the basis of ensuring the consistency of the face information and the identity information of the Qin cavity flower denier cartoon, solves the problems of high difficulty in acquiring traditional cartoon image data and high manufacturing cost of the cartoon image, simultaneously solves the problem that the identity information of the original image is easy to lose during style migration, improves the defect that a large number of data sets are needed by U-GAT-IT, and can more accurately outline the detail information of the cartoon image.

Claims

1. A U-GAT-IT-based improved method for migrating the cartoon style of a Qin cavity character is characterized in that: comprises the following steps of;

step 2: constructing an original U-GAT-IT network frame by using a pyrach, wherein the U-GAT-IT network frame comprises a Generator and a Discriminator, and inputting the person acquired in the step 1 into the U-GAT-IT network frameFace image is used as source domain image X-X _S Simultaneously inputting the cartoon image of the Qin cavity character obtained in the step 1 as the target domain images X-X _y ；

step 7: the class activation diagram obtained in the step 6 is processedObtaining a green body through the full connection layer fcDecoder G in the Generator _t 2c parameters gamma and beta used for carrying out normalization treatment on AdaLIN in the Adaptive Layer-Instance Normalization Layer;

2. The method for improving the cartoon style migration of the Qin cavity roles based on U-GAT-IT according to claim 1, wherein the method comprises the following steps of: the normalized formulas of Instance Normalization and Layer Normalization in AdaLIN in step 7 are:

3. The method for improving the cartoon style migration of the Qin cavity roles based on U-GAT-IT according to claim 2, wherein the method comprises the following steps of: the theta interval clipping formula is as follows:

θ←clip[0,1](θ-τΔθ)

where τ is the learning rate.

4. A method for improving the cartoonization style migration of a Qin cavity character based on U-GAT-IT according to any one of claims 1 to 3, wherein: the two stacking up-down sampling convolution modules in the step 9 comprise four pairs of downsampling blocks and upsampling blocks which are mutually corresponding from beginning to end, and the purpose is to extract the surface features of the face in the picture before entering the main network, abstract the input image into a form easy to encode, and ensure the invariable semantic information position while extracting the features.

5. A method for improving the cartoonization style migration of a Qin cavity character based on U-GAT-IT according to any one of claims 1 to 3, wherein: the function formula of the Loss function Face Loss in step 11 is:

6. A method for improving the cartoonization style migration of a Qin cavity character based on U-GAT-IT according to any one of claims 1 to 3, wherein: the cartoon data set of the Qin cavity role in the step 1 is a flower denier cartoon data set, and the face picture set is a female face picture set.