CN113393370A - Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles - Google Patents

Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles Download PDF

Info

Publication number
CN113393370A
CN113393370A CN202110616129.5A CN202110616129A CN113393370A CN 113393370 A CN113393370 A CN 113393370A CN 202110616129 A CN202110616129 A CN 202110616129A CN 113393370 A CN113393370 A CN 113393370A
Authority
CN
China
Prior art keywords
style
picture
discriminator
content
generator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110616129.5A
Other languages
Chinese (zh)
Inventor
李康
张云朋
张妮
李文勇
耿国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN202110616129.5A priority Critical patent/CN113393370A/en
Publication of CN113393370A publication Critical patent/CN113393370A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • G06T3/04
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding

Abstract

The invention belongs to the technical field of character image processing, and discloses a method, a system and an intelligent terminal for Chinese calligraphy character image style migration, wherein n pictures of a source style picture and a target style picture are manufactured from a ttf word stock, and an output picture is in a jpg format to obtain 2n pictures as training samples; training the source style picture and the target style picture to generate a confrontation network, and obtaining a picture generated by training of a generator and a result of judging authenticity of an identifier; inputting the training set obtained by making into a countermeasure network, iteratively updating a generator and a discriminator, generating a false target confusion discriminator by the generator, and identifying true and false targets by the discriminator to improve the discrimination capability of the discriminator until the two targets reach Nash balance, thereby obtaining a calligraphy style migration model. The method overcomes the defect that the input data of the Chinese calligraphy style migration model in the prior art must be strictly paired data; and the problems of poor effect and low generation efficiency of the traditional Chinese calligraphy character migration model are solved.

Description

Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles
Technical Field
The invention belongs to the technical field of character image processing, and particularly relates to a method, a system and an intelligent terminal for transferring Chinese calligraphy character image styles.
Background
At present: many problems in chinese calligraphy style migration may be encountered in converting an input content image to a target style image, and these conversion problems have been the focus of attention of researchers.
In the traditional Chinese calligraphy style migration, character images are generated in a form of decomposing and recombining Chinese characters, although the migration effect can be realized, complex Chinese characters are difficult to separate and recombine, manual intervention is needed at the moment, and the time consumption is serious. The most familiar style migration models based on neural networks in recent years are Pix2Pix, Rewrite and Zi2Zi, and there may be slight differences in the construction idea, but the same point is that the models are constructed by using a generative confrontation network. Carrying out supervised training on the model by using paired training data to realize style migration by using Pix2 Pix; rewrite is applied to font image generation based on a convolutional neural network; zi2Zi realizes calligraphy style migration by distinguishing various input contents and style images through category labels. However, a large amount of paired training data is difficult to find in reality, and the current models cannot be well generalized to new styles and cannot meet the requirements of all Chinese calligraphy style migration tasks.
When the calligraphy style of China is transferred, structural differences exist when content images are converted into images of other styles, the existing model can only learn given content images and images of target styles to realize the transfer, and if the content images and the format images can be coded and processed respectively, many complex problems can be solved, for example, the content images are transferred to new format images and the like, which is very important for realizing the calligraphy style transfer.
Through the above analysis, the problems and defects of the prior art are as follows: in the prior art, when a Chinese calligraphy style is transferred, a content image is converted into other style images, structural differences exist, an existing model can only learn a given content image and a target style image to realize the transfer, the character style transfer effect is poor under an unsupervised condition, the phenomenon that the calligraphy cannot be well transferred to a new style is reflected, local fuzzy and style confusion phenomena occur when the calligraphy is transferred to the new style, and the generation efficiency is low.
The difficulty in solving the above problems and defects is: acquiring a source style picture data set and a target style picture data set; a content encoder and a style encoder are separately designed into a self-coding network structure; implementation of specific code.
The significance of solving the problems and the defects is as follows: the method for transferring the styles of the images of the trained calligraphy characters can extract the styles of the calligraphers from the existing calligraphy truths, and generates the calligraphy characters consistent with the styles of the calligraphers by using the modern technology, so that the method has important significance for inheritance of Chinese calligraphy culture and also has important value for virtual repair of cultural relics with the calligraphy characters.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method, a system and an intelligent terminal for transferring the style of Chinese calligraphy characters and images.
The invention is realized in this way, a method for migrating Chinese calligraphy character image style, which comprises the following steps:
making n pieces of source style pictures and n pieces of target style pictures from a ttf word stock, outputting the pictures in a jpg format to obtain 2n pictures as training samples, explaining the acquisition of a training calligraphy style migration model sample data set, and making early-stage preparation for model training;
the method comprises the following steps that a source style picture is used for providing content characteristics, a target style picture is used for providing style characteristics, the source style picture and the target style picture are used for training and generating a confrontation network, and the pictures generated by the training of a generator and the result of judging the authenticity of an identifier are obtained;
inputting the training set obtained by making into a countermeasure network, iteratively updating a generator and a discriminator, wherein the generator generates a false target confusion discriminator, the discriminator identifies true and false targets to improve the discrimination capability of the discriminator until the true and false targets reach Nash balance, and a calligraphy style migration model is obtained.
Further, the source style picture is used for providing content characteristics, the target style picture is used for providing style characteristics, the source style picture and the target style picture are used for training and generating a confrontation network, the picture generated by the training of the generator and the result of judging the authenticity of the discriminator are obtained, and the method is implemented according to the following steps:
(1) preprocessing the acquired data set, wherein the preprocessing comprises unifying the size of the picture and removing blank pictures;
(2) the method comprises the steps that a generation countermeasure network is arranged and consists of 1 generator G and 1 discriminator D, the generator comprises 2 encoders which are marked as a content encoder Ec and a format encoder Es, and 1 decoder De, 1 AdaIN module and 1 Mask module are arranged between the encoders and the decoders, and the generation countermeasure network has 3 loss functions, namely, countermeasure loss, content loss and style loss;
(3) the generator G is responsible for training the preprocessed source style picture into a picture of a target style, then updating content loss and style loss to obtain a generated picture, the discriminator D takes the preprocessed picture of the target style and the picture generated by the generator G as input, and updates antagonism loss and predicts whether the picture is the picture generated by the generator or the picture of the source style.
Further, inputting the training set obtained by making into a countermeasure network, iteratively updating a generator and a discriminator, generating a false target confusion discriminator by the generator, and identifying true and false targets by the discriminator to improve the discrimination capability of the discriminator until the true and false targets reach Nash balance to obtain a calligraphy style migration model, wherein the method is implemented according to the following steps:
(1) selecting any one of the preprocessed source style pictures C as a sample and inputting the sample into an encoder Ec to obtain the characteristic x of the source style picture, selecting any one of the preprocessed target style pictures S as a sample and inputting the sample into an encoder Es to obtain the characteristic y of the target style picture;
(2) inputting the obtained content characteristic x of the source style picture and the style characteristic y of the target style picture into an AdaIN module, firstly de-stylizing the content characteristic x of the source style picture to obtain a characteristic omega, wherein a specific formula is as follows:
Figure BDA0003097627770000031
where x represents the content feature of the content image, y represents the style feature of the style image, μ (x) represents the mean of the content feature, σ (x) represents the variance of the content feature, and ω represents the de-stylized feature of the whitening operation. Then, the style of the target style picture is blended into the source style picture to obtain AdaIN (x, y), and the specific formula is as follows:
AdaIN(x,y)=ω·σ(y)+μ(y)(2)
y represents the style characteristic of the style image, sigma (y) represents the standard deviation of the target style image, mu (y) represents the mean value of the target style image, omega represents the characteristic after de-stylization of the whitening operation in the first step, AdaIN (x, y) represents the style obtained by removing the original style of the source style image and then blending the original style into the target style;
(3) inputting the obtained source style picture characteristic x, the obtained target style picture characteristic y and the obtained result AdaIN (x, y) into a mask module to obtain a characteristic z, wherein a specific formula is as follows:
z=M(x,y)·x+[1-M(x,y)]·AdaIN(x,y) (3)
wherein M (x, y) represents a mask generated by a mask module, x represents content features extracted by a content encoder, y is style features extracted by a style encoder, AdaIN (x, y) represents a result of adaptive instance normalization, and z represents a result of fusing source style picture features, target style picture features and AdaIN (x, y) features;
(4) inputting the obtained characteristic z into a decoder De to generate a picture S' with a target style;
(5) inputting the generated picture S' into a content encoder Ec and a format encoder Es, calculating content loss and style loss, and adopting L for the content encoder Ec and the format encoder Es1And (3) calculating the loss, wherein a specific calculation formula is as follows:
Figure BDA0003097627770000041
Figure BDA0003097627770000042
wherein L iscontentRepresents a content loss, LstyleRepresenting style loss, x representing content features extracted by the content encoder Ec, Ec(S ') content characteristics E of the pseudo image S' generated by the generator extracted by the content encoder Ecs(S ') representing the style characteristics extracted after the false image S' passes through the style encoder Es,
Figure BDA0003097627770000043
A desire to show that the generated image S' follows the probability distribution of the original style image S in content,
Figure BDA0003097627770000044
Representing the expectation that the generated image S' obeys the probability distribution of the original-style image S in style;
(6) inputting the generated picture S' and the preprocessed target style picture S into a discriminator D, and calculating the confrontation loss LadvThe specific calculation formula is as follows:
Ladv=Ex[log(D(x))]+ES’[log(1-D(S’))] (6)
wherein L isadvRepresenting the countermeasure loss, x represents the content feature extracted by the content encoder Ec of the source style image S, D (x) represents the output value of the source style image S after being input to the discriminator De, D (S ') represents the output value of the generated false image S' input discriminator De, ExExpressing expectation, E, of probability distribution of obeying true data for source-style image SS’Representing the expectation that the false image S' is amenable to generating a data probability distribution;
(7) the final total loss function is the sum of the generator loss and the discriminator loss, and is expressed as follows:
Ltotal=Ladv+α·Lcontent+β·Lstyle (7)
wherein L istotalDenotes the sum of losses, L, of the countermeasure networkadvRepresenting the loss of confrontation of the discriminator, LcontentRepresents the content loss, L, of the generatorstyleThe style loss of the generator is represented, alpha and beta represent the weight proportion occupied by the sub-loss function, the network parameters are continuously updated through the antagonism training, the loss values of the generator and the discriminator are optimized, the smaller the loss value is, the more successful the training is, namely, the picture style generated by the training is closer to the picture of the target style.
It is a further object of the invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
making n pieces of source style pictures and n pieces of target style pictures from a ttf word stock, outputting the pictures in a jpg format, and obtaining 2n pieces of pictures as training samples;
the source style picture is used for providing content characteristics, the target style picture is used for providing style characteristics, the source style picture and the target style picture are used for training to generate a confrontation network, and the picture generated by the training of the generator and the result of the discriminator for judging authenticity are obtained;
inputting the training set obtained by making into a countermeasure network, iteratively updating a generator and a discriminator, generating a false target confusion discriminator by the generator, and identifying true and false targets by the discriminator to improve the discrimination capability of the discriminator until the two targets reach Nash balance, thereby obtaining a calligraphy style migration model.
The invention also aims to provide an information data processing terminal, which is used for realizing the method for migrating the style of Chinese calligraphy characters and images.
Another objective of the present invention is to provide a system for transferring Chinese calligraphy character image style, which implements the method for transferring Chinese calligraphy character image style, wherein the system for transferring Chinese calligraphy character image style comprises:
the training sample module is used for manufacturing n pieces of source style pictures and n pieces of target style pictures from the ttf word stock, outputting the pictures in a jpg format, and obtaining 2n pictures as training samples;
the training sample processing module is used for using the source style picture to provide content characteristics, using the target style picture to provide style characteristics, and using the source style picture and the target style picture to train and generate a confrontation network to obtain a picture generated by the training of the generator and a result of judging authenticity of the identifier;
and the calligraphy style migration model acquisition module is used for inputting the manufactured training set into the countermeasure network, iteratively updating the generator and the discriminator, generating a false target confusion discriminator by the generator, and identifying the true and false targets by the discriminator to improve the discrimination capability of the discriminator until the two targets reach Nash balance to obtain the calligraphy style migration model.
Further, the generation countermeasure network comprises a generator and a discriminator, wherein the generator comprises a content encoder, a style encoder, an AdaIN module, a Mask module and a decoder;
the content encoder is used for extracting content characteristics of the source style pictures; the style encoder is used for extracting style characteristics of the target style picture; the AdaIN module is used for de-stylizing the content features and then blending the content features into the style of a target style; the Mask module is an attention mechanism module and is used for fusing content characteristics, style characteristics and characteristics obtained by the AdaIN module; the decoder is used for generating the picture of the target style from the characteristics processed by the Mask module, the discriminator is used for evaluating the result generated by the generator, and the evaluation parameter is given according to the loss function.
Furthermore, 5 coding blocks are provided in total, each coding block is formed by a corresponding convolution layer, a pooling layer and an active layer, the first coding block comprises two convolution layers, and the number of channels is 3 and 64 respectively; the second coding block has two convolution layers, the channel number of which is 64 and 128; the third coding block comprises two convolution layers with the channel numbers of 128 and 256 respectively; the fourth coding block is formed by four convolution layers, and the channel numbers of the fourth coding block are 256, 256 and 512 respectively; the fifth coding block comprises four convolution layers, and the number of channels is 512, 512 and 512 in sequence; an activation function, LeakyRELU, function in the encoder; the convolution size of the picture input in this example after passing through each layer is respectively: 224. 224, 112, 56, 28, 14.
Furthermore, the Mask module has a concrete structure consisting of 5 convolution layers and 1 deconvolution layer, and the channels are respectively 256, 512, 256 and 1472; the activation function of each layer in the Mask module is a LeakyReLU function, a LeakyReLU function and a Tanh function respectively; the Mask module also uses a pruning strategy, and the probability value of dropout is 0.5.
Furthermore, the decoder corresponds to the structure of the encoder and is a total of 5 encoding blocks, the convolution layer of the decoder is the opposite structure of the decoder, and the decoder generates a target style image through the deconvolution reduction characteristic; the activation function in the decoder is a ReLU function; the sizes of convolutions obtained when the Mask module features enter the decoder are respectively: 14. 28, 56, 112, 224;
the discriminator structure consists of a convolution layer, a pooling layer and a full-link layer which are connected in sequence, and the activation function is a LeakyRELU function.
By combining all the technical schemes, the invention has the advantages and positive effects that: the method comprises the steps of extracting content characteristics of a source style picture and style characteristics of a target style picture respectively based on separation of a content encoder and a style encoder, introducing an AdaIN module and a Mask module between the encoder and the decoder to process the extracted content characteristics and the extracted style characteristics, and training through continuous iteration updating of a countermeasure network, so that a style migration model of the calligraphy is obtained. The method solves the problem of style migration of the calligraphy characters under the condition of no pairing, and simultaneously improves the accuracy of the calligraphy style migration. The method has important significance for inheritance of Chinese calligraphy and culture, and also has important value for virtual repair of cultural relics with calligraphy characters.
Drawings
Fig. 1 is a flowchart of a method for migrating Chinese calligraphy character image styles according to an embodiment of the present invention.
FIG. 2 is a schematic structural diagram of a system for transferring Chinese calligraphy character image styles according to an embodiment of the present invention;
in fig. 2: 1. training a sample module; 2. a training sample processing module; 3. and a calligraphy style migration model acquisition module.
Fig. 3 is a flowchart of an implementation of the method for migrating the style of chinese calligraphy character images according to the embodiment of the present invention.
Fig. 4(a) is a partially shown view of a source content data set provided by an embodiment of the present invention.
Fig. 4(b) is a partially shown view of a target style data set provided by an embodiment of the present invention.
Fig. 5 is a network structure diagram provided in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides a method, a system and an intelligent terminal for transferring the style of Chinese calligraphy characters and images, and the invention is described in detail below by combining the attached drawings.
As shown in fig. 1, the method for migrating the style of chinese calligraphy characters and images provided by the invention comprises the following steps:
s101: making n pieces of source style pictures and n pieces of target style pictures from a ttf word stock, outputting the pictures in a jpg format, and obtaining 2n pieces of pictures as training samples;
s102: the source style picture is used for providing content characteristics, the target style picture is used for providing style characteristics, the source style picture and the target style picture are used for training to generate a confrontation network, and the picture generated by the training of the generator and the result of the discriminator for judging authenticity are obtained;
s103: inputting the training set obtained by making into a countermeasure network, iteratively updating a generator and a discriminator, generating a false target confusion discriminator by the generator, and identifying true and false targets by the discriminator to improve the discrimination capability of the discriminator until the two targets reach Nash balance, thereby obtaining a calligraphy style migration model.
The method for transferring the style of the Chinese calligraphy character and the image provided by the invention can be implemented by other steps by ordinary technicians in the field, and the method for transferring the style of the Chinese calligraphy character and the image provided by the invention in figure 1 is only a specific embodiment.
As shown in fig. 2, the system for migrating Chinese calligraphy character image styles provided by the present invention comprises:
the training sample module 1 is used for making n pieces of source style pictures and n pieces of target style pictures from the ttf word stock, outputting the pictures in a jpg format, and obtaining 2n pictures as training samples;
the training sample processing module 2 is used for providing the content characteristics with the source style picture, providing the style characteristics with the target style picture, and training the source style picture and the target style picture to generate an confrontation network to obtain the picture generated by the training of the generator and the result of judging the authenticity of the identifier;
and the calligraphy style migration model acquisition module 3 is used for inputting the manufactured training set into the countermeasure network, iteratively updating the generator and the discriminator, generating a false target confusion discriminator by the generator, and identifying the true and false targets by the discriminator to improve the discrimination capability of the discriminator until the two targets reach Nash balance to obtain the calligraphy style migration model.
The technical solution of the present invention is further described below with reference to the accompanying drawings.
As shown in fig. 3, the method for migrating the style of chinese calligraphy characters and images provided by the invention comprises the following steps:
the method comprises the following steps that firstly, n pieces of source style pictures and n pieces of target style pictures are made from a ttf word stock, output pictures are in a jpg format, and 2n pieces of pictures are obtained and serve as training samples;
step two, the source style picture is used for providing content characteristics, the target style picture is used for providing style characteristics, the source style picture and the target style picture are used for training and generating a confrontation network, the picture generated by the training of the generator and the result of judging the authenticity of the identifier are obtained, and the method is implemented according to the following steps:
step 2-1, preprocessing the data set acquired in the step one, wherein the preprocessing comprises unifying the size of the picture and removing blank pictures, and the preprocessed partial data set is shown in fig. 4;
step 2-2, setting a generation countermeasure network, which consists of 1 generator G and 1 discriminator D, wherein the generator comprises 2 encoders which are marked as a content encoder Ec and a stylistic encoder Es, and 1 decoder De, 1 AdaIN module and 1 Mask module are arranged between the encoder and the decoder, the generation countermeasure network has 3 loss functions, and the countermeasure loss, the content loss and the style loss are shown in a network structure diagram as shown in FIG. 5;
step 2-3, the generator G is responsible for training the source style picture preprocessed in the step two into a picture of a target style, then content loss and style loss are updated to obtain a generated picture, the discriminator D takes the picture of the target style preprocessed in the step two and the picture generated by the generator G in the step 2-3 as input, antagonism loss is updated, and whether the predicted picture is the picture generated by the generator or the picture of the source style is predicted;
inputting the training set obtained in the step 2-1 into the countermeasure network in the step 2-2, iteratively updating a generator and a discriminator, generating a false target confusion discriminator by the generator, and identifying true and false targets by the discriminator to improve the discrimination ability of the discriminator until the two targets reach Nash balance to obtain a calligraphy style migration model, wherein the method is implemented according to the following steps:
step 3-1, selecting any one of the source style pictures C preprocessed in the step 2-1 as a sample and inputting the sample into an encoder Ec to obtain the feature x of the source style picture, and selecting any one of the target style pictures S preprocessed in the step 2-1 as a sample and inputting the sample into an encoder Es to obtain the feature y of the target style picture;
step 3-2, inputting the content characteristic x of the source style picture and the style characteristic y of the target style picture obtained in the step 3-1 into an AdaIN module, and firstly de-stylizing the content characteristic x of the source style picture to obtain a characteristic omega, wherein a specific formula is as follows:
Figure BDA0003097627770000101
where x represents the content feature of the content image, y represents the style feature of the style image, μ (x) represents the mean of the content feature, σ (x) represents the variance of the content feature, and ω represents the de-stylized feature of the whitening operation. Then, the style of the target style picture is blended into the source style picture to obtain AdaIN (x, y), and the specific formula is as follows:
AdaIN(x,y)=ω·σ(y)+μ(y)(2)
y represents the style characteristic of the style image, sigma (y) represents the standard deviation of the target style image, mu (y) represents the mean value of the target style image, omega represents the characteristic after de-stylization of the whitening operation in the first step, AdaIN (x, y) represents the style obtained by removing the original style of the source style image and then blending the original style into the target style;
step 3-3, inputting the source style picture characteristic x and the target style picture characteristic y obtained in the step 3-1 and the result AdaIN (x, y) obtained in the step 3-2 into a mask module to obtain a characteristic z, wherein a specific formula is as follows:
z=M(x,y)·x+[1-M(x,y)]·AdaIN(x,y) (3)
wherein M (x, y) represents a mask generated by a mask module, x represents content features extracted by a content encoder, y is style features extracted by a style encoder, AdaIN (x, y) represents a result of adaptive instance normalization, and z represents a result of fusing source style picture features, target style picture features and AdaIN (x, y) features;
step 3-4, inputting the characteristic z obtained in the step 3-3 into a decoder De to generate a picture S' of a target style;
step 3-5, inputting the picture S' generated in the step 3-4 into a content encoder Ec and a style encoder Es, and calculating content loss and style loss, wherein the content loss and the style loss both adopt L1And (3) calculating the loss, wherein a specific calculation formula is as follows:
Figure BDA0003097627770000102
Figure BDA0003097627770000103
wherein L iscontentRepresents a content loss, LstyleRepresenting style loss, x representing content features extracted by the content encoder Ec, Es(S ') content characteristics E of the pseudo image S' generated by the generator extracted by the content encoder Ecs(S ') representing the style characteristics extracted after the false image S' passes through the style encoder Es,
Figure BDA0003097627770000111
A desire to show that the generated image S' follows the probability distribution of the original style image S in content,
Figure BDA0003097627770000112
Representing the expectation that the generated image S' obeys the probability distribution of the original-style image S in style;
step 3-6, inputting the picture S' generated in step 3-4 and the target style picture S obtained by preprocessing in step 2-1 into a discriminator D, and calculating the confrontation loss LadvThe specific calculation formula is as follows:
Ladv=Ex[log(D(x))]+ES’[log(1-D(S’))] (6)
wherein L isadvRepresenting the countermeasure loss, x represents the content characteristics of the source-style image S extracted by the content encoder Ec, D (x) represents the output value of the source-style image S after it is input to the discriminator De (representing the probability that the image is a true image), D (S ') represents the output value of the generated false image S' input discriminator De (representing the probability that the image is a true image), ExExpressing expectation, E, of probability distribution of obeying true data for source-style image SS’Representing the false image S' subject to the expectation of generating a probability distribution of the data.
Step 3-7, the final total penalty function is the sum of the step 3-5 generator penalty and the step 3-6 discriminator penalty, expressed as follows:
Ltotal=Ladv+α·Lcontent+β·Lstyle (7)
wherein L istotalDenotes the sum of losses, L, of the countermeasure networkadvPresentation authenticationLoss of resistance of the device, LcontentRepresents the content loss, L, of the generatorstyleThe style loss of the generator is represented, alpha and beta represent the weight proportion occupied by the sub-loss function, the network parameters are continuously updated through the antagonism training, the loss values of the generator and the discriminator are optimized, the smaller the loss value is, the more successful the training is, namely, the picture style generated by the training is closer to the picture of the target style.
In this example, the generation countermeasure network includes a generator and a discriminator, the generator includes a content encoder, a style encoder, an AdaIN module, a Mask module and a decoder, wherein the content encoder is used to extract content features of a source style picture, the style encoder is used to extract style features of a target style picture, the AdaIN module is used to de-stylize the content features and then blend in the style of the target style, the Mask module is a attention mechanism module used to blend the content features, the style features and the features obtained by the AdaIN module, the decoder is used to generate the picture of the target style from the features processed by the Mask module, the discriminator evaluates the result generated by the generator, and the evaluation parameters are given according to a loss function.
The content encoder and the style encoder have the same structure, the specific structure is shown in table 1, the structure of the content encoder and the style encoder is 5 coding blocks in total, each coding block is formed by a corresponding convolution layer, a pooling layer and an active layer, the first coding block comprises two convolution layers, and the number of channels is 3 and 64 respectively; the second coding block has two convolution layers, the channel number of which is 64 and 128; the third coding block comprises two convolution layers with the channel numbers of 128 and 256 respectively; the fourth coding block is formed by four convolution layers, and the channel numbers of the fourth coding block are 256, 256 and 512 respectively; the fifth coding block comprises four convolution layers, and the number of channels is 512, 512 and 512 in sequence; an activation function, LeakyRELU, function in the encoder; the convolution size of the picture input in this example after passing through each layer is respectively: 224. 224, 112, 56, 28, 14.
Table 1 network architecture of the encoder
Figure BDA0003097627770000121
The Mask module has a specific structure shown in table 2, and comprises 5 convolutional layers and 1 anti-convolutional layer, and the channels are 256, 512, 256 and 1472; the activation function of each layer in the Mask module is a LeakyReLU function, a LeakyReLU function and a Tanh function respectively; the Mask module also uses a pruning strategy, and the probability value of dropout is 0.5.
Table 2 network architecture of Mask modules
Figure BDA0003097627770000122
Figure BDA0003097627770000131
The decoder corresponds to the structure of the encoder, the structure is shown in table 3, the structure is also a total of 5 coding blocks, the convolution layer of the decoder is the reverse structure of the decoder, and the decoder generates a target style image through the deconvolution characteristic; the activation function in the decoder is a ReLU function; the sizes of convolutions obtained when the Mask module features enter the decoder are respectively: 14. 28, 56, 112, 224;
table 3 network architecture of decoder
Figure BDA0003097627770000132
The structure of the discriminator is shown in table 4, and the discriminator is composed of a convolution layer, a pooling layer and a full-link layer which are connected in sequence, and the activation function is a LeakyRELU function;
table 4 network architecture of discriminator
First layer Second layer Third layer The fourth layer The fifth layer The sixth layer
Convolutional layer Convolutional layer Convolutional layer Convolutional layer Convolutional layer Full connection layer
Pooling layer Layer of activation function Layer of activation function Layer of activation function Layer of activation function
Layer of activation function Batch Norm layer Batch Norm layer Batch Norm layer Batch Norm layer
The images are numbered in this example, and a total of 100 calligraphy fonts are selected, where odd numbers represent real data and even numbers represent data generated by the designed calligraphy migration model. The evaluation results are shown in Table 5. Only experts are told before the experiment, and 50 pictures in 100 pictures are genuine calligraphy, and 50 pictures are fake calligraphy. 4 experts were given enough time to allow them to distinguish these samples from each other. As a result, they were unable to distinguish between genuine calligraphy images and model-generated false images, which demonstrates that the calligraphy images generated by the present invention are sufficiently realistic.
TABLE 5 evaluation results
Figure BDA0003097627770000141
Wherein the true samples represent the original calligraphic image and the false samples represent the calligraphic image generated by the method provided by the invention.
It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for migrating the style of Chinese calligraphy character images is characterized by comprising the following steps:
making n pieces of source style pictures and n pieces of target style pictures from a ttf word stock, outputting the pictures in a jpg format, and obtaining 2n pieces of pictures as training samples;
the source style picture is used for providing content characteristics, the target style picture is used for providing style characteristics, the source style picture and the target style picture are used for training to generate a confrontation network, and the picture generated by the training of the generator and the result of the discriminator for judging authenticity are obtained;
inputting the training set obtained by making into a countermeasure network, iteratively updating a generator and a discriminator, generating a false target confusion discriminator by the generator, and identifying true and false targets by the discriminator to improve the discrimination capability of the discriminator until the two targets reach Nash balance, thereby obtaining a calligraphy style migration model.
2. The method for migrating Chinese calligraphy character and image styles as claimed in claim 1, wherein the source style picture is used for providing content features, the target style picture is used for providing style features, the source style picture and the target style picture are used for training to generate an anti-network, and the picture generated by the training of the generator and the result of the discriminator for distinguishing true from false are obtained, which is implemented according to the following steps:
(1) preprocessing the acquired data set, wherein the preprocessing comprises unifying the size of the picture and removing blank pictures;
(2) the method comprises the steps that a generation countermeasure network is arranged and consists of 1 generator G and 1 discriminator D, the generator comprises 2 encoders which are marked as a content encoder Ec and a format encoder Es, and 1 decoder De, 1 AdaIN module and 1 Mask module are arranged between the encoders and the decoders, and the generation countermeasure network has 3 loss functions, namely, countermeasure loss, content loss and style loss;
(3) the generator G is responsible for training the preprocessed source style picture into a picture of a target style, then updating content loss and style loss to obtain a generated picture, the discriminator D takes the preprocessed picture of the target style and the picture generated by the generator G as input, and updates antagonism loss and predicts whether the picture is the picture generated by the generator or the picture of the source style.
3. The method for migrating Chinese calligraphy character and image styles as claimed in claim 1, wherein the training set obtained by making is input into a countermeasure network, a generator and a discriminator are iteratively updated, the generator generates a false target confusion discriminator, the discriminator identifies true and false targets and improves the discrimination ability until the two targets reach Nash balance, a calligraphy style migration model is obtained, and the method is implemented according to the following steps:
(1) selecting any one of the preprocessed source style pictures C as a sample and inputting the sample into an encoder Ec to obtain the characteristic x of the source style picture, selecting any one of the preprocessed target style pictures S as a sample and inputting the sample into an encoder Es to obtain the characteristic y of the target style picture;
(2) inputting the obtained content characteristic x of the source style picture and the style characteristic y of the target style picture into an AdaIN module, firstly de-stylizing the content characteristic x of the source style picture to obtain a characteristic omega, wherein a specific formula is as follows:
Figure FDA0003097627760000021
wherein x represents the content feature of the content image, y represents the style feature of the style image, μ (x) represents the mean of the content feature, σ (x) represents the variance of the content feature, and ω represents the de-stylized feature of the whitening operation; then, the style of the target style picture is blended into the source style picture to obtain AdaIN (x, y), and the specific formula is as follows:
AdaIN(x,y)=ω·σ(y)+μ(y)(2)
y represents the style characteristic of the style image, sigma (y) represents the standard deviation of the target style image, mu (y) represents the mean value of the target style image, omega represents the characteristic after de-stylization of the whitening operation in the first step, AdaIN (x, y) represents the style obtained by removing the original style of the source style image and then blending the original style into the target style;
(3) inputting the obtained source style picture characteristic x, the obtained target style picture characteristic y and the obtained result AdaIN (x, y) into a mask module to obtain a characteristic z, wherein a specific formula is as follows:
z=M(x,y)·x+[1-M(x,y)]·AdaIN(x,y)(3)
wherein M (x, y) represents a mask generated by a mask module, x represents content features extracted by a content encoder, y is style features extracted by a style encoder, AdaIN (x, y) represents a result of adaptive instance normalization, and z represents a result of fusing source style picture features, target style picture features and AdaIN (x, y) features;
(4) inputting the obtained characteristic z into a decoder De to generate a picture S' with a target style;
(5) inputting the generated picture S' into a content encoder Ec and a format encoder Es, calculating content loss and style loss, and adopting L for the content encoder Ec and the format encoder Es1And (3) calculating the loss, wherein a specific calculation formula is as follows:
Figure FDA0003097627760000022
Figure FDA0003097627760000023
wherein L iscontentRepresents a content loss, LstyleRepresenting style loss, x representing content features extracted by the content encoder Ec, Ec(S ') content characteristics E of the pseudo image S' generated by the generator extracted by the content encoder Ecs(S ') representing the style characteristics extracted after the false image S' passes through the style encoder Es,
Figure FDA0003097627760000031
A desire to show that the generated image S' follows the probability distribution of the original style image S in content,
Figure FDA0003097627760000032
Representing the expectation that the generated image S' obeys the probability distribution of the original-style image S in style;
(6) inputting the generated picture S' and the preprocessed target style picture S into a discriminator D, and calculating the confrontation loss LadvThe specific calculation formula is as follows:
Ladv=Ex[log(D(x))]+ES’[log(1-D(S’))](6)
wherein L isadvRepresenting the countermeasure loss, x represents the content feature extracted by the content encoder Ec of the source style image S, D (x) represents the output value of the source style image S after being input to the discriminator De, D (S ') represents the output value of the generated false image S' input discriminator De, ExExpressing expectation, E, of probability distribution of obeying true data for source-style image SS’Representing the expectation that the false image S' is amenable to generating a data probability distribution;
(7) the final total loss function is the sum of the generator loss and the discriminator loss, and is expressed as follows:
Ltotal=Ladv+α·Lcontent+β·Lstyle(7)
wherein L istotalDenotes the sum of losses, L, of the countermeasure networkadvRepresenting the loss of confrontation of the discriminator, LcontentRepresents the content loss, L, of the generatorstyleThe style loss of the generator is represented, alpha and beta represent the weight proportion occupied by the sub-loss function, the network parameters are continuously updated through the antagonism training, the loss values of the generator and the discriminator are optimized, the smaller the loss value is, the more successful the training is, namely, the picture style generated by the training is closer to the picture of the target style.
4. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:
making n pieces of source style pictures and n pieces of target style pictures from a ttf word stock, outputting the pictures in a jpg format, and obtaining 2n pieces of pictures as training samples;
the source style picture is used for providing content characteristics, the target style picture is used for providing style characteristics, the source style picture and the target style picture are used for training to generate a confrontation network, and the picture generated by the training of the generator and the result of the discriminator for judging authenticity are obtained;
inputting the training set obtained by making into a countermeasure network, iteratively updating a generator and a discriminator, generating a false target confusion discriminator by the generator, and identifying true and false targets by the discriminator to improve the discrimination capability of the discriminator until the two targets reach Nash balance, thereby obtaining a calligraphy style migration model.
5. An information data processing terminal, characterized in that the information data processing terminal is used for realizing the method for migrating the style of Chinese calligraphy characters and images according to any one of claims 1-3.
6. A system for implementing the method for migrating Chinese calligraphy character image style according to any one of claims 1 to 3, wherein the system for migrating Chinese calligraphy character image style comprises:
the training sample module is used for manufacturing n pieces of source style pictures and n pieces of target style pictures from the ttf word stock, outputting the pictures in a jpg format, and obtaining 2n pictures as training samples;
the training sample processing module is used for using the source style picture to provide content characteristics, using the target style picture to provide style characteristics, and using the source style picture and the target style picture to train and generate a confrontation network to obtain a picture generated by the training of the generator and a result of judging authenticity of the identifier;
and the calligraphy style migration model acquisition module is used for inputting the manufactured training set into the countermeasure network, iteratively updating the generator and the discriminator, generating a false target confusion discriminator by the generator, and identifying the true and false targets by the discriminator to improve the discrimination capability of the discriminator until the two targets reach Nash balance to obtain the calligraphy style migration model.
7. The system for migrating Chinese calligraphy character image style according to claim 6, wherein the generating countermeasure network comprises a generator and a discriminator, the generator comprises a content encoder, a style encoder, an AdaIN module, a Mask module and a decoder;
the content encoder is used for extracting content characteristics of the source style pictures; the style encoder is used for extracting style characteristics of the target style picture; the AdaIN module is used for de-stylizing the content features and then blending the content features into the style of a target style; the Mask module is an attention mechanism module and is used for fusing content characteristics, style characteristics and characteristics obtained by the AdaIN module; the decoder is used for generating the picture of the target style from the characteristics processed by the Mask module, the discriminator is used for evaluating the result generated by the generator, and the evaluation parameter is given according to the loss function.
8. The system for migrating Chinese calligraphy character and image styles as claimed in claim 7, wherein there are 5 coding blocks in total, each coding block is formed by a corresponding convolution layer, a pooling layer and an active layer, the first coding block includes two convolution layers, the number of channels is 3 and 64 respectively; the second coding block has two convolution layers, the channel number of which is 64 and 128; the third coding block comprises two convolution layers with the channel numbers of 128 and 256 respectively; the fourth coding block is formed by four convolution layers, and the channel numbers of the fourth coding block are 256, 256 and 512 respectively; the fifth coding block comprises four convolution layers, and the number of channels is 512, 512 and 512 in sequence; an activation function, LeakyRELU, function in the encoder; the convolution size of the picture input in this example after passing through each layer is respectively: 224. 224, 112, 56, 28, 14.
9. The system for migrating Chinese calligraphy character and image styles as claimed in claim 7, wherein the Mask module has a specific structure consisting of 5 convolutional layers and 1 anti-convolutional layer, and the channels are 256, 512, 256 and 1472; the activation function of each layer in the Mask module is a LeakyReLU function, a LeakyReLU function and a Tanh function respectively; the Mask module also uses a pruning strategy, and the probability value of dropout is 0.5.
10. The system for migrating Chinese calligraphy character image styles as claimed in claim 7, wherein the decoder corresponds to the structure of the encoder and is a total of 5 coding blocks, the convolution layer of the decoder is the reverse structure of the decoder, and the decoder generates the target style image by the deconvolution reduction characteristic; the activation function in the decoder is a ReLU function; the sizes of convolutions obtained when the Mask module features enter the decoder are respectively: 14. 28, 56, 112, 224;
the discriminator structure consists of a convolution layer, a pooling layer and a full-link layer which are connected in sequence, and the activation function is a LeakyRELU function.
CN202110616129.5A 2021-06-02 2021-06-02 Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles Pending CN113393370A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110616129.5A CN113393370A (en) 2021-06-02 2021-06-02 Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110616129.5A CN113393370A (en) 2021-06-02 2021-06-02 Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles

Publications (1)

Publication Number Publication Date
CN113393370A true CN113393370A (en) 2021-09-14

Family

ID=77619968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110616129.5A Pending CN113393370A (en) 2021-06-02 2021-06-02 Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles

Country Status (1)

Country Link
CN (1) CN113393370A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807430A (en) * 2021-09-15 2021-12-17 网易(杭州)网络有限公司 Model training method and device, computer equipment and storage medium
CN113808011A (en) * 2021-09-30 2021-12-17 深圳万兴软件有限公司 Feature fusion based style migration method and device and related components thereof
CN116402067A (en) * 2023-04-06 2023-07-07 哈尔滨工业大学 Cross-language self-supervision generation method for multi-language character style retention
CN116721306A (en) * 2023-05-24 2023-09-08 北京思想天下教育科技有限公司 Online learning content recommendation system based on big data cloud platform
CN117236284A (en) * 2023-11-13 2023-12-15 江西师范大学 Font generation method and device based on style information and content information adaptation

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807430A (en) * 2021-09-15 2021-12-17 网易(杭州)网络有限公司 Model training method and device, computer equipment and storage medium
CN113807430B (en) * 2021-09-15 2023-08-08 网易(杭州)网络有限公司 Model training method, device, computer equipment and storage medium
CN113808011A (en) * 2021-09-30 2021-12-17 深圳万兴软件有限公司 Feature fusion based style migration method and device and related components thereof
CN113808011B (en) * 2021-09-30 2023-08-11 深圳万兴软件有限公司 Style migration method and device based on feature fusion and related components thereof
CN116402067A (en) * 2023-04-06 2023-07-07 哈尔滨工业大学 Cross-language self-supervision generation method for multi-language character style retention
CN116402067B (en) * 2023-04-06 2024-01-30 哈尔滨工业大学 Cross-language self-supervision generation method for multi-language character style retention
CN116721306A (en) * 2023-05-24 2023-09-08 北京思想天下教育科技有限公司 Online learning content recommendation system based on big data cloud platform
CN116721306B (en) * 2023-05-24 2024-02-02 北京思想天下教育科技有限公司 Online learning content recommendation system based on big data cloud platform
CN117236284A (en) * 2023-11-13 2023-12-15 江西师范大学 Font generation method and device based on style information and content information adaptation

Similar Documents

Publication Publication Date Title
CN113393370A (en) Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles
CN112381097A (en) Scene semantic segmentation method based on deep learning
CN110569033B (en) Method for generating basic codes of digital transaction type intelligent contracts
CN113487629B (en) Image attribute editing method based on structured scene and text description
US20180365594A1 (en) Systems and methods for generative learning
CN111476285B (en) Training method of image classification model, image classification method and storage medium
CN111581966A (en) Context feature fusion aspect level emotion classification method and device
CN113763371B (en) Pathological image cell nucleus segmentation method and device
CN112884758B (en) Defect insulator sample generation method and system based on style migration method
CN111401156A (en) Image identification method based on Gabor convolution neural network
CN115620010A (en) Semantic segmentation method for RGB-T bimodal feature fusion
CN115311130A (en) Method, system and terminal for migrating styles of Chinese, calligraphy and digital images in multiple lattices
CN113762269A (en) Chinese character OCR recognition method, system, medium and application based on neural network
CN115034200A (en) Drawing information extraction method and device, electronic equipment and storage medium
CN113140023A (en) Text-to-image generation method and system based on space attention
CN116958827A (en) Deep learning-based abandoned land area extraction method
CN113920379A (en) Zero sample image classification method based on knowledge assistance
CN114399775A (en) Document title generation method, device, equipment and storage medium
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
CN117152438A (en) Lightweight street view image semantic segmentation method based on improved deep LabV3+ network
CN116958324A (en) Training method, device, equipment and storage medium of image generation model
CN116958700A (en) Image classification method based on prompt engineering and contrast learning
CN116778164A (en) Semantic segmentation method for improving deep V < 3+ > network based on multi-scale structure
CN114840680A (en) Entity relationship joint extraction method, device, storage medium and terminal
CN107704580A (en) Question and answer method for pushing, device, server and storage medium based on user's period

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination