CN115240201B - Chinese character generation method for alleviating network mode collapse problem by using Chinese character skeleton information - Google Patents

Chinese character generation method for alleviating network mode collapse problem by using Chinese character skeleton information Download PDF

Info

Publication number
CN115240201B
CN115240201B CN202211146858.XA CN202211146858A CN115240201B CN 115240201 B CN115240201 B CN 115240201B CN 202211146858 A CN202211146858 A CN 202211146858A CN 115240201 B CN115240201 B CN 115240201B
Authority
CN
China
Prior art keywords
image
source domain
skeleton
style
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211146858.XA
Other languages
Chinese (zh)
Other versions
CN115240201A (en
Inventor
曾锦山
周杰
徐瑞英
程诺
黄箐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Normal University
Original Assignee
Jiangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Normal University filed Critical Jiangxi Normal University
Priority to CN202211146858.XA priority Critical patent/CN115240201B/en
Publication of CN115240201A publication Critical patent/CN115240201A/en
Application granted granted Critical
Publication of CN115240201B publication Critical patent/CN115240201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/141Image acquisition using multiple overlapping images; Image stitching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

The invention discloses a Chinese character generation method for alleviating the problem of network mode collapse by utilizing Chinese character skeleton information, which comprises the following steps: extracting a corresponding skeleton image from a source domain image, splicing the source domain image and the corresponding skeleton image together, inputting the spliced source domain image and the corresponding skeleton image into a generator to generate an image with a target style, and putting the image into a discriminator to discriminate whether the image is true or false; extracting a corresponding skeleton image from the image with the target style, splicing the extracted skeleton image with the target style, inputting a splicing result into another generator to generate an image with a source domain style, and putting the image with the source domain style into another discriminator for discrimination; and thirdly, extracting a skeleton image from the image reconstructed and generated by the generator, and calculating pixel-level loss of the extracted skeleton image and the skeleton image of the source domain style extracted in the first step, wherein the pixel-level loss is used as a part of network gradient return and is used for optimizing the model in training.

Description

Chinese character generation method for alleviating network mode collapse problem by utilizing Chinese character skeleton information
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a Chinese character generation method for alleviating the problem of network mode collapse by utilizing Chinese character skeleton information.
Background
The Chinese character generation is a very difficult task, the Chinese character font is very complex, the number of the common Chinese characters is large, and the generation time of a character library is long. In the early days, the relevant technicians first extracted some dominant features of Chinese characters, such as strokes, components, etc., and then generated new Chinese characters using some traditional machine learning methods. The good early-stage feature extraction has great influence on the effect of the method, and the early-stage feature extraction is usually made by hands, so that the method is very time-consuming and labor-consuming.
Some recent methods enhance the effectiveness of the network by introducing paired data sets, but in real life, the paired data sets are difficult to obtain, and particularly, the data sets are limited in ancient writing repair and handwriting generation. Moreover, the paired data sets are manually partitioned by manually partitioning a given data set, which requires a very large amount of manpower and material resources. In order to solve the problem of difficult acquisition of paired data in the process of generating Chinese characters, some prior arts also explore to some extent in this respect, but these methods heavily rely on additional training steps or add some other additional labels. For the neural network, the extra training steps increase the training cost, and the extra labels are manually made, so that much energy is consumed. And the unpaired model currently used has a common problem, namely, mode collapse.
Some methods begin to focus on the problem of pattern collapse and attempt to alleviate the problem of pattern collapse in the process of generating chinese characters from several perspectives, such as AAAI meeting records entitled "reducing pattern collapse in chinese font generation by stroke coding" propose the use of a form of adding one-hot stroke coding, but this method can only determine whether a certain stroke exists in the font for the extracted stroke information, and does not consider the relationship between the stroke and the whole chinese character. For example, generating the two words 'already' and 'already' is indistinguishable in this approach because the strokes of the two words are identical. Yet another example is the word 'king' which is also indistinguishable in this way because the underlying strokes that make up them are identical. There are also techniques, such as the one proposed in the thesis "self-supervised Chinese font generation based on block transformation", which consider dividing a Chinese character into four parts and let the network learn the spatial structure information between the four parts. However, the spatial structure information learned by this method is very shallow and no constraint is imposed on the stroke detail part.
Disclosure of Invention
The invention aims to provide a Chinese character generation method for relieving the problem of network mode collapse by utilizing Chinese character skeleton information, which is used for solving the technical problem of mode collapse in the network generation process in the prior art and simultaneously ensuring the rapidness and low cost of the Chinese character generation method.
The Chinese character generation method for alleviating the network mode collapse problem by utilizing the Chinese character skeleton information comprises the following steps:
step one, from a source domain image
Figure 69661DEST_PATH_IMAGE001
Extracting corresponding source domain skeleton image
Figure 635771DEST_PATH_IMAGE002
Source domain image
Figure 78385DEST_PATH_IMAGE001
And corresponding source domain skeleton images
Figure 314325DEST_PATH_IMAGE002
Spliced together input generator
Figure 701444DEST_PATH_IMAGE003
To generate a target-style image
Figure 681033DEST_PATH_IMAGE004
And combining the target-style images
Figure 712574DEST_PATH_IMAGE004
Put-in discriminator
Figure 509629DEST_PATH_IMAGE005
Middle discrimination target style image
Figure 993831DEST_PATH_IMAGE004
True or false;
step two, from the target style image
Figure 777110DEST_PATH_IMAGE004
Extracting corresponding target style skeleton image
Figure 787791DEST_PATH_IMAGE006
And extracting the target style skeleton image
Figure 896693DEST_PATH_IMAGE006
And target style images
Figure 602612DEST_PATH_IMAGE004
Splicing, the splicing result is input into another generator
Figure 314216DEST_PATH_IMAGE007
Generating source domain style images
Figure 320349DEST_PATH_IMAGE008
And combining the source domain style images
Figure 334573DEST_PATH_IMAGE008
Put into another discriminator
Figure 918001DEST_PATH_IMAGE009
Judging;
step three, the slave generator
Figure 777503DEST_PATH_IMAGE007
Reconstructing a generated source domain style image
Figure 903722DEST_PATH_IMAGE008
Extracting source domain style skeleton image
Figure 213481DEST_PATH_IMAGE010
For the extracted source domain style skeleton image
Figure 159571DEST_PATH_IMAGE010
And the source domain skeleton image extracted in the step one
Figure 822765DEST_PATH_IMAGE002
Pixel level penalties are calculated as part of the network gradient backprojection and used to optimize the model during training.
Preferably, in the first step, the source domain image
Figure 662545DEST_PATH_IMAGE001
For RGB three-channel image, each skeleton image is single-channel gray image, and the concrete splicing operation is to make source domain image
Figure 284150DEST_PATH_IMAGE001
The RGB three channels are used for splicing and extracting the source domain skeleton image
Figure 186379DEST_PATH_IMAGE002
The gray single channel is finally combined into four channels of information and put into a generator of the network
Figure 43476DEST_PATH_IMAGE003
Generating RGB three-channel target style image
Figure 878708DEST_PATH_IMAGE004
Preferably, in the second step, the target style image is displayed
Figure 405635DEST_PATH_IMAGE004
The RGB three channels are used for splicing and lifting the extracted target style skeleton image
Figure 450952DEST_PATH_IMAGE006
The gray scale single channel is finally combined into a four-channel information to be put into a generator in the network
Figure 455948DEST_PATH_IMAGE007
Generating RGB three-channel source domain style image
Figure 145687DEST_PATH_IMAGE008
Preferably, in the third step, an error value between the skeleton of the source domain image and the skeleton of the source domain style image generated by reconstruction is calculated and used as a part of network gradient return for optimizing the model in training, the error value of the network gradient return is the pixel level loss, and the pixel level loss after optimization is smaller than a set loss threshold value, which indicates that the source domain style image generated by reconstruction is
Figure 233728DEST_PATH_IMAGE011
The source domain image in the skeleton layer and the step one
Figure 110549DEST_PATH_IMAGE012
Similarly.
Preferably, the cycle generation network used by the method comprises a framework extraction and integration module, a font reconstruction and generation module, two generators, two discriminators and a framework loss calculation module.
Preferably, the skeleton extraction and integration module is configured to extract an input source domain image into a source domain skeleton image, splice the extracted source domain skeleton image and the source domain image in a channel dimension, combine the source domain skeleton image and the source domain image into a four-channel information, and place the four-channel information into a generator in a network
Figure 184815DEST_PATH_IMAGE003
A target style image is generated.
Preferably, the reconstruction generation font module is used for generating the generator
Figure 588114DEST_PATH_IMAGE003
Extracting a corresponding target style skeleton image from the generated target style image, splicing the generated target style image and the target style skeleton image, and then transmitting the generated four-channel information into a generator
Figure 988003DEST_PATH_IMAGE007
To generate a source domain style image.
Preferably, the two generators are respectively a generator for generating a source domain style image
Figure 352119DEST_PATH_IMAGE007
And a generator for generating the target style image
Figure 89131DEST_PATH_IMAGE003
The input of the two generators is a four-channel image generated by splicing, the four-channel image passes through a series of convolution layers, and the output of the two generators is a three-channel image.
Preferably, the two discriminator modules are used for judging whether the input image is a real image or a false image generated by a network, and the discriminator and the generator have a contradictory relation so as to mutually optimize the capacities of the two parties.
Preferably, the model is optimized during training by calculating an error value between the skeleton of the source domain image and the skeleton of the reconstructed source domain style image as part of a network gradient back pass.
The invention has the following advantages: 1. the invention can relieve the problem of mode collapse in the network generation process by utilizing the spatial structure information of the framework, and the framework information can provide more comprehensive overall information compared with the stroke information and the segmented local spatial information and can also restrict the generation effect of the network on the stroke details. 2. The invention uses the CycleGAN network, and solves the problem of pairing the data sets by using the idea of cyclic generation. 3. The invention extracts the skeleton information by using an automatic skeleton extraction algorithm without manually extracting features, thereby solving the problem of manually extracting the features. 4. The invention is convenient for extracting the skeleton information, can easily generate a set of Chinese character fonts and solves the problem of overhigh Chinese character generation cost. 5. The method can be easily expanded to other network models, and has strong universality.
Drawings
FIG. 1 is a flow chart of a Chinese character generation method for alleviating the network mode collapse problem based on Chinese character skeleton information according to the present invention.
FIG. 2 is a diagram of a skeleton extraction integration module according to the present invention.
FIG. 3 is a schematic diagram of a module for generating a font by reconstruction according to the present invention.
FIG. 4 is a diagram of a module for calculating skeletal loss according to the present invention.
FIG. 5 is an effect diagram of font generation by each model.
FIG. 6 is a diagram of font generation effects with and without the application of the method of the present invention to the Attention GAN.
FIG. 7 is a diagram of font generation effects of the application of the method of the present invention to FUNIT and the non-application of the method of the present invention.
FIG. 8 is a diagram of font generation effects for SQ-GAN with and without the method of the present invention applied.
Fig. 9 is a diagram showing the font generation effect of the method of the present invention applied to the StrokeGAN and the method of the present invention not applied thereto.
Fig. 10 is a diagram showing font generation effects of the method of the present invention applied to UGATIT and the method of the present invention not applied thereto.
Attention GAN, FUNIT, SQ-GAN, stroke GAN and UGATIT in the attached drawings are English abbreviation of corresponding models.
Detailed Description
The following detailed description of the present invention will be given in conjunction with the accompanying drawings, for a more complete and accurate understanding of the inventive concept and technical solutions of the present invention by those skilled in the art.
The first embodiment is as follows:
as shown in FIGS. 1-4, the present invention provides a method for generating Chinese characters by using Chinese character skeleton information to alleviate the problem of network mode collapse, comprising the following steps.
Step one, from a source domain image
Figure 776899DEST_PATH_IMAGE001
Extracting corresponding source domain skeleton image
Figure 82110DEST_PATH_IMAGE002
Source domain image
Figure 58156DEST_PATH_IMAGE001
And corresponding source domain skeleton images
Figure 208646DEST_PATH_IMAGE002
Spliced together input generator
Figure 993062DEST_PATH_IMAGE003
To generate a target-style image
Figure 938016DEST_PATH_IMAGE004
And combining the object style images
Figure 666937DEST_PATH_IMAGE004
Put-in discriminator
Figure 621118DEST_PATH_IMAGE005
Middle discrimination target style image
Figure 119095DEST_PATH_IMAGE004
True and false.
Source domain image
Figure 766108DEST_PATH_IMAGE001
The method is characterized in that the method is an RGB three-channel image, each skeleton image is a single-channel gray image, and the specific splicing operation is to use a source domain image
Figure 592113DEST_PATH_IMAGE001
The RGB three channels are used for splicing and extracting the source domain skeleton image
Figure 474618DEST_PATH_IMAGE002
The gray single channel is finally combined into four channels of information and put into a generator of the network
Figure 968048DEST_PATH_IMAGE003
Generating a target-style image
Figure 785962DEST_PATH_IMAGE004
An RGB three-channel image.
Step two,From the target-style images, according to the idea of cyclically generating a network
Figure 223897DEST_PATH_IMAGE004
Extracting corresponding target style skeleton image
Figure 254301DEST_PATH_IMAGE006
And extracting the target style skeleton image
Figure 867816DEST_PATH_IMAGE006
And target style images
Figure 715686DEST_PATH_IMAGE004
Splicing, the splicing result is input into another generator
Figure 250704DEST_PATH_IMAGE007
Generating source domain style images
Figure 350378DEST_PATH_IMAGE008
And combining the source domain style images
Figure 818400DEST_PATH_IMAGE008
Put into another discriminator
Figure 102750DEST_PATH_IMAGE009
And (6) judging.
The splicing operation is similar to the first step, and the target style image is spliced
Figure 125064DEST_PATH_IMAGE004
The RGB three channels are used for splicing and lifting the extracted target style skeleton image
Figure 887484DEST_PATH_IMAGE006
The gray scale single channel is finally combined into a four-channel information to be put into a generator in the network
Figure 210012DEST_PATH_IMAGE007
Generating a source-domain style image
Figure 540630DEST_PATH_IMAGE008
Also an RGB three-channel image.
Step three, the slave generator
Figure 440453DEST_PATH_IMAGE007
Reconstructing a generated source domain style image
Figure 616351DEST_PATH_IMAGE008
Extracting source domain style skeleton image
Figure 793385DEST_PATH_IMAGE010
For the extracted source domain style skeleton image
Figure 419539DEST_PATH_IMAGE010
And the source domain skeleton image extracted in the step one
Figure 416445DEST_PATH_IMAGE002
Computing pixel-level penalties to ensure optimized reconstruction of the generated source-domain-style image
Figure 396033DEST_PATH_IMAGE008
The source domain image in the skeleton layer and the step one
Figure 552208DEST_PATH_IMAGE001
Similarly. After optimization, the pixel-level loss is less than a set loss threshold value, namely, the source domain style image generated by reconstruction is indicated
Figure 490208DEST_PATH_IMAGE011
The source domain image in the skeleton layer and the step one
Figure 708831DEST_PATH_IMAGE012
Similarly. This step is used as part of the network gradient return by calculating the error value between the skeleton of the source domain image and the skeleton of the source domain style image generated by the reconstruction and is used to optimize the model during training.
The network adopts a cycle generation network, and aims to realize self-supervision through the cycle generation network without using a pairing data set. Through the process of using X- > Y- > X, pseudo-pairing of the network is achieved, and therefore training can be conducted on a large number of unpaired data sets. In real life, most of the acquired target font data is not matched with the converted font, such as generation of ancient writing repair and handwriting. The existing data set method needing pairing cannot directly utilize the data sets, a large amount of data preprocessing operation is needed, and the method realizes that a model is trained by using unpaired data sets by applying a cyclic generation idea.
The circular generating network used by the method comprises a framework extracting and integrating module, a character style reconstructing and generating module, two generators, two discriminators and a framework loss calculating module. The functions and specific implementation of each module are as follows.
And the framework extraction and integration module is used for extracting the input source domain image into a source domain framework image and splicing the extracted source domain framework image and the source domain image in a channel dimension. The skeleton image is a gray single-channel image, and the splicing operation is to splice three RGB channels of the source domain image and the single channel of the extracted source domain skeleton image, and finally combine the three channels into a four-channel information to be put into a generator in the network
Figure 882323DEST_PATH_IMAGE003
A target style image is generated.
A reconstruction generating font module for generating the generator
Figure 33950DEST_PATH_IMAGE003
And extracting a corresponding target style skeleton image from the generated target style image. Splicing the generated target style image and the target style skeleton image by adopting the same splicing method as the previous skeleton extraction and integration module, splicing a single channel of the target style skeleton image on an RGB channel of the generated target style image to form four-channel information, and then transmitting the four-channel information into a generator
Figure 877272DEST_PATH_IMAGE007
To generate a source domain style image.
The two generators are respectively generators for generating source domain style images
Figure 973404DEST_PATH_IMAGE007
And a generator for generating the target style image
Figure 29216DEST_PATH_IMAGE003
For the two generators, the network is all a four-channel image, namely, the four-channel information obtained by the skeleton extraction module is transmitted to the network, and the network passes through a series of convolution layers to obtain a three-channel image.
Two discriminator modules for judging whether the input image is a real image or a false image generated by a network, and respectively being discriminators for discriminating the true and false of the source domain style image
Figure 300929DEST_PATH_IMAGE009
And discriminator for discriminating true and false of target style image
Figure 439786DEST_PATH_IMAGE005
The discriminator and the generator have contradictory relations, the capabilities of the discriminator and the generator can be mutually optimized, the generator hopes that the generated image can cheat the discriminator, and the discriminator hopes that the incoming image can be correctly judged to be true or false.
And the skeleton loss calculating module is used for calculating an error value between the skeleton of the source domain image and the skeleton of the source domain style image generated by reconstruction as a part of network gradient feedback and optimizing the model in training. The method adopts a cycle generation network and uses a data set with non-pairing relation for training, so that no one-to-one corresponding image pair exists in the data set. When the network is trained, gradient backhaul is required, so that the network is required to calculate a loss value to provide the network with the gradient backhaul.
FIG. 5 shows the effect of generating fonts for each type of model. The following is the effect of the method on other models (SK, standing for the use of our method).
FIG. 6 shows the application of our method to the Attention GAN, with its input expanded to four-channel information, the rest of the network remaining unchanged. It can be found that our method has a large boost in the Attention GAN application.
FIG. 7 shows the method of applying FUNIT, which decomposes the content and style, expands the skeleton information to the content module, and finds that the generation effect of the model can be improved well.
Figure 8 shows the application of our method to SQ-GAN with its input expanded into four-channel information, with the rest of the network remaining unchanged. Our method can be found to have a large improvement on SQ-GAN.
Fig. 9 shows the application of our method to the StrokeGAN, with its inputs augmented to four-channel information, the rest of the network remaining unchanged. Our method can be found to have a large boost on StrokeGAN.
Fig. 10 shows the application of our method to ugatt, with its input expanded into four-channel information, the rest of the network remaining unchanged. Our method can be found to have a large improvement in UGATI applications.
The method is provided with a skeleton loss calculating module to calculate the pixel level loss between skeleton images, which not only can optimize the generator adopted by the method, but also can relieve the problem of mode collapse in the network generating process by utilizing the spatial structure information of the skeleton, and the skeleton information can provide more comprehensive overall information compared with stroke information and segmented local spatial information, and can also restrict the generating effect of the network on the stroke details.
The invention is described above with reference to the accompanying drawings, it is obvious that the specific implementation of the invention is not limited by the above-mentioned manner, and it is within the scope of the invention to adopt various insubstantial modifications of the inventive concept and solution of the invention, or to apply the inventive concept and solution directly to other applications without modification.

Claims (8)

1. A Chinese character generation method for alleviating the problem of network mode collapse by utilizing Chinese character skeleton information is characterized by comprising the following steps of: comprises the following steps:
step one, from a source domain image
Figure DEST_PATH_IMAGE001
Extracting corresponding source domain skeleton image
Figure DEST_PATH_IMAGE002
Source domain image
Figure 733532DEST_PATH_IMAGE001
And corresponding source domain skeleton images
Figure 181831DEST_PATH_IMAGE002
Spliced together input generator
Figure DEST_PATH_IMAGE003
To generate a target-style image
Figure DEST_PATH_IMAGE004
And combining the target-style images
Figure 289464DEST_PATH_IMAGE004
Put-in discriminator
Figure DEST_PATH_IMAGE005
Middle discrimination target style image
Figure 481411DEST_PATH_IMAGE004
True or false;
in the first step, the source domain image
Figure 826942DEST_PATH_IMAGE001
The method is characterized in that the method is an RGB three-channel image, each skeleton image is a single-channel gray image, and the specific splicing operation is to use a source domain image
Figure 183493DEST_PATH_IMAGE001
The RGB three channels are used for splicing and extracting the source domain skeleton image
Figure 981684DEST_PATH_IMAGE002
The gray single channel is finally combined into four channels of information and put into a generator of the network
Figure 446164DEST_PATH_IMAGE003
Generating RGB three-channel target style image
Figure 646201DEST_PATH_IMAGE004
Step two, from the target style image
Figure 170723DEST_PATH_IMAGE004
Extracting corresponding target style skeleton image
Figure DEST_PATH_IMAGE006
And extracting the target style skeleton image
Figure 721790DEST_PATH_IMAGE006
And target style images
Figure 989960DEST_PATH_IMAGE004
Splicing, the splicing result is input into another generator
Figure DEST_PATH_IMAGE007
Generating source domain style images
Figure DEST_PATH_IMAGE008
And combining the source domain style images
Figure 841242DEST_PATH_IMAGE008
Put into another discriminator
Figure DEST_PATH_IMAGE009
Judging;
in the second step, the target style image is taken
Figure 333403DEST_PATH_IMAGE004
The RGB three channels are used for splicing and lifting the extracted target style skeleton image
Figure 106187DEST_PATH_IMAGE006
The gray single channel is finally combined into four channels of information and put into a generator of the network
Figure 646890DEST_PATH_IMAGE007
Generating RGB three-channel source domain style image
Figure 821519DEST_PATH_IMAGE008
Step three, the slave generator
Figure 687844DEST_PATH_IMAGE007
Reconstructing a generated source domain style image
Figure 682345DEST_PATH_IMAGE008
Extracting source domain style skeleton image
Figure DEST_PATH_IMAGE010
For the extracted source domain style skeleton image
Figure 843984DEST_PATH_IMAGE010
And the source domain skeleton image extracted in the step one
Figure 873120DEST_PATH_IMAGE002
Pixel level penalties are calculated as part of the network gradient pass-back and used to optimize the model during training.
2. The method for generating Chinese characters using Chinese character skeleton information to alleviate network mode collapse problems of claim 1, wherein: in the third step, an error value between the skeleton of the source domain image and the skeleton of the source domain style image generated by reconstruction is calculated and used as a part of network gradient return for optimizing the model in training, wherein the error value of the network gradient return is the pixel level loss, and the pixel level loss after optimization is less than a set loss threshold value, namely the pixel level loss indicates that the source domain style image generated by reconstruction is
Figure 910346DEST_PATH_IMAGE008
The source domain image in the skeleton layer and the step one
Figure 392143DEST_PATH_IMAGE001
Similarly.
3. The method for generating Chinese characters utilizing Chinese character skeleton information to alleviate network mode collapse problems of any of claims 1-2, wherein: the cycle generation network used by the method comprises a framework extraction and integration module, a font reconstruction and generation module, two generators, two discriminators and a framework loss calculation module.
4. The method for generating Chinese characters according to claim 3, wherein said method comprises: a skeleton extraction integration module for extracting the input source domain image into a source domain skeleton image, splicing the extracted source domain skeleton image and the source domain image in channel dimension, combining into a generator for inputting four-channel information into the network
Figure 805807DEST_PATH_IMAGE003
Generating targetsAnd (5) style images.
5. The method of claim 3, wherein the method further comprises the following steps: a reconstruction generating font module for generating a generator
Figure 689449DEST_PATH_IMAGE003
Extracting a corresponding target style skeleton image from the generated target style image, splicing the generated target style image and the target style skeleton image, and then transmitting the generated four-channel information into a generator
Figure 897576DEST_PATH_IMAGE007
To generate a source domain style image.
6. The method of claim 3, wherein the method further comprises the following steps: the two generators are respectively generators for generating source domain style images
Figure 866669DEST_PATH_IMAGE007
And a generator for generating the target style image
Figure 818445DEST_PATH_IMAGE003
The input of the two generators is a four-channel image generated by splicing, the four-channel image passes through a series of convolution layers, and the output of the two generators is a three-channel image.
7. The method of claim 6, wherein the method further comprises the following steps: the two discriminator modules are used for judging whether the input image is a real image or a false image generated by a network, and the discriminator and the generator have a contradictory relation and mutually optimize the capacities of the two parties.
8. The method of claim 3, wherein the method further comprises the following steps: and the skeleton loss calculating module is used for calculating an error value between the skeleton of the source domain image and the skeleton of the source domain style image generated by reconstruction as a part of network gradient feedback and optimizing the model in training.
CN202211146858.XA 2022-09-21 2022-09-21 Chinese character generation method for alleviating network mode collapse problem by using Chinese character skeleton information Active CN115240201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211146858.XA CN115240201B (en) 2022-09-21 2022-09-21 Chinese character generation method for alleviating network mode collapse problem by using Chinese character skeleton information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211146858.XA CN115240201B (en) 2022-09-21 2022-09-21 Chinese character generation method for alleviating network mode collapse problem by using Chinese character skeleton information

Publications (2)

Publication Number Publication Date
CN115240201A CN115240201A (en) 2022-10-25
CN115240201B true CN115240201B (en) 2022-12-23

Family

ID=83682194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211146858.XA Active CN115240201B (en) 2022-09-21 2022-09-21 Chinese character generation method for alleviating network mode collapse problem by using Chinese character skeleton information

Country Status (1)

Country Link
CN (1) CN115240201B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129207B (en) * 2023-04-18 2023-08-04 江西师范大学 Image data processing method for attention of multi-scale channel
CN117078921B (en) * 2023-10-16 2024-01-23 江西师范大学 Self-supervision small sample Chinese character generation method based on multi-scale edge information

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408776A (en) * 2018-10-09 2019-03-01 西华大学 A kind of calligraphy font automatic generating calculation based on production confrontation network
CN110033054B (en) * 2019-03-14 2021-05-25 上海交通大学 Personalized handwriting migration method and system based on collaborative stroke optimization
CN111859852A (en) * 2019-04-26 2020-10-30 普天信息技术有限公司 Training device and method for Chinese character style migration model
CN112036137A (en) * 2020-08-27 2020-12-04 哈尔滨工业大学(深圳) Deep learning-based multi-style calligraphy digital ink simulation method and system
CN113657397B (en) * 2021-08-17 2023-07-11 北京百度网讯科技有限公司 Training method for circularly generating network model, method and device for establishing word stock
CN114742714A (en) * 2021-10-29 2022-07-12 天津大学 Chinese character image restoration algorithm based on skeleton extraction and antagonistic learning

Also Published As

Publication number Publication date
CN115240201A (en) 2022-10-25

Similar Documents

Publication Publication Date Title
CN115240201B (en) Chinese character generation method for alleviating network mode collapse problem by using Chinese character skeleton information
CN108985181B (en) End-to-end face labeling method based on detection segmentation
CN112966684A (en) Cooperative learning character recognition method under attention mechanism
CN112884758B (en) Defect insulator sample generation method and system based on style migration method
CN109255826B (en) Chinese training image generation method, device, computer equipment and storage medium
CN110046116B (en) Tensor filling method, device, equipment and storage medium
CN111401156B (en) Image identification method based on Gabor convolution neural network
CN115131560A (en) Point cloud segmentation method based on global feature learning and local feature discrimination aggregation
CN117058266B (en) Handwriting word generation method based on skeleton and outline
CN106227836B (en) Unsupervised joint visual concept learning system and unsupervised joint visual concept learning method based on images and characters
CN114972847A (en) Image processing method and device
CN112037239A (en) Text guidance image segmentation method based on multi-level explicit relation selection
CN115908639A (en) Transformer-based scene image character modification method and device, electronic equipment and storage medium
CN113836319A (en) Knowledge completion method and system for fusing entity neighbors
Ling et al. A facial expression recognition system for smart learning based on YOLO and vision transformer
CN114529450B (en) Face image super-resolution method based on improved depth iteration cooperative network
US11734389B2 (en) Method for generating human-computer interactive abstract image
US20230154077A1 (en) Training method for character generation model, character generation method, apparatus and storage medium
CN115100451A (en) Data expansion method for monitoring oil leakage of hydraulic pump
Liu et al. Facial landmark detection using generative adversarial network combined with autoencoder for occlusion
Fan et al. Image inpainting based on structural constraint and multi-scale feature fusion
CN114332491A (en) Saliency target detection algorithm based on feature reconstruction
Xiao et al. CTNet: hybrid architecture based on CNN and transformer for image inpainting detection
CN113065407A (en) Financial bill seal erasing method based on attention mechanism and generation countermeasure network
CN117875362B (en) Distributed training method and device for large model and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Zeng Jinshan

Inventor after: Zhou Jie

Inventor after: Xu Ruiying

Inventor after: Cheng Nuo

Inventor after: Huang Jing

Inventor before: Zeng Jinshan

Inventor before: Zhou Jie

Inventor before: Xu Ruiying

Inventor before: Cheng Nuo

Inventor before: Huang Jing

GR01 Patent grant
GR01 Patent grant