CN117058266A - Handwriting word generation method based on skeleton and outline - Google Patents

Handwriting word generation method based on skeleton and outline Download PDF

Info

Publication number
CN117058266A
CN117058266A CN202311313408.XA CN202311313408A CN117058266A CN 117058266 A CN117058266 A CN 117058266A CN 202311313408 A CN202311313408 A CN 202311313408A CN 117058266 A CN117058266 A CN 117058266A
Authority
CN
China
Prior art keywords
image
skeleton
contour
loss
style
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311313408.XA
Other languages
Chinese (zh)
Other versions
CN117058266B (en
Inventor
曾锦山
章燕
汪叶飞
熊佳鹭
汪蕊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Normal University
Original Assignee
Jiangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Normal University filed Critical Jiangxi Normal University
Priority to CN202311313408.XA priority Critical patent/CN117058266B/en
Publication of CN117058266A publication Critical patent/CN117058266A/en
Application granted granted Critical
Publication of CN117058266B publication Critical patent/CN117058266B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1918Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/32Digital ink
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于骨架和轮廓的书法字生成方法,包括下列步骤:建立模型;所述模型以CycleGAN模型为骨干网络,CycleGAN模型包含两组生成对抗网络,所述模型还包括Con、Ske、IPaD和SCF;步骤二、对所述模型进行训练;源域风格的汉字图像作为原始图像输入模型,通过第一组生成对抗网络将原始图像转化为目标风格图像,通过第二组生成对抗网络将第一组生成对抗网络输出的目标风格图像转化为重构图像,训练过程中通过计算整个模型的损失,对模型进行优化;步骤三、获得优化后的模型用于书法字体自动生成。本发明引入有效的骨架‑轮廓融合模块来融合骨架信息和轮廓信息,在缺少精确配对字体样本的情况下能实现高质量的内容风格表现。

The invention discloses a method for generating calligraphy characters based on skeleton and outline, which includes the following steps: establishing a model; the model uses the CycleGAN model as a backbone network, and the CycleGAN model includes two sets of generative adversarial networks, and the model also includes Con, Ske , IPaD and SCF; Step 2, train the model; the source domain style Chinese character image is used as the original image input model, and the original image is converted into the target style image through the first group of generative adversarial networks, and the second group of generative adversarial networks is used to convert the original image into a target style image. The target style images output by the first set of generative adversarial networks are converted into reconstructed images. During the training process, the model is optimized by calculating the loss of the entire model; Step 3: Obtain the optimized model for automatic generation of calligraphy fonts. The present invention introduces an effective skeleton-contour fusion module to fuse skeleton information and contour information, and can achieve high-quality content style performance in the absence of accurate paired font samples.

Description

一种基于骨架和轮廓的书法字生成方法A calligraphy character generation method based on skeleton and outline

技术领域Technical field

本发明属于计算机视觉技术领域,具体涉及一种基于骨架和轮廓的书法字生成方法。The invention belongs to the field of computer vision technology, and specifically relates to a calligraphy character generation method based on skeleton and outline.

背景技术Background technique

中国书法是一种以汉字为基础的艺术形式,主要用画笔书写。近年来,随着人工智能技术的快速发展,对中国书法自动生成的研究逐渐出现,致力于文化遗产的数字保护和继承,并建立了一个可广泛应用的中国书法文本数据库。然而,书法汉字的自动生成在技术上具有相当大的挑战性,主要体现有以下两个方面:1、书法字符的形状多种多样,而书法字体的整体形状也非常不同。2、书法字大多是传统字,其结构比简化字更复杂。Chinese calligraphy is an art form based on Chinese characters and is written primarily with a brush. In recent years, with the rapid development of artificial intelligence technology, research on the automatic generation of Chinese calligraphy has gradually emerged, committed to the digital protection and inheritance of cultural heritage, and established a widely applicable Chinese calligraphy text database. However, the automatic generation of calligraphy Chinese characters is technically quite challenging, mainly reflected in the following two aspects: 1. The shapes of calligraphy characters are diverse, and the overall shapes of calligraphy fonts are also very different. 2. Most calligraphy characters are traditional characters, and their structures are more complex than simplified characters.

针对上述两个挑战,现有的汉字生成方法通常被认为是图像到图像的转换问题。现有技术中,有些采用Pix2Pix模型进行中文字体生成,通过构建直接从标准字体字符中生成书法字符的深度神经网络模型实现书法字体的生成。另一种现有技术则构建了有效的书法生成模型LF-Font,通过利用成对的字符和组件来提取内容和风格表示,但是这些模型需要成对的数据进行训练,收集大量成对的样本往往是不切实际和繁重的,特别是对于某些字体生成问题,如古代书法字体,这导致现有技术在小样本情况下难以得到足够的配对字体,导致这些模型难以得到准确可靠的结果。To address the above two challenges, existing Chinese character generation methods are usually considered as an image-to-image conversion problem. In the existing technology, some use the Pix2Pix model to generate Chinese fonts, and realize the generation of calligraphy fonts by building a deep neural network model that directly generates calligraphy characters from standard font characters. Another existing technology builds an effective calligraphy generation model LF-Font, which extracts content and style representation by utilizing pairs of characters and components. However, these models require paired data for training and a large number of paired samples are collected. It is often impractical and cumbersome, especially for certain font generation problems, such as ancient calligraphy fonts, which makes it difficult for existing techniques to obtain sufficient paired fonts in small sample cases, making it difficult for these models to obtain accurate and reliable results.

为了解决数据配对的问题,一些技术人员采用CycleGAN模型来基于未配对的数据生成中文字体,如可变形生成模型DG-Font。该技术引入了某些笔画编码来缓解模式崩溃的问题,有些现有技术还通过使用少量配对样本作为监督,提出了其半监督变量,另一些则利用多个分块变换(square-block transformations)来捕捉汉字的字形结构,还有的现有技术使用了汉字的轮廓来获取全局信息。In order to solve the problem of data pairing, some technicians use CycleGAN models to generate Chinese fonts based on unpaired data, such as the deformable generation model DG-Font. This technique introduces certain stroke encodings to alleviate the problem of mode collapse. Some existing techniques also propose their semi-supervised variants by using a small number of paired samples as supervision, and others utilize multiple square-block transformations. To capture the glyph structure of Chinese characters, other existing technologies use the outline of Chinese characters to obtain global information.

尽管这些有监督、无监督和自我监督的模型对一般中文字体的生成非常有效,但由于汉字多样的形状和不同字体件非常不同的风格,这些现有技术在应用于中国书法生成时的效果仍然不令人满意,特别是难以产生高质量的内容风格的表现,这也是中国书法生成的关键。上述技术中有些仍需要一定量成对的数据为生成结果提供重要的监督,但是收集成对数据的数量是非常困难的。而单纯利用文字的骨架或轮廓,生成字体的风格或内容上常常有一些缺陷,仍不能满足中国书法字体的自动生成需求。Although these supervised, unsupervised, and self-supervised models are very effective for general Chinese font generation, these existing techniques are still ineffective when applied to Chinese calligraphy generation due to the diverse shapes of Chinese characters and the very different styles of different font pieces. It is unsatisfactory, especially the difficulty in producing high-quality performance of content and style, which is also the key to the generation of Chinese calligraphy. Some of the above techniques still require a certain amount of paired data to provide significant supervision for generating results, but collecting the amount of paired data is very difficult. However, simply using the skeleton or outline of text to generate fonts often has some flaws in the style or content, and still cannot meet the needs for automatic generation of Chinese calligraphy fonts.

发明内容Contents of the invention

本发明的目的是提供一种基于骨架和轮廓的书法字生成方法,用于解决现有技术中在没有足够成对字体监督的情况下,生成的中国书法字体难以产生高质量的内容风格表现的技术问题。The purpose of the present invention is to provide a method for generating calligraphy characters based on skeleton and outline, which is used to solve the problem in the existing technology that the generated Chinese calligraphy fonts are difficult to produce high-quality content and style expression without sufficient paired font supervision. technical problem.

所述的一种基于骨架和轮廓的书法字生成方法,包括下列步骤。The method for generating calligraphy characters based on skeleton and outline includes the following steps.

步骤一、建立模型;所述模型以CycleGAN模型为骨干网络,CycleGAN模型包含两组生成对抗网络。Step 1: Establish a model; the model uses the CycleGAN model as the backbone network, and the CycleGAN model includes two sets of generative adversarial networks.

步骤二、对所述模型进行训练;输入模型的汉字字体风格为源域风格,源域风格的汉字图像即源域图像,采集源域风格的汉字图像做训练样本,需要生成的书法字体图像的字体风格为目标风格,目标风格的书法字体图像即目标域图像,采集目标域图像形成书法数据集;源域图像在训练时作为原始图像输入模型,通过第一组生成对抗网络将原始图像转化为目标风格图像,通过第二组生成对抗网络将第一组生成对抗网络输出的目标风格图像转化为重构图像,目标风格图像的字体风格应与目标风格一致,重构图像的字体风格应与源域风格一致,训练过程中通过计算整个模型的损失,对模型进行优化,优化目标是让整个模型的损失最小化。Step 2: Train the model; the Chinese character font style input to the model is the source domain style, and the Chinese character image in the source domain style is the source domain image. The Chinese character images in the source domain style are collected as training samples. The calligraphy font image that needs to be generated is The font style is the target style, and the calligraphy font image of the target style is the target domain image. The target domain image is collected to form a calligraphy data set; the source domain image is used as the original image input model during training, and the original image is converted into For the target style image, the target style image output by the first group of generative adversarial networks is converted into a reconstructed image through the second group of generative adversarial networks. The font style of the target style image should be consistent with the target style, and the font style of the reconstructed image should be consistent with the source The domain style is consistent. During the training process, the model is optimized by calculating the loss of the entire model. The optimization goal is to minimize the loss of the entire model.

步骤三、获得优化后的模型用于书法字体自动生成。Step 3: Obtain the optimized model for automatic generation of calligraphy fonts.

其中,两组生成对抗网络中均包括轮廓提取模块Con、骨架提取模块Ske和骨架-轮廓融合模块SCF,所述模型还包括不精确配对数据模块IPaD。Among them, both sets of generative adversarial networks include a contour extraction module Con, a skeleton extraction module Ske, and a skeleton-contour fusion module SCF. The model also includes an inexact pairing data module IPaD.

所述步骤二中,两组生成对抗网络均通过轮廓提取模块Con和骨架提取模块Ske分别提取骨架信息和轮廓信息,并将骨架信息和轮廓信息通过骨架-轮廓融合模块SCF融合后在生成器中与输入生成器的图像拼接,再由相应生成器处理生成图像。In the second step, both groups of generative adversarial networks extract skeleton information and contour information respectively through the contour extraction module Con and the skeleton extraction module Ske, and fuse the skeleton information and contour information through the skeleton-contour fusion module SCF in the generator. It is spliced with the image input to the generator, and then processed by the corresponding generator to generate the image.

不精确配对数据模块IPaD自动识别书法数据集中的字符并记录为识别标签,再根据目标风格图像在书法数据集中进行不精确配对,配对时允许对有关的书法数据集使用错误的识别标签,从而得到不精确配对数据。The inexact pairing data module IPaD automatically recognizes the characters in the calligraphy data set and records them as identification tags, and then performs inexact pairing in the calligraphy data set based on the target style image. During pairing, it allows the use of wrong identification tags for the relevant calligraphy data sets, thus obtaining Inexact pairing data.

整个模型的损失包括第一代对抗性损失L advy 、第二代对抗性损失L advx 、循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 和不精确的配对损失L inex The losses of the entire model include the first-generation adversarial loss L advy , the second-generation adversarial loss L advx , cycle consistency loss L cyc , skeleton consistency loss L ske , contour consistency loss L con and imprecise pairing loss L inex .

优选的,所述步骤一中,第一组生成对抗网络包括构建的生成器一G y 和鉴别器一D y ,第二组生成对抗网络包括构建的生成器二G x 和鉴别器二D x ;生成器一G y 用于将原始图像转化为目标风格图像,鉴别器一D y 用来判别生成的目标风格图像与目标域图像之间字体风格是否一致;第二组生成对抗网络采用相反的过程对第一组生成对抗网络输出的结果进行重构,即通过生成器二G x 将目标风格图像转化为源域风格的重构图像,鉴别器二D x 用来判别生成的重构图像与源域图像之间字体风格是否一致。Preferably, in step one, the first group of generative adversarial networks includes the constructed generator G y and the discriminator D y , and the second group of generative adversarial networks includes the constructed generator G x and the discriminator D x ; The generator G y is used to convert the original image into a target style image, and the discriminator D y is used to determine whether the font style between the generated target style image and the target domain image is consistent; the second group of generative adversarial networks uses the opposite The process reconstructs the results output by the first set of generative adversarial networks, that is, the target style image is converted into a reconstructed image of the source domain style through the generator 2 G x , and the discriminator 2 D x is used to distinguish between the generated reconstructed image and Whether the font style between source domain images is consistent.

优选的,所述步骤二中,在第一组生成对抗网络中,源域图像x作为输入的原始图像分别通过骨架提取模块Ske和轮廓提取模块Con处理,对应提取到骨架信息sx和轮廓信息cx,骨架信息sx和轮廓信息cx二者通过骨架-轮廓融合模块SCF融合;原始图像x输入生成器一G y ,生成器一G y 在处理过程中,将原始图像x与骨架-轮廓融合模块SCF所得的骨架特征E asx 和轮廓特征E bcx 在通道层次进行拼接,处理后生成目标风格图像,采集目标域图像y组成目标域数据集Y,将目标风格图像/>和目标域数据集Y中的目标域图像y分别输入鉴别器一D y 判断二者经鉴别器一D y 返回的结果是否一致,以此评估目标风格图像/>的真实性。Preferably, in the second step, in the first group of generative adversarial networks, the original image of the source domain image x as input is processed by the skeleton extraction module Ske and the contour extraction module Con respectively, and the corresponding skeleton information sx and contour information cx are extracted. , the skeleton information sx and the contour information cx are fused through the skeleton-contour fusion module SCF; the original image x is input to the generator G y , and during the processing, the generator G y combines the original image x with the skeleton-contour fusion module SCF The obtained skeleton features E asx and contour features E bcx are spliced at the channel level, and the target style image is generated after processing. , collect the target domain image y to form the target domain data set Y , and convert the target style image/> and the target domain image y in the target domain data set Y are respectively input into the discriminator D y to determine whether the results returned by the discriminator D y are consistent, so as to evaluate the target style image/> authenticity.

优选的,向骨架-轮廓融合模块SCF输入给定一个汉字的骨架信息和轮廓信息后,骨架-轮廓融合模块SCF首先将它们输入对应的骨架编码器和轮廓编码器,以产生对应的骨架特征E sx 和轮廓特征E cx ;然后将编码的骨架特征E sx 和轮廓特征E cx 相加得到特征E scx 并使用SoftMax函数得到归一化特征c Z ;基于归一化特征c Z ,使用注意力权重公式计算相应的骨架特征E sx 的权重a c 和轮廓特征E cx 的权重b c ;最后,将计算出的权重a c b c 乘以对应的骨架特征E sx 和轮廓特征E cx ,得到融合权重的骨架特征E asx 和融合权重的轮廓特征E bcx ,计算式描述如下所示:Preferably, after inputting the skeleton information and contour information of a given Chinese character to the skeleton-contour fusion module SCF, the skeleton-contour fusion module SCF first inputs them into the corresponding skeleton encoder and contour encoder to generate the corresponding skeleton feature E sx and contour feature E cx ; then add the encoded skeleton feature E sx and contour feature E cx to obtain the feature E scx and use the SoftMax function to obtain the normalized feature c Z ; based on the normalized feature c Z , use the attention weight The formula calculates the weight a c of the corresponding skeleton feature E sx and the weight b c of the contour feature E cx ; finally, multiply the calculated weights a c and b c by the corresponding skeleton feature E sx and contour feature E cx to obtain the fusion The weighted skeleton feature E asx and the fused weighted contour feature E bcx are calculated as follows:

,/> ,/>

其中,a c b c c Z 中的c都表示的通道c上的计算,AB是两个可学习参数的矩阵。 Among them, a c , b c and c c in Z all represent the calculation on channel c , and A and B are matrices of two learnable parameters.

优选的,在第二组生成对抗网络中,目标风格图像再通过骨架提取模块Ske和轮廓提取模块Con处理,提取到相应的骨架信息/>和轮廓信息/>,骨架信息/>和轮廓信息/>二者通过骨架-轮廓融合模块SCF融合;目标风格图像/>输入生成器二G x ,生成器二G x 在处理过程中,将目标风格图像/>与骨架-轮廓融合模块SCF融合所得的相应骨架特征和相应轮廓特征在通道层次进行拼接,重构生成与源域风格一致的重构图像/>;采集源域图像x组成源域数据集X,重构图像/>和源域数据集X中的源域图像x输入鉴别器二D x 后判断二者经鉴别器二D x 返回的结果是否一致,以此评估目标重构图像/>的真实性。Preferably, in the second group of generative adversarial networks, the target style image Then through the skeleton extraction module Ske and the contour extraction module Con, the corresponding skeleton information is extracted/> and contour information/> , skeleton information/> and contour information/> The two are fused through the skeleton-contour fusion module SCF; target style image/> Input generator two G x . During the processing, generator two G x converts the target style image/> The corresponding skeleton features and corresponding contour features obtained by fusion with the skeleton-contour fusion module SCF are spliced at the channel level, and reconstructed to generate a reconstructed image consistent with the style of the source domain/> ;Collect the source domain image x to form the source domain data set X , and reconstruct the image/> After inputting the source domain image x in the source domain data set X to the discriminator D x , it is judged whether the results returned by the two discriminators D authenticity.

优选的,所述步骤二中,CycleGAN模型中第一组生成对抗网络中,由鉴别器一D y 计算得到目标风格图像与目标域图像之间在字体风格上的差异,即第一代对抗性损失L advy ,用于优化生成器一G y ;第二组生成对抗网络的输入是基于第一组生成对抗网络中生成器一G y 的输出,第二组生成对抗网络中鉴别器二D x 计算源域图像和重构图像之间在字体风格上的差异,即第二代对抗性损失L advx ,用于优化生成器二G x Preferably, in the second step, in the first group of generative adversarial networks in the CycleGAN model, the difference in font style between the target style image and the target domain image is calculated by the discriminator D y , that is, the first generation adversarial The loss L advy is used to optimize the generator one G y ; the input of the second group of generative adversarial networks is based on the output of the generator one G y in the first group of generative adversarial networks, and the discriminator two D x in the second group of generative adversarial networks The difference in font style between the source domain image and the reconstructed image is calculated, that is, the second-generation adversarial loss L advx , which is used to optimize the generator two G x .

循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 、不精确的配对损失L inex 均对应优化生成器二G x 和生成器一G y ;循环一致性损失L cyc 是源域风格的原始图像x和重构图像间的损失;骨架一致性损失L ske 是原始图像x的骨架信息sx和重构图像/>中提取出的骨架信息/>间的损失;轮廓一致性损失L con 是原始图像x的轮廓信息cx和重构图像/>中提取出的轮廓信息/>间的损失,不精确的配对损失L inex 是不精确配对数据y inex 和对应到不精确配对数据的目标风格图像/>之间的损失。Cycle consistency loss L cyc , skeleton consistency loss L ske , contour consistency loss L con , and imprecise pairing loss L inex all correspond to the optimized generator two G x and generator one G y ; the cycle consistency loss L cyc is Original image x and reconstructed image in source domain style The loss between; the skeleton consistency loss L ske is the skeleton information sx of the original image x and the reconstructed image/> Skeleton information extracted from The loss between; the contour consistency loss L con is the contour information cx of the original image x and the reconstructed image/> Contour information extracted from The loss between, the imprecise pairing loss L inex is the imprecise pairing data y inex and the target style image corresponding to the imprecise pairing data/> between losses.

优选的,第二代对抗性损失L advx 、第一代对抗性损失L advy 、循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 、不精确的配对损失L inex 的算式依次如下:Preferably, the second generation adversarial loss L advx , the first generation adversarial loss L advy , cycle consistency loss L cyc , skeleton consistency loss L ske , contour consistency loss L con , and imprecise pairing loss Linex The calculation formula is as follows:

其中,E x~X [ ]表示在给定源域数据集X中的源域图像x分布下对[ ]里面数据的期望值,表示在给定重构图像集合/>中的重构图像/>分布下对[ ]里面数据的期望值,logD x (x)表示鉴别器二D x 将源域图像x识别为源域图像的概率,log(1-logD x (/>))表示鉴别器二D x 将重构图像/>识别为不是源域图像的概率;E y~Y [ ]表示在给定目标域数据集Y中的目标域图像y分布下对[ ]里面数据的期望值,/>表示在给定目标风格图像集合/>中的目标风格图像/>分布下对[ ]里面数据的期望值,logD y (y)表示鉴别器一D y 将目标域图像y识别为目标域图像的概率,log(1-logD y (/>))表示鉴别器一D y 将目标风格图像/>识别为不是目标域图像的概率;/>表示在给定源域数据集X中的源域图像x以及给定重构图像集合/>中的重构图像/>的分布下对|| ||1里面数据的范数的期望值,Ske(x)和Ske(/>)分别表示通过骨架提取模块Ske对源域图像x和重构图像/>处理所得的结果,Con(x)和Con(/>)分别表示通过轮廓提取模块Con对源域图像x和重构图像/>处理所得的结果;/>表示重构图像/>的集合,Y inex 表示不精确配对数据y inex 的集合,/>表示对应到不精确配对数据的目标风格图像,/>表示对应到不精确配对数据的目标风格图像/>的集合,表示在给定集合Y inex 中的不精确配对数据y inex 以及给定集合/>中的对应到不精确配对数据的目标风格图像/>的分布下对|| ||1里面数据的范数的期望值。Among them, E x ~ X [ ] represents the expected value of the data in [ ] under the given source domain image x distribution in the source domain data set X , Represented in a given set of reconstructed images/> Reconstructed image in/> The expected value of the data in [ ] under the distribution, log D x ( x ) represents the probability that the discriminator D x recognizes the source domain image x as the source domain image, log (1- log D x (/> )) means that the discriminator D x will reconstruct the image/> The probability of being recognized as not being a source domain image; E y ~ Y [ ] represents the expected value of the data in [ ] under the given target domain image y distribution in the target domain data set Y , /> Represents a collection of images in a given target style/> Target style image in/> The expected value of the data in [ ] under the distribution, log D y ( y ) represents the probability that the discriminator D y recognizes the target domain image y as the target domain image, log (1- log D y (/> )) means that the discriminator D y converts the target style image/> The probability of being recognized as an image that is not the target domain;/> Represents a source domain image x in a given source domain data set X and a given set of reconstructed images /> Reconstructed image in/> The expected value of the norm of the data in || || 1 under the distribution, Ske ( x ) and Ske (/> ) respectively represent the source domain image x and reconstructed image /> through the skeleton extraction module Ske Process the results obtained, Con ( x ) and Con (/> ) respectively represent the source domain image x and reconstructed image /> through the contour extraction module Con Processing results;/> Represents reconstructed image/> The set of Y inex represents the set of inexact paired data y inex , /> Represents the target style image corresponding to the imprecisely paired data, /> Represents target style images corresponding to imprecisely paired data/> collection of Represents the inexact paired data y inex in the given set Y inex and the given set /> Correspondence to target style image of imprecise paired data/> The expected value of the norm of the data in || || 1 under the distribution.

优选的,整个模型的模型损失的算式如下:Preferably, the model loss of the entire model The calculation formula is as follows:

该式中,λ cyc λ ske λ con λ inex 分别是对应循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 、不精确的配对损失L inex 的四个可调的超参数,表示相应损失在整个模型损失中的权重。In this formula, λ cyc , λ ske , λ con , and λ inex are four possible parameters corresponding to the cycle consistency loss L cyc , the skeleton consistency loss L ske , the contour consistency loss L con , and the inaccurate pairing loss L inex respectively. The tuned hyperparameter represents the weight of the corresponding loss in the entire model loss.

本发明具有以下优点:由于书法字体更加复杂,包括连笔画、笔画锐度、粗细等多种书法风格特征,这些特征很难单独使用骨架、笔画编码或其他组件来表征。因此,本方案引入轮廓来表示这些风格特征。而单纯的轮廓信息也不能确定字符的内容,因此引入了一个有效的骨架-轮廓融合模块来融合骨架信息和轮廓信息。本方案还由不精确配对数据模块IPaD自动识别书法数据集中的字符并记录为识别标签,得到不精确配对数据集。不精确配对数据集用于计算所生成图像与不精确配对数据集中相应的不精确配对图像之间的图像级损失。基于上述技术特点,本方案能综合利用骨架或轮廓信息,并且无需较多数量的配对样本即可实现对中国书法字体的自动生成,并能实现高质量的内容风格表现。The present invention has the following advantages: since calligraphy fonts are more complex and include multiple calligraphy style features such as connected strokes, stroke sharpness, thickness, etc., these features are difficult to characterize using skeletons, stroke codes or other components alone. Therefore, this scheme introduces silhouettes to represent these stylistic features. The pure outline information cannot determine the content of the character, so an effective skeleton-contour fusion module is introduced to fuse the skeleton information and outline information. This solution also uses the imprecise paired data module IPaD to automatically identify characters in the calligraphy data set and record them as identification tags to obtain an imprecise paired data set. The imprecise paired dataset is used to compute the image-level loss between the generated image and the corresponding imprecise paired image in the imprecise paired dataset. Based on the above technical characteristics, this solution can comprehensively utilize skeleton or outline information, and can automatically generate Chinese calligraphy fonts without requiring a large number of paired samples, and can achieve high-quality content style expression.

附图说明Description of the drawings

图1为本发明一种基于骨架和轮廓的书法字生成方法的模型流程图。Figure 1 is a model flow chart of a calligraphy character generation method based on skeleton and outline according to the present invention.

图2为本发明中骨架-轮廓融合模块SCF的工作流程示意图。Figure 2 is a schematic diagram of the workflow of the skeleton-contour fusion module SCF in the present invention.

图3为本发明与现有技术在汉字生成结果上的比较图。Figure 3 is a comparison diagram of Chinese character generation results between the present invention and the prior art.

图4为正楷字体和于右任、诸遂良二人书法字体的对比图。Figure 4 is a comparison of the block script fonts and the calligraphy fonts of Yu Youren and Zhu Suiliang.

图5为本发明将正楷字体的四组不同汉字分别转化为八大山人、黄庭坚、诸遂良和弘一法师的书法字体的效果图。Figure 5 is a rendering of the present invention's conversion of four different sets of Chinese characters in block script fonts into calligraphy fonts of Master Bada Shanren, Huang Tingjian, Zhu Suiliang and Master Hongyi respectively.

具体实施方式Detailed ways

下面对照附图,通过对实施例的描述,对本发明具体实施方式作进一步详细的说明,以帮助本领域的技术人员对本发明的发明构思、技术方案有更完整、准确和伸入的理解。The specific embodiments of the present invention will be further described in detail by describing the embodiments with reference to the accompanying drawings to help those skilled in the art have a more complete, accurate and thorough understanding of the inventive concepts and technical solutions of the present invention.

如图1-图2所示,本发明提供了一种基于骨架和轮廓的书法字生成方法,包括下列步骤。As shown in Figures 1 and 2, the present invention provides a calligraphy character generation method based on skeleton and outline, which includes the following steps.

步骤一、建立模型。Step 1. Build the model.

CycleGAN模型,即循环生成对抗模型,该模型是一种非监督学习模型。CycleGAN模型包含两组生成对抗网络,第一组包括构建的生成器一G y 和鉴别器一D y ,第二组生成对抗网络包括构建的生成器二G x 和鉴别器二D x 。在本方案中,生成器一G y 用于将原始图像转化为目标风格图像,鉴别器一D y 用来判别生成的目标风格图像与目标域图像之间字体风格是否一致,即判别目标风格图像的真实性。第二组生成对抗网络采用相反的过程对第一组生成对抗网络输出的结果进行重构,即通过生成器二G x 将目标风格图像转化为源域风格的重构图像,鉴别器二D x 用来判别生成的重构图像与源域图像之间字体风格是否一致,即判别重构图像的真实性。CycleGAN model, namely cycle generative adversarial model, is an unsupervised learning model. The CycleGAN model contains two groups of generative adversarial networks. The first group includes the constructed generator G y and the discriminator D y . The second group of generative adversarial networks includes the constructed generator G x and the discriminator D x . In this scheme, the generator G y is used to convert the original image into a target style image, and the discriminator D y is used to determine whether the font style between the generated target style image and the target domain image is consistent, that is, to determine the target style image. authenticity. The second group of generative adversarial networks uses the opposite process to reconstruct the output results of the first group of generative adversarial networks, that is, the target style image is converted into a reconstructed image of the source domain style through the generator two G x , and the discriminator two D x It is used to determine whether the font style between the generated reconstructed image and the source domain image is consistent, that is, to determine the authenticity of the reconstructed image.

上述生成对抗网络中的生成器均包括括编码器、转换器和解码器。CycleGAN模型中第一组生成对抗网络中通过鉴别器一D y 计算得到目标风格图像与目标域图像之间在字体风格上的差异,鉴别器一D y 的损失结合生成器一G y 的损失形成第一代对抗性损失L advy ,用于优化生成器一G y ;第二组生成对抗网络的输入是基于第一组生成对抗网络中生成器一G y 的输出,第二组生成对抗网络中鉴别器二D x 计算源域图像和重构图像之间在字体风格上的差异,鉴别器二D x 的损失结合生成器二G x 的损失形成第二代对抗性损失L advx ,用于优化生成器二G x 。训练时一般先训练鉴别器一D y 和鉴别器二D x ,再基于鉴别器处理得到的两代对抗性损失分别对相应的生成器进行优化。常规CycleGAN模型中对生成器的训练,实质上是让上述两代对抗性损失最小化的过程。训练过程还可以交替的训练鉴别器和生成器。The generators in the above generative adversarial networks include encoders, converters and decoders. In the first group of generative adversarial networks in the CycleGAN model, the difference in font style between the target style image and the target domain image is calculated through the discriminator D y . The loss of the discriminator D y is combined with the loss of the generator G y to form The first-generation adversarial loss L advy is used to optimize the generator- G y ; the input of the second group of generative adversarial networks is based on the output of the generator- G y in the first group of generative adversarial networks. The discriminator 2 D x calculates the difference in font style between the source domain image and the reconstructed image. The loss of the discriminator 2 D x is combined with the loss of the generator 2 G Generator two G x . During training, the discriminator D y and the discriminator D x are generally trained first, and then the corresponding generators are optimized based on the two generations of adversarial losses obtained by the discriminator processing. The training of the generator in the conventional CycleGAN model is essentially the process of minimizing the adversarial losses of the above two generations. The training process can also alternately train the discriminator and generator.

在本方案中,以CycleGAN模型作为基础模型,从而能够学习源域和目标域之间的两个映射,CycleGAN模型能够引入循环一致性损失来帮助克服配对数据的限制。本方案建立的模型以CycleGAN模型为骨干网络,CycleGAN模型包含两组生成对抗网络,两组生成对抗网络中均包括轮廓提取模块Con、骨架提取模块Ske和骨架-轮廓融合模块SCF。在轮廓提取模块中,由于书法特征图像通常以灰色表示,因此能通过使用著名的Canny算子很容易地实现轮廓信息的提取。在骨架提取模块中,采用了现有的一些简单规则相同的骨架方案(如论文:Jie Zhou, Yefei Wang, Yiyang Yuan, Qing Huang, and Jinshan Zeng, “Sgce-font: Skeleton guided channel expansion for chinese font generation,”arXivpreprint arXiv:2211.14475,2022.中所公开的提取方法)来有效地提取骨架信息。此外,本模型还设有不精确配对数据模块IPaD,对于不精确配对数据模块,使用现有的汉字识别方法(ChineseCharacter Recognition,简写CCR,例如论文:Jinshan Zeng, Ruiying Xu,Yu Wu, Hongwei Li, and Jiaxing Lu, “Zero-shot chinese character recognitionwith stroke and radical-level decompositions,”in Proceedings of theInternational Joint Conference on Neural Networks,2023.中所公开的识别方法)来自动识别书法数据集中的字符并记录为识别标签,在生成目标风格图像后依据目标风格图像进行相似性配对。不精确配对数据模块IPaD与现有技术不同之处在于:配对时允许对有关的书法数据集使用错误的识别标签,即配对结果是与原始图像相似但不同的汉字。这里虽然有些书法汉字被识别出是错误的,但它们仍然可以为相关的书法汉字提供一些重要的参考信息。In this solution, the CycleGAN model is used as the basic model to learn two mappings between the source domain and the target domain. The CycleGAN model can introduce cycle consistency loss to help overcome the limitations of paired data. The model established in this solution uses the CycleGAN model as the backbone network. The CycleGAN model contains two sets of generative adversarial networks. Both sets of generative adversarial networks include the contour extraction module Con, the skeleton extraction module Ske, and the skeleton-contour fusion module SCF. In the contour extraction module, since calligraphy feature images are usually represented in gray, the contour information can be easily extracted by using the famous Canny operator. In the skeleton extraction module, some existing skeleton schemes with the same simple rules are used (such as the paper: Jie Zhou, Yefei Wang, Yiyang Yuan, Qing Huang, and Jinshan Zeng, “Sgce-font: Skeleton guided channel expansion for chinese font generation," arXivpreprint arXiv:2211.14475, 2022.) to effectively extract skeleton information. In addition, this model also has an inexact paired data module IPaD. For the inexact paired data module, the existing Chinese Character Recognition method (CCR) is used. For example, papers: Jinshan Zeng, Ruiying Xu, Yu Wu, Hongwei Li, and Jiaxing Lu, “Zero-shot chinese character recognition with stroke and radical-level decompositions,” in Proceedings of the International Joint Conference on Neural Networks, 2023.) to automatically recognize characters in calligraphy datasets and record as recognition Label, after generating the target style image, perform similarity matching based on the target style image. The difference between the imprecise matching data module IPaD and the existing technology is that it allows the use of wrong identification labels for the relevant calligraphy data sets during matching, that is, the matching result is a Chinese character that is similar but different from the original image. Although some calligraphy Chinese characters are recognized incorrectly here, they can still provide some important reference information for related calligraphy Chinese characters.

与简体中文字体相比,书法字体更加复杂,包括连笔画、笔画锐度、粗细等多种书法风格特征,这些特征很难单独使用骨架、笔画编码或其他组件来表征。因此,引入轮廓来表示这些风格特征。而单纯的轮廓信息也不能确定字符的内容,因此引入了一个有效的骨架-轮廓融合模块来融合骨架信息和轮廓信息。骨架-轮廓融合模块的架构如图2所示。Compared with Simplified Chinese fonts, calligraphy fonts are more complex and include multiple calligraphy style features such as connected strokes, stroke sharpness, thickness, etc. These features are difficult to characterize using skeletons, stroke encodings, or other components alone. Therefore, silhouettes are introduced to represent these stylistic features. The pure outline information cannot determine the content of the character, so an effective skeleton-contour fusion module is introduced to fuse the skeleton information and outline information. The architecture of the skeleton-contour fusion module is shown in Figure 2.

步骤二、对所述模型进行训练。Step 2: Train the model.

上述模型将骨架-轮廓融合模块SCF与不精确配对数据模块IPaD集成在一起。所提出的模型融合了汉字的骨架和轮廓信息,提供了全面的结构监督信息。The above model integrates the skeleton-contour fusion module SCF and the inexact paired data module IPaD. The proposed model integrates the skeleton and outline information of Chinese characters to provide comprehensive structural supervision information.

训练的基本工作流程包括:对所述模型进行训练;输入模型的汉字字体风格为源域风格,源域风格的汉字图像即源域图像,采集源域风格的汉字图像做训练样本,需要生成的书法字体图像的字体风格为目标风格,目标风格的书法字体图像即目标域图像,采集目标域图像形成书法数据集;源域图像在训练时作为原始图像输入模型,通过第一组生成对抗网络将原始图像转化为目标风格图像,通过第二组生成对抗网络将第一组生成对抗网络输出的目标风格图像转化为重构图像,目标风格图像的字体风格应与目标风格一致,重构图像的字体风格应与源域风格一致,训练过程中通过计算整个模型的损失,对模型进行优化,优化目标是让整个模型的损失最小化。同时不精确配对数据模块IPaD自动识别书法数据集中的字符并记录为识别标签,再根据目标风格图像在书法数据集中进行不精确配对,即配对时允许对有关的书法数据集使用错误的识别标签。The basic workflow of training includes: training the model; the Chinese character font style input to the model is the source domain style, the Chinese character image in the source domain style is the source domain image, and the Chinese character images in the source domain style are collected as training samples, which need to be generated The font style of the calligraphy font image is the target style, and the calligraphy font image of the target style is the target domain image. The target domain image is collected to form a calligraphy data set; the source domain image is used as the original image input model during training, and the first set of generative adversarial networks is used to generate the calligraphy font image. The original image is converted into a target style image, and the target style image output by the first group of generative adversarial networks is converted into a reconstructed image through the second group of generative adversarial networks. The font style of the target style image should be consistent with the target style, and the font of the reconstructed image The style should be consistent with the source domain style. During the training process, the model is optimized by calculating the loss of the entire model. The optimization goal is to minimize the loss of the entire model. At the same time, the inexact pairing data module IPaD automatically recognizes the characters in the calligraphy data set and records them as identification tags, and then performs inexact pairing in the calligraphy data set based on the target style image, that is, the wrong identification tags are allowed to be used for the relevant calligraphy data set during pairing.

具体来说,在第一组生成对抗网络中,源域图像x作为输入的原始图像分别通过骨架提取模块Ske和轮廓提取模块Con处理,对应提取到骨架信息sx和轮廓信息cx,骨架信息sx和轮廓信息cx二者通过骨架-轮廓融合模块SCF融合。骨架-轮廓融合模块SCF属于一种交叉注意力模块,向骨架-轮廓融合模块SCF输入给定一个汉字的骨架信息和轮廓信息后,骨架-轮廓融合模块SCF首先将它们输入相关的编码器(即对应的骨架编码器和轮廓编码器)以产生对应的骨架特征E sx 和轮廓特征E cx ;然后将编码的骨架特征E sx 和轮廓特征E cx 相加得到特征E scx 并使用SoftMax函数得到归一化特征c Z 。基于归一化特征c Z ,使用注意力权重公式计算相应的骨架特征E sx 的权重a c 和轮廓特征E cx 的权重b c 。最后,将计算出的权重a c b c 乘以对应的骨架特征E sx 和轮廓特征E cx ,得到融合权重的骨架特征E asx 和融合权重的轮廓特征E bcx ,这里计算式描述如下所示:Specifically, in the first group of generative adversarial networks , the original image of the source domain image The contour information cx is fused through the skeleton-contour fusion module SCF. The skeleton-contour fusion module SCF is a kind of cross-attention module. After inputting the skeleton information and contour information of a Chinese character to the skeleton-contour fusion module SCF, the skeleton-contour fusion module SCF first inputs them into the relevant encoder (i.e. Corresponding skeleton encoder and contour encoder) to generate the corresponding skeleton feature E sx and contour feature E cx ; then add the encoded skeleton feature E sx and contour feature E cx to obtain the feature E scx and use the SoftMax function to obtain the normalization Characterization c Z . Based on the normalized feature c Z , the attention weight formula is used to calculate the weight a c of the corresponding skeleton feature E sx and the weight b c of the contour feature E cx . Finally, the calculated weights a c and b c are multiplied by the corresponding skeleton features E sx and contour features E cx to obtain the fused weighted skeleton feature E asx and the fused weighted contour feature E bcx , where the calculation formula is described as follows :

,/> ,/>

其中,a c b c c Z 中的c都表示的通道c上的计算,AB是两个可学习参数的矩阵。 Among them, a c , b c and c c in Z all represent the calculation on channel c , and A and B are matrices of two learnable parameters.

原始图像x输入生成器一G y ,生成器一G y 在处理过程中,将原始图像x与骨架-轮廓融合模块SCF所得的骨架特征E asx 和轮廓特征E bcx 在通道层次进行拼接,处理后生成目标风格图像,之后通过鉴别器一D y 评估目标风格图像/>的真实性,即目标风格图像/>和目标域图像分别输入鉴别器一D y 判断二者经鉴别器一D y 返回的结果是否一致。The original image x is input to the generator G y . During the processing, the generator G y splices the original image x with the skeleton feature E asx and the contour feature E bcx obtained by the skeleton-contour fusion module SCF at the channel level. After processing Generate target style image , and then evaluate the target style image/> through the discriminator D y authenticity, that is, the target style image/> and the target domain image are respectively input into the discriminator- D y to determine whether the results returned by the two discriminators- D y are consistent.

之后在第二组生成对抗网络中,目标风格图像再通过骨架提取模块Ske和轮廓提取模块Con处理,提取到相应的骨架信息/>和轮廓信息/>,骨架信息/>和轮廓信息/>二者通过骨架-轮廓融合模块SCF融合。目标风格图像/>输入生成器二G x ,生成器二G x 在处理过程中,将目标风格图像/>与骨架-轮廓融合模块SCF融合所得的相应骨架特征和相应轮廓特征在通道层次进行拼接,重构生成与源域风格一致的重构图像/>。之后在鉴别器二D x 评估重构图像/>的真实性,将重构图像/>和源域数据集X输入鉴别器二D x 后判断二者经鉴别器二D x 返回的结果是否一致。Then in the second set of generative adversarial networks, the target style image Then through the skeleton extraction module Ske and the contour extraction module Con, the corresponding skeleton information is extracted/> and contour information/> , skeleton information/> and contour information/> The two are fused through the skeleton-contour fusion module SCF. Target style image/> Input generator two G x . During the processing, generator two G x converts the target style image/> The corresponding skeleton features and corresponding contour features obtained by fusion with the skeleton-contour fusion module SCF are spliced at the channel level, and reconstructed to generate a reconstructed image consistent with the style of the source domain/> . The reconstructed image is then evaluated in the discriminator D x The authenticity will reconstruct the image/> After inputting the discriminator D x with the source domain data set X , it is judged whether the results returned by the two discriminators D x are consistent.

根据上面描述的工作流程,本方案提出的模型损失包括循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 、不精确的配对损失L inex 、以及两代对抗性损失L advx L advy 六个主要组成部分。两代对抗性损失中,L advx 对应生成器二G x 和鉴别器二D x 的第二代对抗性损失,L advy 对应生成器一G y 和鉴别器一D y 的第一代对抗性损失;循环一致性损失L cyc 是源域风格的原始图像x和重构图像间的损失。上述两代对抗性损失和循环一致损失是CycleGAN模型自身存在的损失函数,在训练中通过损失函数最小化实现对模型的优化,完成相应模型训练。According to the workflow described above, the model losses proposed in this scheme include cycle consistency loss L cyc , skeleton consistency loss L ske , contour consistency loss L con , imprecise pairing loss L inex , and two generations of adversarial loss L advx and L advy six main components. Among the two generations of adversarial losses, L advx corresponds to the second generation adversarial loss of generator two G x and discriminator two D x , and L advy corresponds to the first generation adversarial loss of generator one G y and discriminator one D y ;The cycle consistency loss L cyc is the original image x and the reconstructed image of the source domain style loss of time. The above two generations of adversarial loss and cycle consistent loss are the loss functions of the CycleGAN model itself. During training, the model is optimized by minimizing the loss function and the corresponding model training is completed.

由于本申请方案还分别提取了图像的骨架信息和轮廓信息,因此本模型还存在轮廓一致性损失和骨架一致性损失。骨架一致性损失L ske 是原始图像x的骨架信息sx和重构图像中提取出的骨架信息/>间的损失;轮廓一致性损失L con 是原始图像x的轮廓信息cx和重构图像/>中提取出的轮廓信息/>间的损失。最后由于本方案对目标域数据集采用了不精确配对数据模块IPaD对书法数据集进行不精确配对,因此损失中还包含不精确的配对损失。Since the solution of this application also extracts the skeleton information and contour information of the image respectively, this model also has contour consistency loss and skeleton consistency loss. The skeleton consistency loss L ske is the skeleton information sx of the original image x and the reconstructed image Skeleton information extracted from The loss between; the contour consistency loss L con is the contour information cx of the original image x and the reconstructed image/> Contour information extracted from loss of time. Finally, because this solution uses the inexact pairing data module IPaD for the target domain data set to perform inexact pairing on the calligraphy data set, the loss also includes imprecise pairing loss.

生成器一G y 生成的目标风格图像如果无法从目标域图像中实现精确配对,则进行不精确配对,即配对时允许对有关的书法数据集使用错误的识别标签,由此得到不精确配对数据y inex ,此时相应的目标风格图像/>即为对应到不精确配对数据y inex 的目标风格图像/>。而不精确的配对损失L inex 是不精确配对数据y inex 和对应到不精确配对数据的目标风格图像/>之间的损失。上述损失中循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 和不精确的配对损失L inex 均用于优化生成器一G y 和生成器二G x 。上述损失函数的算式具体如下:The target style image generated by generator one G y If an accurate pairing cannot be achieved from the target domain image, an inexact pairing is performed, that is, the wrong identification label is allowed to be used for the relevant calligraphy data set during pairing, thereby obtaining the imprecise pairing data y inex , at this time the corresponding target style image /> That is, the target style image corresponding to the inaccurate pairing data y inex /> . The imprecise pairing loss L inex is the imprecise pairing data y inex and the target style image corresponding to the imprecise pairing data/> between losses. Among the above losses, cycle consistency loss L cyc , skeleton consistency loss L ske , contour consistency loss L con and imprecise pairing loss L inex are all used to optimize generator one G y and generator two G x . The calculation formula of the above loss function is as follows:

其中,E x~X [ ]表示在给定源域数据集X中的源域图像x分布下对[ ]里面数据的期望值,表示在给定重构图像集合/>中的重构图像/>分布下对[ ]里面数据的期望值,logD x (x)表示鉴别器二D x 将源域图像x识别为源域图像的概率,鉴别器二D x 的损失越小,logD x (x)越大,第二代对抗性损失越小。log(1-logD x (/>))表示鉴别器二D x 将重构图像/>识别为不是源域图像的概率;随训练过程对生成器二G x 的优化,生成器二G x 的损失越小,就表明重构图像/>与源域图像x在字体风格上差异越小,log(1-logD x (/>))越小,鉴别器二D x 将重构图像/>正确识别的概率也越小,这导致鉴别器二D x 的损失越大,同时第二代对抗性损失也越小。E y~Y [ ]表示在给定目标域数据集Y中的目标域图像y分布下对[ ]里面数据的期望值,表示在给定目标风格图像集合/>中的目标风格图像/>分布下对[ ]里面数据的期望值,logD y (y)表示鉴别器一D y 将目标域图像y识别为目标域图像的概率,鉴别器一D y 的损失越小,logD y (y)越大,第一代对抗性损失越小。log(1-logD y (/>))表示鉴别器一D y 将目标风格图像/>识别为不是目标域图像的概率;随训练过程对生成器一G y 的优化,生成器一G y 的损失越小,就表明目标风格图像/>与目标域图像y在字体风格上差异越小,log(1-logD y (/>))越小,鉴别器一D y 将目标风格图像/>正确识别的概率也越小,这导致鉴别器一D y 的损失越大,同时第一代对抗性损失也越小。/>表示在给定源域数据集X中的源域图像x以及给定重构图像集合/>中的重构图像/>的分布下对|| ||1里面数据的范数的期望值,Ske(x)和Ske(/>)分别表示通过骨架提取模块Ske对源域图像x和重构图像/>处理所得的结果,Con(x)和Con(/>)分别表示通过轮廓提取模块Con对源域图像x和重构图像/>处理所得的结果;/>表示重构图像/>的集合,Y inex 表示不精确配对数据y inex 的集合,/>表示对应到不精确配对数据的目标风格图像,/>表示对应到不精确配对数据的目标风格图像/>的集合,表示在给定集合Y inex 中的不精确配对数据y inex 以及给定集合/>中的对应到不精确配对数据的目标风格图像/>的分布下对|| ||1里面数据的范数的期望值。Among them, E x ~ X [ ] represents the expected value of the data in [ ] under the given source domain image x distribution in the source domain data set X , Represented in a given set of reconstructed images/> Reconstructed image in/> The expected value of the data in [ ] under the distribution , log D x ( x ) represents the probability that the discriminator D x recognizes the source domain image ), the smaller the second-generation adversarial loss. log (1- log D x (/> )) means that the discriminator D x will reconstruct the image/> The probability of being recognized as not a source domain image; as the training process optimizes the generator two G x , the smaller the loss of the generator two G x , it indicates the reconstructed image/> The smaller the difference in font style between the source image x and the source domain image x , the smaller the difference between log (1- log D x (/> )) is smaller, the discriminator D x will reconstruct the image/> The probability of correct identification is also smaller, which results in a larger loss for the discriminator D x and a smaller second-generation adversarial loss. E y ~ Y [ ] represents the expected value of the data in [ ] under the given target domain image y distribution in the target domain data set Y , Represents a collection of images in a given target style/> Target style image in/> The expected value of the data in [ ] under the distribution, log D y ( y ) represents the probability that the discriminator D y will recognize the target domain image y as the target domain image. The smaller the loss of the discriminator D y , log D y ( y ), the smaller the first-generation adversarial loss. log (1- log D y (/> )) means that the discriminator D y converts the target style image/> The probability of being recognized as an image that is not the target domain; as the training process optimizes the generator G y , the smaller the loss of the generator G y , it indicates the target style image/> The smaller the difference in font style between the target domain image y and the target domain image y, the smaller the difference between log (1- log D y (/> )) is smaller, the discriminator D y will target style image/> The probability of correct identification is also smaller, which results in a larger loss of the discriminator D y and a smaller first-generation adversarial loss. /> Represents a source domain image x in a given source domain data set X and a given set of reconstructed images /> Reconstructed image in/> The expected value of the norm of the data in || || 1 under the distribution, Ske ( x ) and Ske (/> ) respectively represent the source domain image x and reconstructed image /> through the skeleton extraction module Ske Process the results obtained, Con ( x ) and Con (/> ) respectively represent the source domain image x and reconstructed image /> through the contour extraction module Con Processing results;/> Represents reconstructed image/> The set of Y inex represents the set of inexact paired data y inex , /> Represents the target style image corresponding to the imprecisely paired data, /> Represents target style images corresponding to imprecisely paired data/> collection of Represents the inexact paired data y inex in the given set Y inex and the given set /> Correspondence to target style image of imprecise paired data/> The expected value of the norm of the data in || || 1 under the distribution.

在CycleGAN模型中,整个模型(该模型即基于骨架、轮廓和不精确配对数据的书法字生成方法的模型,英文简写SCI-Font,其中S、C、I依次对应骨架提取模块Ske、轮廓提取模块Con和不精确配对数据模块IPaD)的模型损失、所有生成器G的损失以及所有鉴别器D的损失之间的关系可以通过下面的表达式描述:In the CycleGAN model, the entire model (the model is the model of the calligraphy character generation method based on skeleton, contour and imprecise paired data, the English abbreviation SCI-Font, where S, C, and I correspond to the skeleton extraction module Ske and the contour extraction module in sequence Model loss for Con and Inexact Paired Data module IPaD) , the relationship between the losses of all generators G and the losses of all discriminators D can be described by the following expression:

其中,表示模型中鉴别器D损失越大,生成器G损失越小,该表达式的含义是要在所有生成器G的损失以及所有鉴别器D损失这二者的取值范围内,找到能够使取得最小值的情况。此时所有生成器G的损失为最小值,所有鉴别器D的损失为最大值,并且同时使/>取得最优解。依据该关系对该模型训练时利用训练中反馈的模型损失对模型进行优化,减少模型损失。in, Indicates that the greater the loss of the discriminator D in the model, the smaller the loss of the generator G. The meaning of this expression is to find the value that can be used within the range of the losses of all generators G and the losses of all discriminators D. obtain the minimum value. At this time, the loss of all generators G is the minimum value, and the loss of all discriminators D is the maximum value, and at the same time/> Obtain the optimal solution. Based on this relationship, when training the model, the model loss fed back during training is used to optimize the model and reduce the model loss.

结合其他损失函数,整个模型的模型损失的算式如下:Combined with other loss functions, the model loss of the entire model The calculation formula is as follows:

该式中,λ cyc λ ske λ con λ inex 分别是对应循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 、不精确的配对损失L inex 的四个可调的超参数,表示相应损失在整个模型损失中的权重,对超参数进行优化,选择一组最优超参数,以提高学习的性能和效果。In this formula, λ cyc , λ ske , λ con , and λ inex are four possible parameters corresponding to the cycle consistency loss L cyc , the skeleton consistency loss L ske , the contour consistency loss L con , and the inaccurate pairing loss L inex respectively. The adjusted hyperparameters represent the weight of the corresponding loss in the entire model loss. The hyperparameters are optimized and a set of optimal hyperparameters are selected to improve the performance and effect of learning.

步骤三、获得优化后的模型用于书法字体自动生成。Step 3: Obtain the optimized model for automatic generation of calligraphy fonts.

基于上述模型和训练方式,本方案模型融合了汉字的骨架和轮廓信息,然后用作显式表示以增强由解码器产生的潜在内容风格表示,这样能有效捕捉书法字体的内容和风格特性。注意到收集配对数据的困难,利用了一些自动的中文字符识别技术生成不精确的配对数据集进一步用于监督模型性能,不精确的成对数据能更好的监督源域与目标域之间的字形差异,虽然有些书法汉字被识别出是错误的,但它们仍然可以为相关的书法汉字提供一些重要的参考信息,这些都为生成书法字的内容提供重要技术支持。Based on the above models and training methods, the model of this scheme integrates the skeleton and outline information of Chinese characters, and then uses it as an explicit representation to enhance the latent content style representation produced by the decoder, which can effectively capture the content and style characteristics of calligraphy fonts. Noting the difficulty in collecting paired data, we used some automatic Chinese character recognition technology to generate imprecise paired data sets to further supervise model performance. Inexact paired data can better supervise the relationship between the source domain and the target domain. Glyph differences, although some calligraphy Chinese characters are recognized as errors, they can still provide some important reference information for related calligraphy Chinese characters, which provide important technical support for generating the content of calligraphy characters.

应用本发明提供的方法进行字体生成实验,并与其他现有的字体生成技术进行对比,所得生成字体的对比结果如图3所示,不同生成方法的生成结果则由上至下排列,所用汉字由左至右依次分为三组,每组四个不同汉字,由左至右依次为由正楷字体生成柳公权的书法字体、由正楷字体生成颜真卿的书法字体和由正楷字体生成欧阳修的书法字体;图中标记的圆圈表示生成字体时出现的缺损错误,图中用方框框选出的汉字表示生成字体的形状不准确,出现了模式坍塌现象。其中倒数第二行即采用本发明的方法及模型(英文简写SCI-Font)。该图体现了本方法对书法字的生成效果较好。图4为正楷字体和于右任、诸遂良二人书法字体的对比图,可见相同汉字在不同书法字体中笔画和风格变化很大,还存在简繁体变化,体现书法字体的笔画复杂和风格多变。图5为应用本方法将正楷字体的四组不同汉字分别转化为八大山人、黄庭坚、诸遂良和弘一法师的书法字体的效果图,四组字体中,第一组的“悼”、第二组的“秉”、第三组的“蜀”和第四组的“郝”均生成了与输入的汉字不同的汉字,而书法字体风格则符合要求,从而体现本方法中存在不精确配对现象。The method provided by the present invention is used to conduct font generation experiments and compared with other existing font generation technologies. The comparison results of the generated fonts are shown in Figure 3. The generation results of different generation methods are arranged from top to bottom. The Chinese characters used Divided into three groups from left to right, each group has four different Chinese characters. From left to right, the calligraphy font of Liu Gongquan is generated from the block font, the calligraphy font of Yan Zhenqing is generated from the block font, and the calligraphy font of Ouyang Xiu is generated from the block font; The circles marked in the figure indicate defective errors that occur when generating fonts. The Chinese characters selected with boxes in the figure indicate that the shape of the generated fonts is inaccurate and mode collapse occurs. The penultimate row adopts the method and model of the present invention (English abbreviation SCI-Font). This figure shows that this method has a better effect on the generation of calligraphy characters. Figure 4 is a comparison of the block script fonts and the calligraphy fonts of Yu Youren and Zhu Suiliang. It can be seen that the strokes and styles of the same Chinese characters vary greatly in different calligraphy fonts, and there are also changes in simplified and traditional fonts, reflecting the complexity of strokes and the variety of styles in calligraphy fonts. Change. Figure 5 is a rendering of applying this method to convert four different groups of Chinese characters in block script fonts into calligraphy fonts of Bada Shanren, Huang Tingjian, Zhu Suiliang and Master Hongyi. Among the four groups of fonts, the first group of "mourning", the second group of Chinese characters "Bing" in the third group, "Shu" in the third group and "Hao" in the fourth group all generate Chinese characters that are different from the input Chinese characters, while the calligraphy font style meets the requirements, thus reflecting the phenomenon of inaccurate matching in this method .

上面结合附图对本发明进行了示例性描述,显然本发明具体实现并不受上述方式的限制,只要采用了本发明的发明构思和技术方案进行的各种非实质性的改进,或未经改进将本发明构思和技术方案直接应用于其它场合的,均在本发明保护范围之内。The present invention has been exemplarily described above in conjunction with the accompanying drawings. It is obvious that the specific implementation of the present invention is not limited by the above-mentioned manner, as long as various non-substantive improvements are made using the inventive concepts and technical solutions of the present invention, or without improvement. Direct application of the concepts and technical solutions of the present invention to other situations shall fall within the protection scope of the present invention.

Claims (8)

1.一种基于骨架和轮廓的书法字生成方法,包括下列步骤:1. A calligraphy character generation method based on skeleton and outline, including the following steps: 步骤一、建立模型;所述模型以CycleGAN模型为骨干网络,CycleGAN模型包含两组生成对抗网络;Step 1: Establish a model; the model uses the CycleGAN model as the backbone network, and the CycleGAN model includes two sets of generative adversarial networks; 步骤二、对所述模型进行训练;输入模型的汉字字体风格为源域风格,源域风格的汉字图像即源域图像,采集源域风格的汉字图像做训练样本,需要生成的书法字体图像的字体风格为目标风格,目标风格的书法字体图像即目标域图像,采集目标域图像形成书法数据集;源域图像在训练时作为原始图像输入模型,通过第一组生成对抗网络将原始图像转化为目标风格图像,通过第二组生成对抗网络将第一组生成对抗网络输出的目标风格图像转化为重构图像,目标风格图像的字体风格应与目标风格一致,重构图像的字体风格应与源域风格一致,训练过程中通过计算整个模型的损失,对模型进行优化,优化目标是让整个模型的损失最小化;Step 2: Train the model; the Chinese character font style input to the model is the source domain style, and the Chinese character image in the source domain style is the source domain image. The Chinese character images in the source domain style are collected as training samples. The calligraphy font image that needs to be generated is The font style is the target style, and the calligraphy font image of the target style is the target domain image. The target domain image is collected to form a calligraphy data set; the source domain image is used as the original image input model during training, and the original image is converted into For the target style image, the target style image output by the first group of generative adversarial networks is converted into a reconstructed image through the second group of generative adversarial networks. The font style of the target style image should be consistent with the target style, and the font style of the reconstructed image should be consistent with the source The domain style is consistent. During the training process, the model is optimized by calculating the loss of the entire model. The optimization goal is to minimize the loss of the entire model; 步骤三、获得优化后的模型用于书法字体自动生成;Step 3: Obtain the optimized model for automatic generation of calligraphy fonts; 其特征在于:两组生成对抗网络中均包括轮廓提取模块Con、骨架提取模块Ske和骨架-轮廓融合模块SCF,所述模型还包括不精确配对数据模块IPaD;It is characterized in that: both sets of generative adversarial networks include a contour extraction module Con, a skeleton extraction module Ske and a skeleton-contour fusion module SCF, and the model also includes an inexact pairing data module IPaD; 所述步骤二中,两组生成对抗网络均通过轮廓提取模块Con和骨架提取模块Ske分别提取骨架信息和轮廓信息,并将骨架信息和轮廓信息通过骨架-轮廓融合模块SCF融合后在生成器中与输入生成器的图像拼接,再由相应生成器处理生成图像;In the second step, both groups of generative adversarial networks extract skeleton information and contour information respectively through the contour extraction module Con and the skeleton extraction module Ske, and fuse the skeleton information and contour information through the skeleton-contour fusion module SCF in the generator. The image is spliced with the input generator, and then processed by the corresponding generator to generate the image; 不精确配对数据模块IPaD自动识别书法数据集中的字符并记录为识别标签,再根据目标风格图像在书法数据集中进行不精确配对,配对时允许对有关的书法数据集使用错误的识别标签,从而得到不精确配对数据;The inexact pairing data module IPaD automatically recognizes the characters in the calligraphy data set and records them as identification tags, and then performs inexact pairing in the calligraphy data set based on the target style image. During pairing, it allows the use of wrong identification tags for the relevant calligraphy data sets, thus obtaining Inexact pairing data; 整个模型的损失包括第一代对抗性损失L advy 、第二代对抗性损失L advx 、循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 和不精确的配对损失L inex The losses of the entire model include the first-generation adversarial loss L advy , the second-generation adversarial loss L advx , cycle consistency loss L cyc , skeleton consistency loss L ske , contour consistency loss L con and imprecise pairing loss L inex . 2.根据权利要求1所述的一种基于骨架和轮廓的书法字生成方法,其特征在于:所述步骤一中,第一组生成对抗网络包括构建的生成器一G y 和鉴别器一D y ,第二组生成对抗网络包括构建的生成器二G x 和鉴别器二D x ;生成器一G y 用于将原始图像转化为目标风格图像,鉴别器一D y 用来判别生成的目标风格图像与目标域图像之间字体风格是否一致;第二组生成对抗网络采用相反的过程对第一组生成对抗网络输出的结果进行重构,即通过生成器二G x 将目标风格图像转化为源域风格的重构图像,鉴别器二D x 用来判别生成的重构图像与源域图像之间字体风格是否一致。2. A method for generating calligraphy characters based on skeleton and outline according to claim 1, characterized in that: in the step one, the first group of generative adversarial networks includes a constructed generator- Gy and a discriminator- D y , the second group of generative adversarial networks includes the built generator G x and discriminator D x ; the generator G y is used to convert the original image into a target style image, and the discriminator D y is used to identify the generated target Whether the font style is consistent between the style image and the target domain image; the second group of generative adversarial networks uses the opposite process to reconstruct the results output by the first group of generative adversarial networks, that is, the target style image is converted into For the reconstructed image in the source domain style, the discriminator D x is used to determine whether the font style between the generated reconstructed image and the source domain image is consistent. 3.根据权利要求2所述的一种基于骨架和轮廓的书法字生成方法,其特征在于:所述步骤二中,在第一组生成对抗网络中,源域图像x作为输入的原始图像分别通过骨架提取模块Ske和轮廓提取模块Con处理,对应提取到骨架信息sx和轮廓信息cx,骨架信息sx和轮廓信息cx二者通过骨架-轮廓融合模块SCF融合;原始图像x输入生成器一G y ,生成器一G y 在处理过程中,将原始图像x与骨架-轮廓融合模块SCF所得的骨架特征E asx 和轮廓特征E bcx 在通道层次进行拼接,处理后生成目标风格图像,采集目标域图像y组成目标域数据集Y,将目标风格图像/>和目标域数据集Y中的目标域图像y分别输入鉴别器一D y 判断二者经鉴别器一D y 返回的结果是否一致,以此评估目标风格图像/>的真实性。3. A method for generating calligraphy characters based on skeleton and outline according to claim 2, characterized in that: in the second step, in the first group of generative adversarial networks, the source domain image x is used as the original image of the input respectively. Through the processing of the skeleton extraction module Ske and the contour extraction module Con, the skeleton information sx and the contour information cx are extracted correspondingly. The skeleton information sx and the contour information cx are fused through the skeleton-contour fusion module SCF; the original image x is input to the generator - G y , during the processing process, the generator G y splices the original image x with the skeleton feature E asx and contour feature E bcx obtained by the skeleton-contour fusion module SCF at the channel level, and generates the target style image after processing , collect the target domain image y to form the target domain data set Y , and convert the target style image/> and the target domain image y in the target domain data set Y are respectively input into the discriminator D y to determine whether the results returned by the discriminator D y are consistent, so as to evaluate the target style image/> authenticity. 4. 根据权利要求3所述的一种基于骨架和轮廓的书法字生成方法,其特征在于:向骨架-轮廓融合模块SCF输入给定一个汉字的骨架信息和轮廓信息后,骨架-轮廓融合模块SCF首先将它们输入对应的骨架编码器和轮廓编码器,以产生对应的骨架特征E sx 和轮廓特征E cx ;然后将编码的骨架特征E sx 和轮廓特征E cx 相加得到特征E scx 并使用SoftMax函数得到归一化特征c Z ;基于归一化特征c Z ,使用注意力权重公式计算相应的骨架特征E sx 的权重a c 和轮廓特征E cx 的权重b c ;最后,将计算出的权重a c b c 乘以对应的骨架特征E sx 和轮廓特征E cx ,得到融合权重的骨架特征E asx 和融合权重的轮廓特征E bcx ,计算式描述如下所示:4. A method for generating calligraphy characters based on skeleton and contour according to claim 3, characterized in that: after inputting the skeleton information and contour information of a given Chinese character to the skeleton-contour fusion module SCF, the skeleton-contour fusion module SCF first inputs them into the corresponding skeleton encoder and contour encoder to generate the corresponding skeleton feature E sx and contour feature E cx ; then the encoded skeleton feature E sx and contour feature E cx are added to obtain the feature E scx and used The SoftMax function obtains the normalized feature c Z ; based on the normalized feature c Z , use the attention weight formula to calculate the weight a c of the corresponding skeleton feature E sx and the weight b c of the contour feature E cx ; finally, the calculated The weights a c and b c are multiplied by the corresponding skeleton feature E sx and contour feature E cx to obtain the skeleton feature E asx of the fused weight and the contour feature E bcx of the fused weight. The calculation formula is described as follows: ,/> ,/> 其中,a c b c c Z 中的c都表示的通道c上的计算,AB是两个可学习参数的矩阵。 Among them, a c , b c and c c in Z all represent the calculation on channel c , and A and B are matrices of two learnable parameters. 5.根据权利要求4所述的一种基于骨架和轮廓的书法字生成方法,其特征在于:在第二组生成对抗网络中,目标风格图像再通过骨架提取模块Ske和轮廓提取模块Con处理,提取到相应的骨架信息/>和轮廓信息/>,骨架信息/>和轮廓信息/>二者通过骨架-轮廓融合模块SCF融合;目标风格图像/>输入生成器二G x ,生成器二G x 在处理过程中,将目标风格图像/>与骨架-轮廓融合模块SCF融合所得的相应骨架特征和相应轮廓特征在通道层次进行拼接,重构生成与源域风格一致的重构图像/>;采集源域图像x组成源域数据集X,重构图像/>和源域数据集X中的源域图像x输入鉴别器二D x 后判断二者经鉴别器二D x 返回的结果是否一致,以此评估目标重构图像/>的真实性。5. A method for generating calligraphy characters based on skeleton and outline according to claim 4, characterized in that: in the second group of generative adversarial networks, the target style image Then through the skeleton extraction module Ske and the contour extraction module Con, the corresponding skeleton information is extracted/> and contour information/> , skeleton information/> and contour information/> The two are fused through the skeleton-contour fusion module SCF; target style image/> Input generator two G x . During the processing, generator two G x converts the target style image/> The corresponding skeleton features and corresponding contour features obtained by fusion with the skeleton-contour fusion module SCF are spliced at the channel level, and reconstructed to generate a reconstructed image consistent with the style of the source domain/> ;Collect the source domain image x to form the source domain data set X , and reconstruct the image/> After inputting the source domain image x in the source domain data set X to the discriminator D x , it is judged whether the results returned by the two discriminators D authenticity. 6.根据权利要求5所述的一种基于骨架和轮廓的书法字生成方法,其特征在于:所述步骤二中,CycleGAN模型中第一组生成对抗网络中,由鉴别器一D y 计算得到目标风格图像与目标域图像之间在字体风格上的差异,即第一代对抗性损失L advy ,用于优化生成器一G y ;第二组生成对抗网络的输入是基于第一组生成对抗网络中生成器一G y 的输出,第二组生成对抗网络中鉴别器二D x 计算源域图像和重构图像之间在字体风格上的差异,即第二代对抗性损失L advx ,用于优化生成器二G x 6. A method for generating calligraphy characters based on skeleton and outline according to claim 5, characterized in that: in the second step, in the first group of generative adversarial networks in the CycleGAN model, the discriminator D y is calculated. The difference in font style between the target style image and the target domain image, that is, the first generation adversarial loss L advy , is used to optimize the generator one G y ; the input of the second set of generative adversarial networks is based on the first set of generative adversarial The output of the generator G y in the network, the discriminator D x in the second group of generative adversarial networks calculates the difference in font style between the source domain image and the reconstructed image, that is, the second generation adversarial loss L advx , using In optimizing generator two G x ; 循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 、不精确的配对损失L inex 均对应优化生成器二G x 和生成器一G y ;循环一致性损失L cyc 是源域风格的原始图像x和重构图像间的损失;骨架一致性损失L ske 是原始图像x的骨架信息sx和重构图像/>中提取出的骨架信息/>间的损失;轮廓一致性损失L con 是原始图像x的轮廓信息cx和重构图像/>中提取出的轮廓信息/>间的损失,不精确的配对损失L inex 是不精确配对数据y inex 和对应到不精确配对数据的目标风格图像/>之间的损失。Cycle consistency loss L cyc , skeleton consistency loss L ske , contour consistency loss L con , and imprecise pairing loss L inex all correspond to the optimized generator two G x and generator one G y ; the cycle consistency loss L cyc is Original image x and reconstructed image in source domain style The loss between; the skeleton consistency loss L ske is the skeleton information sx of the original image x and the reconstructed image/> Skeleton information extracted from The loss between; the contour consistency loss L con is the contour information cx of the original image x and the reconstructed image/> Contour information extracted from The loss between, the imprecise pairing loss L inex is the imprecise pairing data y inex and the target style image corresponding to the imprecise pairing data/> between losses. 7.根据权利要求6所述的一种基于骨架和轮廓的书法字生成方法,其特征在于:第二代对抗性损失L advx 、第一代对抗性损失L advy 、循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 、不精确的配对损失L inex 的算式依次如下:7. A calligraphy character generation method based on skeleton and outline according to claim 6, characterized in that: the second generation adversarial loss L advx , the first generation adversarial loss L advy , the cycle consistency loss L cyc , The calculation formulas of skeleton consistency loss L ske , contour consistency loss L con , and inaccurate pairing loss L inex are as follows: 其中,E x~X [ ]表示在给定源域数据集X中的源域图像x分布下对[ ]里面数据的期望值,表示在给定重构图像集合/>中的重构图像/>分布下对[ ]里面数据的期望值,logD x (x)表示鉴别器二D x 将源域图像x识别为源域图像的概率,log(1-log D x (/>))表示鉴别器二D x 将重构图像/>识别为不是源域图像的概率;E y~Y [ ]表示在给定目标域数据集Y中的目标域图像y分布下对[ ]里面数据的期望值,/>表示在给定目标风格图像集合/>中的目标风格图像/>分布下对[ ]里面数据的期望值,logD y (y)表示鉴别器一D y 将目标域图像y识别为目标域图像的概率,log(1-logD y (/>))表示鉴别器一D y 将目标风格图像/>识别为不是目标域图像的概率;/>表示在给定源域数据集X中的源域图像x以及给定重构图像集合/>中的重构图像/>的分布下对|| ||1里面数据的范数的期望值,Ske(x)和Ske(/>)分别表示通过骨架提取模块Ske对源域图像x和重构图像/>处理所得的结果,Con(x)和Con(/>)分别表示通过轮廓提取模块Con对源域图像x和重构图像/>处理所得的结果;/>表示重构图像/>的集合,Y inex 表示不精确配对数据y inex 的集合,/>表示对应到不精确配对数据的目标风格图像,/>表示对应到不精确配对数据的目标风格图像/>的集合,表示在给定集合Y inex 中的不精确配对数据y inex 以及给定集合/>中的对应到不精确配对数据的目标风格图像/>的分布下对|| ||1里面数据的范数的期望值。Among them, E x ~ X [ ] represents the expected value of the data in [ ] under the given source domain image x distribution in the source domain data set X , Represented in a given set of reconstructed images/> Reconstructed image in/> The expected value of the data in [ ] under the distribution, log D x ( x ) represents the probability that the discriminator D x recognizes the source domain image x as the source domain image, log (1- log D x (/> )) means that the discriminator D x will reconstruct the image/> The probability of being recognized as not being a source domain image; E y ~ Y [ ] represents the expected value of the data in [ ] under the given target domain image y distribution in the target domain data set Y , /> Represents a collection of images in a given target style/> Target style image in/> The expected value of the data in [ ] under the distribution, log D y ( y ) represents the probability that the discriminator D y recognizes the target domain image y as the target domain image, log (1- log D y (/> )) means that the discriminator D y converts the target style image/> The probability of being recognized as an image that is not the target domain;/> Represents a source domain image x in a given source domain data set X and a given set of reconstructed images /> Reconstructed image in/> The expected value of the norm of the data in || || 1 under the distribution, Ske ( x ) and Ske (/> ) respectively represent the source domain image x and reconstructed image /> through the skeleton extraction module Ske Process the results obtained, Con ( x ) and Con (/> ) respectively represent the source domain image x and reconstructed image /> through the contour extraction module Con Processing results;/> Represents reconstructed image/> The set of Y inex represents the set of inexact paired data y inex , /> Represents the target style image corresponding to the imprecisely paired data, /> Represents target style images corresponding to imprecisely paired data/> collection of Represents the inexact paired data y inex in the given set Y inex and the given set /> Correspondence to target style image of imprecise paired data/> The expected value of the norm of the data in || || 1 under the distribution. 8.根据权利要求7所述的一种基于骨架和轮廓的书法字生成方法,其特征在于:8. A method for generating calligraphy characters based on skeleton and outline according to claim 7, characterized in that: 整个模型的模型损失的算式如下:Model loss for the entire model The calculation formula is as follows: 该式中,λ cyc λ ske λ con λ inex 分别是对应循环一致性损失L cyc 、骨架一致性损失L ske 、轮廓一致性损失L con 、不精确的配对损失L inex 的四个可调的超参数,表示相应损失在整个模型损失中的权重。In this formula, λ cyc , λ ske , λ con , and λ inex are four possible parameters corresponding to the cycle consistency loss L cyc , the skeleton consistency loss L ske , the contour consistency loss L con , and the inaccurate pairing loss L inex respectively. The tuned hyperparameter represents the weight of the corresponding loss in the entire model loss.
CN202311313408.XA 2023-10-11 2023-10-11 A calligraphy character generation method based on skeleton and outline Active CN117058266B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311313408.XA CN117058266B (en) 2023-10-11 2023-10-11 A calligraphy character generation method based on skeleton and outline

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311313408.XA CN117058266B (en) 2023-10-11 2023-10-11 A calligraphy character generation method based on skeleton and outline

Publications (2)

Publication Number Publication Date
CN117058266A true CN117058266A (en) 2023-11-14
CN117058266B CN117058266B (en) 2023-12-26

Family

ID=88655783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311313408.XA Active CN117058266B (en) 2023-10-11 2023-10-11 A calligraphy character generation method based on skeleton and outline

Country Status (1)

Country Link
CN (1) CN117058266B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117830074A (en) * 2023-12-20 2024-04-05 广州美术学院 A Chinese character font generation method based on font outline information
CN118036555A (en) * 2024-04-11 2024-05-14 江西师范大学 Low-sample font generation method based on skeleton transfer and structure contrast learning
CN118537660A (en) * 2024-07-19 2024-08-23 南通理工学院 Tobacco leaf detection method integrating main pulse characteristics and edge characteristics
CN118799892A (en) * 2024-09-12 2024-10-18 南昌大学 A method and system for generating Chinese characters in calligraphic style

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408776A (en) * 2018-10-09 2019-03-01 西华大学 A kind of calligraphy font automatic generating calculation based on production confrontation network
CN109746916A (en) * 2019-01-28 2019-05-14 武汉科技大学 A method and system for robot writing calligraphy
US20210390686A1 (en) * 2020-06-15 2021-12-16 Dalian University Of Technology Unsupervised content-preserved domain adaptation method for multiple ct lung texture recognition
CN116823983A (en) * 2023-06-15 2023-09-29 西北大学 One-to-many style handwriting picture generation method based on style collection mechanism

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408776A (en) * 2018-10-09 2019-03-01 西华大学 A kind of calligraphy font automatic generating calculation based on production confrontation network
CN109746916A (en) * 2019-01-28 2019-05-14 武汉科技大学 A method and system for robot writing calligraphy
US20210390686A1 (en) * 2020-06-15 2021-12-16 Dalian University Of Technology Unsupervised content-preserved domain adaptation method for multiple ct lung texture recognition
CN116823983A (en) * 2023-06-15 2023-09-29 西北大学 One-to-many style handwriting picture generation method based on style collection mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王晓红;卢辉;麻祥才;: "基于生成对抗网络的风格化书法图像生成", 包装工程, no. 11 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117830074A (en) * 2023-12-20 2024-04-05 广州美术学院 A Chinese character font generation method based on font outline information
CN118036555A (en) * 2024-04-11 2024-05-14 江西师范大学 Low-sample font generation method based on skeleton transfer and structure contrast learning
CN118537660A (en) * 2024-07-19 2024-08-23 南通理工学院 Tobacco leaf detection method integrating main pulse characteristics and edge characteristics
CN118537660B (en) * 2024-07-19 2024-10-18 南通理工学院 Tobacco leaf detection method integrating main pulse characteristics and edge characteristics
CN118799892A (en) * 2024-09-12 2024-10-18 南昌大学 A method and system for generating Chinese characters in calligraphic style
CN118799892B (en) * 2024-09-12 2025-03-18 南昌大学 A method and system for generating Chinese characters in calligraphic style

Also Published As

Publication number Publication date
CN117058266B (en) 2023-12-26

Similar Documents

Publication Publication Date Title
CN117058266A (en) Handwriting word generation method based on skeleton and outline
CN110598221B (en) Method for improving translation quality of Mongolian Chinese by constructing Mongolian Chinese parallel corpus by using generated confrontation network
CN108829677B (en) Multi-modal attention-based automatic image title generation method
CN113313022B (en) Training method of character recognition model and method for recognizing characters in image
Zuo et al. Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities
CN113204952A (en) Multi-intention and semantic slot joint identification method based on clustering pre-analysis
CN110473140B (en) Image dimension reduction method of extreme learning machine based on graph embedding
CN113362416B (en) Method of Text Generation to Image Based on Object Detection
CN116244473B (en) A Multimodal Emotion Recognition Method Based on Feature Decoupling and Graph Knowledge Distillation
CN114299512B (en) A zero-shot seal character recognition method based on Chinese character root structure
Abdul-Rashid et al. Shrec’18 track: 2d image-based 3d scene retrieval
CN112651940A (en) Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
CN113140023A (en) Text-to-image generation method and system based on space attention
CN118036555B (en) Few-shot font generation method based on skeleton transfer and structural contrastive learning
CN118332414A (en) Method and system for generating graphic description text integrating numerical and visual features
Perdana et al. Instance-based deep transfer learning on cross-domain image captioning
CN111339782A (en) Sign language translation system and method based on multilevel semantic analysis
CN119692476A (en) Ultra-high-speed optical module digital manufacturing scenario knowledge reasoning method, device, equipment and medium based on large model technology
CN112084319B (en) Relational network video question-answering system and method based on actions
CN119166791A (en) A method and system for interactive live marketing of intelligent digital people
CN116071759B (en) Optical character recognition method fusing GPT2 pre-training large model
CN118038054A (en) MRI tumor segmentation method in missing modality based on feature-modality dual-level fusion
CN103793720A (en) Method and system for positioning eyes
Wang et al. Small target image caption generation based on feature redirection
CN114202057A (en) Space-time attention perception learning method for wind power generator group data feature recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant