CN113962192B

CN113962192B - Method and device for generating Chinese character font generation model and Chinese character font generation method and device

Info

Publication number: CN113962192B
Application number: CN202110467098.1A
Authority: CN
Inventors: 曾锦山; 汪叶飞; 陈琪; 王明文
Original assignee: Jiangxi Normal University
Current assignee: Jiangxi Normal University
Priority date: 2021-04-28
Filing date: 2021-04-28
Publication date: 2022-11-15
Anticipated expiration: 2041-04-28
Also published as: CN113962192A

Abstract

The application discloses a method and a device for generating a Chinese character font generation model, and belongs to the technical field of artificial intelligence. The method comprises the following steps: acquiring a Chinese character image; carrying out geometric transformation processing on the Chinese character image to generate a Chinese character transformation image; inputting the Chinese character transformation image into a Chinese character identification model, identifying the Chinese character transformation image through the Chinese character identification model, and outputting authenticity information of the Chinese character transformation image; and adjusting model parameters of the font generation model based on the authenticity information to obtain the trained font generation model. In the technical scheme provided by the embodiment of the application, the geometric transformation is carried out on the Chinese character image to guide the model to extract the high-quality Chinese character features; and the model parameters are adjusted to optimize the model effect according to the result of judging the authenticity of the Chinese character conversion image by the Chinese character identification model, so that the problem of mode collapse in deep learning model training is effectively solved, the guidance and pertinence of feature extraction are improved, and the Chinese character generation effect is obviously improved.

Description

Method and device for generating Chinese character font generation model and Chinese character font generation method and device

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a method for generating a Chinese character font generation model, a method and a device for generating the Chinese character font.

Background

In recent years, automatic generation of chinese characters has attracted much attention due to its wide application in artistic font generation, personalized font design, and calligraphy work generation.

The present Chinese character font generating method can be roughly divided into two categories. The first category of methods is based mainly on the explicit characteristics of Chinese characters, such as the structure, radicals and strokes of Chinese characters, and utilizes traditional machine learning methods. The core idea of the method is 'decomposition and recombination', namely, the Chinese character is firstly split, local explicit characteristics such as the hierarchical structure, the strokes, the radicals and the like of the Chinese character are extracted, and then recombination is carried out through a traditional machine learning algorithm, so that a new Chinese character is formed. The second method is mainly based on deep learning, and the core idea of the method is to regard Chinese characters as images, so that a Chinese character font generation task is regarded as an image style conversion task, and further, a new technology developed in the field of image style conversion can be effectively utilized to process the Chinese character font generation task. However, the deep learning model often has the problem of pattern collapse in training, and some unique features of Chinese characters are ignored, so that guidance and pertinence are lacked in feature extraction, and the generated Chinese characters have quality problems.

Disclosure of Invention

The embodiment of the application provides a Chinese character font generation model generation method, a Chinese character font generation device and Chinese character font generation equipment.

According to an aspect of an embodiment of the present application, there is provided a method for generating a chinese character font generation model, the method including:

acquiring a Chinese character image, wherein the Chinese character image comprises a source style Chinese character image and a target style Chinese character image;

performing geometric transformation processing on the Chinese character image to generate a Chinese character transformation image, wherein the geometric transformation processing refers to an image processing mode of dividing the Chinese character image into different areas and then adjusting the positions of the areas in the Chinese character image, and the Chinese character transformation image refers to an image obtained after the positions of the areas are adjusted;

inputting the Chinese character transformation image into a Chinese character identification model, identifying the Chinese character transformation image through the Chinese character identification model, and outputting authenticity information of the Chinese character transformation image;

and adjusting model parameters of a font generation model based on the authenticity information to obtain a trained font generation model, wherein the font generation model is used for converting the source style Chinese character image into the target style Chinese character image.

According to an aspect of an embodiment of the present application, there is provided a method for generating a chinese character font, the method including:

acquiring a source style Chinese character image;

inputting the source style Chinese character image into a trained font generation model, and performing Chinese character style conversion processing on the source style Chinese character image through the trained font generation model to generate a target style Chinese character image;

the trained font generation model is a machine learning model obtained by performing countermeasure training with a Chinese character identification model, the Chinese character identification model performs identification processing on a Chinese character transformation image, and outputs authenticity information of the Chinese character transformation image to adjust model parameters of the font generation model, the Chinese character transformation image is an image generated by the Chinese character image through geometric transformation processing, and the geometric transformation processing refers to an image processing mode of dividing the Chinese character image into different regions and then adjusting positions of the regions in the Chinese character image.

According to an aspect of an embodiment of the present application, there is provided an apparatus for generating a chinese character font generation model, the apparatus including:

the Chinese character image acquisition module is used for acquiring Chinese character images, and the Chinese character images comprise source style Chinese character images and target style Chinese character images;

the geometric transformation processing module is used for performing geometric transformation processing on the Chinese character image to generate a Chinese character transformation image, the geometric transformation processing refers to an image processing mode of adjusting the positions of the areas in the Chinese character image after dividing the Chinese character image into different areas, and the Chinese character transformation image refers to an image obtained after the positions of the areas are adjusted;

the Chinese character identification module is used for inputting the Chinese character transformation image into a Chinese character identification model, identifying the Chinese character transformation image through the Chinese character identification model and outputting the authenticity information of the Chinese character transformation image;

and the model parameter adjusting module is used for adjusting the model parameters of the font generation model based on the authenticity information to obtain the trained font generation model, and the font generation model is used for converting the source style Chinese character image into the target style Chinese character image.

According to an aspect of an embodiment of the present application, there is provided a chinese character font generating apparatus, including:

the source style Chinese character image acquisition module is used for acquiring a source style Chinese character image;

the Chinese character style conversion module is used for inputting the source style Chinese character images into a trained font generation model, and performing Chinese character style conversion processing on the source style Chinese character images through the trained font generation model to generate target style Chinese character images;

According to an aspect of the embodiments of the present application, there is provided a computer device, including a processor and a memory, where at least one instruction, at least one program, a code set, or an instruction set is stored in the memory, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the method for generating the chinese character font generation model.

According to an aspect of the embodiments of the present application, there is provided a computer device, including a processor and a memory, where at least one instruction, at least one program, a code set, or an instruction set is stored in the memory, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the method for generating chinese characters.

According to an aspect of the embodiments of the present application, there is provided a computer-readable storage medium, in which at least one instruction, at least one program, a code set, or an instruction set is stored, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by a processor to implement the method for generating a chinese character font generation model described above.

According to an aspect of the embodiments of the present application, there is provided a computer-readable storage medium having at least one instruction, at least one program, a code set, or an instruction set stored therein, the at least one instruction, the at least one program, the code set, or the instruction set being loaded and executed by a processor to implement the method for generating chinese character fonts.

The technical scheme provided by the embodiment of the application can bring the following beneficial effects:

the geometric transformation is carried out on the Chinese character image to guide a model network to extract the Chinese character features with higher quality; and judging the authenticity of the Chinese character conversion image through the Chinese character identification model, and finally adjusting the parameters of the font generation model according to the output result of the output Chinese character identification model and the real attribute of the image to obtain the font generation model with the optimal effect. The method effectively solves the problem that the deep learning model often has mode collapse in training, improves guidance and pertinence in feature extraction, and obviously improves Chinese character generation effect.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a schematic diagram of an application execution environment provided by one embodiment of the present application;

FIG. 2 is a flowchart of a method for generating a Chinese character font generation model according to an embodiment of the present application;

FIG. 3 is a flowchart of a method for generating a Chinese character font generation model according to another embodiment of the present application;

FIG. 4 (a) is a schematic diagram illustrating a geometric transformation of a Chinese character image;

FIG. 4 (b) is a schematic diagram illustrating a geometric transformation of a Chinese character image;

FIG. 4 (c) is a schematic diagram showing an exemplary geometric transformation of a Chinese character image;

FIG. 4 (d) is a schematic diagram illustrating a geometric transformation of a Chinese character image;

FIG. 5 is a schematic diagram illustrating a technical architecture of a Chinese character font generation network based on field lattice transformation;

FIG. 6 is an exemplary illustration of a Chinese character comparison graph;

FIG. 7 is a schematic diagram illustrating one manner of reducing mode collapse;

FIG. 8 (a) illustrates a first diagram representing model training loss;

FIG. 8 (b) illustrates a second diagram characterizing model training loss;

FIG. 9 is a schematic diagram illustrating Chinese character generation by embedding font generation models trained by different image transformation schemes;

FIG. 10 is a flowchart of a Chinese character font generation method according to an embodiment of the present application;

FIG. 11 is a block diagram of an apparatus for generating a Chinese character font generation model according to an embodiment of the present application;

FIG. 12 is a block diagram of a Chinese character font generation apparatus according to an embodiment of the present application;

fig. 13 is a block diagram of a computer device according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Referring to fig. 1, a schematic diagram of an application execution environment according to an embodiment of the present application is shown. The application execution environment may include: a terminal 10 and a server 20.

The terminal 10 may be an electronic device such as a mobile phone, a tablet Computer, a game console, an electronic book reader, a PC (Personal Computer), and the like. The terminal 10 may operate the method for generating the chinese character font generation model or the method for generating the chinese character font.

The server 20 is used for providing network background services for the terminal 10. For example, the server 20 may be a background server that provides network services for the terminal 10. The server 20 may be a server, a server cluster composed of a plurality of servers, or a cloud computing service center. Optionally, the server 20 provides network background services for a plurality of terminals 10 simultaneously. The server 20 may also execute the method for generating the chinese character font generation model or the method for generating the chinese character font.

Alternatively, the terminal 10 and the server 20 may communicate with each other through the network 30.

Before describing the method embodiments provided in the present application, relevant terms or terms that may be referred to in the method embodiments of the present application are briefly described so as to be easily understood by those skilled in the art of the present application.

Generating countermeasure networks is one of the most popular depth generation models today. The generation of the countermeasure network consists of two parts: the device comprises a generator and a discriminator, wherein the main task of the generator is to generate false samples which are as vivid as possible, and the main task of the discriminator is to judge the truth between the generated samples and real samples and feed the output result back to the generator. In the training process, the two devices finally reach certain balance through continuous interaction. Specifically, generating a countermeasure network can be expressed in the following mathematical form:

wherein, P _data And P _z Respectively, the distribution of real data x and input random noise z, and G and D are respectively a generator and a discriminator, both of which are represented by a deep neural network model. Although the generation of the confrontation network can generate samples with higher quality, the generation of the confrontation network has strong randomness in the training process due to the unsupervised characteristic, and cannot generate samples meeting specified conditions. In addition, the generation of the antagonistic network often causes pattern collapse during training, resulting in significant degradation of quality and diversity of the generated samples.

Self-supervision learning: in deep learning, unsupervised learning has high randomness of the content of the generated samples in the training process due to the lack of data annotation. Although supervised learning can effectively control the content of generated samples through training samples with labels and learn to models with better generalization performance, the cost of data labeling is relatively high. To solve the problem of difficulty in acquiring labeled data sets, self-supervised Learning (Self-supervised Learning) has been developed and has attracted much attention in recent years. The basic idea of the self-supervision learning is to help the model network extract some more useful features by designing an auxiliary task, thereby improving the performance of the model network in the downstream tasks.

Referring to fig. 2, a flowchart of a method for generating a chinese character font generation model according to an embodiment of the present application is shown. The method can be applied to a computer device, which refers to an electronic device with data calculation and processing capabilities, and the execution subject of each step can be the terminal 10 or the server 20 in the application program running environment shown in fig. 1. The method may include the following steps (201-204).

And step 210, acquiring a Chinese character image.

The Chinese character image comprises a source style Chinese character image and a target style Chinese character image. The source style Chinese character image is the image of the source font Chinese character, and the target style Chinese character image is the image of the target font Chinese character. The target font may be any type other than the source font.

In an exemplary embodiment, the above step 210 includes the following sub-steps (211-212).

And step 211, acquiring a source style Chinese character image.

And 212, inputting the source style Chinese character image into a font generation model, and performing Chinese character style conversion processing on the source style Chinese character image through the font generation model to generate a target style Chinese character image.

The font generation model is used for converting the source style Chinese character image into the target style Chinese character image. The font generation model is a deep learning model constructed based on a generation network in a cycle generation network. In one possible embodiment, the font generation model includes an input layer, a downsampling layer, at least one residual module, an upsampling layer, and an output layer.

In an exemplary embodiment, table 1 illustrates a network structure of a font generation model.

TABLE 1

Wherein BN represents batch normalization, CONV represents a convolution structure, RELU represents an activation function, h and w represent the size of an input picture, N represents the number of network layer neurons, K represents the size of a convolution kernel, S represents a step size, and P represents the size of padding.

And step 220, performing geometric transformation processing on the Chinese character image to generate a Chinese character transformation image.

The geometric transformation processing is an image processing mode of dividing the Chinese character image into different areas and then adjusting the positions of the areas in the Chinese character image, and correspondingly, the Chinese character transformation image is an image obtained after the positions of the areas are adjusted.

In an exemplary embodiment, the above step 220 includes the following sub-steps (221-222).

Step 221, dividing the Chinese character image into at least two areas.

Optionally, the Chinese character image is divided based on the region division mode of the field character grid. Optionally, the Chinese character image is divided based on the region division mode of the Chinese character grid. Optionally, the Chinese character image is divided based on a region division mode of the nine-square grid. The method and the device for dividing the Chinese character image are not limited, can be set according to specific image processing tasks, and select the dividing mode with the optimal effect.

Step 222, performing geometric transformation processing on at least two areas to generate a Chinese character transformation image.

Determining a geometric transformation mode; and performing geometric transformation processing on the at least two areas according to a geometric transformation mode to generate a Chinese character transformation image.

Optionally, the positions of the at least two regions are adjusted according to a region position adjustment rule determined by the geometric transformation mode to generate a Chinese character transformation image.

And step 230, inputting the Chinese character transformation image into a Chinese character identification model, identifying the Chinese character transformation image through the Chinese character identification model, and outputting the authenticity information of the Chinese character transformation image.

In an exemplary embodiment, the above-mentioned kanji recognition model also outputs a geometric transformation mode of the kanji transformed image.

The Chinese character identification model is a deep learning model constructed on the basis of a discrimination network in a cycle generation network. In one possible implementation, the Chinese character identification model comprises an input layer, at least one hidden layer, a authenticity information output layer (Dsrc) and a geometric transformation mode output layer (Dt).

In an exemplary embodiment, table 2 shows a network structure of a chinese character recognition model.

TABLE 2

And step 240, adjusting model parameters of the font generation model based on the authenticity information to obtain the trained font generation model.

Obtaining the anti-loss information of the font generation model and the Chinese character identification model based on the authenticity information; and obtaining geometric transformation reconstruction loss information based on a geometric transformation mode.

The countermeasure loss information can be generated by a countermeasure loss function. The corresponding penalty function of the font generation model provided in this embodiment can be expressed as follows:

wherein x is the input source style domain Chinese character, G (x) is the target style domain Chinese character generated by the generator,

to combat loss values.

Alternatively, the above-mentioned geometric transformation reconstruction loss information may be generated by a field lattice geometric transformation reconstruction loss function. The method and the device for extracting the character from the font generation model have the advantages that the auxiliary task of geometric transformation and reconstruction of the field character lattice is introduced to guide the model network of the font generation model to better extract the features. Specifically, the matts geometry transformation reconstruction loss may be defined as follows:

wherein, T _c (x) And T _c (G (x)) are sources, respectivelyStyle Chinese character x and the generated target style Chinese character G (x) are converted into C-field character grid, D _tian (T _c (x),T _c (G (x))) is the estimate of the trellis transform coding by the discriminator, c is the true trellis transform coding,

is as follows.

And adjusting model parameters of the font generation model based on the resistance loss information and the geometric transformation reconstruction loss information to obtain the trained font generation model.

Optionally, the confrontation loss value and the field lattice geometric transformation reconstruction loss value are reduced by adjusting model parameters of the font generation model, and finally the trained font generation model is obtained.

In summary, the technical solution provided by the embodiment of the present application guides the model network to extract the high-quality chinese character features by performing geometric transformation on the chinese character images; and judging the authenticity of the Chinese character conversion image through the Chinese character identification model, and finally adjusting the parameters of the font generation model according to the output result of the output Chinese character identification model and the real attribute of the image to obtain the font generation model with the optimal effect. The method effectively solves the problem that the deep learning model often has mode collapse in training, improves guidance and pertinence in feature extraction, and obviously improves Chinese character generation effect.

Referring to fig. 3, a flowchart of a method for generating a chinese character font generation model according to another embodiment of the present application is shown. The method can be applied to a computer device which refers to an electronic device with data calculation and processing capabilities, for example, the execution subject of each step can be the terminal 10 or the server 20 in the application program running environment shown in fig. 1. The method may comprise the following steps (301-304).

Step 301, obtaining a source style Chinese character image.

Step 302, inputting the source style Chinese character image into a font generation model, and performing Chinese character style conversion processing on the source style Chinese character image through the font generation model to generate a target style Chinese character image.

Step 303, dividing the chinese character image into a first region image, a second region image, a third region image and a fourth region image respectively through a horizontal central line and a vertical central line.

The first image area is an image in an area surrounded by a horizontal center line, a vertical center line, a left side line of a Chinese character image and an upper side line of the Chinese character image, the second image area is an image in an area surrounded by the horizontal center line, the vertical center line, a right side line of the Chinese character image and the upper side line of the Chinese character image, the third image area is an image in an area surrounded by the horizontal center line, the vertical center line, the left side line of the Chinese character image and a lower side line of the Chinese character image, and the fourth image area is an image in an area surrounded by the horizontal center line, the vertical center line, the right side line of the Chinese character image and the lower side line of the Chinese character image.

In one possible implementation, the source style chinese character image is divided into a first region image, a second region image, a third region image and a fourth region image corresponding to the source style chinese character image.

In one possible implementation, the target style chinese character image is divided into a first region image, a second region image, a third region image and a fourth region image corresponding to the target style chinese character image.

Step 304, determining a geometric transformation mode.

Optionally, the geometric transformation is randomly selected.

And according to the geometric transformation mode, performing geometric transformation processing on the first area image, the second area image, the third area image and the fourth area image to generate a Chinese character transformation image.

Optionally, the image positions of the first region image, the second region image, the third region image and the fourth region image are adjusted according to a region position adjustment rule corresponding to the geometric transformation mode, and the first region image, the second region image, the third region image and the fourth region image are spliced according to the adjusted positions to obtain the Chinese character transformation image.

In a possible implementation manner, according to a geometric transformation manner, geometric transformation processing is performed on a first region image, a second region image, a third region image and a fourth region image corresponding to a source style Chinese character image, so as to generate a source style Chinese character transformation image.

In a possible implementation manner, according to a geometric transformation manner, geometric transformation processing is performed on a first region image, a second region image, a third region image and a fourth region image corresponding to the target style Chinese character image, so as to generate a target style Chinese character transformation image.

Step 305, in the case where the geometric transformation method is the first geometric transformation method, generating a unit transformed image of the kanji image while keeping the positions of the first region image, the second region image, the third region image, and the fourth region image unchanged.

The first geometric transformation mode, also called unit transformation, is that the position of each region of the Chinese character image is kept unchanged. The first geometric transformation mode is mainly inspired by Chinese characters with inseparable structures, such as single-body characters, such as 'day' and 'month' and the like.

In one possible embodiment, when the geometric transformation method is the first geometric transformation method, the unit transformed image of the source-style chinese character is generated while keeping the positions of the first region image, the second region image, the third region image, and the fourth region image corresponding to the source-style chinese character image unchanged.

In one possible embodiment, when the geometric transformation method is the first geometric transformation method, the unit transformed image of the target style chinese character is generated while keeping the positions of the first region image, the second region image, the third region image, and the fourth region image corresponding to the target style chinese character image unchanged.

And step 306, under the condition that the geometric transformation mode is the second geometric transformation mode, interchanging the first area image and the second area image, interchanging the third area image and the fourth area image, and generating a second transformation image of the Chinese character image.

The second geometric transformation mode is mainly inspired by Chinese characters with left and right structures, such as Chinese characters, and Chinese characters with different structures.

In a possible implementation manner, in the case that the geometric transformation manner is the second geometric transformation manner, the first region image and the second region image corresponding to the source style Chinese character image are interchanged, and the third region image and the fourth region image corresponding to the source style Chinese character image are interchanged, so that the second transformation image of the source style Chinese character is generated.

In a possible implementation manner, in the case that the geometric transformation manner is the second geometric transformation manner, the first region image and the second region image corresponding to the target style Chinese character image are exchanged, and the third region image and the fourth region image corresponding to the target style Chinese character image are exchanged, so as to generate the second transformation image of the target style Chinese character.

And 307, under the condition that the geometric transformation mode is the third geometric transformation mode, interchanging the first area image and the third area image, and interchanging the second area image and the fourth area image to generate a third transformation image of the Chinese character image.

The third geometric transformation mode is mainly inspired by Chinese characters with upper and lower structures, such as "will", "Miao", and the like.

In a possible implementation manner, in the case that the geometric transformation manner is the third geometric transformation manner, the first region image and the third region image corresponding to the source style Chinese character image are interchanged, and the second region image and the fourth region image corresponding to the source style Chinese character image are interchanged, so as to generate a third transformation image of the source style Chinese character.

In a possible implementation manner, in the case that the geometric transformation manner is the third geometric transformation manner, the first region image and the third region image corresponding to the target style Chinese character image are interchanged, and the second region image and the fourth region image corresponding to the target style Chinese character image are interchanged, so as to generate a third transformation image of the target style Chinese character.

And 308, under the condition that the geometric transformation mode is a fourth geometric transformation mode, interchanging the second area image and the third area image, keeping the positions of the first area image and the fourth area image unchanged, and generating a fourth transformation image of the Chinese character image.

The fourth geometric transformation mode is mainly inspired by Chinese characters with upper right surrounding and lower left surrounding structures, such as 'sentence', 'can', 'build', 'link' and the like.

In a possible implementation manner, in the case that the geometric transformation manner is the fourth geometric transformation manner, the second region image and the third region image corresponding to the source style Chinese character image are exchanged, and the positions of the first region image and the fourth region image corresponding to the source style Chinese character image are kept unchanged, so as to generate a fourth transformation image of the source style Chinese character.

In a possible implementation manner, in the case that the geometric transformation manner is the fourth geometric transformation manner, the second region image and the third region image corresponding to the target style Chinese character image are exchanged, and the positions of the first region image and the fourth region image corresponding to the target style Chinese character image are kept unchanged, so as to generate a fourth transformation image of the target style Chinese character.

The four geometric transformation modes reflect the natural structure of the Chinese characters to a certain extent. Besides the above-defined four field grid geometric transformations, there are other geometric transformations, such as interchanging

regions

1 and 4, keeping

regions

2 and 3 unchanged, and the geometric transformations are mainly inspired by Chinese characters with upper-left surrounding and lower-right surrounding structures. In addition, the geometrical transformation such as the transformation of the

areas

1 and 2 and the transformation of the

areas

3 and 4 can correspond to Chinese characters which can be further split at the middle upper part or the lower part of the upper and lower structures, such as 'crying' and 'crying'.

Optionally, the geometric transformation manners are encoded by one-hot encoding (one-hot encoding), such as encoding the first geometric transformation manner as (1,0,0,0). It should be noted that the above-mentioned field grid region numbering, four geometric transformation modes and corresponding coding can be implemented automatically by means of a simple computer program without any manual marking.

In one example, as shown in fig. 4 (a) to 4 (d), fig. 4 (a) exemplarily shows a first schematic diagram of a geometric transformation manner of a chinese character image, and fig. 4 (b) exemplarily shows a second schematic diagram of a geometric transformation manner of a chinese character image; FIG. 4 (c) is a schematic diagram showing an exemplary geometric transformation of a Chinese character image; fig. 4 (d) schematically shows a geometric transformation of a chinese character image. A Chinese character is given, and the Chinese character picture is divided into four regions according to a 'field character grid', wherein the four regions are respectively numbered as 1, 2, 3 and 4. Fig. 4 (a) is a unit conversion image obtained by keeping the positions of the respective regions of the kanji image unchanged according to the first geometric conversion method. Fig. 4 (b) is a second converted image obtained by interchanging the

regions

1 and 2 and interchanging the

regions

3 and 4 according to the second geometric conversion method. Fig. 4 (c) is a third converted image obtained by interchanging the

regions

1 and 3 and interchanging the

regions

2 and 4 according to the third geometric conversion method. Fig. 4 (d) is a fourth converted image obtained by interchanging the

regions

2 and 3 with the

regions

1 and 4 kept unchanged according to the fourth geometric conversion method.

And 309, inputting the Chinese character transformation image into a Chinese character identification model, identifying the Chinese character transformation image through the Chinese character identification model, and outputting authenticity information of the Chinese character transformation image and a geometric transformation mode of the Chinese character transformation image.

In an exemplary embodiment, the chinese character discrimination model also outputs a geometric transformation mode of the chinese character transformation image. For example, the chinese character discrimination model outputs a code corresponding to a geometric transformation mode of the chinese character transformed image to determine the geometric transformation mode adopted to obtain the above-described chinese character transformed image.

In one possible implementation mode, the source style Chinese character transformation image is input into a Chinese character identification model, the source style Chinese character transformation image is identified through the Chinese character identification model, and the authenticity information of the source style Chinese character transformation image and the geometric transformation mode of the source style Chinese character transformation image are output.

In one possible implementation mode, the target style Chinese character transformation image is input into a Chinese character identification model, the target style Chinese character transformation image is identified through the Chinese character identification model, and the authenticity information of the target style Chinese character transformation image and the geometric transformation mode of the target style Chinese character transformation image are output.

And 310, inputting the target style Chinese character image into a font generation model, and performing Chinese character style conversion inverse processing on the target style Chinese character image through the font generation model to generate a reconstructed source style Chinese character image.

The reconstructed source style Chinese character image can be paired with the source style Chinese character image to generate a paired sample for training the font generation model.

And 311, obtaining the anti-loss information of the font generation model and the Chinese character identification model based on the authenticity information.

In one possible implementation mode, the anti-loss information of the font generation model and the Chinese character identification model is obtained based on the authenticity information of the source style Chinese character transformation image and the target style transformation image.

And step 312, obtaining the geometric transformation reconstruction loss information based on the geometric transformation mode.

In a possible implementation mode, the resistance loss information of the font generation model and the Chinese character identification model is obtained based on the real geometric transformation mode corresponding to the source style Chinese character transformation image and the target style transformation image respectively and the geometric transformation mode of the source style Chinese character transformation image and the target style transformation image output by the Chinese character identification model.

And 313, obtaining cycle consistency loss information based on the reconstructed source style Chinese character image and the reconstructed source style Chinese character image.

Loss of cycle consistency: in order to overcome the problem of difficult acquisition of paired data sets, a cyclic consistency loss is introduced to generate "pseudo-paired samples", which is specifically defined as follows:

wherein x is the input source style Chinese character, G (x) is the target style Chinese character generated by the generator, G (x) is the Chinese character with the source style reconstructed by the generated target style Chinese character, | | | |1 is L1 norm.

And step 314, adjusting model parameters of the font generation model based on the countermeasure loss information, the geometric transformation reconstruction loss information and the cycle consistency loss information to obtain the trained font generation model.

In combination with the above-mentioned countervailing loss, cyclic consistency loss and transform reconstruction loss, the total loss of the above-mentioned font generation model is as follows:

wherein λ is _cyc And λ _tian As the parameter(s) is (are),

is the total loss value.

Based on the above loss, the generator wants to generate false samples as realistic as possible by minimizing the loss, and the discriminator wants to discriminate the authenticity of the generated samples as much as possible by maximizing the loss. Thus, the font generation model described above may be expressed as follows:

in one example, as shown in fig. 5, a schematic diagram of a technical architecture of a chinese character font generation network based on field lattice transformation is exemplarily shown. The Chinese character font generation network shown in FIG. 5 is mainly divided into three modules, namely, a generator, a discriminator and a matts geometry transformation module. The first module is a generator, i.e. the font generation model mentioned above, which mainly undertakes two functions: (1) Generating target-style Chinese characters by taking the source-style Chinese characters as input and (2) reconstructing the source-style Chinese characters by taking the generated target-style Chinese characters as input, wherein the function (2) is mainly used for generating a pseudo-matched sample, thereby solving the requirement on matched data. The second module is a field lattice geometric transformation module Tc, wherein c is the one-hot coding corresponding to the field lattice-based geometric transformation method. In the module, the same four kinds of geometric transformation of the field character lattices are carried out on the source style Chinese character image and the correspondingly generated target style Chinese character image, and the four kinds of geometric transformation are used as the input of a discriminator. The third module is a discriminator, namely the Chinese character identification model, and the Chinese character identification model not only needs to judge the authenticity of the generated Chinese character, but also needs to output the code of the corresponding geometric transformation mode. Based on the method, the Chinese character identification model has two parts (Dsrc, dtian), wherein Dsrc is responsible for judging the authenticity of the generated Chinese character, and Dtian reconstructs the code corresponding to the geometric transformation mode of the grid.

In summary, according to the technical scheme provided by the embodiment of the application, the source style Chinese characters are converted into the target style Chinese characters through the font generation model, and meanwhile, the generated source style Chinese characters are further inverted into the reconstructed source style Chinese characters to be used as the matched samples of the source style Chinese characters; secondly, respectively carrying out corresponding geometric transformation of the field character lattices on the real source style Chinese character images and the generated target style Chinese character images, and automatically marking the types of the transformation; and finally, judging the authenticity of the generated Chinese character and the real Chinese character by the Chinese character identification model, judging the conversion mode of inputting the Chinese character after the geometric conversion of the field character lattice, and finally adjusting the parameters of the font generation model by outputting the output result of the Chinese character identification model and the difference between matched samples to obtain the font generation model with the optimal effect. The method effectively solves the problem that the deep learning model often has mode collapse in training, guides a model network to extract high-quality Chinese character features based on a self-supervision method of field character lattice transformation, improves guidance and pertinence in feature extraction, obviously improves Chinese character generation effect, and reduces labor cost for manufacturing matched samples.

The following provides a further summary of the examples of the present application. In the writing process of learning Chinese characters, the 'field character lattice' plays an auxiliary role and helps learners to know the frame structure of the Chinese characters. Based on the inspiration, the application provides an automatic self-supervision Chinese character font generation method based on 'Tian character lattice' transformation aiming at the problem that the existing deep Chinese character generation model is lack of guidance when extracting features. The basic idea of the embodiment of the application is to guide the network to automatically extract some important Chinese character features by introducing an auxiliary task of geometric transformation reconstruction of a 'field character lattice', so that the effect of generating Chinese character fonts is obviously improved.

Specifically, under the inspiration that the field case is adopted for writing Chinese characters, four simple geometric transformations of the field case are designed, and then tasks reconstructed by the geometric transformations are embedded into the training of a depth generation model, so that the deep neural network is guided to extract features more specifically. By embedding the auxiliary task, the method can focus on the whole style and the local structure information of the Chinese character at the same time under the condition of not changing a model network and increasing labor cost.

In an exemplary embodiment, the present application provides a model verification method, and a description is provided below for a model verification process. The model verification method verifies the effectiveness of the font generation model provided by the embodiment of the application through designing a series of experiments. The model verification method is used for explaining the effectiveness of the embedded latticed geometry transformation reconstruction task by designing a series of experiments.

1. Experimental setup

A. A data set. In the process of model verification, ten kinds of data with different fonts are mainly used, wherein the data comprise one handwritten form, three printed forms (namely an imitation Song style, a regular style and a black body) and six pseudo handwritten forms (namely a comfortable form, a Huawen amber form, a Han dynasty style, a Han dynasty doll form, a Han dynasty thin round form and a square and regular form black handwritten simple form). The first type of handwriting data set is constructed with 300 people participating. For 3755 common Chinese characters, each person writes once respectively as data acquisition. Thus, the handwritten data set is 300 × 3755 in total. In order to construct the handwritten data set used in this experiment, for each chinese character, one sample is randomly selected from the corresponding 300 samples as a sample of the chinese character, so that the size of the handwritten chinese character data set constructed in this embodiment is 3755. For the construction of other font data sets, the font data sets can be first crawled from the internet and then automatically generated by a TTF (truetypefent) tool. The corresponding size of each font data set can be seen in table 3 below. The size of each Chinese character picture is 256 × 256 × 3, unit: a pixel. In the experiment, for each font data set, 80% of the data was used for training and the remaining 20% was used for testing.

TABLE 3 size of data set

B. A network structure and an optimizer. The font generation model comprises 1 down-sampling module, 9 residual modules and an up-sampling module, wherein the down-sampling module comprises 2 convolutional layers, each residual module comprises 2 convolutional layers, and the up-sampling module comprises 2 de-convolutional layers. The Chinese character identification model comprises 6 hidden layer convolution layers and 2 convolution layers in the output module.

In this embodiment, an Adaptive moment estimation (Adam) algorithm is adopted as the optimizer, wherein the algorithm parameter is set to be (0.50,0.999), the learning rate is fixed to be 0.0002, the size of the batch sample is set to be 2, and the parameter λ is set to be _cyc And λ _tian The optimal parameters are obtained through empirical adjustment.

C. And (4) evaluating the index. To evaluate the effectiveness of the proposed method, the following four general evaluation criteria were used:

(1) The accuracy of the generated content. The index is mainly used for measuring the accuracy of generating the Chinese character content, the recognition rate of the Chinese character recognition model is used as the accuracy rate of the model generation sample, and the higher the accuracy rate is, the better the quality of the generated sample is represented. Optionally, the chinese Character Recognition model is a trained machine learning model based on an OCR (Optical Character Recognition) technique.

(2) Freehet Inclusion Distance (FID). The index is mainly used for measuring the distance between the generated sample distribution and the real sample distribution, and the smaller the FID value is, the generated sample distribution is closer to the real sample distribution, and the generated samples have more diversity.

(3) L1 is lost. The index is mainly used for measuring the L1 loss of the generated sample and the real sample in the pixel sense, and the smaller the L1 loss is, the closer the generated sample and the real sample are. The L1 loss function, also known as the minimum absolute deviation (LAD), minimum absolute error (LAE). In general terms, the amount of the particles, it is to minimize the sum (S) of absolute differences between a target value (Yi) and an estimated value (f (xi))

(4) Interaction over Union (IOU, intersection ratio). The index calculates the ratio of the intersection and the union between the generated samples and the real samples, and is mainly used for measuring the degree of coincidence between the generated samples and the real samples, and the larger the IOU is, the closer the generated samples and the real samples are.

2. Effectiveness of field-grid transformation reconstruction

In the embodiment, a series of experiments are designed to verify the effectiveness of the reconstruction task based on the field lattice transformation, wherein the experiments include improving the generation effect, reducing the mode collapse, improving the training stability and the like. For this purpose, an existing model is selected as a reference model for comparison.

A. The production effect is improved. In the embodiment of the application, a plurality of one-to-one font style conversion experiments are designed and compared with a reference model, and the corresponding experiment results are shown in table 4 below. As can be seen from table 4 below, compared with the reference model, the model provided by the present application has certain improvements in both content accuracy and FID index. This shows that embedding the auxiliary task of field-character lattice transformation reconstruction into the cyclic generation network can help the font generation model and the Chinese character identification model to extract the Chinese character features better, thereby improving the generation performance of the font generation model.

TABLE 4

Further, in one example, as shown in FIG. 6, an exemplary generation of a Chinese character comparison graph is illustrated. As can be seen from the characters (only a part of the generated characters) in FIG. 6, the character generated by the font generation model provided by the present application has higher quality, and particularly has significant advantages in the aspects of content generation accuracy, stroke integrity, etc.

B. Reducing mode collapse. Pattern collapse is an important problem that plagues existing deep generation models based on Generative Adaptive Networks (GAN). Pattern collapse refers to the fact that the generative model generates the same pattern (or fewer patterns than the number of inputs) for different inputs, thereby significantly reducing the performance and diversity of the depth generative model.

In one example, as shown in fig. 7, a schematic diagram for reducing mode collapse is illustrated. The reference model will collapse when the transition from Song-imitating to bold is achieved, see the output Chinese character image in the second row circled by box 71 in FIG. 7. Specifically, the benchmark model outputs the same pattern for different inputs, such as the "sink" word and the "dirty" word. As can be seen from the third row circled by the block 72 in fig. 7, in the embodiment of the present application, after the self-supervision task based on the field lattice geometric transformation reconstruction is embedded, the pattern collapse phenomenon of the font generation model is significantly improved, and the experimental result indicates that the self-supervision task based on the field lattice geometric transformation reconstruction provided by the present application can indeed help the network to extract some more important features, so that the pattern information of the chinese characters can be better maintained.

C. The training stability is improved. In one example, as shown in fig. 8 (a) to 8 (b), 8 (a) illustrates a first diagram representing a model training loss, and fig. 8 (b) illustrates a second diagram representing a model training loss. As shown in fig. 8 (a), the loss value of the reference model suddenly increases as the number of iterations increases, and is unstable. As shown in fig. 8 (b), by embedding the self-supervision task of the field-character-grid geometric transformation coding reconstruction in the training process of the font generation model, the stability of the training of the font generation model can be effectively improved, and the loss value of the font generation model is stably reduced along with the increase of the iteration times.

3. Effects of type of geometric transformation of matts

The validation experiment of this example explores the effect of the number of geometric transformations of the grid. In addition to the image transformation scheme proposed in the above embodiment (including four geometric transformation methods based on the grid, which are herein abbreviated as Tian 4), the present application also considers the following three different grid geometric transformation groups as a comparison:

the first comparative image transform scheme includes three geometric transforms, namely (1) unit transform, (2)

region

1, 2 interchange, 3, 4 interchange, and (3)

region

1,3 interchange, 2,4 interchange. For convenience of description, it is referred to as Tian3.

The second contrast image transformation scheme comprises five geometric transformation modes, namely, the following two geometric transformation modes are added on the basis of the first contrast image transformation scheme: (1)

Regions

1, 4 are unchanged,

regions

2, 3 are interchanged and (2)

regions

1, 4 are interchanged, and

regions

2, 3 are unchanged. For convenience of description, it is referred to as Tian5.

The third contrast image transformation scheme comprises seven geometric transformation modes, namely, the following two geometric transformation modes are added on the basis of the second contrast image transformation scheme: (1)

Regions

1, 2 are interchanged, 3, 4 are unchanged and (2)

regions

1, 2 are unchanged, 3, 4 are interchanged. For convenience of description, it is referred to as Tian7.

For the above four image transformation schemes, training was performed on the square-positive black-handwritten simple data set, and the corresponding experimental results are shown in table 5 and fig. 9. As can be seen from table 3, the type of field lattice geometric transformation (Tian 4) proposed in the embodiment of the present application has the highest content accuracy and the lowest FID value, indicating that the field lattice geometric transformation scheme has better performance.

TABLE 5

Further, in one example, as shown in FIG. 9, a schematic diagram of Chinese characters generated by embedding font generation models trained by different image transformation schemes is exemplarily shown. As can be seen from fig. 9, the Tian4 image transformation scheme is embedded and adopted in the training process of the font generation model, so that the chinese characters with higher quality can be generated, for example, the chinese characters have the best performance in terms of details and stroke integrity.

In summary, the effectiveness of the method provided in the embodiments of the present application is verified on ten different chinese font data sets. The ten fonts comprise a handwritten style, a simple Shu Ti, a Chinese amber style, a Chinese character Ling wave style, an imitation Song style, a Chinese character doll style, a Chinese character thin round style, a Chinese character regular black handwritten style, a regular style and a black style. Experimental results show that by introducing the auxiliary task of field character lattice transformation reconstruction, the character generation model based on the circular generation network is remarkably improved in the aspects of content accuracy, style diversity and the like. In addition, the pattern collapse phenomenon existing in the font generation model training based on the cycle generation network is also greatly improved. In addition, the field character lattice geometric transformation reconstruction task designed by the application can be seamlessly embedded into other depth generation models and Chinese character generation tasks, such as Chinese character generation based on small samples.

Referring to fig. 10, a flowchart of a method for generating a chinese character font according to an embodiment of the present application is shown. The method can be applied to the application program running environment shown in fig. 1. The method may include the following steps (1010-1020).

Step 1010, obtaining a source style Chinese character image.

And 1020, inputting the source style Chinese character images into the trained font generation model, and performing Chinese character style conversion processing on the source style Chinese character images through the trained font generation model to generate target style Chinese character images.

The method comprises the steps of training a Chinese character recognition model, wherein the trained font generation model is a machine learning model obtained by performing countermeasure training with a Chinese character recognition model, the Chinese character recognition model performs recognition processing on a Chinese character conversion image, and outputs authenticity information of the Chinese character conversion image to adjust model parameters of the font generation model, the Chinese character conversion image is an image generated by the Chinese character image through geometric conversion processing, and the geometric conversion processing refers to an image processing mode of dividing the Chinese character image into different regions and then adjusting the positions of the regions in the Chinese character image. Correspondingly, the Chinese character transformation image is an image obtained by adjusting the position of each region.

In summary, according to the technical scheme provided by the embodiment of the application, the geometric transformation is performed on the Chinese character image to guide the model network to extract the Chinese character features with higher quality; and the authenticity of the Chinese character conversion image is judged through the Chinese character identification model, finally, the parameters of the font generation model are adjusted according to the output result of the output Chinese character identification model and the real attribute of the image, so that the font generation model with the optimal effect is obtained, further, the source style Chinese characters can be converted into the target style Chinese characters, the problem that the deep learning model often has mode collapse in training is effectively solved, the guidance and pertinence in feature extraction are improved, and the Chinese character generation effect is remarkably improved.

Aiming at the problem that the existing deep Chinese character font generation model based on the unpaired data set is lack of instructive in automatic feature extraction, the embodiment of the application improves the quality of feature extraction of a model network by designing an auxiliary task based on field character lattice transformation and embedding the auxiliary task into the existing deep generation model, and further remarkably improves the effect of the existing model in the Chinese character font generation task. The effectiveness of the designed auxiliary task based on the field lattice transformation is verified in a series of experiments. After the designed auxiliary task is embedded, the new model is obviously improved in the aspects of generating Chinese character quality, improving the mode collapse phenomenon, training stability and the like. The method provided by the text has the advantages that the indexes such as the accuracy rate of generated contents, FID, L1 loss, IOU and the like are obviously improved, and meanwhile, the generated Chinese characters have better expression on the aspect that the stroke integrity and the fidelity are kept. The field character grid auxiliary task does not need to increase any labor cost and change the network structure of the existing model, and the field character grid transformation-based auxiliary task can be effectively transplanted to other deep Chinese character font generation models.

The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.

Referring to fig. 11, a block diagram of an apparatus for generating a chinese character font generation model according to an embodiment of the present application is shown. The device has the function of realizing the generating method of the Chinese character font generating model, and the function can be realized by hardware or hardware executing corresponding software. The device can be a computer device and can also be arranged in the computer device. The apparatus 1100 may include: a Chinese character image acquisition module 1110, a geometric transformation processing module 1120, a Chinese character identification module 1130, and a model parameter adjustment module 1140.

The chinese character image obtaining module 1110 is configured to obtain a chinese character image, where the chinese character image includes a source style chinese character image and a target style chinese character image.

The geometric transformation processing module 1120 is configured to perform geometric transformation processing on the chinese character image to generate a chinese character transformation image, where the geometric transformation processing refers to an image processing manner in which positions of the regions in the chinese character image are adjusted after the chinese character image is divided into different regions, and the chinese character transformation image refers to an image obtained after the positions of the regions are adjusted.

A chinese character identification module 1130, configured to input the chinese character transformation image to a chinese character identification model, perform identification processing on the chinese character transformation image through the chinese character identification model, and output authenticity information of the chinese character transformation image.

A model parameter adjusting module 1140, configured to adjust model parameters of a font generation model based on the authenticity information to obtain a trained font generation model, where the font generation model is used to convert the source style chinese character image into the target style chinese character image.

In an exemplary embodiment, the chinese character image obtaining module 1110 includes:

and the source style Chinese character image acquisition unit is used for acquiring the source style Chinese character image.

And the target style Chinese character image generating unit is used for inputting the source style Chinese character image into the font generating model, and performing Chinese character style conversion processing on the source style Chinese character image through the font generating model to generate the target style Chinese character image.

In an exemplary embodiment, the geometric transformation processing module 1120 includes:

and the image segmentation unit is used for dividing the Chinese character image into at least two areas.

And the geometric transformation unit is used for performing the geometric transformation processing on the at least two areas to generate the Chinese character transformation image.

In an exemplary embodiment, the geometric transformation unit includes:

and the transformation mode determining subunit is used for determining the geometric transformation mode.

And the region geometric transformation subunit is used for performing the geometric transformation processing on the at least two regions according to the geometric transformation mode to generate the Chinese character transformation image.

In an exemplary embodiment, the image segmentation unit is further configured to:

and dividing the Chinese character image into a first area image, a second area image, a third area image and a fourth area image through a horizontal central line and a vertical central line.

The first image area is an image in an area surrounded by the horizontal center line, the vertical center line, the left side line of the Chinese character image and the upper side line of the Chinese character image, the second image area is an image in an area surrounded by the horizontal center line, the vertical center line, the right side line of the Chinese character image and the upper side line of the Chinese character image, the third image area is an image in an area surrounded by the horizontal center line, the vertical center line, the left side line of the Chinese character image and the lower side line of the Chinese character image, and the fourth image area is an image in an area surrounded by the horizontal center line, the vertical center line, the right side line of the Chinese character image and the lower side line of the Chinese character image.

In an exemplary embodiment, the geometric transformation manner includes a first geometric transformation manner, and the kanji transformation image includes a unit transformation image.

The region geometry transformation subunit is further configured to:

and generating a unit conversion image of the kanji image while keeping the positions of the first, second, third, and fourth region images unchanged, when the geometric conversion method is the first geometric conversion method.

In an exemplary embodiment, the geometric transformation manner includes a second geometric transformation manner, and the kanji transformation image includes a second transformation image.

The region geometry transformation subunit is further configured to:

and under the condition that the geometric transformation mode is the second geometric transformation mode, interchanging the first area image and the second area image, interchanging the third area image and the fourth area image, and generating a second transformation image of the Chinese character image.

In an exemplary embodiment, the geometric transformation manner includes a third geometric transformation manner, and the kanji transformation image includes a third transformation image.

The region geometry transformation subunit is further configured to:

and under the condition that the geometric transformation mode is the third geometric transformation mode, interchanging the first area image and the third area image, interchanging the second area image and the fourth area image, and generating a third transformation image of the Chinese character image.

In an exemplary embodiment, the geometric transformation manner includes a fourth geometric transformation manner, and the kanji transformation image includes a fourth transformation image.

The region geometry transformation subunit is further configured to:

and under the condition that the geometric transformation mode is the fourth geometric transformation mode, interchanging the second area image and the third area image, keeping the positions of the first area image and the fourth area image unchanged, and generating a fourth transformation image of the Chinese character image.

In an exemplary embodiment, the chinese character recognition model further outputs a geometric transformation mode of the chinese character transformed image, and the model parameter adjusting module 1140 includes:

and the confrontation loss acquisition unit is used for acquiring the confrontation loss information of the font generation model and the Chinese character identification model based on the authenticity information.

A geometric transformation reconstruction loss acquisition unit for acquiring a geometric transformation loss based on the geometric transformation manner, and obtaining the geometric transformation reconstruction loss information.

And the parameter adjusting unit is used for adjusting the model parameters of the font generation model based on the countermeasure loss information and the geometric transformation reconstruction loss information to obtain the trained font generation model.

In an exemplary embodiment, the apparatus 1100 further comprises:

and the reconstruction source style Chinese character image generating module is used for inputting the target style Chinese character image into the font generating model, and performing Chinese character style conversion inverse processing on the target style Chinese character image through the font generating model to generate a reconstruction source style Chinese character image.

In an exemplary embodiment, the model parameter adjustment module 1140 further comprises:

and the cycle consistency loss acquisition unit is used for acquiring cycle consistency loss information based on the reconstructed source style Chinese character image and the source style Chinese character image.

The parameter adjusting unit is further configured to:

and adjusting model parameters of the font generation model based on the countermeasure loss information, the geometric transformation reconstruction loss information and the cycle consistency loss information to obtain the trained font generation model.

In summary, according to the technical scheme provided by the embodiment of the application, the geometric transformation is performed on the Chinese character image to guide the model network to extract the Chinese character features with higher quality; and judging the authenticity of the Chinese character conversion image through the Chinese character identification model, and finally adjusting the parameters of the font generation model according to the output result of the output Chinese character identification model and the real attribute of the image to obtain the font generation model with the optimal effect. The method effectively solves the problem that the deep learning model often has mode collapse in training, improves guidance and pertinence in feature extraction, and obviously improves Chinese character generation effect.

Referring to fig. 12, a block diagram of a chinese character font generation apparatus according to an embodiment of the present application is shown. The device has the function of realizing the Chinese character font generating method, and the function can be realized by hardware or by hardware executing corresponding software. The device can be a computer device and can also be arranged in the computer device. The apparatus 1200 may include: a source style Chinese character image acquisition module 1210 and a Chinese character style conversion module 1220.

A source style Chinese character image obtaining module 1210 for obtaining a source style Chinese character image.

The chinese character style conversion module 1220 is configured to input the source style chinese character image to a trained font generation model, and perform chinese character style conversion processing on the source style chinese character image through the trained font generation model to generate a target style chinese character image.

It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, the division of each functional module is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the apparatus and method embodiments provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments, which are not described herein again.

Referring to fig. 13, a block diagram of a computer device according to an embodiment of the present application is shown. The computer device may be a terminal or a server. The computer device is used for implementing the method for generating the Chinese character font generation model provided in the above embodiment, or is used for implementing the method for generating the Chinese character font provided in the above embodiment. Specifically, the method comprises the following steps:

generally, computer device 1300 includes: a processor 1301 and a memory 1302.

Processor 1301 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 1301 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1301 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also referred to as a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1301 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing content required to be displayed on a display screen. In some embodiments, processor 1301 may further include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

Memory 1302 may include one or more computer-readable storage media, which may be non-transitory. The memory 1302 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 1302 is used for storing at least one instruction, at least one program, set of codes, or set of instructions configured to be executed by one or more processors to implement the method for generating a chinese character font generation model provided in the above embodiments, or to implement the method for generating a chinese character font provided in the above embodiments.

In some embodiments, computer device 1300 may also optionally include: a peripheral interface 1303 and at least one peripheral. Processor 1301, memory 1302, and peripheral interface 1303 may be connected by a bus or signal line. Each peripheral device may be connected to the peripheral device interface 1303 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1304, touch display 1305, camera assembly 1306, audio circuitry 1307, positioning assembly 1308, and power supply 1309.

Those skilled in the art will appreciate that the architecture shown in FIG. 13 is not intended to be limiting of the computer device 1300, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

In an exemplary embodiment, there is also provided a computer-readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which when executed by a processor, implements the method for generating the chinese character font generation model described above.

In an exemplary embodiment, there is also provided a computer readable storage medium having stored therein at least one instruction, at least one program, code set, or set of instructions which, when executed by a processor, implements the chinese font generation method described above.

Optionally, the computer-readable storage medium may include: ROM (Read Only Memory), RAM (Random Access Memory), SSD (Solid State drive), or optical disc. The Random Access Memory may include a ReRAM (resistive Random Access Memory) and a DRAM (Dynamic Random Access Memory).

It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated object, indicating that there may be three relationships, for example, a and/or B, which may indicate: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. In addition, the step numbers described herein only exemplarily show one possible execution sequence among the steps, and in some other embodiments, the steps may also be executed out of the numbering sequence, for example, two steps with different numbers are executed simultaneously, or two steps with different numbers are executed in a reverse order to the order shown in the figure, which is not limited by the embodiment of the present application.

The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for generating a Chinese character font generation model is characterized by comprising the following steps:

dividing the Chinese character image into a first area image, a second area image, a third area image and a fourth area image respectively through a horizontal central line and a vertical central line, wherein the first area image is an image in an area surrounded by the horizontal central line, the vertical central line, a left side line of the Chinese character image and an upper side line of the Chinese character image, the second area image is an image in an area surrounded by the horizontal central line, the vertical central line, a right side line of the Chinese character image and the upper side line of the Chinese character image, the third area image is an image in an area surrounded by the horizontal central line, the vertical central line, the left side line of the Chinese character image and a lower side line of the Chinese character image, and the fourth area image is an image in an area surrounded by the horizontal central line, the vertical central line, the right side line of the Chinese character image and the lower side line of the Chinese character image;

determining a geometric transformation mode;

generating a unit transformation image of the kanji image while keeping the positions of the first, second, third and fourth region images unchanged, in a case where the geometric transformation method is a first geometric transformation method;

under the condition that the geometric transformation mode is a second geometric transformation mode, interchanging the first area image and the second area image, interchanging the third area image and the fourth area image, and generating a second transformation image of the Chinese character image;

under the condition that the geometric transformation mode is a third geometric transformation mode, interchanging the first area image and the third area image, interchanging the second area image and the fourth area image, and generating a third transformation image of the Chinese character image;

under the condition that the geometric transformation mode is a fourth geometric transformation mode, interchanging the second area image and the third area image, keeping the positions of the first area image and the fourth area image unchanged, and generating a fourth transformation image of the Chinese character image;

inputting a Chinese character transformation image into a Chinese character identification model, identifying the Chinese character transformation image through the Chinese character identification model, and outputting authenticity information of the Chinese character transformation image and a geometric transformation mode of the Chinese character transformation image, wherein the Chinese character transformation image comprises at least one of the unit transformation image, the second transformation image, the third transformation image and the fourth transformation image;

obtaining the anti-loss information of the font generation model and the Chinese character identification model based on the authenticity information;

obtaining geometric transformation reconstruction loss information based on the geometric transformation mode;

inputting the target style Chinese character image into the font generation model, and performing Chinese character style conversion inverse processing on the target style Chinese character image through the font generation model to generate a reconstructed source style Chinese character image;

obtaining cycle consistency loss information based on the reconstructed source style Chinese character image and the source style Chinese character image;

and adjusting model parameters of the font generation model based on the antagonistic loss information, the geometric transformation reconstruction loss information and the cycle consistency loss information to obtain a trained font generation model, wherein the font generation model is used for converting the source style Chinese character image into the target style Chinese character image.

2. The method of claim 1, wherein said obtaining a chinese character image comprises:

acquiring the source style Chinese character image;

and inputting the source style Chinese character image into the font generation model, and performing Chinese character style conversion processing on the source style Chinese character image through the font generation model to generate the target style Chinese character image.

3. A method for generating a chinese character font, the method comprising:

acquiring a source style Chinese character image;

the training process of the trained font generation model is as follows:

acquiring a Chinese character image, wherein the Chinese character image comprises the source style Chinese character image and the target style Chinese character image;

dividing the Chinese character image into a first area image, a second area image, a third area image and a fourth area image respectively through a horizontal central line and a vertical central line, wherein the first area image is an image in an area surrounded by the horizontal central line, the vertical central line, a left side line of the Chinese character image and an upper side line of the Chinese character image, the second area image is an image in an area surrounded by the horizontal central line, the vertical central line, a right side line of the Chinese character image and an upper side line of the Chinese character image, the third area image is an image in an area surrounded by the horizontal central line, the vertical central line, a left side line of the Chinese character image and a lower side line of the Chinese character image, and the fourth area image is an image in an area surrounded by the horizontal central line, the vertical central line, a right side line of the Chinese character image and a lower side line of the Chinese character image;

determining a geometric transformation mode;

4. An apparatus for generating a Chinese character font generation model, the apparatus comprising:

a geometric transformation processing module to: dividing the Chinese character image into a first area image, a second area image, a third area image and a fourth area image respectively through a horizontal central line and a vertical central line, wherein the first area image is an image in an area surrounded by the horizontal central line, the vertical central line, a left side line of the Chinese character image and an upper side line of the Chinese character image, the second area image is an image in an area surrounded by the horizontal central line, the vertical central line, a right side line of the Chinese character image and the upper side line of the Chinese character image, the third area image is an image in an area surrounded by the horizontal central line, the vertical central line, the left side line of the Chinese character image and a lower side line of the Chinese character image, and the fourth area image is an image in an area surrounded by the horizontal central line, the vertical central line, the right side line of the Chinese character image and the lower side line of the Chinese character image; determining a geometric transformation mode; generating a unit transformation image of the kanji image while keeping the positions of the first, second, third and fourth region images unchanged, in a case where the geometric transformation method is a first geometric transformation method; under the condition that the geometric transformation mode is a second geometric transformation mode, interchanging the first area image and the second area image, interchanging the third area image and the fourth area image, and generating a second transformation image of the Chinese character image; under the condition that the geometric transformation mode is a third geometric transformation mode, interchanging the first area image and the third area image, interchanging the second area image and the fourth area image, and generating a third transformation image of the Chinese character image; under the condition that the geometric transformation mode is a fourth geometric transformation mode, interchanging the second area image and the third area image, keeping the positions of the first area image and the fourth area image unchanged, and generating a fourth transformation image of the Chinese character image;

a Chinese character identification module, configured to input a Chinese character transformation image to a Chinese character identification model, perform identification processing on the Chinese character transformation image through the Chinese character identification model, and output authenticity information of the Chinese character transformation image and a geometric transformation mode of the Chinese character transformation image, where the Chinese character transformation image includes at least one of the unit transformation image, the second transformation image, the third transformation image, and the fourth transformation image;

a model parameter adjustment module to: obtaining the anti-loss information of the font generation model and the Chinese character identification model based on the authenticity information; obtaining geometric transformation reconstruction loss information based on the geometric transformation mode; inputting the target style Chinese character image into the font generation model, and performing Chinese character style conversion inverse processing on the target style Chinese character image through the font generation model to generate a reconstructed source style Chinese character image; obtaining cycle consistency loss information based on the reconstructed source style Chinese character image and the source style Chinese character image; and adjusting model parameters of the font generation model based on the countermeasure loss information, the geometric transformation reconstruction loss information and the cycle consistency loss information to obtain a trained font generation model, wherein the font generation model is used for converting the source style Chinese character image into the target style Chinese character image.

5. A chinese character font generation apparatus, comprising:

the training process of the trained font generation model is as follows:

determining a geometric transformation mode;

obtaining the confrontation loss information of a font generation model and the Chinese character identification model based on the authenticity information;

6. A computer device comprising a processor and a memory, wherein the memory has stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by the processor to implement the method for generating a model of chinese font generation as claimed in claim 1 or 2 or to implement the method for generating chinese font generation as claimed in claim 3.

7. A computer-readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method for generating a model for generating chinese characters as claimed in claim 1 or 2, or to implement the method for generating chinese characters as claimed in claim 3.