CN114118012A

CN114118012A - Method for generating personalized fonts based on cycleGAN

Info

Publication number: CN114118012A
Application number: CN202111404620.8A
Authority: CN
Inventors: 李治江; 徐慧婷; 黄煜城; 曹丽琴
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2021-11-24
Filing date: 2021-11-24
Publication date: 2022-03-01

Abstract

The invention provides a method for generating an individual font based on cycleGAN. The method is based on the idea of image style migration, a model for font style migration is built based on a cyclic generation countermeasure Network (cyclic GAN), the method for image style migration is mainly characterized in that the method for image style migration is applied to font style design, a generator of the cyclic generation countermeasure Network is optimized, an original Resnet structure in the generator is replaced by a U-net structure, appropriate model parameters are selected, a data set belonging to the method is built, data enhancement operation is carried out on the data set, and finally a generated font image with a good effect is obtained.

Description

Method for generating personalized fonts based on cycleGAN

Technical Field

The invention belongs to the field of style image generation, and particularly relates to a font image generation model aiming at a handwritten individual font and based on a loop generation countermeasure network.

Background

The style design of Chinese fonts refers to that the fonts of target contents are directly converted into style fonts on the basis of the existing few style fonts. As a very complicated language for expressing meaning, the design of Chinese font is very complicated. The number and the types of the Chinese characters are far more than those of the English letters with phonetic alphabet, so that the work load for completing the Chinese character font design with a relatively full coverage range is very huge. In addition, in the era of pursuing personalization, many people are required to establish a font library belonging to an individual. However, the establishment of the font library requires that a person writes and uploads a large number of Chinese characters, and the threshold for establishing the personal font library is high.

In recent years, when research on image style migration is carried out, many models have been tried and a good effect has been obtained. Existing more sophisticated technologies include generating a countermeasure Network (GAN), pix2pix, and the like. In the aspect of Chinese character image style migration, the image style migration plays an important reference role. Under the influence of the thought, the Chinese characters are directly used as pictures, a small number of Chinese character pictures are input into the network, the style of the Chinese character pictures is learned, and the Chinese character pictures with the style are output.

Shamir et al proposed a way to parameterize feature modeling for font design, but this method is non-automatic and requires manual extraction of features and constraints, which is quite complicated. Suvveernnont et al propose a method for automatically generating new fonts, which relies on user-defined instances (expanded-Based). The method is that corresponding topological features are extracted from the outline of each character, then the extracted features are retained, the extracted features are operated with the original font, and corresponding weights are added to generate a new font. It should be appreciated that an automatic identification technique has been proposed to identify specific content in text images in different fonts and in different formats. Xu et al propose an intelligent algorithm that can simulate the style of handwritten Chinese characters, and is implemented by decomposing and recombining character components, but this algorithm is complex and not applicable. Compared with the Chinese character font and the English font, the strokes and the structures of the Chinese characters are more complex, so that the generated result is not ideal, and the conditions of distortion, fracture and the like can occur.

With the development of deep learning technology, the image style migration field has a significant breakthrough, and a method for translating images into images based on a generating countermeasure network appears. Image-to-image translation refers to giving an image of a specific field, and "translating" the image into another field according to the content characteristics to generate a corresponding image. The application range of style migration is greatly expanded by the generative countermeasure network, and the style migration is expanded to the migration between any image sets. Therefore, researchers have been searching for and hoped to perform font style migration using deep learning techniques. First, researchers have placed their goals on simpler english fonts. In 2017, Baluja et al proposed a deep model that learned using a deep neural network and then modeled english fonts. In this experiment, researchers learned their style using 4 english letters as subjects, and generated the remaining 22 letters of the same style. In 2017, Azadi et al proposed a model for creating a confrontational network (Stacked CGAN) based on stacking conditions for the synthesis of artistic english. The model can directly input original features and output a prediction result, does not need feature extraction, is an end-to-end model, and realizes font style migration by using the mode. The method can input a small number of letter examples and then synthesize the required artistic English letters. In addition, the model has wide application, and can learn various data sets and generate corresponding style migration fonts.

However, the font of a Chinese character is much more complicated than the font of an English character from the aspects of strokes, structures and fonts, and the style migration method with better effect on the font of the English character is not ideal to be applied to the style migration of the Chinese character. Accordingly, more researchers have begun to develop research work on the migration of the style of Chinese characters. In 2017, Luy and the like propose a model based on a Deep Neural Network (DNN) in order to realize font style migration of Chinese calligraphy Chinese characters. The model uses the idea of style migration of common images for reference, realizes the style migration from Chinese character images to Chinese character images, and completes the training of a large number of matched images of a common deep learning network of a Chinese character image style migration task, but the acquisition of some matched images is difficult and the application range is narrow.

In summary, for the individual fonts of chinese characters, how to implement style migration between unpaired images is a problem of great concern to researchers in this field.

Disclosure of Invention

Aiming at the problems in the prior art, the generation of the character fonts of the Chinese characters can be regarded as a deformation of an image style migration task, and a CycleGAN network is fully utilized to realize style migration between unpaired images.

The invention provides a Chinese character image style migration model based on cycleGAN, which improves a generator structure in a cycleGAN network, changes the generator structure into an improved code-decoder structure suitable for Chinese character image style migration, and increases the learning capacity of Chinese character styles. The method specifically comprises the following steps:

step 1, collecting an image data set, preprocessing image data in the image data set, and constructing a training data set required by an experiment, wherein the training data set comprises a Chinese character image data set and a handwritten Chinese character image data set;

step 2, constructing a cyclic generation confrontation type network model to realize style migration of the Chinese character image;

the cyclic generation countermeasure network model is characterized in that a generator structure in an original CycleGAN network is replaced by a U-net structure, and a PatchGAN network is used in a discriminator;

the aim of the cyclic generation antagonistic network model is to realize domain-to-domain mapping, namely learning the mapping relation of style conversion between an input domain A and a target domain B, rather than the one-to-one mapping relation between a specific input picture a and a specific target picture B in two data domains, and solving the problem that the model is highly dependent on paired pictures;

step 3, training the circularly generated countermeasure network model constructed in the step 2 by using the training data set in the step 1 to generate a style picture;

and 4, continuously correcting the parameters of the circularly generated antagonistic network model according to the generated picture effect, modifying the loss function of the model to obtain a style picture with better effect, and finally determining the parameters with the best effect to generate a corresponding result picture.

Further, the training data set required by the experiment comprises a Song style Chinese character image data set and a handwritten Chinese character image data set, the two data sets respectively comprise N128 × 128 images, the image content comprises N Song style Chinese characters and corresponding handwritten Chinese characters, N images are selected from the data sets to serve as test sets, and the following processing is carried out before training for constructing the data sets:

step 11: song style Chinese character image selection

When constructing the image data set of the Song-style Chinese characters, firstly generating a plurality of characters according to the ASCII code range of the Chinese characters and storing the characters in a txt file; generating corresponding 128-128 Song body pictures of the words stored in the txt file according to the sequence; preprocessing the Song body images, and selecting N Song body digital images with higher ordinary use rate;

step 12: handwritten Chinese character image acquisition

Selecting N Chinese characters for writing, ensuring the uniformity of styles and ensuring the same size of pictures;

step 13: data enhancement

In order to expand the data volume, each picture of the data set is inverted to generate a new picture, and the stroke characteristics of the new picture are ensured not to change.

Further, the specific structure of the loop generation countermeasure network model comprises;

two generators, Generator A2B and Generator B2A, and two discriminators, Discriminator a and Discriminator B;

the generator adopts a U-net network, the U-net network is divided into a left encoding part and a right decoding part, the encoding part is used for extracting the characteristics of a comparison surface of a picture, the characteristic extraction process is realized by down-sampling and convolution, the number of channels is increased while the size of the picture is reduced, the decoding part is used for extracting the characteristics of a deeper comparison kernel of the picture, the process is realized by up-sampling and deconvolution, the filling mode of deconvolution selects a valid mode of stride and valid, the size of the picture is increased and the number of channels is reduced during the decoding process, the connection mode in the middle of the U-net network is jump connection, the mode combines the characteristics obtained by the encoding part and the decoding part, the fusion of shallow layer characteristics and deep layer characteristics is realized, the final characteristic image is obtained, and the obtained characteristic image is subdivided, performing predictive segmentation to obtain a final predictive segmentation map; the method specifically comprises the following steps: inputting a handwritten font picture into a U-net network, alternately extracting shallow layer features through a convolution layer of 3 x 3 and a maximum pooling layer of 2 x2, extracting deep layer features of the picture through an deconvolution layer and an upsampling layer, wherein the size of the picture is increased in the upsampling process, and finally outputting a feature image; splicing the features extracted by the coding part with the decoding part through jump connection, and combining the shallow features with the deep layers;

the discriminator uses a 70 × 70 PatchGAN network, three sequential layers are used in the discriminator to carry out down-sampling on the spliced image, in addition, Zero _ padding layers are added to carry out edge padding on the image, the size of the image output by the discriminator is 30 × 30, and the number of channels is 1.

Further, the loop-Generated countermeasure network model obtains an Input image Input _ a from an Input field a, and transmits the Input _ a to a first Generator A2B, the Generator converts the Input image into a target field B, and generates an image Generated _ B having a target style, the Generated style image Generated _ B will enter a Discriminator B, the Discriminator determines whether the Generated image is true, and at the same time, the Generated image Generated _ B generates a graph Cyclic _ a similar to the original Input image through a Generator B2A, and performs a loss function calculation with the original Input image Input _ a, and when the value of the loss function is smaller than a certain threshold, the training of the loop-Generated countermeasure network model meets the requirement.

Further, the loss function of the loop generation countermeasure network model consists of three parts: two GAN loss functions and one round-robin uniform loss function;

the loss functions of two GANs are shown as (1) and (2), respectively:

in the formula (2), p_dataIs data distribution of real image, IE represents expectation function, X and Y represent Song style Chinese character image domain and handwritten Chinese character image domain, X represents image data in X domain, Y represents image data in Y domain, G represents generator from X domain to Y domain, D_YThe discriminator is used for discriminating the truth of the Y-domain picture; d_Y(Y) the discriminator discriminates the true or false of Y corresponding to the picture in the Y domain, G (x) the generator generates a false picture in the Y domain, D_Y(G (x)) is a false map of the Y domain generated by discrimination by a discriminator; in the formula (3), D_XIs a discriminator for discriminating the authenticity of an X-domain picture, D_x(x) Means for discriminating X in the X domain, F means a generator from the Y domain to the X domain, F (Y) means a dummy picture for generating the X domain, D_x(F (y) is a false map of the X domain generated by the discriminator;

the cyclic matching loss function is a loss function for calculating the loss between the input picture x and the false graph F (G (x)) generated after passing through the two generators, and between y and the false graph G (F (y)) generated after passing through the two generators, and the specific formula is shown as the following formula (4):

finally all loss functions are shown below (5):

L(G,F,D_X,D_Y)＝L_GAN(G,D_Y,X,Y)+L_GAN(F,D_X,Y,X)+λL_cyc(G,F) (5)

wherein, λ is the weight coefficient of the cyclic consistency loss, which can be adjusted by itself.

Compared with the prior art, the invention has the advantages and beneficial effects that: the training of the CycleGAN model adopted in the invention does not need matched images, so that the acquisition of a data set is easier. The model uses the generation countermeasure network, and the generated image is more real and has high definition and consistency. The generator structure in the CycleGAN model is replaced by a U-net structure, so that the character of font stroke lines is better met, and the character of font style migration is better suitable.

Drawings

FIG. 1 shows the structure of the original CycleGAN generator in this experiment;

FIG. 2 is a U-net network architecture;

FIG. 3 is a diagram showing the structure of PatchGAN

FIG. 4 is a structural diagram of a modified cycleGAN;

FIG. 5 is an experimental virtual environment.

Fig. 6 is a diagram for generating font effects.

Detailed Description

The invention provides a method for generating an individual font based on cycleGAN, which comprises the following research contents:

(1) the Song style Chinese character data set and the self-built handwritten Chinese character data set are used as training sets, and a data enhancement method is adopted to invert the Chinese characters, so that the number of the data sets is doubled, and the content of the Song style and the content of the handwritten form are consistent.

(2) And constructing a loop to generate a confrontation Network (cycleGAN), so as to realize style migration of the Chinese character image. And (4) inputting pictures in a preprocessed training (train) data set, carrying out model training, testing by using data in a test set (test), and observing the font style migration effect. On the basis, parameters, loss functions and generator structures of the model are improved, and the Chinese character image style transition picture with better effect is output.

The construction step of the loop generation countermeasure network model comprises the following steps:

step 1: and clearly constructing a Song style Chinese character data set and a handwritten Chinese character data set. The structure of the generator of the original CycleGAN network is improved (figure 1), the structure of the generator is replaced by a U-net structure, and the U-net network model consists of down-sampling, up-sampling and jumping connection (figure 3). And inputting the handwritten font picture into a U-net network, and alternately passing through a convolution layer of 3 x 3 and a maximum pooling layer of 2 x2 to extract shallow layer characteristics. Deep features of the picture are extracted through the deconvolution layer and the upsampling, the size of the picture is increased in the upsampling process, and finally a feature image is output. The middle part is connected through jumping, the features extracted by the coding part are spliced with the decoding part, and the shallow features are combined with the deep layer. The arbiter in the CycleGAN network uses a 70 × 70 PatchGAN network (fig. 4). Three sequential layers are used in the discriminator to carry out downsampling on the spliced image, and in addition, a Zero _ padding layer is added in the network to carry out edge padding on the image. The size of the image output by the discriminator is 30 × 30, the number of channels is 1, the PatchGAN network processes the input image into 70 × 70patches, and the process is to trace back according to the feature map finally output by the discriminator and finally corresponds to the area of the input image 70 × 70. There are two generators (Generator A2B (U-net) and Generator B2A (U-net)) and two discriminators (Discriminator A and Discriminator B) in the modified cycleGAN network (FIG. 5). The model takes the Input image Input _ a from the Input field a and passes the Input _ a to the first Generator A2B (U-net). The generator converts the input image into the target domain B, generating an image Generated _ B having a target style. The Generated genre image Generated _ B is entered into a Discriminator B, which determines whether the Generated image is true. Meanwhile, the Generated image Generated _ B generates a graph Cyclic _ A similar to the original Input image through a Generator B2A (U-net), and performs a loss calculation with the original Input image Input _ A, and when the loss is less than a certain value, the training of the model can meet the requirement. Training the improved CycleGAN network to generate 128 x 128 size handwritten generated images, and repeating the iteration for multiple times until the network can generate more realistic images.

Step 2: and (3) inputting the pictures in the test set by using the model stored in the step (1), and generating the handwritten font pictures through the game of the generator and the discriminator.

And step 3: and calculating loss functions, including basic loss functions and cycle consistent loss functions of the two GAN networks.

The loss function of a CycleGAN network consists of three parts: two GAN loss functions and one round-robin uniform loss function. The optimization problem of the GAN network is actually a minimum maximization problem, namely the generator G is kept unchanged, and the discriminator D is optimized, so that the discrimination accuracy of the discriminator is maximum; and then, keeping the discriminator unchanged, and optimizing the generator G to enable the generated false picture to be close to the real picture, and when the false picture and the real picture are the same, achieving the best effect.

In the formula (1), x represents image data, z is input noise of G, (z) represents a false picture generated by the generator, and p_dataIs the data distribution of the real image, P_ZRepresenting the noise distribution resulting from the noise z after passing through the G-network, and IE the expectation function. Formula (II)

The expression means that the generator G is kept unchanged, so that the discriminator can discriminate as maximally as possible whether the data is from a false image or a true image generated. Then the part is regarded as a whole, and

formula (II)

The expression means that the discriminator D is kept unchanged so that the difference between the false image generated by the generator and the real image is minimized.

Since there are two symmetric GAN networks in the CycleGAN network, the loss function of the GAN network also has two "symmetric" functions, which are shown in (2) and (3) below:

in the formula (2), X and Y represent the image domain of the song style chinese character and the image domain of the handwritten chinese character, respectively. X denotes image data in the X domain, Y denotes image data in the Y domain, G denotes an X-to-Y domain generator, D_YIs a discriminator for discriminating the truth of the Y-domain picture. D_Y(Y) denotes a discriminator for discriminating the true or false of Y in the Y domain, G (z) denotes a false picture generated by the generator, D_Y(G (z)) is a false graph of the Y domain generated by the discrimination by the discriminator. Formula (3) is similar in meaning except that the domain has been changed; d^XIs a discriminator for discriminating the authenticity of an X-domain picture, D_x(x) Means for discriminating X in the X domain, F means a generator from the Y domain to the X domain, F (Y) means a dummy picture for generating the X domain, D_x(F (y)) is a false map of the X domain generated by the discrimination by the discriminator.

In addition to the loss function of a conventional GAN network, CycleGAN has its own unique loss function, i.e., a round robin uniform loss function. To ensure that the model is a one-to-one mapping, the CycleGAN network introduces a cycle-consistency loss that computes the input pictures x (and y) and the false graphs generated after passing through two generators

(and

) A loss function in between. The specific formula is shown as the following (4):

finally all loss functions are shown below (5):

L(G,F,D_X,D_Y)＝L_GAN(G,D_Y,X,Y)+L_GAN(F,D_X,Y,X)+λL_cyc(G,F) (5)

And 4, step 4: and (4) calculating the result through the loss function, updating the model parameters, and continuously iterating (namely, repeating the

steps

2, 3 and 4) until the picture which is similar to the style of the handwritten font is generated. And continuously correcting the parameters of the model to obtain a style picture with better effect. And finally, determining the hyper-parameters with the best effect, and generating a corresponding result graph.

The invention can use a computer to train and infer the network, and is realized by using a Pythroch deep learning framework under an ubuntu 16.04LTS operating system. The specific experimental environment configuration is as follows:

the specific implementation mode is as follows:

step 1: and constructing a data set. The training data set used in the experiment is a Song style Chinese character data set and a self-built handwritten character data set, the two data sets respectively comprise 750 images of 128 x 128 size, the image content comprises 750 Song style Chinese characters and corresponding handwritten character, and 50 images are selected from the data set to serve as a test set. Before training, the following processing is performed in order to construct a data set:

(1) song style Chinese character image selection

When the image data set of the Song-style Chinese characters is constructed, 20869 characters are generated according to the ASCII code range of the Chinese characters and are stored in a txt file. And generating corresponding 128-128 Song body pictures by the words stored in the txt file according to the sequence, wherein 20869 pictures are generated in total. The pictures are preprocessed, and 750 Song body digital images with high ordinary use rate are selected.

(2) Handwritten Chinese character image acquisition

The text constructs a data set of own handwritten Chinese characters. The data set was written personally by the same student. She selects 750 Chinese characters from thousand characters to write, which ensures the unification of styles and ensures the same size of pictures.

(3) Data enhancement

A large amount of data is needed for training a deep learning network model, in order to expand the data volume, each picture of a data set is inverted to generate a new picture, and the stroke characteristics of the new picture are guaranteed not to change. By the method, the data volume is greatly expanded, and the training set is expanded to 1400 images from 750 images;

step 2: and (5) training a cycleGAN model. Putting the constructed training set into corresponding data folders (databases), improving an original cycleGAN model, and training the model according to the set hyper-parameters, wherein the improved cycleGAN model has the following structure:

(1) there are two generators (Generator A2B and Generator B2A) and two discriminators (Discriminator A and Discriminator B) in a CycleGAN network. In order to realize the training of non-paired images, the CycleGAN network needs to ensure that the mapping relationship cannot be many-to-one, that is, the data X1, X2,. and so on in the input domain X cannot be converted into the same data Y in the destination domain Y through the mapping relationship. In order to satisfy the above conditions, a new concept cycle (loop) is introduced into the CycleGAN network. In a cycleGAN network, a cycle, namely, the mapped data y is mapped back to an input domain X to generate a corresponding false image, and a true image is similar to the false image through a cycle-consistency loss, so that the model is ensured to be domain-to-domain one-to-one mapping and does not need image matching.

(2) The improved generator network structure adopts a U-net network. The U-net network structure diagram is divided into a left encoding part and a right decoding part, the encoding part is used for extracting the characteristics of a comparison surface of a picture, the characteristic extraction process is realized by down-sampling and convolution, the size of the picture is reduced while the number of channels is increased, the decoding part is used for extracting the characteristics of a deeper comparison kernel of the picture, the process is realized by up-sampling and deconvolution, the filling mode of deconvolution is a valid mode of stride and valid, the size of the picture is increased and the number of channels is reduced in the decoding process, the connection mode in the middle of the U-net network is jump connection, the mode combines the characteristics obtained by the encoding part and the decoding part, the fusion of shallow layer characteristics and deep layer characteristics is realized, the final characteristic image is obtained, the obtained characteristic image is subdivided and is predicted and divided, obtaining a final prediction segmentation map;

(3) the arbiter in the CycleGAN network uses a 70 × 70 PatchGAN network. The CycleGAN network model has two discriminators, one of which, Discriminator B, is used for analysis. The discriminator inputs the false map Generated _ B Generated by the Generator Generator A2B and the style picture in the target domain, and uses the locate function to splice the two, and carries out down sampling to obtain the image characteristics and judge whether the image is true or false.

(4) In the generator, the downsampling part uses a Leaky Relu activation function, and the upsampling part uses a Relu activation function. In the discriminator, only the Leaky Relu activation function is used, since there is only the operation of downsampling.

And step 3: and (3) updating the parameters of the model by using the Song style Chinese character images in the test set and calculating loss functions (including basic loss functions and cyclic consistent loss functions of two GAN networks), and continuously correcting the parameters of the model to obtain style pictures with better effects.

FIG. 6 selects a portion of the generated fonts for comparison, including a total of four different fonts: song font, font generated by the original CycleGAN network, font generated by the improved CycleGAN network (the generator is replaced by U-net) and handwriting. Compared with the difference between the original CycleGAN network and the improved CycleGAN network, the method has the obvious advantage that the improved CycleGAN network has good effect, and the generated images of the improved CycleGAN network have good effect no matter in the aspects of fluency of font strokes, font structure combination and the like.

The specific implementations described herein are merely illustrative of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or scope of the invention as defined in the appended claims.

Claims

1. A method for generating an individual font based on cycleGAN is characterized by comprising the following steps:

the loop generation countermeasure network model is used for realizing the mapping from one domain to another domain, namely learning the mapping relation of style conversion between the input domain and the target domain, but not the one-to-one mapping relation between specific input pictures and target pictures in the two data domains, thereby solving the problem that the model is highly dependent on paired pictures;

and 4, continuously correcting the parameters of the circularly generated antagonistic network model according to the generated style picture effect, modifying the loss function of the model to obtain a style picture with better effect, and finally determining the parameters with the best effect to generate a corresponding result picture.

2. The method for generating personalized fonts based on a loop-generated countermeasure network as claimed in claim 1, wherein: the training data set required by the experiment comprises a Song style Chinese character image data set and a handwritten Chinese character image data set, the two data sets respectively comprise N images with the size of 128 x 128, the image content comprises N Song style Chinese characters and corresponding handwritten Chinese characters, N images are selected from the data sets to be used as test sets, and the following processing is carried out before training for constructing the data sets:

step 11: song style Chinese character image selection

When constructing a Song-style Chinese character image data set, firstly generating a plurality of characters according to the ASCII code range of Chinese characters and storing the characters in a txt file; generating corresponding 128-128 Song body pictures of the words stored in the txt file according to the sequence; preprocessing the Song body images, and selecting N Song body digital images with higher ordinary use rate;

step 12: handwritten Chinese character image acquisition

step 13: data enhancement

3. The method for generating personalized fonts based on a loop-generated countermeasure network as claimed in claim 1, wherein: the specific structure of the circularly generated countermeasure network model comprises;

4. The method for generating personalized fonts based on a loop-generated countermeasure network as claimed in claim 3, wherein: the method comprises the steps that a loop generation countermeasure network model obtains an Input image Input _ A from an Input field A, the Input image Input _ A is transmitted to a first Generator Generator A2B, the Generator converts the Input image into a target field B, an image Generated _ B with a target style is Generated, the Generated style image Generated _ B enters a Discriminator B, the Discriminator judges whether the Generated image is true or not, meanwhile, the Generated image Generated _ B generates a graph Cyclic _ A similar to the original Input image through a Generator Generator B2A and performs a loss function calculation with the original Input image Input _ A, and when the value of the loss function is smaller than a certain threshold value, the training of the loop generation countermeasure network model meets requirements.

5. The method for generating personalized fonts based on a loop-generated countermeasure network as claimed in claim 1, wherein: the loss function of the loop generation countermeasure network model consists of three parts: two GAN loss functions and one round-robin uniform loss function;

the loss functions of two GANs are shown as (1) and (2), respectively:

the cyclic consistency loss function is a loss function for calculating the loss function between the input picture x and the false graph F (G (x)) generated after passing through the two generators, and between the input y and the false graph G (F (y)) generated after passing through the two generators, and the specific formula is shown as the following formula (4):

finally all loss functions are shown below (5):

L(G,F,D_X,D_Y)＝L_GAN(G,D_Y,X,Y)+L_GAN(F,D_X,Y,X)+λL_cyc(G,F) (5)