CN109285111A

CN109285111A - A kind of method, apparatus, equipment and the computer readable storage medium of font conversion

Info

Publication number: CN109285111A
Application number: CN201811101699.5A
Authority: CN
Inventors: 刘怡俊; 杨培超; 叶武剑; 翁韶伟; 张子文; 李学易; 王峰
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2018-09-20
Filing date: 2018-09-20
Publication date: 2019-01-29
Anticipated expiration: 2038-09-20
Also published as: CN109285111B

Abstract

The invention discloses a kind of methods of font conversion, deep learning model is obtained previously according to standard letter pictures and the training of target font pictures, and in the training process be respectively each target font target character setting and the unique corresponding style embedded block of each target font, i.e. each target character is other than possessing identical character embedded block, respectively also there is unique style embedded block, thus the deep learning model that training obtains is when carrying out font conversion, different target fonts can be exported based on different style embedded blocks.Standard letter picture is further converted to by target font file picture, the text picture of available plurality of target font by the deep learning model that training obtains.The method provided through the invention improves the efficiency of the model of trained multiple fonts and the efficiency of deep learning model conversion new font.The present invention also provides device, equipment and the computer readable storage mediums of a kind of conversion of font, have above-mentioned beneficial effect.

Description

A kind of method, apparatus, equipment and the computer readable storage medium of font conversion

Technical field

The present invention relates to image identifying and processing field, more particularly to a kind of method, apparatus of font conversion, equipment and Computer readable storage medium.

Background technique

With the increase of computer font type, hand-written script is gradually added into computer library, the font of some famous persons It is pursued by many users.It is every in computer library character library to increase a kind of new font, require the picture according to the font Carry out learning training transformation model.Learn a kind of new font by the way of training deep learning model in the prior art, and will This new font increases storage.But in practical applications, it is often inadequate for only generating a kind of font of style, and trains It practises generation model to take a long time, a kind of every font for generating new style requires to remake training set and re -training, then More times can be expended.

Therefore, the training transfer efficiency for generating new font how is improved, is the technology that those skilled in the art need to solve Problem.

Summary of the invention

The object of the present invention is to provide method, apparatus, equipment and the computer readable storage mediums of a kind of conversion of font, use In the training transfer efficiency for improving generation new font.

In order to solve the above technical problems, the present invention provides a kind of method of font conversion, comprising:

Deep learning model is obtained previously according to standard letter pictures and the training of multiple target font pictures, and is being instructed Respectively uniquely corresponding style is embedded in for the target character insertion of each target font and each target font during practicing Block；Wherein, the character embedded block that the target character of each target font includes is identical；

Obtain standard letter text picture；

The standard letter text picture is inputted into the deep learning model, obtains target font file picture.

Optionally, the deep learning model is specially cGAN network model.

Optionally, described to obtain deep learning model according to standard letter pictures and the training of target font pictures, and Respectively uniquely corresponding style is embedded in for the target character insertion of each target font and each target font in the training process Block specifically includes:

The standard letter pictures and the target font pictures are processed to obtain tagged binary system instruction Practice data；

Determine generator, arbiter, the first encoder for being embedded in the genre category block and for calculating penalty values Second encoder, construct original cGAN network model；

Using the binary system training data training original cGAN network model, and according to transformation result and the damage Mistake value adjusting parameter determines the cGAN network model until the penalty values reach the first preset condition.

Optionally, the acquisition standard letter text picture, specifically includes:

Training obtains the convolutional neural networks model of initial text for identification in advance；

The initial text picture is inputted the convolutional neural networks model and obtained by the initial text picture for receiving input The standard letter text picture.

Optionally, the preparatory training obtains the convolutional neural networks model of initial text for identification, specifically includes:

It determines Chinese characters in common use pictures and the corresponding number of each Chinese character, generates the data set for training；

Determine input layer, convolutional layer, down-sampling layer and the output layer of convolutional neural networks；

It is determined by the data set training convolutional neural networks until training parameter reaches the second preset condition The convolutional neural networks model.

Optionally, described that the initial text picture input convolutional neural networks model is obtained into the standard letter Text picture, specifically includes:

The initial text picture is inputted into the convolutional neural networks model, obtains text number；

According to preset corresponding relationship, obtains and export standard letter text picture corresponding with text number.

In order to solve the above technical problems, the present invention also provides a kind of devices of font conversion, comprising:

First training unit, for obtaining depth previously according to standard letter pictures and the training of target font pictures Practise model, and in the training process be respectively each target font target character insertion it is uniquely right with each target font The style embedded block answered；Wherein, the character embedded block that the target character of each target font includes is identical；

First converting unit will be described in standard letter text picture input for obtaining standard letter text picture Deep learning model obtains target font file picture.

Optionally, further includes:

Second training unit obtains the convolutional neural networks model of initial text for identification for training in advance；

The initial text picture is inputted the volume by the second converting unit, initial text picture for receiving input Product neural network model obtains the standard letter text picture.

In order to solve the above technical problems, the present invention also provides a kind of equipment of font conversion, comprising:

Memory, for storing instruction, described instruction include the steps that the method for the conversion of font described in above-mentioned any one；

Processor, for executing described instruction.

In order to solve the above technical problems, being stored thereon with calculating the present invention also provides a kind of computer readable storage medium Machine program realizes the step for the method that the font as described in above-mentioned any one is converted when the computer program is executed by processor Suddenly.

The method of font conversion provided by the present invention, is instructed previously according to standard letter pictures and target font pictures Deep learning model is got, and the target character setting for being in the training process respectively each target font and each target font are only One corresponding style embedded block, even if each target character other than possessing identical character embedded block, respectively also has unique Style embedded block, the deep learning model that thus training obtains can be embedded in when carrying out font conversion by identical character The same character is mapped to the same vector by block, and can export different target fonts based on different style embedded blocks. Standard letter picture is further converted to by target font file picture by the deep learning model that training obtains, it is available The text picture of plurality of target font.The method provided through the invention, can obtain in a model training can convert The deep learning model of the font of a variety of styles improves the efficiency of the model of trained multiple fonts, and passes through the deep learning Standard letter text picture is disposably converted to the target font file picture of multiple fonts by model, improves deep learning mould The efficiency of type conversion new font.The present invention also provides device, equipment and the computer readable storage medium of a kind of conversion of font, tools There is above-mentioned beneficial effect, details are not described herein.

Detailed description of the invention

It, below will be to embodiment or existing for the clearer technical solution for illustrating the embodiment of the present invention or the prior art Attached drawing needed in technical description is briefly described, it should be apparent that, the accompanying drawings in the following description is only this hair Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root Other attached drawings are obtained according to these attached drawings.

Fig. 1 is a kind of flow chart of the method for font conversion provided in an embodiment of the present invention；

Fig. 2 is a kind of flow chart of the specific embodiment of step S10 provided in an embodiment of the present invention；

Fig. 3 is the flow chart of the method for another font conversion provided in an embodiment of the present invention；

Fig. 4 is a kind of flow chart of the specific embodiment of step S30 provided in an embodiment of the present invention；

Fig. 5 is the schematic diagram of convolutional neural networks model provided in an embodiment of the present invention；

Fig. 6 is a kind of flow chart of the specific embodiment of step S32 provided in an embodiment of the present invention；

Fig. 7 is a kind of schematic diagram of the device of font conversion provided in an embodiment of the present invention；

Fig. 8 is the schematic diagram of the device of another font conversion provided in an embodiment of the present invention；

Fig. 9 is a kind of structural schematic diagram of the equipment of font conversion provided in an embodiment of the present invention.

Specific embodiment

Core of the invention is to provide method, apparatus, equipment and the computer readable storage medium of a kind of font conversion, uses In the training transfer efficiency for improving generation new font.

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Fig. 1 is a kind of flow chart of the method for font conversion provided in an embodiment of the present invention.Fig. 2 mentions for the embodiment of the present invention A kind of flow chart of the specific embodiment of the step S10 supplied.

As shown in Figure 1, the method for font conversion includes:

S10: obtaining deep learning model previously according to standard letter pictures and the training of multiple target font pictures, and It is in the training process respectively target character insertion and the unique corresponding style embedded block of each target font of each target font.

Wherein, the character embedded block that the target character of each target font includes is identical.

In specific implementation, deep learning model can specifically use cGAN network model.Through experiments, it was found that cGAN network The effect that model learns font and converts is better than other models.CGAN, that is, conditional GAN, can learn the image x from With generation model G: i.e. { x, z } → y of random noise vector z to y mapping.As shown in figure 3, G (x) is the mould in training process Type, y are object module, and generator G is that cannot be differentiated by the arbiter D Jing Guo antagonistic training by training, the output of generation The image of " true " out, and arbiter D detects the image of the "false" of generator by training as much as possible.That is generator The target of G be generate the unresolvable image of arbiter D, and the target of arbiter D be tell the received picture of institute whether be The process of true picture, their games is exactly the process of e-learning, when two sides reach balance, generator G energy Generate the picture of high quality.

It is in the training process respectively target character insertion and the unique corresponding style of each target font of each target font Embedded block, and since the character embedded block that the target character of each target font includes is identical, deep learning model is in training process The same vector can be mapped to according to identical character in conversion process, further according to respective style embedded block, output is not Same target character.

As shown in Fig. 2, step S10 can specifically include:

S20: standard letter pictures and target font pictures are processed to obtain tagged binary system training number According to.

Standard letter can be regular script, and standard letter pictures can preparatory self manufacture.

S21: generator, arbiter, the first encoder for being embedded in genre category block and for calculating penalty values are determined Second encoder, construct original cGAN network model.

Building generator is U-Net network structure, the cGAN network model that arbiter is convolutional neural networks model, simultaneously The first encoder for being embedded in a variety of genre categories and the second encoder for calculating loss is added.

Wherein, the first encoder (classification encoder) is used to carry out style using convolutional neural networks model to random noise The extraction of feature encodes.Multiple fonts this one are converted to by a kind of font since common U-Net and GAN model can not be handled Uncertainty in many-many relationship, the embodiment of the present invention introduce the first coding of a variety of genre categories of insertion in generator Device links together before by decoder using a non-training Gaussian noise as style embedded block and character embedded block. By the method, the same character still can be mapped to the same vector by encoder, and but then, decoder will will use The content of two kinds of characters and style is embedded in generate target character.

Second encoder (auxiliary coder) principle is similar to the first encoder, focus on to generator generate picture into Row feature extraction, carries out the subsequent calculating for generating picture loss, this loss is equally used for the learning process of constraint cGAN.

Penalty values herein are calculated with L1 Regularization function:

L_L1(G)=E_x,y,z[||y-G(x,z)||₁] (1)

The loss majorized function of last entire cGAN network model are as follows:

Wherein, x indicates that source picture, y indicate that the picture that generator generates, z indicate Gaussian noise, and G indicates the letter of generator Number, D indicate the function of arbiter, L_cGANIt is the loss function of cGAN, E_x,y,zIt is expectation value function, G^*It is objective optimization function, λ It is scale factor.

S22: it using the original cGAN network model of binary system training data training, and is adjusted according to transformation result and penalty values Parameter determines cGAN network model until penalty values reach the first preset condition.

The repetition training of cGAN network model, observation conversion are carried out using binary system training data obtained in step S20 The effect of obtained text picture, and the text in text picture is calculated compared to target word according to formula (1) and formula (2) The penalty values of the corresponding text of body picture look after and guide network settings parameter repeatedly, test tuning.When the numerical value symbol that loss function obtains When closing expected, compares true picture and generate picture, determine the final cGAN network model for font conversion.

S11: standard letter text picture is obtained.

Receive the standard letter text picture of other units transmission.

S12: standard letter text picture is inputted into deep learning model, obtains target font file picture.

By trained deep learning model, the text for being once converted to multiple fonts is realized.

Optionally, target font file picture is exported.

In specific implementation, target font file picture is shown in human-computer interaction interface or is sent to designated position.

It can also include: to receive the user setting selection font to be exported, be exported by the first encoder corresponding selection Font.

The method of font conversion provided in an embodiment of the present invention, previously according to standard letter pictures and target font picture Training gets deep learning model, and is in the training process respectively the target character setting and each target word of each target font The unique corresponding style embedded block of body, even if each target character other than possessing identical character embedded block, respectively also has only One style embedded block, the deep learning model that thus training obtains can pass through identical character when carrying out font conversion The same character is mapped to the same vector by embedded block, and can export different targets based on different style embedded blocks Font.Standard letter picture is further converted to by target font file picture by the deep learning model that training obtains, it can To obtain the text picture of plurality of target font.The method provided through the embodiment of the present invention, can be in a model training The deep learning model that can convert the font of a variety of styles is obtained, improves the efficiency of the model of trained multiple fonts, and lead to The target font file picture that standard letter text picture is disposably converted to multiple fonts by the deep learning model is crossed, is improved The efficiency of deep learning model conversion new font.

Fig. 3 is the flow chart of the method for another font conversion provided in an embodiment of the present invention.Fig. 4 is the embodiment of the present invention A kind of flow chart of the specific embodiment of the step S30 provided.Fig. 5 is convolutional neural networks mould provided in an embodiment of the present invention The schematic diagram of type.Fig. 6 is a kind of flow chart of the specific embodiment of step S32 provided in an embodiment of the present invention.

As shown in figure 3, on the basis of the above embodiments, in another embodiment, the step S11 of the method for font conversion It specifically includes:

S30: training obtains the convolutional neural networks model of initial text for identification in advance.

It should be noted that step S30 and step S10 out-of-order relationship.

In order to further user-friendly, it may provide the user with handwriting input text and be then converted into another text Mode.This needs first to identify the text of user's handwriting input, i.e., to the image of the initial text of user's input carry out processing and Identification.In traditional character image process field, need to carry out the pretreatment such as String localization, Character segmentation of Chinese character.Character Partitioning algorithm mainly includes projection and connected domain analysis.And mainstream Chinese character recognition system generally use extract feature of Chinese characters structure and The two ways of statistical nature.It, can be accurate but since the personal writing style of hand-written script is different, it is totally different to write environment The feature of Chinese characters structure for extracting various handwritten forms is nearly impossible.So being all based on the method that statistical nature is classified It is identified, there are several types of: elastic mesh feature, directional element features, Gabor are special for currently used Chinese character statistical nature Sign, moment characteristics.

In order to improve the accuracy of the identification to initial text, as shown in figure 4, step S30 can specifically include:

S40: determining Chinese characters in common use pictures and the corresponding number of each Chinese character, generates the data set for training.

In specific implementation, the hand-written pictures of Chinese characters in common use CSAIA-HWDB1.1 and each Chinese characters in common use pair can be applied Data set of the number label production answered for training, is divided into training set and test set for data set with the ratio of 10:1.

S41: input layer, convolutional layer, down-sampling layer and the output layer of convolutional neural networks are determined.

As shown in figure 5, convolutional neural networks model may include input layer (input), 5 convolutional layers (conv1~5), 4 A down-sampling layer (pool1~4) and the output layer (output) connected entirely, wherein the activation primitive of full linking layer uses ReLU letter Number, other convolution kernel sizes are using 3 × 3.

S42: convolution is determined until training parameter reaches the second preset condition by data set training convolutional neural networks Neural network model.

It specifically can be by hand-written script data set Dropout (i.e. in the training process of deep learning network, for mind Through network unit, it is temporarily abandoned from network according to certain probability) method carry out network training, repetition training volume Product neural network model shows training parameter after the accuracy that frequency of training reaches preset threshold or test reaches target value The second preset condition is had reached, determines convolutional neural networks model at this time for the convolutional neural networks model of final application.

S31: the initial text picture of input is received.

It is specifically as follows the initial text picture for receiving user's handwriting input.

S32: initial text picture input convolutional neural networks model is obtained into standard letter text picture.

In specific implementation, as shown in fig. 6, step S32 may include:

S60: inputting convolutional neural networks model for initial text picture, obtains text number.

S61: it according to preset corresponding relationship, obtains and exports standard letter text picture corresponding with text number.

Due to trained convolutional neural networks model output be text number, and the input of cGAN network model and Output must be same data, therefore can correspond the number of text and UNICODE code, in addition UNICODE code is raw At the functional block of corresponding standard letter text picture.

The method of font conversion provided in an embodiment of the present invention provides a kind of hand-written to another kind by a kind of hand-written script The method of font conversion, extends the function of font conversion.

The corresponding each embodiment of method of font conversion as detailed above, on this basis, the invention also discloses with The device of the corresponding font conversion of the above method.

Fig. 7 is a kind of schematic diagram of the device of font conversion provided in an embodiment of the present invention.As shown in fig. 7, the present invention is real Applying the device that the font that example provides is converted includes:

First training unit 701, for obtaining depth previously according to standard letter pictures and the training of target font pictures Spend learning model, and in the training process be respectively each target font target character insertion it is uniquely right with each target font The style embedded block answered；Wherein, the character embedded block that the target character of each target font includes is identical；

Standard letter text picture is inputted depth for obtaining standard letter text picture by the first converting unit 702 Model is practised, target font file picture is obtained.

Fig. 8 is the schematic diagram of the device of another font conversion provided in an embodiment of the present invention.As shown in figure 8, above-mentioned On the basis of embodiment, in another embodiment, the device of font conversion further include:

Second training unit 801 obtains the convolutional neural networks model of initial text for identification for training in advance；

Initial text picture is inputted convolutional Neural by the second converting unit 802, initial text picture for receiving input Network model obtains standard letter text picture.

Since the embodiment of device part is corresponded to each other with the embodiment of method part, the embodiment of device part is asked Referring to the description of the embodiment of method part, wouldn't repeat here.

Fig. 9 is a kind of structural schematic diagram of the equipment of font conversion provided in an embodiment of the present invention.As shown in figure 9, the word The equipment of body conversion can generate bigger difference because configuration or performance are different, may include one or more processors (central processing units, CPU) 910 (for example, one or more processors) and memory 920, one Or (such as one or more mass memories are set the storage medium 930 of more than one storage application program 933 or data 932 It is standby).Wherein, memory 920 and storage medium 930 can be of short duration storage or persistent storage.It is stored in the journey of storage medium 930 Sequence may include one or more modules (diagram does not mark), and each module may include to a series of in computing device Instruction operation.Further, processor 910 can be set to communicate with storage medium 930, in the equipment 900 of font conversion The upper series of instructions operation executed in storage medium 930.

Font conversion equipment 900 can also include one or more power supplys 940, one or more it is wired or Radio network interface 950, one or more input/output interfaces 990, and/or, one or more operating systems 931, such as Windows Server^TM, Mac OS X^TM, Unix^TM,Linux^TM, FreeBSD^TMEtc..

Step in the method for the conversion of font described in above-mentioned Fig. 1 to Fig. 6 is based on Fig. 9 institute by the equipment that font is converted The structure shown is realized.

It is apparent to those skilled in the art that for convenience and simplicity of description, the font of foregoing description The equipment of conversion and the specific work process of computer readable storage medium, can be with reference to corresponding in preceding method embodiment Journey, details are not described herein.

In several embodiments provided herein, it should be understood that disclosed method, apparatus, equipment and calculating Machine readable storage medium storing program for executing, may be implemented in other ways.For example, Installation practice described above is only schematic , for example, the division of module, only a kind of logical function partition, there may be another division manner in actual implementation, such as Multiple module or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Separately A bit, shown or discussed mutual coupling, direct-coupling or communication connection can be through some interfaces, device Or the indirect coupling or communication connection of module, it can be electrical property, mechanical or other forms.Module as illustrated by the separation member It may or may not be physically separated, the component shown as module may or may not be physics mould Block, it can it is in one place, or may be distributed on multiple network modules.It can be selected according to the actual needs In some or all of the modules achieve the purpose of the solution of this embodiment.

It, can also be in addition, can integrate in a processing module in each functional module in each embodiment of the application It is that modules physically exist alone, can also be integrated in two or more modules in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.

If integrated module is realized and when sold or used as an independent product in the form of software function module, can To be stored in a computer readable storage medium.Based on this understanding, the technical solution of the application substantially or Say that all or part of the part that contributes to existing technology or the technical solution can embody in the form of software products Out, which is stored in a storage medium, including some instructions are used so that a computer equipment The whole of (can be personal computer, funcall device or the network equipment etc.) execution each embodiment method of the application Or part steps.And storage medium above-mentioned include: USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. are various can store program The medium of code.

Above to a kind of method, apparatus, equipment and the computer readable storage medium of font conversion provided by the present invention It is described in detail.Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are With the difference of other embodiments, the same or similar parts in each embodiment may refer to each other.Embodiment is disclosed Device for, since it is corresponded to the methods disclosed in the examples, so be described relatively simple, related place is referring to method Part illustrates.It should be pointed out that for those skilled in the art, before not departing from the principle of the invention It puts, can be with several improvements and modifications are made to the present invention, these improvement and modification also fall into the guarantor of the claims in the present invention It protects in range.

It should also be noted that, in the present specification, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged Except there is also other identical elements in the process, method, article or apparatus that includes the element.

Claims

1. a kind of method of font conversion characterized by comprising

Deep learning model is obtained previously according to standard letter pictures and the training of multiple target font pictures, and was being trained It is respectively target character insertion and the unique corresponding style embedded block of each target font of each target font in journey；Its In, the character embedded block that the target character of each target font includes is identical；

Obtain standard letter text picture；

2. the method according to claim 1, wherein the deep learning model is specially cGAN network model.

3. according to the method described in claim 2, it is characterized in that, described according to standard letter pictures and target font picture Training gets deep learning model, and respectively the target character of each target font is embedded in and each mesh in the training process The unique corresponding style embedded block of marking-up body, specifically includes:

The standard letter pictures and the target font pictures are processed to obtain tagged binary system training number According to；

Determine generator, arbiter, the first encoder for being embedded in the genre category block and for calculating penalty values Two encoders construct original cGAN network model；

Using the binary system training data training original cGAN network model, and according to transformation result and the penalty values Adjusting parameter determines the cGAN network model until the penalty values reach the first preset condition.

4. the method according to claim 1, wherein the acquisition standard letter text picture, specifically includes:

The initial text picture for receiving input obtains the initial text picture input convolutional neural networks model described Standard letter text picture.

5. according to the method described in claim 4, it is characterized in that, the preparatory training obtains the volume of initial text for identification Product neural network model, specifically includes:

By the data set training convolutional neural networks, until training parameter reaches the second preset condition, determine described in Convolutional neural networks model.

6. according to the method described in claim 4, it is characterized in that, described input the convolution mind for the initial text picture The standard letter text picture is obtained through network model, is specifically included:

7. a kind of device of font conversion characterized by comprising

First training unit, for obtaining deep learning mould previously according to standard letter pictures and the training of target font pictures Type, and in the training process be respectively each target font target character insertion it is uniquely corresponding with each target font Style embedded block；Wherein, the character embedded block that the target character of each target font includes is identical；

The standard letter text picture is inputted the depth for obtaining standard letter text picture by the first converting unit Learning model obtains target font file picture.

8. device according to claim 7, which is characterized in that further include:

The initial text picture is inputted the convolution mind by the second converting unit, initial text picture for receiving input The standard letter text picture is obtained through network model.

9. a kind of equipment of font conversion characterized by comprising

Memory, for storing instruction, described instruction include the method for the conversion of font described in claim 1 to 6 any one Step；

Processor, for executing described instruction.

10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method that the font as described in claim 1 to 6 any one is converted is realized when being executed by processor.