CN115937869A

CN115937869A - Chinese character skeleton generation method and device, electronic equipment and storage medium

Info

Publication number: CN115937869A
Application number: CN202211699803.1A
Authority: CN
Inventors: 于凤丽; 刘辰宇; 吴嘉嘉; 胡金水; 殷兵
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2022-12-28
Filing date: 2022-12-28
Publication date: 2023-04-07

Abstract

The invention provides a method and a device for generating a Chinese character framework, electronic equipment and a storage medium, wherein the method comprises the following steps: determining a handwritten Chinese character image of a target user and a standard track sequence of a target Chinese character; performing style extraction based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user; extracting content based on the standard track sequence to obtain the content structure characteristics of the target Chinese character; the method and the device have the advantages that the Chinese character skeleton is generated based on the handwriting style characteristics and the content structure characteristics, the handwriting track sequence of the target user for the target Chinese character is obtained, the defects that the style similarity and the structure stability of Chinese character skeleton synthesis in the traditional scheme are difficult to guarantee, and the use scene is limited are overcome, the stable handwriting style extraction based on the handwritten Chinese character image containing any and a small number of handwritten Chinese characters is realized, the style consistency between the generated Chinese characters and the handwritten Chinese characters is guaranteed, the content correctness and the structure stability of the generated Chinese characters are improved, and meanwhile, the application range is guaranteed.

Description

Chinese character skeleton generation method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of image processing, in particular to a method and a device for generating a Chinese character skeleton, electronic equipment and a storage medium.

Background

Reading and writing take an extremely important place in people's lives, being the input and output of information from and to the world, respectively. Therefore, how to give the device reading (handwritten text recognition) and writing (handwritten text generation) capabilities is a popular research topic.

At present, the handwritten text recognition technology of the device is mature, but in the composition of the handwritten data, because the types of Chinese characters are various, the research on the skeleton generation of the handwritten character is less, and in the only few researches, the device is difficult to define/predict the appearance representation of the character due to the transformation of the writing style, the content style, the font usage, the character spacing and the like of a writer, and finally the naturalness and the fidelity of the synthesized handwritten data are worried.

Disclosure of Invention

The invention provides a method and a device for generating a Chinese character skeleton, electronic equipment and a storage medium, which are used for solving the defects that the style similarity and the structural stability of Chinese character skeleton synthesis in the prior art are difficult to ensure and the use scene is limited.

The invention provides a Chinese character skeleton generating method, which comprises the following steps:

determining a handwritten Chinese character image of a target user and a standard track sequence of a target Chinese character;

performing style extraction based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user;

extracting content based on the standard track sequence to obtain the content structure characteristics of the target Chinese character;

and generating a Chinese character skeleton based on the handwriting style characteristics and the content structure characteristics to obtain a handwriting track sequence of the target user for the target Chinese character.

According to the method for generating the Chinese character skeleton provided by the invention, the style extraction is carried out based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user, and the method comprises the following steps:

inputting the handwritten Chinese character image into an image style extraction model to obtain the handwriting style characteristics of the target user output by the image style extraction model;

the image style extraction model is obtained by training a combined sequence style extraction model on the basis of the sample handwriting style characteristics of the sample handwritten Chinese character image and the sample handwriting style characteristics of the sample handwritten Chinese character track sequence corresponding to the sample handwritten Chinese character image;

the sequence style extraction model is used for carrying out style extraction based on the sample handwritten Chinese character track sequence to obtain sample handwritten style characteristics of the sample handwritten Chinese character track sequence.

According to the Chinese character skeleton generation method provided by the invention, the image style extraction model is determined based on the following steps:

respectively determining sample handwriting style characteristics of the sample handwritten Chinese character image and sample handwriting style characteristics of the sample handwritten Chinese character track sequence based on an initial image style extraction model and an initial sequence style extraction model;

determining the similarity between the sample handwriting style characteristics of a positive sample and the similarity between the sample handwriting style characteristics of a negative sample, wherein the positive sample is a sample handwritten Chinese character image of the same person and a corresponding sample handwritten Chinese character track sequence, and the negative sample is a sample handwritten Chinese character image of different persons and/or a sample handwritten Chinese character track sequence;

and determining a contrast loss based on the similarity between the sample handwriting style characteristics of the positive sample and the similarity between the sample handwriting style characteristics of the negative sample, and performing parameter adjustment on the initial image style extraction model and the initial sequence style extraction model based on the contrast loss to obtain an image style extraction model and a sequence style extraction model.

According to the method for generating a Chinese character framework provided by the invention, the parameter adjustment is carried out on the initial image style extraction model and the initial sequence style extraction model based on the contrast loss to obtain an image style extraction model and a sequence style extraction model, and the method comprises the following steps:

determining distribution loss based on the sample handwriting style characteristics of the sample handwritten Chinese character image and the distance between the sample handwriting style characteristics and an initial class center matrix of an initial handwriting style class center; the initial class center matrix is obtained by clustering all handwriting styles in the initial handwriting style class center;

and determining a joint training loss based on the distribution loss and the contrast loss, and carrying out parameter adjustment on the initial sequence style extraction model, the initial image style extraction model and the initial handwriting style class center based on the joint training loss to obtain a sequence style extraction model, an image style extraction model and a handwriting style class center.

According to the method for generating the Chinese character skeleton provided by the invention, the step of determining the joint training loss based on the distribution loss and the contrast loss comprises the following steps:

under the condition that any sample handwritten Chinese character image lacks a corresponding sample handwritten Chinese character track sequence, determining a predicted handwritten track sequence corresponding to the any sample handwritten Chinese character image;

determining the predicted handwriting style characteristics of the predicted handwriting track sequence based on the initial sequence style extraction model, wherein the predicted handwriting track sequence is obtained by generating a Chinese character skeleton based on the sample handwriting style characteristics of any sample handwritten Chinese character image and the sample content structure characteristics of a sample standard track sequence of a reference Chinese character;

and determining consistency loss based on the similarity between the sample handwriting style characteristics of any sample handwritten Chinese character image and the predicted handwriting style characteristics of the predicted handwriting track sequence, and determining joint training loss based on the consistency loss, the distribution loss and the contrast loss.

According to the method for generating the Chinese character skeleton provided by the invention, the handwritten Chinese character image is input to an image style extraction model, and the handwritten style characteristics of the target user output by the image style extraction model are obtained, and the method comprises the following steps:

inputting the handwritten Chinese character image into an image style extraction model to obtain the initial handwriting style characteristics of the target user output by the image style extraction model;

and determining a class center matrix of a handwriting style class center, and determining the handwriting style characteristics of the target user based on the correlation between the initial handwriting style characteristics and the class center matrix.

According to the method for generating a Chinese character framework provided by the invention, the Chinese character framework is generated based on the handwriting style characteristics and the content structure characteristics to obtain the handwriting track sequence of the target user for the target Chinese character, and the method comprises the following steps:

performing track prediction based on the handwriting style characteristics and the content structure characteristics to obtain the relative position of each handwriting track point of the target Chinese character;

performing state prediction based on the handwriting style characteristics and the content structure characteristics to obtain track states of all handwriting track points of the target Chinese character;

and generating a Chinese character framework based on the relative position and the track state of each handwritten track point to obtain a handwritten track sequence of the target user for the target Chinese character.

The invention also provides a Chinese character skeleton generating device, comprising:

the data determining unit is used for determining a handwritten Chinese character image of a target user and a standard track sequence of a target Chinese character;

the style extraction unit is used for extracting the style based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user;

the content extraction unit is used for extracting content based on the standard track sequence to obtain the content structure characteristics of the target Chinese character;

and the skeleton generation unit is used for generating Chinese character skeletons based on the handwriting style characteristics and the content structure characteristics to obtain a handwriting track sequence of the target user for the target Chinese characters.

The invention also provides electronic equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the program to realize the Chinese character skeleton generating method.

The present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method for generating a chinese character skeleton as described in any of the above.

The method, the device, the electronic equipment and the storage medium for generating the Chinese character framework provided by the invention have the advantages that the style extraction is carried out through the handwritten Chinese character image, the content extraction is carried out through the standard track sequence, the handwritten style characteristic of a target user and the content structure characteristic of a target Chinese character are respectively obtained, the Chinese character framework is generated according to the two characteristics, the handwritten track sequence of the target user for the target Chinese character is obtained, the defects that the style similarity and the structure stability of the Chinese character framework synthesis in the traditional scheme are difficult to guarantee and the use scene is limited are overcome, the stable handwritten style extraction based on the handwritten Chinese character image containing any and a small number of handwritten Chinese characters is realized, the style consistency between the generated Chinese characters and the handwritten Chinese characters is guaranteed, the content correctness and the structure stability of the generated Chinese characters are improved, and the application range is guaranteed.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is an exemplary diagram of handwritten text and stroke sequences provided by the present invention;

FIG. 2 is a schematic flow chart of a method for generating a Chinese character skeleton according to the present invention;

FIG. 3 is a schematic flow chart of a comparative training process provided by the present invention;

FIG. 4 is an exemplary diagram of a sample handwritten Chinese character binary image provided by the present invention;

FIG. 5 is a flow chart of a joint training process provided by the present invention;

FIG. 6 is a flow chart diagram of a consistency loss determination process provided by the present invention;

FIG. 7 is an overall framework diagram of the image style extraction model training process provided by the present invention;

FIG. 8 is a flow chart illustrating a process for determining handwriting style characteristics provided by the present invention;

FIG. 9 is a flowchart illustrating a process for determining a sequence of handwritten traces provided by the present invention;

FIG. 10 is a general frame diagram of the Chinese character skeleton generation method provided by the present invention;

FIG. 11 is a schematic structural diagram of a Chinese character skeleton generation apparatus provided in the present invention;

fig. 12 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Currently, the technology for recognizing handwritten texts of devices is mature, and recognition accuracy is improved to a great extent, but in the process of synthesizing handwritten data, due to transformation of a writer's note style, content style, font usage, character spacing and the like, fig. 1 is an exemplary diagram of the handwritten text and stroke sequence provided by the invention, as shown in fig. 1, the broken and connected styles and styles in the writer's writing content are mixed, font usage is transformed, and the character spacing and gradient are different, so that it is difficult to define/predict character-specific appearance representation, and the synthesis of vivid handwritten data is very challenging.

At present, a set of personalized fonts are customized for users, a font design team consumes about one year, font design consumes time and labor, and personalized customization of the fonts is difficult to realize by everyone. In this field, in languages such as english and korean, the research on the font synthesis is more and the technology is relatively mature due to the few basic constituent units and simple structure; for Chinese characters, because the characters have various types (more than twenty thousand) and complex structures, the research on the generation of the skeleton of the personalized handwriting font is less, and the practicability of the scene is poor; therefore, how to generate the full amount of Chinese characters with uniform style becomes a technical problem to be solved urgently at present, and meanwhile, for common users and professional word stock manufacturing enterprises, the cost can be saved, and the efficiency can be improved.

Handwritten data typically has two forms of characterization, one of which is treating it as aligned pixels, such as a static image written on paper; the second is the stroke sequence shown in fig. 1, i.e. the handwritten track sequence. The handwritten Chinese character synthesis aims at simulating a handwritten form with a specific style, synthesizing a corresponding handwritten Chinese character skeleton based on the writing style of the handwritten Chinese character, and generating a result which is a corresponding handwritten stroke sequence.

The research on the composition of handwritten Chinese characters is mainly divided into two main categories: firstly, handwritten Chinese character synthesis based on an off-line picture, and secondly, handwritten Chinese character synthesis based on-line track points (skeleton sequences); the former mostly limits the writing content corresponding to the offline picture, and on the basis of picture extraction track sequence, the effective extraction of the corresponding track sequence is often difficult to ensure, thereby influencing the similarity of the generated Chinese characters on style; the latter requires the user to input an online handwriting skeleton sequence to perform style extraction, so that the method is difficult to apply to a photographing scene, and the application range is narrow, i.e. the method cannot perform style extraction on the handwriting data of the user in any photographing scene to generate handwritten Chinese characters.

In short, most of the current handwritten Chinese character synthesis schemes have obvious limitations in use scenes, namely, the writing content and the writing mode are limited; moreover, the stability and style similarity of the generated handwritten Chinese characters are difficult to be ensured.

In addition, there is a scheme for realizing handwritten Chinese character synthesis by using a generation countermeasure network based on character component information, in which the generated result is directly in a picture form, and there is no writing stroke sequence information, i.e. there is no handwriting track sequence, so that it is difficult to apply the scheme to an application scenario of online handwritten data, such as a smart phone, an electronic whiteboard, a tablet computer, and writing text in a digital ink form, and it is difficult to ensure the structural stability of the generated Chinese characters.

In the RNN (Recurrent Neural Network) based scheme, the handwritten data can be simulated by an RNN-based generator model, but the synthesis result is unstable and a specific style of chinese characters cannot be synthesized. Correspondingly, in the fontnnn-based scheme, the fontnnn generates a Chinese character skeleton through the RNN by using a migration learning strategy, but the scheme focuses on font generation, each training model can only synthesize one font (the same as a training set), and only can perform font library generation through an online track sequence, so that the use scene is limited.

In summary, the difficulty of the current handwritten Chinese character synthesis lies in the consistency of the generated Chinese characters and the reference characters in the handwriting style and the structural stability of the generated Chinese characters; therefore, how to capture the handwriting style of the user through any and a small amount of Chinese characters written by the user to generate a whole amount of Chinese characters with uniform and complete style becomes a technical problem to be solved urgently at present, and meanwhile, the cost can be saved and the efficiency can be improved for common users and professional word stock manufacturing enterprises.

Therefore, the invention provides a Chinese character skeleton generation method, which aims to extract handwriting style characteristics on the basis of a handwritten Chinese character image containing any small number of handwritten Chinese characters, combines the content structure characteristics of the target Chinese character and uses the handwriting track sequence of the target Chinese character, thereby not only ensuring the style consistency between the generated Chinese character and the handwritten Chinese character, but also improving the structure stability and the content correctness of the generated Chinese character. Fig. 2 is a schematic flow chart of a method for generating a chinese character skeleton according to the present invention, and as shown in fig. 2, the method includes:

step 210, determining a handwritten Chinese character image of a target user and a standard track sequence of a target Chinese character;

specifically, before generating a Chinese character skeleton, a reference object needs to be determined, where the reference object corresponds to two layers, namely a handwriting style and a content structure, the former can provide reference basis for generating Chinese characters in style, and the latter can provide guidance guidelines for the Chinese characters in content and structure. Specifically, the reference object may be a handwritten Chinese character of the target user presented in an image form and a written stroke of the target Chinese character presented in a sequence form, which may also be referred to as a handwritten Chinese character image of the target user and a standard trajectory sequence of the target Chinese character.

The target user needs to perform personalized font customization/handwritten Chinese character skeleton generation personnel; the target Chinese character, i.e. the reference character, can be any Chinese character, and can provide guidance for the finally generated handwriting track sequence on the content and the structure. The standard track sequence (standard stroke sequence) of the target Chinese character can be obtained by network query/crawling/downloading.

The handwritten Chinese character image can be obtained by imaging an online skeleton sequence of any Chinese character input by a target user, or can be obtained by shooting any Chinese character written on a writing page by the target user through image acquisition equipment; here, the image capturing device may be a camera, a scanner, or the like, and the writing page may be a fixed paper, a blackboard/whiteboard, an electronic writing board, or the like. It should be noted that the specific content and number of the handwritten Chinese characters contained in the handwritten Chinese character image are arbitrary and random, in other words, the specific content and number of the Chinese characters written by the target user are not limited in the embodiment of the present invention.

Step 220, performing style extraction based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user;

specifically, after obtaining the handwritten Chinese character image of the target user, style extraction can be performed according to the handwritten Chinese character image to obtain the handwriting style characteristics of the target user, and the specific process includes:

the method comprises the steps of firstly, utilizing a handwritten Chinese character image to extract style, so as to extract information related to the handwriting style characteristics of a target user, which is contained in the image, and further obtain the handwriting style characteristics of the target user.

The style extraction process based on the handwritten Chinese character image can be realized through an image style extraction model, namely, the handwritten Chinese character image can be input into the image style extraction model firstly, then the image style extraction model can carry out style extraction according to the input handwritten Chinese character image so as to extract the handwriting style characteristics of the handwritten Chinese characters contained in the handwritten Chinese character image, and finally the handwriting style characteristics of a target user output by the image style extraction model can be obtained.

It is worth noting that before the handwritten Chinese character image is input into the image style extraction model, the handwritten Chinese character image needs to be subjected to image segmentation, namely, the handwritten Chinese character image can be cut through the single character detection model to be cut into a series of images only containing a single Chinese character, so that a plurality of independent Chinese character single character images are obtained, and then style extraction can be performed on the basis of the series of independent Chinese character single character images to obtain the handwriting style characteristics of the target user.

Step 230, extracting content based on the standard track sequence to obtain the content structure characteristics of the target Chinese character;

specifically, after a standard track sequence of the target Chinese character is obtained, content extraction can be performed according to the standard track sequence to obtain the content structure characteristics of the target Chinese character, and the specific process comprises the following steps:

firstly, content extraction is carried out by utilizing a standard track sequence, so as to extract information which is contained in the track sequence and is related to the font content and the font structure of a target Chinese character, thereby obtaining the content structure characteristics of the target Chinese character.

The content extraction process based on the standard track sequence can be realized through a content structure extraction model, namely the standard track sequence of the target Chinese character is firstly input into the content structure extraction model, then the content structure extraction model carries out content structure extraction according to the input standard track sequence so as to extract the characteristics of the target Chinese character contained in the standard track sequence on the content and the structure, and finally the content structure characteristics of the target Chinese character output by the content structure extraction model can be obtained.

Before the standard track sequence is input into the content structure extraction model, the content structure extraction model can be obtained by applying the sample standard track sequence of the reference Chinese character and the sample content structure characteristics corresponding to the sample standard track sequence. Here, the initial content structure extraction model of the training process can be constructed on the basis of Bi-LSTM (Bi-directional Long Short-Term Memory network).

And 240, generating a Chinese character skeleton based on the handwriting style characteristics and the content structure characteristics to obtain a handwriting track sequence of the target user for the target Chinese character.

Specifically, after the handwriting style characteristics of the target user and the content structure characteristics of the target Chinese character are obtained through the above processes, the Chinese character skeleton generation can be performed according to the both characteristics, so as to obtain the handwriting track sequence of the target user for the target Chinese character, and the specific process may include:

the method comprises the steps of generating a Chinese character skeleton by utilizing the handwriting style characteristics of a target user and the content structure characteristics of the target Chinese character to obtain a handwriting track sequence of the target user for the target Chinese character, namely obtaining the handwriting track sequence (generated Chinese character) which corresponds to the target Chinese character and has the handwriting style of the target user, specifically, simulating the handwriting style of the target user by taking the handwriting style characteristics and the content structure characteristics as the reference to generate the handwriting Chinese character with the content characteristics and the structure characteristics of the target Chinese character to obtain the handwriting track sequence, namely, on the basis of the content structure characteristics of the target Chinese character, referring to the handwriting style characteristics of the target user to simulate the handwriting track sequence which has the handwriting style and corresponds to the content information and the structure information of the target Chinese character.

Specifically, the handwriting style characteristics of the target user and the content structure characteristics of the target Chinese character can be input into the skeleton generation model together, then the skeleton generation model generates the Chinese character skeleton according to the input handwriting style characteristics and the content structure characteristics, namely, the handwriting style corresponding to the target user is generated by referring to the handwriting style characteristics and the content structure characteristics, the handwritten Chinese character with the content characteristics and the structure characteristics of the target Chinese character is generated, and finally the handwriting track sequence corresponding to the target Chinese character output by the skeleton generation model can be obtained.

Before the handwriting style characteristics and the content structure characteristics are input into the skeleton generation model, the sample handwriting style characteristics of the sample handwritten Chinese character image, the sample content structure characteristics of the sample standard track sequence of the reference Chinese character and the sample handwriting track sequence can be used for pre-training to obtain the skeleton generation model. Here, the initial skeleton generation model of the training process may be constructed on the basis of two layers of bidirectional long-short term memory networks.

It should be noted that, in order to make the similarity of the generated chinese character skeleton (handwriting track sequence) on style better, in the training process of the skeleton generation model, the handwriting style consistency constraint based on the track sequence can be used to promote the style consistency of the generated chinese character skeleton, that is, the consistency of the predicted handwriting track sequence generated by the model and the sample handwriting track sequence on the handwriting style is used as the constraint condition to train the initial skeleton generation model.

In the embodiment of the invention, the handwriting style characteristic and the content structure characteristic are utilized to simulate the handwriting Chinese character skeleton with the handwriting style of the target user to generate the handwriting stroke sequence (handwriting track sequence) corresponding to the target Chinese character, thereby realizing the stable handwriting style extraction based on the handwriting Chinese character image containing any and a small number of handwriting Chinese characters, ensuring the similarity of the generated Chinese characters and the handwriting style of the target user, and improving the content correctness and the structure stability of the generated Chinese characters.

It should be noted that the handwriting track sequence generated in the embodiment of the present invention may be converted into an offline static image by a related technical means, and is applied to data enhancement in a specific scene, so that the present invention has an important meaning for performance improvement of device identification, detection, and the like in many fields.

The method for generating the Chinese character framework provided by the invention extracts the style through the handwritten Chinese character image, extracts the content through the standard track sequence, respectively obtains the handwritten style characteristics of the target user and the content structure characteristics of the target Chinese character, generates the Chinese character framework according to the two characteristics, obtains the handwritten track sequence of the target user for the target Chinese character, overcomes the defects that the style similarity and the structure stability of the Chinese character framework synthesis in the traditional scheme are difficult to ensure and the use scene is relatively limited, realizes the stable handwritten style extraction based on the handwritten Chinese character image containing any and a small number of handwritten Chinese characters, ensures the style consistency between the generated Chinese characters and the handwritten Chinese characters, improves the content correctness and the structure stability of the generated Chinese characters, and simultaneously ensures the application range.

Based on the above embodiment, in step 220, style extraction is performed based on the handwritten chinese character image to obtain the handwriting style characteristics of the target user, which includes:

inputting the handwritten Chinese character image into an image style extraction model to obtain the handwriting style characteristics of a target user output by the image style extraction model;

the image style extraction model is obtained by training a combined sequence style extraction model on the basis of the sample handwriting style characteristics of the sample handwritten Chinese character image and the sample handwriting style characteristics of a sample handwritten Chinese character track sequence corresponding to the sample handwritten Chinese character image;

Specifically, in step 220, the process of performing style extraction according to the handwritten Chinese character image to obtain the handwriting style characteristics of the target user may specifically include the following steps:

firstly, the handwritten Chinese character image can be subjected to image segmentation to obtain a series of independent Chinese character single character images, specifically, the handwritten Chinese character image is segmented by utilizing a single character detection model to be segmented into a series of images only containing single Chinese characters, so that a plurality of independent Chinese character single character images are obtained, namely, the handwritten Chinese character image is subjected to image segmentation by using each Chinese character contained in the handwritten Chinese character image as a unit by utilizing the single character detection model, so that a series of independent Chinese character single character images are obtained;

the method comprises the steps of obtaining a series of independent single Chinese character images, randomly selecting a preset number of single Chinese character images from the independent single Chinese character images to extract styles, inputting the selected single Chinese character images into an image style extraction model, extracting the styles by the image style extraction model, and extracting information capable of reflecting internal features and appearance characteristics of handwritten Chinese characters of target personnel from the handwritten Chinese character images, such as writing content styles, font use habits, writing character intervals, writing character gradients, writing stroke awkwardness, writing track trends and the like, so as to finally obtain the handwriting style features of the target user output by the image style extraction model.

The preset number can be set according to actual conditions, actual requirements and the like, and can be 5, 10, 15 and the like, and preferably, the preset number is determined to be 10 in the embodiment of the invention, that is, 10 images are randomly selected from the cut Chinese character single image for style extraction.

Before the handwritten Chinese character image is input into the image style extraction model, the image style extraction model can be obtained by combining the sample handwriting style characteristics of the sample handwritten Chinese character image and the sample handwriting style characteristics of the sample handwritten Chinese character track sequence corresponding to the sample handwritten Chinese character image with the sequence style extraction model through pre-training. The sequence style extraction model is used for carrying out style extraction according to the input sample handwritten Chinese character track sequence, and can obtain sample handwritten style characteristics of the sample handwritten Chinese character track sequence.

Different from the traditional multi-task training mode, in the embodiment of the invention, considering that the training mode of multi-task learning requires that abstract representation information among different modes is completely shared, if the condition is not satisfied, the model cannot be aggregated to obtain matched high-dimensional information expression, so that the training of the model is deviated, and the prediction performance of the model is poor, therefore, the consistency of the handwriting style represented by sample handwriting data of an image mode and a sequence mode is selected to perform model training, so as to obtain the trained image style extraction model.

Specifically, when performing model training, firstly, a large amount of sample handwriting data of an image modality and a sequence modality need to be collected, the sample handwriting data are respectively a sample handwritten Chinese character image and a sample handwritten Chinese character track sequence, and sample handwriting style characteristics of the sample handwriting data of the corresponding modality can be respectively determined through an initial style extraction model of the image modality and an initial style extraction model of the sequence modality, wherein the sample handwriting style characteristics are obtained by style extraction of the sample handwriting data of the corresponding modality; then, the similarity between the sample handwriting style characteristics of the sample handwriting data of the image modality and the sequence modality can be applied to carry out parameter iteration on the initial style extraction model of the image modality and the initial style extraction model of the sequence modality, so that the trained style extraction model of the image modality and the trained style extraction model of the sequence modality, namely the image style extraction model and the sequence style extraction model, are obtained.

Compared with the traditional scheme in which a model is trained, the parameter updating is carried out by using an error driving model between the predicted value and the labeled value, the training mode of the consistency of the handwriting styles represented by the sample handwriting data of the image mode and the sequence mode selected in the embodiment of the invention does not need to take the complete sharing of abstract representation information between different modes as a precondition, and the initial style extraction model is trained by applying the similarity between the sample handwriting style characteristics of the sample handwriting data of the image mode and the sequence mode, so that the initial style extraction model can fully learn the distance relation between the sample handwriting style characteristics corresponding to the sample handwriting data of the image mode and the sequence mode, and the key assistance is provided for the promotion of the style similarity between the generated Chinese characters and the handwritten Chinese characters.

In the embodiment of the invention, based on the model training process of the similarity between the sample handwriting style characteristics of the sample handwriting data in different modes, the initial style extraction model can judge the similarity between the sample handwriting style characteristics according to the difference of the corresponding personnel of the sample handwriting data, so that the similarity between the sample handwriting style characteristics is as high as possible when the sample handwriting data in the image mode and the sequence mode corresponds to the same personnel, namely when the sample handwriting data in the two modes can form a positive sample; conversely, when the sample handwriting data of the image modality and/or the sequence modality correspond to different persons, that is, the sample handwriting data of the same or different modalities may constitute a negative sample, the similarity between the sample handwriting style features is made as low as possible.

The method provided by the embodiment of the invention takes the consistency of the handwriting styles represented by the sample handwriting data in different modes as a reference to carry out model training, so that the model can fully learn the distance relation between the sample handwriting style characteristics of the sample handwriting data in different modes in the training process, thereby providing key assistance for promoting the style similarity between the generated Chinese characters and the handwritten Chinese characters, and overcoming the defect that the traditional training mode requires the complete sharing of abstract representation information in different modes so as to have poor model training effect; and the model is trained by utilizing the complementary relation and the corresponding relation of the same handwriting style among different modes, the generalization capability of the model can be improved, and the effectiveness and the accuracy of the handwriting style characteristic extraction are improved.

Based on the above embodiment, fig. 3 is a schematic flowchart of a comparative training process provided by the present invention, and as shown in fig. 3, the image style extraction model is determined based on the following steps:

step 310, respectively determining sample handwriting style characteristics of a sample handwritten Chinese character image and sample handwriting style characteristics of a sample handwritten Chinese character track sequence based on an initial image style extraction model and an initial sequence style extraction model;

step 320, determining the similarity between the sample handwriting style characteristics of the positive sample and the similarity between the sample handwriting style characteristics of the negative sample, wherein the positive sample is a sample handwritten Chinese character image of the same person and a corresponding sample handwritten Chinese character track sequence, and the negative sample is a sample handwritten Chinese character image of different persons and/or a sample handwritten Chinese character track sequence;

and 330, determining contrast loss based on the similarity between the sample handwriting style characteristics of the positive sample and the sample handwriting style characteristics of the negative sample, and performing parameter adjustment on the initial image style extraction model and the initial sequence style extraction model based on the contrast loss to obtain an image style extraction model and a sequence style extraction model.

Specifically, the training process of the image style extraction model may specifically include the following steps:

step 310, firstly, an initial style extraction model of an image modality and an initial style extraction model of a sequence modality are required to be determined, wherein the initial style extraction model and the initial sequence style extraction model are respectively an initial image style extraction model and an initial sequence style extraction model; here, the initial image style extraction model may be constructed on the basis of a full convolution CNN (Convolutional Neural Networks); the initial sequence style extraction model can be constructed on the basis of a Bi-directional Long Short-Term Memory (Bi-LSTM) network;

simultaneously, sample handwriting data of an image modality and sample handwriting data of a sequence modality need to be determined, wherein the sample handwriting data are a sample handwritten Chinese character image and a sample handwritten Chinese character track sequence respectively; in short, the sample handwriting data set not only contains sample handwriting data of the same person in two modes, but also contains sample handwriting data of different persons in the same mode or different modes.

In view of the fact that it is generally difficult to acquire sample handwritten data of the same person in different modalities, that is, a sample handwritten Chinese character image and a sample handwritten Chinese character track sequence of the same person cannot be acquired simultaneously, in the embodiment of the invention, a style migration method is adopted to render the sample handwritten Chinese character track sequence into an image, and background information is added to obtain the sample handwritten Chinese character image.

Specifically, the sample handwritten chinese character image may be determined by the following process, specifically:

firstly, acquiring a sample handwritten Chinese character track sequence, wherein the sample handwritten Chinese character track sequence can be derived from track point data on electronic equipment, and the sample handwritten Chinese character track sequence can be acquired through a tablet personal computer;

then, preprocessing the sample handwritten Chinese character track sequence to construct a sample handwritten data pair of a track sequence-Chinese character image, specifically, rendering the sample handwritten Chinese character track sequence to render the sample handwritten Chinese character track sequence into a binary image, so as to obtain a sample handwritten Chinese character binary image; fig. 4 is an exemplary diagram of a sample handwritten Chinese character binary image provided by the present invention, and as shown in fig. 4, a binary image without a background can be obtained through rendering, and then background addition is performed on the sample handwritten Chinese character binary image, that is, based on a style migration method, background information in a real photographing scene is added to the sample handwritten Chinese character binary image to obtain a sample handwritten Chinese character image with a background.

And then, respectively carrying out style extraction on the sample handwritten Chinese character image and the sample handwritten Chinese character track sequence by utilizing the initial image style extraction model and the initial sequence style extraction model to obtain the sample handwritten style characteristics of the sample handwritten Chinese character image and the sample handwritten style characteristics of the sample handwritten Chinese character track sequence.

Step 320, a positive sample and a negative sample are determined from the sample handwritten data set, where the positive sample can be understood as sample handwritten data of the same person in different modalities in the sample handwritten data set, and the corresponding negative sample is sample handwritten data of different persons in the sample handwritten data set in the same or different modalities, specifically, a sample handwritten Chinese character image of the same person and a corresponding sample handwritten Chinese character track sequence are selected from the sample handwritten data set to construct a positive sample, and simultaneously sample handwritten Chinese character images of different persons and/or sample handwritten Chinese character track sequences are selected from the sample handwritten data set to construct a negative sample;

then, determining the sample handwriting style characteristics of the sample handwritten Chinese character image in the positive sample, the similarity between the sample handwriting style characteristics of the sample handwritten Chinese character track sequence, the sample handwriting style characteristics of the sample handwritten Chinese character image in the negative sample and/or the similarity between the sample handwriting style characteristics of the sample handwritten Chinese character track sequence, namely calculating the similarity between the sample handwriting style characteristics of the positive sample and the similarity between the sample handwriting style characteristics of the negative sample; it is noted that the similarity between the sample handwriting style features herein can be measured by cosine similarity, euclidean distance, minkoff distance, etc.

Step 330, determining a loss of the model training process, i.e. a contrast loss, according to the similarity between the sample handwritten style features of the positive sample and the similarity between the sample handwritten style features of the negative sample, specifically, calculating the contrast loss of the model training process based on the sample handwritten style features of the sample handwritten Chinese character image in the positive sample, the similarity between the sample handwritten style features of the sample handwritten Chinese character track sequence, the sample handwritten style features of the sample handwritten Chinese character image in the negative sample, and/or the similarity between the sample handwritten style features of the sample handwritten Chinese character track sequence;

the training target of the initial image style extraction model is to make the similarity between the sample handwriting style characteristics of the sample handwriting data in different modes as high as possible under the condition that the sample handwriting data forms a positive sample, namely under the condition that the sample handwriting data in different modes correspond to the same person; correspondingly, in the case where the sample handwriting data constitutes a negative sample, i.e., the sample handwriting data of the same or different modalities correspond to different persons, the similarity between the sample handwriting style characteristics of the sample handwriting data of the same or different modalities is made as low as possible.

Therefore, under the condition that the similarity between the sample handwriting style characteristics of each sample handwriting data in the positive sample is high, and the similarity between the sample handwriting style characteristics of each sample handwriting data in the negative sample is low, the contrast loss can be determined to be small; accordingly, in the case where the similarity between the sample handwriting style features of the respective sample handwriting data in the positive sample is low, and/or the similarity between the sample handwriting style features of the respective sample handwriting data in the negative sample is high, it can be determined that the contrast loss is large.

After that, parameter iteration can be performed on the initial style extraction models of the two modalities according to the contrast loss, specifically, the initial image style extraction model and the initial sequence style extraction model are subjected to parameter adjustment by using the contrast loss, so that the adjusted initial style extraction models of the two modalities can judge that the similarity between the handwriting style characteristics of the samples is as high as possible under the condition that the handwriting data of the samples of different modalities belongs to a positive sample, correspondingly, the similarity between the handwriting style characteristics of the samples is as low as possible under the condition that the handwriting data of the samples of the same or different modalities belongs to a negative sample, and finally, the trained style extraction models of the two modalities, namely, the image style extraction model and the sequence style extraction model, are obtained.

Based on the above embodiment, fig. 5 is a schematic flowchart of a joint training process provided by the present invention, and as shown in fig. 5, the obtaining of the image style extraction model and the sequence style extraction model by performing parameter adjustment on the initial image style extraction model and the initial sequence style extraction model based on the contrast loss includes:

step 510, determining distribution loss based on the sample handwriting style characteristics of the sample handwritten Chinese character image and the distance between the initial class center matrix of the initial handwriting style class center; the initial class center matrix is obtained by clustering the handwriting styles in the initial handwriting style class center;

and step 520, determining the joint training loss based on the distribution loss and the contrast loss, and performing parameter adjustment on the initial sequence style extraction model, the initial image style extraction model and the initial handwriting style class center based on the joint training loss to obtain a sequence style extraction model, an image style extraction model and a handwriting style class center.

Specifically, the process of obtaining the style extraction models of the two modalities by performing parameter adjustment on the initial style extraction models of the two modalities according to the contrast loss may specifically include:

in order to ensure the stability of the generation of the Chinese character skeleton in the out-of-domain handwriting style in the process of generating the Chinese character skeleton, in the embodiment of the invention, in the training process of the image style extraction model, the Chinese character skeleton and the writing style class center are required to be subjected to joint training based on attention similarity.

Step 510, firstly, an initial class center matrix of an initial handwriting style class center needs to be determined, which can be obtained by clustering all handwriting styles in the initial handwriting style class center, specifically, the writing style class center is clustered based on a prototype clustering mode, so as to obtain an initial class center matrix of the initial handwriting style class center;

then, calculating the distance between the sample handwriting style characteristic of the sample handwritten Chinese character image and an initial class center matrix of an initial handwriting style class center, and determining the distribution loss of the joint training process according to the distance;

step 520, determining a loss of the joint training process, that is, a joint training loss, that is, a comprehensive contrast loss and a distribution loss, according to the distribution loss and the contrast loss, and determining the joint training loss, specifically, determining that the joint training loss is small under the condition that both the contrast loss and the distribution loss are small; under the condition of large contrast loss and distribution loss, the large loss of the joint training can be determined; under the conditions of large contrast loss and small distribution loss or small contrast loss and large distribution loss, the weights of the contrast loss and the distribution loss can be combined to measure the magnitude of the combined training loss;

and finally obtaining the trained sequence style extraction model, image style extraction model and handwriting style class center by carrying out parameter adjustment on the initial style extraction model and the initial handwriting style class center of the two modes according to the combined training loss.

Based on the above embodiment, fig. 6 is a schematic flowchart of a process for determining a consistency loss provided by the present invention, and as shown in fig. 6, determining a joint training loss based on a distribution loss and a contrast loss includes:

step 610, determining a predicted handwritten trajectory sequence corresponding to a sample handwritten Chinese character image under the condition that any sample handwritten Chinese character image lacks a corresponding sample handwritten Chinese character trajectory sequence;

step 620, determining a predicted handwriting style characteristic of a predicted handwriting track sequence based on the initial sequence style extraction model, wherein the predicted handwriting track sequence is obtained by generating a Chinese character skeleton based on the sample handwriting style characteristic of the sample handwritten Chinese character image and the sample content structure characteristic of a sample standard track sequence of a reference Chinese character;

step 630, determining consistency loss based on the similarity between the sample handwriting style characteristics of the sample handwritten Chinese character image and the predicted handwriting style characteristics of the predicted handwriting track sequence, and determining the joint training loss based on the consistency loss, the distribution loss and the contrast loss.

Specifically, in the above process, the process of determining the joint training loss according to the distribution loss and the contrast loss may specifically include the following steps:

in the training process of the image style extraction model, not all sample handwritten Chinese character images are obtained by rendering and background addition based on a style migration method on the basis of a sample handwritten Chinese character track sequence, and part of the sample handwritten Chinese character images are directly acquired, or the sample handwritten Chinese character track sequence is lost carelessly after sample handwritten data pairs are formed, and only the sample handwritten Chinese character images are left.

In such cases, in order to ensure consistency of the generated predicted handwritten trajectory sequence and the sample handwritten Chinese character image in the handwriting style, that is, to make the generated predicted handwritten trajectory sequence have better similarity in style, in the embodiments of the present invention, consistency loss based on handwriting style characteristics may be used to improve style consistency of the generated Chinese character skeleton.

Specifically, under the condition that any sample handwritten Chinese character image in the sample handwritten data set lacks a corresponding sample handwritten Chinese character track sequence, namely the sample handwritten Chinese character image is directly acquired and is not obtained through data rendering, or the corresponding sample handwritten Chinese character track sequence is lost, a predicted handwritten track sequence corresponding to the sample handwritten Chinese character image needs to be determined, the predicted handwritten track sequence is obtained by generating a Chinese character skeleton by utilizing the sample handwritten style characteristics of the sample handwritten Chinese character image and the sample content structural characteristics of the reference Chinese character, the process of generating the Chinese character skeleton is described in detail above, and the description is omitted herein;

then, style extraction can be performed on the generated predicted handwriting track sequence through an initial style extraction model of a sequence mode to obtain predicted handwriting style characteristics of the predicted handwriting track sequence, specifically, the generated predicted handwriting track sequence is input into the initial sequence style extraction model, style extraction is performed through the initial sequence style model, and finally sample handwriting style characteristics of the predicted handwriting track sequence output by the initial sequence style model can be obtained;

then, the similarity between the sample handwriting style characteristics of the sample handwritten Chinese character image and the predicted handwriting style characteristics of the predicted handwriting track sequence can be calculated, and the similarity between the characteristics can be measured through cosine similarity, euclidean distance, min's distance and the like; then, according to the similarity, the loss on the handwriting style in the model training process, namely the consistency loss based on the handwriting style, can be determined, and then the combined training loss can be determined according to the consistency loss, the distribution loss and the contrast loss, namely the consistency loss based on the handwriting style, the distribution loss and the contrast loss are synthesized, and the loss in the combined training process, namely the combined training loss is measured.

Specifically, under the condition that consistency loss, contrast loss and distribution loss are all small, it can be determined that the joint training loss is small; under the condition that consistency loss, contrast loss and distribution loss are all large, the large loss of the joint training can be determined; under the condition that the three trends are different, the loss of the joint training can be measured by combining the weight of each loss.

Based on the above embodiments, fig. 7 is an overall framework diagram of the image style extraction model training process provided by the present invention, and as shown in fig. 7, the image style extraction model training process specifically includes:

in order to improve the handwriting style extraction capability of the image style extraction model, namely the effectiveness, stability and style similarity of the handwriting style extraction model for extracting the handwriting style features are improved. In the embodiment of the invention, a training scheme based on a cross-modal initial style extraction model is provided, which specifically comprises the following steps:

in view of the fact that it is usually difficult to obtain a sample handwritten Chinese character image of the same person and a corresponding sample handwritten Chinese character track sequence, in the embodiment of the invention, a style migration method is adopted to render the sample handwritten Chinese character track sequence into an image, and background information is added to obtain the sample handwritten Chinese character image.

Here, the sample handwritten Chinese character track sequence may be derived from track point data on the electronic device, that is, the sample handwritten Chinese character track sequence may be acquired by a tablet computer, and may be regarded as a tag of a sample handwritten Chinese character image, and may be represented as a series of track points ordered in time, each track point being composed of a coordinate (x, y) on the electronic device and a pen lift event; the pen coordinates are integer values limited by the screen resolution, and when the pen is lifted from the screen, the state of the pen-up event is recorded as 1, otherwise, it is 0. Thus a sample handwritten Chinese character trace sequence may be defined as

Wherein x is _t And (4) representing the T-th track point, wherein T represents the total number of the track points.

In the training process, sample handwriting data (track sequence) input at a certain moment can be defined as the relative position of the coordinate of a track point at the current moment and the coordinate of the track point at the last moment, and the real value pair x = (x) ₁ ,x ₂ ) And a binary flag x ₃ Composition, which can be expressed as:

in the formula, P represents the relative position of each track point; x is the number of ₃ Indicating the state of the pen-up event, i.e. when the current trace point has been completed, or when the pen is lifted from the screen, x ₃ =1, otherwise x ₃ ＝0。

During model training, the characteristic that sample handwriting data of the same person in different modes have the same handwriting style is utilized to train the cross-mode initial style extraction model so as to draw the handwriting style of the same person in two modes to the same characteristic space, and therefore the extraction capability of the model on the handwriting style is improved. The cross-modal initial style extraction model comprises an initial sequence style extraction model and an initial image style extraction model, and the sample handwriting style characteristics of the same person in different modalities are drawn closer and the sample handwriting style characteristics of different persons are drawn further by using a contrast learning strategy.

Specifically, the input of the initial sequence style extraction model is the relative position P of each track point, which can extract the sample handwriting style feature seq _ feat according to the input data; the input of the initial image style extraction model is a sample handwritten Chinese character image, and the sample handwritten style characteristics img _ feat can be extracted according to input data; when the two types of handwriting data are trained, based on a contrast learning strategy, the sample handwriting style characteristics of the sample handwriting data of the same person in two modes are drawn by using contrast loss (triplet _ loss), and the sample handwriting style characteristics of the sample handwriting data of different persons in the same or different modes are drawn.

For the image style extraction model, the initial writing style class center is trained based on a prototype clustering mode by adopting distribution loss (dist _ loss), and the matrix parameters of the initial class center matrix are used as class center parameters of a two-stage network for training.

In order to ensure the effectiveness of style extraction, the training stage of the initial style extraction model also adopts consistency loss aiming at the sample handwriting data of the same person, so as to draw the distance between the sample handwriting style characteristics seq _ feat and the sample handwriting style characteristics img _ feat of the same person, namely, for a sample handwriting Chinese character image without a label, in short, a sample handwriting Chinese character image without a corresponding sample handwriting Chinese character track sequence is lack, in order to ensure the consistency of the generated predicted handwriting track sequence and the sample handwriting Chinese character image on the handwriting style, the consistency loss (cycle _ loss) based on the handwriting style characteristics can be used to promote the style consistency of the generated Chinese character skeleton, namely, the generated predicted handwriting track sequence is input into the initial sequence style extraction model to extract the predicted handwriting style characteristics of the predicted handwriting track sequence, and the distance measurement is carried out on the predicted handwriting style characteristics of the sample handwriting Chinese character image corresponding to draw the distance between the two characteristics to realize the consistency constraint based on the handwriting style.

The overall loss of the model training process can be expressed as:

loss＝L _{triplet_loss} +L _{dist_loss} +L _{cycle_loss}

in the formula, loss is the overall Loss, i.e., the Loss of joint training.

According to the method provided by the embodiment of the invention, the initial image style extraction model is trained, so that the image style extraction model with the convergence of training can be used for modeling and reconstructing the handwriting style of the target user from the handwritten Chinese character image containing any small number of handwritten Chinese characters written by the target user, the stability of style extraction based on the handwritten Chinese character image and the effectiveness of the extracted handwriting style characteristics are ensured, the style similarity of the handwriting style characteristics is improved, and the key assistance is provided for the automatic generation of a large-scale handwritten Chinese character skeleton font library of the handwriting style of the target user.

Based on the foregoing embodiment, fig. 8 is a flowchart illustrating a process for determining a handwriting style characteristic provided by the present invention, and as shown in fig. 8, inputting a handwritten chinese character image into an image style extraction model to obtain a handwriting style characteristic of a target user output by the image style extraction model, where the process includes:

step 810, inputting the handwritten Chinese character image into an image style extraction model to obtain initial handwriting style characteristics of a target user output by the image style extraction model;

step 810, determining a class center matrix of the handwriting style class center, and determining the handwriting style characteristics of the target user based on the correlation between the initial handwriting style characteristics and the class center matrix.

Specifically, the process of inputting a handwritten Chinese character image into the image style extraction model to obtain the handwriting style characteristics of the target user output by the image style extraction model may specifically include:

step 810, firstly, inputting a handwritten Chinese character image into an image style extraction model, and then performing style extraction by the image style extraction model to extract information capable of reflecting internal characteristics and appearance representation of the handwritten Chinese character of a target person from the handwritten Chinese character image, for example, a writing content style, a font use habit, a writing character interval, a writing character gradient, a writing stroke sharpness, a writing track trend and the like, so that initial handwriting style characteristics of a target user output by the image style extraction model can be finally obtained;

step 820, immediately ensuring the stability of the generation of the Chinese character skeleton of the handwriting style outside the domain, in the embodiment of the invention, after the initial handwriting style characteristics output by the image style extraction model are obtained, the initial handwriting style characteristics and the handwriting style class center are subjected to attention similarity calculation to obtain the final handwriting style characteristics, specifically, a class center matrix of the handwriting style class center is determined, and because the matrix parameters of the initial class center matrix are used as class center parameters of a two-stage network when the initial handwriting style class center is trained to participate in combined training, a class center matrix can be obtained after the training is finished, and then the initial handwriting style characteristics and the class center matrix can be subjected to attention similarity calculation to obtain the handwriting style characteristics of the target user.

Based on the foregoing embodiment, fig. 9 is a schematic flowchart of a process for determining a handwriting track sequence provided by the present invention, and as shown in fig. 9, based on handwriting style characteristics and content structure characteristics, a chinese character skeleton is generated to obtain a handwriting track sequence of a target user for a target chinese character, where the process includes:

step 910, performing track prediction based on the handwriting style characteristics and the content structure characteristics to obtain the relative position of each handwriting track point of the target Chinese character;

step 920, performing state prediction based on the handwriting style characteristics and the content structure characteristics to obtain the track state of each handwriting track point of the target Chinese character;

and 930, generating a Chinese character skeleton based on the relative position and the track state of each handwritten track point to obtain a handwritten track sequence of the target user for the target Chinese character.

Specifically, in step 140, the process of generating a chinese character skeleton according to the handwriting style characteristics and the content structure characteristics to obtain the handwriting track sequence of the target user for the target chinese character may specifically include the following steps:

step 910, firstly, a trajectory prediction may be performed by using the handwriting style characteristics and the content structure characteristics to obtain the relative position of each handwriting trajectory point of the target Chinese character, specifically, the trajectory prediction may be performed by using the handwriting style characteristics of the target user and the content structure characteristics of the target Chinese character as references to obtain the relative position of each handwriting trajectory point of the target Chinese character;

specifically, the skeleton generation network for Chinese character skeleton generation can be regarded as a decoder based on two-layer bidirectional LSTM network, so that when the skeleton generation network is applied to Chinese character skeleton generation, the decoder can generate Chinese character skeleton according to handwriting style characteristics and contentsThe structural feature may predict each handwritten track point in the handwritten track sequence, specifically, the position of each handwritten track point may be shifted (d) by using a Gaussian Mixed Model (GMM) with R bivariate normal distribution _x ,d _y ) And modeling, namely sampling through Gaussian mixture distribution of the relative positions of the handwritten track points predicted by the decoder to obtain the relative positions of the handwritten track points generated by the model decoder.

Corresponding to the training phase, at the current decoding time t, the predicted handwritten track point p generated at the previous decoding time can be compared _t-1 Content output c with content structure extraction model _t And sample handwriting style features s _context And splicing to obtain a _t ＝[p _t-1 ,c _t ,s _context ]，a _t Can be used to determine the hidden state h of the decoder at the current decoding instant _t ＝DEC(h _t-1 ,a _t ) (ii) a Finally, h is _t Predicting and outputting predicted handwriting track point p output at current decoding moment by performing linear layer mapping _t 。

Step 920, simultaneously, according to the handwriting style characteristics and the content structure characteristics, performing state prediction to obtain the track state of each handwriting track point of the target Chinese character, specifically, performing state prediction by taking the handwriting style characteristics of the target user and the content structure characteristics of the target Chinese character as references, thereby obtaining the track state of each handwriting track point of the target Chinese character;

specifically, when the state prediction is performed by using the skeleton generation model regarded as the decoder based on the handwriting style and the content structure feature, the state class (p) of each handwriting trace point can be predicted by using a three-class classifier (Softmax layer) ₁ ,p ₂ ,p ₃ ) And modeling, namely predicting the handwriting state of each handwriting track point of the target Chinese character through the state class of the decoder.

And 930, generating a Chinese character skeleton according to the relative position of each handwritten track point obtained by the track prediction and the track state of each handwritten track point obtained by the state prediction, so as to obtain a handwritten track sequence of the target Chinese character for the target user.

Fig. 10 is an overall frame diagram of the method for generating a skeleton of chinese characters according to the present invention, and as shown in fig. 10, the overall frame of the method for generating a skeleton of chinese characters according to the present invention is composed of three parts, namely, an image style extraction model, a content structure extraction model, and a skeleton generation model; the image style extraction model is composed of a full convolution CNN which can extract an input handwritten Chinese character image X _s Encoding as initial handwriting style characteristics S _feat ＝ENC _style (X _s ) And introducing an attention mechanism between the image style extraction model and the skeleton generation model to enable the initial handwriting style to be characterized by S _feat Calculating the attribute with the class center matrix meta _ style of the class center of the handwriting style, thereby obtaining the handwriting style characteristic S _context ＝attention(S _feat Meta style); the content structure extraction model is composed of an LSTM network layer, which can input the standard track sequence X of the target Chinese character _c Encoding into content structure feature H _context ＝ENC _context (X _c ) (ii) a The skeleton generation model is composed of two layers of bidirectional LSTM networks, which can be based on the input handwriting style characteristics S _context And content structure feature H _contrxt And generating a Chinese character skeleton to obtain a handwriting track sequence. Specifically, the method may comprise:

firstly, determining a handwritten Chinese character image of a target user and a standard track sequence of a target Chinese character;

then, style extraction is carried out based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user, specifically, the handwritten Chinese character image is input to an image style extraction model to obtain the handwriting style characteristics of the target user output by the image style extraction model; the image style extraction model is obtained by training a combined sequence style extraction model on the basis of the sample handwriting style characteristics of the sample handwritten Chinese character image and the sample handwriting style characteristics of a sample handwritten Chinese character track sequence corresponding to the sample handwritten Chinese character image; the sequence style extraction model is used for carrying out style extraction based on the sample handwritten Chinese character track sequence to obtain sample handwritten style characteristics of the sample handwritten Chinese character track sequence;

the method for inputting the handwritten Chinese character image into the image style extraction model to obtain the handwriting style characteristics of the target user output by the image style extraction model comprises the following steps: inputting the handwritten Chinese character image into an image style extraction model to obtain initial handwriting style characteristics of a target user output by the image style extraction model; and determining a class center matrix of the handwriting style class center, and determining the handwriting style characteristics of the target user based on the correlation between the initial handwriting style characteristics and the class center matrix.

Here, the image style extraction model is determined based on the following steps: respectively determining sample handwriting style characteristics of the sample handwritten Chinese character image and sample handwriting style characteristics of the sample handwritten Chinese character track sequence based on the initial image style extraction model and the initial sequence style extraction model; determining the similarity between the sample handwriting style characteristics of a positive sample and the similarity between the sample handwriting style characteristics of a negative sample, wherein the positive sample is a sample handwritten Chinese character image of the same person and a corresponding sample handwritten Chinese character track sequence, and the negative sample is a sample handwritten Chinese character image of different persons and/or a sample handwritten Chinese character track sequence; and determining the contrast loss based on the similarity between the sample handwriting style characteristics of the positive sample and the similarity between the sample handwriting style characteristics of the negative sample, and performing parameter adjustment on the initial image style extraction model and the initial sequence style extraction model based on the contrast loss to obtain an image style extraction model and a sequence style extraction model.

Based on the contrast loss, parameter adjustment is carried out on the initial image style extraction model and the initial sequence style extraction model to obtain an image style extraction model and a sequence style extraction model, and the method specifically comprises the following steps: determining distribution loss based on the sample handwriting style characteristics of the sample handwritten Chinese character image and the distance between the sample handwriting style characteristics and an initial class center matrix of an initial handwriting style class center; the initial class center matrix is obtained by clustering the handwriting styles in the initial handwriting style class center; determining the joint training loss based on the distribution loss and the contrast loss, and carrying out parameter adjustment on the initial sequence style extraction model, the initial image style extraction model and the initial handwriting style class center based on the joint training loss to obtain the sequence style extraction model, the image style extraction model and the handwriting style class center.

Wherein, based on the distribution loss and the contrast loss, determining the joint training loss specifically comprises: under the condition that any sample handwritten Chinese character image lacks a corresponding sample handwritten Chinese character track sequence, determining a predicted handwritten track sequence corresponding to the sample handwritten Chinese character image; determining a predicted handwriting style characteristic of a predicted handwriting track sequence based on an initial sequence style extraction model, and generating a Chinese character skeleton based on the sample handwriting style characteristic of the sample handwritten Chinese character image and the sample content structure characteristic of a sample standard track sequence of a reference Chinese character in the predicted handwriting track sequence; and determining consistency loss based on the similarity between the sample handwriting style characteristics of the sample handwritten Chinese character image and the predicted handwriting style characteristics of the predicted handwriting track sequence, and determining joint training loss based on the consistency loss, the distribution loss and the contrast loss.

Then, extracting content based on the standard track sequence to obtain the content structure characteristics of the target Chinese character;

then, generating a Chinese character skeleton based on the handwriting style characteristics and the content structure characteristics to obtain a handwriting track sequence of the target user for the target Chinese character, specifically, based on the handwriting style characteristics and the content structure characteristics, performing track prediction to obtain the relative position of each handwriting track point of the target Chinese character; performing state prediction based on the handwriting style characteristics and the content structure characteristics to obtain the track state of each handwriting track point of the target Chinese character; and generating a Chinese character skeleton based on the relative position and the track state of each handwritten track point to obtain a handwritten track sequence of the target user for the target Chinese character.

The method provided by the embodiment of the invention extracts styles through handwritten Chinese character images, extracts contents through standard track sequences to respectively obtain the handwriting style characteristics of a target user and the content structure characteristics of a target Chinese character, generates Chinese character frameworks according to the handwriting style characteristics and the content structure characteristics, and obtains the handwriting track sequence of the target user for the target Chinese character.

The following describes the apparatus for generating a skeleton of chinese characters according to the present invention, and the apparatus for generating a skeleton of chinese characters described below and the method for generating a skeleton of chinese characters described above may be referred to in correspondence with each other.

Fig. 11 is a schematic structural diagram of a chinese character skeleton generating device provided in the present invention, and as shown in fig. 11, the device includes:

a data determining unit 1110, configured to determine a handwritten chinese character image of a target user and a standard trajectory sequence of a target chinese character;

a style extraction unit 1120, configured to perform style extraction based on the handwritten Chinese character image to obtain a handwritten style characteristic of the target user;

a content extraction unit 1130, configured to perform content extraction based on the standard trajectory sequence to obtain a content structure feature of the target chinese character;

and a skeleton generation unit 1140, configured to perform Chinese character skeleton generation based on the handwriting style features and the content structure features, so as to obtain a handwriting trajectory sequence of the target user for the target Chinese character.

The Chinese character framework generation device provided by the invention extracts styles through handwritten Chinese character images, extracts contents through standard track sequences, respectively obtains the handwritten style characteristics of a target user and the content structure characteristics of a target Chinese character, generates the Chinese character framework according to the handwritten style characteristics and the content structure characteristics, and obtains the handwritten track sequence of the target user for the target Chinese character.

Based on the above embodiment, the style extraction unit 1120 is configured to:

Based on the above embodiment, the apparatus further includes a model training unit, configured to:

Based on the above embodiment, the model training unit is configured to:

under the condition that any sample handwritten Chinese character image lacks a corresponding sample handwritten Chinese character track sequence, determining a predicted handwritten track sequence corresponding to the sample handwritten Chinese character image;

determining the predicted handwriting style characteristics of the predicted handwriting track sequence based on the initial sequence style extraction model, wherein the predicted handwriting track sequence is obtained by generating a Chinese character framework based on the sample handwriting style characteristics of the sample handwritten Chinese character image and the sample content structure characteristics of the sample standard track sequence of the reference Chinese character;

and determining consistency loss based on the similarity between the sample handwriting style characteristics of the sample handwritten Chinese character image and the predicted handwriting style characteristics of the predicted handwriting track sequence, and determining joint training loss based on the consistency loss, the distribution loss and the contrast loss.

Based on the above embodiment, the style extraction unit 1120 is configured to:

inputting the handwritten Chinese character image into an image style extraction model to obtain initial handwriting style characteristics of the target user output by the image style extraction model;

Based on the above embodiment, the skeleton generation unit 1140 is configured to:

Fig. 12 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 12: a processor (processor) 1210, a communication Interface (Communications Interface) 1220, a memory (memory) 1230, and a communication bus 1240, wherein the processor 1210, the communication Interface 1220, and the memory 1230 communicate with each other via the communication bus 1240. Processor 1210 may invoke logic instructions in memory 1230 to perform a chinese character skeleton generation method comprising: determining a handwritten Chinese character image of a target user and a standard track sequence of a target Chinese character; performing style extraction based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user; extracting content based on the standard track sequence to obtain the content structure characteristics of the target Chinese character; and generating a Chinese character skeleton based on the handwriting style characteristics and the content structure characteristics to obtain a handwriting track sequence of the target user for the target Chinese character.

In addition, the logic instructions in the memory 1230 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, when the program instructions are executed by a computer, the computer being capable of executing the method for generating a skeleton of chinese characters provided by the above methods, the method comprising: determining a handwritten Chinese character image of a target user and a standard track sequence of a target Chinese character; performing style extraction based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user; extracting content based on the standard track sequence to obtain the content structure characteristics of the target Chinese character; and generating a Chinese character skeleton based on the handwriting style characteristics and the content structure characteristics to obtain a handwriting track sequence of the target user for the target Chinese character.

In another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the method for generating a skeleton of chinese characters provided by the above methods, the method comprising: determining a handwritten Chinese character image of a target user and a standard track sequence of a target Chinese character; performing style extraction based on the handwritten Chinese character image to obtain the handwriting style characteristics of the target user; extracting content based on the standard track sequence to obtain the content structure characteristics of the target Chinese character; and generating a Chinese character skeleton based on the handwriting style characteristics and the content structure characteristics to obtain a handwriting track sequence of the target user for the target Chinese character.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A Chinese character skeleton generation method is characterized by comprising the following steps:

2. The method for generating a skeleton of chinese characters according to claim 1, wherein said performing style extraction based on said handwritten chinese character image to obtain handwritten style features of said target user comprises:

the sequence style extraction model is used for carrying out style extraction on the basis of the sample handwritten Chinese character track sequence to obtain sample handwritten style characteristics of the sample handwritten Chinese character track sequence.

3. The method for generating Chinese character skeletons according to claim 2, wherein the image style extraction model is determined based on the following steps:

4. The method for generating Chinese character skeletons according to claim 3, wherein the parameter adjustment of the initial image style extraction model and the initial sequence style extraction model based on the contrast loss to obtain an image style extraction model and a sequence style extraction model comprises:

5. The method of generating a chinese character skeleton according to claim 4, wherein the determining a joint training loss based on the distribution loss and the contrast loss comprises:

6. The method for generating a skeleton of chinese characters according to any one of claims 2 to 5, wherein the inputting the handwritten chinese character image to an image style extraction model to obtain the handwriting style characteristics of the target user output by the image style extraction model includes:

7. The method for generating a skeleton of chinese characters according to any one of claims 1 to 5, wherein the generating a skeleton of chinese characters based on the handwriting style features and the content structure features to obtain a handwriting trajectory sequence of the target user for the target chinese characters includes:

8. A Chinese character skeleton generation device, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for generating a kanji skeleton according to any one of claims 1 to 7 when executing the program.

10. A non-transitory computer-readable storage medium on which a computer program is stored, wherein the computer program, when executed by a processor, implements the method for generating a chinese character skeleton according to any one of claims 1 to 7.