WO2019232869A1

WO2019232869A1 - Handwriting model training method, text recognition method and apparatus, device, and medium

Info

Publication number: WO2019232869A1
Application number: PCT/CN2018/094344
Authority: WO
Inventors: 孙强; 周罡
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-06-04
Filing date: 2018-07-03
Publication date: 2019-12-12
Also published as: CN109086654B; CN109086654A

Abstract

Disclosed are a handwriting model training method, a text recognition method and apparatus, a device, and a medium. The handwriting model training method comprises: obtaining a standard Chinese text training sample, performing batch classification on the standard Chinese text training sample according to a preset batch, inputting the classified standard Chinese text training sample into a recurrent neural network, performing training on the basis of a continuous time classification algorithm, updating a network parameter by using a time-related back propagation algorithm, and obtaining a standard Chinese text recognition model; obtaining and using a non-standard Chinese text training sample, and training and obtaining an adjusted Chinese handwriting text recognition model; obtaining and using a Chinese text sample to be tested to obtain an error text training sample; and using the error text training sample to update a network parameter of the Chinese handwriting text recognition model, and obtaining a target Chinese handwriting text recognition model. By using the handwriting model training method, a target Chinese handwriting text recognition model having a high recognition rate in handwriting text recognition can be obtained.

Description

Handwriting model training method, text recognition method, device, device and medium

This application is based on a Chinese patent application filed on June 4, 2018 with the application number 201810564063.8, entitled "Handwriting Model Training Method, Text Recognition Method, Device, Equipment, and Medium", and claims its priority.

Technical field

The present application relates to the field of Chinese text recognition, and in particular, to a handwriting model training method, a text recognition method, a device, a device, and a medium.

Background technique

When traditional text recognition methods are used to recognize the more sloppy non-standard text (handwritten Chinese text), the recognition accuracy is not high, which makes its recognition effect unsatisfactory. Traditional text recognition methods can only recognize canonical texts to a large extent, and have a low accuracy rate when recognizing a variety of handwritten texts in real life.

Summary of the Invention

The embodiments of the present application provide a handwriting model training method, a device, a device, and a medium to solve the problem that the current accuracy of handwritten Chinese text recognition is not high.

A handwriting model training method includes:

Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;

Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;

Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;

Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.

A handwriting model training device includes:

The normal Chinese text recognition model acquisition module is used to obtain normal Chinese text training samples, and batch the normal Chinese text training samples according to a preset batch, and input the batch normal Chinese text training samples to the recurrent neural network. In training, based on continuous-time classification algorithm, time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;

Adjust the Chinese handwritten text recognition model acquisition module to obtain non-standard Chinese text training samples, batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples In the normal Chinese text recognition model, training is performed based on a continuous time classification algorithm, and a time-dependent back propagation algorithm is used to update network parameters of the normal Chinese text recognition model to obtain an adjusted Chinese handwritten text recognition model;

Error text training sample acquisition module, for acquiring Chinese text samples to be tested, using the adjusted Chinese handwritten text recognition model to identify the Chinese text samples to be tested, obtaining error texts whose recognition results do not match the true results, and putting all the errors Text as training text for error text;

A target Chinese handwritten text recognition model acquisition module is configured to input the error text training sample into the adjusted Chinese handwritten text recognition model, train based on a continuous time classification algorithm, and update the time-dependent backpropagation algorithm with batch gradient descent. Adjust the network parameters of the Chinese handwritten text recognition model to obtain the target Chinese handwritten text recognition model.

A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor. When the processor executes the computer-readable instructions, the following steps are implemented:

One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:

The embodiments of the present application further provide a text recognition method, device, device, and medium to solve the problem of low accuracy of current handwritten text recognition.

A text recognition method includes:

Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the handwriting model training method;

Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.

A text recognition device includes:

An output value acquisition module, configured to acquire Chinese text to be recognized, identify the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtain an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; The target Chinese handwritten text recognition model is obtained by using the handwriting model training method;

The recognition result obtaining module is configured to select a maximum output value among output values corresponding to the Chinese text to be recognized, and obtain a recognition result of the Chinese text to be recognized according to the maximum output value.

Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the above handwriting model training method;

Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by the above handwriting model training method;

Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below, and other features and advantages of the present application will become apparent from the description, the drawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to explain the technical solutions of the embodiments of the present application more clearly, the drawings used in the description of the embodiments of the application will be briefly introduced below. Obviously, the drawings in the following description are just some embodiments of the application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without paying creative labor.

1 is an application environment diagram of a handwriting model training method according to an embodiment of the present application;

2 is a flowchart of a handwriting model training method according to an embodiment of the present application;

FIG. 3 is a specific flowchart of step S10 in FIG. 2;

4 is another specific flowchart of step S10 in FIG. 2;

FIG. 5 is a specific flowchart of step S30 in FIG. 2;

6 is a schematic diagram of a handwriting model training device according to an embodiment of the present application;

7 is a flowchart of a text recognition method according to an embodiment of the present application;

8 is a schematic diagram of a text recognition device according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a computer device in an embodiment of the present application.

Detailed ways

In the following, the technical solutions in the embodiments of the present application will be clearly and completely described with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of this application.

FIG. 1 illustrates an application environment of a handwriting model training method provided by an embodiment of the present application. The application environment of the handwriting model training method includes a server and a client, wherein the server and the client are connected through a network, and the client is a device that can interact with the user, including, but not limited to, a computer and a smart phone. For devices such as tablets, the server can be implemented with an independent server or a server cluster consisting of multiple servers. The handwriting model training method provided in the embodiment of the present application is applied to a server.

FIG. 2 shows a flowchart of a handwriting model training method in an embodiment of the present application. As shown in FIG. 2, the handwriting model training method includes the following steps:

S10: Obtain training samples of standard Chinese text, and batch the training samples of standard Chinese characters according to preset batches, input the batch of training samples of standard Chinese text into recurrent neural network, and train based on continuous time classification algorithm. The time-correlated back-propagation algorithm was used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.

Among them, the standard Chinese text training samples refer to training samples obtained from standard texts (such as texts that belong to all the order of Chinese fonts such as Kai, Song, or Lishu, and the font is generally selected from Kai or Song). Recurrent neural networks (RNNs) are neural networks that model sequence data. Chinese text is composed of several fonts in order, so using RNN can better learn the deep features of Chinese text on sequences. The Continuous Time Classification (CTC) algorithm is a completely end-to-end acoustic model training algorithm. It does not need to align the training samples in advance. It only needs one input sequence and one output sequence to train. In one embodiment, the batch of normalized Chinese text training samples is input into a recurrent neural network for training, and the process of updating the weights and biases of the convolutional neural network using a back propagation algorithm uses a small batch of gradient descent. Methods. Mini-batch gradient descent (MBGD) is used to update the network parameters by accumulating the errors generated during the training according to preset batches to obtain the cumulative error corresponding to several batches. Method for updating parameters by accumulating errors corresponding to batches. The time-dependent backpropagation algorithm (Back Propagation, Thin Time, BPTT algorithm for short) is a training and learning method in neural network learning, which is used to update and adjust the network parameters between nodes in the neural network. When adjusting the network parameters in a neural network using a time-dependent back-propagation algorithm, the minimum value of the error function is required. In this embodiment, the minimum value of the error function is specifically calculated using a small batch gradient descent method. Out.

In this embodiment, a training sample of standard Chinese text is obtained, and the training sample of standard Chinese characters is batched according to a preset batch. The fonts used in the standard Chinese text training samples are the same (multiple fonts are not mixed). For example, the standard Chinese text training samples used for model training are all in the New Roman style. In this embodiment, the New Roman style is used as an example. Understandably, the Chinese fonts in the standard text here refer to the mainstream fonts in the current Chinese fonts, such as the default font Song style in the input method of computer equipment, and the mainstream font italics commonly used in copying; and like in daily life The less commonly used Chinese fonts, such as cursive and young round, are not included in the scope of Chinese fonts that make up the standard text. After obtaining the normalized Chinese text training samples and batching the normalized Chinese text training samples into preset batches, the batched normalized Chinese text training samples are input into a recurrent neural network and trained based on a continuous-time classification algorithm. A time-dependent back-propagation algorithm (based on a small batch of gradients) updates the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model. The standard Chinese text recognition model learns the deep features of the standard Chinese text training samples during the training process, enabling the model to accurately recognize standard standard text, has the ability to recognize standard standard text, and trains standard Chinese text recognition In the process of the model, manual labeling and data alignment of the standard Chinese text training samples are not required, and end-to-end training can be performed directly. It should be noted that regardless of whether the typefaces in the training samples of the standard Chinese text are other Chinese fonts such as Kai, Song, or Lishu, since the standard standard texts composed of these different Chinese fonts are not much different in terms of font recognition, The trained canonical Chinese text recognition model can accurately recognize standard canonical texts corresponding to typefaces such as Kai, Song, or Lishu, and obtain more accurate recognition results.

S20: Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on continuous time The classification algorithm is trained, and the time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain the adjusted Chinese handwritten text recognition model.

The non-standard Chinese text training sample refers to a training sample obtained based on handwritten Chinese text. The handwritten Chinese text may specifically be a text obtained by handwriting in mainstream fonts such as Kai, Song, or Lishu. Understandably, the difference between this non-standardized Chinese text training sample and the normalized Chinese text training sample is that the non-standardized Chinese text training sample is obtained by handwritten Chinese text. Since it is handwritten, it certainly contains a variety of different fonts. form.

In this embodiment, the server obtains non-standard Chinese text training samples, and batches the non-standard Chinese text training samples into preset batches. The training samples include the characteristics of handwritten Chinese text. The batched non-standard Chinese text training samples are input into the standard Chinese text recognition model, trained and adjusted based on the continuous time classification algorithm, and the time-dependent back propagation algorithm (based on small batch gradients) is used to update the standard Chinese text recognition model. Network parameters to get adjusted Chinese handwritten text recognition model. Understandably, the standard Chinese text recognition model has the ability to recognize standard Chinese text, but does not have high recognition accuracy when recognizing handwritten Chinese text. Therefore, this embodiment uses non-standard Chinese text training samples for training, so that the standard Chinese handwritten text recognition model can adjust the network parameters in the model based on the existing standard text of the recognition standard to obtain the adjusted Chinese handwritten text recognition model. The adjusted Chinese handwritten text recognition model learns the deep features of handwritten Chinese text on the basis of the original standard text recognition, so that the adjusted Chinese handwritten text recognition model combines the deep features of standard and handwritten Chinese text, and can simultaneously adjust the standard specifications. The text and handwritten Chinese text are effectively recognized, and a high accuracy recognition result is obtained.

In the text recognition, the recurrent neural network makes judgments based on the pixel distribution and sequence of the text. In actual life, the handwritten Chinese text is different from the standard text, but this difference is compared with other texts that do not correspond to the standard text. The difference is much smaller, for example, there is a difference in pixel distribution between "hello" in handwritten Chinese text and "hello" in standard specification text, but this difference is compared to "hello" and standard specification text " The difference between "goodbye" is significantly smaller. It can be considered that even if there is a certain difference between the handwritten Chinese text and the corresponding standard text, this difference is much smaller than the non-corresponding standard text. Therefore, the most similar (that is, the difference The minimum) principle determines the recognition result. The adjusted Chinese handwritten text recognition model is trained by a recurrent neural network. The model combines the standard features of text and the deep features of handwritten Chinese text, and can effectively recognize handwritten Chinese text based on the deep features.

It should be noted that the order of step S10 and step S20 in this embodiment is not interchangeable, and step S10 needs to be executed before step S20. Training the recurrent neural network with the normal Chinese training samples first can make the obtained normal Chinese text recognition model have better recognition ability, and make it have accurate recognition results for the standard normal text. On the basis of having good recognition ability, the fine-tuning of step S20 is performed, so that the adjusted Chinese handwritten text recognition model obtained by training can effectively recognize the handwritten Chinese text based on the deep features of the learned handwritten Chinese text and make it handwriting Chinese text recognition has more accurate recognition results. If step S20 is performed first or only step S20, because the handwritten Chinese text contained in the handwritten Chinese text has various forms, the features learned by directly training the handwritten Chinese text cannot reflect the characteristics of the handwritten Chinese text. Make the model learn "bad" at the beginning, which makes it difficult to make accurate recognition results for handwritten Chinese text recognition. Although each person's handwritten Chinese text is different, most of them are similar to standard specification text (such as handwritten Chinese text imitating standard specification text). Therefore, at the beginning, model training based on standard and normative text is more in line with the objective situation. It is more effective than model training directly on handwritten Chinese text. You can make corresponding adjustments under the "good" model to obtain the recognition rate of handwritten Chinese text. Highly adjusted Chinese handwritten text recognition model.

S30: Obtain a sample of Chinese text to be tested, use the adjusted Chinese handwritten text recognition model to identify the sample of Chinese text to be tested, obtain error texts whose recognition results do not match the true results, and use all the error texts as training text samples for errors.

The Chinese text sample to be tested refers to the training sample obtained for testing according to the standard text and the handwritten Chinese text. The standard text used in this step is the same as the standard text used for training in step S10 (because For example, each character corresponding to a font such as Kai, Song, etc. is uniquely determined); the handwritten Chinese text used and the handwritten Chinese text used for training in step S20 may be different (the Chinese text handwritten by different people is not complete) Similarly, each text of the handwritten Chinese text can correspond to multiple font forms. In order to distinguish it from the non-standard Chinese text training samples used for training in step S20, and to avoid the situation of model training overfitting, this step is generally used with S20 different handwritten Chinese text).

In this embodiment, the trained adjusted Chinese handwritten text recognition model is used to identify the Chinese text sample to be tested. Standard training text and handwritten Chinese text can be input to the adjusted Chinese handwritten text recognition model in a mixed manner during training. When the adjusted Chinese handwritten text recognition model is used to recognize the Chinese text samples to be tested, the corresponding recognition results will be obtained, and all error texts whose recognition results do not match the label value (true result) will be used as the error text training samples. The error text training sample reflects that the problem of insufficient recognition accuracy still exists in adjusting the Chinese text handwriting recognition model, so as to further update and optimize the Chinese handwriting text recognition model based on the error text training sample.

Because adjusting the recognition accuracy of the Chinese handwritten text recognition model is actually affected by both the normal Chinese text training samples and the non-standard Chinese text training samples, the network parameters were first updated with the normal Chinese text training samples, and then the non-standard Chinese text training samples were used to update On the premise of network parameters, the acquired adjusted Chinese handwritten text recognition model will over-learn the characteristics of non-standard Chinese text training samples, so that the obtained adjusted Chinese handwritten text recognition model will train non-standard Chinese text training samples (including handwritten Chinese text). Has very high recognition accuracy, but over-learns the characteristics of the non-standard Chinese text sample, which affects the recognition accuracy of handwritten Chinese text other than the non-standard Chinese text training sample. Therefore, step S30 uses the Chinese text sample to be tested to adjust Chinese handwritten text recognition model for recognition can largely eliminate over-learning of non-standard Chinese text training samples used during training. That is, by adjusting the Chinese handwritten text recognition model to identify the Chinese text samples to be tested to find the errors caused by over-learning, the errors can be specifically reflected by the error text, so the Chinese handwriting can be further updated and optimized based on the error text. Network parameters for text recognition models.

S40: Input the training sample of the wrong text into the adjusted Chinese handwritten text recognition model, train it based on the continuous time classification algorithm, and update the network parameters of the Chinese handwritten text recognition model with batch gradient descent time-dependent back propagation algorithm to obtain the target Chinese. Handwritten text recognition model.

In this embodiment, an error text training sample is input into the adjusted Chinese handwritten text recognition model, and training is performed based on a continuous time classification algorithm. The error text training sample reflects that during training and adjustment of the Chinese handwritten text recognition model, due to excessive learning non-standard The characteristics of Chinese text training samples lead to inaccurate recognition problems when adjusting the Chinese handwritten text recognition model to recognize handwritten Chinese text other than non-standard Chinese text training samples. In addition, the reason that the normalized Chinese text training samples are used first and then the non-standardized Chinese text training samples are used to train the model will overly weaken the characteristics of the previously learned standard canonical text, which will affect the initial establishment of the model to recognize the standard canonical text. frame". The use of erroneous text training samples can well solve the problems of over-learning and over-weakening. According to the recognition accuracy problems reflected by the erroneous text training samples, the over-learning and over-weakening generated during the original training process can be largely eliminated. Adverse effects.

Specifically, the time-correlated back-propagation algorithm of batch gradient descent is used for training using the error text training samples, and the network parameters of the Chinese handwritten text recognition model are updated and adjusted according to the algorithm to obtain the target Chinese handwritten text recognition model. The target Chinese The handwritten text recognition model refers to the finally trained model that can be used to recognize Chinese handwritten text. The training uses a recurrent neural network, which can combine the sequence characteristics of Chinese text to learn the deep features of Chinese text and improve the recognition rate of the target Chinese handwritten text recognition model. The training algorithm is a continuous-time classification algorithm. Using this algorithm for training does not require manual labeling and data alignment of the training samples, which can reduce the complexity of the model and enable direct training of non-aligned and variable-length sequences. When updating the network parameters, the sample size of the error text training samples is small (less error texts). Using the time-dependent backpropagation algorithm of batch gradient descent, all errors generated by the error text training samples during the training of the recurrent neural network are all Back-propagation update is performed to ensure that all errors generated can be adjusted and updated on the network, can fully train the recurrent neural network, and improve the recognition accuracy of the target Chinese handwritten text recognition model.

In steps S10-S40, the standardized Chinese text training sample is used to train and obtain a standardized Chinese text recognition model, and then the standardized Chinese text recognition model is adjusted to update through the unstandardized Chinese text after the batch, so that after the update, The obtained adjusted Chinese handwritten text recognition model learns the deep features of handwritten Chinese text through training and updating on the premise that it has the ability to recognize standard and standardized text, so that the adjusted Chinese handwritten text recognition model can better recognize handwritten Chinese text. Then use the adjusted Chinese handwritten text recognition model to identify the Chinese text samples to be tested, obtain the error texts whose recognition results do not match the real results, and input all the error texts as training text of the error text into the adjusted Chinese handwritten text recognition model, based on continuous time The classification algorithm is updated to obtain the target Chinese handwritten text recognition model. The use of error text training samples can largely eliminate the adverse effects caused by over-learning and over-weakening during the original training process, and can further optimize the recognition accuracy. The training standard Chinese text recognition model and the adjusted Chinese handwritten text recognition model use a time-dependent back-propagation algorithm (based on small batch gradients), which can still have good training efficiency and training effect in the case of a large number of training samples, and It can also ensure that the error has global characteristics within a certain range compared to a single training sample, and it is easier to find the minimum value of the error function. The training target Chinese handwritten text recognition model uses a time-dependent backpropagation algorithm using batch gradient descent. Using batch gradient descent can ensure that the parameters in the model are fully updated, and the errors generated by the training samples during the training process are backpropagated. Update, comprehensively update the parameters according to the generated errors, and improve the recognition accuracy of the obtained model. Each model is trained using a recurrent neural network, which can combine the sequence characteristics of Chinese text to learn the deep features of Chinese text and realize the function of recognizing different handwritten Chinese text. The algorithm used to train each model is a continuous-time classification algorithm. Using this algorithm for training does not require manual labeling and data alignment of the training samples, which can reduce the model complexity and enable direct training of non-aligned and indefinite length sequences.

In an embodiment, as shown in FIG. 3, in step S10, a training sample of standard Chinese text is obtained, and the training sample of standard Chinese characters is batched according to a preset batch, which specifically includes the following steps:

S101: Obtain a pixel value feature matrix of each Chinese text in a training sample of Chinese text to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain a normalization of each Chinese text Pixel value feature matrix, where the normalization formula is

MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization.

Among them, the training text samples to be processed refers to the training samples that are initially acquired and not processed.

In this embodiment, a mature, open-source convolutional neural network may be used to extract the features of the Chinese text training samples to be processed, and obtain the pixel value feature matrix of each Chinese text in the Chinese text training samples to be processed. The pixel value feature matrix of each Chinese text represents the features of the corresponding text. Here, the pixel values represent the features of the text. Since the text is represented two-dimensionally by the image, the pixel values can be represented by a matrix, that is, the pixel value feature matrix. The computer device can recognize the form of the pixel value characteristic matrix and read the value in the pixel value characteristic matrix. After the server obtains the pixel value feature matrix of each Chinese text, it uses the formula of normalization processing to normalize each pixel value in the feature matrix to obtain the normalized pixel value feature of each Chinese text. In this embodiment, the normalized processing method can be used to compress the pixel value feature matrix of each Chinese text within the same range, which can speed up the calculation related to the pixel value feature matrix and help improve the training standard Chinese. Training efficiency of text recognition models.

S102: Divide the pixel values in the normalized pixel value feature matrix of each Chinese text into two types of pixel values, build a binary pixel value feature matrix of each Chinese text based on the two types of pixel values, and divide each Chinese text The Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample of the standard Chinese text, and the training sample of the standard Chinese text is batched according to a preset batch.

In this embodiment, the pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values. The two types of pixel values refer to that the pixel values include only the pixel value A or the pixel value B. Specifically, a pixel value greater than or equal to 0.5 in the normalized pixel feature matrix can be taken as 1 and a pixel value less than 0.5 can be taken as 0, and a corresponding binary pixel value feature matrix for each Chinese text can be established. The original of the binarized pixel feature matrix of each Chinese text contains only 0 or 1. After establishing the binarized pixel value feature matrix of each Chinese text, the Chinese text combination corresponding to the binarized pixel value feature matrix is used as the standard Chinese text training sample, and the standard Chinese text training sample is batched according to a preset batch Minute. For example, in an image containing text, there are portions of text pixels and portions of blank pixels. The pixel values on the text are generally darker. The "1" in the binarized pixel value feature matrix represents the portion of the text pixel, and the "0" represents the portion of the blank pixel in the image. Understandably, the feature representation of text can be further simplified by establishing a binary pixel value feature matrix. Only the matrix of 0 and 1 can be used to represent and distinguish each text, which can improve the computer processing of the feature matrix of text. Speed, which further improves the training efficiency of training standard Chinese text recognition models.

Steps S101-S102: Normalize the training samples of the Chinese text to be processed and divide the two types of values, obtain the binary pixel value feature matrix of each Chinese text, and binarize the pixel value features of each Chinese text The text corresponding to the matrix is used as the training sample of the standard Chinese text, which can significantly shorten the training time of training the standard Chinese text recognition model.

In an embodiment, as shown in FIG. 4, in step S10, the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous time classification algorithm, and the loop is updated by using a time-dependent back propagation algorithm. The network parameters of the neural network to obtain the standard Chinese text recognition model include the following steps:

S111: input the batch of normalized Chinese text training samples into a recurrent neural network, and train based on a continuous time classification algorithm to obtain the forward propagation output and the backward direction of the batched normalized Chinese text training samples in a recurrent neural network The propagation output, the forward propagation output is expressed as

Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t,

Represents the probability that the output at step t is the label value l ' _u ,

The backward propagation output is expressed as

Represents the probability that the output at step t + 1 is the label value l ' _u ,

In this embodiment, a batch of standardized Chinese text training samples is input into a recurrent neural network, and training is performed based on a continuous time classification (CTC) algorithm. The CTC algorithm is essentially an algorithm that calculates an error function. This algorithm is used to measure the error between the input sequence data after passing through the neural network and the real result (objective facts, also called label values). Therefore, the corresponding error function can be constructed by obtaining the forward propagation output and the backward propagation output of a batch of standardized Chinese text training samples in a recurrent neural network, and then using the forward propagation output and the backward propagation output to describe. First, I will briefly introduce some basic definitions in CTC to better understand the implementation process of CTC.

Represents the probability that the output at step t is the label value k. For example: When the output sequence is (a-ab-),

Represents the probability that the letter output in step 3 is a, and the letter a is the label value corresponding to the third step. p (π | x): represents the probability of the given input sequence x and the output path being π; since the probability of the corresponding label value output at each sequence step is independent of each other, then p (π | x) is used Formula to represent

It can be understood as the product of the probability of the label value corresponding to the output path π at each sequence step. F: represents a many-to-one mapping, a transformation that maps the output path π to the label sequence l, for example: F (a-ab-) = F (-aa-abb) = aab (where-represents a space ). In this embodiment, the mapping transformation may be a process of removing overlapping words and removing spaces as in the above example. p (l | x): Represents a given input sequence x (such as a sample in a standard Chinese text training sample), and the probability of output is sequence l. Therefore, the probability of output as sequence l can be expressed as all output paths π The mapped sequence is the sum of the probabilities of l, expressed by the formula:

Understandably, as the length of the sequence l increases, the number of corresponding paths increases exponentially, so iterative thinking can be adopted, from the t-step and t-1, t + 1 step forward From the perspective of propagation and backward propagation, the path probability corresponding to sequence l is calculated to improve the efficiency of the calculation.

Specifically, before performing calculations, some preprocessing is needed on the sequence l, spaces are added at the beginning and end of the sequence l, and spaces are added between the letters. If the length of the original sequence l is U, then after preprocessing, the length of the sequence l ′ is 2U + 1. For a sequence l, the forward variable α (t, u) can be defined as the sum of the probabilities of the output sequence length t and the path of sequence l after F mapping, expressed by the formula:

Among them, V (t, u) = {π∈A ' ^t : F (π) = l _{1: u / 2} , π _t = l' _u }, which means that all the sequences after F mapping are satisfied, and the length is t The set of paths and the output of step t is l ' _u , where u / 2 represents the index, so it needs to be rounded down. All correct paths must begin with a space or l ₁ (that is, the first letter of the sequence l), so there are initialization constraints:

(b represents blank, space),

Then p (l | x) can be represented by a forward variable, that is: p (l | x) = α (T, U ') + α (T, U'-1), where α (T, U' ) Can be understood as the length of all paths is T, after the F mapping is sequence l, and the label value of the output at time T is: l ' _U or l' _U-1 , that is, whether the last of the path includes a space. Therefore, the calculation of the forward variable can be recursed according to time, expressed by the formula:

Among them, f (u) here is actually a list of all possible paths at the previous moment, and the specific condition formula is as follows:

Similar to the process of forward propagation, a backward variable β (t, u) can be defined, which means that starting from time t + 1, a path π 'is added to the forward variable α (t, u), so that it is finally mapped by F The sum of the probabilities of the sequence l is followed by the formula:

Where W (t, u) = {π∈A ' ^Tt : F (π' + π) = l,

There are corresponding initialization conditions for backward propagation: β (T, U ') = β (T, U'-1) = 1, β (T, u) = 0,

Therefore, the backward variable can also be obtained in a recursive manner, and expressed by the formula:

Among them, g (u) represents a possible path selection function at time t + 1, which is expressed as

According to the recursive expression of the forward variable and the recursive expression of the backward variable, the process of forward propagation and the process of backward propagation can be described, and the corresponding forward propagation output and backward propagation output (forward variable The recursive expression of is the forward propagation output, and the recursive expression of the backward variable is the backward propagation output.

S112: Construct an error function according to the forward propagation output and the backward propagation output.

In an embodiment, an error function may be constructed based on the forward propagation output and the backward propagation output. Specifically, a negative logarithm of the probability may be used as the error function. Let l = z, then the error function can be expressed as

Among them, S represents the standard Chinese text training sample. P (z | x) in this formula can be calculated from the forward propagation output and the backward propagation output. First define a set X, which represents all the correct paths at position u at time t, expressed by the formula: X (t, u) = {π∈A ' ^T : F (π) = z, π _t = z ' _u }, so the product of the forward and backward variables at any time represents the sum of the probabilities of all possible paths, ie

This formula is the sum of the probabilities of all the correct paths for the position at t at time t. For the general case, for any time t, the correct path of all positions can be calculated to get the total probability:

The error function can be obtained according to the definition of the error function

After obtaining the error function, the network parameters can be updated according to the error function to obtain a standard Chinese text recognition model.

S113: According to the error function, a time-dependent back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.

In one embodiment, according to the obtained error function

A time-dependent back-propagation algorithm (based on a small batch of gradients) can be used to update the network parameters. Specifically, the partial derivative of the error function on the network output without the sofmax layer (that is, the gradient) is obtained. The network parameters are updated by subtracting the product of the gradient and the learning rate from the original network parameters.

Steps S111-S113 can construct an error function according to the forward propagation output and the backward propagation output obtained from the recurrent neural network by the training samples of the standard Chinese text

Based on the error function, the error is back-propagated, and the network parameters are updated to achieve the purpose of obtaining the standard Chinese text recognition model. The model learns the deep features of the training samples of standard Chinese text and can accurately identify standard standard text.

In an embodiment, as shown in FIG. 5, in step S30, the Chinese text sample to be tested is identified by adjusting the Chinese handwritten text recognition model, and error texts whose recognition results do not match the real results are obtained, and all the error texts are used as the error text training samples. , Including the following steps:

S31: Input the Chinese text sample to be tested into the adjusted Chinese handwritten text recognition model, and obtain the output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model.

In this embodiment, the Chinese handwritten text recognition model is adjusted to recognize the Chinese text sample to be tested, and the Chinese text sample to be tested includes several Chinese texts. The text includes text, and the output value of each text mentioned in this embodiment specifically refers to each output value corresponding to each font in each text. In the Chinese character library, there are more than 3,000 commonly used Chinese characters (including spaces and various Chinese punctuation marks). When adjusting the output layer of the Chinese handwritten text recognition model, each character in the Chinese character library and the input Chinese to be tested should be set. The probability value of the similarity of the words in the text sample can be achieved through the softmax function. Understandably, if a text sample in the Chinese text sample to be tested is assumed to be an image with a resolution of 8 * 8, and the words "Hello" are above, the picture is vertically cut into 8 columns during recognition. Eight three-dimensional vectors are then used as the eight input numbers for adjusting the Chinese handwritten text recognition model. The number of output and input of the Chinese handwritten text recognition model should be adjusted. In fact, the text sample has only 3 output numbers instead of 8 output numbers. Therefore, the actual output situation will be overlapped. Situations, such as: "You, you guys ___", "You_men_mens_good_", "_Youyou_mens_good_" and other output situations. Among these 8 output numbers, each output number Corresponding Chinese characters have probability values that are similar to each word in the Chinese character library. The probability value is the output value of each text in the test Chinese text sample in adjusting the Chinese handwritten text recognition model. There are many output values. , Each output value corresponds to the probability value of the similarity between the Chinese character corresponding to the output number and each character in the Chinese character library. The recognition result of each text can be determined according to the probability value.

S32: Select the maximum output value among the output values corresponding to each text, and obtain the recognition result of each text according to the maximum output value.

In this embodiment, a maximum output value among all output values corresponding to each text is selected, and a recognition result of the text can be obtained according to the maximum output value. Understandably, the output value directly reflects the similarity between the words in the input Chinese text sample and each character in the Chinese character library, and the maximum output value indicates that the word in the text sample to be tested is closest to a certain character in the Chinese character library. Number of words, the actual output can be determined according to the word corresponding to the maximum output value, such as the actual output is "You guys ___", "You_ guys_men_good_", "_ 你你 _ 人 _ 好 _ "And so on instead of actual output like" 你你 ”好 ___", "你 _ 我们 _ 个 _ 好 _", "_ 你你 _ 扪 _ 好 _", etc., according to the definition of continuous time classification algorithm, The actual output needs to be further processed to remove the reduplicated words in the actual output, leaving only one; and to remove the spaces, you can get the recognition result, for example, the recognition result in this embodiment is "hello". The correctness of the actual output word is determined by the maximum output value, and the de-superposition and space removal processing are performed to effectively obtain the recognition result of each text.

S33: According to the recognition results, obtain error texts whose recognition results do not match the real results, and use all the error texts as training samples of the error texts.

In this embodiment, the obtained recognition result is compared with an actual result (objective fact), and an error text in which the recognition result does not match the actual result is used as an error text training sample. Understandably, the recognition result is just the result recognized by the Chinese text training sample to be tested in adjusting the Chinese handwritten text recognition model. It may be different from the real result, reflecting that the model still has the recognition accuracy. Shortcomings, and these shortcomings can be optimized by training samples of erroneous text to achieve more accurate recognition results.

Steps S31-S33 adjust the output value of the Chinese handwritten text recognition model according to each text in the Chinese text sample to be tested, and select the maximum output value from the output value that can reflect the similarity between texts (actually the similarity of words). ; Then get the recognition result through the maximum output value, and get the error text training sample according to the recognition result, which provides an important technical premise for the subsequent use of the error text training sample to further optimize the recognition accuracy.

In one embodiment, before step S10, that is, before the step of obtaining training samples of standard Chinese text, the handwriting model training method further includes the following steps: initializing a recurrent neural network.

In one embodiment, the initialization of the recurrent neural network is to initialize the network parameters of the network and assign initial values to the network parameters. If the initial weight is in a relatively flat area of the error surface, the convergence speed of the RNN model training may be abnormally slow. The network parameters can be initialized to be uniformly distributed in a relatively small interval with a zero mean, such as in an interval such as [-0.30, + 0.30]. Reasonably initializing the recurrent neural network can make the network more flexible in the initial stage. It can effectively adjust the network during the training process. It can quickly and effectively find the minimum value of the error function, which is conducive to the update and recurrent neural network. Adjusted so that the model obtained by model training based on recurrent neural network has accurate recognition effect when performing Chinese handwriting recognition.

In the handwriting model training method provided in this embodiment, the network parameters of the recurrent neural network are initialized to be uniformly distributed in a relatively small interval with a zero mean, such as an interval such as [-0.30, +0.30]. This method can quickly and efficiently find the minimum value of the error function, which is conducive to the update and adjustment of the recurrent neural network. Normalize the Chinese text training samples to be processed and divide the two types of values to obtain the binary pixel value feature matrix of each Chinese text, and the text corresponding to the binary pixel value feature matrix of each Chinese text As a training sample of canonical Chinese text, it can significantly shorten the time for training a canonical Chinese text recognition model. Construct an error function based on the forwarded and backward-propagated output of the recurrent neural network based on the batch of standardized Chinese text training samples

Based on this error function, the network parameters are updated to obtain a standard Chinese text recognition model. The model learns the deep features of the standard Chinese text training samples and can accurately identify standard standard texts. Then, the standardized Chinese text recognition model is adjusted to update through the batch of non-standard Chinese text, so that the adjusted Chinese handwritten text recognition model obtained after the update can learn by training and updating on the premise that it has the ability to recognize standard Chinese handwritten text. The deep features of non-standard Chinese text make the adjusted Chinese hand-written text recognition model better recognize non-standard Chinese hand-written text. Then, according to the output value of each text in the Chinese text sample to be tested in the Chinese handwritten text recognition model, the maximum output value that reflects the degree of similarity between texts is selected from the output values, and the maximum output value is used to obtain the recognition result. Recognition results are obtained from training text samples of errors, and all error texts are input as training text samples to adjust the Chinese handwritten text recognition model, and training updates are performed based on the continuous time classification algorithm to obtain the target Chinese handwritten text recognition model. The use of error text training samples can largely eliminate the adverse effects caused by over-learning and over-weakening during the original training process, and can further optimize the recognition accuracy. In addition, in the handwriting model training method provided in this embodiment, the standardized Chinese text recognition model and the adjusted Chinese handwritten text recognition model are trained based on a small batch gradient (the standard Chinese text training samples are batched according to a preset batch) Points) of the back propagation algorithm, in the case of a large number of training samples, still has good training efficiency and training effect. The target Chinese handwritten text recognition model is trained using a time-dependent backpropagation algorithm using batch gradient descent, which can ensure that the parameters in the model are fully updated, and the errors generated by the training samples during the training process are backpropagated. The parameters are updated comprehensively according to the generated errors to improve the recognition accuracy of the obtained model.

It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

FIG. 6 shows a principle block diagram of a handwriting model training device corresponding to the handwriting model training method in the embodiment. As shown in FIG. 6, the handwriting model training device includes a standard Chinese text recognition model acquisition module 10, an adjusted Chinese handwriting text recognition model acquisition module 20, an error text training sample acquisition module 30, and a target Chinese handwriting text recognition model acquisition module 40. Among them, the implementation functions of the standard Chinese text recognition model acquisition module 10, adjusted Chinese handwritten text recognition model acquisition module 20, error text training sample acquisition module 30, and target Chinese handwritten text recognition model acquisition module 40 correspond to the handwriting model training method in the embodiment. The steps correspond one by one. In order to avoid redundant description, this embodiment is not detailed one by one.

Canonical Chinese text recognition model acquisition module 10 is used to obtain normative Chinese text training samples, and batches the normative Chinese text training samples into preset batches, and inputs the batched normative Chinese text training samples to the recurrent neural network. Based on continuous-time classification algorithm for training, the time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.

Adjust the Chinese handwritten text recognition model acquisition module 20 to obtain non-standard Chinese text training samples, batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples to In the standard Chinese text recognition model, training is performed based on a continuous time classification algorithm, and the network parameters of the standard Chinese text recognition model are updated using a time-dependent backpropagation algorithm to obtain an adjusted Chinese handwritten text recognition model.

Error text training sample acquisition module 30, which is used to obtain Chinese text samples to be tested, adjust the Chinese handwritten text recognition model to identify Chinese text samples to be tested, obtain error texts whose recognition results do not match the true results, and train all error texts as error texts sample.

The target Chinese handwritten text recognition model acquisition module 40 is used to input training text error samples into the adjusted Chinese handwritten text recognition model, train based on a continuous time classification algorithm, and use batch gradient descent time-dependent back-propagation algorithm to update and adjust Chinese handwriting. Network parameters of the text recognition model to obtain the target Chinese handwritten text recognition model.

Preferably, the normalized Chinese text recognition model acquisition module 10 includes a normalized pixel value feature matrix acquisition unit 101, a normalized Chinese text training sample acquisition unit 102, a propagation output acquisition unit 111, an error function construction unit 112, and a normalized Chinese text recognition model acquisition. Unit 113.

The normalized pixel value feature matrix obtaining unit 101 is configured to obtain a pixel value feature matrix of each Chinese text in a Chinese text training sample to be processed, and normalize each pixel value in the pixel value feature matrix of each Chinese text. Processing to obtain a normalized pixel value feature matrix for each Chinese text, where the formula for normalization processing is

A normalized Chinese text training sample acquisition unit 102 is configured to divide the pixel values in the normalized pixel value feature matrix of each Chinese text into two types of pixel values, and establish a binarized pixel of each Chinese text based on the two types of pixel values The value feature matrix uses the Chinese text combination corresponding to the binarized pixel value feature matrix of each Chinese text as a standard Chinese text training sample, and the standard Chinese text training samples are batched according to a preset batch.

A propagation output obtaining unit 111 is configured to input the batch of standardized Chinese text training samples into a recurrent neural network, and perform training based on a continuous-time classification algorithm, and obtain a batch of standardized Chinese text training samples in a recurrent neural network. Forward propagation output and backward propagation output, forward propagation output is expressed as

The backward propagation output is expressed as

The error function constructing unit 112 is configured to construct an error function according to the forward propagation output and the backward propagation output.

The standard Chinese text recognition model acquisition unit 113 is configured to update the network parameters of the recurrent neural network by using a time-dependent back-propagation algorithm according to an error function to obtain a standard Chinese text recognition model.

Preferably, the error text training sample acquisition module 30 includes a model output value acquisition unit 31, a model recognition result acquisition unit 32, and an error text training sample acquisition unit 33.

The model output value acquiring unit 31 is configured to input a Chinese text sample to be tested into the adjusted Chinese handwritten text recognition model, and obtain an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model.

The model recognition result obtaining unit 32 is configured to select a maximum output value among output values corresponding to each text, and obtain a recognition result of each text according to the maximum output value.

The error text training sample acquisition unit 33 is configured to obtain error texts whose recognition results do not match the real results according to the recognition results, and use all the error texts as the error text training samples.

Preferably, the handwriting model training device further includes an initialization module 50 for initializing a recurrent neural network.

FIG. 7 shows a flowchart of the text recognition method in this embodiment. The text recognition method can be applied to computer equipment configured by banks, investment, and insurance institutions to recognize handwritten Chinese text and achieve artificial intelligence purposes. As shown in FIG. 7, the text recognition method includes the following steps:

S50: Obtain the Chinese text to be recognized, use the target Chinese handwritten text recognition model to identify the Chinese text to be recognized, and obtain the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model. The target Chinese handwritten text recognition model is trained using the above handwriting model. Method.

The Chinese text to be identified refers to the Chinese text to be identified.

In this embodiment, the Chinese text to be recognized is obtained, the Chinese text to be recognized is input to the target Chinese handwritten text recognition model for recognition, and the Chinese text corresponding to each output number of the target Chinese handwritten text recognition model is obtained. A probability value similar to each word in the Chinese character library. The probability value is the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model. The recognition result of the Chinese text to be recognized can be determined based on the output value.

S60: Select the maximum output value among the output values corresponding to the Chinese text to be recognized, and obtain the recognition result of the Chinese text to be recognized according to the maximum output value.

In this embodiment, the maximum output value among all output values corresponding to the Chinese text to be recognized is selected, and the corresponding actual output is determined according to the maximum output value, for example, the actual output is "you_men_men_ 好 _". Then the actual output is further processed, and the overlapping words in the actual output are removed, leaving only one; and the spaces are removed to obtain the recognition result of the Chinese text to be recognized. The maximum output value is used to determine the correctness of the words in the actual output stage, and then the de-superposition and de-space processing is performed to effectively obtain the recognition result of each text and improve the recognition accuracy.

In steps S50-S60, the target Chinese handwritten text recognition model is used to identify the Chinese text to be recognized, and the recognition result of the Chinese text to be recognized is obtained according to the maximum output value and the processing of desuperimposed characters and spaces. The target Chinese handwritten text recognition model itself has high recognition accuracy, and combined with the Chinese semantic thesaurus to further improve the accuracy of Chinese handwriting recognition.

In the text recognition method provided in the embodiment of the present application, the Chinese text to be recognized is input into the target Chinese handwritten text recognition model for recognition, and the recognition result is obtained in combination with a preset Chinese semantic thesaurus. When the target Chinese handwritten text recognition model is used to recognize Chinese handwritten text, accurate recognition results can be obtained.

FIG. 8 shows a principle block diagram of a text recognition device that corresponds one-to-one to the text recognition method in the embodiment. As shown in FIG. 8, the text recognition device includes an output value acquisition module 60 and a recognition result acquisition module 70. The implementation functions of the output value acquisition module 60 and the recognition result acquisition module 70 correspond to the steps corresponding to the text recognition method in the embodiment one by one. To avoid redundant descriptions, this embodiment does not detail them one by one.

The text recognition device includes an output value acquisition module 60 for obtaining the Chinese text to be recognized, using the target Chinese handwritten text recognition model to identify the Chinese text to be recognized, and obtaining the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese The handwritten text recognition model is obtained using the handwriting model training method.

The recognition result acquisition module 70 is configured to select a maximum output value among output values corresponding to the Chinese text to be recognized, and obtain a recognition result of the Chinese text to be recognized according to the maximum output value.

This embodiment provides one or more non-volatile readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors are executed. The handwriting model training method in the embodiment is implemented at this time. To avoid repetition, details are not repeated here. Alternatively, when the computer-readable instructions are executed by one or more processors, the functions of each module / unit of the handwriting model training device in the embodiment are implemented when the one or more processors are executed. To avoid repetition, here No longer. Alternatively, when the computer-readable instructions are executed by one or more processors, the functions of each step in the text recognition method in the embodiment are implemented when the one or more processors are executed. To avoid repetition, different ones are not provided here. One more detail. Alternatively, when the computer-readable instructions are executed by one or more processors, the functions of each module / unit in the text recognition device in the embodiment are implemented when the one or more processors are executed. To avoid repetition, here Not one by one.

FIG. 9 is a schematic diagram of a computer device according to an embodiment of the present application. As shown in FIG. 9, the computer device 80 of this embodiment includes a processor 81, a memory 82, and computer-readable instructions 83 stored in the memory 82 and executable on the processor 81. The computer-readable instructions 83 are processed. The device 81 implements the handwriting model training method in the embodiment when executed. To avoid repetition, details are not described here one by one. Alternatively, when the computer-readable instructions 83 are executed by the processor 81, the functions of each model / unit in the handwriting model training device in the embodiment are implemented. To avoid repetition, details are not described here one by one. Alternatively, when the computer-readable instructions 83 are executed by the processor 81, the functions of the steps in the text recognition method in the embodiment are implemented. To avoid repetition, details are not described here one by one. Alternatively, when the computer-readable instructions 83 are executed by the processor 81, the functions of each module / unit in the text recognition device in the embodiment are realized. To avoid repetition, we will not repeat them here.

The computer device 80 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer equipment may include, but is not limited to, a processor 81 and a memory 82. Those skilled in the art can understand that FIG. 9 is only an example of the computer device 80 and does not constitute a limitation on the computer device 80. It may include more or fewer components than shown in the figure, or combine some components or different components. For example, computer equipment may also include input and output equipment, network access equipment, and buses.

The so-called processor 81 may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 82 may be an internal storage unit of the computer device 80, such as a hard disk or a memory of the computer device 80. The memory 82 may also be an external storage device of the computer device 80, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory card (Flash) provided on the computer device 80. Card) and so on. Further, the memory 82 may also include both an internal storage unit of the computer device 80 and an external storage device. The memory 82 is used to store computer-readable instructions 83 and other programs and data required by the computer device. The memory 82 may also be used to temporarily store data that has been or will be output.

Those skilled in the art can clearly understand that, for the convenience and brevity of the description, only the above-mentioned division of functional units and modules is used as an example. In practical applications, the above functions can be assigned by different functional units, Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.

The above-mentioned embodiments are only used to describe the technical solution of the present application, but not limited thereto. Although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that they can still implement the foregoing implementations. The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of this application.

Claims

A handwriting model training method is characterized in that it includes:

Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;

Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;

Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;

Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
The method for training a handwriting model according to claim 1, wherein the obtaining a normal Chinese text training sample and batching the normal Chinese text training sample according to a preset batch comprises:

Obtain the pixel value feature matrix of each Chinese text in the Chinese text training sample to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain the normalized pixel value of each Chinese text Feature matrix, where the formula for normalization is
MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization;

The pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values, and a binary pixel value feature matrix of each Chinese text is established based on the two types of pixel values, and each Chinese text is The Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample for the standard Chinese text, and the training samples for the text in the standard are batched according to a preset batch.
The method for training a handwriting model according to claim 1, wherein the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous time classification algorithm, and a time-dependent back-propagation algorithm is used Update the network parameters of the recurrent neural network to obtain the standard Chinese text recognition model, including:

The batched canonical Chinese text training samples are input into a recurrent neural network, and training is performed based on a continuous-time classification algorithm to obtain the forward propagation output and backward direction of the batched canonical Chinese text training samples in the recurrent neural network. Propagation output, the forward propagation output is expressed as
Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t,
Represents the probability that the output at step t is the label value l ' u ,
The back-propagation output is expressed as
Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t,
Represents the probability that the output at step t + 1 is the label value l ' u ,

Constructing an error function according to the forward propagation output and the backward propagation output;

According to the error function, a network parameter of the recurrent neural network is updated using a time-dependent back-propagation algorithm to obtain a standardized Chinese text recognition model.
The method for training a handwriting model according to claim 1, wherein the adjusted Chinese handwritten text recognition model is used to identify samples of Chinese text to be tested, obtain error texts whose recognition results do not match the true results, and treat all the error texts as Error text training samples, including:

Inputting a Chinese text sample to be tested into an adjusted Chinese handwritten text recognition model, and obtaining an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model;

Selecting a maximum output value among output values corresponding to each of the texts, and obtaining a recognition result of each of the texts according to the maximum output value;

According to the recognition results, error texts whose recognition results do not match the real results are obtained, and all the error texts are used as training texts of the error texts.
The method for training a handwriting model according to claim 1, wherein before the step of obtaining a training sample of standard Chinese text, the method for training a handwriting model further comprises:

Initialize the recurrent neural network.
A text recognition method, comprising:

Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the handwriting model training method according to any one of claims 1-5;

Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.
A handwriting model training device, comprising:

The normal Chinese text recognition model acquisition module is used to obtain normal Chinese text training samples, and batch the normal Chinese text training samples according to a preset batch, and input the batch normal Chinese text training samples to the recurrent neural network. In training, based on continuous-time classification algorithm, time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;

Adjust the Chinese handwritten text recognition model acquisition module to obtain non-standard Chinese text training samples, batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples In the normal Chinese text recognition model, training is performed based on a continuous time classification algorithm, and a time-dependent back propagation algorithm is used to update network parameters of the normal Chinese text recognition model to obtain an adjusted Chinese handwritten text recognition model;

Error text training sample acquisition module, for acquiring Chinese text samples to be tested, using the adjusted Chinese handwritten text recognition model to identify the Chinese text samples to be tested, obtaining error texts whose recognition results do not match the true results, and putting all the errors Text as training text for error text;

A target Chinese handwritten text recognition model acquisition module is configured to input the error text training sample into the adjusted Chinese handwritten text recognition model, train based on a continuous time classification algorithm, and update the time-dependent backpropagation algorithm with batch gradient descent. Adjust the network parameters of the Chinese handwritten text recognition model to obtain the target Chinese handwritten text recognition model.
A text recognition device, comprising:

An output value acquisition module, configured to acquire Chinese text to be recognized, identify the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtain an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; The target Chinese handwritten text recognition model is obtained by using the handwriting model training method according to any one of claims 1-5;

The recognition result obtaining module is configured to select a maximum output value among output values corresponding to the Chinese text to be recognized, and obtain a recognition result of the Chinese text to be recognized according to the maximum output value.
A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and is characterized in that the processor implements the computer-readable instructions as follows step:

Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;

Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;

Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;

Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
The computer device according to claim 9, wherein the obtaining a training sample of standard Chinese text and batching the training sample of standard Chinese characters according to a preset batch comprises:

Obtain the pixel value feature matrix of each Chinese text in the Chinese text training sample to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain the normalized pixel value of each Chinese text Feature matrix, where the formula for normalization is
MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization;

The pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values, and a binary pixel value feature matrix of each Chinese text is established based on the two types of pixel values, and each Chinese text is The Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample for the standard Chinese text, and the training samples for the text in the standard are batched according to a preset batch.
The computer device according to claim 9, wherein the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous-time classification algorithm, and a loop is updated using a time-dependent back-propagation algorithm The network parameters of the neural network to obtain the standard Chinese text recognition model, including:

The batched canonical Chinese text training samples are input into a recurrent neural network, and training is performed based on a continuous-time classification algorithm to obtain the forward propagation output and backward direction of the batched canonical Chinese text training samples in the recurrent neural network. Propagation output, the forward propagation output is expressed as
Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t,
Represents the probability that the output at step t is the label value l ' u ,
The back-propagation output is expressed as
Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t,
Represents the probability that the output at step t + 1 is the label value l ' u ,

Constructing an error function according to the forward propagation output and the backward propagation output;

According to the error function, a network parameter of the recurrent neural network is updated using a time-dependent back-propagation algorithm to obtain a standardized Chinese text recognition model.
The computer device according to claim 9, characterized in that the adjusted Chinese handwritten text recognition model is used to identify samples of Chinese text to be tested, to obtain error texts whose recognition results do not match the true results, and to treat all the error texts as error texts Training samples, including:

Inputting a Chinese text sample to be tested into an adjusted Chinese handwritten text recognition model, and obtaining an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model;

Selecting a maximum output value among output values corresponding to each of the texts, and obtaining a recognition result of each of the texts according to the maximum output value;

According to the recognition results, error texts whose recognition results do not match the real results are obtained, and all the error texts are used as training texts of the error texts.
The computer device according to claim 9, wherein before the step of obtaining a training sample of standard Chinese text, the processor further implements the following steps when executing the computer-readable instruction:

Initialize the recurrent neural network.
A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and is characterized in that the processor implements the computer-readable instructions as follows step:

Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the handwriting model training method according to any one of claims 1-5;

Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.
One or more non-volatile readable storage media storing computer readable instructions, characterized in that when the computer readable instructions are executed by one or more processors, the one or more processors are caused to execute The following steps:

Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;

Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;

Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;

Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
The non-volatile readable storage medium according to claim 15, wherein the obtaining a normal Chinese text training sample and batching the normal Chinese text training sample according to a preset batch comprises:

Obtain the pixel value feature matrix of each Chinese text in the Chinese text training sample to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain the normalized pixel value of each Chinese text Feature matrix, where the formula for normalization is
MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization;

The pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values, and a binary pixel value feature matrix of each Chinese text is established based on the two types of pixel values, and each Chinese text is The Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample for the standard Chinese text, and the training samples for the text in the standard are batched according to a preset batch.
The nonvolatile readable storage medium according to claim 15, wherein the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous time classification algorithm, and time correlation is used The back-propagation algorithm updates the network parameters of the recurrent neural network to obtain a standard Chinese text recognition model, including:

The batched canonical Chinese text training samples are input into a recurrent neural network, and training is performed based on a continuous-time classification algorithm to obtain the forward propagation output and backward direction of the batched canonical Chinese text training samples in the recurrent neural network. Propagation output, the forward propagation output is expressed as
Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t,
Represents the probability that the output at step t is the label value l ' u ,
The back-propagation output is expressed as
Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t,
Represents the probability that the output at step t + 1 is the label value l ' u ,

Constructing an error function according to the forward propagation output and the backward propagation output;

According to the error function, a network parameter of the recurrent neural network is updated using a time-dependent back-propagation algorithm to obtain a standardized Chinese text recognition model.
The non-volatile readable storage medium according to claim 15, wherein the adjusted Chinese handwritten text recognition model is used to identify a sample of Chinese text to be tested, obtain error text that does not match the recognition result, and Describe the error text as an error text training sample, including:

Inputting a Chinese text sample to be tested into an adjusted Chinese handwritten text recognition model, and obtaining an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model;

Selecting a maximum output value among output values corresponding to each of the texts, and obtaining a recognition result of each of the texts according to the maximum output value;

According to the recognition results, error texts whose recognition results do not match the real results are obtained, and all the error texts are used as training texts of the error texts.
The non-volatile readable storage medium according to claim 15, wherein before the step of obtaining a training sample of standard Chinese text, when the computer-readable instructions are executed by one or more processors, The one or more processors further perform the following steps:

Initialize the recurrent neural network.
One or more non-volatile readable storage media storing computer readable instructions, characterized in that when the computer readable instructions are executed by one or more processors, the one or more processors are caused to execute The following steps:

Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the handwriting model training method according to any one of claims 1-5;

Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.