WO2019232847A1 - 手写模型训练方法、手写字识别方法、装置、设备及介质 - Google Patents

手写模型训练方法、手写字识别方法、装置、设备及介质 Download PDF

Info

Publication number
WO2019232847A1
WO2019232847A1 PCT/CN2018/094193 CN2018094193W WO2019232847A1 WO 2019232847 A1 WO2019232847 A1 WO 2019232847A1 CN 2018094193 W CN2018094193 W CN 2018094193W WO 2019232847 A1 WO2019232847 A1 WO 2019232847A1
Authority
WO
WIPO (PCT)
Prior art keywords
chinese character
chinese
training
recognition model
neural network
Prior art date
Application number
PCT/CN2018/094193
Other languages
English (en)
French (fr)
Inventor
黄春岑
周罡
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019232847A1 publication Critical patent/WO2019232847A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/22Character recognition characterised by the type of writing
    • G06V30/226Character recognition characterised by the type of writing of cursive writing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the present application relates to the field of Chinese character recognition, and in particular, to a handwriting model training method, handwriting recognition method, device, device, and medium.
  • Traditional handwriting recognition methods mostly include binarization processing, character segmentation, feature extraction, and support vector machine recognition.
  • the traditional handwriting recognition methods are used to identify the more sloppy non-standard characters (handwritten Chinese characters). The degree is not high, which makes its recognition effect unsatisfactory.
  • Traditional handwriting recognition methods can only recognize standard characters to a large extent, and the accuracy rate is low when identifying various handwritings in real life.
  • the embodiments of the present application provide a handwriting model training method, a device, a device, and a medium to solve the problem that the current accuracy of handwriting recognition is not high.
  • a handwriting model training method includes:
  • Adopting optical character recognition technology to obtain a pixel value feature matrix of each Chinese character in a training sample of Chinese characters to be processed
  • non-standard Chinese character training samples input the non-standard Chinese character training samples into the standard Chinese character recognition model for training, and use a back-propagation algorithm based on stochastic gradient descent to update the weight of the standard Chinese character recognition model Values and offsets to obtain adjusted Chinese handwriting recognition models;
  • a handwriting model training device includes:
  • a pixel value feature matrix acquisition module which is used to obtain the pixel value feature matrix of each Chinese character in a training sample of Chinese characters to be processed using optical character recognition technology
  • the normal Chinese character training sample acquisition module is used to obtain the normal Chinese character training samples based on the pixel value feature matrix of each Chinese character in the Chinese character training samples to be processed;
  • a specification Chinese character recognition model acquisition module is used to input the specification Chinese character training samples into a convolutional neural network for training, and use a back-propagation algorithm based on stochastic gradient descent to update the weights and biases of the convolutional neural network.
  • Adjust the Chinese handwriting recognition model acquisition module to obtain non-standard Chinese character training samples, input the non-standard Chinese character training samples into the standard Chinese character recognition model for training, and use back propagation based on stochastic gradient descent
  • An algorithm updates the weights and offsets of the text recognition model in the specification to obtain an adjusted Chinese handwriting recognition model
  • Error word training sample acquisition module which is used to obtain a sample of Chinese characters to be tested, and use the adjusted Chinese handwriting recognition model to identify the sample of Chinese characters to be tested, obtain error words that do not match the recognition result with the real result, and put all the errors Words as training samples for wrong words;
  • a target Chinese handwriting recognition model acquisition module is configured to input the error character training sample into the adjusted Chinese handwriting recognition model for training, and use a batch gradient descent-based back propagation algorithm to update and adjust the Chinese handwriting recognition model. Weights and biases to obtain the target Chinese handwriting recognition model.
  • a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor.
  • the processor executes the computer-readable instructions, the following steps are implemented:
  • Adopting optical character recognition technology to obtain a pixel value feature matrix of each Chinese character in a training sample of Chinese characters to be processed
  • non-standard Chinese character training samples input the non-standard Chinese character training samples into the standard Chinese character recognition model for training, and use a back-propagation algorithm based on stochastic gradient descent to update the weight of the standard Chinese character recognition model. Values and offsets to obtain adjusted Chinese handwriting recognition models;
  • One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:
  • Adopting optical character recognition technology to obtain a pixel value feature matrix of each Chinese character in a training sample of Chinese characters to be processed
  • non-standard Chinese character training samples input the non-standard Chinese character training samples into the standard Chinese character recognition model for training, and use a back-propagation algorithm based on stochastic gradient descent to update the weight of the standard Chinese character recognition model. Values and offsets to obtain adjusted Chinese handwriting recognition models;
  • the embodiments of the present application further provide a handwriting recognition method, device, device, and medium to solve the problem that the current handwriting recognition accuracy is not high.
  • a handwriting recognition method includes:
  • a target probability output value is obtained according to the output value and a preset Chinese semantic thesaurus, and a recognition result of the Chinese character to be recognized is obtained based on the target probability output value.
  • An embodiment of the present application provides a handwriting recognition device, including:
  • An output value acquisition module configured to acquire Chinese characters to be identified, identify the Chinese characters to be identified using a target Chinese handwriting recognition model, and obtain output values of the Chinese characters to be identified in the target Chinese handwriting recognition model;
  • the target Chinese handwriting recognition model is obtained by using the handwriting model training method;
  • a recognition result obtaining module is configured to obtain a target probability output value according to the output value and a preset Chinese semantic lexicon, and obtain a recognition result of the Chinese character to be recognized based on the target probability output value.
  • a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor.
  • the processor executes the computer-readable instructions, the following steps are implemented:
  • a target probability output value is obtained according to the output value and a preset Chinese semantic thesaurus, and a recognition result of the Chinese character to be recognized is obtained based on the target probability output value.
  • One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:
  • a target probability output value is obtained according to the output value and a preset Chinese semantic thesaurus, and a recognition result of the Chinese character to be recognized is obtained based on the target probability output value.
  • FIG. 1 is an application environment diagram of a handwriting model training method according to an embodiment of the present application
  • FIG. 2 is a flowchart of a handwriting model training method according to an embodiment of the present application
  • FIG. 3 is a specific flowchart of step S20 in FIG. 2;
  • FIG. 4 is a specific flowchart of step S40 in FIG. 2;
  • FIG. 5 is a specific flowchart of step S60 in FIG. 2;
  • FIG. 6 is a schematic diagram of a handwriting model training device according to an embodiment of the present application.
  • FIG. 7 is a flowchart of a handwriting recognition method according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a handwriting recognition device according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a computer device in an embodiment of the present application.
  • FIG. 1 illustrates an application environment of a handwriting model training method provided by an embodiment of the present application.
  • the application environment of the handwriting model training method includes a server and a client, wherein the server and the client are connected through a network, and the client is a device that can interact with the user, including, but not limited to, a computer and a smart phone.
  • the server can be implemented with an independent server or a server cluster consisting of multiple servers.
  • the handwriting model training method provided in the embodiment of the present application is applied to a server.
  • FIG. 2 shows a flowchart of a handwriting model training method according to an embodiment of the present application.
  • the handwriting model training method includes the following steps:
  • Optical Character Recognition refers to converting text on an image into computer-editable text content.
  • the training samples of Chinese characters to be processed refer to the initially acquired, unprocessed training samples.
  • the pixel value feature matrix is a matrix that uses pixel values as features and is expressed in a matrix manner.
  • the OCR technology is used to perform operations such as positioning, segmentation, and feature extraction on the text on the image to obtain the pixel value feature matrix of each Chinese character in the to-be-processed Chinese character training sample.
  • the pixel value feature matrix can be directly used by the computer.
  • the pixel value features of the Chinese text training samples to be processed can be extracted and represented by a matrix.
  • the training samples of standard Chinese characters refer to the training samples obtained according to the standard standard characters (such as the characters belonging to Kai, Song, or Lishu, and the general font selection is Kai or Song).
  • the words in the pending Chinese training samples belong to the standard. Standard word.
  • a standard Chinese character training sample for training a convolutional neural network is obtained based on a pixel value feature matrix of each Chinese character in a Chinese character training sample to be processed, so as to improve the efficiency of network training.
  • the text training samples in this specification are obtained from standard standard characters that belong to Chinese fonts such as Kai, Song, or Lishu.
  • Song is used as an example for description.
  • the standard word here refers to the characters that belong to the current mainstream fonts in Chinese fonts, such as the default font Songti in the input method of computer equipment, and the commonly used characters in the mainstream fonts of Lintong. Chinese characters that are less commonly used, such as cursive characters and young round characters, are not included in the scope of this standard.
  • initializing the convolutional neural network includes: making the weights of the convolutional neural network initialization satisfy a formula Among them, n l represents the number of samples of the training samples input at the first layer, S () represents the variance operation, and W l represents the weight of the first layer, Is arbitrary, and l is the first layer in the convolutional neural network.
  • Convolutional Neural Network is a kind of feed-forward neural network. Its artificial neurons can respond to a part of the surrounding cells in the coverage area, and can perform image processing and recognition.
  • DNN general deep neural network
  • the convolutional neural network includes a convolutional layer and a pooling layer, which is a convolutional neural network capable of processing images with text. And identification to provide important technical support.
  • the convolutional neural network includes the weights and biases of each neuron connection between the layers. These weights and biases determine the recognition effect of the convolutional neural network.
  • the convolutional neural network is initialized, and the initialization operation is to set initial values of weights and biases in the convolutional neural network.
  • S () represents the variance operation
  • n l represents the number of samples of the training samples input at the l-th layer.
  • the activation function used by the convolutional layer in a convolutional neural network is ReLU (Rectified Linear Unit, Chinese name is linear rectification function), also known as modified linear unit, is an activation function commonly used in artificial neural networks, usually refers to Non-linear functions represented by ramp functions and their variants.
  • Reasonably initializing the convolutional neural network can make the network more flexible in the initial stage. It can effectively adjust the network during the training process. It can quickly and effectively find the minimum value of the error function, which is beneficial to the convolutional neural network. Updates and adjustments make the model obtained by model training based on convolutional neural network have accurate recognition effect when performing Chinese handwriting recognition.
  • S40 The normalized Chinese character training samples are input to the convolutional neural network for training, and the backpropagation algorithm based on stochastic gradient descent is used to update the weights and offsets of the convolutional neural network to obtain the normalized Chinese character recognition model.
  • Stochastic Gradient Descent is to obtain the errors generated by each training sample during the training process when updating the network parameters, and multiple times randomly adopt the errors generated by the single sample during the training process.
  • Back Propagation is a training and learning method in neural network learning, which is used to adjust the weights and offsets between nodes in the neural network.
  • the minimum value of the error function is required.
  • the minimum value of the error function is specifically calculated using the stochastic gradient descent method.
  • a training sample of standard Chinese characters is input to a convolutional neural network for training, and a backpropagation algorithm based on stochastic gradient descent is used to update the weights and offsets of the convolutional neural network to obtain a standard Chinese character recognition model.
  • the standard Chinese character recognition model learns the deep features of standard Chinese character training samples during the training process, which enables the model to accurately recognize standard standard characters and has the ability to recognize standard standard characters.
  • S50 Obtain non-standard Chinese character training samples, input non-standard Chinese character training samples into the standard Chinese character recognition model for training, and use a back-propagation algorithm based on stochastic gradient descent to update the weight and offset of the standard Chinese character recognition model. To get adjusted Chinese handwriting recognition model.
  • the non-standard Chinese character training sample refers to a training sample obtained based on handwritten Chinese characters.
  • the handwritten Chinese characters may specifically be characters obtained by handwriting according to the font form of standard normal characters corresponding to the fonts such as Kai, Song, or Lishu. Understandably, the difference between this non-standard Chinese character training sample and the normal Chinese character training sample is that the non-standard Chinese character training sample is obtained by handwriting Chinese characters. Since it is handwritten, it certainly contains a variety of different fonts. form.
  • the server obtains a non-standard Chinese character training sample
  • the training sample contains the characteristics of handwritten Chinese characters
  • the descending backpropagation algorithm updates the weights and offsets of the text recognition model in the specification to obtain an adjusted Chinese handwriting recognition model.
  • the standard Chinese character recognition model has the ability to recognize Chinese characters in the standard specification, but it does not have high recognition accuracy when recognizing handwritten Chinese characters. Therefore, this embodiment uses non-standard Chinese character training samples for training, so that the standard Chinese handwriting recognition model can adjust the parameters (weights and offsets) in the model based on the existing standard characters of the recognition standard to obtain adjusted Chinese characters.
  • Handwriting recognition model uses non-standard Chinese character training samples for training, so that the standard Chinese handwriting recognition model can adjust the parameters (weights and offsets) in the model based on the existing standard characters of the recognition standard to obtain adjusted Chinese characters.
  • the adjusted Chinese handwriting recognition model learns the deep features of handwritten Chinese characters on the basis of the original recognition of standard and standardized characters, so that the adjusted Chinese handwriting recognition model combines the deep features of standard and handwritten Chinese characters, and can simultaneously regulate the standard specifications. Characters and handwritten Chinese characters are effectively recognized, and recognition results with higher accuracy are obtained.
  • the convolutional neural network uses the pixel distribution of the words for word recognition. There is a difference between handwritten Chinese characters and standard canonical characters in real life, but this difference is different from other non-corresponding standard canonical characters. Much smaller, for example, there is a difference in pixel distribution between the "I” of handwritten Chinese characters and the "I” of standard canonical characters, but this difference is compared to the difference between "You" of handwritten Chinese characters and "I” The difference is significantly smaller. It can be considered that even if there is a certain difference between the handwritten Chinese characters and the corresponding standard standard words, this difference is much smaller than the non-corresponding standard standard words. Therefore, the most similar (that is, the smallest difference) ) Determine the recognition result.
  • the adjusted Chinese handwriting recognition model is trained by a convolutional neural network. The model combines standard canonical characters and deep features of handwritten Chinese characters, and can effectively recognize handwritten Chinese characters based on the deep features.
  • the back-propagation update based on stochastic gradient descent is used to update the error back-propagation, and the model training can be performed smoothly with a large number of training samples, which can improve the efficiency and effect of network training and make training More effective.
  • step S40 and step S50 in this embodiment is not interchangeable.
  • Step S40 is performed first and then step S50 is performed.
  • First training the convolutional neural network with the normal Chinese training samples can make the obtained normal Chinese character recognition model have better recognition ability, so that it can have accurate recognition results for standard normal words.
  • the fine-tuning of step S50 is performed, so that the adjusted Chinese handwriting recognition model obtained by training can effectively recognize handwritten Chinese characters based on the deep features of the learned handwritten Chinese characters, and make them handwritten. Chinese character recognition has more accurate recognition results.
  • step S50 is performed first or only step S50 is performed, because there are various forms of handwritten Chinese characters, the features learned through direct training of handwritten Chinese characters cannot reflect the characteristics of handwritten Chinese characters well, which will make the model at the beginning Studying is "bad", which makes it difficult to make accurate recognition results for handwritten Chinese character recognition.
  • each person's handwritten Chinese characters are different, most of them are similar to standard Chinese characters (such as handwritten Chinese characters imitating standard Chinese characters). Therefore, at the beginning, model training based on standard and standardized words is more in line with the objective situation, and it is more effective than model training directly on handwritten Chinese characters. You can make corresponding adjustments under the "good” model to obtain the recognition rate of handwritten Chinese characters Highly adjusted Chinese handwriting recognition model.
  • the Chinese character sample to be tested refers to the training sample obtained for testing according to the standard Chinese character and handwritten Chinese characters.
  • the standard Chinese character used in this step is the same as the standard Chinese character used for training in step S40 (because For example, each character corresponding to a font such as Kai font, Song font, etc. is uniquely determined; the handwritten Chinese character used may be different from the handwritten Chinese character used for training in step S50 (the Chinese characters written by different people are incomplete) Similarly, each character corresponding to handwritten Chinese characters can correspond to multiple font forms. In order to distinguish it from the non-standard Chinese character training samples used for training in step S50 and avoid the situation of model training overfitting, this step is generally used Handwritten Chinese characters different from step S50).
  • the trained adjusted Chinese handwriting recognition model is used to identify a sample of Chinese characters to be tested.
  • the sample of Chinese characters to be tested includes standard canonical characters and their pre-set label values (that is, real results), and handwriting. Chinese characters and their preset label values.
  • standard and handwritten Chinese characters can be input to the Chinese handwriting recognition model in a mixed manner.
  • the adjusted Chinese handwriting recognition model is used to recognize the text samples in the test, the corresponding recognition results will be obtained, and all error words that do not match the recognition result with the label value (real result) will be used as the error word training samples.
  • the error word training sample reflects that the Chinese character handwriting recognition model still has insufficient recognition accuracy. In order to further update and optimize the Chinese handwriting recognition model based on the error word training sample.
  • the network parameters (weights and offsets) were first updated with the standard Chinese character training samples, and then Under the premise of updating the network parameters (weights and biases) of the non-standard Chinese character training samples, the acquired adjusted Chinese handwriting recognition model will over-learn the characteristics of the non-standard Chinese character training samples, making the obtained adjusted Chinese handwriting recognition
  • the model has a very high recognition accuracy for non-standard Chinese character training samples (including handwritten Chinese characters), but it over-learns the characteristics of the non-standard Chinese character samples, affecting the recognition of handwritten Chinese characters other than the non-standard Chinese character training samples.
  • step S60 uses the Chinese character samples to be tested to identify the adjusted Chinese handwriting recognition model, which can largely eliminate over-learning of non-standard Chinese character training samples used in training. That is, by adjusting the Chinese handwriting recognition model to identify the samples of the text to be tested to find the error caused by over-learning, the error can be specifically reflected by the error word, so the Chinese handwriting can be further updated and optimized based on the error word. Network parameters of the word recognition model.
  • S70 Input the error word training samples into the adjusted Chinese handwriting recognition model for training, and use the back-propagation algorithm based on batch gradient descent to update and adjust the weight and offset of the Chinese handwriting recognition model to obtain the target Chinese handwriting recognition model. .
  • an error character training sample is input to the adjusted Chinese handwriting recognition model for training, and the error word training sample reflects the characteristics of the non-standard Chinese character training sample due to excessive learning during training and adjustment of the Chinese handwriting recognition model. , Resulting in an inaccurate recognition problem when adjusting the Chinese handwriting recognition model to recognize handwritten Chinese characters other than non-standard Chinese character training samples.
  • the reason that the standard Chinese character training samples are used first and then the non-standard Chinese character training samples are used to train the model will excessively weaken the characteristics of the standard word that was originally learned, which will affect the initial establishment of the model to recognize the standard word. frame".
  • the use of error word training samples can well solve the problems of over-learning and over-weakening.
  • the training method using the error word training sample uses a back-propagation algorithm based on batch gradient descent, and updates the weight and offset of the Chinese handwriting recognition model according to the algorithm to obtain the target Chinese handwriting recognition model.
  • the target Chinese handwriting recognition model refers to the finally trained model that can be used to recognize Chinese handwriting.
  • the sample size of the error word training samples is small (less error words).
  • Back-propagation update is performed to ensure that all errors generated can be adjusted and updated on the network, can fully train the convolutional neural network, and improve the recognition accuracy of the target Chinese handwriting recognition model.
  • steps S40 and S50 use a back-propagation algorithm based on stochastic gradient descent; step S70 uses a back-propagation algorithm based on batch gradient descent.
  • step S40 the process of updating the weights and offsets of the convolutional neural network using a back-propagation algorithm based on stochastic gradient descent includes the following steps:
  • step S50 the process of updating the weights and offsets of the convolutional neural network using a back-propagation algorithm based on stochastic gradient descent is similar to the process of step S40, and is not repeated here.
  • step S70 the process of updating the weights and offsets of the convolutional neural network by using a back-propagation algorithm based on batch gradient descent specifically includes the following steps:
  • the binarized pixel value feature matrix corresponding to one training sample in the training sample of the error word and input the binarized pixel value feature matrix to the adjusted Chinese handwriting recognition model (essentially a convolutional neural network) to obtain the Forward output, calculate the error between the forward output and the real result, obtain and sequentially input the binary pixel value feature matrix corresponding to the remaining training samples into the adjusted Chinese handwriting recognition model, and calculate the corresponding forward output and real
  • the error between the results and the accumulated error can be adjusted.
  • the total error is used for a back-propagation based on gradient descent, the network weights and offsets are updated, and the total calculation is repeated.
  • the process of updating the weights and offsets of the network using the error and the total error until the error is less than the stop iteration threshold ⁇ 2 ends the cycle and obtains the updated weights and offsets to obtain the target Chinese handwriting recognition model.
  • steps S40 and S50 because the number of training samples used for model training is relatively large, if the back-propagation algorithm based on batch gradient descent will affect the efficiency and effect of network training, even model training cannot be performed normally It is difficult to train effectively.
  • Using the back propagation algorithm based on stochastic gradient descent to perform error back propagation update can improve the efficiency and effect of network training and make training more effective.
  • the sample size of the error word training samples is small (the number of error words is small).
  • the back-propagation algorithm based on batch gradient descent all errors generated by the training words of the error word training samples can be inverted. Transmission update ensures that all errors generated can be adjusted and updated on the network, and the convolutional neural network can be fully trained.
  • Back-propagation algorithm based on batch gradient descent is compared with back-propagation algorithm based on stochastic gradient descent.
  • the former's gradient is standard and can fully train the convolutional neural network; the latter randomly draws a training sample from the training sample each time
  • the gradient of the parameters of the update network is approximate, not standard, and less accurate than the former in terms of training accuracy.
  • the use of batch gradient descent-based back propagation algorithm can improve the accuracy of model training, so that the target Chinese handwriting recognition model obtained by training has accurate recognition ability.
  • steps S10-S70 the optical character recognition technology is used to obtain the pixel value feature matrix of each Chinese character in the to-be-processed Chinese character training samples, and the standard Chinese characters are obtained based on the pixel value feature matrix of each Chinese character in the to-be-processed Chinese character training samples.
  • Training samples The text training samples in this specification can be directly recognized and read by the computer. Then initialize the convolutional neural network, which is helpful to improve the training efficiency of the neural network.
  • error word training samples can largely eliminate the adverse effects caused by over-learning and over-weakening during the original training process, and can further optimize the recognition accuracy.
  • the text recognition model and the adjusted Chinese handwriting recognition model in the training specification use a back-propagation algorithm based on stochastic gradient descent, which can still have a good training effect in the case of a large number of training samples.
  • the training target Chinese handwriting recognition model uses a back-propagation algorithm based on batch gradient descent. Using batch gradient descent can ensure that the parameters in the model are fully updated, and the errors generated by the training samples during the training process are back-propagated and updated. The parameters are updated comprehensively according to the generated errors to improve the recognition accuracy of the obtained model.
  • a normal Chinese character training sample is obtained based on a pixel value feature matrix of each Chinese character in the Chinese character training sample to be processed, and specifically includes the following steps:
  • S21 Obtain a pixel value feature matrix of each Chinese character in the training sample of Chinese characters to be processed, normalize each pixel value in the pixel value feature matrix, and obtain a normalized pixel value feature matrix of each Chinese character.
  • the normalization formula is MaxValue is the maximum pixel value in the pixel value feature matrix of each Chinese character, MinValue is the minimum pixel value in the pixel value feature matrix of each Chinese character, x is the pixel value before normalization, and y is the normalization The pixel value after the transformation.
  • a pixel value feature matrix of each Chinese character in a training sample of Chinese characters to be processed is obtained.
  • the pixel value feature matrix of each Chinese character represents the feature of the corresponding word.
  • the pixel value represents the feature of the word.
  • Words are based on two-dimensional representation (generally a word is represented by an m ⁇ n image), so pixel values can be represented by a matrix, that is, a pixel value feature matrix is formed.
  • the computer device can recognize the form of the pixel value characteristic matrix and read the value in the pixel value characteristic matrix.
  • the server uses the formula of normalization processing to normalize the pixel value of each Chinese character in the feature matrix to obtain the normalized pixel value feature of each Chinese character.
  • the normalized processing method can be used to compress the pixel value feature matrix of each Chinese character in the same range, which can speed up the calculations related to the pixel value feature matrix and help improve the training standard. Training efficiency of word recognition model.
  • S22 Divide the pixel values in the normalized pixel value feature matrix of each Chinese character into two types of pixel values, and establish a binary pixel value feature matrix of each Chinese character based on the two types of pixel values.
  • the combination of the binarized pixel feature matrix is used as the standard Chinese character training sample.
  • the pixel values in the normalized pixel value feature matrix of each Chinese character are divided into two types of pixel values.
  • the two types of pixel values refer to that the pixel values include only the pixel value A or the pixel value B.
  • a pixel value greater than or equal to 0.5 in the normalized pixel feature matrix can be taken as 1
  • a pixel value less than 0.5 can be taken as 0, and a corresponding binary pixel feature matrix for each Chinese character can be established.
  • the original in the binary pixel feature matrix of each Chinese character contains only 0 or 1.
  • the Chinese character combination corresponding to the binarized pixel value feature matrix is used as a standard Chinese character training sample.
  • the feature representation of the characters can be further simplified by establishing a binary pixel value feature matrix.
  • Each matrix of Chinese characters can be represented and distinguished only by using a matrix of 0 and 1, which can improve the computer processing of Chinese characters.
  • the speed of the feature matrix further improves the training efficiency of the text recognition model in the training specification.
  • Steps S21-S22 Normalize the Chinese character training samples to be processed and divide the two types of values, obtain the binary pixel value feature matrix of each Chinese character, and binarize the pixel value features of each Chinese character
  • the words corresponding to the matrix are used as training samples for Chinese characters in the specification, which can significantly shorten the time for training the character recognition model in the specification.
  • step S40 training samples of normal Chinese characters are input to a convolutional neural network for training, and a back-propagation algorithm based on stochastic gradient descent is used to update the weights of the convolutional neural network.
  • a back-propagation algorithm based on stochastic gradient descent is used to update the weights of the convolutional neural network.
  • offset to obtain the text recognition model in the specification including the following steps:
  • S41 Input the normalized Chinese character training samples into the convolutional neural network, and obtain the forward output of the normalized Chinese character training samples in the convolutional neural network.
  • the convolutional neural network is a kind of feedforward neural network. Its artificial neurons can respond to a part of the surrounding cells within the coverage area, and can perform image processing and recognition.
  • Convolutional neural networks usually include at least two non-linearly trainable convolutional layers, at least two non-linear pooling layers, and at least one fully connected layer, that is, including at least five hidden layers, in addition to an input layer and an output Floor.
  • the training samples of the standard Chinese characters are input to the convolutional neural network for training.
  • the training samples of the standard Chinese characters are processed in each layer of the convolutional neural network (specifically, the weights and offsets are compared to the training samples of the standard Chinese characters).
  • Response processing the corresponding output value will be obtained in each layer of the convolutional neural network. Because the convolutional neural network contains many layers and the functions of each layer are different, the output of each layer is different.
  • average pooling is also commonly used, that is, taking the average value of each sample in the sample of n * n as the sample value after sampling.
  • the output of each layer in the convolutional neural network can be obtained, and the output a L of the output layer is finally obtained, and the output is the forward output.
  • the forward output obtained in step S111 can reflect the output situation of the text training samples in the specification in the convolutional neural network. The output situation can be compared with the objective facts (real results) to determine The error between the two is adjusted for the convolutional neural network.
  • S42 Construct an error function according to the forward output and the real result.
  • the expression of the error function is Among them, n represents the total number of training samples, x i represents the forward output of the i-th training sample, and y i represents the real result of the i-th training sample corresponding to x i .
  • the real result is the objective fact. For example, if the input word is "tai” in italics, the result of the forward output may be "big” and other results, and the real result is the original input "too", which can understand the real result. Is the label value of the training sample, used to calculate the error from the forward output.
  • a corresponding error function can be constructed according to the error, so that the volume can be trained using the error function.
  • Product Neural Networks update the weights and biases so that the updated weights and biases can process forward input training samples to obtain the same or similar forward output as the real result.
  • an appropriate error function can be constructed according to the actual situation.
  • the error function constructed in this embodiment is Can better reflect the error between the forward output and the true result.
  • the weights and offsets of the convolutional neural network are updated by using a back-propagation algorithm based on stochastic gradient descent to obtain a standard Chinese character recognition model.
  • the weights are updated
  • the formula is In the convolutional layer of the convolutional neural network, the formula for updating the weights is W l ' represents the updated weight, W l represents the weight before the update, ⁇ represents the learning rate, m represents the training sample of Chinese characters in the specification, i represents the i-th sample of Chinese characters input, and ⁇ i, l represents the input Sensitivity of the i-th Chinese character sample at the l- th layer, a i, l-1 represents the output of the input i-th Chinese character sample at the l-1 layer, T represents the matrix transposition operation, and * represents the convolution operation, rot180 represents the operation of turning the matrix 180 degrees; in the fully connected layer of the convolutional neural network, the formula
  • a back propagation algorithm based on stochastic gradient descent is used to update the network parameters, and the updated convolutional neural network is used as the standard Chinese character recognition model.
  • backward propagation should be performed according to the actual situation of each layer to update the network parameters.
  • the weights and offsets of the updated output layer are calculated first, and the error function is used to perform partial derivative operations on the weights W and offset b, respectively.
  • a common factor, that is, the output layer, can be obtained.
  • the sensitivity of ⁇ L can be sequentially determined sensitivity of [delta] l l layer, obtained by the gradient of a neural network in accordance with [delta] l l layer, then with a gradient update convolutional neural network Weights and biases.
  • ⁇ l (W l + 1 ) T ⁇ l + 1 ⁇ '(z l ), where W l + 1 represents the weight of the l + 1 layer and T represents the matrix Transpose operation, ⁇ l + 1 represents the sensitivity of the l + 1 layer, ⁇ represents the operation of multiplying the corresponding elements of two matrices (Hadamard product), ⁇ represents the activation function, and z l represents that no activation is used in the calculation of forward propagation Output before function processing.
  • ⁇ l ⁇ l + 1 * rot180 (W l + 1 ) ⁇ (z l ), where * represents a convolution operation, and rot180 represents an operation that turns the matrix 180 degrees.
  • * represents a convolution operation
  • rot180 represents an operation that turns the matrix 180 degrees.
  • ⁇ l upsample ( ⁇ l + 1 ) ⁇ ′ (z l ), and upsample represents an upsampling operation.
  • the corresponding sensitivity ⁇ l is obtained according to the above layers of the convolutional neural network, and the weight and offset of the layer l are updated according to the sensitivity ⁇ l .
  • the pooling layer has no weights and biases, so only the weights and biases of the fully connected layer and the convolutional layer need to be updated.
  • step S43 if it is currently a fully connected layer, the formula for updating weights is expressed as Among them, W l ′ represents the updated weight value, W l represents the weight value before the update, ⁇ represents the learning rate, m represents the standard Chinese character training sample, i represents the i-th Chinese character sample input, and ⁇ i, l represents Sensitivity of the input i-th Chinese character sample at the l- th layer, a i, l-1 represents the output of the input i-th Chinese character sample at the l-1 layer, T represents the matrix transposition operation, That is, the gradient of the weight W of the l layer; the formula for updating the offset is expressed as b l ′ represents the updated bias, b l represents the bias before the update ⁇ represents the learning rate, m represents the text training sample in the specification, i represents the i-th sample of the Chinese character input, and ⁇ i, l represents the first-number input.
  • Sensitivity of the i Chinese character samples in the first layer If it is a convolution layer, the formula for updating weights is The formula for updating the bias is Among them, (u, v) refers to the position of the small block (the element constituting the convolution feature map) in each of the convolution feature maps obtained during the convolution operation.
  • (u, v) refers to the position of the small block (the element constituting the convolution feature map) in each of the convolution feature maps obtained during the convolution operation.
  • Steps S41-S43 can construct an error function according to the forward output obtained from the convolutional neural network of the text training samples in the specification Based on the error function, the weights and offsets are updated and back-propagated to obtain the normal Chinese character recognition model.
  • the model learns the deep features of the normal Chinese character training samples and can accurately identify the standard normal characters.
  • step S60 the Chinese handwriting recognition model is adjusted to identify the text samples to be tested, to obtain error words whose recognition results do not match the real results, and to use all the error words as training samples for the error words. , Including the following steps:
  • S61 Input the Chinese character sample to be tested into the adjusted Chinese handwriting recognition model, and obtain the output value of each character in the Chinese character sample to be tested in the adjusted Chinese handwriting recognition model.
  • the Chinese handwriting recognition model is adjusted to recognize the text samples to be tested, and the text samples to be tested include several Chinese characters.
  • the Chinese character library there are about 3,000 commonly used Chinese characters.
  • the probability value of the similarity between each character in the Chinese character library and the input Chinese character sample to be tested should be set.
  • the probability The value is the output value of each character in the text sample to be tested in the adjusted Chinese handwriting recognition model, which can be achieved by the softmax function. To put it simply, when the "I" character is input, the output value (represented by probability) corresponding to each character in the Chinese character library will be obtained in the adjustment of the Chinese handwriting recognition model, such as corresponding to the "I" in the Chinese character library.
  • the output value of is 99.5%, and the output values of the remaining words add up to 0.5%.
  • S62 Select the maximum output value among the output values corresponding to each word, and obtain the recognition result of each word according to the maximum output value.
  • a maximum output value among all output values corresponding to each word is selected, and a recognition result of the word can be obtained according to the maximum output value.
  • the output value directly reflects the similarity between the words in the input Chinese character sample to be tested and each character in the Chinese character library, and the maximum output value indicates that the sample of the character to be tested is closest to a word in the Chinese character library.
  • the recognition result of the word can be obtained. For example, the recognition result of the last output of the word "I" is "I".
  • the obtained recognition result is compared with a real result (objective fact), and an error word that does not match the recognition result with the real result is used as an error word training sample.
  • the recognition result is only the result recognized by the text training sample in the test under adjustment of the Chinese handwriting recognition model, and may be different from the real result, reflecting that the model still has accuracy in recognition. Shortcomings, and these shortcomings can be optimized by training samples of wrong words to achieve more accurate recognition results.
  • Steps S61-S63 adjust the output value of the Chinese handwriting recognition model according to each word in the text sample to be tested, and select the maximum output value that can reflect the degree of similarity between words from the output value; and then obtain the recognition result by the maximum output value According to the recognition results, the training samples of the wrong words are obtained, which provides an important technical premise for the subsequent use of the training samples of the wrong words to further optimize the recognition accuracy.
  • an optical character recognition technology is used to obtain a pixel value feature matrix of each Chinese character in a Chinese character training sample to be processed, and based on the pixel value of each Chinese character in the Chinese character training sample to be processed
  • the feature matrix obtains the text training samples in the specification, which can be directly recognized and read by the computer.
  • This initialization method can quickly and efficiently find the minimum value of the error function, which is beneficial to the convolutional neural network. Updates and adjustments.
  • the non-standard Chinese characters are adjusted to update the standard Chinese character recognition model, so that the adjusted Chinese handwriting recognition model obtained after the update can learn non-standard Chinese by training and updating under the premise that it has the ability to recognize standard Chinese handwriting
  • the deep features of characters make it possible to adjust the Chinese handwriting recognition model to better recognize non-standard Chinese handwriting.
  • the maximum output value that reflects the degree of similarity between words is selected from the output values, and the recognition result is obtained by using the maximum output value.
  • the recognition results are obtained from the training samples of the wrong words, and all the wrong words are input as the training samples of the wrong words into the adjusted Chinese handwriting recognition model for training update to obtain the target Chinese handwriting recognition model.
  • error word training samples can largely eliminate the adverse effects caused by over-learning and over-weakening during the original training process, and can further optimize the recognition accuracy.
  • the standard Chinese character recognition model and the adjusted Chinese handwriting recognition model are trained using a back-propagation algorithm based on a random gradient, which is still the case when the number of training samples is large. Have better training efficiency and training effect.
  • the target Chinese handwriting recognition model is trained using a back-propagation algorithm based on batch gradient descent, which can ensure that the parameters in the model are fully updated, and the errors generated by the training samples during the training process are back-propagated and updated.
  • the parameters are updated according to the generated errors to improve the recognition accuracy of the obtained model.
  • FIG. 6 shows a principle block diagram of a handwriting model training device corresponding to the handwriting model training method in the embodiment.
  • the handwriting model training device includes a pixel value feature matrix acquisition module 10, a standard Chinese character training sample acquisition module 20, an initialization module 30, a standard Chinese character recognition model acquisition module 40, and an adjusted Chinese handwriting recognition model acquisition module. 50.
  • the pixel value feature matrix acquisition module 10 corresponds to the steps corresponding to the handwriting model training method in the embodiment. To avoid redundant description, this embodiment does not detail them one by one.
  • the pixel value feature matrix obtaining module 10 is configured to obtain the pixel value feature matrix of each Chinese character in a training sample of Chinese characters to be processed by using optical character recognition technology.
  • the normal Chinese character training sample acquisition module 20 is configured to obtain a normal Chinese character training sample based on a pixel value feature matrix of each Chinese character in the Chinese character training sample to be processed.
  • the initialization module 30 is configured to initialize a convolutional neural network.
  • Canonical Chinese character recognition model acquisition module 40 is used to input normal Chinese character training samples into a convolutional neural network for training, and uses a back-propagation algorithm based on stochastic gradient descent to update the weights and biases of the convolutional neural network to obtain Standard Chinese character recognition model.
  • Adjust the Chinese handwriting recognition model acquisition module 50 to obtain non-standard Chinese character training samples, input non-standard Chinese character training samples into the standard Chinese character recognition model for training, and use a back-propagation algorithm based on stochastic gradient descent to update the specifications
  • the weights and offsets of the Chinese character recognition model were obtained to adjust the Chinese handwriting recognition model.
  • Error word training sample acquisition module 60 which is used to obtain a sample of the text to be tested, adjust the Chinese handwriting recognition model to identify the sample of the text to be tested, obtain the error words that do not match the actual results, and train all the error words as the error words sample.
  • the target Chinese handwriting recognition model acquisition module 70 is used to input training character training samples into the adjusted Chinese handwriting recognition model for training, and uses a back-propagation algorithm based on batch gradient descent to update and adjust the weight and sum of the Chinese handwriting recognition model. Offset to obtain the target Chinese handwriting recognition model.
  • the normalized Chinese character training sample obtaining module 20 includes a normalized pixel value feature matrix obtaining unit 21 and the normalized Chinese character training sample obtaining unit 22.
  • the normalized pixel value feature matrix obtaining unit 21 is configured to obtain a pixel value feature matrix of each Chinese character in a Chinese character training sample to be processed, and normalize each pixel value in the pixel value feature matrix to obtain each The normalized pixel value feature matrix of Chinese characters, where the formula for normalization processing is MaxValue is the maximum pixel value in the pixel value feature matrix of each Chinese character, MinValue is the minimum pixel value in the pixel value feature matrix of each Chinese character, x is the pixel value before normalization, and y is the normalization The pixel value after the transformation.
  • the standard Chinese character training sample acquisition unit 22 is configured to divide the pixel values in the normalized pixel value feature matrix of each Chinese character into two types of pixel values, and establish a binarized pixel of each Chinese character based on the two types of pixel values. Value feature matrix, using the binarized pixel feature matrix of each Chinese character as the standard Chinese character training sample.
  • the initialization module 30 is configured to initialize the convolutional neural network, wherein the weights of the convolutional neural network initialization satisfy the formula n l represents the number of training samples input in the l-th layer, S () represents the variance operation, W l represents the weight of the l-th layer, Is arbitrary, and l is the first layer in the convolutional neural network.
  • the standard Chinese character recognition model acquisition module 40 includes a forward output acquisition unit 41, an error function construction unit 42 and a standard Chinese character recognition model acquisition unit 43.
  • the forward output obtaining unit 41 is configured to input the normalized Chinese character training samples into the convolutional neural network, and obtain the forward output of the normalized Chinese character training samples in the convolutional neural network.
  • the error function constructing unit 42 is configured to construct an error function according to the forward output and the real result.
  • the expression of the error function is Among them, n represents the total number of training samples, x i represents the forward output of the i-th training sample, and y i represents the real result of the i-th training sample corresponding to x i .
  • the standard Chinese character recognition model acquisition unit 43 is configured to update the weights and offsets of the convolutional neural network using a back-propagation algorithm based on stochastic gradient descent based on the error function to obtain the standard Chinese character recognition model.
  • the fully connected layer of the network is configured to update the weights and offsets of the convolutional neural network using a back-propagation algorithm based on stochastic gradient descent based on the error function to obtain the standard Chinese character recognition model.
  • the formula for updating weights is In the convolutional layer of the convolutional neural network, the formula for updating the weights is W l ' represents the updated weight, W l represents the weight before the update, ⁇ represents the learning rate, m represents the training sample of Chinese characters in the specification, i represents the i-th sample of Chinese characters input, and ⁇ i, l represents the input Sensitivity of the i-th Chinese character sample at the l- th layer, a i, l-1 represents the output of the input i-th Chinese character sample at the l-1 layer, T represents the matrix transposition operation, and * represents the convolution operation, rot180 represents the operation of turning the matrix 180 degrees; in the fully connected layer of the convolutional neural network, the formula for updating the offset is In the convolutional layer of the convolutional neural network, the formula for updating the bias is b l ' represents the offset after updating, b l represents the offset before updating, ⁇ represents the learning rate, m represents the text training sample in the specification,
  • the error word training sample acquisition module 60 includes a model output value acquisition unit 61, a model recognition result acquisition unit 62, and an error word training sample acquisition unit 63.
  • the model output value obtaining unit 61 is configured to input a sample of the Chinese character to be tested into the adjusted Chinese handwriting recognition model, and obtain an output value of each character in the sample of the Chinese character to be tested in the adjusted Chinese handwriting recognition model.
  • the model recognition result obtaining unit 62 is configured to select a maximum output value among output values corresponding to each word, and obtain a recognition result of each word according to the maximum output value.
  • the error word training sample acquisition unit 63 is configured to obtain error words that do not match the recognition result according to the recognition result, and use all the error words as the error word training samples.
  • FIG. 7 shows a flowchart of the handwriting recognition method in this embodiment.
  • the handwriting recognition method can be applied to computer equipment configured by banks, investment, and insurance institutions to recognize handwritten Chinese characters and achieve artificial intelligence purposes. As shown in FIG. 7, the handwriting recognition method includes the following steps:
  • S80 Obtain the Chinese characters to be recognized, use the target Chinese handwriting recognition model to identify the Chinese characters to be recognized, and obtain the output values of the Chinese characters to be recognized in the target Chinese handwriting recognition model.
  • the target Chinese handwriting recognition model is trained using the handwriting model described above. Method.
  • the Chinese characters to be identified refer to Chinese characters to be identified.
  • the Chinese characters to be recognized are input, and the Chinese characters to be recognized are input into the target Chinese handwriting recognition model for recognition, and the output values of the Chinese characters to be recognized in the target Chinese handwriting recognition model are obtained, and one Chinese character to be identified corresponds to There are more than three thousand (the specific number is based on the Chinese character library) output value, and the recognition result of the Chinese character to be recognized can be determined based on the output value.
  • the Chinese characters to be recognized are specifically represented by a binary pixel value feature matrix that can be directly recognized by a computer.
  • S90 Obtain a target probability output value according to the output value and a preset Chinese semantic lexicon, and obtain a recognition result of the Chinese character to be recognized based on the target probability output value.
  • the preset Chinese semantic lexicon refers to a preset lexicon that describes the semantic relationship between Chinese words based on the word frequency. For example, in the Chinese semantic thesaurus, for the word “X Yang”, the probability of "Sun” appearing is 30.5%, the probability of "Dayang” appearing is 0.5%, and the rest such as “Sun” The sum of the probabilities of the two words of "Xyang” is 69%.
  • the target probability output value is a probability value obtained by combining the output value and a preset Chinese semantic lexicon to obtain the recognition result of the Chinese character to be recognized.
  • using the output value and the preset Chinese semantic thesaurus to obtain the target probability output value includes the following steps: (1) selecting the maximum value of the output value corresponding to each character in the Chinese character to be recognized as the first probability value, according to the first A probability value obtains a preliminary recognition result of the Chinese characters to be recognized. (2) Obtain the leftward semantic probability value and the rightward semantic probability value of the word to be recognized according to the preliminary recognition result and the Chinese semantic thesaurus. Understandably, for a text, the words in the text have a sequence, such as "red X Yang", for the "X" word, there are two words “red X” and "left X”. X Yang "corresponds to the probability value, that is, the left-side semantic probability value and the right-side semantic probability value.
  • the first 5 probability values represent the most likely 5 words (recognition results), and only the 5 words combined with the Chinese semantic thesaurus to calculate the target Probability output value, there are only five target probability output values, which can greatly improve the efficiency of recognition.
  • the output value and the preset Chinese semantic thesaurus accurate recognition results can be obtained. Understandably, for the recognition of a single character (non-text), the corresponding recognition result can be directly obtained according to the maximum value in the output value, without the need to add recognition based on Chinese semantics.
  • the target Chinese handwriting recognition model is used to recognize the Chinese characters to be recognized, and the output value and the preset Chinese semantic thesaurus are used to obtain the recognition results of the Chinese characters to be recognized.
  • the target Chinese handwriting recognition model itself has high recognition accuracy, combined with the Chinese semantic thesaurus to further improve the accuracy of Chinese handwriting recognition.
  • the Chinese characters to be recognized are input into the target Chinese handwriting recognition model for recognition, and the recognition result is obtained by combining with a preset Chinese semantic thesaurus.
  • the target Chinese handwriting recognition model is used to recognize Chinese handwriting, accurate recognition results can be obtained.
  • FIG. 8 shows a schematic block diagram of a handwriting recognition device corresponding to the handwriting recognition method in the embodiment.
  • the handwriting recognition device includes an output value acquisition module 80 and a recognition result acquisition module 90.
  • the implementation functions of the output value acquisition module 80 and the recognition result acquisition module 90 correspond to the steps corresponding to the handwriting recognition method in the embodiment. To avoid redundant description, this embodiment does not detail them one by one.
  • the handwriting recognition device includes an output value acquisition module 80 for obtaining the Chinese characters to be recognized, using the target Chinese handwriting recognition model to identify the Chinese characters to be recognized, and obtaining the output values of the Chinese characters to be recognized in the target Chinese handwriting recognition model;
  • the Chinese handwriting recognition model is obtained by using the handwriting model training method.
  • the recognition result obtaining module 90 is configured to obtain a target probability output value according to the output value and a preset Chinese semantic lexicon, and obtain a recognition result of the Chinese characters to be recognized based on the target probability output value.
  • This embodiment provides one or more non-volatile readable storage media storing computer-readable instructions.
  • the computer-readable instructions are executed by one or more processors, the one or more processors are executed.
  • the handwriting model training method in the embodiment is implemented at this time. To avoid repetition, details are not repeated here.
  • the functions of each module / unit of the handwriting model training device in the embodiment are implemented when the one or more processors are executed. To avoid repetition, here No longer.
  • the functions of each step in the handwriting recognition method in the embodiment are implemented when the one or more processors are executed. One by one.
  • the functions of each module / unit in the handwriting recognition device in the embodiment are implemented when the one or more processors are executed. To avoid repetition, this I will not repeat them one by one.
  • FIG. 9 is a schematic diagram of a computer device according to an embodiment of the present application.
  • the computer device 100 of this embodiment includes a processor 101, a memory 102, and computer-readable instructions 103 stored in the memory 102 and executable on the processor 101.
  • the computer-readable instructions 103 are processed.
  • the implementation of the handwriting model training method in the embodiment is implemented when the device 101 is executed. To avoid repetition, details are not described here one by one.
  • the computer-readable instructions 103 are executed by the processor 101, the functions of each model / unit in the handwriting model training device in the embodiment are implemented. To avoid repetition, details are not repeated here.
  • the computer-readable instructions 103 are executed by the processor 101, the functions of the steps in the handwriting recognition method in the embodiment are implemented. To avoid repetition, details are not described here one by one.
  • the computer-readable instruction 103 is executed by the processor 101, the functions of each module / unit in the handwriting recognition device in the embodiment are implemented. To avoid repetition, details are not described here one by one.
  • the computer device 100 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the computer device may include, but is not limited to, a processor 101 and a memory 102.
  • FIG. 9 is only an example of the computer device 100, and does not constitute a limitation on the computer device 100. It may include more or fewer components than shown in the figure, or combine some components or different components.
  • computer equipment may also include input and output equipment, network access equipment, and buses.
  • the so-called processor 101 may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 102 may be an internal storage unit of the computer device 100, such as a hard disk or a memory of the computer device 100.
  • the memory 102 may also be an external storage device of the computer device 100, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, and a flash memory card (Flash) provided on the computer device 100. Card) and so on.
  • the memory 102 may also include both an internal storage unit of the computer device 100 and an external storage device.
  • the memory 102 is used to store computer-readable instructions 103 and other programs and data required by the computer device.
  • the memory 102 may also be used to temporarily store data that has been output or is to be output.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)

Abstract

一种手写模型训练方法、手写字识别方法、装置、设备及介质。该手写模型训练方法包括:获取规范中文字训练样本,初始化卷积神经网络,将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;获取并采用非规范中文字训练样本,训练获取调整中文手写字识别模型;获取并采用待测试中文字样本得到出错字训练样本;基于批量梯度下降的后向传播算法,采用出错字训练样本更新中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。采用该手写模型训练方法,能够得到识别手写字识别率高的目标中文手写字识别模型。

Description

手写模型训练方法、手写字识别方法、装置、设备及介质
本申请以2018年6月4日提交的申请号为201810564062.3,名称为“手写模型训练方法、手写字识别方法、装置、设备及介质”的中国专利申请为基础,并要求其优先权。
技术领域
本申请涉及中文字识别领域,尤其涉及一种手写模型训练方法、手写字识别方法、装置、设备及介质。
背景技术
传统手写字识别方法大多包括二值化处理、字符分割、特征提取和支持向量机等步骤进行识别,采用传统手写字识别方法在识别较为潦草的非规范字(手写中文字)时,识别的精确度不高,使得其识别效果不理想。传统手写字识别方法很大程度上只能识别规范字,对实际生活中各种各样的手写字进行识别时,准确率较低。
发明内容
本申请实施例提供一种手写模型训练方法、装置、设备及介质,以解决当前手写字识别准确率不高的问题。
一种手写模型训练方法,包括:
采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵;
基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本;
初始化卷积神经网络;
将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;
获取非规范中文字训练样本,将所述非规范中文字训练样本输入到所述规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新所述规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型;
获取待测试中文字样本,采用所述调整中文手写字识别模型识别所述待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本;
将所述出错字训练样本输入到所述调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
一种手写模型训练装置,包括:
像素值特征矩阵获取模块,用于采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵;
规范中文字训练样本获取模块,用于基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本;
初始化模块,用于初始化卷积神经网络;
规范中文字识别模型获取模块,用于将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;
调整中文手写字识别模型获取模块,用于获取非规范中文字训练样本,将所述非规范中文字训练样本输入到所述规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新所述规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型;
出错字训练样本获取模块,用于获取待测试中文字样本,采用所述调整中文手写字识别模型识别所述待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本;
目标中文手写字识别模型获取模块,用于将所述出错字训练样本输入到所述调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获 取目标中文手写字识别模型。
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵;
基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本;
初始化卷积神经网络;
将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;
获取非规范中文字训练样本,将所述非规范中文字训练样本输入到所述规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新所述规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型;
获取待测试中文字样本,采用所述调整中文手写字识别模型识别所述待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本;
将所述出错字训练样本输入到所述调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵;
基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本;
初始化卷积神经网络;
将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;
获取非规范中文字训练样本,将所述非规范中文字训练样本输入到所述规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新所述规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型;
获取待测试中文字样本,采用所述调整中文手写字识别模型识别所述待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本;
将所述出错字训练样本输入到所述调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
本申请实施例还提供一种手写字识别方法、装置、设备及介质,以解决当前手写字识别准确率不高的问题。
一种手写字识别方法,包括:
获取待识别中文字,采用目标中文手写字识别模型识别所述待识别中文字,获取所述待识别中文字在所述目标中文手写字识别模型中的输出值;所述目标中文手写字识别模型是采用所述手写模型训练方法获取到的;
根据所述输出值和预设的中文语义词库获取目标概率输出值,基于所述目标概率输出值获取所述待识别中文字的识别结果。
本申请实施例提供一种手写字识别装置,包括:
输出值获取模块,用于获取待识别中文字,采用目标中文手写字识别模型识别所述待识别中文字,获取所述待识别中文字在所述目标中文手写字识别模型中的输出值;所述目标中文手写字识别模型是采用所述手写模型训练方法获取到的;
识别结果获取模块,用于根据所述输出值和预设的中文语义词库获取目标概率输出值,基于所述目标概率输出值获取所述待识别中文字的识别结果。
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
获取待识别中文字,采用目标中文手写字识别模型识别所述待识别中文字,获取所述待识别中文字 在所述目标中文手写字识别模型中的输出值;所述目标中文手写字识别模型是采用所述手写模型训练方法获取到的;
根据所述输出值和预设的中文语义词库获取目标概率输出值,基于所述目标概率输出值获取所述待识别中文字的识别结果。
一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
获取待识别中文字,采用目标中文手写字识别模型识别所述待识别中文字,获取所述待识别中文字在所述目标中文手写字识别模型中的输出值;所述目标中文手写字识别模型是采用所述手写模型训练方法获取到的;
根据所述输出值和预设的中文语义词库获取目标概率输出值,基于所述目标概率输出值获取所述待识别中文字的识别结果。
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一实施例中手写模型训练方法的一应用环境图;
图2是本申请一实施例中手写模型训练方法的一流程图;
图3是图2中步骤S20的一具体流程图;
图4是图2中步骤S40的一具体流程图;
图5是图2中步骤S60的一具体流程图;
图6是本申请一实施例中手写模型训练装置的一示意图;
图7是本申请一实施例中手写字识别方法的一流程图;
图8是本申请一实施例中手写字识别装置的一示意图;
图9是本申请一实施例中计算机设备的一示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
图1示出本申请实施例提供的手写模型训练方法的应用环境。该手写模型训练方法的应用环境包括服务端和客户端,其中,服务端和客户端之间通过网络进行连接,客户端是可与用户进行人机交互的设备,包括但不限于电脑、智能手机和平板等设备,服务端具体可以用独立的服务器或者多个服务器组成的服务器集群实现。本申请实施例提供的手写模型训练方法应用于服务端。
如图2所示,图2示出本申请实施例中手写模型训练方法的一流程图,该手写模型训练方法包括如下步骤:
S10:采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵。
其中,光学字符识别技术(Optical Character Recognition,简称OCR),是指将图像上的文字转化为计算机可编辑的文字内容,待处理中文字训练样本是指初始获取的,未经处理的训练样本。像素值特征矩阵即采用像素值作为特征,并采用矩阵的方式进行表示的矩阵。
本实施例中,采用OCR技术,对图像上的文字进行定位、分割和特征提取等操作,获取待处理中文字训练样本中每个中文字的像素值特征矩阵,像素值特征矩阵能够被计算机直接读取和识别,能够将提取待处理中文字训练样本的像素值特征,并采用矩阵进行表示。
S20:基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本。
其中,规范中文字训练样本是指根据标准规范字(如属于楷体、宋体或隶书等字体的字,一般字体选择楷体或者宋体)所获取的训练样本,待处理中文字训练样本中的字属于标准规范字。
本实施例中,基于待处理中文字训练样本中每个中文字的像素值特征矩阵,获取到用于训练卷积神经网络的规范中文字训练样本,以提高网络训练的效率。该规范中文字训练样本是由属于楷体、宋体或隶书等中文字体的标准规范字获取而来,本实施例中以宋体为例进行说明。可以理解地,这里的标准规范字是指属于目前中文字体中主流字体的字,如计算机设备的输入法中的默认字体宋体的字,常用于临摹的主流字体楷体的字等;而像日常生活中比较少使用的中文字体的字如草书的字、幼圆的字,则不列入该标准规范字的范围。
S30:初始化卷积神经网络。
在一实施例中,初始化卷积神经网络,包括:令卷积神经网络初始化的权值满足公式
Figure PCTCN2018094193-appb-000001
其中,n l表示在第l层输入的训练样本的样本个数,S()表示方差运算,W l表示第l层的权值,
Figure PCTCN2018094193-appb-000002
表示任意,l表示卷积神经网络中的第l层。
其中,卷积神经网络(Convolutional Neural Network,简称CNN)是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,能够进行图像处理和识别。卷积神经网络与一般的深度神经网络(Deep Neural Networks,简称DNN)的主要区别在于卷积神经网络包括卷积层和池化层,这为卷积神经网络能够对带有文字的图像进行处理和识别提供重要的技术支持。
卷积神经网络中包括各层之间各个神经元连接的权值和偏置,这些权值和偏置决定卷积神经网络的识别效果。
本实施例中,初始化卷积神经网络,该初始化操作即设置卷积神经网络中权值和偏置的初始值。具体地,设C l的卷积神经网络中第l层的卷积,由卷积神经网络的性质可知C l=W lx l+b l,其中,W l表示第l层的权值,x l表示第l层输入的用于初始化的训练样本,b l表示第l层的偏置。则C l的方差可求得为S(C l)=n lS(W lx l),其中,S()表示方差运算,n l表示在第l层输入的训练样本的样本个数。在卷积神经网络进行训练时,权值的均值过大可能导致梯度过大,而无法有效找到误差函数的极小值,因此此处将权值W设为满足均值0的情况,则上述C l的方差表达式可进一步写为S(C l)=n lS(W l)E((x l) 2),其中,E()表示数学期望运算。
特别地,卷积神经网络中卷积层采用的激活函数为ReLU(Rectified Linear Unit,中文名称为线性整流函数),又称修正线性单元,是一种人工神经网络中常用的激活函数,通常指代以斜坡函数及其变种为代表的非线性函数。由激活函数ReLU能够得到x l=ReLU(C l-1)和
Figure PCTCN2018094193-appb-000003
将这两个式子代入到上述C l的方差表达式S(C l)=n lS(W l)E((x l) 2),得到
Figure PCTCN2018094193-appb-000004
在卷积神经网络训练时应尽量保持方差一致,才不会导致方差在训练过程中变得越来越大或越来越小,导致梯度收敛得过快或过慢,从而出现无法有效找到误差函数的极小值或训练的速度过慢的问题。因此,为了使方差保持一致,由上式
Figure PCTCN2018094193-appb-000005
可知,权值应满足
Figure PCTCN2018094193-appb-000006
表示任意,则根据该式可以相应地设置卷积神经网络的权值。偏置在初始设置时可以设置为较小的值,如设置在区间[-0.3,0.3]之间。
合理地初始化卷积神经网络可以使网络在初期有较灵活的调整能力,可以在训练过程中对网络进行有效的调整,能够快速有效地找到误差函数的极小值,有利于卷积神经网络的更新和调整,使得基于卷积神经网络进行模型训练获取的模型在进行中文手写字识别时具备精确的识别效果。
S40:将规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型。
其中,随机梯度下降(Stochastic Gradient Descent,简称SGD)是在更新网络参数时,获取各个训练样本在训练过程中产生的误差,并多次随机采用单个样本在训练过程中产生的误差,对网络参数进行更新的处理方法。后向传播算法(Back Propagation,简称BP算法)是神经网络学习中一种训练与学习方法,用来调整神经网络中各个节点之间的权值和偏置。采用后向传播算法对神经网络中的权值和偏置进行调整时需要求出误差函数的极小值,而在本实施例中,误差函数的极小值具体采用随机梯度下降的处理方法求出。
本实施例中,将规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型。该规范中文字识别模型在训练过程中学习了规范中文字训练样本的深层特征,使得该模型能够对标准规范字进行精确的识别,具备对标准规范字的识别能力。需要说明的是,无论规范中文字训练样本采用的是楷体、宋体或隶书等其他中文字体对应的标准规范字,由于这些标准规范字在字体识别的层面上差别并不大,因此该规范中文字识别模型可以对楷体、宋体或隶书等字体对应的标准规范字进行精确的识别,得到较准确的识别结果。
S50:获取非规范中文字训练样本,将非规范中文字训练样本输入到规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型。
其中,非规范中文字训练样本是指根据手写中文字所获取的训练样本,该手写中文字具体可以是按照楷体、宋体或隶书等字体对应的标准规范字的字体形态通过手写方式得到的字。可以理解地,该非规范中文字训练样本与规范中文字训练样本的区别在于非规范中文字训练样本是由手写中文字所获取的,既然是手写的,当然就包含各种各样不同的字体形态。
本实施例中,服务端获取非规范中文字训练样本,该训练样本包含有手写中文字的特征,将非规范中文字训练样本输入到规范中文字识别模型中进行训练并调整,采用基于随机梯度下降的后向传播算法更新规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型。可以理解地,规范中文字识别模型具备识别标准规范中文字的能力,但是在对手写中文字进行识别时并没有较高的识别精确度。因此本实施例采用非规范中文字训练样本进行训练,让规范中文手写字识别模型在已有识别标准规范字的基础上,对模型中的参数(权值和偏置)进行调整,获取调整中文手写字识别模型。该调整中文手写字识别模型在原本识别标准规范字的基础上学习手写中文字的深层特征,使得该调整中文手写字识别模型结合了标准规范字和手写中文字的深层特征,能够同时对标准规范字和手写中文字进行有效的识别,得到准确率较高的识别结果。
卷积神经网络在进行字识别时,是根据字的像素分布进行判断的,在实际生活中的手写中文字与标准规范字存在差别,但是这种差别相比与其他不对应标准规范字的差别小很多的,例如,手写中文字的“我”和标准规范字的“我”在像素分布上存在差别,但是这种差别相比于手写中文字“你”和标准规范字“我”之间的差别明显小很多。可以这样认为,即使手写中文字与相对应的标准规范字之间存在一定的差别,但是这种差别与不相对应的标准规范字的差别小得多,因此,可以通过最相似(即差别最小)的原则确定识别结果。调整中文手写字识别模型是由卷积神经网络训练而来的,该模型结合标准规范字和手写中文字的深层特征,能够根据该深层特征对手写中文字进行有效的识别。
对于步骤S40和S50,采用基于随机梯度下降的后向传播算法进行误差反传更新,能够在训练样本的数量较为庞大的情况下仍顺利进行模型训练,可以提高网络训练的效率和效果,使得训练更有效。
需要说明的是,本实施例的步骤S40和步骤S50的顺序是不可调换的,先执行步骤S40再执行步骤S50。先采用规范中文训练样本训练卷积神经网络可以使获取的规范中文字识别模型拥有较好的识别能力,使其对标准规范字有精确的识别结果。在拥有良好的识别能力的基础上再进行步骤S50的微调,使得训练获取的调整中文手写字识别模型能够根据学习到的手写中文字的深层特征对手写中文字进行有效的识别,使其对手写中文字识别有较精确的识别结果。若先执行步骤S50或只执行步骤S50,由于手写中文字有各种各样的形态,直接采用手写中文字训练学习到的特征并不能较好地反映手写中文字的特征,会使一开始模型就学“坏”,导致后来再怎么进行调整也难以使得对手写中文字识别有精确的识别 结果。虽然每个人的手写中文字都不一样,但是极大部分都是与标准规范字相似(如手写中文字模仿标准规范字)。因此,一开始根据标准规范字进行模型训练更符合客观情况,要比直接对手写中文字进行模型训练的效果更好,可以在“好”的模型下进行相应的调整,获取手写中文字识别率高的调整中文手写字识别模型。
S60:获取待测试中文字样本,采用调整中文手写字识别模型识别待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有出错字作为出错字训练样本。
其中,待测试中文字样本是指根据标准规范字和手写中文字所获取的用于测试的训练样本,该步骤采用的标准规范字和步骤S40中用于训练的标准规范字是相同的(因为如楷体、宋体等字体所对应的每个字都是唯一确定的);采用的手写中文字与和步骤S50中用于训练的手写中文字可以是不同的(不同人手写的中文字是不完全相同的,手写中文字所对应的每个字可以对应多种字体形态,为了与步骤S50用于训练的非规范中文字训练样本区分开来,避免模型训练过拟合的情况,一般该步骤采用与步骤S50不同的手写中文字)。
本实施例中,将训练好的调整中文手写字识别模型用来识别待测试中文字样本,该待测试中文字样本包括标准规范字和其预先设置好的标签值(即真实结果),以及手写中文字和其预先设置好的标签值。训练时标准规范字和手写中文字可以是采用混合的方式输入到调整中文手写字识别模型。在采用调整中文手写字识别模型对待测试中文字样本进行识别时,将获取到相应的识别结果,把识别结果与标签值(真实结果)不相符的所有出错字作为出错字训练样本。该出错字训练样本反映调整中文字手写识别模型仍然存在识别精度不足的问题,以便后续根据该出错字训练样本进一步更新、优化调整中文手写字识别模型。
由于调整中文手写字识别模型的识别精度实际上受到规范中文字训练样本和非规范中文字训练样本的共同影响,在先采用规范中文字训练样本更新网络参数(权值和偏置),再采用非规范中文字训练样本更新网络参数(权值和偏置)的前提下,会导致获取到的调整中文手写字识别模型过度学习非规范中文字训练样本的特征,使得获取的调整中文手写字识别模型对非规范中文字训练样本(包括手写中文字)拥有非常高的识别精度,但却过度学习该非规范中文字样本的特征,影响除该非规范中文字训练样本以外的手写中文字的识别精度,因此,步骤S60采用待测试中文字样本对调整中文手写字识别模型进行识别,能够很大程度上消除训练时采用的非规范中文字训练样本的过度学习。即通过调整中文手写字识别模型识别待测试中文字样本,以找出由于过度学习而产生的误差,该误差具体可以通过出错字反映出来,因此能够根据该出错字进一步地更新、优化调整中文手写字识别模型的网络参数。
S70:将出错字训练样本输入到调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
本实施例中,将出错字训练样本输入到调整中文手写字识别模型中进行训练,该出错字训练样本反映了在训练调整中文手写字识别模型时,由于过度学习非规范中文字训练样本的特征,导致调整中文手写字识别模型在识别非规范中文字训练样本以外的手写中文字时出现的识别不精确的问题。并且,由于先采用规范中文字训练样本再采用非规范中文字训练样本训练模型的原因,会过度削弱原先学习的标准规范字的特征,这会影响模型初始搭建的对标准规范字进行识别的“框架”。利用出错字训练样本可以很好地解决过度学习和过度削弱的问题,可以根据出错字训练样本反映的识别精确度上的问题,在很大程度上消除原本训练过程中产生的过度学习和过度削弱带来的不利影响。具体地,采用出错字训练样本进行训练时采用的是基于批量梯度下降的后向传播算法,根据该算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型,该目标中文手写字识别模型是指最终训练出来的可用于识别中文手写字的模型。在更新网络参数时,出错字训练样本的样本容量较少(出错字较少),采用基于批量梯度下降的后向传播算法能够将所有出错字训练样本在卷积神经网络训练时产生的误差都进行反传更新,保证产生的所有误差都能对网络进行调整和更新,能够全面地训练卷积神经网络,提高目标中文手写字识别模型的识别准确率。
需要说明的是,本实施例中,步骤S40和S50采用的是基于随机梯度下降的后向传播算法;步骤S70采用的是基于批量梯度下降的后向传播算法。
步骤S40中,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置的过程具体包 括如下步骤:
获取规范中文字训练样本中每一训练样本(每一个字)对应的二值化像素值特征矩阵,把每一二值化像素值特征矩阵随机输入到卷积神经网络中得到每一对应的前向输出,计算每一前向输出与对应的标签值(真实结果)之间的误差,每获取一误差即相应进行一次梯度下降的反向传播,更新网络的权值和偏置。重复上述计算每一误差并采用每一误差更新网络的权值和偏置的过程,直到误差小于停止迭代阈值ε 1时,结束该循环,得到更新好的权值和偏置,即得到规范中文字识别模型。
步骤S50采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置的过程与步骤S40的过程相似,在此不再赘述。
步骤S70中,采用基于批量梯度下降的后向传播算法更新卷积神经网络的权值和偏置的过程具体包括如下步骤:
获取出错字训练样本中的一个训练样本对应的二值化像素值特征矩阵,把该二值化像素值特征矩阵输入到调整中文手写字识别模型(本质上还是一个卷积神经网络)中得到前向输出,计算该前向输出与真实结果之间的误差,获取并依次输入剩余的训练样本对应的二值化像素值特征矩阵到调整中文手写字识别模型中,计算相应的前向输出与真实结果之间误差,并累加误差得到调整中文手写字识别模型对于出错字训练样本的总误差,采用总误差进行一次基于梯度下降的反向传播,更新网络的权值和偏置,重复上述计算总误差和采用总误差更新网络的权值和偏置的过程,直到误差小于停止迭代阈值ε 2时,结束该循环,得到更新好的权值和偏置,即得到目标中文手写字识别模型。
可以理解地,对于步骤S40和S50,由于进行模型训练所采用的训练样本的数量较为庞大,若采用基于批量梯度下降的后向传播算法将影响网络训练的效率和效果,甚至无法正常进行模型训练,难以有效地进行训练。采用基于随机梯度下降的后向传播算法进行误差反传更新可以提高网络训练的效率和效果,使得训练更有效。
对于步骤S70,出错字训练样本的样本容量较少(出错字较少),采用基于批量梯度下降的后向传播算法能够将所有出错字训练样本在卷积神经网络训练时产生的误差都进行反传更新,保证产生的所有误差都能对网络进行调整和更新,能够全面地训练卷积神经网络。基于批量梯度下降的后向传播算法相比较于基于随机梯度下降的后向传播算法,前者的梯度是标准的,能够全面地训练卷积神经网络;而后者每次随机从训练样本抽取一个训练样本更新网络的参数,其梯度是近似的,并不标准,在训练的准确性上不如前者。采用基于批量梯度下降的后向传播算法能够提高模型训练的准确性,使得训练获取的目标中文手写字识别模型拥有精确的识别能力。
步骤S10-S70中,采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵,基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本,该规范中文字训练样本能够被计算机直接识别和读取。然后初始化卷积神经网络,有利于提高神经网络的训练效率。采用规范中文字训练样本训练并获取规范中文字识别模型,再通过非规范中文字对规范中文字识别模型进行调整性的更新,使得更新后获取的调整中文手写字识别模型在具备识别标准规范字能力的前提下,通过训练更新的方式学习手写中文字的深层特征,使得调整中文手写字识别模型能够较好地识别手写中文字。然后采用调整中文手写字识别模型识别待测试中文字样本,获取识别结果与真实结果不相符的出错字,并将所有出错字作为出错字训练样本输入到调整中文手写字识别模型中进行训练更新,获取目标中文手写字识别模型。采用出错字训练样本可以在很大程度上消除原本训练过程中产生的过度学习和过度削弱带来的不利影响,能够进一步优化识别准确率。训练规范中文字识别模型和调整中文手写字识别模型采用了基于随机梯度下降的后向传播算法,能够在训练样本数量多的情况下仍然有较好的训练效果。训练目标中文手写字识别模型采用了基于批量梯度下降的后向传播算法,采用批量梯度下降能够保证对模型中参数的充分更新,对训练样本在训练过程中产生的误差都进行反向传播更新,全面地根据产生的误差进行参数更新,提高所获取的模型的识别准确率。
在一实施例中,如图3所示,步骤S20中,基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本,具体包括如下步骤:
S21:获取待处理中文字训练样本中每个中文字的像素值特征矩阵,将像素值特征矩阵中每个像素值进行归一化处理,获取每个中文字的归一化像素值特征矩阵,其中,归一化处理的公式为
Figure PCTCN2018094193-appb-000007
MaxValue为每个中文字的像素值特征矩阵中像素值的最大值,MinValue为每个中文字的像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值。
本实施例中,获取待处理中文字训练样本中每个中文字的像素值特征矩阵,每个中文字的像素值特征矩阵代表着对应字的特征,在这里用像素值代表字的特征,由于字是基于二维表示的(一般一个字用一张m×n的图像表示),故像素值可以采用矩阵表示,即形成像素值特征矩阵。计算机设备能够识别像素值特征矩阵的形式,读取像素值特征矩阵中的数值。服务端获取像素值特征矩阵后,采用归一化处理的公式对特征矩阵中的每个中文字的像素值进行归一化处理,获取每个中文字的归一化像素值特征。本实施例中,采用归一化处理方式能够将每个中文字的像素值特征矩阵都压缩在同一个范围区间内,能够加快与该像素值特征矩阵相关的计算,有助于提高训练规范中文字识别模型的训练效率。
S22:将每个中文字的归一化像素值特征矩阵中的像素值划分为两类像素值,基于两类像素值建立每个中文字的二值化像素值特征矩阵,将每个中文字的二值化像素特征矩阵组合作为规范中文字训练样本。
本实施例中,将每个中文字的归一化像素值特征矩阵中的像素值划分为两类像素值,该两类像素值是指像素值中只包含像素值A或者像素值B。具体地,可以将归一化像素特征矩阵中大于或等于0.5的像素值取为1,将小于0.5的像素值取为0,建立相应的每个中文字的二值化像素值特征矩阵,每个中文字的二值化像素特征矩阵中的原始只包含0或1。在建立每个中文字的二值化像素值特征矩阵后,将二值化像素值特征矩阵对应的中文字组合作为规范中文字训练样本。例如,在一张包含字的图像中,包含字像素的部分和空白像素的部分。字上的像素值一般颜色会比较深,二值化像素值特征矩阵中的“1”代表字像素的部分,而“0”则代表图像中空白像素的部分。可以理解地,通过建立二值化像素值特征矩阵可以进一步简化对字的特征表示,仅采用0和1的矩阵就可以将每个中文字表示并区别开来,能够提高计算机处理关于中文字的特征矩阵的速度,进一步提高训练规范中文字识别模型的训练效率。
步骤S21-S22对待处理中文字训练样本进行归一化处理并进行二类值的划分,获取每个中文字的二值化像素值特征矩阵,并将每个中文字的二值化像素值特征矩阵对应的字作为规范中文字训练样本,能够显著缩短训练规范中文字识别模型的时长。
在一实施例中,如图4所示,步骤S40中,将规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,具体包括如下步骤:
S41:将规范中文字训练样本输入到卷积神经网络中,获取规范中文字训练样本在卷积神经网络中的前向输出。
其中,卷积神经网络是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,能够进行图像处理和识别。卷积神经网络通常包括至少两个非线性可训练的卷积层,至少两个非线性的池化层和至少一个全连接层,即包括至少五个隐含层,此外还包括输入层和输出层。
本实施例中,将规范中文字训练样本输入到卷积神经网络中进行训练,规范中文字训练样本在卷积神经网络经过各层处理后(具体为权值和偏置对规范中文字训练样本的响应处理),会在卷积神经网络的每一层都得到处理后相应的输出值。由于卷积神经网络包含的层数较多,且各层的功能不同,因此各层的输出是不同的。
具体地,若第l层是卷积层,则卷积层的输出可以表示为a l=σ(z l)=σ(a l-1*W l+b l),其中,a l表示第l层的输出,z l表示未采用激活函数处理前的输出,a l-1表示l-1层的输出(即上一层的输出),σ表示激活函数(对于卷积层采用的激活函数σ为ReLU,相比其他激活函数的效果会更好),*表示卷积运算,W l表示第l层的权值,b l表示第l层的偏置。若第l层是池化层,则池化层的输出可以表示为 a l=pool(a l-1),其中pool是指下采样计算,该下采样计算可以选择最大池化的方法,最大池化实际上就是在n*n的样本中取最大值,作为采样后的样本值。除了最大池化之外,常用的还有平均池化,即取在n*n的样本中取各样本的平均值作为采样后的样本值。若第l层是全连接层,则计算该全连接层的输出与传统深度神经网络计算输出的方式相同,用公式表示为a l=σ(z l)=σ(W la l-1+b l),参数的含义与上述提及的解释相同,在此不再进行赘述。特别地,对于输出层L,激活函数σ采用的是softmax函数,计算输出层L输出的公式为a L=softmax(z l)=softmax(W La L-1+b L)。根据上述卷积神经网络每一层的计算公式,可以求出卷积神经网络中每一层的输出,并最终得到输出层的输出a L,该输出即前向输出。可以理解地,步骤S111中得到的前向输出,能够反映规范中文字训练样本在卷积神经网络中的输出情况,可以根据该输出情况与客观事实(真实结果)进行比较,以根据两者之间的误差对卷积神经网络进行调整。
S42:根据前向输出和真实结果构建误差函数,误差函数的表达式为
Figure PCTCN2018094193-appb-000008
其中,n表示训练样本总数,x i表示第i个训练样本的前向输出,y i表示与x i相对应的第i个训练样本的真实结果。
其中,真实结果即客观事实,例如输入的字为楷体的“太”,则前向输出的结果可能是“大”等其他结果,而真实结果就是原本输入的“太”,可以将真实结果理解为训练样本的标签值,用于计算与前向输出的误差。
本实施例中,由于卷积神经网络对规范中文字训练样本进行处理后得到的前向输出与真实结果是存在误差的,那么可以根据该误差构建对应的误差函数,以便利用该误差函数训练卷积神经网络,更新权值和偏置,以使更新后的权值和偏置在处理输入的训练样本是能够得到与真实结果相同或更相似的前向输出。具体地,可以根据实际情况构建合适的误差函数,本实施例构建的误差函数为
Figure PCTCN2018094193-appb-000009
能够较好地反映前向输出和真实结果之间的误差。
S43:根据误差函数,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,其中,在卷积神经网络的全连接层,更新权值的公式为
Figure PCTCN2018094193-appb-000010
在卷积神经网络的卷积层,更新权值的公式为
Figure PCTCN2018094193-appb-000011
W l'表示更新后的权值,W l表示更新前的权值,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,a i,l-1表示输入的第i个中文字样本在第l-1层的输出,T表示矩阵转置运算,*表示卷积运算,rot180表示将矩阵翻转180度的运算;在卷积神经网络的全连接层,更新偏置的公式为
Figure PCTCN2018094193-appb-000012
在卷积神经网络的卷积层,更新偏置的公式为
Figure PCTCN2018094193-appb-000013
b l'表示 更新后的偏置,b l表示更新前的偏置,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,(u,v)是指进行卷积运算时获取的卷积特征图中每一个卷积特征图中的小块位置。
本实施例中,在构建合适的误差函数后,采用基于随机梯度下降的后向传播算法更新网络参数,并将更新后的卷积神经网络作为规范中文字识别模型。具体地,在后向传播过程中由于卷积神经网络各层有较大差异,因此应根据每一层的实际情况进行后向传播,对网络参数进行更新。在反向传播过程中,首先对更新后输出层的权值和偏置进行计算,采用误差函数分别对权值W和偏置b进行求偏导的运算,能够得到公共的因子,即输出层的灵敏度δ L(L表示输出层),由该灵敏度δ L能够依次求出第l层的灵敏度δ l,根据δ l求得神经网络中第l层的梯度,再利用梯度更新卷积神经网络的权值和偏置。具体地,若当前为全连接层,则δ l=(W l+1) Tδ l+1οσ'(z l),其中,W l+1表示l+1层的权值,T表示矩阵转置运算,δ l+1表示l+1层的灵敏度,ο表示两个矩阵对应元素相乘的运算(Hadamard积),σ表示激活函数,z l表示在计算前向传播过程中未采用激活函数处理前的输出。若当前为卷积层,则δ l=δ l+1*rot180(W l+1)οσ'(z l),其中,*表示卷积运算,rot180表示将矩阵翻转180度的运算,公式中其余参数的含义参见上文度参数含义进行解释的内容,在此不再赘述。若当前为池化层,则δ l=upsample(δ l+1)οσ'(z l),upsample表示上采样运算。根据上述卷积神经网络各个层求相应的灵敏度δ l,根据灵敏度δ l更新层l的权值和偏置。池化层没有权值和偏置,因此只需要更新全连接层和卷积层的权值和偏置即可。
具体地,步骤S43中,若当前是全连接层,则更新权值的公式表示为
Figure PCTCN2018094193-appb-000014
其中,W l'表示更新后的权值,W l表示更新前的权值,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,a i,l-1表示输入的第i个中文字样本在第l-1层的输出,T表示矩阵转置运算,
Figure PCTCN2018094193-appb-000015
即l层权值W的梯度;更新偏置的公式表示为
Figure PCTCN2018094193-appb-000016
b l'表示更新后的偏置,b l表示更新前的偏置α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度。若当前是卷积层,则更新权值的公式为
Figure PCTCN2018094193-appb-000017
更新偏置的 公式为
Figure PCTCN2018094193-appb-000018
其中,(u,v)是指进行卷积运算时获取的卷积特征图中每一个卷积特征图中的小块(组成卷积特征图的元素)位置。通过采用随机梯度下降的后向传播算法,对卷积神经网络中各层的权值和偏置进行相应的更新获取规范中文字识别模型。
步骤S41-S43能够根据规范中文字训练样本在卷积神经网络得到的前向输出构建误差函数
Figure PCTCN2018094193-appb-000019
并根据该误差函数反传更新权值和偏置,能够获取规范中文字识别模型,该模型学习了规范中文字训练样本的深层特征,能够精确地识别标准规范字。
在一实施例中,如图5所示,步骤S60中,采用调整中文手写字识别模型识别待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有出错字作为出错字训练样本,具体包括如下步骤:
S61:将待测试中文字样本输入到调整中文手写字识别模型,获取待测试中文字样本中每一个字在调整中文手写字识别模型中的输出值。
本实施例中,采用调整中文手写字识别模型对待测试中文字样本进行识别,待测试中文字样本中包含若干中文字。在中文字库中,常用的中文字大概有三千多个,在调整中文手写字识别模型的输出层应设置中文字库中每一个字与输入的待测试中文字样本相似程度的概率值,该概率值为待测试中文字样本中每一个字在调整中文手写字识别模型中的输出值,具体可以是通过softmax函数实现。简单地说,当输入“我”字时,在调整中文手写字识别模型中将会获取其与中文字库中每一个字对应的输出值(用概率表示),如与中文字库中“我”对应的输出值为99.5%,其余字的输出值加起来为0.5%。通过设置待测试中文字样本,在经过调整中文手写字识别模型识别后的与中文字库中每一个字对应的输出值,可以根据该输出值得到合理的识别结果。
S62:选取每一个字对应的输出值中的最大输出值,根据最大输出值获取每一个字的识别结果。
本实施例中,选择每一个字对应的所有输出值中的最大输出值,根据该最大输出值即可获取该字的识别结果。可以理解地,输出值直接反映了输入的待测试中文字样本中的字与中文字库中每一个字的相似程度,而最大输出值则表明待测试字样本最接近中文字库中的某个字,则可以根据该最大输出值对应的字即为该字的识别结果,如输入“我”字最后输出的识别结果为“我”。
S63:根据识别结果,获取识别结果与真实结果不符的出错字,把所有出错字作为出错字训练样本。
本实施例中,将得到的识别结果与真实结果(客观事实)作比较,将比较识别结果与真实结果不符的出错字作为出错字训练样本。可以理解地,该识别结果只是待测试中文字训练样本在调整中文手写字识别模型识别出来的结果,与真实结果相比有可能是不相同的,反映了该模型在识别的精确度上仍有不足,而这些不足可以通过出错字训练样本进行优化,以达到更精确的识别效果。
步骤S61-S63根据待测试中文字样本中每一个字在调整中文手写字识别模型中的输出值,从输出值中选择能够反映字间相似程度的最大输出值;再通过最大输出值得到识别结果,并根据识别结果得到出错字训练样本,为后续利用出错字训练样本进一步优化识别精确度提供了重要的技术前提。
本实施例所提供的手写模型训练方法中,采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵,基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本,该规范中文字训练样本能够被计算机直接识别和读取。根据式
Figure PCTCN2018094193-appb-000020
初始化卷积神经网络的权值,用较小的值如区间[-0.3,0.3]初始化偏置,采用该初始化的方式能够快速有效地找到误差函数的极小值,有利于卷积神经网络的更新和调整。对待处理中文字训练样本进行归一化处理并进行二类值的划分,获取二值化像素值特征矩阵,并将特征矩阵对应的字作为规范中文字训练样本,能够显著缩短训练规范中文字识别模型的时长。根据规范中文字训练样本在卷积神经网络得到的前向输出构建误差函数
Figure PCTCN2018094193-appb-000021
并根据该误差函数反传更新权值和偏置, 能够获取规范中文字识别模型,该模型学习了规范中文字训练样本的深层特征,能够精确地识别标准规范字。接着通过非规范中文字对规范中文字识别模型进行调整性的更新,使得更新后获取的调整中文手写字识别模型在具备识别规范中文手写字能力的前提下,通过训练更新的方式学习非规范中文字的深层特征,使得调整中文手写字识别模型能够较好地识别非规范中文手写字。接着,根据待测试中文字样本中每一个字在调整中文手写字识别模型中的输出值,从输出值中选择能够反映字间相似程度的最大输出值,利用最大输出值得到识别结果,并根据识别结果得到出错字训练样本,并将所有出错字作为出错字训练样本输入到调整中文手写字识别模型中进行训练更新,获取目标中文手写字识别模型。采用出错字训练样本可以在很大程度上消除原本训练过程中产生的过度学习和过度削弱带来的不利影响,能够进一步优化识别准确率。此外,本实施例所提供的手写模型训练方法中,规范中文字识别模型和调整中文手写字识别模型在训练时采用的是基于随机梯度的后向传播算法,在训练样本数量多的情况下仍然有较好的训练效率和训练效果。目标中文手写字识别模型在训练时采用的是基于批量梯度下降的后向传播算法,能够保证对模型中参数的充分更新,对训练样本在训练过程中产生的误差都进行反向传播更新,全面地根据产生的误差进行参数更新,提高所获取的模型的识别准确率。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图6示出与实施例中手写模型训练方法一一对应的手写模型训练装置的原理框图。如图6所示,该手写模型训练装置包括像素值特征矩阵获取模块10、规范中文字训练样本获取模块20、初始化模块30、规范中文字识别模型获取模块40、调整中文手写字识别模型获取模块50、出错字训练样本获取模块60和目标中文手写字识别模型获取模块70。其中,像素值特征矩阵获取模块10、规范中文字训练样本获取模块20、初始化模块30、规范中文字识别模型获取模块40、调整中文手写字识别模型获取模块50、出错字训练样本获取模块60和目标中文手写字识别模型获取模块70的实现功能与实施例中手写模型训练方法对应的步骤一一对应,为避免赘述,本实施例不一一详述。
像素值特征矩阵获取模块10,用于采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵。
规范中文字训练样本获取模块20,用于基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本。
初始化模块30,用于初始化卷积神经网络。
规范中文字识别模型获取模块40,用于将规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型。
调整中文手写字识别模型获取模块50,用于获取非规范中文字训练样本,将非规范中文字训练样本输入到规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型。
出错字训练样本获取模块60,用于获取待测试中文字样本,采用调整中文手写字识别模型识别待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有出错字作为出错字训练样本。
目标中文手写字识别模型获取模块70,用于将出错字训练样本输入到调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
优选地,规范中文字训练样本获取模块20包括归一化像素值特征矩阵获取单元21和规范中文字训练样本获取单元22。
归一化像素值特征矩阵获取单元21,用于获取待处理中文字训练样本中每个中文字的像素值特征矩阵,将像素值特征矩阵中每个像素值进行归一化处理,获取每个中文字的归一化像素值特征矩阵,其中,归一化处理的公式为
Figure PCTCN2018094193-appb-000022
MaxValue为每个中文字的像素值特征矩阵中像素值的最大值,MinValue为每个中文字的像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值。
规范中文字训练样本获取单元22,用于将每个中文字的归一化像素值特征矩阵中的像素值划分为 两类像素值,基于两类像素值建立每个中文字的二值化像素值特征矩阵,将每个中文字的二值化像素特征矩阵组合作为规范中文字训练样本。
优选地,初始化模块30,用于初始化卷积神经网络,其中,卷积神经网络初始化的权值满足公式
Figure PCTCN2018094193-appb-000023
n l表示在第l层输入的训练样本的样本个数,S()表示方差运算,W l表示第l层的权值,
Figure PCTCN2018094193-appb-000024
表示任意,l表示卷积神经网络中的第l层。
优选地,规范中文字识别模型获取模块40包括前向输出获取单元41、误差函数构建单元42和规范中文字识别模型获取单元43。
前向输出获取单元41,用于将规范中文字训练样本输入到卷积神经网络中,获取规范中文字训练样本在卷积神经网络中的前向输出。误差函数构建单元42,用于根据前向输出和真实结果构建误差函数,误差函数的表达式为
Figure PCTCN2018094193-appb-000025
其中,n表示训练样本总数,x i表示第i个训练样本的前向输出,y i表示与x i相对应的第i个训练样本的真实结果。
规范中文字识别模型获取单元43,用于根据误差函数,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,其中,在卷积神经网络的全连接层,更新权值的公式为
Figure PCTCN2018094193-appb-000026
在卷积神经网络的卷积层,更新权值的公式为
Figure PCTCN2018094193-appb-000027
W l'表示更新后的权值,W l表示更新前的权值,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,a i,l-1表示输入的第i个中文字样本在第l-1层的输出,T表示矩阵转置运算,*表示卷积运算,rot180表示将矩阵翻转180度的运算;在卷积神经网络的全连接层,更新偏置的公式为
Figure PCTCN2018094193-appb-000028
在卷积神经网络的卷积层,更新偏置的公式为
Figure PCTCN2018094193-appb-000029
b l'表示更新后的偏置,b l表示更新前的偏置,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,(u,v)是指进行卷积运算时获取的卷积特征图中每一个卷积特征图中的小块位置。
优选地,出错字训练样本获取模块60包括模型输出值获取单元61、模型识别结果获取单元62和出错字训练样本获取单元63。
模型输出值获取单元61,用于将待测试中文字样本输入到调整中文手写字识别模型,获取待测试中文字样本中每一个字在调整中文手写字识别模型中的输出值。
模型识别结果获取单元62,用于选取每一个字对应的输出值中的最大输出值,根据最大输出值获取每一个字的识别结果。
出错字训练样本获取单元63,用于根据识别结果,获取识别结果与真实结果不符的出错字,把所有出错字作为出错字训练样本。
图7示出本实施例中手写字识别方法的一流程图。该手写字识别方法可应用在银行、投资和保险等 机构配置的计算机设备,用于对手写中文字进行识别,达到人工智能目的。如图7所示,该手写字识别方法包括如下步骤:
S80:获取待识别中文字,采用目标中文手写字识别模型识别待识别中文字,获取待识别中文字在目标中文手写字识别模型中的输出值,目标中文手写字识别模型是采用上述手写模型训练方法获取到的。
其中,待识别中文字是指要进行识别的中文字。
本实施例中,获取待识别中文字将待识别中文字输入到目标中文手写字识别模型中进行识别,获取待识别中文字在目标中文手写字识别模型中的输出值,一个待识别中文字对应有三千多个(具体数量以中文字库为准)输出值,可以基于该输出值确定该待识别中文字的识别结果。具体地,待识别中文字具体是采用计算机能够直接识别的二值化像素值特征矩阵表示。
S90:根据输出值和预设的中文语义词库获取目标概率输出值,基于目标概率输出值获取待识别中文字的识别结果。
其中,预设的中文语义词库是指预先设置好的基于词频的描述中文词语间语义关系的词库。例如,在该中文语义词库中对于“X阳”这两个字的词,“太阳”出现的概率为30.5%,“大阳”出现的概率为0.5%,剩余的如“骄阳”等“X阳”的两个字的词出现的概率之和为69%。目标概率输出值是结合输出值和预设的中文语义词库,得到的用于获取待识别中文字的识别结果的概率值。
具体地,采用输出值和预设的中文语义词库获取目标概率输出值包括如下步骤:(1)选取待识别中文字中每一个字对应的输出值中最大值作为第一概率值,根据第一概率值获取待识别中文字初步的识别结果。(2)根据该初步的识别结果和中文语义词库获取待识别字的向左语义概率值和向右语义概率值。可以理解地,对于一文本,该文本中的字是有先后顺序的,如“红X阳”,则对于“X”字而言,有向左向右两个方向词语“红X”和“X阳”对应的概率值,即向左语义概率值和向右语义概率值。(3)分别设置待识别中文字中每一个字对应的输出值的权值、向左语义概率值的权值和向右语义概率值的权值。具体地,可以赋予待识别中文字中每一个字对应的输出值0.4的权值,赋予向左语义概率值0.3的权值,赋予0.3向右语义概率值的权值。(4)根据上述设置的各个权值分别乘以各自对应的概率值得到各个加权运算后的概率值,将各个加权运算后的概率值相加得到目标概率输出值(目标概率输出值有多个,具体个数可以按中文字库为准),并选取目标概率输出值中最大值对应的字作为待识别中文字的识别结果。实际上,可以先选取输出值中,数值最大的前5个概率值,该前5个概率值代表最有可能的5个字(识别结果),只对这5字结合中文语义词库算出目标概率输出值,则目标概率输出值就只有5个,可以大大提高识别的效率。通过结合输出值和预设的中文语义词库,可以得到精确的识别结果。可以理解地,对于单个字(非文本)的识别,则可以根据输出值中最大值直接得到相应的识别结果即可,而不必加入基于中文语义的识别。
步骤S80-S90,采用目标中文手写字识别模型识别待识别中文字,结合输出值和预设的中文语义词库获取待识别中文字的识别结果。采用该目标中文手写字识别模型本身拥有较高的识别精确度,再结合中文语义词库进一步提高中文手写的识别准确率。
本申请实施例所提供的手写字识别方法中,将待识别中文字输入到目标中文手写字识别模型中进行识别,并结合预设的中文语义词库获取识别结果。采用该目标中文手写字识别模型对中文手写字进行识别时,可以得到精确的识别结果。
图8示出与实施例中手写字识别方法一一对应的手写字识别装置的原理框图。如图8所示,该手写字识别装置包括输出值获取模块80和识别结果获取模块90。其中,输出值获取模块80和识别结果获取模块90的实现功能与实施例中手写字识别方法对应的步骤一一对应,为避免赘述,本实施例不一一详述。
手写字识别装置包括输出值获取模块80,用于获取待识别中文字,采用目标中文手写字识别模型识别待识别中文字,获取待识别中文字在目标中文手写字识别模型中的输出值;目标中文手写字识别模型是采用手写模型训练方法获取到的。
识别结果获取模块90,用于根据输出值和预设的中文语义词库获取目标概率输出值,基于目标概率输出值获取待识别中文字的识别结果。
本实施例提供一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行时实现实施例中手写模型训练方法,为避免重复,这里不再赘述。或者,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行时实现实施例中手写模型训练装置的各模块/单元的功能,为避免重复,这里不再赘述。或者,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行时实现实施例中手写字识别方法中各步骤的功能,为避免重复,此处不一一赘述。或者,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行时实现实施例中手写字识别装置中各模块/单元的功能,为避免重复,此处不一一赘述。
图9是本申请一实施例提供的计算机设备的示意图。如图9所示,该实施例的计算机设备100包括:处理器101、存储器102以及存储在存储器102中并可在处理器101上运行的计算机可读指令103,该计算机可读指令103被处理器101执行时实现实施例中的手写模型训练方法,为避免重复,此处不一一赘述。或者,该计算机可读指令103被处理器101执行时实现实施例中手写模型训练装置中各模型/单元的功能,为避免重复,此处不一一赘述。或者,该计算机可读指令103被处理器101执行时实现实施例中手写字识别方法中各步骤的功能,为避免重复,此处不一一赘述。或者,该计算机可读指令103被处理器101执行时实现实施例中手写字识别装置中各模块/单元的功能,为避免重复,此处不一一赘述。
计算机设备100可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。计算机设备可包括,但不仅限于,处理器101、存储器102。本领域技术人员可以理解,图9仅仅是计算机设备100的示例,并不构成对计算机设备100的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如计算机设备还可以包括输入输出设备、网络接入设备、总线等。
所称处理器101可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器102可以是计算机设备100的内部存储单元,例如计算机设备100的硬盘或内存。存储器102也可以是计算机设备100的外部存储设备,例如计算机设备100上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器102还可以既包括计算机设备100的内部存储单元也包括外部存储设备。存储器102用于存储计算机可读指令103以及计算机设备所需的其他程序和数据。存储器102还可以用于暂时地存储已经输出或者将要输出的数据。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种手写模型训练方法,其特征在于,包括:
    采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵;
    基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本;
    初始化卷积神经网络;
    将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;
    获取非规范中文字训练样本,将所述非规范中文字训练样本输入到所述规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新所述规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型;
    获取待测试中文字样本,采用所述调整中文手写字识别模型识别所述待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本;
    将所述出错字训练样本输入到所述调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
  2. 根据权利要求1所述的手写模型训练方法,其特征在于,所述基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本,包括:
    获取待处理中文字训练样本中每个中文字的像素值特征矩阵,将所述像素值特征矩阵中每个像素值进行归一化处理,获取每个中文字的归一化像素值特征矩阵,其中,归一化处理的公式为
    Figure PCTCN2018094193-appb-100001
    MaxValue为每个中文字的像素值特征矩阵中像素值的最大值,MinValue为每个中文字的像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值;
    将每个中文字的归一化像素值特征矩阵中的像素值划分为两类像素值,基于所述两类像素值建立每个中文字的二值化像素值特征矩阵,将每个中文字的二值化像素特征矩阵组合作为规范中文字训练样本。
  3. 根据权利要求1所述的手写模型训练方法,其特征在于,所述将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,包括:
    将规范中文字训练样本输入到卷积神经网络中,获取所述规范中文字训练样本在所述卷积神经网络中的前向输出;
    根据所述前向输出和真实结果构建误差函数,所述误差函数的表达式为
    Figure PCTCN2018094193-appb-100002
    其中,n表示训练样本总数,x i表示第i个训练样本的前向输出,y i表示与x i相对应的第i个训练样本的真实结果;
    根据所述误差函数,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,其中,在所述卷积神经网络的全连接层,更新权值的公式为
    Figure PCTCN2018094193-appb-100003
    在所述卷积神经网络的卷积层,更新权值的公式为
    Figure PCTCN2018094193-appb-100004
    W l'表示更新后的权值,W l表示更新前的权值,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l 层的灵敏度,a i,l-1表示输入的第i个中文字样本在第l-1层的输出,T表示矩阵转置运算,*表示卷积运算,rot180表示将矩阵翻转180度的运算;在所述卷积神经网络的全连接层,更新偏置的公式为
    Figure PCTCN2018094193-appb-100005
    在所述卷积神经网络的卷积层,更新偏置的公式为
    Figure PCTCN2018094193-appb-100006
    b l'表示更新后的偏置,b l表示更新前的偏置,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,(u,v)是指进行卷积运算时获取的卷积特征图中每一个卷积特征图中的小块位置。
  4. 根据权利要求1所述的手写模型训练方法,其特征在于,所述采用调整中文手写字识别模型识别待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本,包括:
    将待测试中文字样本输入到调整中文手写字识别模型,获取所述待测试中文字样本中每一个字在所述调整中文手写字识别模型中的输出值;
    选取每一个所述字对应的输出值中的最大输出值,根据所述最大输出值获取每一个所述字的识别结果;
    根据识别结果,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本。
  5. 根据权利要求1所述的手写模型训练方法,其特征在于,所述初始化卷积神经网络,包括:
    令卷积神经网络初始化的权值满足公式
    Figure PCTCN2018094193-appb-100007
    其中,n l表示在第l层输入的训练样本的样本个数,S()表示方差运算,W l表示第l层的权值,
    Figure PCTCN2018094193-appb-100008
    表示任意,l表示卷积神经网络中的第l层。
  6. 一种手写字识别方法,其特征在于,包括:
    获取待识别中文字,采用目标中文手写字识别模型识别所述待识别中文字,获取所述待识别中文字在所述目标中文手写字识别模型中的输出值;所述目标中文手写字识别模型是采用权利要求1-5任一项所述手写模型训练方法获取到的;
    根据所述输出值和预设的中文语义词库获取目标概率输出值,基于所述目标概率输出值获取所述待识别中文字的识别结果。
  7. 一种手写模型训练装置,其特征在于,包括:
    像素值特征矩阵获取模块,用于采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵;
    规范中文字训练样本获取模块,用于基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本;
    初始化模块,用于初始化卷积神经网络;
    规范中文字识别模型获取模块,用于将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;
    调整中文手写字识别模型获取模块,用于获取非规范中文字训练样本,将所述非规范中文字训练样本输入到所述规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新所述规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型;
    出错字训练样本获取模块,用于获取待测试中文字样本,采用所述调整中文手写字识别模型识别所述待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本;
    目标中文手写字识别模型获取模块,用于将所述出错字训练样本输入到所述调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
  8. 一种手写字识别装置,其特征在于,包括:
    输出值获取模块,用于获取待识别中文字,采用目标中文手写字识别模型识别所述待识别中文字,获取所述待识别中文字在所述目标中文手写字识别模型中的输出值;所述目标中文手写字识别模型是采用权利要求1-5任一项所述手写模型训练方法获取到的;
    识别结果获取模块,用于根据所述输出值和预设的中文语义词库获取目标概率输出值,基于所述目标概率输出值获取所述待识别中文字的识别结果。
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵;
    基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本;
    初始化卷积神经网络;
    将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;
    获取非规范中文字训练样本,将所述非规范中文字训练样本输入到所述规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新所述规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型;
    获取待测试中文字样本,采用所述调整中文手写字识别模型识别所述待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本;
    将所述出错字训练样本输入到所述调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
  10. 根据权利要求9所述的计算机设备,其特征在于,所述基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本,包括:
    获取待处理中文字训练样本中每个中文字的像素值特征矩阵,将所述像素值特征矩阵中每个像素值进行归一化处理,获取每个中文字的归一化像素值特征矩阵,其中,归一化处理的公式为
    Figure PCTCN2018094193-appb-100009
    MaxValue为每个中文字的像素值特征矩阵中像素值的最大值,MinValue为每个中文字的像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值;
    将每个中文字的归一化像素值特征矩阵中的像素值划分为两类像素值,基于所述两类像素值建立每个中文字的二值化像素值特征矩阵,将每个中文字的二值化像素特征矩阵组合作为规范中文字训练样本。
  11. 根据权利要求9所述的计算机设备,其特征在于,所述将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,包括:
    将规范中文字训练样本输入到卷积神经网络中,获取所述规范中文字训练样本在所述卷积神经网络中的前向输出;
    根据所述前向输出和真实结果构建误差函数,所述误差函数的表达式为
    Figure PCTCN2018094193-appb-100010
    其中,n表示训练样本总数,x i表示第i个训练样本的前向输出,y i表示与x i相对应的第i个训练样本的真实结果;
    根据所述误差函数,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,其中,在所述卷积神经网络的全连接层,更新权值的公式为
    Figure PCTCN2018094193-appb-100011
    在所述卷积神经网络的卷积层,更新权值的公式为
    Figure PCTCN2018094193-appb-100012
    W l'表示更新后的权值,W l表示更新前的权值,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,a i,l-1表示输入的第i个中文字样本在第l-1层的输出,T表示矩阵转置运算,*表示卷积运算,rot180表示将矩阵翻转180度的运算;在所述卷积神经网络的全连接层,更新偏置的公式为
    Figure PCTCN2018094193-appb-100013
    在所述卷积神经网络的卷积层,更新偏置的公式为
    Figure PCTCN2018094193-appb-100014
    b l'表示更新后的偏置,b l表示更新前的偏置,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,(u,v)是指进行卷积运算时获取的卷积特征图中每一个卷积特征图中的小块位置。
  12. 根据权利要求9所述的计算机设备,其特征在于,所述采用调整中文手写字识别模型识别待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本,包括:
    将待测试中文字样本输入到调整中文手写字识别模型,获取所述待测试中文字样本中每一个字在所述调整中文手写字识别模型中的输出值;
    选取每一个所述字对应的输出值中的最大输出值,根据所述最大输出值获取每一个所述字的识别结果;
    根据识别结果,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本。
  13. 根据权利要求9所述的计算机设备,其特征在于,所述初始化卷积神经网络,包括:
    令卷积神经网络初始化的权值满足公式
    Figure PCTCN2018094193-appb-100015
    其中,n l表示在第l层输入的训练样本的样本个数,S()表示方差运算,W l表示第l层的权值,
    Figure PCTCN2018094193-appb-100016
    表示任意,l表示卷积神经网络中的第l层。
  14. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    获取待识别中文字,采用目标中文手写字识别模型识别所述待识别中文字,获取所述待识别中文字在所述目标中文手写字识别模型中的输出值;所述目标中文手写字识别模型是采用权利要求1-5任一项所述手写模型训练方法获取到的;
    根据所述输出值和预设的中文语义词库获取目标概率输出值,基于所述目标概率输出值获取所述待识别中文字的识别结果。
  15. 一个或多个存储有计算机可读指令的非易失性可读存储介质,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    采用光学字符识别技术获取待处理中文字训练样本中每个中文字的像素值特征矩阵;
    基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本;
    初始化卷积神经网络;
    将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型;
    获取非规范中文字训练样本,将所述非规范中文字训练样本输入到所述规范中文字识别模型中进行训练,采用基于随机梯度下降的后向传播算法更新所述规范中文字识别模型的权值和偏置,获取调整中文手写字识别模型;
    获取待测试中文字样本,采用所述调整中文手写字识别模型识别所述待测试中文字样本,获取识别 结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本;
    将所述出错字训练样本输入到所述调整中文手写字识别模型中进行训练,采用基于批量梯度下降的后向传播算法更新调整中文手写字识别模型的权值和偏置,获取目标中文手写字识别模型。
  16. 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述基于待处理中文字训练样本中每个中文字的像素值特征矩阵获取规范中文字训练样本,包括:
    获取待处理中文字训练样本中每个中文字的像素值特征矩阵,将所述像素值特征矩阵中每个像素值进行归一化处理,获取每个中文字的归一化像素值特征矩阵,其中,归一化处理的公式为
    Figure PCTCN2018094193-appb-100017
    MaxValue为每个中文字的像素值特征矩阵中像素值的最大值,MinValue为每个中文字的像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值;
    将每个中文字的归一化像素值特征矩阵中的像素值划分为两类像素值,基于所述两类像素值建立每个中文字的二值化像素值特征矩阵,将每个中文字的二值化像素特征矩阵组合作为规范中文字训练样本。
  17. 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述将所述规范中文字训练样本输入到卷积神经网络中进行训练,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,包括:
    将规范中文字训练样本输入到卷积神经网络中,获取所述规范中文字训练样本在所述卷积神经网络中的前向输出;
    根据所述前向输出和真实结果构建误差函数,所述误差函数的表达式为
    Figure PCTCN2018094193-appb-100018
    其中,n表示训练样本总数,x i表示第i个训练样本的前向输出,y i表示与x i相对应的第i个训练样本的真实结果;
    根据所述误差函数,采用基于随机梯度下降的后向传播算法更新卷积神经网络的权值和偏置,获取规范中文字识别模型,其中,在所述卷积神经网络的全连接层,更新权值的公式为
    Figure PCTCN2018094193-appb-100019
    在所述卷积神经网络的卷积层,更新权值的公式为
    Figure PCTCN2018094193-appb-100020
    W l'表示更新后的权值,W l表示更新前的权值,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,a i,l-1表示输入的第i个中文字样本在第l-1层的输出,T表示矩阵转置运算,*表示卷积运算,rot180表示将矩阵翻转180度的运算;在所述卷积神经网络的全连接层,更新偏置的公式为
    Figure PCTCN2018094193-appb-100021
    在所述卷积神经网络的卷积层,更新偏置的公式为
    Figure PCTCN2018094193-appb-100022
    b l'表示更新后的偏置,b l表示更新前的偏置,α表示学习率,m表示规范中文字训练样本,i表示输入的第i个中文字样本,δ i,l表示输入的第i个中文字样本在第l层的灵敏度,(u,v)是指进行卷积运算时获取的卷积特征图中每一个卷积特征图中的小块位置。
  18. 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述采用调整中文手写字识别模型识别待测试中文字样本,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本,包括:
    将待测试中文字样本输入到调整中文手写字识别模型,获取所述待测试中文字样本中每一个字在所述调整中文手写字识别模型中的输出值;
    选取每一个所述字对应的输出值中的最大输出值,根据所述最大输出值获取每一个所述字的识别结果;
    根据识别结果,获取识别结果与真实结果不符的出错字,把所有所述出错字作为出错字训练样本。
  19. 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述初始化卷积神经网络,包括:
    令卷积神经网络初始化的权值满足公式
    Figure PCTCN2018094193-appb-100023
    其中,n l表示在第l层输入的训练样本的样本个数,S()表示方差运算,W l表示第l层的权值,
    Figure PCTCN2018094193-appb-100024
    表示任意,l表示卷积神经网络中的第l层。
  20. 一个或多个存储有计算机可读指令的非易失性可读存储介质,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    获取待识别中文字,采用目标中文手写字识别模型识别所述待识别中文字,获取所述待识别中文字在所述目标中文手写字识别模型中的输出值;所述目标中文手写字识别模型是采用权利要求1-5任一项所述手写模型训练方法获取到的;
    根据所述输出值和预设的中文语义词库获取目标概率输出值,基于所述目标概率输出值获取所述待识别中文字的识别结果。
PCT/CN2018/094193 2018-06-04 2018-07-03 手写模型训练方法、手写字识别方法、装置、设备及介质 WO2019232847A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810564062.3 2018-06-04
CN201810564062.3A CN108764195B (zh) 2018-06-04 2018-06-04 手写模型训练方法、手写字识别方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2019232847A1 true WO2019232847A1 (zh) 2019-12-12

Family

ID=64002667

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/094193 WO2019232847A1 (zh) 2018-06-04 2018-07-03 手写模型训练方法、手写字识别方法、装置、设备及介质

Country Status (2)

Country Link
CN (1) CN108764195B (zh)
WO (1) WO2019232847A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476237A (zh) * 2020-04-28 2020-07-31 有米科技股份有限公司 一种文字识别方法、装置、服务器及存储介质
CN111680690A (zh) * 2020-04-26 2020-09-18 泰康保险集团股份有限公司 一种文字识别方法及装置
CN111950548A (zh) * 2020-08-10 2020-11-17 河南大学 一种引入字库文字图像进行深度模板匹配的汉字识别方法
CN112308058A (zh) * 2020-10-25 2021-02-02 北京信息科技大学 一种手写字符的识别方法
CN112766051A (zh) * 2020-12-29 2021-05-07 有米科技股份有限公司 基于Attention的图像文字识别方法及装置
CN112801085A (zh) * 2021-02-09 2021-05-14 沈阳麟龙科技股份有限公司 一种图像中文字的识别方法、装置、介质及电子设备
CN116012860A (zh) * 2022-12-29 2023-04-25 华南师范大学 一种基于图像识别的教师板书设计水平诊断方法及装置
CN116311543A (zh) * 2023-02-03 2023-06-23 汇金智融(深圳)科技有限公司 一种基于图像识别技术的笔迹分析方法及系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685968A (zh) * 2018-12-15 2019-04-26 西安建筑科技大学 一种基于卷积神经网络的纸币图像缺陷的识别模型构建以及识别方法
US10846553B2 (en) * 2019-03-20 2020-11-24 Sap Se Recognizing typewritten and handwritten characters using end-to-end deep learning
WO2020223859A1 (zh) * 2019-05-05 2020-11-12 华为技术有限公司 一种检测倾斜文字的方法、装置及设备
CN110399912B (zh) * 2019-07-12 2023-04-07 广东浪潮大数据研究有限公司 一种字符识别的方法、系统、设备及计算机可读存储介质
CN110378318B (zh) * 2019-07-30 2022-07-15 腾讯科技(深圳)有限公司 文字识别方法、装置、计算机设备及存储介质
CN110688997B (zh) * 2019-09-24 2023-04-18 北京猎户星空科技有限公司 一种图像处理方法及装置
CN111738269B (zh) * 2020-08-25 2020-11-20 北京易真学思教育科技有限公司 模型训练方法、图像处理方法及装置、设备、存储介质
CN114120336B (zh) * 2020-08-25 2023-08-08 本源量子计算科技(合肥)股份有限公司 手写数字识别方法、系统、设备及计算机可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101630368A (zh) * 2009-08-25 2010-01-20 华南理工大学 一种用于手写汉字识别的用户书写风格自适应方法
US20140067738A1 (en) * 2012-08-28 2014-03-06 International Business Machines Corporation Training Deep Neural Network Acoustic Models Using Distributed Hessian-Free Optimization
CN105184226A (zh) * 2015-08-11 2015-12-23 北京新晨阳光科技有限公司 数字识别方法和装置及神经网络训练方法和装置
CN105654135A (zh) * 2015-12-30 2016-06-08 成都数联铭品科技有限公司 一种基于递归神经网络的图像文字序列识别系统
CN106599941A (zh) * 2016-12-12 2017-04-26 西安电子科技大学 基于卷积神经网络与支持向量机的手写数字识别方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7983478B2 (en) * 2007-08-10 2011-07-19 Microsoft Corporation Hidden markov model based handwriting/calligraphy generation
US20150317336A1 (en) * 2014-04-30 2015-11-05 Hewlett-Packard Development Company, L.P. Data reconstruction
CN107316054A (zh) * 2017-05-26 2017-11-03 昆山遥矽微电子科技有限公司 基于卷积神经网络和支持向量机的非标准字符识别方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101630368A (zh) * 2009-08-25 2010-01-20 华南理工大学 一种用于手写汉字识别的用户书写风格自适应方法
US20140067738A1 (en) * 2012-08-28 2014-03-06 International Business Machines Corporation Training Deep Neural Network Acoustic Models Using Distributed Hessian-Free Optimization
CN105184226A (zh) * 2015-08-11 2015-12-23 北京新晨阳光科技有限公司 数字识别方法和装置及神经网络训练方法和装置
CN105654135A (zh) * 2015-12-30 2016-06-08 成都数联铭品科技有限公司 一种基于递归神经网络的图像文字序列识别系统
CN106599941A (zh) * 2016-12-12 2017-04-26 西安电子科技大学 基于卷积神经网络与支持向量机的手写数字识别方法

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680690B (zh) * 2020-04-26 2023-07-11 泰康保险集团股份有限公司 一种文字识别方法及装置
CN111680690A (zh) * 2020-04-26 2020-09-18 泰康保险集团股份有限公司 一种文字识别方法及装置
CN111476237A (zh) * 2020-04-28 2020-07-31 有米科技股份有限公司 一种文字识别方法、装置、服务器及存储介质
CN111950548A (zh) * 2020-08-10 2020-11-17 河南大学 一种引入字库文字图像进行深度模板匹配的汉字识别方法
CN111950548B (zh) * 2020-08-10 2023-07-28 河南大学 一种引入字库文字图像进行深度模板匹配的汉字识别方法
CN112308058A (zh) * 2020-10-25 2021-02-02 北京信息科技大学 一种手写字符的识别方法
CN112308058B (zh) * 2020-10-25 2023-10-24 北京信息科技大学 一种手写字符的识别方法
CN112766051A (zh) * 2020-12-29 2021-05-07 有米科技股份有限公司 基于Attention的图像文字识别方法及装置
CN112801085A (zh) * 2021-02-09 2021-05-14 沈阳麟龙科技股份有限公司 一种图像中文字的识别方法、装置、介质及电子设备
CN116012860A (zh) * 2022-12-29 2023-04-25 华南师范大学 一种基于图像识别的教师板书设计水平诊断方法及装置
CN116012860B (zh) * 2022-12-29 2024-01-16 华南师范大学 一种基于图像识别的教师板书设计水平诊断方法及装置
CN116311543A (zh) * 2023-02-03 2023-06-23 汇金智融(深圳)科技有限公司 一种基于图像识别技术的笔迹分析方法及系统
CN116311543B (zh) * 2023-02-03 2024-03-08 汇金智融(深圳)科技有限公司 一种基于图像识别技术的笔迹分析方法及系统

Also Published As

Publication number Publication date
CN108764195B (zh) 2023-04-18
CN108764195A (zh) 2018-11-06

Similar Documents

Publication Publication Date Title
WO2019232847A1 (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
WO2019232854A1 (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
WO2019232869A1 (zh) 手写模型训练方法、文本识别方法、装置、设备及介质
WO2019232855A1 (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
US9195934B1 (en) Spiking neuron classifier apparatus and methods using conditionally independent subsets
WO2019232861A1 (zh) 手写模型训练方法、文本识别方法、装置、设备及介质
CN108171318B (zh) 一种基于模拟退火—高斯函数的卷积神经网络集成方法
WO2019232857A1 (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
AU2020100052A4 (en) Unattended video classifying system based on transfer learning
CN114266897A (zh) 痘痘类别的预测方法、装置、电子设备及存储介质
WO2019232859A1 (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
WO2019232844A1 (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
CN110414541B (zh) 用于识别物体的方法、设备和计算机可读存储介质
CN112749737A (zh) 图像分类方法及装置、电子设备、存储介质
CN114186063A (zh) 跨域文本情绪分类模型的训练方法和分类方法
CN111340051A (zh) 图片处理方法、装置及存储介质
CN108496174B (zh) 用于面部识别的方法和系统
CN109101984B (zh) 一种基于卷积神经网络的图像识别方法及装置
CN111079930B (zh) 数据集质量参数的确定方法、装置及电子设备
WO2019232856A1 (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
Velandia et al. Applications of deep neural networks
Liu et al. Multi-digit Recognition with Convolutional Neural Network and Long Short-term Memory
Li et al. A pre-training strategy for convolutional neural network applied to Chinese digital gesture recognition
US20230360370A1 (en) Neural network architectures for invariant object representation and classification using local hebbian rule-based updates
WO2023196917A1 (en) Neural network architectures for invariant object representation and classification using local hebbian rule-based updates

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18921829

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12/03/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18921829

Country of ref document: EP

Kind code of ref document: A1