CN111667066A - Network model training and character recognition method and device and electronic equipment - Google Patents

Network model training and character recognition method and device and electronic equipment Download PDF

Info

Publication number
CN111667066A
CN111667066A CN202010330213.6A CN202010330213A CN111667066A CN 111667066 A CN111667066 A CN 111667066A CN 202010330213 A CN202010330213 A CN 202010330213A CN 111667066 A CN111667066 A CN 111667066A
Authority
CN
China
Prior art keywords
model
trained
target
character recognition
training sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010330213.6A
Other languages
Chinese (zh)
Inventor
张婕蕾
万昭祎
姚聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kuangshi Technology Co Ltd
Beijing Megvii Technology Co Ltd
Original Assignee
Beijing Kuangshi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kuangshi Technology Co Ltd filed Critical Beijing Kuangshi Technology Co Ltd
Priority to CN202010330213.6A priority Critical patent/CN111667066A/en
Publication of CN111667066A publication Critical patent/CN111667066A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/418Document matching, e.g. of document images

Abstract

The invention provides a network model training and character recognition method, a device and electronic equipment, which relate to the technical field of artificial intelligence and comprise the steps of obtaining a plurality of models to be trained and target training samples of the plurality of models to be trained; respectively performing character recognition processing on the target training sample through each model to be trained to obtain a plurality of character recognition results, wherein each character recognition result represents the prediction probability that each character to be recognized in the target training sample is each preset character; determining a relative entropy loss value of each model to be trained based on a plurality of character recognition results and label information of a target training sample; the relative entropy loss value is used for representing the difference degree between a plurality of character recognition results; the model parameters of the corresponding model to be trained are adjusted through the relative entropy loss value, so that the technical problem that the recognition precision is poor in the process of character recognition of the existing character recognition model is solved.

Description

Network model training and character recognition method and device and electronic equipment
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a network model training and character recognition method, a network model training and character recognition device and electronic equipment.
Background
In the working process, people often need to process characters in the picture, and because the characters in the picture cannot be edited, the characters of the picture need to be recognized firstly. In the prior art, an Optical Character Recognition (OCR) model can be generally used to recognize characters in a picture. However, the accuracy of the characters identified by the model is low, and with the development of artificial intelligence technology, the characters can be identified by adopting a deep learning algorithm at present. In the deep learning field, there are many methods for recognizing characters, for example, the following methods are available: the first is a character decoder attention-decoder based on attention mechanism; the second is a model based on CTC-Loss (connection prompt Classification, character recognition model of the communication temporary storage recognizer); the third is image segmentation network segmentation.
In the use process of the model, it is found that the attention-based character decoder attention-decoder has stronger sequence modeling capability in the language model, namely, the attention-based character decoder attention-decoder has stronger word-background capability, and the image segmentation network segmentation-based segmentation focuses on the processing of image features. However, relying on sequence modeling capability for a model would make it impossible for the model to recognize words that did not appear in the training set. If the image features are emphasized, the recognition accuracy is greatly reduced if the image quality is poor.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for training a network model and character recognition, and an electronic device, so as to solve the technical problem that the recognition accuracy is poor in the process of performing character recognition by using the existing character recognition model.
In a first aspect, an embodiment of the present invention provides a method for training a network model, including: obtaining a plurality of models to be trained and target training samples of the plurality of models to be trained; respectively performing character recognition processing on the target training sample through each model to be trained to obtain a plurality of character recognition results, wherein each character recognition result represents the prediction probability that each character to be recognized in the target training sample is each preset character; determining a relative entropy loss value of each model to be trained based on the plurality of character recognition results and the label information of the target training sample; wherein the relative entropy loss value is used for representing the difference degree between a plurality of character recognition results; and adjusting the model parameters of the corresponding model to be trained according to the relative entropy loss value.
Further, determining a relative entropy loss value of each model to be trained based on the plurality of word recognition results and the label information of the target training sample comprises: determining a character recognition result of a first model to be trained and a character recognition result of a second model to be trained in the plurality of character recognition results, and respectively obtaining a first character recognition result and a second character recognition result, wherein the first model to be trained is a model of the plurality of models to be trained, of which the relative entropy loss value is to be calculated at the current moment, and the second model to be trained is the other model except the first model to be trained; calculating KL divergence between the first character recognition result and the second character recognition result to obtain a target KL divergence value; and determining a relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample.
Further, the number of the second models to be trained is multiple; each second model to be trained corresponds to a second character recognition result; calculating the KL divergence between the first and second word recognition results comprises: calculating KL divergence between the first character recognition result and each second character recognition result to obtain a plurality of target KL divergence values; determining the relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample comprises: and determining a relative entropy loss value of the first model to be trained based on the plurality of target KL divergence values, the first character recognition result and the label information of the target training sample.
Further, the label information is used for representing the actual probability that the character to be recognized in the target training sample is each preset character; determining the relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample comprises: summing the prediction probability of each character to be recognized in the first character recognition result and the actual probability of the corresponding character to be recognized in the label information to obtain a target calculation result corresponding to each character to be recognized; and summing the target calculation result corresponding to each character to be recognized and the target KL divergence value to obtain a relative entropy loss value of the first model to be trained.
Further, calculating the KL divergence between the first and second word recognition results comprises: transforming the first character recognition result to obtain a first logits vector; and transforming the second character recognition result to obtain a second logits vector; calculating a KL divergence between the first and second logits vectors.
Further, the plurality of models to be trained include the following types of models: a neural network model based on an attention mechanism and an image segmentation network model.
In a second aspect, an embodiment of the present invention further provides a method for training a network model, including: obtaining a plurality of models to be trained and a plurality of target training sample sets of the models to be trained; the determination modes of the sample labels in different target training sample groups are different, and the determination modes of the sample labels are associated with the types of the models to be trained; training each model to be trained by using each target training sample group by the method of any one of the first aspect above; testing the trained model to be trained by using a target test set to obtain a plurality of model test results, wherein one target training sample group corresponds to one model test result; the model test result is used for representing the accuracy of character recognition of the model; determining a balance ability index of each model to be trained based on the model test result; the balance ability index is used for measuring the ability of each model to be trained to execute a plurality of operations; and training each model to be trained again based on the balance ability index.
Further, the plurality of models to be trained include the following types of models: a neural network model and an image segmentation network model based on an attention mechanism; the plurality of target training sample sets comprises: a first type of target training sample and a second type of target training sample; determining label information of each kind of target training sample by the following method, specifically comprising: randomly generating the probability that each character to be recognized is each preset character for the first type of target training sample to obtain the label information of the first type of target training sample; and acquiring a target test set for testing the plurality of models to be trained, and determining label information of a second type of target training sample based on the label information of the target test set.
Further, the length of the label information of the first type of target training sample is the same as the length of the label information of the second type of target training sample, and the length of the label information of the second type of target training sample is the same as the length of the label information of the target test set.
Further, determining label information for the second type of target training sample based on the label information for the target test set comprises: determining a test sample in the target test set, wherein the text sequence contained in the test sample is the same as the text sequence contained in the second type of target training sample; determining label information for the second type of target training sample based on the label information for the test sample.
Further, determining label information for the second type of target training pattern based on the label information for the test pattern comprises: determining label information of the test sample as label information of the second type of target training sample; and/or converting the label information of the test sample according to a preset writing format to obtain converted label information, and determining the converted label information as the label information of the second type of target training sample.
Further, the plurality of model test results includes: a first model test result and a second model test result; determining the balance ability index of each model to be trained based on the model test result comprises: calculating a difference between the first model test result and the second model test result, and determining the difference as the balance ability index; training each model to be trained again based on the balance ability index comprises: and if the difference is larger than the preset difference, training each model to be trained again.
In a third aspect, an embodiment of the present invention further provides a text recognition method, including: acquiring an image to be identified; and performing character recognition on the image to be recognized through a target neural network to obtain a character recognition result, wherein the target neural network is a model obtained by training by adopting the method of any one of the first aspect or the second aspect.
In a fourth aspect, an embodiment of the present invention further provides a device for training a network model, including: the device comprises a first acquisition unit, a second acquisition unit and a control unit, wherein the first acquisition unit is used for acquiring a plurality of models to be trained and target training samples of the plurality of models to be trained; the recognition processing unit is used for respectively carrying out character recognition processing on the target training sample through each model to be trained to obtain a plurality of character recognition results, wherein each character recognition result represents the prediction probability that each character to be recognized in the target training sample is each preset character; a first determining unit, configured to determine a relative entropy loss value of each model to be trained based on the multiple character recognition results and the label information of the target training sample; wherein the relative entropy loss value is used for representing the difference degree between a plurality of character recognition results; and the adjusting unit is used for adjusting the model parameters of the corresponding model to be trained through the relative entropy loss value.
In a fifth aspect, an embodiment of the present invention further provides a device for training a network model, including: the second acquisition unit is used for acquiring a plurality of models to be trained and a plurality of target training sample sets of the models to be trained; the determination modes of the sample labels in different target training sample groups are different, and the determination modes of the sample labels are associated with the types of the models to be trained; a first training unit, configured to train each model to be trained by using each target training sample group through the method according to any one of the first aspect; the test unit is used for testing the trained model to be trained by utilizing a target test set to obtain a plurality of model test results, wherein one target training sample group corresponds to one model test result; the model test result is used for representing the accuracy of character recognition of the model; a second determining unit, configured to determine a balance ability index of each model to be trained based on the model test result; the balance ability index is used for measuring the ability of each model to be trained to execute a plurality of operations; and the second training unit is used for training each model to be trained again based on the balance ability index.
In a sixth aspect, an embodiment of the present invention further provides a text recognition apparatus, including: a third acquisition unit configured to acquire an image to be recognized; and the character recognition unit is used for performing character recognition on the image to be recognized through a target neural network to obtain a character recognition result, wherein the target neural network is a model obtained by training by adopting the method of any one of the first aspect or the second aspect.
In a seventh aspect, an embodiment of the present invention further provides an electronic device, including a memory, a processing device, and a computer program stored on the memory and executable on the processing device, where the processing device implements the steps of the method in any one of the first aspect or the second aspect when executing the computer program.
In an eighth aspect, the present invention further provides a computer-readable medium having a non-volatile program code executable by a processing device, where the program code causes the processing device to execute the steps of the method in any one of the above first aspects or implement the steps of the method in any one of the above second aspects.
In the embodiment of the invention, firstly, a plurality of models to be trained and target training samples of the models to be trained are obtained, and then, character recognition processing is respectively carried out on the target training samples through each model to be trained to obtain a plurality of character recognition results; and then, determining a relative entropy loss value of each model to be trained based on the plurality of character recognition results and the label information of the target training sample, and finally, adjusting the model parameters of the corresponding model to be trained according to the relative entropy loss value. According to the description, in the application, through calculating the relative entropy loss value of the model to be trained, and through the mode that the model to be trained is trained through the relative entropy loss value, mutual learning among a plurality of models to be trained can be realized, so that any model to be trained has functions or capabilities of other models to be trained, the adaptability of the model to be trained is improved, the character recognition accuracy of the model is improved, and the technical problem that the recognition accuracy is poor when the existing character recognition model is used for character recognition is solved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an electronic device according to an embodiment of the invention;
FIG. 2 is a flow chart of a method of training a network model according to an embodiment of the present invention;
FIG. 3 is a flow chart of another method of training a network model according to an embodiment of the invention;
FIG. 4 is a flow chart of yet another method of training a network model according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a target training sample in accordance with an embodiment of the present invention;
FIG. 6 is a schematic diagram of a network model training apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a training apparatus for a network model according to another embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1:
first, an electronic device 100 for implementing an embodiment of the present invention, which may be used to run a training method of a network model of embodiments of the present invention, is described with reference to fig. 1.
As shown in fig. 1, electronic device 100 includes one or more processing devices 102, one or more memories 104. Optionally, the electronic device 100 may also include an input device 106, an output device 108, and a data acquisition device 110, which are interconnected via a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 1 are merely exemplary and not limiting, and the electronic device may also have some of the components shown in fig. 1 or other components and structures not shown in fig. 1, as desired.
The Processing device 102 may be implemented in at least one hardware form of a Digital signal Processing Device (DSP), a Field-Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), and an asic (application Specific integrated circuit), and the Processing device 102 may be a Central Processing Unit (CPU) or other form of Processing Unit having data Processing capability and/or instruction execution capability, and may control other components in the electronic device 100 to perform desired functions.
The memory 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processing device 102 to implement client functionality (implemented by the processing device) and/or other desired functionality in embodiments of the present invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.
The data acquisition device 110 is configured to obtain a plurality of models to be trained and target training samples of the plurality of models to be trained, where data acquired by the data acquisition device is trained by the network model training method to obtain trained models.
Example 2:
in accordance with an embodiment of the present invention, there is provided an embodiment of a method for training a network model, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than that herein.
Fig. 2 is a flowchart of a method for training a network model according to an embodiment of the present invention, as shown in fig. 2, the method includes the following steps:
step S202, a plurality of models to be trained and target training samples of the plurality of models to be trained are obtained.
In the present application, the capabilities or functions of the plurality of models to be trained may not be identical. For example, the plurality of models to be trained may include a model based on attention-based text decoder attention-decoder, and then the capabilities or functions of the models to be trained focus on the sequence modeling capabilities of the language model at this time. For another example, the training models may include an image segmentation network segmentation, and in this case, the capability or function of the model to be trained focuses on image feature processing.
In the present application, the attention-based character decoder and the image segmentation network segmentation are only described as examples. In addition to the above two models, a model having other capabilities or functions may be selected, and the present application is not limited thereto.
It should be noted that, in the present application, the model structures of the models to be trained may be the same in size or may be different in size. For example, the plurality of models to be trained may include models with larger model sizes and include models with smaller model sizes.
Step S204, respectively performing character recognition processing on the target training sample through each model to be trained to obtain a plurality of character recognition results, wherein each character recognition result represents the prediction probability that each character to be recognized in the target training sample is each preset character.
In the present application, the character to be recognized may be any character capable of being recognized, such as a chinese character, an upper case and a lower case, and the content of the character to be recognized is not specifically limited.
In the method, after each model to be trained performs character recognition on a target training sample, a character recognition result is obtained, and the character recognition result can represent the prediction probability that the character to be recognized corresponding to each region to be recognized in the target training sample is each preset character.
It should be noted that the preset characters may be 26 capital english letters and 26 small english letters, and may also be preset chinese characters, that is, the preset characters are associated with the types of the characters to be recognized, which is not specifically limited in this application.
Step S206, determining a relative entropy loss value of each model to be trained based on the plurality of character recognition results and the label information of the target training sample; wherein the relative entropy loss value is used for representing the difference degree between a plurality of character recognition results.
In the application, the label information is label information preset for the target training sample, and the label information is used for representing the actual probability that each character to be recognized in the target training sample is each preset character. Wherein, the actual probability is used to characterize the actual text sequence contained in the target training sample.
And S208, adjusting model parameters of the corresponding model to be trained according to the relative entropy loss value.
In the embodiment of the invention, firstly, a plurality of models to be trained and target training samples of the models to be trained are obtained, and then, character recognition processing is respectively carried out on the target training samples through each model to be trained to obtain a plurality of character recognition results; and then, determining a relative entropy loss value of each model to be trained based on the plurality of character recognition results and the label information of the target training sample, and finally, adjusting the model parameters of the corresponding model to be trained according to the relative entropy loss value. According to the description, in the application, through calculating the relative entropy loss value of the model to be trained, and through the mode that the model to be trained is trained through the relative entropy loss value, mutual learning among a plurality of models to be trained can be realized, so that any model to be trained has functions or capabilities of other models to be trained, the adaptability of the model to be trained is improved, the character recognition accuracy of the model is improved, and the technical problem that the recognition accuracy is poor when the existing character recognition model is used for character recognition is solved.
In an optional embodiment, in step S206, determining a relative entropy loss value of each model to be trained based on the plurality of character recognition results and the label information of the target training sample includes the following processes:
step S2061, determining a character recognition result of a first model to be trained and a character recognition result of a second model to be trained in the plurality of character recognition results, and obtaining a first character recognition result and a second character recognition result respectively, where the first model to be trained is a model of the plurality of models to be trained at the current moment for which a relative entropy loss value is to be calculated, and the second model to be trained is another model of the plurality of models to be trained except the first model to be trained.
Specifically, in the present application, a model to be trained among a plurality of models to be trained is taken as an example for explanation. It is assumed that a first model to be trained in the plurality of models to be trained is taken as an example for explanation, where the first model to be trained is any one of the plurality of models to be trained, and the present application does not specifically limit this.
First, in the present application, a character recognition result of a first model to be trained is determined, and then, a character recognition result of a second model to be trained is determined.
Step S2062, calculating KL divergence between the first character recognition result and the second character recognition result to obtain a target KL divergence value.
In the present application, after determining the first character recognition result and the second character recognition result, the KL divergence between the first character recognition result and the second character recognition result may be calculated. In the present application, the calculation formula of KL divergence can be described as:
Figure BDA0002464079690000121
in the formula, i is 1 to n in sequence, n is the number of preset characters, and p (x)i) The probability that the character to be recognized in the first character recognition result is the ith character in the preset characters is represented, and q (x)i) And the probability that the character to be recognized in the second character recognition result is the ith character in the preset characters is represented.
Step S2063, determining a relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample.
In the present application, after the target KL divergence value is determined and obtained in the manner described above, the relative entropy loss value of the first model to be trained may be determined based on the target KL divergence value, the first character recognition result, and the label information of the target training sample.
According to the description, the KL divergence is combined when the relative entropy loss value of the model to be trained is calculated, and the KL divergence can represent the difference degree between two probability distributions, so that the KL divergence is added to the loss function of the model to be trained, mutual learning among a plurality of models to be trained can be assisted, and the learning capacity of each model to be trained is enriched.
It should be noted that, in the present application, the number of the second models to be trained may be multiple; at this time, each second model to be trained corresponds to one second character recognition result.
Based on this, in the present application, calculating the KL divergence between the first and second word recognition results includes the following processes:
and calculating KL divergence between the first character recognition result and each second character recognition result to obtain a plurality of target KL divergence values.
And if the number of the second models to be trained is multiple, each second model to be trained respectively performs character recognition processing on the target training sample to obtain a second character recognition result.
Based on this, in the present application, the KL divergence between the first character recognition result and each second character recognition result can be calculated, where a plurality of target KL divergence values will be obtained.
It should be noted that in the present application, a formula can be adopted
Figure BDA0002464079690000131
Calculating KL divergence between the first character recognition result and each second character recognition result. It should be appreciated that the greater the number of second models to be trained, the greater the ability of the first model to be trained to learn.
In this application, after obtaining a plurality of target KL divergence values, the relative entropy loss value of the first model to be trained may be determined based on the plurality of target KL divergence values, the first character recognition result, and the label information of the target training sample.
It should be noted that, in the present application, the label information is preset, and is used to represent the actual probability that the character to be recognized in the target training sample is each preset character.
Based on this, determining the relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample includes the following processes:
(1) and summing the predicted probability of each character to be recognized in the first character recognition result and the actual probability of the corresponding character to be recognized in the label information to obtain a target calculation result corresponding to each character to be recognized.
As can be seen from the above description, in the present application, the first character recognition result includes a probability distribution of each character to be recognized, where the probability distribution represents a prediction probability that each character to be recognized is a preset character. The label information also contains the probability distribution of each character to be recognized, wherein the probability distribution represents the actual probability that each character to be recognized is a preset character.
Based on this, in the application, the prediction probability of each character to be recognized and the actual probability of the corresponding character to be recognized can be summed to obtain the target calculation result.
If the preset characters are 26 english letters, the first character recognition result includes the prediction probability that each character to be recognized is 26 english letters. The label information contains the actual probability that each character to be recognized is 26 english letters. If the number of characters to be recognized is n, the first character recognition result and the label information may be a vector with a length of 26 × n. For each element in the length-26 x n vector, a summation calculation is performed, and the summation calculation result is determined as a target calculation result.
(2) And summing the target calculation result corresponding to each character to be recognized and the target KL divergence value to obtain a relative entropy loss value of the first model to be trained.
After the target calculation result is determined in the above-described manner, the target result and the target KL divergence value may be summed to obtain the relative entropy loss value of the first model to be trained.
It should be noted that, for each character to be recognized, a KL divergence value vector is corresponding, and the KL divergence value vector includes m × 1 numerical values, where m is the number of preset characters, and then, for each character to be recognized, the target calculation result and the numerical values in the KL divergence value vector may be summed correspondingly to obtain the relative entropy loss value of the first model to be trained.
In an optional embodiment of the present application, calculating the KL divergence between the first and second word recognition results comprises:
(1) transforming the first character recognition result to obtain a first locations vector; and transforming the second character recognition result to obtain a second logits vector;
(2) calculating a KL divergence between the first and second logits vectors.
Specifically, in the present application, after the target training samples are respectively subjected to the character recognition processing by each model to be trained to obtain a plurality of character recognition results, the first character recognition result may be transformed, where the transformation formula may be described as:
Figure BDA0002464079690000141
in the equation, p represents each prediction probability in the first character recognition result, or represents each prediction probability in the second character recognition result.
The above process is described below with reference to fig. 3.
As shown in FIG. 3, Basemodel theta1Being the first model to be trained, Basemodel theta2Is the second model to be trained. First, Basemodel theta1And Basemodel theta2Performing character recognition processing on the target training sample to obtain a first character recognition result p1 and a second character recognition result p2, and then, according to a formula
Figure BDA0002464079690000151
Respectively transforming the first character recognition result p1 and the second character recognition result p2 to obtain a first logits vector and a second logits vector, and calculating a target KL divergence value KL (p) based on the first logits vector and the second logits vector1||p2) And a target KL divergence value KL (p)2||p1) Next, combine Basemodel theta1The first character recognition result p1 and the target KL divergence value KL (p)1||p2) Calculating Basemodel theta with label information of target training sample1In combination with Basemodel theta, and relative entropy loss function loss12Second character recognition result p2, target KL divergence value KL (p)2||p1) Calculating Basemodel theta with label information of target training sample2The relative entropy loss function loss 2.
According to the description, in the application, through calculating the relative entropy loss value of the model to be trained, and through the mode that the model to be trained is trained through the relative entropy loss value, mutual learning among a plurality of models to be trained can be realized, so that any model to be trained has functions or capabilities of other models to be trained, the adaptability of the model to be trained is improved, the character recognition accuracy of the model is improved, and the technical problem that the recognition accuracy is poor when the existing character recognition model is used for character recognition is solved.
Example 3:
according to an embodiment of the present invention, an embodiment of a training method of a network model is provided.
Fig. 4 is a flowchart of a method for training a network model according to an embodiment of the present invention, and as shown in fig. 4, the method includes the following steps:
step S402, obtaining a plurality of models to be trained and a plurality of target training sample groups of the models to be trained; the determination modes of the sample labels in different target training sample groups are different, and the determination modes of the sample labels are associated with the types of the models to be trained;
step S404, each model to be trained is trained by using each target training sample group through the method described in any one of the above embodiments 1;
step S406, testing the trained model to be trained by using a target test set to obtain a plurality of model test results, wherein one target training sample group corresponds to one model test result; the model test result is used for representing the accuracy of character recognition of the model;
step S408, determining the balance ability index of each model to be trained based on the model test result; the balance ability index is used for measuring the ability of each model to be trained to execute a plurality of operations;
and step S410, training each model to be trained again based on the balance ability index.
As can be seen from the above description, in the present application, the ability of each model to be trained to perform multiple operations can be determined by calculating the balance ability index of the model to be trained. By adopting the method, the balance capability of the model among a plurality of functions can be enhanced on the basis of improving the character recognition accuracy of the model.
In the present embodiment, it is assumed that the plurality of models to be trained includes the following types of models: a neural network model and an image segmentation network model based on an attention mechanism; the plurality of target training samples includes: a first type of target training sample and a second type of target training sample.
Based on this, the label information of each kind of target training sample can be determined in the following manner, specifically including:
firstly, randomly generating the probability that each character to be recognized is each preset character for the first type of target training sample to obtain the label information of the first type of target training sample.
In an alternative embodiment, assuming that the number of the preset characters is n, the probability that each character to be recognized is a preset character can be randomly generated for the first type of target training sample to be 1/n. For example, if the predetermined character is 26 english letters, the probability may be 1/26.
Then, a target test set used for testing the plurality of models to be trained is obtained, and label information of a second type of target training sample is determined based on the label information of the target test set.
Specifically, in the present application, a target test set is first obtained, where the target test set may be any one or more of the following: ICDAR test set, SVT test set, CUTE test set, IIIT test set.
After the target test set is obtained, a test sample can be determined in the target test set, wherein the character sequence contained in the test sample is the same as the character sequence contained in the second type of target training sample. Then, the label information of the test sample is determined as the label information of the second type of target training sample.
In addition, the label information of the test sample can be transformed according to a preset writing format to obtain transformed label information, and the transformed label information is determined as the label information of the second type of target training sample.
It should be noted that, in the present application, in order to adapt to various case distributions, each type of label information is generated into triplicate labels according to the same rule of full capitalization, first letter capitalization, and full lowercase.
It should be further noted that, in the present application, a tag generation engine may be preset, and then the tag generation step is executed by the tag generation engine.
According to the description, the label information of the first type of target training sample is randomly generated, and after the model to be trained is trained through the target training sample, the model can be tested through the target mapping test set, so that the recognition capability of the model on the image characteristics can be embodied.
The label information of the second type of target training sample is generated based on the label information of the target test set, and after the model to be trained is trained through the target training sample, the model is tested through the target test set, so that the modeling capacity of the model on the language sequence can be embodied.
In an optional embodiment, the length of the label information of the first type of target training sample is the same as the length of the label information of the second type of target training sample, and the length of the label information of the second type of target training sample is the same as the length of the label information of the target test set.
As shown in fig. 5, the label information of the left six target training samples is derived from the label information of the test set, and the label information of the right six target training samples is derived from pure random generation and is consistent with the length distribution of the label information of the left six target training samples.
In an alternative embodiment of the present application, if the plurality of model test results includes: first and second model test results, determining a balance capability indicator for each model to be trained based on the model test results comprises the following processes:
calculating a difference between the first model test result and the second model test result, and determining the difference as the balance ability indicator.
In this application, the first model test result may represent the accuracy of the model after it is tested by the target test set after the model to be trained is trained by one target training sample set. Likewise, the second model test result may represent the accuracy of the model after it has been tested by the target test set after it has been trained by another target training sample set on the model to be trained.
After obtaining the above two accuracy rates (i.e., the first model test result and the second model test result), a difference between the first model test result and the second model test result may be calculated, and if the difference is greater than a preset difference and the first model test result is greater than the second model test result, it indicates that the balance capability of the model to be trained is poor.
That is, in the present application, if the difference is greater than the preset difference, each of the models to be trained is trained again.
By the training method, the balance capability of the model to be trained can be adjusted, and the character recognition precision of the model is improved.
In the method, first, an image to be recognized is obtained, and then, a character recognition result is obtained by performing character recognition on the image to be recognized through a target neural network, where the target neural network is a model determined by using the method described in embodiment 1 or embodiment 2.
In the application, when the target neural network obtained by using the training method described in embodiment 1 or embodiment 2 identifies an image to be identified, the character identification accuracy can be improved, and the technical problem that the existing character identification model is poor in identification accuracy in the character identification process is solved.
Example 4:
the embodiment of the present invention further provides a training device for a network model, where the training device for a network model is mainly used to execute the training method for a network model provided in the foregoing content of the embodiment of the present invention, and the following describes the training device for a network model provided in the embodiment of the present invention in detail.
Fig. 6 is a schematic diagram of a training apparatus for a network model according to an embodiment of the present invention, as shown in fig. 6, the training apparatus for a network model mainly includes a first obtaining unit 10, a recognition processing unit 20, a first determining unit 30 and an adjusting unit 40, where:
a first obtaining unit 10, configured to obtain a plurality of models to be trained and target training samples of the plurality of models to be trained;
the recognition processing unit 20 is configured to perform character recognition processing on the target training sample through each to-be-trained model to obtain a plurality of character recognition results, where each character recognition result represents a prediction probability that each to-be-recognized character in the target training sample is a preset character;
a first determining unit 30, configured to determine a relative entropy loss value of each model to be trained based on the multiple character recognition results and the label information of the target training sample; wherein the relative entropy loss value is used for representing the difference degree between a plurality of character recognition results;
and the adjusting unit 40 is used for adjusting the model parameters of the corresponding model to be trained through the relative entropy loss value.
According to the description, in the application, through calculating the relative entropy loss value of the model to be trained, and through the mode that the model to be trained is trained through the relative entropy loss value, mutual learning among a plurality of models to be trained can be realized, so that any model to be trained has functions or capabilities of other models to be trained, the adaptability of the model to be trained is improved, the character recognition accuracy of the model is improved, and the technical problem that the recognition accuracy is poor when the existing character recognition model is used for character recognition is solved.
Optionally, the first determining unit is configured to: determining a character recognition result of a first model to be trained and a character recognition result of a second model to be trained in the plurality of character recognition results, and respectively obtaining a first character recognition result and a second character recognition result, wherein the first model to be trained is a model of the plurality of models to be trained, of which the relative entropy loss value is to be calculated at the current moment, and the second model to be trained is the other model except the first model to be trained; calculating KL divergence between the first character recognition result and the second character recognition result to obtain a target KL divergence value; and determining a relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample.
Optionally, the first determining unit is further configured to: calculating the KL divergence between the first and second word recognition results comprises: calculating KL divergence between the first character recognition result and each second character recognition result to obtain a plurality of target KL divergence values; determining the relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample comprises: and determining a relative entropy loss value of the first model to be trained based on the plurality of target KL divergence values, the first character recognition result and the label information of the target training sample.
Optionally, the first determining unit is further configured to: determining the relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample comprises: summing the prediction probability of each character to be recognized in the first character recognition result and the actual probability of the corresponding character to be recognized in the label information to obtain a target calculation result corresponding to each character to be recognized; and summing the target calculation result corresponding to each character to be recognized and the target KL divergence value to obtain a relative entropy loss value of the first model to be trained.
Optionally, the first determining unit is further configured to: transforming the first character recognition result to obtain a first logits vector; and transforming the second character recognition result to obtain a second logits vector; calculating a KL divergence between the first and second logits vectors.
Optionally, the plurality of models to be trained comprises the following types of models: a neural network model based on an attention mechanism and an image segmentation network model.
Example 5:
the embodiment of the present invention further provides another training apparatus for a network model, where the training apparatus for a network model is mainly used for executing the training method for a network model provided in the foregoing content of the embodiment of the present invention, and the following describes the training apparatus for a network model provided in the embodiment of the present invention in detail.
Fig. 7 is a schematic diagram of a training apparatus for a network model according to an embodiment of the present invention, as shown in fig. 7, the training apparatus for a network model mainly includes a second obtaining unit 50, a first training unit 60, a testing unit 70, a second determining unit 80, and a second training unit 90, where:
a second obtaining unit 50, configured to obtain a plurality of models to be trained and a plurality of target training sample sets of the plurality of models to be trained; the determination modes of the sample labels in different target training sample groups are different, and the determination modes of the sample labels are associated with the types of the models to be trained;
a first training unit 60 for training each model to be trained by the method of any one of the preceding claims 1 to 10 with each target training sample set;
the test unit 70 is configured to test the trained model to be trained by using a target test set to obtain a plurality of model test results, where one target training sample group corresponds to one model test result; the model test result is used for representing the accuracy of character recognition of the model;
a second determining unit 80, configured to determine a balance ability index of each model to be trained based on the model test result; the balance ability index is used for measuring the ability of each model to be trained to execute a plurality of operations;
and a second training unit 90, configured to train each model to be trained again based on the balance ability index.
Optionally, the plurality of models to be trained comprises the following types of models: a neural network model and an image segmentation network model based on an attention mechanism; the plurality of target training sample sets comprises: a first type of target training sample and a second type of target training sample; the apparatus is also configured to: determining label information of each kind of target training sample by the following method, specifically comprising: randomly generating the probability that each character to be recognized is each preset character for the first type of target training sample to obtain the label information of the first type of target training sample; and acquiring a target test set for testing the plurality of models to be trained, and determining label information of a second type of target training sample based on the label information of the target test set.
Optionally, the length of the label information of the first type of target training sample is the same as the length of the label information of the second type of target training sample, and the length of the label information of the second type of target training sample is the same as the length of the label information of the target test set.
Optionally, the apparatus is further configured to: determining a test sample in the target test set, wherein the text sequence contained in the test sample is the same as the text sequence contained in the second type of target training sample; determining label information for the second type of target training sample based on the label information for the test sample.
Optionally, the apparatus is further configured to: determining label information of the test sample as label information of the second type of target training sample; and/or converting the label information of the test sample according to a preset writing format to obtain converted label information, and determining the converted label information as the label information of the second type of target training sample.
Optionally, the plurality of model test results comprise: a first model test result and a second model test result; the second determination unit is configured to: calculating a difference between the first model test result and the second model test result, and determining the difference as the balance ability index; the second training unit is to: and if the difference is larger than the preset difference, training each model to be trained again.
The embodiment of the invention also provides a character recognition device. The character recognition device mainly comprises a third acquisition unit and a character recognition unit, wherein:
a third acquisition unit configured to acquire an image to be recognized;
and the character recognition unit is used for performing character recognition on the image to be recognized through a target neural network to obtain a character recognition result, wherein the target neural network is a model obtained by training by adopting the method in any one of the embodiment 1 or the embodiment 2.
In the application, when the target neural network obtained by using the training method described in embodiment 1 or embodiment 2 identifies an image to be identified, the character identification accuracy can be improved, and the technical problem that the existing character identification model is poor in identification accuracy in the character identification process is solved.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments.
In addition, in the description of the embodiments of the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processing device. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (18)

1. A method for training a network model, comprising:
obtaining a plurality of models to be trained and target training samples of the plurality of models to be trained;
respectively performing character recognition processing on the target training sample through each model to be trained to obtain a plurality of character recognition results, wherein each character recognition result represents the prediction probability that each character to be recognized in the target training sample is each preset character;
determining a relative entropy loss value of each model to be trained based on the plurality of character recognition results and the label information of the target training sample; wherein the relative entropy loss value is used for representing the difference degree between a plurality of character recognition results;
and adjusting the model parameters of the corresponding model to be trained according to the relative entropy loss value.
2. The method of claim 1, wherein determining a relative entropy loss value for each model to be trained based on the plurality of word recognition results and label information for the target training samples comprises:
determining a character recognition result of a first model to be trained and a character recognition result of a second model to be trained in the plurality of character recognition results, and respectively obtaining a first character recognition result and a second character recognition result, wherein the first model to be trained is a model of the plurality of models to be trained, of which the relative entropy loss value is to be calculated at the current moment, and the second model to be trained is the other model except the first model to be trained;
calculating KL divergence between the first character recognition result and the second character recognition result to obtain a target KL divergence value;
and determining a relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample.
3. The method according to claim 2, wherein the number of the second model to be trained is plural; each second model to be trained corresponds to a second character recognition result;
calculating the KL divergence between the first and second word recognition results comprises: calculating KL divergence between the first character recognition result and each second character recognition result to obtain a plurality of target KL divergence values;
determining the relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample comprises: and determining a relative entropy loss value of the first model to be trained based on the plurality of target KL divergence values, the first character recognition result and the label information of the target training sample.
4. The method according to claim 2 or 3, wherein the label information is used for representing the actual probability of each preset character of the characters to be recognized in the target training sample;
determining the relative entropy loss value of the first model to be trained based on the target KL divergence value, the first character recognition result and the label information of the target training sample comprises:
summing the prediction probability of each character to be recognized in the first character recognition result and the actual probability of the corresponding character to be recognized in the label information to obtain a target calculation result corresponding to each character to be recognized;
and summing the target calculation result corresponding to each character to be recognized and the target KL divergence value to obtain a relative entropy loss value of the first model to be trained.
5. The method of claim 2, wherein calculating the KL divergence between the first and second word recognition results comprises:
transforming the first character recognition result to obtain a first logits vector; and transforming the second character recognition result to obtain a second logits vector;
calculating KL divergence between the first and second logits vectors to obtain the target KL divergence value.
6. The method of claim 1, wherein the plurality of models to be trained comprise the following types of models: a neural network model based on an attention mechanism and an image segmentation network model.
7. A method for training a network model, comprising:
obtaining a plurality of models to be trained and a plurality of target training sample sets of the models to be trained; the determination modes of the sample labels in different target training sample groups are different, and the determination modes of the sample labels are associated with the types of the models to be trained;
training each model to be trained separately with each target training sample set by the method of any of the preceding claims 1 to 6;
testing the trained model to be trained by using a target test set to obtain a plurality of model test results, wherein one target training sample group corresponds to one model test result; the model test result is used for representing the accuracy of character recognition of the model;
determining a balance ability index of each model to be trained based on the model test result; the balance ability index is used for measuring the ability of each model to be trained to execute various operations;
and training each model to be trained again based on the balance ability index.
8. The method of claim 7, wherein the plurality of models to be trained comprise the following types of models: a neural network model and an image segmentation network model based on an attention mechanism; the plurality of target training sample sets comprises: a first type of target training sample and a second type of target training sample;
determining label information of each kind of target training sample by the following method, specifically comprising:
randomly generating the probability that each character to be recognized is each preset character for the first type of target training sample to obtain the label information of the first type of target training sample;
and acquiring a target test set for testing the plurality of models to be trained, and determining label information of a second type of target training sample based on the label information of the target test set.
9. The method of claim 8, wherein the length of the label information of the first type of target training sample is the same as the length of the label information of the second type of target training sample, and the length of the label information of the second type of target training sample is the same as the length of the label information of the target test set.
10. The method of claim 8, wherein determining label information for the second type of target training sample based on the label information for the target test set comprises:
determining a test sample in the target test set, wherein the text sequence contained in the test sample is the same as the text sequence contained in the second type of target training sample;
determining label information for the second type of target training sample based on the label information for the test sample.
11. The method of claim 10, wherein determining label information for the second type of target training pattern based on the label information for the test pattern comprises:
determining label information of the test sample as label information of the second type of target training sample;
and/or
And converting the label information of the test sample according to a preset writing format to obtain converted label information, and determining the converted label information as the label information of the second type of target training sample.
12. The method of claim 7, wherein the plurality of model test results comprises: a first model test result and a second model test result;
determining the balance ability index of each model to be trained based on the model test result comprises: calculating a difference between the first model test result and the second model test result, and determining the difference as the balance ability index;
training each model to be trained again based on the balance ability index comprises: and if the difference is larger than the preset difference, training each model to be trained again.
13. A method for recognizing a character, comprising:
acquiring an image to be identified;
and performing character recognition on the image to be recognized through a target neural network to obtain a character recognition result, wherein the target neural network is a model obtained by training by adopting the method of any one of claims 1 to 12.
14. An apparatus for training a network model, comprising:
the device comprises a first acquisition unit, a second acquisition unit and a control unit, wherein the first acquisition unit is used for acquiring a plurality of models to be trained and target training samples of the plurality of models to be trained;
the recognition processing unit is used for respectively carrying out character recognition processing on the target training sample through each model to be trained to obtain a plurality of character recognition results, wherein each character recognition result represents the prediction probability that each character to be recognized in the target training sample is each preset character;
a first determining unit, configured to determine a relative entropy loss value of each model to be trained based on the multiple character recognition results and the label information of the target training sample; wherein the relative entropy loss value is used for representing the difference degree between a plurality of character recognition results;
and the adjusting unit is used for adjusting the model parameters of the corresponding model to be trained through the relative entropy loss value.
15. An apparatus for training a network model, comprising:
the second acquisition unit is used for acquiring a plurality of models to be trained and a plurality of target training sample sets of the models to be trained; the determination modes of the sample labels in different target training sample groups are different, and the determination modes of the sample labels are associated with the types of the models to be trained;
a first training unit for training each model to be trained by the method of any one of the preceding claims 1 to 6 with each target training sample set;
the test unit is used for testing the trained model to be trained by utilizing a target test set to obtain a plurality of model test results, wherein one target training sample group corresponds to one model test result; the model test result is used for representing the accuracy of character recognition of the model;
a second determining unit, configured to determine a balance ability index of each model to be trained based on the model test result; the balance ability index is used for measuring the ability of each model to be trained to execute a plurality of operations;
and the second training unit is used for training each model to be trained again based on the balance ability index.
16. A character recognition apparatus, comprising:
a third acquisition unit configured to acquire an image to be recognized;
a character recognition unit, configured to perform character recognition on the image to be recognized through a target neural network to obtain a character recognition result, where the target neural network is a model obtained by training according to the method of any one of claims 1 to 12.
17. An electronic device comprising a memory, a processing device and a computer program stored on the memory and executable on the processing device, wherein the processing device implements the steps of the method of any of the preceding claims 1 to 6, or implements the steps of the method of any of the preceding claims 7 to 12, or implements the steps of the method of claim 13 when executing the computer program.
18. A computer readable medium having non-volatile program code executable by a processing device, characterized in that the program code causes the processing device to perform the steps of the method of any of the preceding claims 1 to 6, or to carry out the steps of the method of any of the preceding claims 7 to 12, or to carry out the steps of the method of claim 13.
CN202010330213.6A 2020-04-23 2020-04-23 Network model training and character recognition method and device and electronic equipment Pending CN111667066A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010330213.6A CN111667066A (en) 2020-04-23 2020-04-23 Network model training and character recognition method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010330213.6A CN111667066A (en) 2020-04-23 2020-04-23 Network model training and character recognition method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN111667066A true CN111667066A (en) 2020-09-15

Family

ID=72382904

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010330213.6A Pending CN111667066A (en) 2020-04-23 2020-04-23 Network model training and character recognition method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111667066A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112270316A (en) * 2020-09-23 2021-01-26 北京旷视科技有限公司 Character recognition method, character recognition model training method, character recognition device, and electronic equipment
CN112364860A (en) * 2020-11-05 2021-02-12 北京字跳网络技术有限公司 Training method and device of character recognition model and electronic equipment
CN112784903A (en) * 2021-01-26 2021-05-11 上海明略人工智能(集团)有限公司 Method, device and equipment for training target recognition model
CN112990429A (en) * 2021-02-01 2021-06-18 深圳市华尊科技股份有限公司 Machine learning method, electronic equipment and related product
CN113128220A (en) * 2021-04-30 2021-07-16 北京奇艺世纪科技有限公司 Text distinguishing method and device, electronic equipment and storage medium
CN113420689A (en) * 2021-06-30 2021-09-21 平安科技(深圳)有限公司 Character recognition method and device based on probability calibration, computer equipment and medium
CN113470626A (en) * 2021-06-30 2021-10-01 北京有竹居网络技术有限公司 Training method, device and equipment of voice recognition model
CN113469188A (en) * 2021-07-15 2021-10-01 有米科技股份有限公司 Method and device for data enhancement and character recognition of character recognition model training
CN114937267A (en) * 2022-04-20 2022-08-23 北京世纪好未来教育科技有限公司 Training method and device for text recognition model and electronic equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104966097A (en) * 2015-06-12 2015-10-07 成都数联铭品科技有限公司 Complex character recognition method based on deep learning
CN108920622A (en) * 2018-06-29 2018-11-30 北京奇艺世纪科技有限公司 A kind of training method of intention assessment, training device and identification device
CN109086871A (en) * 2018-07-27 2018-12-25 北京迈格威科技有限公司 Training method, device, electronic equipment and the computer-readable medium of neural network
US20190102678A1 (en) * 2017-09-29 2019-04-04 Samsung Electronics Co., Ltd. Neural network recogntion and training method and apparatus
CN109960808A (en) * 2019-03-26 2019-07-02 广东工业大学 A kind of text recognition method, device, equipment and computer readable storage medium
CN110059828A (en) * 2019-04-23 2019-07-26 杭州智趣智能信息技术有限公司 A kind of training sample mask method, device, equipment and medium
CN110209817A (en) * 2019-05-31 2019-09-06 安徽省泰岳祥升软件有限公司 Training method, device and the text handling method of text-processing model
CN110858307A (en) * 2018-08-24 2020-03-03 国信优易数据有限公司 Character recognition model training method and device and character recognition method and device
CN111027345A (en) * 2018-10-09 2020-04-17 北京金山办公软件股份有限公司 Font identification method and apparatus

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104966097A (en) * 2015-06-12 2015-10-07 成都数联铭品科技有限公司 Complex character recognition method based on deep learning
US20190102678A1 (en) * 2017-09-29 2019-04-04 Samsung Electronics Co., Ltd. Neural network recogntion and training method and apparatus
CN108920622A (en) * 2018-06-29 2018-11-30 北京奇艺世纪科技有限公司 A kind of training method of intention assessment, training device and identification device
CN109086871A (en) * 2018-07-27 2018-12-25 北京迈格威科技有限公司 Training method, device, electronic equipment and the computer-readable medium of neural network
CN110858307A (en) * 2018-08-24 2020-03-03 国信优易数据有限公司 Character recognition model training method and device and character recognition method and device
CN111027345A (en) * 2018-10-09 2020-04-17 北京金山办公软件股份有限公司 Font identification method and apparatus
CN109960808A (en) * 2019-03-26 2019-07-02 广东工业大学 A kind of text recognition method, device, equipment and computer readable storage medium
CN110059828A (en) * 2019-04-23 2019-07-26 杭州智趣智能信息技术有限公司 A kind of training sample mask method, device, equipment and medium
CN110209817A (en) * 2019-05-31 2019-09-06 安徽省泰岳祥升软件有限公司 Training method, device and the text handling method of text-processing model

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112270316B (en) * 2020-09-23 2023-06-20 北京旷视科技有限公司 Character recognition, training method and device of character recognition model and electronic equipment
CN112270316A (en) * 2020-09-23 2021-01-26 北京旷视科技有限公司 Character recognition method, character recognition model training method, character recognition device, and electronic equipment
CN112364860A (en) * 2020-11-05 2021-02-12 北京字跳网络技术有限公司 Training method and device of character recognition model and electronic equipment
CN112784903A (en) * 2021-01-26 2021-05-11 上海明略人工智能(集团)有限公司 Method, device and equipment for training target recognition model
CN112784903B (en) * 2021-01-26 2023-12-12 上海明略人工智能(集团)有限公司 Method, device and equipment for training target recognition model
CN112990429A (en) * 2021-02-01 2021-06-18 深圳市华尊科技股份有限公司 Machine learning method, electronic equipment and related product
CN113128220A (en) * 2021-04-30 2021-07-16 北京奇艺世纪科技有限公司 Text distinguishing method and device, electronic equipment and storage medium
CN113128220B (en) * 2021-04-30 2023-07-18 北京奇艺世纪科技有限公司 Text discrimination method, text discrimination device, electronic equipment and storage medium
WO2023273985A1 (en) * 2021-06-30 2023-01-05 北京有竹居网络技术有限公司 Method and apparatus for training speech recognition model and device
CN113470626A (en) * 2021-06-30 2021-10-01 北京有竹居网络技术有限公司 Training method, device and equipment of voice recognition model
CN113420689A (en) * 2021-06-30 2021-09-21 平安科技(深圳)有限公司 Character recognition method and device based on probability calibration, computer equipment and medium
CN113470626B (en) * 2021-06-30 2024-01-26 北京有竹居网络技术有限公司 Training method, device and equipment for voice recognition model
CN113420689B (en) * 2021-06-30 2024-03-22 平安科技(深圳)有限公司 Character recognition method, device, computer equipment and medium based on probability calibration
CN113469188A (en) * 2021-07-15 2021-10-01 有米科技股份有限公司 Method and device for data enhancement and character recognition of character recognition model training
CN114937267A (en) * 2022-04-20 2022-08-23 北京世纪好未来教育科技有限公司 Training method and device for text recognition model and electronic equipment
CN114937267B (en) * 2022-04-20 2024-04-02 北京世纪好未来教育科技有限公司 Training method and device for text recognition model and electronic equipment

Similar Documents

Publication Publication Date Title
CN111667066A (en) Network model training and character recognition method and device and electronic equipment
CN111046133A (en) Question-answering method, question-answering equipment, storage medium and device based on atlas knowledge base
CN111695352A (en) Grading method and device based on semantic analysis, terminal equipment and storage medium
CN112528637B (en) Text processing model training method, device, computer equipment and storage medium
CN110222330B (en) Semantic recognition method and device, storage medium and computer equipment
CN111241291A (en) Method and device for generating countermeasure sample by utilizing countermeasure generation network
CN113205160B (en) Model training method, text recognition method, model training device, text recognition device, electronic equipment and medium
CN113094478B (en) Expression reply method, device, equipment and storage medium
CN111767883A (en) Title correction method and device
CN113254654A (en) Model training method, text recognition method, device, equipment and medium
CN111694954B (en) Image classification method and device and electronic equipment
CN112015955A (en) Multi-mode data association method and device
CN113435531B (en) Zero sample image classification method and system, electronic equipment and storage medium
CN109919214B (en) Training method and training device for neural network model
CN112966476B (en) Text processing method and device, electronic equipment and storage medium
CN113591884B (en) Method, device, equipment and storage medium for determining character recognition model
CN116469111B (en) Character generation model training method and target character generation method
CN117349402A (en) Emotion cause pair identification method and system based on machine reading understanding
CN107533672A (en) Pattern recognition device, mode identification method and program
CN114707518B (en) Semantic fragment-oriented target emotion analysis method, device, equipment and medium
JP2009276937A (en) Dictionary creating apparatus, recognition apparatus, recognition method, and recognition program
CN113723111B (en) Small sample intention recognition method, device, equipment and storage medium
CN110852102B (en) Chinese part-of-speech tagging method and device, storage medium and electronic equipment
CN113836297A (en) Training method and device for text emotion analysis model
CN113453065A (en) Video segmentation method, system, terminal and medium based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination