CN115565179A

CN115565179A - Method, system and device for correcting errors after character recognition

Info

Publication number: CN115565179A
Application number: CN202211222382.3A
Authority: CN
Inventors: 王博帝; 彭斌; 姚毅
Original assignee: Shenzhen Lingyun Shixun Technology Co ltd
Current assignee: Shenzhen Lingyun Shixun Technology Co ltd
Priority date: 2022-10-08
Filing date: 2022-10-08
Publication date: 2023-01-03

Abstract

The application relates to the technical field of character recognition methods, in particular to a method, a system and a device for correcting errors after character recognition, which can solve the problem of low recognition accuracy when text recognition is carried out through deep learning to a certain extent. The method for correcting errors after character recognition comprises the following steps: acquiring a posterior probability matrix, wherein each one-hot vector in the posterior probability matrix corresponds to one character; determining the maximum value and the second largest value in all the one-hot vectors, and when the score difference value between the maximum value and the second largest value in any one-hot vector is smaller than or equal to a first threshold value, taking the character corresponding to the maximum value and the character corresponding to the second largest value as candidate characters; combining the candidate characters and the characters corresponding to the maximum value in the other one-hot vectors according to the sequence from left to right to obtain a character string to be processed; and carrying out duplication removal operation on the character string to be processed to obtain a candidate character string, sending the candidate character string to an error correction branch, wherein the error correction branch is used for processing the candidate character string and outputting an error correction result.

Description

Method, system and device for correcting errors after character recognition

Technical Field

The present application relates to the technical field of character recognition methods, and in particular, to a method, a system, and an apparatus for error correction after character recognition.

Background

With the development of science and technology, deep learning is widely applied to industrial vision, and the performance of the deep learning in a complex scene is obviously superior to that of a traditional image processing algorithm. The deep learning is driven by data, images are mapped to high-dimensional feature space, and then different processing is carried out according to different tasks.

Deep learning character recognition is taken as a specialized application of deep learning in the visual field, and can be specifically divided into text detection, text recognition and text end-to-end recognition according to tasks, wherein the text recognition is used for extracting features on a cut text image and modeling in sequence, a posterior probability matrix is output by a text recognition network, and in the implementation of the text recognition, a convolution cycle neural network and an attention network are often adopted for recognition.

However, similar character service scenes often exist in the industrial vision field, and when text recognition is performed through a traditional deep learning model, similar characters are still recognized wrongly, such as 8/B, O/O/0, 5/S, I \ L, U/V and the like, so that the accuracy of text recognition is reduced.

Disclosure of Invention

In order to solve the problem of low recognition accuracy in text recognition through deep learning, the application provides a method, a system and a device for correcting errors after character recognition.

The embodiment of the application is realized as follows:

a first aspect of an embodiment of the present application provides a method for correcting errors after character recognition, including:

acquiring a posterior probability matrix, wherein each one-hot vector in the posterior probability matrix corresponds to one character;

determining a maximum value and a second maximum value in all one-hot vectors, wherein when a score difference value between the maximum value and the second maximum value in any one-hot vector is smaller than or equal to a first threshold value, a character corresponding to the maximum value and a character corresponding to the second maximum value are candidate characters;

combining the candidate characters and characters corresponding to the maximum values in the rest one-hot vectors according to the sequence from left to right to obtain character strings to be processed;

and acquiring a candidate character string based on the duplication removing operation of the character string to be processed, sending the candidate character string to an error correction branch, wherein the error correction branch is used for processing the candidate character string and outputting an error correction result.

In some embodiments, between the step of obtaining a posterior probability matrix, where each one-hot vector in the posterior probability matrix corresponds to one character, and the step of combining the candidate characters and the characters corresponding to the maximum value in the remaining one-hot vectors in the initial left-to-right order to obtain the character string to be processed, the method further includes:

determining a maximum value, a second maximum value and a third maximum value in all one-hot vectors, wherein when a score difference value between the maximum value and the second maximum value in any one-hot vector is less than or equal to a first threshold value, a character corresponding to the maximum value, a character corresponding to the second maximum value and a character corresponding to the third maximum value are candidate characters.

In some embodiments, in the step of outputting the error correction result by the error correction branch, the method further includes:

calculating a CTCLOss value based on the candidate character string and the posterior probability matrix;

sorting the CTCLOss values in ascending order;

and outputting an error correction result, wherein the error correction result is the candidate character string with the minimum CTCLOs value.

In some embodiments, when a score difference between a maximum value and a second maximum value in one-hot vectors at any position in all one-hot vectors is greater than a first threshold, the candidate character does not exist, at this time, a recognition character string is obtained by performing deduplication operation on a character string formed by characters corresponding to the maximum value in all one-hot vectors, and the recognition character string is sent to a matching branch, where the matching branch is used for comparing a preset regular expression with a character sequence in the recognition character string;

when the recognition character string can be matched with the regular expression, the character string is considered to be correctly recognized;

the regular expression is preset according to the composition rule of the correct character sequence.

In some embodiments, in the step of determining that the character corresponding to the maximum value and the character corresponding to the second largest value are candidate characters when the score difference between the maximum value and the second largest value in the one-hot vector at any position is less than or equal to a first threshold, the first threshold is 0.05 to 0.15.

A second aspect of an embodiment of the present application provides a post-character-recognition error correction system, including:

an acquisition module: the method comprises the steps of obtaining a posterior probability matrix, wherein each one-hot vector in the posterior probability matrix corresponds to one character;

the candidate character determination module: the method comprises the steps that the maximum value and the second largest value in all one-hot vectors are determined, and when the fraction difference value between the maximum value and the second largest value in any one-hot vector is smaller than or equal to a first threshold value, the character corresponding to the maximum value and the character corresponding to the second largest value are candidate characters;

a character string acquisition module: the method comprises the steps of combining the candidate characters and characters corresponding to the maximum value in the other one-hot vectors according to a sequence from left to right to obtain a character string to be processed;

a character string processing module: and acquiring a candidate character string based on the duplicate removal operation of the character string to be processed, sending the candidate character string to an error correction branch, wherein the error correction branch is used for processing the candidate character and outputting an error correction result.

In some embodiments, the candidate character determination module is further configured to:

determining a maximum value, a second maximum value and a third maximum value in all the one-hot vectors, wherein when a score difference value between the maximum value and the second maximum value in any one-hot vector is smaller than or equal to a first threshold value, a character corresponding to the maximum value, a character corresponding to the second maximum value and a character corresponding to the third maximum value are candidate characters.

In some embodiments, the string processing module is further configured to:

calculating a CTCLOss value based on the candidate string and the posterior probability matrix;

sorting the CTCLOss values in an ascending order;

In some embodiments, the string processing module is further configured to:

when the score difference between the maximum value and the second maximum value in one-hot vectors at any position in all one-hot vectors is larger than a first threshold value, the candidate character does not exist, at this time, a recognition character string is obtained by carrying out deduplication operation on a character string formed by characters corresponding to the maximum value in all one-hot vectors, and the recognition character string is sent to a matching branch, wherein the matching branch is used for comparing a preset regular expression with a character sequence in the recognition character string;

when the identification character string can be matched with the regular expression, the character string is considered to be correctly identified;

A third aspect of the embodiments of the present application provides a device on an error correction side after character recognition, including: at least one processor, memory, and input-output unit; wherein the memory is configured to store a computer program, and the processor is configured to call the computer program stored in the memory to execute the post-character-recognition error correction method as in the first aspect.

The beneficial effect of this application: and when the candidate character exists, combining the candidate character and the character corresponding to the maximum value in the rest one-hot vectors according to the initial sequence from left to right, and then carrying out error correction branch processing on a candidate character string formed by deduplication operation, so that an error correction result can be efficiently and accurately output, the character can be accurately recognized, the correctness of the character string recognition result is effectively improved, and the purpose of improving the applicability of deep learning text recognition in the field of similar character scene recognition is also achieved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and those skilled in the art can obtain other drawings without inventive labor.

FIG. 1 is a schematic flow chart of a method for error correction after character recognition according to some embodiments of the present application;

FIG. 2 is a flow chart illustrating a method for post-character-recognition error correction according to further embodiments of the present application;

FIG. 3 is a flowchart illustrating a step of outputting an error correction result by an error correction branch during error correction after character recognition according to one or more embodiments of the present application;

FIG. 4 is a block diagram of a post character recognition error correction system in accordance with one or more embodiments of the present application;

fig. 5 is a schematic structural diagram of an error correction device after character recognition according to one or more embodiments of the present application.

Detailed Description

To make the objects, embodiments and advantages of the present application clearer, the following is a clear and complete description of exemplary embodiments of the present application with reference to the attached drawings in exemplary embodiments of the present application, and it is apparent that the exemplary embodiments described are only a part of the embodiments of the present application, and not all of the embodiments.

It should be noted that the brief descriptions of the terms in the present application are only for the convenience of understanding the embodiments described below, and are not intended to limit the embodiments of the present application. These terms should be understood in their ordinary and customary meaning unless otherwise indicated.

The terms "first," "second," "third," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between similar or analogous objects or entities and not necessarily for describing a particular sequential or chronological order, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.

The terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to all elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

FIG. 1 is a flowchart illustrating an error correction method after character recognition according to an embodiment.

As shown in fig. 1, the method for correcting errors after character recognition specifically includes the following steps:

in step 100, a posterior probability matrix is obtained, wherein each one-hot vector in the posterior probability matrix corresponds to one character.

After receiving the text image, the information processing units such as the processor perform optical character recognition on the text image through the neural network recognition model, and output a posterior probability matrix through sequence modeling, wherein the posterior probability matrix records the probability of different characters in different time sequences, and is auxiliary information generated in the process of recognizing the text by the neural network recognition model. In the posterior probability matrix, each line represents the width of the cut text image after down sampling, and each column of the matrix corresponds to one-hot vector output.

In step 200, the maximum value and the second largest value in all the one-hot vectors are determined, and when the score difference between the maximum value and the second largest value in any one-hot vector is less than or equal to a first threshold, the character corresponding to the maximum value and the character corresponding to the second largest value are candidate characters.

It should be noted that, the first threshold value is 0.05 to 0.15, and in some embodiments, when the score difference is less than or equal to 0.05, the character corresponding to the maximum value and the character corresponding to the second largest value are candidate characters. In some embodiments, when the score difference is less than or equal to 0.1, the character corresponding to the maximum value and the character corresponding to the second largest value are candidate characters. In other embodiments, the maximum value, the second largest value and the third largest value in all the one-hot vectors are determined, and when the score difference between the maximum value and the second largest value in any one of the one-hot vectors is less than or equal to 0.15, the character corresponding to the maximum value, the character corresponding to the second largest value and the character corresponding to the third largest value are candidate characters, at this time, the error correction accuracy is favorably improved, but the calculation amount is increased, and the method is more advantageous in scenes with higher accuracy requirements.

It is understood that when there is no candidate character, i.e., the maximum value and the second maximum value in the one-hot vector have a numerical difference greater than 0.15, the character string to be processed is not formed.

In step 300, the candidate characters and the characters corresponding to the maximum values in the remaining one-hot vectors are combined in the order from left to right to obtain the character string to be processed.

In step 400, a candidate character string is obtained based on a deduplication operation performed on the character string to be processed, and the candidate character string is sent to an error correction branch, where the error correction branch is used to process the candidate character and output an error correction result.

It should be noted that, since the predicted result may continuously have the same character, in this case, it is necessary to ensure that only one character is retained by the deduplication operation, but if the predicted result itself should have the same character, such as an applet, and there are two same characters, the result of the predicted result may appear in the form of "p _ p", and at this time, the deduplication operation removes "_", so the deduplication operation does not affect the situation that the same character itself exists.

Through exhaustion of the candidate characters, the candidate characters and other characters are combined and sorted from left to right, subsequent duplicate removal and error correction branch processing is carried out, correct character strings can be obtained, and the character recognition accuracy is improved.

FIG. 2 is a flowchart illustrating an error correction method after character recognition during error correction according to another embodiment.

As shown in fig. 2, in some embodiments, the post-character-recognition error correction method employs the following steps:

in step 210, a posterior probability matrix is obtained, where each one-hot vector in the posterior probability matrix corresponds to one character.

In step 220, the maximum value and the second largest value in all the one-hot vectors are determined, and whether the fraction difference value between the maximum value and the second largest value in any one-hot vector is smaller than or equal to a first threshold value is judged.

In step 230, when the score difference between the maximum value and the second largest value in the one-hot vector at any position is less than or equal to the first threshold, the character corresponding to the maximum value and the character corresponding to the second largest value are candidate characters.

In step 240, the candidate characters and the characters corresponding to the maximum value in the remaining one-hot vectors are combined in order from left to right to obtain the character string to be processed.

In step 250, a candidate character string is obtained based on the deduplication operation of the character string to be processed, the candidate character string is sent to an error correction branch, and the error correction branch is used for processing the candidate character and outputting an error correction result.

In step 260, when the score difference between the maximum value and the second largest value in one-hot vectors at any position in all one-hot vectors is greater than a first threshold, no candidate character exists, at this time, a deduplication operation is performed on a character string formed by characters corresponding to the maximum value in all one-hot vectors to obtain an identification character string, and the identification character string is sent to a matching branch, wherein the matching branch is used for comparing a preset regular expression with a character sequence in the identification character string.

In step 270, when the recognition character string can match the regular expression, the character string recognition is considered to be correct.

It should be noted that the character sequence often exhibits a certain composition rule, so that the regular expression may be manually set according to the composition rule exhibited by the character sequence, for example, if the composition rule of a batch of product sequence codes is that the first digit is a capital letter, and the last four digits are numbers, and if a1234, the regular expression is manually given according to the rule. When the recognition character string of the matching branch is matched with the regular expression, the recognition process of the original character string is accurate, and the character string formed by the characters corresponding to the maximum value in all the one-hot vectors does not need to be corrected; when the identification character string of the matching branch is not matched with the regular expression, the identification character string is considered to be in error identification and cannot be corrected, so that the user is warned that the character string identification result is wrong.

By combining the double steps of exhaustion candidate characters and regular expressions, the correctness of a character string recognition result is effectively improved, the effect of enabling deep learning text recognition in similar character scenes is achieved, the applicability of deep learning in industrial visual character recognition is improved, and therefore the production efficiency of deep learning in industrial visual character recognition is greatly improved.

FIG. 3 is a flowchart illustrating the steps of outputting error correction results by the error correction branch after character recognition and error correction.

As shown in fig. 3, when the error correction branch processes the candidate character strings, the following manner is adopted:

in step 310, a CTCLOss value is calculated based on the candidate string and the a posteriori probability matrix.

The method comprises the steps of introducing bidirectional LSTM (Long Short-Term Memory) on the basis of a CRNN (conditional recovery Neural Network) text recognition algorithm, enhancing context modeling, effectively extracting context information in candidate character strings, outputting a feature sequence, inputting the output feature sequence into a CTC unit, and calculating a CTCLOss value.

In step 320, sorting the CTCLOss values in an ascending order; at this time, a plurality of ctclos values are obtained, and each ctclos value corresponds to one character string.

In step 330, an error correction result is output, where the error correction result is the candidate character string with the smallest ctclos value.

It can be understood that the output error correction result is a correct character string, the candidate character string is corrected according to the CTCLOs value, the direction is consistent with the optimization direction in the training stage, and confidence is provided for the error correction result, so that the accuracy of the whole character recognition is improved.

In summary, the whole method for correcting errors after character recognition is not related to the text recognition network structure, because the backbone part of the network structure may be CNN or Transformer, and the sequence modeling part may be RNN or FC only, but as long as the loss function in the training phase is ctclos, the method for correcting errors after character recognition in the present application can be adopted, so the method for correcting errors after character recognition has a wide application scope.

In a second aspect, the present application further provides a post-character-recognition error correction system.

Fig. 4 is a structural diagram of the entire post-character-recognition error correction system.

As shown in fig. 4, the post-character-recognition error correction system includes:

the candidate character determination module: the character selecting method comprises the steps that the character selecting method is used for determining the maximum value and the second largest value in all one-hot vectors, and when the score difference value between the maximum value and the second largest value in any one-hot vector is smaller than or equal to a first threshold value, the character corresponding to the maximum value and the character corresponding to the second largest value are candidate characters;

a character string acquisition module: the method comprises the steps of combining candidate characters and characters corresponding to the maximum value in the other one-hot vectors according to the sequence from left to right to obtain a character string to be processed;

a character string processing module: and carrying out duplicate removal operation on the character string to be processed to obtain a candidate character string, sending the candidate character string to an error correction branch, wherein the error correction branch is used for processing the candidate character and outputting an error correction result.

In some embodiments, the candidate character determining module is further configured to, between the step of obtaining a posterior probability matrix, where each one-hot vector in the posterior probability matrix corresponds to one character, and the step of combining the candidate characters and the characters corresponding to the maximum value in the remaining one-hot vectors in the initial left-to-right order to obtain the character string to be processed, perform:

and determining the maximum value, the second maximum value and the third maximum value in all the one-hot vectors, wherein when the score difference between the maximum value and the second maximum value in any one-hot vector is smaller than or equal to a first threshold value, the character corresponding to the maximum value, the character corresponding to the second maximum value and the character corresponding to the third maximum value are candidate characters. The range of the first threshold value can be 0.05-0.15, and the value of the first threshold value can be set according to the actual situation.

In some embodiments, in the step of outputting the error correction result by the error correction branch, the string processing module is further configured to:

sorting the CTCLOss values in an ascending order;

and outputting an error correction result, wherein the error correction result is the candidate character string with the minimum CTCLOss value.

In some embodiments, the string processing module is further to:

when the score difference between the maximum value and the second maximum value in one-hot vectors at any position in all one-hot vectors is larger than a first threshold value, no candidate character exists, at the moment, a recognition character string is obtained by carrying out duplication removing operation on a character string formed by characters corresponding to the maximum value in all one-hot vectors, and the recognition character string is sent to a matching branch, wherein the matching branch is used for comparing a preset regular expression with a character sequence in the recognition character string;

All modules in the character recognition error correction system cooperate to enable the whole system to efficiently and accurately recognize characters.

In a third aspect, the present application further discloses a post-character-recognition error correction apparatus, as shown in fig. 5, the post-character-recognition error correction apparatus includes: at least one processor, a memory, and an input-output unit; wherein the memory is adapted to store a computer program and the processor is adapted to call the computer program stored in the memory to perform the method of the first aspect.

In a fourth aspect, the present application further discloses a computer-readable storage medium, in which at least one executable instruction is stored, and when the executable instruction is executed on the post-character-recognition error correction apparatus, the post-character-recognition error correction apparatus performs the operations of the post-character-recognition error correction method in the above first aspect.

The embodiment has the advantages that whether the candidate character exists is judged by determining the maximum value and the second maximum value in all the one-hot vectors, when the candidate character exists, the candidate character string is processed by the error correction branch, and the error correction result is output, so that the character can be accurately identified; and further, when no candidate character exists, the matching branch compares the preset regular expression with the character sequence in the recognition character string to determine whether the character string is correctly recognized or not, and can further screen out the character string with a recognition error, so that the accuracy of the whole character recognition process is improved. And furthermore, the CTCLOs value is calculated based on the candidate character string and the posterior probability matrix, which is beneficial to improving the applicability of the whole method.

The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. However, the foregoing discussion in some embodiments is not intended to be exhaustive or to limit the embodiments to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A method for error correction after character recognition, the method comprising:

combining the candidate characters and characters corresponding to the maximum value in the rest one-hot vectors according to the initial left-to-right sequence to obtain character strings to be processed;

2. The method for error correction after character recognition according to claim 1, wherein, between the step of obtaining a posterior probability matrix, where each one-hot vector in the posterior probability matrix corresponds to a character, and the step of combining the candidate characters and the characters corresponding to the maximum value in the remaining one-hot vectors in the initial left-to-right order to obtain the character string to be processed, further comprises:

3. The post-character-recognition error correction method according to claim 1, wherein in the step of outputting the error correction result by the error correction branch, further comprising:

sorting the CTCLOss values in an ascending order;

4. The post-character-recognition error correction method according to any one of claims 1 to 3, further comprising:

when the score difference between the maximum value and the second maximum value in one-hot vectors at any position in all one-hot vectors is larger than a first threshold value, the candidate character does not exist, at the moment, a recognition character string is obtained based on the character string formed by the characters corresponding to the maximum value in all one-hot vectors through de-duplication operation, the recognition character string is sent to a matching branch, and the matching branch is used for comparing a preset regular expression with the character sequence in the recognition character string;

5. The post-character-recognition error-correction method according to claim 4, wherein in the step of determining the character corresponding to the maximum value and the character corresponding to the second largest value as candidate characters when the score difference between the maximum value and the second largest value in any one of the one-hot vectors is less than or equal to a first threshold, the first threshold is 0.05-0.15.

6. A post-character-recognition error correction system, the system comprising:

an acquisition module: the method comprises the steps of obtaining a posterior probability matrix, wherein each one-hot vector in the posterior probability matrix corresponds to a character;

a candidate character determination module: the method comprises the steps that the maximum value and the second largest value in all one-hot vectors are determined, and when the fraction difference value between the maximum value and the second largest value in any one-hot vector is smaller than or equal to a first threshold value, the character corresponding to the maximum value and the character corresponding to the second largest value are candidate characters;

a character string acquisition module: the method comprises the steps of combining the candidate characters and characters corresponding to the maximum value in the rest one-hot vectors according to an initial left-to-right sequence to obtain a character string to be processed;

a character string processing module: and acquiring a candidate character string based on the duplication removing operation of the character string to be processed, sending the candidate character string to an error correction branch, wherein the error correction branch is used for processing the candidate character and outputting an error correction result.

7. The post-character-recognition error correction system of claim 6, wherein the candidate character determination module is further to:

8. The post-character-recognition error-correction system of claim 6, wherein the character string processing module is further configured to:

sorting the CTCLOss values in ascending order;

9. The post-character-recognition error-correction system according to any one of claims 6-7, wherein the character string processing module is further configured to:

10. A post-character-recognition error correction apparatus, comprising: at least one processor, a memory, and an input-output unit; wherein the memory is for storing a computer program and the processor is for invoking the computer program stored in the memory to perform the method of any of claims 1-5.